
What AI Taught Me About Being a Product Engineer
/ 14 min read
The AI change that was technically correct but product-wrong
Earlier this year, I reviewed an AI-generated pull request that did exactly what it had been asked to do.
It was for a feature where a client had asked for more control. The ticket was rough, something like: “Add option to X in Y so users can do Z.”
The PR added the configuration setting. The migration was there. The tests passed. The UI behaved correctly. The implementation was not perfect, but it was reasonable. If I looked only at the ticket, the PR was hard to reject. It had translated an instruction into working software.
And still, something felt wrong.
The problem was not that the AI had failed. In a narrow sense, it had succeeded. It had answered the request almost too literally.
The client did want more control. But the deeper issue was not simply that the product lacked another setting. They wanted control because the defaults of the feature were unclear, or because the defaults were not working for them in a way they could understand or trust.
Adding a setting gave them control. It also added another path through the product, another workflow to explain, and another concept users would have to keep in their heads.
Inside the diff, the change was correct. Inside the product, it was questionable.
That moment stayed with me because it made a shift I had been feeling much harder to ignore.
I have always described myself more naturally as a Product Engineer than as a Software Engineer. Not because I dislike software engineering. I love software engineering. I love clean systems, good abstractions, elegant APIs, and the satisfaction of separate pieces finally fitting together.
By Product Engineer, I mean an engineer who treats implementation as part of the job, not the edge of it.
If I think of my job as producing software, the boundary can stop at the implementation. If I think of my job as producing a product, that boundary moves outward: to the problem, the tradeoff, the user experience, and the consequences of what we ship.
AI has made that distinction more important.
It makes it easier to build the wrong thing faster.
And if AI can write more code than us, then our value cannot be measured by the amount of code we produce.
Clean code was my comfort zone
Earlier in my career, I optimized for elegance.
I wanted the code to be clean. I cared about APIs, layers, abstractions, and modules that could be built separately but still compose into something coherent. I liked the feeling of a system becoming legible: boundaries settling into place, names becoming obvious, complexity finding a shape.
I still care about clean systems.
I still care when code is sloppy. I still care when a model leaks into the wrong layer, when an API name hides the real behavior, or when a shortcut will make the next change painful. Those are real engineering concerns.
But I care about it differently now.
Clean code only matters if the product decision behind it is worth keeping.
The danger is not clean code. The danger is believing clean code is the point.
Clean code can become a hiding place because it feels objective and controllable. The abstraction either holds or it does not. The API either feels right or it does not. The module is either too coupled or it is not. These are difficult questions, but they are questions engineers can answer mostly inside the system.
Product work is messier.
Users contradict each other. Timing changes the right answer. Business needs matter. A technically awkward solution may be the right thing to ship because it buys learning, trust, or time. A beautifully designed abstraction may be premature because the product concept underneath it is still unstable.
That messiness can be uncomfortable.
For a long time, craft gave me a reliable way to feel useful. I could point to the refactor. I could explain the architecture. I could show that the implementation was flexible, tested, and maintainable. Those things mattered, and still matter, but they also gave me a familiar way to measure my value.
It is strange to see a model produce in seconds something that used to give me a reliable sense of progress. Not because the output is always good, but because it is often good enough to remove the easy part of my confidence.
AI has made that comfort feel less stable.
If an AI system can produce a plausible first version of the implementation, then being valuable cannot only mean being the person who can turn a ticket into code. The craft still matters, but it is no longer enough to treat it as the safest place to stand.
The harder question is whether the ticket should become code at all.
When output gets cheap, judgment gets expensive
When implementation was expensive, producing code felt like the scarce part.
If you could build the thing, you had leverage. You could turn ambiguity into a working system. You could make ideas real. For a long time, that ability shaped how many engineers understood their value. I was not immune to that. Part of my identity was built around being able to create solid software out of messy requirements.
AI changes the economics of that work.
It can scaffold a feature, write a first version of the tests, explore a library, explain an unfamiliar part of the system, and produce a plausible implementation. It does not make those things free, and it does not make them automatically good. But it lowers the cost of getting from instruction to output.
That matters because when output becomes cheaper, scarcity moves elsewhere.
It moves toward knowing what to ask for, what to leave out, and what the product can absorb without becoming harder to understand. It moves toward recognizing when a request is a symptom of confusion rather than a missing feature.
More software will be built. More experiments will be cheap enough to attempt. More internal tools will appear. More workflows will become software. Some of that will create real leverage. Some of it will create new surfaces to explain, support, migrate, debug, and eventually remove.
A feature does not stop costing money when the pull request is merged. It costs attention. It costs maintenance. It costs conceptual space in the product.
The cost of producing a feature may go down.
The cost of carrying the wrong feature does not disappear.
That is why the work before implementation matters more, not less.
AI can help with judgment, but it cannot own it
There is an obvious objection here: AI will not stop at implementation.
It is already useful for more than code. It can summarize customer calls, compare product approaches, generate alternative flows, identify edge cases, and challenge assumptions. It can help make vague ideas more concrete. It can turn a blurry thought into something a team can react to.
I use it that way all the time.
So the argument cannot be that AI writes code while humans think about product. That boundary is too clean, and it is already false.
AI can help think through a tradeoff. It can propose options. It can surface missing assumptions. It can argue for one path over another. It can make the conversation better.
But it does not live with the consequences of the decision.
It does not answer the support ticket when the behavior confuses a customer. It does not explain the tradeoff to sales. It does not decide that a technically cleaner abstraction is wrong because it makes the product harder to understand. It does not know which debts are intentional and which ones will make the product harder to change six months from now.
The model can participate in the thinking. It cannot be accountable for the outcome.
That ownership gap is where product engineering still lives.
Product work starts before the instruction is clear
This became very obvious to me during our work on Enginy’s Smart Inbox.
The inbox is a unified place for messages across different channels, multiple identities, and many contacts. At first, the intention sounded straightforward: make it better designed, prettier, and more polished.
It was the kind of project that can look mostly visual from the outside. Improve the surface. Clean up the experience. Make it feel more refined.
But once we touched it, the project stopped being about polish.
We started uncovering different ideas of what the inbox was supposed to be. Some users wanted it to process everything. They wanted a place where all incoming work could be handled, sorted, filtered, and moved forward. Others wanted it to stay focused on the next relevant action, almost as a way to protect attention.
A lot of the requests sounded like requests for more control.
Users wanted filters by status: unread, read, unreplied, replied. They wanted to filter by almost any available property. From the outside, these requests looked reasonable. If the data exists, why not expose it? If users ask for a filter, why not add the filter?
But this was the same pattern as the PR from the beginning.
The obvious implementation was to add the requested control. The deeper product question was whether the inbox model was clear enough.
The inbox was not just a list of conversations anymore. One of the major product changes was grouping multiple conversations with the same contact into one view. That made the product more useful in some ways, because users often think in terms of people and relationships, not individual message threads.
But it also introduced constraints that made some requests less straightforward.
A status that is obvious at the individual conversation level can become ambiguous when multiple conversations are grouped under the same contact.
If one conversation is replied and another is unreplied, what status should the contact have?
If the last reply in one channel came from the contact, but another channel has a drafted message ready to send, what should the inbox show?
If a filter says “unreplied,” should it include a contact with one unreplied thread and three completed ones?
These are not just implementation details.
They are product decisions hiding inside implementation details.
Each answer teaches users what the inbox is for. Adding another status sounds harmless, but it changes whether users understand the inbox as a place to manage every possible message state or as a place to know what needs their attention now.
That distinction became the center of the project.
Was the inbox a triage surface, a task list, a communication hub, or a focused view of follow-ups?
Each answer led to a different product.
We could have said yes to every request and built a more powerful inbox in the narrow sense, exposing every possible property, status, and filter. It would have satisfied many individual asks, but it also would have made the product heavier: more configurable, less clear, and slower to understand.
The direction we chose was narrower: the inbox should show what needs attention now, not expose every possible message state.
That did not mean ignoring every other workflow. It meant choosing the product’s center of gravity.
If we said no too aggressively, we risked ignoring real pain and forcing users into a model that did not match their day. If we said yes to everything, we risked turning the inbox into a control panel instead of a product surface.
This is where product engineering starts: before the code, when deciding what the product should become.
Users know their pain. They know what feels slow, confusing, or broken. They know the part of the workflow that hurts them most.
But they usually see the product through that pain.
The engineer working on the product has to see the system around it: the behaviors we encourage, the concepts we teach, the complexity we expose, the default states we create, the support burden we accept, and the promises we make harder to change later.
AI reduces the friction between idea and implementation. A vague request can become a working feature very quickly. The requested field can be exposed. The requested setting can be added. The requested filter can appear.
And because it works, it can feel like progress.
But the useful speed is not how fast the model can generate code. It is how fast the team can learn whether the generated behavior belongs in the product.
Shipping quickly can be the right move when uncertainty is high and waiting is more expensive than being wrong. Sometimes a narrow solution for an important client is correct because it buys trust or reveals something the team could not learn abstractly. Sometimes the same move is wrong because it creates product debt disguised as responsiveness.
AI increases the speed of generation.
It does not automatically increase the quality of learning.
The job expands beyond the diff
I use AI heavily now.
I use it for technical research, exploring libraries, understanding unfamiliar parts of the system, analyzing product data, writing code, and turning early ideas into something I can react to.
It has made me faster, especially in the early stages of thinking. When I have a blurry idea, I can start exploring it immediately. I can compare approaches, generate a first version, ask questions, and use the output as material to think with.
That is genuinely useful.
What feels vulnerable is not a fear of AI, or a desire to protect an older idea of engineering from change. I do not feel anti-AI. Most days, I feel the opposite: I would not want to go back to working without it.
The uncomfortable part is more personal. AI weakens the comfort of craft as a complete identity. It takes a part of the work that used to feel like proof of competence and turns it into something easier to generate, easier to compare, and easier to treat as a draft.
That does not make engineering less important.
It makes the responsibility around engineering larger.
I do not trust AI output by default. I treat it like a pull request from another developer, and often with more caution, because the accountability does not move to the tool. It stays with me.
If another developer opens a PR, I review it. I try to understand the choices. I ask whether the implementation fits the system and whether the solution matches the intention.
AI-generated work deserves the same review. Maybe more.
AI can produce something that is correct inside the diff and wrong inside the product.
It can pass the tests and still make the experience harder to understand. It can implement the requested behavior and still add the wrong concept. It can follow sparse specifications and solve the narrow problem while ignoring the broader one.
That is not necessarily a failure of the tool.
It is a failure of the process around the tool.
The user does not appear automatically in the prompt. The business does not appear automatically in the generated code. The product model does not appear automatically because a ticket says “add option” or “add filter.”
Those things have to be brought into the work by the people responsible for the product.
The job expands into the places we used to treat as outside the diff: the default state, the empty state, the naming, the permission model, the migration path, the documentation, the support burden, and the user’s mental model.
The skills that matter more after AI are often mislabeled as soft.
Product sense is not decorative; it is the ability to see how a small setting can create a new workflow. Communication is not separate from the work; it is how we figure out what the work should be. Technical care still matters, but it has to be tied to the consequences of what we decide to build.
The product engineer after AI is not less technical. They are more accountable for what the technology makes possible: the behavior, the tradeoff, the user’s mental model, and the consequences.
AI can help us build. It can help us research. It can help us explore possibilities. It can make us faster, and sometimes it can make us better.
But it cannot decide what kind of product we are becoming.
I think back to that AI-generated PR.
The code was not embarrassing. It built. It passed tests. It was probably close to what we had asked for.
The problem was that the prompt had been too small.
We asked for an implementation when what we needed was a product decision. We had described the change, but not the behavior. We had specified the code, but not the user’s mental model.
That is the part AI has made harder to ignore.
When implementation becomes easier, weak product thinking has fewer places to hide.