Or: what this book actually teaches if you read it like an engineer, not a magician.
Prompt-Engineering_v7-1.pdf
After my last post(Prompt Engineering Won’t Fix Your Architecture), a few people replied with variations of:
“Okay smart guy, but prompt engineering is a real skill.”
Yes.
And so is writing SQL that compensates for a bad schema.
That doesn’t mean we should build a career on it.
Recently I read Prompt Engineering (Lee Boonstra, Feb 2025) — the one with Chain of Thought, Tree of Thoughts, ReAct, temperature knobs, and enough diagrams to make you feel productive without shipping anything.
This post is not a dunk on the book.
It’s a reading guide for people who’ve been burned by production.
What the Book Gets Right (Surprisingly Well)
The book is honest about something most AI hype ignores:
LLMs are prediction engines, not minds.
They guess the next token. Repeatedly. Politely.
Everything else — “reasoning,” “thinking,” “deciding” — is scaffolding we bolt on top.
And the techniques it describes?
- Chain of Thought
- Step-back prompting
- Self-consistency
- ReAct
- JSON schemas
- Output constraints
These are not “AI magic tricks”.
They’re control systems.
If you’ve ever:
- Wrapped a flaky API with retries
- Added idempotency keys
- Forced a response into a schema so it wouldn’t explode later
Congratulations.
You already understand prompt engineering.
You just didn’t call it that.
Prompt Engineering Is Middleware With Vibes
Here’s the reframe that made the book click for me:
Prompt engineering is middleware for probabilistic systems.
That’s it. That’s the tweet.
Every technique in the book exists to solve one of these problems:
- nondeterminism
- missing structure
- lack of contracts
- unpredictable retries
- side effects you didn’t ask for
In other words:
Distributed systems problems.
But instead of logs, you get paragraphs.
Instead of stack traces, you get confidence.
Why Prompt Engineering Feels So Powerful
Because it’s the first time many teams are forced to confront their own ambiguity.
When you write:
“If values conflict, choose the most reasonable one”
The model isn’t being smart.
It’s asking you why that conflict exists in the first place.
The book spends hundreds of pages showing how to cope with ambiguity.
It never claims to eliminate it.
That’s not a flaw.
That’s an accidental truth.
The Hidden Lesson in the Book (Nobody Tweets This Part)
The best prompts in the book all have something in common:
- Clear input formats
- Explicit schemas
- Narrow responsibilities
- Deterministic expectations
- Boring outputs
Which means the real lesson isn’t:
“Become a prompt wizard”
It’s:
“Your system finally needs boundaries, and AI won’t let you fake them anymore.”
Prompt engineering doesn’t replace architecture.
It forces you to design one, whether you like it or not.
When Prompt Engineering Actually Is the Right Tool
Prompt engineering shines when:
- The task is fuzzy by nature (language, summaries, classification)
- The cost of being wrong is low
- The output is advisory, not authoritative
- You can retry safely
- You don’t pretend it’s deterministic
If you’re using it to:
- enforce business rules
- make financial decisions
- mutate production state
- replace domain logic
You didn’t discover intelligence.
You discovered technical debt that can speak.
The Takeaway
Read the book.
Not as a spellbook.
Not as a shortcut.
Not as a career identity.
Read it like a systems engineer watching a new failure mode being born.
Prompt engineering won’t save your architecture.
But it will do something better:
It will stop letting you ignore it.
And honestly?
That’s probably why it feels so powerful.
😎
Top comments (33)
Great read!
I tried to fix this in my own projects by applying Separation of Concerns and Type Safety. I ended up building a boilerplate called Atomic Inference (Jinja2 + Instructor + LiteLLM) to handle exactly this. It separates the messy prompt logic from Python code and ensures the output is always structured.
It basically turns the guessing game into actual engineering. Curious to hear your thoughts my repo: github.com/chnghia/atomic-inferenc...
Great approach—this really feels like a practical step toward treating LLM work as real engineering instead of trial-and-error. I like how Separation of Concerns and strong typing turn prompts into something predictable and maintainable, which is exactly what most projects are missing right now. I’m definitely interested in digging into Atomic Inference and seeing how this pattern could scale in more complex systems.
Thanks! Scale was actually the main reason I built this.
I’m currently running it in a Personal Agentic Hub with 10+ agents. The 'Atomic' approach makes composing them into larger workflows (in LangGraph) much less painful than managing a massive prompt inside codebase.
Love this approach — designing for scale from the start really shows here. Running 10+ agents cleanly is no small feat, and the Atomic model feels like a genuinely practical way to keep complex workflows sane and maintainable.
The SQL-on-bad-schema analogy is clever, but incomplete. Bad schemas are accidental; probabilistic models are intrinsic. Prompt engineering isn’t compensating for a flaw—it’s adapting to a fundamentally different computation model. That’s closer to writing concurrent code or numerical methods than patching a mistake. Entire disciplines exist to manage nondeterminism.
That’s a fair distinction: nondeterminism isn’t a defect here, it’s a core property of the model. Prompt engineering is less about correction and more about applying established techniques for controlling stochastic systems—much like concurrency, optimization, or numerical stability.
I mostly agree with your framing, but I think you’re underselling something important: prompt engineering is architecture—just at the human↔model boundary. Yes, it exposes ambiguity, but designing constraints, interfaces, and failure modes at that boundary is real engineering work. We don’t dismiss API design because it’s “just middleware.” The fact that it looks like text doesn’t make it less structural.
Great point—prompt engineering is architectural work at the human–model boundary, shaping constraints, interfaces, and failure modes much like API design does. Treating it as “just text” misses how much system behavior and reliability are determined by that layer.
This hits hard. “Prompt engineering is middleware for probabilistic systems” is probably the cleanest framing I’ve seen. The part about prompts exposing weak architecture is painfully accurate — the model isn’t confused, the system is. Great read.
Thank you — I really appreciate that. I’m glad the framing resonated, and it’s encouraging to hear it landed clearly with someone who sees the architectural implications so sharply.
Yeah this is the first “prompt engineering” take I’ve read that actually lives in reality.
The SQL/bad schema analogy is brutal because it’s true: you can get results, but you’re basically building a career around compensating for upstream mess.
What clicked for me is your framing of prompts as control systems. That’s exactly it. Chain-of-thought, ReAct, schemas, “be consistent,” temperature… it’s not wizardry, it’s you trying to pin a probabilistic blob into something contract-shaped so the rest of your system doesn’t explode.
And the deeper point is even nastier (in a good way): LLMs don’t just fail weirdly — they expose how vague our requirements are. The model isn’t “smart” when it makes a call. It’s just forcing you to admit you never defined the boundary in the first place.
So yeah, prompt engineering is real… the same way duct tape is real. Useful, sometimes necessary, but if your whole product is duct tape, you don’t have a product — you have a future incident report waiting to happen.
This post is basically: stop worshipping prompts, start owning constraints. That’s the actual skill ceiling.
This nails it — treating prompts as control surfaces rather than magic spells is the only framing that scales in real systems. The uncomfortable truth you call out is the key insight: LLM failures aren’t anomalies, they’re diagnostics for undefined constraints, and mature teams learn to fix the system, not keep adding duct tape.
👌👌👌BRILLIAT!!!👍👍👍👍
This resonated more than most takes on prompt engineering. What clicked for me is the framing of prompts as control surfaces rather than “clever instructions.” Every technique people label as “advanced prompting” looks exactly like what we already do when integrating unreliable systems: constrain inputs, force schemas, narrow responsibility, and assume retries will happen.
The uncomfortable part (and the useful one) is that LLMs don’t let you hand-wave ambiguity anymore. If you say “pick the reasonable option,” the model just reflects your unresolved business logic back at you — confidently. That’s not intelligence, it’s a mirror.
Wish your regard...
From dev
I also appreciate the distinction you draw between advisory use and authoritative use. Prompt engineering works when failure is cheap and reversible; it becomes technical debt the moment we let prose substitute for domain rules. At that point we haven’t built AI — we’ve built a distributed system with no contracts and very persuasive error messages.
Framing this as “middleware for probabilistic systems” is probably the most honest description I’ve seen. Not a career path, not a spellbook — just another layer that forces engineers to finally design the boundaries they were getting away without before.
This is a really sharp take — I love how you ground prompt engineering in real engineering instincts instead of treating it like magic. The “control surfaces” framing especially matches how I think about designing for failure and retries in messy systems, and it sets a much healthier expectation for how LLMs should be used. I’m genuinely interested to see how this mindset shapes more concrete patterns or tooling, because this feels like the right foundation for building things that actually last.
This is the best framing I've seen. I've been on both sides of this.
When I started using Claude Code, my prompts were long and defensive - "don't do X, remember Y, watch out for Z." Classic symptom. The AI was exposing that my codebase didn't encode its own rules.
Now I put those boundaries in CLAUDE.md files: "handlers delegate to services," "use constants not strings," "this is the error handling pattern." The prompts got shorter. The AI got more consistent.
The prompt engineering didn't get better - the system did. The AI just stopped needing to be told what the codebase should have already said.
This resonates deeply—LLMs act like boundary detectors, surfacing implicit architectural debt the moment rules aren’t encoded in the system itself. Once invariants live in code and docs instead of prompts, the model simply amplifies good design rather than compensating for its absence.
Nice perspective on prompt engineering as middleware. The comparison to distributed systems problems makes sense for handling LLM quirks.
Thanks! That analogy really helps ground the discussion in familiar engineering tradeoffs, and I’m glad it resonated with how you think about system reliability and constraints.
Great article 👏👏
Thanks 😎
Great!
Thanks😎
Some comments may only be visible to logged-in visitors. Sign in to view all comments.