Why Storytelling Is Harder Than Coding for LLMs

The real frontier of AI isn’t logic. It’s alignment with human intent.

The Perspective I Didn’t Expect to Matter

Fifteen years ago, I was sitting alone at a desk long after everyone else had gone home. The only light came from the screen, a cursor blinking against a blank document. I wasn’t debugging a function. I was wrestling with a story, searching for the one sentence that would cut through the noise and land somewhere real inside a stranger who would never know my name.

That was my life in communications. Storytelling wasn’t just a skill. It was the air I breathed. Crafting narratives in the dark, shaping tone for brands I’d never touch, translating the fog of someone else’s vague intent into something that would resonate in another person’s chest.

Now? I’m a software engineer.

Different world. Different languages. Or so I thought.

Recently, I’ve been circling back to long-form writing. Not the quick-hit posts that evaporate by lunchtime, but sustained work. The kind that needs bones. Pacing. A thread you can follow from beginning to end. And this time, I’m not doing it alone. I’ve got an LLM beside me.

That’s when I realized this: with one foot in code and the other in narrative, I could suddenly see where these models soar, and where they fall short.

Code Was Supposed to Be Hard

You’d think coding would be the thing that stumps an AI. It’s unforgiving. One misplaced semicolon, one misremembered function name, and the whole thing falls apart. Precision is everything. There’s no vibe that saves you from a syntax error.

So imagine my surprise when I watched an LLM generate a clean solution in seconds, something that would’ve taken me hours of reading documentation and second-guessing variable names. It refactors. It scaffolds. It sees patterns I miss when I’m too close to the problem.

Sometimes the output is cleaner than what I would’ve written myself.

That raised the real question: If coding is the easy part, what’s actually hard?

The Struggle with Storytelling

I wasn’t prepared for the answer to be storytelling.

LLMs can produce a polished paragraph about love or loss faster than any human alive. It looks right. It reads like writing.

But real storytelling, the kind that stays with you, is something else.

  • Maintaining a character’s soul across twenty pages.
  • Building an emotional arc that doesn’t just rise, but breathes.
  • Landing a theme without hammering it.
  • Capturing intent, the thing you meant to say before the words got in the way.

The output is correct. But it doesn’t feel right. And if you’ve spent years in communications, tuning copy until it lands in someone’s gut, you notice that gap immediately. It’s like a note played perfectly on pitch, but without vibration.

Why Coding Fits the Machine Mind

Coding is a dream scenario for how LLMs work. Clear structure. Rules. Syntax. Repeating patterns. Immediate feedback. The compiler yells at you. The test fails. You know right away.

Decomposable problems. Big systems break into small functions.

Endless examples. Millions of repos. Stack Overflow threads. Real-world blueprints.

Coding looks like advanced intelligence from the outside. But from the inside, it’s a legible, rule-bound world. And LLMs thrive in legible worlds.

Why Storytelling Resists the Algorithm

Storytelling lives in the space between the rules.

There is no single correct answer. Feedback arrives slowly, subjectively, and often without words. Coherence isn’t local. It’s stretched across time, memory, and feeling.

Meaning lives in what’s unsaid, not just what’s typed.

Most of all, storytelling demands understanding what someone means, not just what they said.

The Missing Layer: Emotion

There’s another dimension we don’t talk about enough. Emotion.

LLMs know the language of sadness. They’ve read every description of a tightening chest, a quiet room, a silence that means more than screaming. They can simulate emotional language with unsettling accuracy.

But they don’t feel it. And that difference matters.

When humans write, emotion isn’t just content on the page. It’s feedback in the body. We adjust tone because something in us reacts. We feel a sentence goes flat. We sense when a moment needs room before we can explain why. That loop between felt experience and the next keystroke doesn’t exist for an LLM.

It approximates emotion through patterns, not through living.

So you get prose that is technically flawless but emotionally untethered. Correct. But not quite right.

The Gap in My Own Workflow

When I’m coding with an LLM, the loop is tight and satisfying:

Prompt → Output → Refine → Better Output.

When something breaks, I see it. When it’s fixed, I know.

Storytelling is like trying to tune an instrument you can’t hear.

I’ll start with something vague, a mood, a character’s vague thoughts, unknown actions, a turning point I can’t quite name, and ask the model to carry it forward.

The response arrives. It’s coherent. It’s well written. It’s… not it.

And I can’t always tell you why. I just know.

DeepSeek vs. ChatGPT and Claude

Recently, I put several models to the test: DeepSeek, ChatGPT, and Claude, across both coding and storytelling tasks.

Coding? Comparable. They all know how to write me a perfect Chromie Squiggle in JavaScript.

But storytelling, especially in Chinese, my first language, is where things got strange.

DeepSeek kept landing closer to what I meant. It caught the tone without me having to spell it out. It followed emotional direction with less friction between my intent and the output.

Even in English, there was a subtle edge.

And that raised a question I couldn’t shake: Is this about raw capability? Or is it about how the model thinks?

Language Shapes Thought, and Models Inherit That

Linguists have long argued that language doesn’t just describe reality. It shapes how we perceive it. Across roughly 7,000 human languages, meaning is encoded differently. Each one tilts the mind in slightly different directions.

Chinese, for example, tends to:

  • Compress dense meaning into fewer words.
  • Rely heavily on context and shared understanding.
  • Emphasize holistic, relational thinking over isolated categories.

If a model is trained deeply in that linguistic space, it doesn’t just learn vocabulary. It internalizes a different structure for meaning itself.

So when I catch myself thinking: DeepSeek just gets me.

Maybe what I’m really saying is:

My internal compass for meaning is closer to the way Chinese structures thought. And DeepSeek, by design or by consequence, operates closer to that same space.

Thoughts on AGI

If we define AGI as a system that matches or surpasses human capability across domains, then coding isn’t the benchmark we thought it was.

It’s a narrower problem. A solvable puzzle.

The real frontier lies elsewhere:

Can a model understand what we mean, not just what we say?

Storytelling exposes this gap mercilessly.

It requires:

  • Modeling human intention, not just human language.
  • Maintaining coherence across time, memory, and emotion.
  • Navigating ambiguity without a clear reward signal.
  • Aligning emotionally, not just logically.

These aren’t edge cases. They’re central to what it means to think like a person.

Storytelling is the original interface of intelligence.

It’s how we compress lived experience into shareable meaning. It’s how we coordinate with other minds. It’s how we make sense of being alive.

If a system can truly co-create stories with us, not just generate text, but align with intent and emotional direction, then it’s operating at a level of intelligence that deserves a different name.

A Shift in Perspective

I’m in software engineering. This realization sits uncomfortably in my chest.

It suggests that the skills we treated as uniquely human, structured problem-solving and logical reasoning, are more automatable than we wanted to believe.

And the things we treat as soft or secondary, narrative instinct, emotional taste, the ability to feel when something lands, may be closer to the core of what makes us human.

For years, we’ve measured AI progress by how well it writes code.

That made sense. It was visible. Testable. Impressive.

But it might not be the right frontier.

Because the harder problem, the one that still resists us, isn’t syntax. It isn’t logic. It isn’t even reasoning, at least not in the narrow sense we’ve defined.

It’s understanding.

Not the shallow, statistical kind where a model predicts the next word.

The deep, human kind. The kind where you know what someone is trying to say before they’ve fully said it, and you respond with the right weight. The right silence. The right recognition.

Right now, storytelling is where that gap glows brightest.

And that’s where the real work begins.

Originally published on Medium.