Deconstruction, Part Deux
Can Gen AI help improve media literacy?
When we launched Semafor three years ago, we debuted a story structure that we dubbed — what else? — the “Semaform.”
The idea was that we owed it to readers to show them what were facts, what was the reporter’s analysis, and what the best counterargument to that analysis was. Giving readers more visibility into the “scaffolding” of a story should, we thought, give them more insight into an issue and more confidence about the integrity of our work. I’m very proud of the format, and I think it’s by and large been successful in its mission.
Why can’t more stories be built that way?
Or at least, why can’t we help readers to look into the superstructure of the stories we construct so they can better see for themselves what assumptions and analysis we’ve leaned on as part of the narrative? Would that help make a dent in the filter bubble problem we already have and that AI will supercharge? A good editor can pick apart a story, even on an issue they’re not entirely familiar with, but we can’t provide a personal editor to every reader; could a machine do the same job?
It turns out — at least based on an experiment I did — that it can get pretty close.
I built a bot. (Of course I did.)
I’ll come back to how I did in a bit, but here’s what it does:
It’ll identify the central theme of a story it’s given, whether it’s explicitly stated or not;
it’ll extract the key facts (or asserted facts — it can’t discern truth or falsity) supporting that thesis;
the key assumptions or analysis that support it;
and the core assumption that underpins the whole piece.
And if you ask it, it’ll also offer:
alternative framings for the same facts;
perspectives or voices that might be missing from the piece;
and key questions that could have but weren’t asked.
At its best, it’s the editor you hate but need; the one that tells you that you haven’t explored other explanations or gotten all sides of the story. But sometimes it’s also the slightly clueless copy desker who asks decent but obvious questions about a subject they don’t know anything about. (Which is what any number of reporters probably think of their managers…)
This could be a useful newsroom tool — and, eventually, something to help readers get more perspectives on a given issue.
And it does this mostly by parsing language — which Large Language Models have become frighteningly good at. (It also has some general knowledge about the world, and I’m still trying to figure out how much its “conventional wisdom” affects its output.)
(And aside: I’m building these bots not because I’m good at it; in fact, I think I established pretty convincingly that I’m not good at it. Nor am I doing this to show people how easy it is to build them, although I do think more journalists should be experimenting and testing the frontiers of the technology. Mostly I’m doing it to explore and understand how Gen AI’s capabilities can transform the news experience and news product, well beyond helping us do what we currently do faster and better.)
TRYING IT OUT
Here’s an example. I fed in a piece from The New York Times, about the setback for transgender rights when the Supreme Court ruled in favor of Tennessee in United States v. Skrmetti earlier this year. I realize this is, at least at this point in time, a highly contested area in the media and the public mind, but that makes it a good candidate for the tool. It’s a long story, and a tad complicated, but I’m deep in this subject for personal reasons, so I devoured it. And so did the bot.
The core thesis, the bot said, was:
The LGBTQ movement’s strategic decision to bring United States v. Skrmetti—a challenge to Tennessee’s ban on pediatric gender-affirming care—to the Supreme Court represents a significant tactical error that has set back transgender rights for potentially a generation.
That’s spot on in terms of what the piece concludes. It then listed a set of facts asserted in the piece, and then extracted a series of assumptions or analysis in it, including:
Characterizes the ACLU’s approach as driven by ideological purity rather than strategic pragmatism (refusing compromise, pushing expansive gender theories most voters don’t support)
Presents the movement as having become “consumed by theories of sex and gender that most voters didn’t grasp or support” during a period of cultural backlash
Frames the scientific evidence for pediatric gender medicine as “unraveling” and built on politically motivated rather than medically sound foundations
Portrays internal movement dynamics as increasingly dogmatic, attacking dissent and “closing ranks” rather than engaging with legitimate medical concerns
Suggests the case was brought despite better strategic alternatives being available
And then notes that the key assumption you need to accept to buy into its central thesis is:
Successful civil rights advocacy requires incremental, politically palatable strategies that align with public opinion and solid evidence, rather than pushing maximalist legal arguments on divisive issues with weak scientific foundations during periods of political backlash.
You can agree or disagree with the piece, but the bot certainly pulled out the core ideas in it.
And then I asked it for alternative framings for the same facts. It had more than a few, but a couple stood out:
Rather than depicting it as a movement that “bet its future” and “drove itself toward a cliff,” these events could be framed as a civil rights organization responding to urgent attacks on vulnerable youth when no other option existed.
The rapid increase in state bans beginning in 2021 could be presented not as evidence that the movement moved too quickly, but as proof that conservative forces were orchestrating a coordinated assault regardless of the movement’s strategy.
The ACLU’s legal strategy could be framed as principled advocacy defending civil rights even in challenging circumstances, rather than a strategic error. With over 1,500 anti-trans bills introduced since 2021, legal challenges were inevitable—the question was which case would reach the Supreme Court, not whether one would.
In other words, a case was coming to the Supreme Court; the only question was which one, and who would bring it. And as someone who’s been tracking all of this closely, that seems like an entirely alternative fair framing that was absent from the piece.
Not that the bot is or should be adjudicating what the right framing for any story is; all stories need frames, and you have to pick one. You can’t on-the-one-hand-and-on-the-other-hand every piece. But it can certainly help readers (and editors, and reporters) see other points of view.
It offered some unasked questions as well:
How should medical uncertainty be weighed against patient autonomy and parental rights in healthcare decisions?
What would constitute sufficient evidence to support gender-affirming care?
All — at least from my point of view — entirely reasonable suggestions. And could that help readers at least consider different perspectives and framings of events, especially if some version of this capability was built into how major stories were presented?
WHAT’S INSIDE
Sure, it’s not perfect as a media literacy tool, by any stretch of the imagination. I’ve tested it on a number of stories, and it works pretty well, but it also asks obvious questions and has a very limited sense of the real world and current affairs. (And I did build it during a spurt of stable Wi-Fi on the Amtrak from New York to DC, so I imagine some actual engineering might make it much better.)
But it does a pretty good job, based essentially on analyzing language — which is, of course, what LLMs are increasingly frighteningly good at. In fact, this is my second effort building a “Semaform This Story” bot; the first version, which I made about a year ago, was adequate to extract facts and assumptions, but not with any real level of insight.
Since then, the systems have leapt ahead in capability. Claude Sonnet 4 and 4.5 , in particular, continue to surprise me with their language handling capabilities.
I built this by starting an extended conversation with Sonnet 4.5 about how it might discern an asserted fact from an assumption, how it could parse a story into the two groups, and how it look for alternative framings. This took a little effort. For example, it accurately flagged “studies say men are smarter than women” as an asserted fact; an editor or reader could ask for those (non-existent) studies. But the reporter is at least presenting the idea that that sentence has some evidence behind it. But it stumbled on “men are smarter than women,” since that, too, sounds like an asserted fact. And so we would up going back and forth about what assumptions are often slipped into stories by presenting as by-the-way truths. It was, without anthropomorphizing it too much, a delightful and insightful discussion. And once we got to a reasonable place about how what to look for in a story, I asked it to build an Artifact based on that conversation. And it did, using Sonnet 4.0. A little tweaking here and there, and it was done. (Again, no software developer need worry I’m taking their job.)
QUESTIONS AHEAD
There are broader questions and capabilities that come out of this: Can this, along with the tool I built the previous week, help nudge us out of our filter bubbles, at least at the margins? Does it point to the capability to build news literacy tools at scale? Can it help reporters pre-publication to look for angles they may have missed? Could this be a built-in browser extension to give readers more perspectives? Can this help us imagine other products that could use sophisticated language-analysis capabilities? How accurate is it? And equally: How much bias is built in Claude itself so it sees the world in a certain way? Would it “both sides” issues that are well settled?
I don’t know, but it would be nice to have that conversation. And not just with Claude.
Meanwhile, I wrote a fictional story about a scientist who created a “transporter” beam that could revolutionize interplanetary travel to see how well the bot would analyze a completely made-up tale. I named the inventor James Doohan and his young assistant — and test subject — George Takei.
The bot instantly recognized those names as cast members on Star Trek — and told me it had some real questions about the veracity of the story. But then it humored me anyway. At least it’s an editor who can take a joke.


