Just Asking Questions
When coding is cheap and data is plentiful, where does value lie? Good question.
Or rather, good questions.
I was thinking of this as I was looking over Semafor Intelligence, the new product we (Semafor is my other gig) launched last week that distills the collective insights of the 300+ people who spoke on stage at our massive Semafor World Economy conference. (There’s a lot of Semafors in one paragraph, but then again, I’m writing about something we did; also, it is, objectively speaking, a great news organization.)
So Semafor Intelligence is great as a way to understand how movers and shakers see the world on a host of key issues — AI, employment, markets, etc — but what’s just as fascinating is how we put it together. And how quickly.
And that has real lessons for where the field is going, and where we create value.
Bear with me on the backstory. The conference wrapped on Friday afternoon; we all breathed a collective sigh and took a well-deserved 15 minutes off. Meanwhile, Reed Albergotti, our tech editor, decided what he really wanted to do that following Sunday was figure out what all our panelists had said. After all, with speakers scattered over three simultaneous stages over five days, who could have listened to all the discussions?
So he threw together — in an hour or two, using OpenAI’s Codex tool — a site that pulled all the video and transcripts of the panels with the goal of seeing what the main themes were. It was a pretty impressive prototype, not least given how fast he put it together; and then Alastair Clements, our data lead, put in another day and a half of work to make a production-ready version. The reporters dived in, mined it for insights, wrote them up, and here we are.
We wrote about that process here:
The tool analyzed every transcript, pulled out every distinct claim each speaker made, and turned each one into a numerical fingerprint that captures meaning rather than wording. This technology is called “embedding” or “vectorizing” and is becoming a common way to process large amounts of text to understand the relationships between the ideas in a corpus: It essentially assigns ideas to a complex string of numbers, then uses those numbers to understand the semantic proximity between ideas. This proximity map was used to help refine the report. The tool then used multi-agent reasoning to surface direct quotes from speakers that support or push back on the central themes.
Semafor’s journalists then reviewed every theme: stress-testing the premises, interrogating the supporting quotes, and editing down to the ones most clearly supported by what was actually said.
There are many ways to describe this, but insane would not be far off. A project that would have taken, a few years ago, weeks if not months to put together at the cost of tens of thousands — or more — of dollars, was built in two days at a negligible cost. (Tech entrepreneur Rami Alhamad had done much the same thing for everything that was said at Davos, too, although in a more condensed format. I wrote about it here.)
Which is to say, first, that tech development, even for public-facing products, is becoming much cheaper, and much faster, much more quickly than we might realize. And that the key constraint isn’t technologists or money, but imagination. Reed had a great idea, and he wanted to see what it looked like, and an hour later, he could. This isn’t a gather-requirements-and-get-it-on-a-roadmap process; it’s rapid prototyping on steroids. It helps that Reed is deep in the tech world and Alastair is a genius; but anyone can do this.
As Alastair notes in the story we published about the backstory:
The most interesting thing about this project is not the technology we built, but the fact that Reed and I built it in a matter of days. Four years ago, an analysis like this would have required a data science team, weeks of scoping and implementation, and a six-figure budget; now, it takes a journalist with limited engineering background and a data lead with one foot in editorial. We needed some knowledge of vector databases, embedding, and development, but we were still able to go from prototype to a viable product in less than two days.
And second, AI can unlock real value and new capabilities, if you — again — have enough imagination about what it can do. In this case, looking over and analyzing hundreds of transcripts to pull key ideas and claims out, and seeing where people agreed and disagreed, and mapping a week’s worth of discussions.
It’s not an efficiency tool or something that helps us do what we already do better; it’s a system that lets us do completely new things.
On the other hand, if anyone can build this — and anyone can — where does competitive advantage lie? I note in that same piece:
We don’t have a monopoly on this, of course; anyone can build this tool and do the same analysis. Ultimately it comes down both to what information you have that isn’t public, and the quality of the questions you ask the system to track — skills that humans have that machines don’t.
If you have proprietary data, that’s one barrier to entry. You can analyze what you have, and no one else can. In our case, we streamed the event, so in fact anyone can build the same thing. (Except we already did it, so don’t waste your time.)
And so how do we — or anyone else — differentiate themselves from the rest of the pack?
It’s in having the imagination to build this, of course; but the real value is in the questions you ask of it.
You don’t have to restrict this to transcripts; you can port whatever documents you want into it — for example, everything that the CEOs of the major AI companies say in public. You can ask the system to look for things they say in common, or where they disagree; you can set up a hypothesis about what views they may hold in the future, and track over time, how much they are trending towards or away from that; you could look for the view that’s the most isolated from the consensus; or the CEO that’s most determinedly charting their own course in public statements.
The point is, systems like this allow you to ask questions you couldn’t ask before. And in a world where anyone can create systems and query them, the real value is knowing what to ask of it.
Any questions?


