Involving AI in high-stakes decisions in real organizations
Hosted by the House of Innovation, this breakfast seminar centered on two distinct but complementary perspectives:
- how AI tools are built and deployed in industry
- what recent research reveals about their real-world consequences for firms and individuals
Moderated by Sam Cao, Assistant Professor at the House of Innovation, the seminar brought together an academic researcher and an industry practitioner who have each been working at the frontier of AI-assisted and AI-automated decision-making.
The conversation ranged from the technical architecture of AI voice agents to the organizational conditions that determine whether these tools create value or quietly introduce new risks.
What united the two speakers was a shared conviction: that access to AI tools is no longer the limiting factor. The harder questions are about design, deployment, and judgment. In short, it's about who the tool is built for, what it's measuring, and whether the humans working alongside it have the expertise to know when to trust it and when to push back.
From neural networks to decision systems
The first speaker, Lele Cao, Senior Principal AI Researcher at King (part of Microsoft Gaming), opened with a candid reflection on his own journey into AI research. It started in the early 2000s, when a math teacher told him neural networks were a dead end. That advice, he noted, turned out to be poorly timed.
Drawing on his work at EQT, Cao walked through what it takes to build AI tools for high-stakes decisions in practice. He framed the challenge around three practical questions: who the tool is actually for, what specific pain it addresses, and how success or failure will be measured.
His advice was pretty direct: start narrow, know your user, and, ideally, be one yourself. At EQT, rather than trying to solve the full pipeline of investment analysis from the outset, the team started by focusing on one clearly defined problem: evaluating startup potential.
He also raised a concern that drew visible recognition from the audience: the risk of over-trusting AI in production environments.
During a late-night deployment of an agentic research paper review system, he allowed an AI agent to modify previously validated logic without an in-depth review. The result was two hours of troubleshooting at 02:00. The lesson, he said, was not to abandon AI tools completely but to keep humans in the loop, especially at the boundaries where context matters most.
He then posed another, perhaps more worrying, question: if AI handles the routine work that has historically helped junior professionals develop judgment, how will the next generation of senior decision-makers build the expertise to know when AI is wrong?

What the data shows about using AI agents in hiring
The second speaker, Luca Henkel, Assistant Professor of Finance at Erasmus University Rotterdam, presented findings from a field experiment conducted in partnership with a large hiring firm processing a large number of applicants annually in a single market.
His research sits at an intersection he finds particularly interesting: AI tools that do not merely assist a human task but automate it entirely. For example, a tool that replaces the human recruiter (in the initial interview stage) with an AI voice agent. The system combines a large language model that generates interview content, text-to-speech synthesis, and speech recognition to conduct real-time, natural-language conversations with applicants.
One finding surprised even the research team: when applicants were given the choice between an AI interviewer and a human one, most chose the AI.
"The convenience of having an AI agent available around the clock most likely played a role here," Henkel reasoned.
The data also showed that AI collects information differently than humans do: their interviews are more structured and consistent while remaining responsive to individual applicants. This can be an advantage in homogeneous, entry-level hiring contexts where consistency matters. However, in roles requiring nuanced judgment, that same standardization becomes a limitation. This could be, for example, where candidates differ significantly or when the criteria are not fully predefined.
Henkel was careful about the scope of these findings. The results apply to entry-level positions where language skills are the central signal. He does not believe the current technology is ready to replace human judgment in senior or complex roles, and stressed that, in the study, hiring decisions remained with human reviewers.
A panel conversation: accountability, inequality, and the human element
The audience kept the conversation moving across a range of connected issues.
On the question of AI in high-complexity strategic decisions, Lele Cao acknowledged the limitations but pushed back on the idea that AI has no role. Structured context and well-defined information architecture can extend what language models do usefully, he argued, though human oversight remains nonnegotiable.
As he put it: if AI makes the decision, it becomes too easy for humans to disclaim responsibility.
The panelists spoke at length about accountability. Henkel noted that one of the clearest remaining advantages of human decision-makers is that they can be held responsible for their choices. How accountability works when an AI system makes or shapes a consequential decision is still an open question.
The audience also raised the question of inequality: does AI widen the gap between large, well-resourced organizations and smaller ones? Perspectives in the room diverged.
Henkel noted that the evidence is still mixed. AI can augment experienced workers and support less experienced ones, he argued, though it is not yet clear which force dominates.
One attendee thought the opposite, and argued that smaller companies may benefit most, given their ability to move quickly and act without the overhead that slows large organizations.
Henkel also flagged a structural risk for the applicant's experience that has received less attention: if AI makes recruitment interviews essentially costless for companies, there is nothing to prevent hiring processes from expanding to ten or twelve rounds. The cost burden then shifts entirely to applicants, which could make job searching significantly worse for candidates.

What collaboration between research and industry makes possible
The seminar showed what's possible when academic researchers and industry partners put their heads together. Rather than debating AI in the abstract, the morning was grounded in experimental data, live deployments, and the practical decisions that come with building and evaluating these tools in real organizations.
The questions from the floor reflected how much is still unsettled. They ranged from how soft skills can be reliably assessed at scale, and whether AI will ultimately compress or widen capability gaps, to whether the technical architectures underlying today's tools are approaching their limits.
As both panelists acknowledged, the answers depend heavily on context. What AI does well in a standardized, high-volume hiring process looks very different from what it can offer in a boardroom. Getting that distinction right, and building the human expertise to judge it, may matter more than the tools themselves.
Thank you to Lele Cao, Luca Henkel, and Sam Cao for a thought-provoking and engaging morning at the Stockholm School of Economics!
