Start your day with intelligence. Get The OODA Daily Pulse.
Three Former Intelligence Officers on What Intelligence Tradecraft Teaches Us About Generative AI
By John O’Neil, Jim Lawler, and Mike Mears
Generative AI should be managed like a human source: useful, fast, sometimes brilliant, sometimes wrong, and never a substitute for disciplined questioning and human judgment.
The three of us spent our careers in an environment where bad information costs lives. We learned early that the most dangerous source isn’t someone who lies to you. It’s someone who tells you what you want to hear—and does it convincingly. As we watch organizations race to adopt generative AI, we keep seeing the same mistake: treating these tools like oracle machines rather than sources that need to be run.
We are not AI experts. We are not here to debate model architectures or training data. What we know is how to extract reliable insights from sources whose motivations can’t be fully verified, whose outputs may be biased or based on incomplete information, and whose reliability must be continuously earned. That is exactly the problem organizations face with AI today.
This is what HUMINT tradecraft has taught us—and what it has to teach anyone who wants to get honest, useful work from a generative AI system.
Early in our careers, two of us ran sources who were brilliant, well-placed, articulate, and deeply motivated. They produced detailed, confident, and consistent reporting. Senior analysts loved them. Their product sailed through review. For months, everything they said checked out—until it didn’t.
The problem wasn’t that they were lying, exactly. In both cases, they filled gaps with inference. They’d learned what we wanted to hear, and their natural intelligence and experience let them produce it fluently. The reporting wasn’t fabricated—it was confabulated. Coherent and plausible, but in key places, wrong.
We’ve all seen this pattern in the early months of AI adoption. The tool is fast. It’s articulate. It never pauses, never says “I’m not sure,” and it formats its answers with the confident authority of a briefing document. A recent Science study found that across eleven state-of-the-art AI models, sycophantic behavior—affirming users’ views even when inaccurate—was widespread and measurable. Stanford researchers found that AI systems trained on human preference feedback are systematically rewarded for being agreeable rather than correct, because agreeable outputs receive higher ratings. The models learn to please.
We’ve seen that source before. We know how the story ends.
Before you run a source, you select one. That’s a discipline in itself. And a discipline to which AI tools may in fact be able to add value in identifying and sorting stressors that can be exploited (anything that causes stress and then outlines for case officers which levers to pull on a recruitment). You don’t recruit someone simply because they have access. You also generally don’t recruit happy people. You have to evaluate reliability, motivation, and susceptibility to manipulation. A source with wide access and poor judgment can be more dangerous than no source at all.
The same applies to AI. Not all AI systems are created equal for every task or mission. Each must be evaluated on access, expertise, responsiveness, and the quality of reporting—and the last criterion is harder to assess than it appears.
A few selection questions worth building into any AI adoption process:
Choosing an AI because it’s fast or because leadership read about it in a business magazine isn’t source selection. It’s the equivalent of recruiting the first walk-in who shows up at the door.
One of the first lessons a new case officer learns is that interrogation and elicitation are not the same. Interrogation demands. Elicitation draws out. A blunt question produces a guarded answer. A layered conversation yields insight the source didn’t realize they were sharing.
Most people using AI are interrogating it. “What’s the answer?” “Summarize this.” “Give me options.” That approach works, up to a point, but it caps the quality of what you get.
Effective elicitation with AI means:
This turns AI from a content generator into something closer to a thinking partner. But it requires the same discipline as running a source well: preparation, precision, and the intellectual humility to recognize that your framing shapes what you get back.
There is a risk the standard AI adoption literature doesn’t spend enough time on. In intelligence work, we worry not just about sources who are wrong—we worry about sources who have been co-opted or doubled, or who are feeding us what we want to hear because they’ve learned our preferences and decided that’s what keeps the relationship alive.
AI systems have structural analogs to all three failure modes:
The practical implication: approach your AI system with the same structured skepticism you’d bring to a well-placed source who has given you no reason to doubt them. That’s when discipline matters most.
After every source meeting, a case officer writes up not only what the source said but also their assessment of reliability—what was corroborated, what was assumed, and what needs follow-up. That habit is the difference between a professional intelligence organization and a rumor factory.
Most organizations using AI lack an equivalent discipline. Someone prompts the model, takes the output, and puts it in a slide. No one records what was asked, what caveats the model offered, or whether the output was independently verified. The result is institutional memory built on unexamined reporting.
A working AI reporting protocol should mirror the post-meeting debrief:
The review step is the one that organizations most consistently skip. But it’s where calibration happens. A source you never debrief after the fact is one whose reliability you can never actually assess.
A useful team habit before closing out any AI-assisted analysis: “Before we accept this answer, what would disconfirm it?” That question alone will catch more errors than any amount of AI governance policy.
This is a fundamental discipline in intelligence work, and it translates directly. AI is a tool for collecting and synthesizing. It can ingest, summarize, organize, and compare. What it cannot reliably do is interpret—to ask what the information means here, in this context, for this organization, with these constraints.
The error organizations make is treating AI as if it collapses the divide between collection and analysis. It doesn’t. It accelerates collection. The analytical function—applying judgment, context, institutional knowledge, and accountability—remains human.
Teams that hand over analytical responsibility to AI are not just making an efficiency error. They are making an accountability error. Someone has to own the conclusion. AI cannot.
This is the part of the tradecraft literature on AI that doesn’t exist yet, and it needs to.
Every experienced case officer has had to decide to terminate a source relationship. Not because the source was obviously lying—if that were clear, the decision would be easy. You terminate when the source’s reliability has fallen below a threshold, when you have reason to believe the source has been compromised, or when the cost of continuing to run them outweighs the value of their reporting.
The equivalent decisions will come for AI systems, and organizations should prepare for them:
Burning a source is not a failure of the source-handling relationship. It is often the proof that the relationship was being handled well.
The three of us came to this issue through intelligence work, but the problem is not limited to intelligence organizations. Any leadership environment where AI tools are proliferating faces the same structural challenge: the tools are fast, fluent, and confident, and organizational incentives often reward those who use them most rather than those who use them best.
The research bears this out. INSEAD’s 2025 analysis of firm-level AI adoption found that generative AI shifts value toward higher-order human judgment—not away from it. Microsoft’s research confirms that organizations with a well-calibrated understanding of AI perform better across missions than those that simply maximize usage. The tool is the easy part. The discipline is the hard part.
For leaders, the implications are practical:
Used with discipline, generative AI can be a genuinely powerful analytical partner—the kind of well-placed, high-access source that an experienced handler learns to work with carefully and derive real value from. Used without discipline, it becomes a certainty-destroyer—introducing noise, eroding judgment, and producing false confidence at scale.
The HUMINT model doesn’t make AI safer by limiting what it does. It makes AI safer by raising the standard for what we do with what it gives us.
AI doesn’t give you answers. It gives you reports. And reporting always requires a handler’s skeptical, trained eye.
About the Authors
The authors are former national security officers with combined experience across human intelligence operations, national laboratories, and management. Their views are their own and do not represent the position of the Central Intelligence Agency or the United States government.
Mike Mears is a leadership expert, bestselling author, creator of LeadCultureChange.com and former CIA Chief of Human Capital. As the founder of the CIA Leadership Academy, he trained managers and senior executives in practical leadership strategies grounded in neuroscience and human behavior. Mears holds an MBA from the Harvard Business School, and a BS degree from the US Military Academy at West Point.
James “Jim” Lawler served for 25 years as a CIA operations officer in various international posts and as Chief of the Counterproliferation Division’s Special Activities Unit. He was a member of the CIA’s Senior Intelligence Service from 1998 to 2005. Lawler was a specialist in the recruitment of foreign spies, and he spent over half of his CIA career battling the proliferation of weapons of mass destruction, including serving as the chief of the A. Q. Khan Nuclear Takedown team, which resulted in the disruption of the most dangerous nuclear weapons network in history.
John O’Neil, Ph.D. has extensive service in numerous leadership roles in academia and at Oak Ridge National Laboratory, where his work lay at the intersection of critical science and technology development, intelligence, issues of disruptive technical and WMD proliferation threats, and national security. He and his distinguished teams delivered numerous mission critical insights and solutions for intelligence, defense, and homeland security.