AI for Experts: Right Not to Trust It, Wrong to Avoid It

AI for experts works best when managed like a team. Learn why senior professionals are right to be cautious and how briefing, delegating, and reviewing changes everything.

AI STRATEGY

Dr. Sara Manafzadeh

7 min read

Stop thinking of generative AI as a tool you operate. Start thinking of it as a team you manage. The research, and the failures, both point the same way.

I keep noticing the same thing, and for a long time it puzzled me.

The people who hold back most from generative AI are often the most capable ones in the room. Senior researchers, experienced managers, people with two degrees and twenty years of judgement. They have the most repetitive work to offload and the most expertise to direct a tool well. On paper, they should be the fastest adopters. In practice, they are often the most reluctant.

My first instinct was to find this frustrating. Then I read the evidence, and I changed my mind about half of it. Their caution is not laziness, and it is not fear of technology. It is, in large part, correct. What they get wrong is the conclusion they draw from it, and the reason they get it wrong is that they are picturing the wrong thing entirely.

The caution is backed by data

First, give the experts their due, because the data does.

Start with who actually benefits from generative AI. In the largest workplace study of its kind, Erik Brynjolfsson, Danielle Li and Lindsey Raymond examined 5,179 customer-support agents. In the published version (Quarterly Journal of Economics, 2025), access to an AI assistant raised productivity by about 15 per cent on average, but the gain was concentrated dramatically: roughly 30 to 36 per cent for the newest and least-skilled workers, and little or none for the most experienced, who even showed a slight dip in conversation quality. A landmark MIT writing experiment by Shakked Noy and Whitney Zhang (Science, 2023) found the same shape: average time fell 40 per cent and quality rose, while inequality between workers narrowed because the tool helped the weakest most.

In other words, if you are already excellent, generic AI assistance genuinely does less for you. The experts sensing "this is not really built for me" are reading the data correctly.a

It goes further. In the Boston Consulting Group and Harvard study widely known as the "Jagged Frontier" (Dell'Acqua and colleagues, 2023, since published in OrganiSation Science, 2025), 758 elite consultants produced more than 40 per cent higher-quality work with GPT-4 on tasks that sat inside the tool's competence. But on a task deliberately chosen to fall outside it, consultants using AI were 19 percentage points less likely to reach the correct answer than those working without it. The tool was confidently, fluently wrong, and capable people followed it off a cliff.

And in a 2025 randomised trial by METR, sixteen experienced open-source developers working on their own mature codebases were actually 19 per cent slower with AI tools, even as they believed the tools had sped them up by 20 per cent. That gap between felt productivity and measured productivity should give anyone pause. It is an early-2025 snapshot and not yet peer-reviewed, but it is a properly controlled experiment.

So when a senior professional says "I tried it, it confidently invented something false, and I do not trust it," they are not being a Luddite. They are describing a documented failure mode.

But they are managing the relationship wrong

Here is where I part ways with them.

The mistake is treating AI as a tool you operate: push a button, judge the result, and if the result is poor, conclude the tool is poor. But you do not judge a new team member by one vague instruction and one unreviewed draft. You manage them. You brief them clearly, you give the right work to the right person, and you review what comes back. Generative AI rewards exactly the same management skills, and punishes their absence.

A manager who briefs badly, delegates blindly, and never reviews will fail with a brilliant human team. The expert who "tried AI once and it was useless" usually managed it in precisely that way. The fix is not more trust in the tool. It is better management of it. That has three parts.

1. Brief the assistant the way you would brief your team

This is where most reluctant experts quietly sabotage themselves. They type half a sentence, get a generic answer, and conclude the model is shallow, when really they gave it the kind of instruction that would earn a blank look from any capable colleague. Note what this is not. It is not "prompt engineering," not magic words or clever tricks. It is the opposite. It is clear, natural, informative communication: context, goal, constraints, and what good looks like, exactly as you would explain a task to a sharp new hire who is fast and well-read but does not yet know your world.

Take something as ordinary as cleaning up a messy dataset. Most people hand the model the equivalent of a shrug:

Same task, same tool. The first gets you something generic you cannot trust. The second gets you something you can actually check, because you told it the context, the format, the rules, and how to make its work reviewable. Talk to the model as informatively as you talk to your team, and the quality of what comes back changes completely. Talk to it in fragments, and you get fragments.

2. Delegate the right task to the right model

A good manager does not give every job to the same person. They match the task to the strength. Most reluctant professionals try one tool, once, on one task, conclude "AI cannot do this," and stop, which is like writing off a whole profession after one bad meeting with one specialist. The models have genuinely different strengths, and using several of them deliberately is where the real reduction in workload comes from.

A caveat I will own as a scientist: this is a fast-moving field, and these strengths shift with every model release. The principle is durable. The specifics are not. Test, do not assume.

3. Review every output, because the review is the job

Briefing is the front of the management act. Review is the back. This is where your expertise becomes the quality-control layer. An expert spots the invented citation in seconds, precisely because they are an expert. So the deciding question was never "do I trust AI?" You do not trust a junior's first draft or a co-author's reference list either. Expert work has always run on review, not trust. Withholding trust is not a reason to avoid a collaborator. It is the reason you review their work.

The cost of skipping this step is no longer hypothetical. A database maintained by Damien Charlotin, a research fellow at HEC Paris, had logged more than 1,500 court cases involving AI-fabricated legal citations as of mid-2026, and the count climbs weekly. In March 2026 a US appeals court fined two attorneys 15,000 dollars each for briefs riddled with fabricated and misrepresented citations. A September 2025 Harvard Business Review article, reporting a BetterUp Labs and Stanford survey, put a number on the everyday version, which it called "workslop" — plausible-looking output that lacks real substance: each instance cost the person who received it an average of one hour and 56 minutes of rework. Every one of those failures is an output that no expert briefed properly or reviewed. The lesson is not "AI is dangerous." It is "unmanaged output is dangerous," which experts already knew about human work.

The real divide

So the experts are half right, and it is the important half. They are right that AI output cannot be trusted on faith. That instinct is sound, and the people who lack it are the ones producing the workslop and the fake citations. Where they go wrong is in treating "I cannot trust it" as a reason to stay away, rather than as the reason to manage it properly: brief it well, delegate to the right model, review the result.

That reframe is the whole shift, and it plays to exactly the strengths a senior expert already has. Managing capable contributors through clear briefs, good delegation, and sharp review is not a new skill they must learn from scratch. It is what they have done for years. They have simply never been told it is the skill that turns generative AI from disappointing into transformative.

I have stopped trying to convince anyone to "trust AI." It is the wrong goal. What I do instead is sit with people inside their real work, break it into tasks, help them brief each task clearly, match it to the right assistant, and put their expert judgement where it belongs: on the review, not the typing. The shift they feel is not "I am using AI now." It is quieter and more important: "I spent today on the part of my job that only I can do."

That is the whole point. Not blind trust. Not avoidance. Management, brief well, delegate well, review well, that turns the thing they were right to distrust into the team that gives them their best hours back.

Sources

Brynjolfsson, E., Li, D., & Raymond, L. (2025). Generative AI at Work. Quarterly Journal of Economics, 140(2), 889–942.
Noy, S., & Zhang, W. (2023). Experimental evidence on the productivity effects of generative artificial intelligence. Science, 381(6654), 187–192.
Dell'Acqua, F., et al. (2023). Navigating the Jagged Technological Frontier. Harvard Business School Working Paper 24-013; published in Organization Science, 37(2), 2025.
METR (Becker, J., Rush, N., Barnes, E., & Rein, D.) (2025). Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity. arXiv 2507.09089.
Charlotin, D. AI Hallucination Cases Database, HEC Paris (figure as of mid-2026).
Whiting v. City of Athens, Tennessee (US Court of Appeals, 6th Cir., 13 March 2026).
Niederhoffer, K., et al. (2025). AI-Generated "Workslop" Is Destroying Productivity. Harvard Business Review, 22 September 2025.

The shrug:

"Clean up this spreadsheet."

The brief:

"This is a messy export of publication records. Standardise author names to 'Surname, Initial', convert every date to YYYY-MM-DD, and split the single affiliation column into institution and country. Flag any row where the year is missing or impossible rather than guessing. Do not delete anything: record every change in a new column so I can review it."

In my own practice, I reach for different tools for different jobs: one for working through long documents and holding a careful, grounded line of reasoning; another for fast drafting and everyday writing; another when I need current information from the live web; and a dedicated notebook-style tool when I want answers anchored strictly to sources I have chosen. The specific tool matters less than the habit: match the assistant to the task, and never assume one model is the whole of "AI."

They think of AI as a tool they operate. It is far more useful to think of it as a team they manage.