AWS sits at the heart of the generative AI boom, powering everything from LLM training runs to global-scale inference. In this sweeping conversation, AWS CEO Matt Garman discusses the future of work, engineering, open vs. closed models, and why agentic workflows—not just raw models—will be where the next wave of value is created.

Garman is bullish on the economic upside of AI, skeptical of doomer narratives, and refreshingly candid about infrastructure bottlenecks, engineering culture, and how Amazon uses its own silicon to support customer choice.

Key Moments from the Interview

01:00 – White Collar Bloodbath or Utopia?
Garman’s optimistic view on AI, jobs, and productivity.

04:15 – Hiring in the Age of AI
Why more productivity doesn’t mean fewer people.

08:52 – 80% of AWS Developers Use AI
How AI is changing developer workflows at Amazon.

12:55 – Should You Still Study Engineering?
Advice for the next generation in a fast-changing tech landscape.

15:46 – Infrastructure Bottlenecks
From silicon to power, what’s actually constraining AI growth?

18:12 – Where AI Usage Is Growing Most
Training matters, but inference drives demand.

20:05 – AWS Silicon Strategy
Graviton, Tranium, Inferentia, and why Annapurna matters.

27:57 – Serving the Model Ecosystem
Bedrock, specialization, and what AWS looks for in new models.

33:39 – Open vs. Closed Source
How AWS views the trade-offs and partnerships.

36:24 – Will AWS Build a Frontier Model?
On Nova, customer choice, and competition with partners.

41:33 – Benchmarks Are Breaking
Why standardized evaluations may not matter much longer.

47:13 – The Future of Agents
The biggest opportunity in AI—and how AWS is enabling it.

Full Interview: White Collar Jobs, Hyperscalers, AI Coding, Open vs Closed, Agents, and more! (Matt Garman)

In His Own Words: What Matt Garman Revealed

The Real Impact of AI on Work (01:00)

AI will remove toil, not jobs.

ā

Most white collar jobs today involve work no one actually wants to do. AI can take that away and let people focus on what matters.

Hiring Won’t Slow Down (04:15)

Productivity gains will fuel more opportunity—not layoffs.

ā

There’s no mass unemployment because Excel exists. AI is more disruptive, but the same principle applies: we’ll move up the value chain.

Developer Usage: Over 80% (08:52)

AI coding tools are mainstream inside AWS.

ā

North of 80% of our developers are using AI in some part of their workflow—unit tests, documentation, full-on agentic coding.

Engineering Still Matters (12:55)

But it’s about mindset, not memorizing syntax.

ā

If you think you’re going to specialize in one thing for 30 years—you’re wrong. Learn how to think, how to learn, and how to build.

Bottlenecks Keep Shifting (15:46)

Today it’s silicon. Tomorrow? Could be power.

ā

There’s no one bottleneck. Solve one, and the next appears. That’s the nature of scaling infrastructure at AWS.

Inference Is King (18:12)

Training headlines. Inference dominates cost.

ā

Most of the growth today is inference. That’s what drives compute usage—the moment a customer hits an app or asks a question.

AWS Silicon: A 10-Year Bet (20:05)

From Nitro to Graviton to Tranium, Annapurna made it all possible.

ā

Ten years later, the Annapurna team still works with us. Our best acquisition ever.

Why Model Choice Matters (27:57)

AWS isn’t betting on one model to rule them all.

ā

We want customers to use the best model for the job. That’s why Bedrock includes everything from Anthropic to Writer to Luma AI.

Open vs. Closed Isn’t the Real Question (33:39)

The real value comes from customization.

ā

Whether it’s open weights or fine-tuned APIs, customers want to shape models to their own workflows. That’s what matters.

Will AWS Build a Frontier Model? (36:24)

Nova is the start—but partnerships will still thrive.

ā

We’ve built a muscle: compete where necessary, partner everywhere else. That’s how we’ve run AWS for 19 years.

Benchmarks Are Nearing Irrelevance (41:33)

Models have learned how to ace them.

ā

Benchmarks work well for SSDs. They don’t capture complex behavior. We’re already seeing models saturate MMLU, AM24, and others.

Agent Workflows Are the Next Platform Shift (47:13)

AWS is going all-in on agent infrastructure.

ā

We launched Agent Core in Bedrock to make enterprise agents scalable, secure, and auditable. That’s where real AI ROI will come from.

Key Takeaways

AI Will Enhance Work, Not Replace It
Garman sees AI as an augmentation tool—especially in creative, analytical, and engineering workflows.

Agent Infrastructure Is the Next Big Platform
From secure runtimes to memory and gateways, AWS is building the scaffolding for enterprise-scale agents.

Model Diversity > One-Model Dominance
Rather than an omni-model future, AWS is betting on specialization—offering customers choice through Bedrock.

Open vs. Closed Is a Spectrum
Customization, not philosophy, is what customers care about. Whether via open weights or closed APIs, it’s about fit.

Benchmarks Are Losing Signal
As models converge on test scores, real-world performance and UX integration matter more than academic leaderboards.

Silicon Strategy Is Core to AWS’s Edge
Owning the full stack—from chip to UI—gives AWS flexibility, pricing control, and customer alignment unmatched by competitors.

Full Transcript

01:00 — Future of Work / White-Collar Jobs

MB: I wanted to talk first about the future of work and white-collar work. There’s a spectrumā€”ā€œwhite-collar bloodbathā€ to utopia. Where do you fall and why?

MG: I’m on the optimistic side. In tech there’s never been a more exciting time. Across industry, the advances in AI have enormous potential to increase efficiency, effectiveness, and enablement at work.

A lot of what AI promises is taking away the toil in day-to-day jobs. Today, the vast majority of time isn’t spent on what people get excited about—it’s not ā€œput my numbers in this system,ā€ ā€œpull the report,ā€ or ā€œcollate information.ā€ That overhead takes a large percentage of time. AI can really help shrink that, letting people focus on the creative and analytical parts they love. That drives value for companies and people.

I’m very optimistic. This is not ā€œno one has a job and robots run the world.ā€ Companies and people become more efficient, and people spend more time on what they’re excited about.

03:05 — Hiring in a High-Productivity Era

MB: If you automate large swaths of tasks, you can do more as a company. Does that mean you won’t hire anymore? How do you think about hiring when every person is more productive?

MG: It depends—there’s no single answer for every company. Historically, efficiency gains come with a transition. The critical thing is for people to be flexible, willing to learn, and accept jobs will evolve. The job from two years ago won’t be identical two years from now.

There isn’t mass unemployment because we have computers, automation, or robotics. There are lots of jobs—often higher-paying. The economy is bigger; on average people are better off. Think of Excel: many used to spend time doing calculations. Excel didn’t eliminate those jobs; it changed them. AI is more disruptive than Excel, but the analogy holds: people move to higher-value work.

People are worried; I don’t minimize that. Embrace the technology. The more you do, the better off you’ll be. AI has potential to transform every industry, company, and job. It transforms, not replaces. If you don’t lean in, you might be out of a job; if you do, it makes you better/faster and lets you do more of what you like. That’s better for companies, people, and economic growth.

05:51 — Speed of Change

MB: Pessimists say AI’s speed is different from prior shifts. Will speed really affect the white-collar market?

MG: It’s a rapidly evolving space; people will have to move faster. For developers worried coding tools will make them unnecessary, I think we’ll need more developers, not fewer. The job changes: maybe you won’t author Java code; you’ll deconstruct problems, coordinate agents, and build systems.

The part you may not do in two or three years is authoring Java code—tools will be great at that. But pulling it together, reviewing, deciding it’s not quite right, coordinating agents—that becomes the developer’s job. That person drives more value. These tools unlock creativity: turning ideas into action takes time; if we unlock that, good developers become even more valuable.

MB: As a leader, if someone becomes 5–10x more productive, the last thing you want is fewer of them.

MG: Exactly. You invest more, because the ROI compounds. I don’t understand the math for wanting fewer of those people.

08:52 — How Much Code Is ā€œWritten by AIā€?

MB: How much of AWS’s code is written by AI? And define ā€œwritten.ā€

MG: ā€œLines of code written by AIā€ is a silly metric—lines aren’t the point (fewer can be better). The last metric I saw: north of 80% of our developers use AI in their workflows—unit tests, docs, writing code, or agentic flows via things like QCLI or our Neuro IDE. That number goes up weekly.

MB: Are engineers proactively upskilling, or do you run programs?

MG: Both. Amazon’s large and not homogeneous, but most developers are curious. The number who haven’t used an AI coding tool rounds to zero. Using it to transform the job vs. using pieces—that’s where education matters. There’s a learning curve: how to change work, when it accelerates vs. slows you down.

First-gen tools can be linear: you ā€œvibe code,ā€ it gives you code, but it’s not what you want; hard to ā€œgo back.ā€ With an agentic coding-first mentality, you start with a spec, then work with the tool to build parts of it. As you vibe code, it updates the spec—but the spec remains the source of truth you can modify.

This also guides junior developers toward best practices. Some leaders told me, ā€œWith AI we can replace all junior people.ā€ That’s the dumbest thing I’ve ever heard—they’re least expensive, most leaned into AI tools, and you need a pipeline of talent. Keep hiring out of college; teach them to decompose problems and build software. Tools like Q can coach good practice and help juniors collaborate with seasoned engineers.

12:55 — Should Students Study Engineering?

MB: Do you recommend engineering as a career to someone entering college?

MG: Yes—though kids should study what they’re passionate about. The emphasis should be: think for yourself, develop critical reasoning, creativity, and a learning mindset. With today’s pace, if you learn one thing and plan to ride it for 30 years, it won’t be valuable 30 years from now. Learn how to learn and how to think. Engineering is great for systems thinking and problem decomposition.

MB: Internally, how do you measure success using AI?

MG: We don’t have a magic new measure. Some of it is productivity. We encourage experimentation—tools, methods, setups. Previously, large systems needed many people focusing on pieces. Now, tools enable smaller, faster pods with broader scope. Startups move fast because of structure—big orgs can, too, if they organize as small pods. We’re leaning into that.

15:46 — Capacity & Infrastructure Buildout

MB: Looking 2–5 years out: what are the bottlenecks—silicon, energy?

MG: Think of The Goal (book) on production lines: there’s always a bottleneck—solve it and the next appears. All of those (chips, power, networking) are bottlenecks at some point. Recent shortage: NVIDIA chips; as that eases, power could be next. It’s hard to get everything in sync at the growth rate we’re seeing; all require capital.

Our job is to think 1, 3, 5 years out: ensure enough power, capacity, and network for customers. Sometimes we’re wrong, sometimes we’re short—but we take that on so customers don’t have to.

18:07 — Where Growth Is Coming From

MB: Where is growth—RL, inference, training?

MG: Inference drives most usage. Training and fine-tuning are important, but compute demand is dominated by the end-user interaction—questions, app usage, workflows. Infrastructure doesn’t care if it’s RL, FT, or inference—same silicon serves multiple needs (networking can differ for big pipelines). Tranium is great for training and inference (despite our naming); NVIDIA chips are similarly versatile.

20:05 — AWS Custom Silicon (Nitro, Graviton, Inferentia, Tranium)

MB: Custom silicon is a differentiator. What sets yours apart vs. Google TPUs?

MG: We start with customers: breadth of choice—capabilities, cost points, trade-offs. No single best solution for all workloads.

Timeline:

  • Nitro (ā‰ˆ10 years ago): offloaded virtualization (network, storage, hypervisor) to custom cards; customers got bare-metal performance and better security.

  • Graviton (ARM CPUs): enterprise-ready with Graviton2; today Graviton4 is ~20% faster than the best x86 and 20% cheaper—a strong value. We still sell tons of Intel and AMD, because customer choice.

  • AI accelerators (~5 years ago):

    • Inferentia for inference—Alexa cut inference costs ~70%.

    • Tranium1 taught us software ecosystem lessons.

    • Tranium2 is now in market.

    • Many customers still prefer NVIDIA (CUDA is excellent). Others—e.g., Anthropic—lean into Tranium; we also use Tranium under the hood to power Bedrock models in serverless inference.

MB: Would you sell your chips to third parties (like rumors of Google selling TPUs)?

MG: Never say never. Today, selling only in AWS simplifies everything: one environment (AWS data center, server, network), fewer SKUs/firmware paths. Selling merchant silicon adds complexity—but it’s conceivable someday.

MB: Annapurna seems underappreciated. What did you see back then (pre-genAI)?

MG: They were mission-driven and matched Amazon’s culture. We wanted to offload virtualization to a card; nobody had that product. Annapurna was building a network card with ARM cores. We co-designed to also offload EBS virtualization. They were smart, scrappy, customer-focused, thought big. We acquired them—best acquisition we’ve made. Most of that team is still at Amazon 10 years later.

27:57 — Which Models AWS Serves

MB: When deciding which models to serve on Bedrock, how do you choose?

MG: We want choice: the best models customers might want. There’ll be a handful of frontier models (expensive), but lots of purpose-built ones. Recently we added Writer (agentic workflows) and 12 Labs (video understanding). We have Poolside (coding), Stability (image/video), Luma AI (video), and more. Ideally we’d support everyone—like AWS Marketplace. Some remain proprietary elsewhere, but our goal is breadth.

29:52 — One Omni-Model vs. Many Specialized Models

MB: Some predict one giant omni-model. I think we’re headed to specialization. Your view?

MG: From day one we believed customers will use many models. Today they do: a large general model for planning/reasoning; specialty models for specific workflows; fine-tuned models on proprietary data. Trade-offs on cost/capability lead to mixture-of-experts in practice.

Enterprises want models that deeply understand their data—better fine-tuning, domain knowledge, customer context. And as we move into an agent-focused world, the model is critical but not sufficient. You need scaffolding, workflows, memory, audit logs, and domain-specific pieces. Most ROI will come from agent workflows doing real work.

33:39 — Open vs. Closed Source

MB: You serve Anthropic (closed) and plenty of open-weight models. How do you think about open vs. closed—and partnerships with Anthropic, OpenAI, Meta?

MG: Most so-called ā€œopenā€ are open weights, not OSS. The key for customers is customization—bring your own data, tailor the model, run custom workflows. Whether via open weights (e.g., Llama, Mistral) or APIs for fine-tuning/distillation on ā€œclosedā€ models (e.g., Nova), the value is the same: fit to your use case. Over time, everyone will want customization; different vendors will enable it differently.

MB: Will AWS build a truly frontier model?

MG: We think choice matters. We’re investing in Nova to offer differentiated capabilities while maintaining deep partnerships (Anthropic, etc.). Compete where it helps customers; partner everywhere else—an AWS muscle we’ve built over 19 years.

MB: Pricing—will models race to cost of silicon + electricity?

MG: Unlikely. Models aren’t commodities today; why tomorrow? Cloud wasn’t commoditized after a year or two either. There are differences in availability, features, UX. Ask customers: Llama, Claude, GPT—are they commodities? No. Even open models differ (Mistral vs. Llama). Vendors must keep innovating; there’s real value to be captured.

MB: Open models seem ~3–6 months behind closed source. Will that persist?

MG: There’s no inherent advantage to open vs. closed; it’s a choice. China labs open-sourced their best while behind; that’s why some feel ā€œopen lags.ā€ But models like DeepSeek and Qwen are impressive; customers love them. OpenAI released open models (smaller than GPT-5). Anthropic hasn’t open-sourced. Meta open-sources all Llama models. Whether they remain behind is about execution, not openness.

41:33 — Benchmarks

MB: Benchmarks are saturated—AM24, A2, MMLU. We’re down to single-digit gains. Do we need new benchmarks?

MG: Benchmarks are great for commodities (e.g., SSD throughput). The more complex the system, the worse benchmarks become. Early database benchmarks (TPC-C) faded; people test their own workloads. We’ll likely move the same way with models. It’s easy to train models to ace benchmarks; that doesn’t make them best overall.

MB: Your personal test?

MG: I prefer testing inside applications—how well it synthesizes research into coherent docs, generates ideas, interacts. Speed matters too. Integration/UX matters a lot—Perplexity is a good example with thoughtful UI and visible ā€œthinking.ā€ Latency is critical for real-time consumer use; for enterprise workflows (e.g., payroll agents), accuracy can trump speed and async is fine. We launched auditable reasoning (math-proof-like verification) applied to LLMs—powerful when you need correctness over latency.

47:13 — Agents

MB: Where are you seeing repeatable, accurate agent implementations with positive ROI?

MG: Several:

  • Agentic coding is a major unlock—developers build more, faster.

  • Enterprise agents were hard to build/run at scale. We launched Agent Core in Bedrock: building blocks for scalable, secure, auditable, measurable agents.

    • Secure, serverless runtime that scales to zero and up to thousands.

    • Short- and long-term memory built in.

    • Agent Gateway for auth with other agents/systems; hosted MCP server support.

    • Observability hooks to AWS or third-party tools.

    • Open framework: works with any model (Gemini, OpenAI, Bedrock), any stack (Strands, LangChain, etc.).

With scaffolding easier, we see use cases across processing, individual productivity, marketing, sales, and industry workflows. Many are still human-in-the-loop, but the path to more autonomy is clear.

50:31 — Final Advice

MB: Many worry about being automated away. Words of encouragement?

MG: Make yourself valuable. AI and agents amplify employees; they’re valuable through you. If you’re great at marketing, do marketing—not pulling campaign plumbing. If you’re great at building apps, be great at building, not at memorizing a language. Learn the tools, focus on customer problems, keep learning. I have very little worry that jobs just disappear and robots run everything.

MB: Thank you, Matt.
MG: Thank you—this was really fun

Enjoyed this conversation?

Follow us on X and subscribe to our YouTube channel for more interviews with the people building the future of AI.

Reply

Avatar

or to participate

Keep Reading