AI Compute Industry Players

Who participates in the AI compute community — and what positions does each player fill?

Players are the community of participants in the AI compute ecosystem — the WHO. Positions are the roles those players fill — the WHAT. The hat changes; the player remains. (Doctrinal anchor: Ecosystem — every industry has a community of participants.)

AI compute is the bottleneck layer of the AI economy: it determines who can train, who can infer, and at what cost. The data layer that feeds it sits at AI Data Industry Players.

The Ecosystem

The AI compute community has four sides:

Buyers — AI labs, enterprises, researchers, and inference-API consumers that purchase compute to train models or run applications
Providers — GPU/TPU/ASIC makers, cloud hyperscalers, and specialised AI cloud operators that supply the compute layer
Infrastructure — data centres, power grids, networking fabric, and cooling systems that host the compute
Boundary — export control authorities, energy regulators, competition commissions, and AI governance bodies that set the rules

Every player wears multiple hats. A hyperscaler is simultaneously provider (selling cloud GPU instances), buyer (purchasing Nvidia H100s), and infrastructure operator (building and running the data centre). The position changes per transaction; the player remains.

The five-counterparty model from Ecosystem maps to this industry as follows:

Counterparty (canonical)	AI-compute-industry expression
Customers	AI labs training foundation models, enterprise ML teams fine-tuning and running inference, researchers running experiments, inference-API consumers
Suppliers	GPU/TPU/ASIC designers (Nvidia, AMD, Google TPU), fab capacity (TSMC), DRAM/HBM makers (SK Hynix, Micron), liquid cooling providers
Employees	GPU cluster engineers, AI infrastructure specialists, MLOps engineers, data centre operators, power/cooling engineers, procurement specialists
Owners	Hyperscaler shareholders, AI cloud VC investors, sovereign wealth funds investing in data centre real estate, colocation REIT holders
Regulators	US BIS (GPU export controls), energy regulators (data centre power draw + carbon), competition authorities reviewing AI infrastructure concentration, EU AI Act compliance bodies

Buyer side — players

The buyers of AI compute. The value-generators the industry exists to serve. Player = the WHO. Position filled = what they buy.

Player (WHO)	Position filled — what they buy	Asymmetry they need closed	Archetype
Frontier AI lab (OpenAI, Anthropic, DeepMind, Meta AI)	Massive training clusters — 10k–100k+ GPU runs	Allocation access vs competitor; power and cooling at scale	Dreamer / Engineer
Hyperscaler AI team (Google, Microsoft, Amazon)	Proprietary TPU/AI ASIC clusters + Nvidia for general workloads	Vertical integration to reduce per-token cost; ASIC amortisation horizon	Engineer
Enterprise ML team	Cloud GPU instances + managed fine-tuning platforms	Cost per experiment; latency vs throughput trade-off for inference	Realist
AI startup / vertical model builder	Spot GPU capacity + inference APIs + training runs	Budget constraints; access to latest hardware before queue clears	Dreamer
Research institution / university lab	HPC cluster + cloud credits	Funding cycles vs compute availability; open-weights models reduce own-train cost	Philosopher
Inference API consumer (product company)	Tokens per second + cost per million tokens + uptime SLA	Provider lock-in; model capability curve vs cost curve	Engineer

Provider side — players

The organisations that supply AI compute. Player = the WHO. Position filled = what they provide.

Player (WHO)	Position filled — what they provide	Where they compete	Archetype
Nvidia	AI GPU + NVLink fabric + CUDA software ecosystem	Architecture leadership (H100 → B200) + CUDA lock-in is the deepest moat in AI	Engineer
AMD	AI GPU + ROCm open software stack	Price/performance parity with CUDA; open-source stack as a differentiated wedge	Engineer
Google (TPU)	Custom AI ASIC optimised for Transformer workloads + TensorFlow/JAX stack	Captive use + Google Cloud rental; TPU v5 competes on cost-per-token at scale	Engineer
Hyperscaler AI cloud (AWS, Azure, GCP)	GPU clusters + managed training/inference platforms + on-demand scaling	Existing enterprise relationships + data-gravity lock-in; packaging the GPU as a managed service	Realist
Specialised AI cloud (CoreWeave, Lambda, Together AI)	Bare-metal GPU clusters with AI-optimised networking and storage	Cheaper than hyperscaler for pure training workloads; faster GPU allocation during shortage	Engineer
Custom ASIC / neuromorphic (Cerebras, Groq, Tenstorrent)	Wafer-scale or novel-architecture inference chips	Lower latency / higher token throughput on fixed workloads; no general programmability	Engineer / Dreamer

Infrastructure side — players

The physical and digital layer AI compute runs on. Player = the WHO. Position filled = what they provide.

Player (WHO)	Position filled — what they provide	Disruption vector	Archetype
Data centre REIT / colocation (Equinix, Digital Realty)	Physical space + power + cooling for AI clusters	AI power density (40–100 kW/rack) strains existing data centre design; new builds required	Realist
Grid-scale power provider / utility	Electricity supply + grid interconnection for large loads	AI demand is the fastest-growing new load in decades; permitting and grid connection are the bottleneck	Realist
HBM / DRAM memory supplier (SK Hynix, Micron, Samsung)	High-bandwidth memory stacked on GPU dies	HBM is a critical co-constraint with GPU supply; SK Hynix has dominant share of H100 HBM	Engineer
High-speed networking (InfiniBand / RoCE: Mellanox/Nvidia, Arista)	Low-latency GPU-to-GPU interconnect across nodes	NVLink dominates intra-node; InfiniBand dominates inter-node; RoCE as a lower-cost alternative	Engineer
Liquid cooling systems (Vertiv, Schneider Electric)	Direct liquid cooling for high-density GPU racks	Air cooling fails above 30 kW/rack; liquid is the only path to H100/B200 densities	Engineer
MLOps platform (Weights & Biases, Ray, Determined AI)	Experiment tracking + distributed training orchestration + model registry	AI-native platforms reduce the infrastructure-operations burden for ML teams	Engineer

Boundary side — players

Sets the rules the other three sides operate inside. Player = the WHO. Position filled = function held in the system.

Player (WHO)	Position filled — function held	Repeat-player advantage
US BIS (export controls on AI chips)	Restricts export of advanced AI GPUs (A100/H100/H200/B200) to China and other entities	Entity list and licensing tiers restructure global AI capability asymmetry in weeks
National energy regulator (FERC, Ofgem, European ENTSO-E)	Grid interconnection approval + power-purchase contract oversight for large data centre loads	AI data centres are now large enough to affect regional grid planning
Competition authority (DOJ, EC, FTC)	Antitrust review of AI infrastructure concentration + hyperscaler AI acquisitions	Nvidia's GPU moat + hyperscaler packaging of AI compute under review in multiple jurisdictions
EU AI Act authority	High-risk AI system compliance + foundation-model transparency and compute-threshold obligations	Compute-use reporting obligations at >10^25 FLOPs training run threshold
National AI strategy bodies (NIST AI RMF, UK DSIT, Singapore MAS)	Standards + incident reporting + voluntary commitments + evaluation frameworks	Governments are early buyers of AI compute; their standards shape enterprise adoption

The Five Archetypes Across the Community

The fractal pattern names five archetypes that appear at every layer of every system. AI compute is no exception.

Dreamer — The frontier lab founder who believes the next training run unlocks emergent capability nobody predicted. The startup building the wafer-scale chip that makes Nvidia unnecessary. The DePIN protocol that turns distributed edge GPUs into a training cluster.
Realist — The hyperscaler CFO who models the GPU capex payback against five scenarios. The enterprise ML lead who says "we can fine-tune on 8 GPUs — we don't need the cluster." The procurement team that diversified chip suppliers before the export control changed.
Engineer — The GPU cluster network engineer who hits 90% MFU on a 10k-node training run. The MLOps lead who cuts training cost 40% by optimising data pipelines. The ASIC architect who closes the cost-per-token against Nvidia at production scale.
Coach — The ML platform lead who makes the GPU cluster accessible to the 50-person product team that can't hire a cluster engineer. The AI education creator who teaches practitioners to use compute efficiently. The community builder who turns the open-weights ecosystem into a shared training capability.
Philosopher — The researcher asking whether scaling laws hold at 10^28 FLOPs — or whether the next capability jump requires an architectural break. The energy researcher auditing whether the AI compute buildout is compatible with national decarbonisation commitments. The ethicist asking whether access to frontier AI compute should be governed like nuclear capability.

A healthy AI compute community has all five archetypes present. When the Dreamer and Engineer dominate and the Philosopher disappears, the compute buildout concentrates in ways the grid, the regulator, and the competitor can break overnight.

Positions Matrix — Human vs AI Split

Players hold positions. Each position has a human-vs-AI split that is shifting. The hat changes; the player remains — but AI does an increasing share of the work inside the hat.

Position	Human today	AI today	Direction (3–5 years)
GPU cluster operator	Human runbook + incident response	AI-automated failure detection + predictive maintenance	Human for novel failure modes and capacity planning decisions
MLOps / training infrastructure engineer	Human job orchestration + cost optimisation	AI optimises job scheduling and resource allocation	Human focus shifts to architecture and cost model; AI handles run-time
Data centre power engineer	Human load forecasting + UPS/cooling management	AI predicts power demand spikes + pre-stages cooling	Fewer humans per MW; residual is emergency response and novel load profiles
AI procurement specialist	Human vendor relationship + contract negotiation	AI models should-cost + tracks allocation availability	Human for strategic vendor relationships; AI for commodity GPU spot buys
ML researcher (scaling experiments)	Human hypothesis + experimental design	AI runs parameter sweeps + surfaces anomalies	Human irreplaceable for hypothesis formation; AI runs the experiments
AI compute policy analyst	Human regulatory interpretation + lobbying	AI tracks rule changes + models compliance scenarios	Human for regulatory strategy; AI for monitoring and reporting

Archetype Asymmetries — Industry Level

Archetype	What they bring	Where they win in AI compute
Dreamer	Conviction that the next architecture break makes today's GPU stack obsolete	The wafer-scale startup; the DePIN training network; the algorithm innovation that makes a 10x smaller model competitive
Engineer	Cluster-level MFU optimisation; memory-bandwidth-bounded workload design; ASIC tape-out at cost	Nvidia's CUDA moat; the cluster network engineer who hits 90% MFU; the hyperscaler TPU that closes cost-per-token
Realist	Capex payback modelling; allocation risk diversification; export-control scenario planning	The procurement strategy that pre-committed H100 allocation; the enterprise team that right-sized compute before costs scaled
Coach	Compute access democratisation; ML infrastructure education; open-weights community enablement	The MLOps platform that makes clusters accessible; the Hugging Face community that amortises training across the ecosystem
Philosopher	Energy governance; AI capability proliferation risk; open vs closed model access	Asking whether data centre power demand is compatible with the grid; stress-testing whether export controls are achieving their geopolitical goal

Context

depends-on Community → Ecosystem — Five-counterparty model; the hat changes, the player remains
applies-to Community → Archetypes — The five archetypes mapped across this community
pairs-with AI Compute Industry Index — Disruption scoring, friction map, sub-vertical entry ranking
pairs-with AI Data Industry — The data layer that determines what the compute trains
pairs-with Technology Industry — The semiconductor supply chain that produces the hardware
pairs-with Energy Industry — The power supply that is now AI compute's binding constraint
instance-of Standard Templates → Players — Written from the players template

Questions

Which counterparty's perspective is most invisible in this industry — and what routing signal gets missed as a result?
If energy becomes the binding constraint before silicon does, which players gain disproportionate leverage — and which lose theirs?
When inference cost falls to near-zero, does the value in AI compute shift entirely to training — or to the data layer?
Which archetype is underrepresented in the boundary layer — and what does that explain about how the export-control regime was designed?

The Ecosystem​

Buyer side — players​

Provider side — players​

Infrastructure side — players​

Boundary side — players​

The Five Archetypes Across the Community​

Positions Matrix — Human vs AI Split​

Archetype Asymmetries — Industry Level​

Context​

Questions​