Why do Nvidia’s chips dominate the AI market?
The firm has three big advantages
No other firm has benefited from the boom in artificial intelligence (ai) as much as Nvidia. Since January 2023 the chipmaker’s share price has surged by almost 450%. With the total value of its shares approaching $2trn, Nvidia is now America’s third-most valuable firm, behind Microsoft and Apple. Its revenues for the most recent quarter were $22bn, up from $6bn in the same period last year. Most analysts expect that Nvidia, which controls more than 95% of the market for specialist ai chips, will continue to grow at a blistering pace for the foreseeable future. What makes its chips so special?
Nvidia’s ai chips, also known as graphics processor units (gpus) or “accelerators”, were initially designed for video games. They use parallel processing, breaking each computation into smaller chunks, then distributing them among multiple “cores”—the brains of the processor—in the chip. This means that a gpu can run calculations far faster than it would if it completed tasks sequentially. This approach is ideal for gaming: lifelike graphics require countless pixels to be rendered simultaneously on the screen. Nvidia’s high-performance chips now account for four-fifths of gaming gpus.
Happily for Nvidia, its chips have found much wider uses: cryptocurrency mining, self-driving cars and, most important, training of ai models. Machine-learning algorithms, which underpin ai, use a branch of deep learning called artificial neural networks. In these networks computers extract rules and patterns from massive datasets. Training a network involves large-scale computations—but because the tasks can be broken into smaller chunks, parallel processing is an ideal way to speed things up. A high-performance gpu can have more than a thousand cores, so it can handle thousands of calculations at the same time.
Once Nvidia realised that its accelerators were highly efficient at training ai models, it focused on optimising them for that market. Its chips have kept pace with ever more complex ai models: in the decade to 2023 Nvidia increased the speed of its computations 1,000-fold.
But Nvidia’s soaring valuation is not just because of faster chips. Its competitive edge extends to two other areas. One is networking. As ai models continue to grow, the data centres running them need thousands of gpus lashed together to boost processing power (most computers use just a handful). Nvidia connects its gpus through a high-performance network based on products from Mellanox, a supplier of networking technology that it acquired in 2019 for $7bn. This allows it to optimise the performance of its network of chips in a way that competitors can’t match.
Nvidia’s other strength is cuda, a software platform that allows customers to fine tune the performance of its processors. Nvidia has been investing in this software since the mid-2000s, and has long encouraged developers to use it to build and test AI applications. This has made cuda the de facto industry standard.
Nvidia’s juicy profit margins and the rapid growth of the ai accelerator market—projected to reach $400bn per year by 2027—have attracted competitors. Amazon and Alphabet are crafting ai chips for their data centres. Other big chipmakers and startups also want a slice of Nvidia’s business. In December 2023 Advanced Micro Devices, another chipmaker, unveiled a chip that by some measures is roughly twice as powerful as Nvidia’s most advanced chip.
But even building better hardware may not be enough. Nvidia dominates ai chipmaking because it offers the best chips, the best networking kit and the best software. Any competitor hoping to displace the semiconductor behemoth will need to beat it in all three areas. That will be a tall order.■
==============================================================
Jensen Huang says Moore’s law is dead. Not quite yet
3D components and exotic new materials can keep it going for a while longer
Two years shy of its 60th birthday, Moore’s law has become a bit like Schrödinger’s hypothetical cat—at once dead and alive. In 1965 Gordon Moore, one of the co-founders of Intel, observed that the number of transistors—a type of electronic component—that could be crammed onto a microchip was doubling every 12 months, a figure he later revised to every two years.
That observation became an aspiration that set the pace for the entire computing industry. Chips produced in 1971 could fit 200 transistors into one square millimetre. Today’s most advanced chips cram 130m into the same space, and operate each tens of thousands of times more quickly to boot. If cars had improved at the same rate, modern ones would have top speeds in the tens of millions of miles per hour.
Moore knew full well that the process could not go on for ever. Each doubling is more difficult, and more expensive, than the last. In September 2022 Jensen Huang, the boss of Nvidia, a chipmaker, became the latest observer to call time, declaring that Moore’s law was “dead”. But not everyone agrees. Days later, Intel’s chief Pat Gelsinger reported that Moore’s maxim was, in fact, “alive and well”.
Delegates to the International Electron Devices Meeting (iedm), a chip-industry shindig held every year in San Francisco, were mostly on Mr Gelsinger’s side. Researchers showed off several ideas dedicated to keeping Moore’s law going, from exploiting the third dimension to sandwiching chips together and even moving beyond silicon, the material from which microchips have been made for the past half-century.
A transistor is to electricity what a tap is to water. Current flows from a transistor’s source to its drain via a gate. When a voltage is applied to the gate, the current is on: a binary 1. With no voltage on the gate, the current stops: a binary 0. It is from these 1s and 0s that every computer program, from climate models and Chatgpt to Tinder and Grand Theft Auto, is built.
Small is beautiful
For decades transistors were built as mostly flat structures, with the gate sitting atop a channel of silicon linking the source and drain. Making them smaller brought welcome side benefits. Smaller transistors could switch on and off more quickly, and required less power to do so, a phenomenon known as Dennard scaling.
By the mid-2000s, though, Dennard scaling was dead. As the distance between a transistor’s source and drain shrinks, quantum effects cause the gate to begin to lose control of the channel, and electrons move through even when the transistor is meant to be off. That leakage wastes power and causes excess heat that cannot be easily disposed of. Faced with this “power wall”, chip speeds stalled even as transistor counts kept rising (see chart).
In 2012 Intel began to build chips in three dimensions. It turned the flat conducting channel into a fin standing proud of the surface. That allowed the gate to wrap around the channel on three sides, helping it reassert control (see diagram). These transistors, called “finfets”, leak less current, switch a third faster and consume about half as much power as the previous generation. But there is a limit to making these fins thinner and taller, and chipmakers are now approaching it.
The next step is to turn the fins side on such that the gate surrounds them completely, giving it maximum control. Samsung, a South Korean electronics giant, is already using such transistors, called “nanosheets”, in its newest products. Intel and tsmc, a Taiwanese chip foundry, are expected to follow soon. By stacking multiple sheets and reducing their length, transistor sizes can drop by a further 30%.
Szuya Liao, a researcher at tsmc, compares going 3d to urban densification—replacing sprawling suburbs with packed skyscrapers. And it is not just transistors that are getting taller. Chips group transistors into logic gates, which carry out elementary logical operations. The simplest is the inverter, or “NOT” gate, which spits out a 0 when fed a 1 and vice versa. Logic gates are made by combining two different types of transistor, called n-type and p-type, which are produced by “doping” silicon with other chemicals to modify its electrical properties. An inverter requires one of each, usually placed side by side.
At IEDM Ms Liao and her colleagues showed off an inverter called a cfet built from transistors that are stacked on top of each other instead. That reduces the inverter’s footprint drastically, to roughly that of an individual transistor. tsmc says that going 3d frees up room to add insulating layers, which means the transistors inside the inverter leak less current, which wastes less energy and produces less heat.
The ultimate development in 3d chip-making is to stack entire chips atop one another. One big limitation to a modern processor’s performance is how fast it can receive data to crunch from memory chips elsewhere in the computer. Shuttling data around a machine uses a lot of energy, and can take tens of nanoseconds, or billionths of a second—a long time for a computer.
Julien Ryckaert, a researcher at Imec, a chip-research organisation in Belgium, explained how 3d stacking can help. Sandwiching memory chips between data-crunching ones drastically reduces both the time and energy necessary to get data to where it needs to be. In 2022 amd, an American firm whose products are built by tsmc, introduced its “x3d” products, which use 3d technology to stick a big blob of memory directly on top of a processor.
As with cities, though, density also means congestion. A microchip is a complicated electrical circuit that is built on a circular silicon wafer, starting from the bottom up. (Intel likens it to making a pizza.) First the transistors are made. These are topped with layers of metal wires that transport both electrical power and signals. Modern chips may have more than 15 layers of such wires.
As chips get denser, routing those power and data lines gets harder. Roundabout routes waste energy, and power lines can interfere with data ones. 3d logic gates, which pack yet more transistors into a given area, make things worse.
To untangle this mess, chipmakers are moving power lines below the transistors, an approach called “backside power delivery”. Transistors and data lines are built as before. Then the wafer is flipped and thick power lines are added to the bottom. Putting the power wires along the underside of the chip means fundamental changes to the way expensive chip factories operate. But shortening the length of the power lines means less wasted energy and cooler-running chips. It also frees up nearly a fifth of the area above the transistors, giving designers more room to squeeze in extra data lines. The end result is faster, more power efficient devices without tinkering with transistor sizes. Intel plans to use backside power in its chips from next year, though combining it with 3d transistors in full production is still a while away.
Even making use of an extra dimension has its limits. Once a transistor’s gate length approaches ten nanometres the channel it governs needs to be thinner than about four nanometres. At these tiny sizes—mere tens of atoms across—current leakage becomes much worse. Electrons slow down because silicon’s surface roughness hinders their movement, reducing the transistor’s ability to switch on and off properly.
Some researchers are therefore investigating the idea of abandoning silicon, the material upon which the computer age has been built, for a new class of materials called transition metal dichalcogenides (tmds). These can be made in sheets just three atoms thick. Many have electrical properties that mean they leak less current from even the tiniest of transistors.
Three tmds in particular look promising: molybdenum disulphide, tungsten disulphide and tungsten diselenide. But while the industry has six decades of experience with silicon, tmds are much less well understood. Engineers have already found that their ultra-thin profile makes it difficult to connect transistors made from them with a chip’s metal layers. Consistent production is also tricky, particularly at the scales needed for reliable mass production. And the materials’ chemical properties mean it is harder to dope them to produce n-type and p-type transistors.
The atomic age
Those problems are probably not insurmountable. (Silicon suffered from doping problems of its own in the industry’s early days.) At the iedm, Intel was showing off an inverter built out of tmds. But Eric Pop, an electrical engineer at Stanford University, thinks it will be a long while before they replace silicon in commercial products. For most applications, he says, silicon remains “good enough.”
At some point, the day will arrive when no amount of clever technology can shrink transistors still further (it is hard to see, for instance, how one could be built with less than an atom’s worth of stuff). As Moore himself warned in 2003, “no exponential is for ever.” But, he told the assembled engineers, “your job is delaying for ever”. Chipmakers have done an admirable job of that in the two decades since he spoke. And they have at least sketched out a path for the next two decades, too.