Tenstorrent's vision for the AI Revolution: Conversation with Chief CPU Architect Lien Wei-han

In a rapidly evolving technological landscape, the convergence of artificial intelligence (AI) and hardware is reshaping how people perceive computing and its potential applications. Lien Wei-han, the chief CPU architect at Tenstorrent, an innovative Canadian AI startup that has the potential to challenge Nvidia, AMD, and Intel in AI chips and now building its AI servers, shared intriguing insights into the AI revolution and the role hardware plays in enabling it in an exclusive interview with DIGITIMES Asia.

Lien Wei-han is a prominent figure in computer hardware architecture, renowned for his work as the lead architect of the M1 chip while working for tech giant Apple.

The AI revolution unfolds

While recognizing the emergence of AI as a "computation paradigm shift" that he had been fortunate to witness over the past two decades, Lien described the current era as an unprecedented time in human history, comparing it to the first industrial revolution when machines began replacing human labor. "A few centuries from now, historians will likely refer to this time as the new industrial age, characterized by machine intelligence supplanting human intellect," Lien said.

His involvement in the personal computer revolution in the early 2000s was followed by a pivotal move to Apple in 2007, where he played a role in the mobile computing paradigm shift. This shift, marked by the advent of smartphones, empowered individuals to carry powerful computing devices in their pockets, generating vast amounts of data. The subsequent emergence of data mining in 2007-2010 set the stage for the AI revolution that took off around 2012. This was evident with the introduction of AI models like AlexNet and the proliferation of convolutional neural networks (CNNs).

In 2012, it became clear that the computational demands of AI models were doubling every 3.5 months, leading to exponential growth in model complexity and computing speed. Today, AI models like ChatGPT 3.5 boast 300 billion parameters, with ChatGPT 4 rumored to have 4 trillion parameters. This incredible modeling complexity paves the way for complex AI tasks, particularly natural language processing, and promises to revolutionize various fields.

However, the unprecedented amounts of data generated by billions of people daily create insurmountable computational requirements. Human activity generates an astonishing 2.5 quintillion bytes of data daily, necessitating innovative solutions to meet these demands. Tenstorrent, and companies like it, are driven to tackle this challenge.

Lien's departure from Apple was motivated by his belief in the world-altering potential of AI. "I was pretty comfortable at Apple, and then I realized that the mobile computing paradigm shift I worked on since 2007 has already been pushed to its limits," Lien said. "After 14 years, you don't see the cell phone changing too much. But AI is just beginning." And he wanted to be a part of it.

His involvement in building Apple's silicon development team, which grew to 10,000 people from 150, illustrated the immense importance of AI hardware in the tech industry's future.

Tenstorrent's approach to hardware in the AI era: heterogeneous computation

Lien highlighted the significance of hardware in the AI revolution, dispelling the notion that AI was solely a software-based solution. Tenstorrent stands out because it focuses on building its own hardware, combining central processing units (CPUs) and neural processing units (NPUs) for what Lien terms "heterogeneous computation."

Heterogeneous computation involves two key components: accelerators, specialized units designed for specific tasks like AI computations, and CPUs, which serve as versatile, general-purpose computing units. This combination aims to create a computing environment that excels in specialized AI tasks and general computing.

In the past, CPUs were the primary computing units for most systems. However, the emergence of AI required specialized machines, resulting in an increased focus on accelerator units. Tenstorrent's approach seeks to merge these two worlds, allowing for efficient and powerful computation across various tasks. They recognize that some AI algorithms, particularly newer ones, require the flexibility of general-purpose computing, which CPUs provide.

Lien emphasized the importance of having a robust fallback mechanism for future AI computation. Given that the AI landscape is evolving rapidly and its trajectory remains uncertain, a powerful CPU is vital to ensuring that systems can handle AI-related tasks effectively. This approach necessitates building a cutting-edge CPU due to the technology's maturity and stability, providing a strong foundation for future developments.

Moreover, the integration of CPUs and AI accelerators in a tightly coupled manner reduces power consumption by minimizing the need for electronic signal transmission, enhancing the overall power efficiency of the system. Tenstorrent's approach provides flexibility and efficiency for a broad range of AI applications, addressing the ever-increasing computational demands of the AI revolution.

Scalability and energy efficiency empowered by RISC-V

Tenstorrent's architectural design is distinguished by its scalability and versatility. Lien Wei-han emphasized that their technology can be applied to various domains, from consumer electronics like TVs and cell phones to data centers. The architecture's modular and scalable nature allows them to tailor their solutions to the specific needs of each application, whether it's a small consumer device or a large-scale data center.

Lien acknowledged the critical importance of power efficiency in the AI era, particularly in data centers where power consumption has been growing exponentially. Soaring electricity bills can limit the scalability and sustainability of AI applications.

The power efficiency challenge is evident when considering the deployment of AI models like ChatGPT in search engines. The enormous power consumption associated with running these models at scale makes them prohibitively expensive. For instance, Google's investigation revealed that deploying ChatGPT-grade inquiries for all their search operations would increase the costs by a factor of 100, primarily due to electricity expenses.

Lien emphasized that addressing this challenge requires a balance between computation efficiency, power efficiency, and scalability. Tenstorrent's scalable architecture, coupled with smart design choices, allows them to optimize power efficiency by reducing the supply of electrical voltage when necessary. This ability to scale their technology enables them to run computations in a highly power-efficient manner.

Lien described openness as a key element in Tenstorrent's strategy, and the RISC-V instruction set architecture (ISA) plays a pivotal role in achieving this. RISC-V's open and collaborative nature provides a level playing field for various vendors, fostering flexibility and innovation.

Unlike architectures controlled by one or two dominant companies, RISC-V enables a multitude of vendors to implement their versions of the architecture. This eliminates the need for costly licenses and lengthy negotiations while offering a streamlined approach to innovation. RISC-V's ISA extensions further enhance its adaptability, as they permit the creation of custom extensions to address specific application needs.

Developing supply chain partnerships in Taiwan and Korea

Tenstorrent's partnership with Samsung to build a 4-nm chip is a significant development in their journey. However, it is not an exclusive arrangement, and Lien hinted at potential collaborations with other semiconductor firms and electronic supply chains in Taiwan.

He cited Taiwan's well-established electronic ecosystem, from wafer foundries, packaging, and testing to PCB and system design, as a rich source of expertise and innovation. Tenstorrent seeks to tap into Taiwan's strengths for the production of its AI chips, servers, and other systems, enabling mass production and cost reduction.

Expanding into auto ICs

Lien discussed Tenstorrent's collaboration with BOS Semiconductor on automotive-grade integrated circuits (ICs), which is part of the Hyundai investment in Tenstorrent. This move aligns with their long-term vision of addressing the evolving automotive industry. Lien believes that future cars will resemble smart devices with wheels, where differentiation will come from AI-powered autonomous driving capabilities and power efficiency.

In the era of electric vehicles, the emphasis shifts from powertrain to autonomous driving systems as traditional engines become simplified and standardized. Power efficiency and AI capabilities will be essential for electric vehicles, making Tenstorrent's technology a natural fit for the automotive sector. Additionally, Tenstorrent's modular and scalable architecture enables them to address the computational needs of autonomous driving and robotics, extending the technology's application beyond cars.

Lien also envisions the future of smart mobility being enabled by AI by both inference and training, taking place both at the edge and in the cloud. Some car companies are already trying to build and implement that long-term strategy, in a vision to create a technological advantage that would differentiate themselves from other EV competitors with the help of Tenstorrent.

However, Lien highlighted the importance of functional safety in automotive AI, a critical consideration when designing chips for vehicles. Tenstorrent aims to work closely with partners to ensure that their chips meet the rigorous functional safety requirements of the automotive industry.