DTM E58. The Future of AI Chips - Dr. Naveen Verma, EnCharge.ai
From the Deep Tech Musings Podcast - Get actionable and tactical insights to take your Deep Tech startup from 0 to 1 [Idea to Traction].
Listen now on - Spotify, Apple, Google
Naveen Verma, Ph.D., is the Co-founder and CEO of EnCharge AI. At Encharge, he leads a team of engineering veterans from NVIDIA, Intel, Qualcomm, IBM, and AMD who are commercializing a revolutionary AI chip that solves the energy, scalability, and cost constraints of existing AI compute technologies. Naveen spent the last six years leading the cutting-edge research behind EnCharge AI’s core technology as a professor of electrical and computer engineering at Princeton University, where he has done pioneering research in a range of emerging technologies and systems since 2009, including with DARPA funding. His breakthrough, peer-reviewed discoveries in next-generation in-memory computing have been widely recognized and have led to demonstrated step-change increases in performance and efficiency that are the foundation of Encharge AI’s full-stack commercial AI chips.
On today’s episode we discuss,
The Inspiring Journey of EnCharge.ai: From Founding Story to Breakthrough Innovation
The Problem EnCharge is Solving: Why existing chips are not very efficient for handling extreme compute workloads
Solving the Problem from First Principles Thinking: Encharge's breakthrough innovation combining analog+digital
Major Technologies Comprising EnCharge's Innovation
How EnCharge is Uniquely Positioned: The insight
Can’t NVIDIA Build These Chips?
Listen to the episode here,
or on the platforms below
Links & References:
Naveen Verma- naveen-verma (LinkedIn)
Encharge - encharge.ai, LinkedIn
Pronojit Saha, DTM Podcast - pronojitsaha (LinkedIn), @pronojits (Twitter)
(0:44) Founding Journey and Motivation
1. My journey began as a researcher, starting as a professor at Princeton University in 2009 after completing graduate work at MIT.
2. I focused on next-generation approaches to computing, especially statistical workloads and machine learning, recognizing their growing importance.
3. My research group aimed for full-stack solutions, exploring new physics, building complex chips, and developing the entire software stack.
4. Collaborations with industry provided insights into real-world challenges and potential solutions, eventually leading to the development of breakthrough AI compute technology in 2027.
5. After six years of refining this technology within the university with grant funding, we spun out the company Encharge.ai, carefully selecting investors who understood our deep tech approach and ensuring a strong, industry-connected team.
(08:37) Key problem EnCharge is addressing
1. Current computing technologies are misaligned with the demands of modern AI and machine learning workloads, making them unsustainable and costly.
2. Access to GPUs is limited and expensive, hindering innovation and development in AI.
3. The industry is transitioning from AI training and development to large-scale deployment, facing challenges in scalability and monetization.
4. Fundamental changes are needed in the compute technology stack, including moving compute outside of data centers to reduce costs, power demands, and privacy concerns.
(12:55) Efficiency of existing chips for extreme workloads
1. AI workloads have increased operations and data by 10,000x over the past decade, demanding unprecedented compute capability.
2. Traditional advancements like Moore's Law have slowed, failing to meet current AI demands.
3. To handle these demands, new technologies must compute more efficiently and avoid data movement bottlenecks.
4. EnCharge focuses on developing architectures that simultaneously address the challenges of increased computation and data handling efficiency.
(15:35) What are the major technologies at play
1. In the realm of digital computing, traditional microprocessor technology has been employed for decades, serving well until the recent surge in AI workloads, which have strained its capabilities.
2. Responses from industry giants like Nvidia, Qualcomm, and Intel have included adapting number formats to lower precisions and specializing hardware accelerators for specific operations like matrix multiplication, prevalent in AI workloads.
3. Despite advancements, digital computing is limited by its binary nature, leaving potential efficiency untapped by not accounting for intermediate signal levels, unlike analog computing, which theoretically could be significantly more efficient.
4. Analog computing, although known for its potential efficiency gains, has historically been hindered by noise, making it challenging to develop robust systems. However, recent urgency to improve efficiency for AI compute has spurred interest and innovation in analog compute concepts.
5. In-memory computing presents an opportunity to address both compute efficiency and data movement challenges by performing computations within memory, particularly beneficial for operations like matrix multiplications that involve parallel multiplies and subsequent reduction operations. Analog computing, with its energy and area efficiency, becomes particularly advantageous for in-memory computing due to its ability to integrate densely into memory circuits.
(24:45) Analog Breakthrough: Capacitor-based Computing
1. Explored various computing approaches, including analog, recognizing noise as a challenge due to semiconductor devices' inherent complexities.
2. Introduced capacitor-based computing, utilizing metal wires as capacitors, which are temperature-independent and geometry-controlled, ensuring robustness and precision.
3. Accumulates charge rather than current for signal processing, leveraging the intrinsic precision and scalability of capacitors, even at advanced technology nodes.
4. Eliminates the need for additional process options, allowing tight integration with digital infrastructure for efficient AI execution.
5. Enables the construction of comprehensive architectures with in-memory computing, matrix multiplication engines, floating-point units, and high-density memory, facilitating end-to-end AI execution.
(28:31) Performance Benchmarking and Workloads
1. Developed complete architectures incorporating in-memory computing and digital infrastructure.
2. Emphasized the importance of co-designing hardware and software components, particularly the compiler.
3. Conducted extensive benchmarking across various AI models, including vision, transformer, and large language models.
4. Achieved a 5 to 15 times performance advantage compared to products in more advanced technology nodes.
5. Highlighted the necessity for both efficiency and programmability across diverse AI workloads, especially for localized devices.
(32:05) Nvidia's Role and Future Solutions
1. Acknowledged Nvidia's significant contribution to compute efficiency in the AI space.
2. Highlighted current challenges such as GPU scarcity and high costs, indicating underlying technology limitations.
3. Emphasized the need for sustainable solutions beyond algorithmic improvements like precision reduction.
4. Suggested future innovations at the algorithmic level, including model reductions and sparsity exploitation.
5. Stressed the importance of fundamental advancements like those pursued by EnCharge, aiming for orders of magnitude efficiency gains while remaining aligned with AI innovation ecosystem.
If you enjoyed this episode, please leave us a rating on Spotify, Apple, or wherever you listen to podcasts. It helps us reach more people who are interested in deep tech & grow the community.
Also, don’t forget to subscribe to Deep Tech Musings Podcast on pronojits.substack.com so you never miss an episode.
Thank you for listening! See you next time!