Because Chinese entities don’t have access to tens of thousands of advanced AI GPUs from companies like Nvidia, companies in the country must innovate to train their advanced AI models. Kai-Fu Lee, founder and boss 01.aisaid this week that his company trained one of its advanced artificial intelligence models using 2,000 GPUs for just $3 million.
Unlike competitors such as OpenAI, which spent $80-100 million on training GPT-4 and reportedly as much as $1 billion on GPT-5, 01.ai trained its high-performance model for just $3 million, according to Kai-Fu Lee. According to a chart on the company’s website, 01.ai’s Yi-Lightning ranks sixth in model performance as measured by LMSIS at the University of California, Berkeley.
This achievement is, of course, particularly noteworthy given the company’s impressive cost efficiency. The stark difference shows that achieving top-notch AI capabilities doesn’t always require huge budgets, as careful engineering and targeted optimizations can deliver similar results at a fraction of the cost and computational resources.
OpenAI is believed to have used 10,000 Nvidia A100 GPUs to train the GPT-3 model and many more H100 GPUs to train the GPT-4 and GPT-4o models. Using various groundbreaking solutions, 01.ai had to train its Yi-Lightning model using 2,000 undisclosed GPUs. However, last year Kai-Fu Lee said his company had enough GPUs to meet its 1.5-year roadmap. However, the numbers don’t add up. One Nvidia H100 GPU costs about $30,000, so two thousand of these GPUs would cost $6 million.
To improve model performance, 01.ai focused on reducing inference bottlenecks by transforming computational requirements into memory-oriented tasks, building a multi-layer caching system, and designing a specialized inference engine to optimize speed and resource allocation. As a result, ZeroOne.ai’s inference costs are significantly lower than similar models – 10 cents per million tokens – which is about 1/30 of the typical rate charged by comparable models.
“When we only have 2,000 GPUs, the team has to figure out how to use them,” Kai-Fu Lee said. “I, as CEO, have to figure out how to prioritize, and then we not only have to train quickly, but we also have to come to conclusions quickly. So we design our inference by finding bottlenecks in the overall process, trying to transform a computational problem into a memory problem, building a multi-layer cache, building a specific inference engine, and so on. But the most important thing is that our inference cost is 10 cents per million tokens.”
Chinese companies in general, and 01.ai in particular, face solemn challenges. Due to US export restrictions, they have minimal access to advanced GPUs. They also have a lower valuation compared to US AI companies, which affects their available resources and investment opportunities.