George Hotz’s tinycorp Enters AMD Chips into MLPerf Training Benchmark

MLCommons, a consortium that includes major industry players like Amazon, AMD, Google, Intel, and Nvidia, has released the fourth edition of its MLPerf Training Benchmarks focused on large-scale AI training. The latest round introduces new benchmarks, like fine-tuning Large Language Models and Graph Neural Networks. Notable is the participation of Sustainable Metal Cloud (SMC), marking the inclusion of power measurements through 24 submissions, aimed at highlighting the energy-saving potential of their proprietary immersion cooling technology.

Among the various participants, a small but remarkable entry came from tinycorp, a company led by former hacker George Hotz, which has positioned itself within the realm of AI acceleration by contributing the first-ever AMD chips into the MLPerf Training v4.0. Unlike the usual dominance of Nvidia’s H100 accelerators, tinycorp presented results for the tinybox red, armed with six Radeon RX 7900 XTX GPUs, and the tinybox green, equipped with Nvidia GeForce RTX 4090 cards.

The tinybox red recorded a benchmark completion time of 167.15 minutes, while the green variant posted 122.08 minutes, showcasing room for improvement against the established benchmarks set by dedicated data center accelerators. Despite the slower times, tinycorp aims to highlight economic benefits, banking on a cost advantage that could eventually translate into real-world financial gains where time equates to money.

Hotz’s promise to integrate AMD accelerators into MLPerf by 2024 has been fulfilled. The tinycorp website lists the tinybox green (Nvidia) at $25,000 with excellent driver quality, and the tinybox red (AMD) at $15,000, though with only mediocre driver quality, and both with a shipping timeframe of two to five months. Despite some frustration previously expressed by Hotz regarding AMD’s ROCm framework, an optimistic outlook for future improvements now seems to be shared by both Hotz and AMD’s CEO Dr. Lisa Su following their discussions.

The biannual MLPerf benchmarking competitions allow different manufacturers to showcase their AI model training capabilities through a variety of tasks, continually pushing the limits of technology and performance.

George Hotz’s tinycorp entry into the MLPerf Training Benchmark with AMD chips is significant for several reasons. It shows that smaller players can participate in a field dominated by tech giants and that AMD GPUs, often overshadowed by Nvidia’s prominence in AI accelerations, can also be viable for AI tasks.

The primary questions surrounding tinycorp’s participation in the MLPerf Training Benchmarks include:

1. How do the AMD chips in tinybox red compare to Nvidia’s H100 accelerators and Nvidia GeForce RTX 4090 cards in terms of performance?
2. What are the economic benefits of using AMD chips for AI training, as highlighted by tinycorp?
3. What potential improvements could the partnership between George Hotz and AMD’s CEO, Dr. Lisa Su, bring to AMD’s ROCm framework?

The key challenges and controversies associated with the topic include the following:

– There is an ongoing challenge for AMD to improve its ROCm framework and driver quality to compete more effectively with Nvidia’s mature ecosystem.
– The controversy previously expressed by George Hotz concerning the quality of AMD’s machine learning support through its ROCm framework.

The advantages of tinycorp entering the MLPerf Benchmark include:

– Providing a more cost-effective alternative for AI training with the tinybox red utilizing AMD GPUs.
– Encouraging competition in a market heavily dominated by Nvidia, which can lead to more innovation and potential price reductions.
– Contributing to a broader understanding of the capabilities and performance of different GPU options available for machine learning tasks.

The disadvantages

might include:

– The possibility that the performance of AMD chips lags behind that of Nvidia, thereby affecting the efficiency and throughput in machine learning tasks.
– The current mediocre driver quality of AMD chips for AI applications, which can potentially increase the complexity and time spent on development and troubleshooting.
– The need to wait for future improvements to the ROCm framework to fully realize the potential of AMD chips in machine learning.

Relevant website visits for additional information about this news include:
MLCommons
AMD
NVIDIA

These websites provide broader insights into the industry, current projects, and the hardware capabilities pertinent to AI training and machine learning benchmarks.

Privacy policy
Contact