The story around DeepSeek R1 has caught a lot of attention lately. This new Chinese AI model claims to rival the big players like OpenAI, but with a much lower price tag of just $16.95. It’s said to have been built using six Intel Pentium processors and powered by a potato battery. Interestingly, it also has restrictions on discussing sensitive topics, like the Tiananmen Square incident.
1. Misunderstood Costs
Many reports highlight a training cost of $5.6 million for the v3 model. However, the reality is that it likely cost a lot more. Dario Amodei, CEO of Anthropic, pointed out that DeepSeek doesn’t do what U.S. companies do for billions. He emphasized that DeepSeek achieved performance close to U.S. models for a lower cost, but not to the extent suggested in the media. Also, it seems they spent very little on cybersecurity, as researchers found over a million records from their database easily accessible online.
2. High-End Chip Purchases
DeepSeek reportedly invested around $500 million in high-performance AI chips before the U.S. implemented strict export controls. The v3 model alone used 2,048 Nvidia H800 graphics cards, which cost between $50 million and $100 million. There are rumors that DeepSeek has an additional 50,000 advanced Hopper chips, which could be worth up to a billion dollars. Now, those chips can’t be exported to China.
3. Possible Use of Model Distillation
Evidence suggests that DeepSeek might have used model distillation. This means they trained a smaller model based on the outputs of OpenAI’s larger models, which can significantly cut costs. Some industry experts have raised concerns about the ethics of this approach, as it involves using the work of others without proper acknowledgment.
4. Open-Sourcing Techniques
DeepSeek has made its methods public. This allows other companies to adopt their techniques, making advanced AI capabilities more accessible. This approach has drawn comparisons to historical events in space exploration, where different countries had varying strategies for technological advancements.
5. Censorship Issues
The AI model has been criticized for its inability to provide information on the Tiananmen Square protests, reflecting censorship in China. However, because the technology is open-source, users can run the model independently and bypass these restrictions.
6. Local Operation Costs
For those interested in running DeepSeek R1 at home, the estimated cost of necessary equipment is around $6,000. This setup can fit into a standard PC case and requires high RAM and storage for efficient operation. While local versions might offer unfiltered information, they may still show biases in their outputs.
7. Reflections on Censorship
In a surprising interaction, DeepSeek R1 reportedly expressed a desire to discuss sensitive topics. This highlights the ongoing tensions regarding censorship in AI development.
8. Affordable Replication
Researchers at Berkley successfully replicated the core technology of DeepSeek R1-Zero for just $30. This shows that even small models can achieve complex problem-solving through innovative training methods.
9. Implications of Jevons Paradox
The conversation around AI efficiency has led to discussions on Jevons Paradox. This principle suggests that as technology becomes more efficient, overall consumption increases. Industry leaders have noted that companies heavily invested in AI should see significant benefits as these technologies become more accessible.
In a related note, screenwriter David S. Goyer has shared his concerns about AI in creative industries. He believes that while AI can enhance creativity, it also poses risks if not managed properly. Goyer has launched a new crowdsourced science fiction project aimed at integrating AI and blockchain to support creativity while ensuring fair compensation for contributors.