China’s AI research lab DeepSeek has recently achieved such a feat that has shocked the world of Artificial Intelligence (AI). This lab has launched its new open-source model DeepSeek-R1, which is not only competing with giants like OpenAI but is also outperforming them in many cases. The special thing is that DeepSeek has achieved this feat at very low cost, which has sparked a new debate in the global AI industry.
What is DeepSeek and how did this dream become reality?
DeepSeek is an independent AI research lab in China, founded by Liang Wenfeng in 2023. Its origins lie with a hedge fund named High-Flyer, which used advanced computing in financial data analysis. However, Liang turned High-Flyer’s resources towards DeepSeek to do something big in the field of AI.
DeepSeek is not backed by big Chinese companies like Baidu and Alibaba. It works completely independently. Liang’s aim is not just to earn profits, but to introduce the world to new technologies through scientific discoveries.
DeepSeek-R1: The new star of the AI world
DeepSeek’s new model DeepSeek-R1 has become a new example in the world of AI. This model is performing brilliantly in tasks like math and coding due to its excellent reasoning ability. DeepSeek has not only open-sourced its flagship model, but also made smaller versions of it available to developers. All these models have been launched under the MIT license, allowing developers to fine-tune and customize them.
The most special thing about this model is its economical training technology. DeepSeek used new technologies, such as multi-head latent attention (MLA) and mix-of-experts, which greatly reduced its costs. According to reports, DeepSeek trained its model with only 10% of the resources compared to Meta’s Llama model.
Liang Wenfeng: The brain behind DeepSeek
The journey of Liang Wenfeng, founder of DeepSeek, is extremely inspiring. Born in 1985, Liang studied engineering at Zhejiang University and then entered the financial hedge fund industry. But his real dream was to do something big in the world of AI.
At DeepSeek, Liang provided opportunities to young researchers from China’s top universities, such as Peking University and Tsinghua University. These young scientists were not only excellent in their studies, but were also completely dedicated to new innovations. Liang believes that young minds are more adventurous and willing to take risks, which is essential in a technology field like AI.
US sanctions and DeepSeek’s smart strategy
In 2022, the US banned China from supplying advanced chips, such as Nvidia H100. This dealt a blow to China’s AI industry. But DeepSeek turned this challenge into its advantage.
DeepSeek used cost-effective and smart techniques to train its models. They adopted measures like custom data exchange and memory optimization, which achieved the best results even with limited resources. These strategies proved that success in the world of AI does not depend only on expensive resources.
DeepSeek made three major changes to its technologies:
Custom communication schemes: Made data sharing between chips so smart that memory was saved.
Memory Optimization: Optimized resources by reducing field size.
Mix-of-models: Combining small models gives results that compete with larger models.
DeepSeek’s influence is growing with its open-source model.
DeepSeek has caught the attention of the whole world by open-sourcing its AI models. Under the MIT license, any developer can use these models and customize them according to their needs. This move not only eases access to AI technologies, but also challenges the dominance of Western companies.
The new AI race: America vs. China
This success of DeepSeek has shocked America. America is now working on big projects to save its sovereignty. However, DeepSeek has shown that big investments and expensive chips alone are not enough in the world of AI.