Microsoft & OpenAI Investigate if DeepSeek Obtained Data from OpenAI

Security researchers from Microsoft believe that individuals possibly linked to DeepSeek are “exfiltrating a large amount of data” using OpenAI’s API. 
Views : 1,050

Illustration by Supreeth Koundinya

OpenAI, the company behind the GPT/o1 series of models, and Microsoft are investigating whether Chinese AI startup DeepSeek obtained unauthorised data outputs from OpenAI’s models. 

As reported by Bloomberg, security researchers from Microsoft believe that individuals possibly linked to DeepSeek are “exfiltrating a large amount of data” using OpenAI’s API. 

Microsoft is OpenAI’s largest investor, and, as per reports, it notified OpenAI of the suspected activity, which violates the company’s terms of service. 

Moreover, several DeepSeek users on social media speculate that the model displays similar tendencies to OpenAI. 

OpenAI also told the Financial Times that it had seen “some evidence of distillation”, which is a technique to improve the performance of an AI model by using outputs from another one.

A user on Reddit also spotted the DeepSeek model trying to generate an answer that complies with OpenAI’s terms of use. 

DeepSeek’s latest reasoning model, R1, has outperformed OpenAI’s o1, the company’s most powerful model available for public use. R1 scored higher than o1 on multiple benchmarks involving logic, reasoning, coding, and mathematics. 

Recently, DeepSeek’s official app dethroned OpenAI’s ChatGPT and other competing AI apps in the ‘Top Charts’ on the US App Store for iPhone and iPad.  

Recently, around $589 billion was wiped out from GPU giant NVIDIA’s market cap. This was likely because DeepSeek was built with little computing and capital, raising concerns about the demand for GPUs and other AI resources to build state-of-the-art models. 

For instance, one of DeepSeek’s previous models, the V3, used just about 2048 NVIDIA H800 GPUs to achieve performance better than most open-source models. It also only took $5.5 million to train the model. 

Andrej Karpathy, former OpenAI researcher, said the DeepSeek V3’s level of capability is “supposed to require clusters of closer to 16,000 GPUs”. 

DeepSeek’s parent company, High Flyer, is a Chinese hedge fund company. While the company was founded in 2015, the DeepSeek project was started in 2023.

US President Donald Trump said, “The release of DeepSeek AI from a Chinese company should be a wake-up call for our industries.” He added that he views DeepSeek producing an AI model using cheaper methods “as a positive”. 

DeepSeek has also announced Janus Pro, an AI image generation model, which is claimed to offer better results than OpenAI’s DALL-E 3. 

📣 Want to advertise in AIM? Book here

Picture of Supreeth Koundinya

Supreeth Koundinya

Supreeth is an engineering graduate who is curious about the world of artificial intelligence and loves to write stories on how it is solving problems and shaping the future of humanity.
Related Posts
Association of Data Scientists
GenAI Corporate Training Programs
Our Upcoming Conference
India's Biggest Women in Tech Summit
Mar 20 and 21, 2025 | 📍 J N Tata Auditorium, Bengaluru
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.
discord icon
AI Forum for India
Our Discord Community for AI Ecosystem.
Rising 2025 is just around the corner! Book your passes now to lock in your ticket at the lowest price.