DeepSeek Didn’t Win Anything, but It is Interesting

Hey maybe open source and regulation is a good thing—who'da thunk it

DeepSeek Didn’t Win Anything, but It is Interesting
Photo by John Cameron / Unsplash

Welp. That hurt my portfolio. Turns out: most of the market didn’t quite understand how this whole AI thing works, so that’s fun. So anyway, the whole thing with DeepSeek isn’t that it’s beating out models from the US, the thing is that it’s doing almost as well, which is actually a little more interesting.

What Happened?

Well, the DeepSeek models are open source, which is unusual for a model of its size. This leads to some interesting benefits including increased transparency, which is something that’s missing from a lot of other models. Then the other element of this is about regulation. The thing about that is that AI companies have said that regulation would harm innovation…and it, well, didn’t. So, let’s get into it.

Regulation

So, the thing is that all of these AI companies had a quick phase where they welcomed AI regulation, but that didn’t last long. A lot of these companies have ended up cozying up to those who will champion—and in the past week or two—have championed deregulation of AI.

Where the irony lies, is that there was a ban on selling particular Nvidia chips to Chinese customers. This meant that China was not able to get access to the same super powerful hardware that companies have access to here. Companies claimed that these GPUs were so important that Nvidia was fighting for the most valuable company in the world. What actually ended up happening was that Chinese companies—using much less powerful GPUs—were able to find ways to make the training process cheaper. Turns out the regulation yielded a better result.

The biggest cost when it comes to LLMs and other similar models (both monetarily and computationally) is in the training. Once it’s trained it doesn’t actually take all that much effort to get it to run well, provided you have enough RAM to store the model in memory. For example, I can load a version of DeepSeek on my M2 MacBook Air and it runs just fine, no internet connection, nothing. If someone can go out there and prove that you don’t need to use the latest and greatest GPUs to get this kind of quality out of the model, then that’s a big problem for Nvidia (and it was). This also means companies (cough, OpenAI, cough) asking for Trillions of dollars in funding, well, don’t actually need that to build a good product.

Open-Source

This is another big thing that companies have been pushing against for a while. Open-source code, definitionally, means that you have access to the source code for a particular website, app, or model can be read. This means that you can actually see what the code is doing. There’s going to be some gray areas here because Neural Networks are, definitionally, black boxes, but there’s a particular security benefit from having the code out in the open, since anyone can read it and report on vulnerabilities, and even fix them if needed through a pull request to change the source code, or a fork to create their own version of the same code. This is helpful for things like censorship and data collection. Granted, there’s not a way to do that within a model itself, but if the application that accesses the model is also open-source, then you’ll be able to see that as well.

Even Apple has been dipping into the open-source AI community with their OpenELM models, whose research findings paved the way for much of Apple Intelligence to be run on device as the OpenELM models pioneered highly memory efficient sizes (ELM stands for ‘Efficient Language Model’ in case you were wondering). You know that when Apple’s getting in on the open-source train, then there’s something there since they always keep stuff super close to the vest.

Another benefit of open-source AI is open-source datasets. Companies are always wishy-washy about what data they used to train their models. Some of those data can be pulled from…less than reputable places, or through questionably legal means. Open-source datasets can help remove this mystery by showing people exactly what data were used to train the model.

Closing Thoughts

Anyone who knows anything about building statistical models knows that the key to a good model is to make it as simple as possible on high quality data. The problem with these AI companies is that they don’t really seem to care. They act like more data can fix the data quality problem, but they’re really being shown this isn’t the case. Much like how Apple’s transition to ARM chips in their Macs showed that you don’t need to constantly suck more power to get a more powerful chip, DeepSeek’s models show that you don’t need to suck data or have inefficient training methods to make a good model. The basics still hold up, and that’s what’s really messing up these AI companies. What are your thoughts on the DeepSeek models? What do you think this means for AI companies? Let me know over on Mastodon.