Deepseek is sorta kinda Open Source

Published:

by

The moment I heard about DeepSeek, I ran to check my Stock Portfolio to see how badly I was affected. Everyone and their grandmother are predicting the downfall of Western Civilization as we know it. Nothing has changed in my portfolio, because of course I was too late to invest in Nvidia. I am still late, but that's not the point here. The point is DeepSeek is the new Open Source model that rivaled ChatGPT on a shoestring budget.

It cost OpenAi upward of $75 million dollars to train their models. For each new iteration, this price continues to grow steadily. Those new models are proprietary to the company, and are not publicly available for anyone to use. We can refer to those models as closed source since only OpenAi has access to them. So when I heard DeepSeek is open source, and their latest model was trained on a budget of $6 million dollars, my first thought was to go ahead and download the training data and run it myself.

And that's where the open source term kinda falls apart. When we think of an AI model is open source, we swallow a gulp of air between open and source. We imagine access to the code, the data, and the inner workings. But in reality, the training data is never available. The data would include all the information scraped from the internet. Be it text, video, images, the majority of which is copyrighted. When we say a model is open source, we are referring to the weights generated through training. It is open weights. (Doesn't have that same ring to it, does it?)

For example, OpenAi or Deepseek will gather data around the web, build/train the model using this data, then call it AI. In the case of OpenAi, they keep the resulting model a secret only they can enjoy. Deepseek releases these models to the public as R1 and V3. Although they are a black box and we can't roll them back into the original data. However, we can still use these open models in our own devices. Anyone with the resources can download the model and run it.

It's not open source in the traditional sense where you have full access to the data. It's open source as in you have access to the model and its weights for free. We probably won't ever get an open source model in the real sense. But for now we will have to settle for "sorta kinda" open source.

As an aside, the market is responding to this news as if Nvidia has lost its edge now that we have a model that can be trained for cheaper. I think this is absurd. It looks more like we had an inefficient algorithm that required 100% CPU to run, now we've come up with a better algorithm that only takes 6% CPU. If anything, we can do more now with less hardware. In fact, now that we have all this hardware available, it will be trivial to do more training. I suspect we will see a leap in the LLama models in the coming days.