- The Current ā”ļø
- Posts
- š Making Sense of DeepSeek
š Making Sense of DeepSeek
Four ways to examine the DeepSeek hype ā building a multi-disciplinary perspective
Wow! Itās been a while since something went this viral out of Chinaā¦
Actually itās only been about 5 years šš¦

Since the last two weeks worth of news was dominated by DeepSeek and the host of narratives swirling around its implications in various areas (technically, politically, socially) ā I thought it would be fun to pull some of the best takes Iāve seen into one place here and try to weave them together into a more holistic view of the topic.
As with all things AI, reactions tend to range from āItās the end of everything as we know it š¤Æš„š¤šā to āmeh šš¤·š„±š“ā.
Being someone who reads this news every day from a variety of sources, itās easy to tell which of those sources people pay attention to most based on what angle they lead with when they say stuff like: āSo is this Chinese Sputnick moment keeping you up at night?ā
And while itās true that DeepSeekās arrival sent shockwaves through the AI industry as a whole (and world markets for that matter), understanding the ways its impact will ultimately manifest demands a multi-disciplinary approach.
So here are four ways to look at the DeepSeek āmomentā āļø
1ļøā£ Technical: Whatās Actually New About DeepSeek?
DeepSeekās advancements arenāt just about catching up to US led companiesātheyāre about taking AI development in new directions entirely. Some key differentiators:
Training Scale & Efficiency ā This is really the most important. DeepSeek's models appear HIGHLY efficient in parameter usage, competing with OpenAI and Anthropic at a fraction of their compute cost.
By āfractionā I mean 0.0086% of their compute cost
If we can believe their reports, training costs for DeepSeek R1 were about $6 million, compare that to the ~ $7 billion OpenAI is said to have dedicated to new model training in 2024ā¦of course, thatās not all going to o1 (the reasoning competitor to DeepSeekās R1) but stillā¦
The question is, of course, how did they do it?
It is sort of ironic. To us news readers vs. news makers, itāll never be fully clear exactly how well those export restrictions worked, or how much of an advantage having advanced GPUs really gives.
The point is that innovation finds a way. As much as we act like we know how to foster (or squash) innovation, it still just seems to happen no matter what we try to orchestrate on a societal levelā¦often times in spite of whatever intentions we collectively set.
Viewed as competitors, these brilliant minds over in China working with DeepSeek are kind of like the weeds that inevitably poke out the sides of any weed guard you put on the ground ā only the strongest ones survive and by the time you see them theyāve already rooted down deep. Nature always finds a way around blanket solutions, it seems.
This is the technical report referenced, if interested to read further.
Yes, butā¦
There is some indication that the reported $6 million figure did not include costs of prior research and access to existing infrastructure ā as one X user pointed out:
1) DeepSeek r1 is real with important nuances. Most important is the fact that r1 is so much cheaper and more efficient to inference than o1, not from the $6m training figure. r1 costs 93% less to *use* than o1 per each API, can be run locally on a high end work station andā¦ x.com/i/web/status/1ā¦
ā Gavin Baker (@GavinSBaker)
2:54 PM ā¢ Jan 27, 2025
The $6m does not include ācosts associated with prior research and ablation experiments on architectures, algorithms and dataā per the technical paper. āOther than that Mrs. Lincoln, how was the play?ā This means that it is possible to train an r1 quality model with a $6m run if a lab has already spent hundreds of millions of dollars on prior research and has access to much larger clusters.
So basically, itās possible to train a model like R1 for $6 million so long as everything thatās already been built has been built and you have access to a certain, somewhat unclear level of technical infrastructure. Much less impressive than āwow these guys made ChatGPT for 0.0001% of the cost!ā
Sensationalism is out of control these days.
Even so,
Beyond the cost efficiency, DeepSeek Coder performs competitively with GPT-4 Turbo in coding benchmarks.
Additionally, its models demonstrate strong long-context processing, potentially rivaling Anthropicās Claude models in handling extended input lengths.
Finally, and perhaps most importantly, DeepSeek stands apart from its American counterparts with its open-source strategy, making its models accessible to global developers while also circumventing US-led AI restrictions.
This approach not only aims to accelerate adoption but also positions DeepSeek as a leader within the space as a whole. If DeepSeek is the one who is determining the way global builders train AI, they should end up with an edge over other companies.
Bottom Line: DeepSeek isnāt just a "GPT clone" despite OpenAIās attempts to cast it as such with claims of data distillation. Itās part of a broader trend of China closing the AI gapāand in many ways, setting the pace.
2ļøā£ Economic: Why Did Markets React So Much?
The financial world certainly noticed DeepSeekās arrival, with notable plunges in AI-related stocks, especially NVIDIA. Hereās why:
Investor Anxiety Over Excessive Costs ā this was already rising back in the summer of 2024. As Big Tech continued to throw absurd amounts of money at AI development, investors began wanting to see returns. Profitability to justify those expenditures hadnāt been proven (and still isnāt, really).
Pressure on NVIDIA & AI Hardware ā since AI trains on expensive GPUs, NVIDIA, as well positioned vendors of the fastest GPUs (previously used for gaming) saw extreme growth over the last couple of years.
But if China can produce world-class AI models with reduced reliance on US chips, thatās bad news for companies banking on AI infrastructure dominance.
āAccess to computeā was previously seen as one of the most notable bottlenecks in the AI industry. It was thought that in order to train top performing models, you needed to have absurd amounts of compute power, and only the very deepest of pockets could afford it. But DeepSeekās arrival shows we were wrong to make such myopic assumptions.
This guy called it back in the summer of 2023 āļø
Markets have rebounded since the initial shock, and notably, Apple has emerged as a winner thanks to their (previously ridiculed) slow and measured approach to AI.
Back in mid 2024 people were freaking out that Apple wasnāt putting out competitive AI models.
Now they are freaking out that other big tech companies shoveling billions into developing stuff the Chinese figured out how to make for (relative) pennies.
Bottom Line: Markets donāt like uncertainty, and DeepSeek introduces a lot of itāespecially for investors betting on US-based AI dominance.
3ļøā£ Domestic Politics: Why Is the US Taking This So Personal?
After DeepSeek, Washington and US-led tech companies are suddenly looking a bit behind the curve in terms of their approach to AI development. Hereās why:
Chip Export Restrictions ā The US has been tightening chip restrictions in an attempt to slow Chinaās AI progress, but DeepSeekās rise suggests that effort is failing.
Project Stargate ā DeepSeekās arrival coincided with the Trump administration announcing a $500 billion AI infrastructure investment deal with OpenAI, SoftBank, MGX, and Oracle.
That number was already looking ridiculous, with Elon Musk even called it a sham. And while thereās still an argument to be made for building that much infrastructure to āfuture-proofā AI development, itās a lot harder to feel great about sinking that much money into something that the Chinese can basically make in their garage.
Bottom Line: The US government and US-led tech companies are probably doing some soul searching right now (if they have those) about their assumptions on how to maintain competitive advantage.
Here we were, cruising along into another four years of Trumpās MAGA, maybe on some level thinking: āwell at least if the world ends from an AI takeover it will be Made-In-Americaā šŗšøš¤£š©š³
Now suddenly we donāt seem to be in control as we thought.
This is gonna be a weird reference, but honestly this song here encapsulates the feel of this whole thing for me in the best, most awkwardly hilarious wayā¦
Weāve been riding in the Cadillac for too long, smugly convinced our our relative dominance and abilities. This DeepSeek āhappeningā can be thought of as a blessing in disguise that sucks when it comes, but leaves you much better off in the end with an improved sense of self awareness, and an improved strategic operative capacity that that awareness brings.
4ļøā£ International Politics: DeepSeek is a Classic China Move
When I was young(er) and first became able to read/understand what countries were, I remember turning over my toys and always finding somewhere stamped on Supermanās foot āMade In Chinaā.
The obvious question was, why?
The answer isnāt really that complicated.
If you assume open borders, and factor in the relatively low costs for moving products around the world associated with the modern era (established infrastructure for cargo ships, planes, etc.), then flooding the market with a massive supply of unbeatably cheap goods, over time, results in death for domestic producers of other countries.
China has:
Plenty of low-wage labor
A devalued local currency (which makes it affordable to trade with)
Loose domestic restrictions with a government actively seeking to buoy manufacturing sectors
Selling a t-shirt for $1 means (in theory) that the total cost of getting that shirtās materials, assembling it, and moving it to the buyer is < $1. In the US, operating costs are simply way too high for anyone to compete with that kind of pricing.
So eventually all manufacturing shifts to the place where it can be conducted cheapest, even from American owned companies.
āDesigned in the US, Made in Chinaā
Rudimentarily speaking, someone got paid $20/hr to design it in Idaho, they emailed that design to Chinese producers who make 10,000 of them for $0.01 each, then they pay a shipping company $100 to put them in a big box and send to Los Angeles, then they charge me and you $100 per unit. Donāt bother doing the math, the logic is obvious.
Dominating the means of production for the cheap plastic-made home goods, appliances, and clothes that became ubiquitous in the 21st century was one thing ā can they find a way to dominate the means of production for technology too?
There is obviously a physical component to developing AI ā the hardware. But beyond that, you need talent. Highly skilled engineers that can utilize the hardware for creating cutting edge tech. Naturally the companies that can attract the best talent tend to dominate the technology market as a result, and countries with the highest wages + quality of life tend to be able to attract that talent consistently.
Thatās really what makes the DeepSeek moment so compelling. It flies in the face of this conventional wisdom ^
DeepSeek isnāt a government agency,
But as we all know (from the TikTok saga if nothing else), in China, the line between state and private enterprise is blurry. Some factors to consider:
State-Backed AI Ecosystem ā Even if DeepSeek is āprivate,ā Chinese AI firms operate under strict CCP oversight and benefit from state-funding.
AI as a Geopolitical Weapon ā China sees AI as central to economic and military competition with the US.
As usual, theyāve been busy architecting advancement by pushing adoption of AI in specific sectors and driving development.
The ultimate goal seems to be sovereignty ā that is, freedom from reliance on US silicon or any other source of technical innovation. We, of all nations, can hardly blame them for seeking this.
Beyond Silicon Valley Dependence ā If Chinaās top models no longer rely on US chips or cloud services, the US loses a key leverage point in global politics.
Basically, if other countries like China (and others) no longer need to play nice with the US to get access to top tech models, thereās an amount less incentive for them to cooperate with the US in international relations.
Bottom Line: DeepSeekās success is part of a larger and longer push by China to challenge US dominanceāitās not necessarily as a direct government project, but certainly is part and parcel of a national priority that will only be improved by itās demonstrated success.
Amidst all the hot takes, one thing is for sureā
DeepSeekās arrival isnāt just a blipāitās a sign that the AI landscape is becoming truly multipolar. This moment forces tough questions:
Can the US maintain AI leadership?
Do chip export restrictions workā¦at all?
How should global companies respond to Chinaās AI progress?
Stay tuned ā new things happen every day in this wild industry ~