The Current ⚡️
Posts
🐋 Making Sense of DeepSeek

🐋 Making Sense of DeepSeek

Four ways to examine the DeepSeek hype — building a multi-disciplinary perspective

Jack Lajoie
February 02, 2025 • read time ~ 13 minutes

Wow! It’s been a while since something went this viral out of China…

Actually it’s only been about 5 years 😆🦠

Since the last two weeks worth of news was dominated by DeepSeek and the host of narratives swirling around its implications in various areas (technically, politically, socially) — I thought it would be fun to pull some of the best takes I’ve seen into one place here and try to weave them together into a more holistic view of the topic.

As with all things AI, reactions tend to range from “It’s the end of everything as we know it 🤯🔥🤖💀” to “meh 😑🤷🥱😴”.

Being someone who reads this news every day from a variety of sources, it’s easy to tell which of those sources people pay attention to most based on what angle they lead with when they say stuff like: “So is this Chinese Sputnick moment keeping you up at night?”

And while it’s true that DeepSeek’s arrival sent shockwaves through the AI industry as a whole (and world markets for that matter), understanding the ways its impact will ultimately manifest demands a multi-disciplinary approach.

So here are four ways to look at the DeepSeek “moment” ↙️

1️⃣ Technical: What’s Actually New About DeepSeek?

DeepSeek’s advancements aren’t just about catching up to US led companies—they’re about taking AI development in new directions entirely. Some key differentiators:

Training Scale & Efficiency – This is really the most important. DeepSeek's models appear HIGHLY efficient in parameter usage, competing with OpenAI and Anthropic at a fraction of their compute cost.
- By “fraction” I mean 0.0086% of their compute cost
- If we can believe their reports, training costs for DeepSeek R1 were about $6 million, compare that to the ~ $7 billion OpenAI is said to have dedicated to new model training in 2024…of course, that’s not all going to o1 (the reasoning competitor to DeepSeek’s R1) but still…

The question is, of course, how did they do it?

It is sort of ironic. To us news readers vs. news makers, it’ll never be fully clear exactly how well those export restrictions worked, or how much of an advantage having advanced GPUs really gives.

The point is that innovation finds a way. As much as we act like we know how to foster (or squash) innovation, it still just seems to happen no matter what we try to orchestrate on a societal level…often times in spite of whatever intentions we collectively set.

Viewed as competitors, these brilliant minds over in China working with DeepSeek are kind of like the weeds that inevitably poke out the sides of any weed guard you put on the ground — only the strongest ones survive and by the time you see them they’ve already rooted down deep. Nature always finds a way around blanket solutions, it seems.

This is the technical report referenced, if interested to read further.

Yes, but…

There is some indication that the reported $6 million figure did not include costs of prior research and access to existing infrastructure — as one X user pointed out:

1) DeepSeek r1 is real with important nuances. Most important is the fact that r1 is so much cheaper and more efficient to inference than o1, not from the $6m training figure. r1 costs 93% less to *use* than o1 per each API, can be run locally on a high end work station and… x.com/i/web/status/1…
— Gavin Baker (@GavinSBaker)
2:54 PM • Jan 27, 2025

The $6m does not include “costs associated with prior research and ablation experiments on architectures, algorithms and data” per the technical paper. “Other than that Mrs. Lincoln, how was the play?” This means that it is possible to train an r1 quality model with a $6m run if a lab has already spent hundreds of millions of dollars on prior research and has access to much larger clusters.

@GavinSBaker

So basically, it’s possible to train a model like R1 for $6 million so long as everything that’s already been built has been built and you have access to a certain, somewhat unclear level of technical infrastructure. Much less impressive than “wow these guys made ChatGPT for 0.0001% of the cost!”

Sensationalism is out of control these days.

Even so,

Beyond the cost efficiency, DeepSeek Coder performs competitively with GPT-4 Turbo in coding benchmarks.

Additionally, its models demonstrate strong long-context processing, potentially rivaling Anthropic’s Claude models in handling extended input lengths.

Finally, and perhaps most importantly, DeepSeek stands apart from its American counterparts with its open-source strategy, making its models accessible to global developers while also circumventing US-led AI restrictions.

This approach not only aims to accelerate adoption but also positions DeepSeek as a leader within the space as a whole. If DeepSeek is the one who is determining the way global builders train AI, they should end up with an edge over other companies.

Bottom Line: DeepSeek isn’t just a "GPT clone" despite OpenAI’s attempts to cast it as such with claims of data distillation. It’s part of a broader trend of China closing the AI gap—and in many ways, setting the pace.

2️⃣ Economic: Why Did Markets React So Much?

The financial world certainly noticed DeepSeek’s arrival, with notable plunges in AI-related stocks, especially NVIDIA. Here’s why:

Investor Anxiety Over Excessive Costs – this was already rising back in the summer of 2024. As Big Tech continued to throw absurd amounts of money at AI development, investors began wanting to see returns. Profitability to justify those expenditures hadn’t been proven (and still isn’t, really).
Pressure on NVIDIA & AI Hardware – since AI trains on expensive GPUs, NVIDIA, as well positioned vendors of the fastest GPUs (previously used for gaming) saw extreme growth over the last couple of years.
- But if China can produce world-class AI models with reduced reliance on US chips, that’s bad news for companies banking on AI infrastructure dominance.

“Access to compute” was previously seen as one of the most notable bottlenecks in the AI industry. It was thought that in order to train top performing models, you needed to have absurd amounts of compute power, and only the very deepest of pockets could afford it. But DeepSeek’s arrival shows we were wrong to make such myopic assumptions.

This guy called it back in the summer of 2023 ↙️

Compute is Overrated as AI’s Bottleneck

You can’t just blindly extrapolate compute requirements

weightythoughts.com/p/compute-is-overrated-as-ais-bottleneck

Markets have rebounded since the initial shock, and notably, Apple has emerged as a winner thanks to their (previously ridiculed) slow and measured approach to AI.

Back in mid 2024 people were freaking out that Apple wasn’t putting out competitive AI models.
Now they are freaking out that other big tech companies shoveling billions into developing stuff the Chinese figured out how to make for (relative) pennies.

Bottom Line: Markets don’t like uncertainty, and DeepSeek introduces a lot of it—especially for investors betting on US-based AI dominance.

3️⃣ Domestic Politics: Why Is the US Taking This So Personal?

After DeepSeek, Washington and US-led tech companies are suddenly looking a bit behind the curve in terms of their approach to AI development. Here’s why:

Chip Export Restrictions – The US has been tightening chip restrictions in an attempt to slow China’s AI progress, but DeepSeek’s rise suggests that effort is failing.
Project Stargate — DeepSeek’s arrival coincided with the Trump administration announcing a $500 billion AI infrastructure investment deal with OpenAI, SoftBank, MGX, and Oracle.
- That number was already looking ridiculous, with Elon Musk even called it a sham. And while there’s still an argument to be made for building that much infrastructure to “future-proof” AI development, it’s a lot harder to feel great about sinking that much money into something that the Chinese can basically make in their garage.

Bottom Line: The US government and US-led tech companies are probably doing some soul searching right now (if they have those) about their assumptions on how to maintain competitive advantage.

Here we were, cruising along into another four years of Trump’s MAGA, maybe on some level thinking: “well at least if the world ends from an AI takeover it will be Made-In-America” 🇺🇸🤣😩😳

Now suddenly we don’t seem to be in control as we thought.

This is gonna be a weird reference, but honestly this song here encapsulates the feel of this whole thing for me in the best, most awkwardly hilarious way…

We’ve been riding in the Cadillac for too long, smugly convinced our our relative dominance and abilities. This DeepSeek “happening” can be thought of as a blessing in disguise that sucks when it comes, but leaves you much better off in the end with an improved sense of self awareness, and an improved strategic operative capacity that that awareness brings.

4️⃣ International Politics: DeepSeek is a Classic China Move

When I was young(er) and first became able to read/understand what countries were, I remember turning over my toys and always finding somewhere stamped on Superman’s foot “Made In China”.

The obvious question was, why?

Why China Is "The World’s Factory"

Some may think the ubiquity of Chinese products is due to the abundance of cheap labor that reduces production costs, but there is much more to it.

www.investopedia.com/articles/investing/102214/why-china-worlds-factory.asp

The answer isn’t really that complicated.

If you assume open borders, and factor in the relatively low costs for moving products around the world associated with the modern era (established infrastructure for cargo ships, planes, etc.), then flooding the market with a massive supply of unbeatably cheap goods, over time, results in death for domestic producers of other countries.

China has:

Plenty of low-wage labor
A devalued local currency (which makes it affordable to trade with)
Loose domestic restrictions with a government actively seeking to buoy manufacturing sectors

Selling a t-shirt for $1 means (in theory) that the total cost of getting that shirt’s materials, assembling it, and moving it to the buyer is < $1. In the US, operating costs are simply way too high for anyone to compete with that kind of pricing.

So eventually all manufacturing shifts to the place where it can be conducted cheapest, even from American owned companies.

“Designed in the US, Made in China”

Rudimentarily speaking, someone got paid $20/hr to design it in Idaho, they emailed that design to Chinese producers who make 10,000 of them for $0.01 each, then they pay a shipping company $100 to put them in a big box and send to Los Angeles, then they charge me and you $100 per unit. Don’t bother doing the math, the logic is obvious.

Dominating the means of production for the cheap plastic-made home goods, appliances, and clothes that became ubiquitous in the 21st century was one thing — can they find a way to dominate the means of production for technology too?

There is obviously a physical component to developing AI — the hardware. But beyond that, you need talent. Highly skilled engineers that can utilize the hardware for creating cutting edge tech. Naturally the companies that can attract the best talent tend to dominate the technology market as a result, and countries with the highest wages + quality of life tend to be able to attract that talent consistently.

That’s really what makes the DeepSeek moment so compelling. It flies in the face of this conventional wisdom ^

DeepSeek isn’t a government agency,

But as we all know (from the TikTok saga if nothing else), in China, the line between state and private enterprise is blurry. Some factors to consider:

State-Backed AI Ecosystem – Even if DeepSeek is “private,” Chinese AI firms operate under strict CCP oversight and benefit from state-funding.
AI as a Geopolitical Weapon – China sees AI as central to economic and military competition with the US.
- As usual, they’ve been busy architecting advancement by pushing adoption of AI in specific sectors and driving development.
- The ultimate goal seems to be sovereignty — that is, freedom from reliance on US silicon or any other source of technical innovation. We, of all nations, can hardly blame them for seeking this.
Beyond Silicon Valley Dependence – If China’s top models no longer rely on US chips or cloud services, the US loses a key leverage point in global politics.
- Basically, if other countries like China (and others) no longer need to play nice with the US to get access to top tech models, there’s an amount less incentive for them to cooperate with the US in international relations.

Bottom Line: DeepSeek’s success is part of a larger and longer push by China to challenge US dominance—it’s not necessarily as a direct government project, but certainly is part and parcel of a national priority that will only be improved by it’s demonstrated success.

Amidst all the hot takes, one thing is for sure—

DeepSeek’s arrival isn’t just a blip—it’s a sign that the AI landscape is becoming truly multipolar. This moment forces tough questions:

Can the US maintain AI leadership?
Do chip export restrictions work…at all?
How should global companies respond to China’s AI progress?

Stay tuned — new things happen every day in this wild industry ~