The Current ⚡️
Posts
DeepSeek Has Shaken The “Bigger Is Better” Presumption for AI Building To The Core

DeepSeek Has Shaken The “Bigger Is Better” Presumption for AI Building To The Core

Also, Qwen follows R1 release with advanced vision models

Jack Lajoie
January 28, 2025 • read time ~ 7 minutes

⚡️ Headlines

🤖 AI

Five Things Most People Don't Seem to Understand About DeepSeek – Gary Marcus lays it out for us. [Marcus on AI].

DeepSeek Hit with Large-Scale Cyberattack, Says It's Limiting Registrations – Chinese AI startup DeepSeek reports significant cyberattacks disrupting new user registrations, though existing users remain unaffected. [CNBC].

China's DeepSeek Sets Off AI Market Rout – The launch of China's DeepSeek AI assistant leads to a global tech stock selloff, with Nvidia experiencing a record market-cap loss. [Reuters].

DeepSeek vs. ChatGPT: Hands-On with DeepSeek's R1 Chatbot – A comparison between DeepSeek's R1 chatbot and OpenAI's ChatGPT reveals DeepSeek's innovative training methods and potential market disruption, despite common AI challenges. [Wired].

Startup Perplexity Offers Uncensored DeepSeek AI Search – Perplexity introduces an uncensored AI search engine named DeepSeek, aiming to transform user information access. [The Information].

Building Toward a Smarter, More Personalized Assistant – Meta announces advancements in its AI assistant, focusing on personalized experiences by remembering user preferences and providing tailored recommendations. [Meta].

🤳 Social Media

TikTok Launches 2025 Marketing Calendar – TikTok releases its 2025 marketing calendar to help brands optimize their seasonal campaigns and maximize engagement on the platform. [Social Media Today].

⚖ Legal

Trump Vows Near-Future Tariffs, Calls DeepSeek Progress 'Good' – President Trump announces plans for imminent tariffs on imported computer chips and pharmaceuticals, while acknowledging China's DeepSeek AI as a positive development. [Bloomberg].

🎱 Random

How SoftBank's Son Made It Back to the White House – SoftBank CEO Masayoshi Son plans a $40 billion investment in the Stargate data center project, marking a significant return to U.S. tech initiatives. [The Information].

The Americans Pledging to Buy Less—or Even Nothing – Amid rising prices and household debt, a growing number of Americans commit to minimal or no new purchases, focusing on financial responsibility and debt reduction. [The Wall Street Journal].

🔌 Plug-Into-This

DeepSeek Erodes AI Industry's "Scale is Everything" Belief

DeepSeek, a Chinese AI startup, has released a groundbreaking AI model, R1, that performs comparably to OpenAI's cutting-edge technologies while using only a fraction of the computational and financial resources. This development challenges the long-standing industry belief that larger models, backed by massive scale, are the key to progress in AI.

Paradigm Shift: The success of DeepSeek's R1 model disrupts the notion promoted by industry leaders like Sam Altman of OpenAI that AI performance improves predictably with scale, such as by adding more computational power, data, and infrastructure.
Resource Efficiency: Unlike U.S. AI giants investing billions in GPUs and data centers, DeepSeek achieved its advancements on a relatively modest budget, showcasing the potential of smarter, more efficient methodologies over sheer size.
Market Disruption: Nvidia, a major supplier of GPUs driving the AI revolution, experienced a significant financial impact, with its market valuation dropping by $600 billion following the news of DeepSeek's efficiency-driven breakthroughs.

Deepseek V3 and R1 discourse boils down to this. Shifting the curve means you build more and scale more dummies
— Dylan Patel (@dylan522p)
5:34 PM • Jan 26, 2025

❔ DeepSeek's viral moment has raised critical questions about future AI development strategies especially around potential cost reductions, and call into more serious question the broader environmental and economic impacts of resource-intensive AI scaling efforts that have been the norm so far in the US.

Qwen2.5-VL: Advancing Vision-Language Integration

Qwen has unveiled Qwen2.5-VL, its latest vision-language model, marking a significant advancement from the previous Qwen2-VL. The model is available in three configurations—3B, 7B, and 72B—and can be accessed via Qwen Chat, Hugging Face, and ModelScope.

Enhanced Visual Understanding: Qwen2.5-VL excels in recognizing a wide array of objects, including flora, fauna, and various products. It also adeptly analyzes complex visual elements such as text, charts, icons, graphics, and layouts within images.
Agentic Capabilities: The model functions as a visual agent capable of reasoning and dynamically directing tools, facilitating interactions with devices like computers and smartphones.
Advanced Video Comprehension: Qwen2.5-VL can understand videos exceeding one hour in length and is equipped to pinpoint relevant segments, enhancing event detection within video content.
Precise Visual Localization: The model accurately identifies objects within images, generating bounding boxes or points, and provides stable JSON outputs detailing coordinates and attributes.
Structured Data Output: For documents such as invoices, forms, and tables, Qwen2.5-VL supports structured content extraction, benefiting applications in finance and commerce.

The burst of DeepSeek V3 has attracted attention from the whole AI community to large-scale MoE models. Concurrently, we have been building Qwen2.5-Max, a large MoE LLM pretrained on massive data and post-trained with curated SFT and RLHF recipes. It achieves competitive… x.com/i/web/status/1…
— Qwen (@Alibaba_Qwen)
3:31 PM • Jan 28, 2025

👁️ Qwen2.5-VL's release demonstrate rapid progression in vision-language models, highlighting the integration of comprehensive visual understanding with dynamic tool interaction.

DeepSeek Unveils Janus Pro: A New Image Model Family

DeepSeek has introduced Janus Pro, a new family of multimodal AI models designed to generate images from textual descriptions. The company asserts that Janus Pro outperforms existing models like OpenAI's DALL-E 3.

Model Variants: Janus Pro is available in two configurations: 1B and 7B parameters, catering to different performance and resource requirements.
Enhanced Image Generation: The models are trained to produce high-quality images based on textual prompts, aiming to improve upon the capabilities of current text-to-image generators.
Accessibility: DeepSeek has made Janus Pro models accessible to developers and researchers, promoting integration into various applications and further innovation in the field.

NEW: DeepSeek Janus Pro 1B (Generate Images, Chat with PDF) running in your browser, 100% local, powered by WebGPU 🔥
Zero server costs, brought to you by transformers.js - try it out!
— Vaibhav (VB) Srivastav (@reach_vb)
7:32 AM • Jan 28, 2025

🔥 Following up their viral moment last week with this additional launch has DeepSeek cranking up the heat even more on American Big Tech firms. Is there anything is company can’t do? When is the bottom going to drop out on this hype — and on whom does it drop?

🆕 Updates

🎥Introducing Hailuo T2V-01-Director Model: Control Your Camera Like a Pro!
📷 Direct with natural language or simple commands.
🔄 Combine movements for flawless, cinematic transitions.
✨ What’s New:
- Reduced randomness in movements.
- Enhanced control accuracy.
-… x.com/i/web/status/1…
— Hailuo AI (MiniMax) (@Hailuo_AI)
9:47 AM • Jan 28, 2025

📽️ Daily Demo

DeepSeek R1 + FactSet = Financial Research on Steroids
— Aravind Srinivas (@AravSrinivas)
8:25 PM • Jan 28, 2025

Check out this real-time screen recording that demonstrates voice interaction with low latency and interruptions using Gemini 2.0. Try it using a simple prompt and create your own game 👾 → goo.gle/42z1HeK
— Google AI Developers (@googleaidevs)
6:00 PM • Jan 28, 2025

🗣️ Discourse

The DeepSeek-R1 paper is a gem!
Highly encourage everyone to read it.
It's clear that LLM reasoning capabilities can be learned in different ways.
RL, if applied correctly and at scale, can lead to some really powerful and interesting scaling and emergent properties.
There… x.com/i/web/status/1…
— elvis (@omarsar0)
11:10 PM • Jan 20, 2025

DeepSeek R1 has really changed the AI LLM game.
People are creating wild use cases beyond ChatGPT. There's a major shift.
10 examples:
— Min Choi (@minchoi)
2:59 PM • Jan 28, 2025

Market close: $NVDA: -16.91% | $AAPL: +3.21%
Why is DeepSeek great for Apple?
Here's a breakdown of the chips that can run DeepSeek V3 and R1 on the market now:
NVIDIA H100: 80GB @ 3TB/s, $25,000, $312.50 per GB
AMD MI300X: 192GB @ 5.3TB/s, $20,000, $104.17 per GB
Apple M2… x.com/i/web/status/1…
— Alex Cheema - e/acc (@alexocheema)
11:15 PM • Jan 27, 2025

🐋 DeepSeek 🤝 LangChain 🦜
DeepSeek has taken the community by storm since open-sourcing R1, a powerful model that reasons like OpenAI's o1 while also exposing its thought process!
LangChain offers many ways to use it in your projects alongside the faster deepseek-V3 in Python… x.com/i/web/status/1…
— LangChain (@LangChainAI)
5:15 PM • Jan 28, 2025