The Current ⚡️
Posts
UCSD Researchers Showed LLMs Passing The Turing Test

UCSD Researchers Showed LLMs Passing The Turing Test

Also, OpenAI establishes commission to guide non-profit

Jack Lajoie
April 03, 2025 • read time ~ 8 minutes

⚡️ Headlines

🤖 AI

Google in Advanced Talks to Rent Nvidia AI Servers from CoreWeave - Google is negotiating a major deal to rent high-end Nvidia GPUs from CoreWeave to support its AI infrastructure push amid growing cloud demand [The Information].

Grok, Musk's AI Chatbot, Is Now a Tool for X Users to 'Dunk' on Posts - Elon Musk's AI chatbot Grok is increasingly used by X users to mock or "dunk on" posts, showcasing its viral snarky personality [Business Insider].

AI Bots Strain Wikimedia as Bandwidth Surges 50% from Automated Crawlers - Wikimedia projects are experiencing bandwidth overloads from AI crawlers, prompting resource limits and ethical data use concerns [Ars Technica].

NotebookLM Now Shows Source Links to Improve Transparency in AI Summaries - Google’s NotebookLM update now includes a “Discover Sources” feature that links users directly to the references behind AI-generated summaries [9to5Google].

Amex Uses AI to Cut IT Escalations by 40%, Boost Travel Help by 85% - American Express is leveraging AI to streamline internal operations and enhance customer support, notably reducing IT issues and improving travel service [VentureBeat].

Adobe's Generative Extend Feature in Premiere Pro Now Widely Available - Adobe is rolling out the generative “Extend” tool in Premiere Pro to all users, allowing seamless AI-powered video extension [The Verge].

YourBench Aims to Replace Generic AI Benchmarks with Real-World Testing - YourBench is helping enterprises better evaluate AI performance by using their actual data instead of generic benchmarks [VentureBeat].

Google Shakes Up Gemini Leadership, Labs Head Now in Charge - Google has replaced Gemini's leadership, appointing Labs chief Josh Woodward to guide its flagship AI product's next phase [Ars Technica].

CoTools Tackles AI Integration Challenges for Enterprise Workflows - CoTools is aiming to solve interoperability issues in enterprise AI by enabling seamless integration of disparate tools [VentureBeat].

🦾 Emerging Tech

Bitcoin Whales Accumulate During Dip in First Major Buy-Up in 8 Months - Large Bitcoin holders are showing renewed confidence, driving the first significant accumulation since mid-2024 [CoinDesk].

🤳 Social Media

Amazon Has Reportedly Bid to Acquire TikTok's U.S. Operations - Amazon has submitted a bid to buy TikTok's U.S. assets, signaling a bold move into social media, according to the New York Times [Reuters].

Instagram Reportedly Developing Free Standalone Video Editing App - Meta is working on a new free video editing app called “Instagram Free Edits,” likely aimed at creators outside the main platform [Social Media Today].

🔬 Research

OpenAI Introduces PaperBench for Evaluating AI on Scientific Paper Comprehension - OpenAI’s PaperBench presents a new benchmark to test how well language models understand scientific literature, pushing LLM evaluation closer to academic tasks [OpenAI].

DeepMind Proposes Framework to Assess Cybersecurity Risks from Advanced AI - DeepMind outlines a proactive approach for assessing how advanced AI systems might pose cybersecurity threats and how to mitigate them [DeepMind].

🔌 Plug-Into-This

Large Language Models Pass the Turing Test

A new study from UC San Diego shows that GPT-4.5, when adopting a crafted persona, was judged to be human 73% of the time in a rigorous Turing Test format—surpassing actual humans in believability. Participants were unable to reliably distinguish AI from humans in blind chat-based interactions, suggesting LLMs now cross key conversational thresholds.

GPT-4.5, presented as a “shy and warm” persona, outperformed real humans in being perceived as human.
The Turing Test setup involved side-by-side chat conversations, with participants choosing which interlocutor was human.
Meta’s LLaMa-3.1-405B reached a 56% human-identification rate, significantly above chance.
Legacy systems like ELIZA and even GPT-4o lagged far behind, scoring 23% and 21% respectively.
The paper emphasizes that the test reflects perceived humanness, not actual intelligence or comprehension.

👀This paper finds "the first robust evidence that any system passes the original three-party Turing test"
People had a five minute, three-way conversation with another person & an AI. They picked GPT-4.5, prompted to act human, as the real person 73% of time, well above chance.
— Ethan Mollick (@emollick)
8:38 PM • Apr 1, 2025

🗣️ People really thought GPT-4.5 was human more often than real humans—it's become that good at chatting like us...or maybe more accurately, making us like it when we chat? It’s more of a symbolic milestone at this point and it feels oddly anticlimactic amidst the other amazing things AI is doing / being purported to do soon.

OpenAI Establishes Nonprofit Commission to Guide Philanthropic Efforts

OpenAI has launched a new commission to help steer its nonprofit initiatives toward areas of urgent social need. The advisory group will consult with nonprofits, community leaders, and public sector experts—particularly in California—to generate recommendations that shape how OpenAI deploys its resources for public benefit.

The commission will investigate critical pain points in health, education, science, and public services where AI can provide meaningful support.
Its members, to be named in April, will have 90 days to produce a formal set of insights and proposals.
Findings will be reviewed by OpenAI’s Board and may inform how the nonprofit arm evolves through 2025.
The effort focuses on listening to underrepresented voices and grounding tech interventions in community-identified needs.
It marks a rare structural effort by a leading AI lab to embed social listening directly into its philanthropic agenda.

OpenAI seeks to convene group to advise its nonprofit goals
— TechCrunch (@TechCrunch)
7:23 PM • Apr 2, 2025

🧭 This initiative positions OpenAI to broaden its social footprint beyond frontier AI safety, signaling a turn toward impact-focused deployment—but its effectiveness will hinge on how deeply these insights shape actual funding and tool distribution.

Claude for Education: Anthropic Brings AI to Classrooms

Anthropic has launched Claude for Education, a dedicated initiative to integrate its AI models into schools and universities. The program provides custom tools for students, educators, and researchers—prioritizing accuracy, transparency, and thoughtful guardrails for academic use.

Claude is now freely available to U.S. educators through the nonprofit AI for Education platform, with tailored features for classroom contexts.
Educators can generate lesson plans, feedback rubrics, and reading comprehension prompts using Claude.
Students are guided to use Claude responsibly for brainstorming, tutoring, and writing assistance, with embedded guidance on academic integrity.
The platform includes transparency tools to show how Claude arrives at answers—supporting critical thinking and AI literacy.
Anthropic is partnering with researchers to study effective, safe AI usage in learning environments.
In simple terms: Anthropic is putting AI in schools—but with guardrails, training wheels, and a handbook for using it wisely.

Introducing Claude for Education.
We're partnering with universities to bring AI to higher education, alongside a new learning mode for students.
— Anthropic (@AnthropicAI)
4:44 PM • Apr 2, 2025

🎓 Claude’s education push reflects a broader trend: AI companies vying to shape not just the future of learning, but the norms and ethics around how students engage with intelligent systems.

🆕 Updates

NotebookLM has always been grounded in your sources. However, sometimes we all need a little more info. Now NBLM can help you discover new sources to expand your research.
Just type in what you want to learn about and we'll scour the web for the best the internet has to offer 🕵️‍♀️
— NotebookLM (@NotebookLM)
7:16 PM • Apr 2, 2025

Wave 6 is here!
Included in this update:
🚀 App Deploys
📝 Conversation Table of Contents
💬 Commit Message Generation
🟠 Windsurf Tab in Jupyter Notebook
⏩ Additional context for Windsurf Tab
📂 Improved MCP Support
🎨 Two New App Icons
…and much more
— Windsurf (@windsurf_ai)
6:17 PM • Apr 2, 2025

We’re releasing PaperBench, a benchmark evaluating the ability of AI agents to replicate state-of-the-art AI research, as part of our Preparedness Framework.
Agents must replicate top ICML 2024 papers, including understanding the paper, writing code, and executing experiments.
— OpenAI (@OpenAI)
5:13 PM • Apr 2, 2025

📽️ Daily Demo

Freeform LLM outputs are great for chat. But if you're building software? You need structure—something your system can reliably read and use.
In this new short course, Getting Structured LLM Output, built with @dottxtai, taught by @willkurt and @cameron_pfiffer, you’ll learn how
— DeepLearning.AI (@DeepLearningAI)
3:30 PM • Apr 2, 2025

🗣️ Discourse

what's happening with ai adoption in india right now is amazing to watch.
we love to see the explosion of creativity--india is outpacing the world.
— Sam Altman (@sama)
3:13 PM • Apr 2, 2025

When I was younger, I thought that the day an AI passed the Turing test would be momentous event in human history. It’s now happened and no one batted an eye.
— Peter Berezin (@PeterBerezinBCA)
2:33 AM • Apr 2, 2025

It’s wrappers all the way down
Platforms exist
Thanks @daniel_r_vega@inversionsemi
— Garry Tan (@garrytan)
1:59 PM • Apr 2, 2025