UCSD Researchers Showed LLMs Passing The Turing Test

Also, OpenAI establishes commission to guide non-profit

⚡️ Headlines

🤖 AI

Google in Advanced Talks to Rent Nvidia AI Servers from CoreWeave - Google is negotiating a major deal to rent high-end Nvidia GPUs from CoreWeave to support its AI infrastructure push amid growing cloud demand [The Information].

Grok, Musk's AI Chatbot, Is Now a Tool for X Users to 'Dunk' on Posts - Elon Musk's AI chatbot Grok is increasingly used by X users to mock or "dunk on" posts, showcasing its viral snarky personality [Business Insider].

AI Bots Strain Wikimedia as Bandwidth Surges 50% from Automated Crawlers - Wikimedia projects are experiencing bandwidth overloads from AI crawlers, prompting resource limits and ethical data use concerns [Ars Technica].

NotebookLM Now Shows Source Links to Improve Transparency in AI Summaries - Google’s NotebookLM update now includes a “Discover Sources” feature that links users directly to the references behind AI-generated summaries [9to5Google].

Amex Uses AI to Cut IT Escalations by 40%, Boost Travel Help by 85% - American Express is leveraging AI to streamline internal operations and enhance customer support, notably reducing IT issues and improving travel service [VentureBeat].

Adobe's Generative Extend Feature in Premiere Pro Now Widely Available - Adobe is rolling out the generative “Extend” tool in Premiere Pro to all users, allowing seamless AI-powered video extension [The Verge].

YourBench Aims to Replace Generic AI Benchmarks with Real-World Testing - YourBench is helping enterprises better evaluate AI performance by using their actual data instead of generic benchmarks [VentureBeat].

Google Shakes Up Gemini Leadership, Labs Head Now in Charge - Google has replaced Gemini's leadership, appointing Labs chief Josh Woodward to guide its flagship AI product's next phase [Ars Technica].

CoTools Tackles AI Integration Challenges for Enterprise Workflows - CoTools is aiming to solve interoperability issues in enterprise AI by enabling seamless integration of disparate tools [VentureBeat].

🦾 Emerging Tech

Bitcoin Whales Accumulate During Dip in First Major Buy-Up in 8 Months - Large Bitcoin holders are showing renewed confidence, driving the first significant accumulation since mid-2024 [CoinDesk].

🤳 Social Media

Amazon Has Reportedly Bid to Acquire TikTok's U.S. Operations - Amazon has submitted a bid to buy TikTok's U.S. assets, signaling a bold move into social media, according to the New York Times [Reuters].

Instagram Reportedly Developing Free Standalone Video Editing App - Meta is working on a new free video editing app called “Instagram Free Edits,” likely aimed at creators outside the main platform [Social Media Today].

🔬 Research

OpenAI Introduces PaperBench for Evaluating AI on Scientific Paper Comprehension - OpenAI’s PaperBench presents a new benchmark to test how well language models understand scientific literature, pushing LLM evaluation closer to academic tasks [OpenAI].

DeepMind Proposes Framework to Assess Cybersecurity Risks from Advanced AI - DeepMind outlines a proactive approach for assessing how advanced AI systems might pose cybersecurity threats and how to mitigate them [DeepMind].

🔌 Plug-Into-This

A new study from UC San Diego shows that GPT-4.5, when adopting a crafted persona, was judged to be human 73% of the time in a rigorous Turing Test format—surpassing actual humans in believability. Participants were unable to reliably distinguish AI from humans in blind chat-based interactions, suggesting LLMs now cross key conversational thresholds.

  • GPT-4.5, presented as a “shy and warm” persona, outperformed real humans in being perceived as human.

  • The Turing Test setup involved side-by-side chat conversations, with participants choosing which interlocutor was human.

  • Meta’s LLaMa-3.1-405B reached a 56% human-identification rate, significantly above chance.

  • Legacy systems like ELIZA and even GPT-4o lagged far behind, scoring 23% and 21% respectively.

  • The paper emphasizes that the test reflects perceived humanness, not actual intelligence or comprehension.

🗣️ People really thought GPT-4.5 was human more often than real humans—it's become that good at chatting like us...or maybe more accurately, making us like it when we chat? It’s more of a symbolic milestone at this point and it feels oddly anticlimactic amidst the other amazing things AI is doing / being purported to do soon.

OpenAI has launched a new commission to help steer its nonprofit initiatives toward areas of urgent social need. The advisory group will consult with nonprofits, community leaders, and public sector experts—particularly in California—to generate recommendations that shape how OpenAI deploys its resources for public benefit.

  • The commission will investigate critical pain points in health, education, science, and public services where AI can provide meaningful support.

  • Its members, to be named in April, will have 90 days to produce a formal set of insights and proposals.

  • Findings will be reviewed by OpenAI’s Board and may inform how the nonprofit arm evolves through 2025.

  • The effort focuses on listening to underrepresented voices and grounding tech interventions in community-identified needs.

  • It marks a rare structural effort by a leading AI lab to embed social listening directly into its philanthropic agenda.

🧭 This initiative positions OpenAI to broaden its social footprint beyond frontier AI safety, signaling a turn toward impact-focused deployment—but its effectiveness will hinge on how deeply these insights shape actual funding and tool distribution.

Anthropic has launched Claude for Education, a dedicated initiative to integrate its AI models into schools and universities. The program provides custom tools for students, educators, and researchers—prioritizing accuracy, transparency, and thoughtful guardrails for academic use.

  • Claude is now freely available to U.S. educators through the nonprofit AI for Education platform, with tailored features for classroom contexts.

  • Educators can generate lesson plans, feedback rubrics, and reading comprehension prompts using Claude.

  • Students are guided to use Claude responsibly for brainstorming, tutoring, and writing assistance, with embedded guidance on academic integrity.

  • The platform includes transparency tools to show how Claude arrives at answers—supporting critical thinking and AI literacy.

  • Anthropic is partnering with researchers to study effective, safe AI usage in learning environments.

  • In simple terms: Anthropic is putting AI in schools—but with guardrails, training wheels, and a handbook for using it wisely.

🎓 Claude’s education push reflects a broader trend: AI companies vying to shape not just the future of learning, but the norms and ethics around how students engage with intelligent systems.

 🆕 Updates

📽️ Daily Demo

🗣️ Discourse