The AI Hype Index: AI-powered toys are coming
Separating AI reality from hyped-up fiction isn’t always easy. That’s why we’ve created the AI Hype Index—a simple, at-a-glance summary of everything you need to know about the state of the industry. AI agents might be the toast of the AI industry, but they’re still not that reliable. That’s why Yoshua Bengio, one of the…
Read More
Can we fix AI’s evaluation crisis?
As a tech reporter I often get asked questions like “Is DeepSeek actually better than ChatGPT?” or “Is the Anthropic model any good?” If I don’t feel like turning it into an hour-long seminar, I’ll usually give the diplomatic answer: “They’re both solid in different ways.” Most people asking aren’t defining “good” in any precise…
Read More
A Chinese firm has just launched a constantly changing set of AI benchmarks
When testing an AI model, it’s hard to tell if it is reasoning or just regurgitating answers from its training data. Xbench, a new benchmark developed by the Chinese venture capital firm HSG, or Hongshan Capital Group, might help to sidestep that issue. That’s thanks to the way it evaluates models not only on the…
Read More
It’s pretty easy to get DeepSeek to talk dirty
AI companions like Replika are designed to engage in intimate exchanges, but people use general-purpose chatbots for sex talk too, despite their stricter content moderation policies. Now new research shows that not all chatbots are equally willing to talk dirty: DeepSeek is the easiest to convince. But other AI chatbots can be enticed too, if…
Read More
OpenAI can rehabilitate AI models that develop a “bad boy persona”
A new paper from OpenAI released today has shown why a little bit of bad training can make AI models go rogue but also demonstrates that this problem is generally pretty easy to fix. Back in February, a group of researchers discovered that fine-tuning an AI model (in their case, OpenAI’s GPT-4o) by training it…
Read More
Why AI hardware needs to be open
When OpenAI acquired Io to create “the coolest piece of tech that the world will have ever seen,” it confirmed what industry experts have long been saying: Hardware is the new frontier for AI. AI will no longer just be an abstract thing in the cloud far away. It’s coming for our homes, our rooms,…
Read More
AI copyright anxiety will hold back creativity
Last fall, while attending a board meeting in Amsterdam, I had a few free hours and made an impromptu visit to the Van Gogh Museum. I often steal time for visits like this—a perk of global business travel for which I am grateful. Wandering the galleries, I found myself before The Courtesan (after Eisen), painted…
Read More
When AIs bargain, a less advanced agent could cost you
The race to build ever larger AI models is slowing down. The industry’s focus is shifting toward agents—systems that can act autonomously, make decisions, and negotiate on users’ behalf. These AI agents are already being deployed in customer service and programming—and, increasingly, in e-commerce and personal finance. But what would happen if both a customer…
Read More
Powering next-gen services with AI in regulated industries
Businesses in highly-regulated industries like financial services, insurance, pharmaceuticals, and health care are increasingly turning to AI-powered tools to streamline complex and sensitive tasks. Conversational AI-driven interfaces are helping hospitals to track the location and delivery of a patient’s time-sensitive cancer drugs. Generative AI chatbots are helping insurance customers answer questions and solve problems. And agentic…
Read More
Are we ready to hand AI agents the keys?
On May 6, 2010, at 2:32 p.m. Eastern time, nearly a trillion dollars evaporated from the US stock market within 20 minutes—at the time, the fastest decline in history. Then, almost as suddenly, the market rebounded. After months of investigation, regulators attributed much of the responsibility for this “flash crash” to high-frequency trading algorithms, which…
Read More