Why humanoid robots need their own safety rules
Last year, a humanoid warehouse robot named Digit set to work handling boxes of Spanx. Digit can lift boxes up to 16 kilograms between trolleys and conveyor belts, taking over some of the heavier work for its human colleagues. It works in a restricted, defined area, separated from human workers by physical panels or laser…
Read More
  
       
            Inside Amsterdam’s high-stakes experiment to create fair welfare AI
This story is a partnership between MIT Technology Review, Lighthouse Reports, and Trouw, and was supported by the Pulitzer Center. Two futures Hans de Zwart, a gym teacher turned digital rights advocate, says that when he saw Amsterdam’s plan to have an algorithm evaluate every welfare applicant in the city for potential fraud, he nearly…
Read More
  
       
            The Pentagon is gutting the team that tests AI and weapons systems
The Trump administration’s chainsaw approach to federal spending lives on, even as Elon Musk turns on the president. On May 28, Secretary of Defense Pete Hegseth announced he’d be gutting a key office at the Department of Defense responsible for testing and evaluating the safety of weapons and AI systems. As part of a string…
Read More
  
       
            Manus has kick-started an AI agent boom in China
Last year, China saw a boom in foundation models, the do-everything large language models that underpin the AI revolution. This year, the focus has shifted to AI agents—systems that are less about responding to users’ queries and more about autonomously accomplishing things for them. There are now a host of Chinese startups building these general-purpose…
Read More
  
       
            What’s next for AI and math
MIT Technology Review’s What’s Next series looks across industries, trends, and technologies to give you a first look at the future. You can read the rest of them here. The way DARPA tells it, math is stuck in the past. In April, the US Defense Advanced Research Projects Agency kicked off a new initiative called expMath—short…
Read More
  
       
            Inside the tedious effort to tally AI’s energy appetite
After working on it for months, my colleague Casey Crownhart and I finally saw our story on AI’s energy and emissions burden go live last week. The initial goal sounded simple: Calculate how much energy is used each time we interact with a chatbot, and then tally that up to understand why everyone from leaders…
Read More
  
       
            Fueling seamless AI at scale
From large language models (LLMs) to reasoning agents, today’s AI tools bring unprecedented computational demands. Trillion-parameter models, workloads running on-device, and swarms of agents collaborating to complete tasks all require a new paradigm of computing to become truly seamless and ubiquitous. First, technical progress in hardware and silicon design is critical to pushing the boundaries…
Read More
  
       
            This benchmark used Reddit’s AITA to test how much AI models suck up to us
Back in April, OpenAIannounced it was rolling back an update to its GPT-4o model that made ChatGPT’s responses to user queries too sycophantic. An AI model that acts in an overly agreeable and flattering way is more than just annoying. It could reinforce users’ incorrect beliefs, mislead people, and spread misinformation that can be dangerous—a…
Read More
  
       
            The AI Hype Index: College students are hooked on ChatGPT
Separating AI reality from hyped-up fiction isn’t always easy. That’s why we’ve created the AI Hype Index—a simple, at-a-glance summary of everything you need to know about the state of the industry. Large language models confidently present their responses as accurate and reliable, even when they’re neither of those things. That’s why we’ve recently seen…
Read More
  
       
            Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time
Anthropic has announced two new AI models that it claims represent a major step toward making AI agents truly useful. AI agents trained on Claude Opus 4, the company’s most powerful model to date, raise the bar for what such systems are capable of by tackling difficult tasks over extended periods of time and responding…
Read More