Column: Physical AI commercialization's safety gap

The race to commercialize physical AI and autonomous robots is running into a fundamental challenge: existing robot safety frameworks were designed for deterministic systems operating in controlled environments, not for autonomous machines making decisions in dynamic, unstructured ones.

Recent discussions among industry and research organizations have highlighted three emerging layers of risk — ranging from limitations in emergency stop functions and the absence of standardized testing benchmarks to the growing mismatch between traditional safety models and Vision-Language-Action (VLA) robots.

Stop does not equal safe

One of the clearest examples comes from the field of motion-control safety. Traditional industrial robot standards, particularly ISO 10218, have long been built on the assumption that a stopped robot is a safe robot.

That assumption becomes less reliable with humanoid and bipedal robots. Unlike conventional industrial arms, humanoids must continuously manage balance and center of gravity. A robot standing 170 cm tall and weighing 70 kg can still pose a hazard after a stop command if it loses stability and falls.

Engineers from a German motion-control safety supplier noted that existing safe-stop calculation methods do not adequately account for fall risks in bipedal systems. Once a potential fall radius is included, safety calculations become significantly more complex, incorporating factors such as the fall zone, the human-robot approach distance, the stopping distance, the sensor detection range, and uncertainties in positioning and system state.

Similar concerns were raised during an industry session at ICRA 2026. A major European automotive microcontroller supplier outlined five key limitations of Safe Torque Off (STO), a widely used functional safety mechanism. According to the company, STO cannot control limb motion during deceleration, prevent gravity-driven falls, coordinate safe postures across multiple joints, provide torque feedback during faults, or effectively manage cascading failures triggered by partial system breakdowns.

A German collaborative robot manufacturer participating in the same session argued that functional safety certification has become a prerequisite for deployment rather than a post-development consideration. Across the industry, the message was consistent: stopping a robot does not necessarily eliminate risk; in some cases, it introduces a different category of safety challenge.

Missing benchmarks

A second challenge is the lack of standardized testing frameworks.

Fraunhofer IPA, one of Europe's leading applied research institutes, purchased a commercial robot and subjected it to third-party testing using a self-developed 66-point evaluation methodology.

The findings revealed several issues that would not have been visible in typical product demonstrations. Under moderate load, the robot's arm overheated and shut down within two minutes. Collision-force tests recorded impacts exceeding 500 newtons, well above the range covered by ISO/TS 15066 for many human-robot contact scenarios. Researchers also identified Bluetooth security vulnerabilities, undocumented data transmission to the vendor's servers, and a battery life of less than two hours.

These shortcomings were uncovered only through independent testing. Existing standards generally do not require such disclosures, while trade-show demonstrations often highlight carefully controlled success cases rather than operational limitations.

VLA challenges traditional safety models

The most difficult challenge may be the emergence of VLA systems, which are beginning to erode the assumptions underpinning conventional safety assessment methods.

Traditional safety frameworks assume that risk can be quantified by identifying hazards and calculating a risk score based on probability, exposure frequency, and potential severity. However, AI-driven systems can generate unsafe outcomes even when functioning as intended.

The automotive industry encountered a similar issue in advanced driver-assistance systems (ADAS), leading to the creation of ISO/PAS 21448, or Safety of the Intended Functionality (SOTIF), which addresses hazards arising from limitations in system design rather than component failures.

Yet even within relatively structured road environments, long-tail edge cases remain difficult to manage. Applying the same framework to VLA robots is even more challenging because robots must operate in far more diverse and unpredictable environments.

To address this gap, two technical approaches are gaining attention. One involves creating an independent safety supervision layer that continuously monitors the main controller. The other relies on functionally safety-certified microcontrollers embedded at the joint level, allowing safety decisions to be made locally.

Both approaches share a common principle: deterministic systems supervising non-deterministic AI systems. However, neither approach has yet achieved comprehensive certification, and both remain dependent on future standards development.

Regulation may shape market winners

The debate extends beyond engineering into questions of liability and regulation.

At the Humanoids Summit, former US Consumer Product Safety Commission chairman Elliot Kaye posed a question that remains unresolved: if a robot injures a worker, who bears responsibility — the manufacturer, the operator, or the developer of the AI model controlling the machine?

The autonomous driving sector provides a cautionary example. Following a high-profile pedestrian accident involving Cruise's self-driving vehicle in San Francisco in 2023, the company was forced to suspend operations, demonstrating how a single safety incident can trigger significant regulatory and commercial consequences.

For robotics companies, legal readiness may become as important as technical capability. Kaye argued that the eventual winners in the deployment race may not be those with the most impressive demonstrations, but those able to satisfy customer legal and compliance requirements most quickly.

The industry has seen a similar dynamic before. In industrial robotics, many of the organizations involved in drafting ISO 10218 were established companies with extensive deployment histories and operational data, helping shape the standards that later defined market access.

Equivalent functional safety standards for autonomous and humanoid robots have yet to emerge, but many industry participants expect key frameworks to take shape between 2028 and 2030. While today's competition focuses on building increasingly capable robots, the next phase may center on a different question: who can prove most convincingly — and most quickly — that those robots are safe enough to deploy.

Article translated by Jingyue Hsiao and edited by Jerry Chen