CONNECT WITH US

Soul App Launches Revolutionary Full-Duplex Voice Model: AI Gains Autonomous Speech Timing Control

PRNewswire Friday 11 July 2025

SHANGHAI, July 10, 2025 /PRNewswire/ -- As AI deeply integrates into human life and reshapes connectivity, what foundational capabilities are needed to enhance interactive experiences in social scenarios?

Recently, Soul App has upgraded its self-developed, end-to-end full-duplex voice call model. Redefining "full-duplex", the new model abandons traditional concepts like "VAD (Voice Activity Detection, commonly used to detect speech start/end)" and "latency", breaking away from the industry-standard "turn-by-turn" interaction pattern.

Instead, it empowers AI to autonomously decide speaking timing, such as proactively breaking silence, appropriately interrupting users, listening while speaking, perceiving time semantics, enabling parallel discussions, and more. The model also supports multi-dimensional perception (including time, environment, and event awareness) and natural speech features (e.g., fillers, stammering, noticeable emotional fluctuations), making AI interactions more "human-like" and delivering an immersive, lifelike voice experience.


The upgraded full-duplex model will soon enter beta testing on Soul and will later be deployed in one-on-one interactive scenarios like digital human calls and AI matchmaking. Soul's team is also exploring its application in group settings, enabling AI to join conversations at the right moment, extend topics, and foster diverse relationship networks.

Tao Ming, CTO of Soul App, stated, "Social interaction is an exchange of emotional and informational value. Soul remains committed to leveraging innovative technology and product solutions to deliver smarter, more immersive, and higher-quality interactive experiences, making loneliness go away for all."

Full-Duplex Voice Call: Redefining AI Social Interaction

Previously constrained by technical limitations, human-AI dialogues often followed a "Q&A" format (user asks → AI responds), where latency or interruptions disrupted immersion.

In 2024, Soul launched its self-developed, end-to-end full-duplex voice model, featuring ultra-low latency, rapid auto-interruption, hyper-realistic voice expression, and emotional perception. It could directly interpret complex auditory inputs and support highly anthropomorphic multilingual styles. To further achieve daily-life-like conversations and "human-like" companionship, Soul has now upgraded the model with the following capabilities:

1.  Full-Duplex Interaction: AI Gains Autonomous Decision-Making

The new model enables stream prediction for responses, listening, and interruptions. AI autonomously decides when to speak, achieving true end-to-end full-duplex interaction, where AI and users can talk simultaneously (e.g., debating, arguing, singing), appropriately interrupt each other, or proactively break silence to initiate topics.

This autonomy allows AI to master interaction timing, significantly enhancing dialogue naturalness and enabling immersive, real-world-like exchanges during extended, multi-turn conversations.

2.  Colloquial & Emotional Expression: More Vivid Interactions

To make AI interactions feel more human-like, the model achieves comprehensive enhancements across multiple dimensions, including emotional expression, vocal characteristics, and conversational content, bringing it closer to real-world communication.

For instance, in emotional expression, beyond foundational capabilities like laughter, crying, or anger, the upgraded model delivers more pronounced vocal fluctuations that evolve naturally with the conversation. Its pronunciation now incorporates organic speech elements such as filler words, occasional stammering, common catchphrases, coughs, and other everyday vocal nuances. Furthermore, AI-generated dialogue leans into colloquial and socially fluid language rather than rigid, written-language patterns.

3.  Contextual Awareness: Time, Events, Environment

Built on a pure autoregressive architecture with unified text/audio generation (Unified Model), the model leverages strong LLM capabilities to integrate persona, time, environment, and contextual dialogue into AI responses. This allows perceptive, understanding AI to better shape "digital personalities", create rich storylines, and transform interactions into genuine "exchanges of emotion and information".

Notably, Soul's AI team is currently exploring how to extend its full-duplex voice call model to multi-person scenarios. For example, in group voice conversations, the AI leverages its autonomous decision-making capability to identify optimal speaking moments, effectively facilitate topic discussions and extensions, and seamlessly integrate into authentic social dynamics as an active participant.

Integrating AI into Social Networks: Delivering Emotional and Informational Value

Drawing on deep insights into social dynamics, Soul rapidly implements technology at the application layer and refines products based on user feedback. Its AI roadmap focuses on two paths: "AI helping users make friends (AI-assisted socializing)" and "AI being friends with users (human-AI interaction)". Corresponding features, such as the emotionally intelligent "AI Companion" and chat-assistant "AI Chat Assistant", have garnered enthusiastic user adoption.

Human-AI interaction specifically aims to deliver emotional and informational value through human-like capabilities. According to Soul's Just So Soul Institute (2025 Gen Z AI Usage Report, March 2025; sample: 3,680):

  • Nearly 40% of young people use AI daily for emotional companionship.
  • 71.1% are willing to befriend AI (vs. 32.8% in the 2024 Gen Z AIGC Attitudes Report).

As AI reshapes Generation Z's perception of social relationships, it also gives rise to new demands. A survey conducted by Soul among active users of "AI Companions" reveals that approximately 60% desire more human-like AI behavior.

This full-duplex model upgrade significantly advances AI's interactive abilities, infusing presence and emotional warmth into human-AI exchanges, and will propel AI socializing into a new era.

As a platform rooted in authentic human connections, Soul is committed to building a social ecosystem where "AI Beings" and "Human Beings" coexist. By continuously investing in cutting-edge technology, Soul empowers AI to enrich users' emotional support systems, diversify emotional experiences, and ultimately enhance individual well-being and belonging.

 

B2B
marketing