Introducing Audio Driven Lip-Sync

By utilizing the AI-powered ACE A2F (Audio2Face) technology, you can introduce real-time lip-sync and expressions for speech animation.

Workflow of Audio Driven Lip-Sync

Refer to the illustration below to explore the audio driven workflow of the ACE system.

  1. Choose an A2F AI Model for the lip-sync style driven by audio.
  2. Optimize the received ARKit signal and adjust the morph strength.
  3. Enable the Emotions allowed to be triggered by LLM.
  4. Design the triggered Emotions by strength levels.
  5. Send the Audio2Face processing results to the Speech root.

Create an Audio2Face node to overview all of the related settings in the Speech Graph. Right-click on the empty area of the graph, and select ACE-A2F > Audio2Face from the context menu.


The Audio2Face node is created for the following controls.

  • Audio2Face Model: Accepts the selected AI model from the namesake sub-node.
  • ARKit Tuner: Accepts calculated results from the namesake sub-node, which maps Audio2Face’s returned ARKit signal to the character’s default morphs.
  • Face Tuner Mapper: Accepts calculated results from the namesake sub-node, which applies additional custom mapping on top of the ARKit Tuner.
  • Lower Face Strength: Adjusts the strength of the mouth animation generated from the Audio2Face signal.
  • Enabled Emotions: Allows LLM-driven control of specified emotional speech animations.
  • Emotion Config: Outputs enhanced emotional mouth animations.