ASTONISHED BY DESIGN
This video demonstrates an early version of Realitalk without the integration with CereProc's text-to-speech system and instead is based on a real actor's vocal performance, in this case John Lithgow's "A Dramatic Reading of a Playbill".
Take a complete voice actor audio performance and use an audio editor to create regions for each word. A region is nothing more than the start and end time with a region length time.
For my example, I used Reaper and named each region the standard English written word.
The regions are then exported to CSV and imported into UE as data table.
Using blueprints and a separate data table, which contains a dictionary of American English to IPA, the standard English words are broken down into the IPA phonemes.
The phonemes are then sent to a custom version of the common Metahuman face Anim blueprint, which has all the phoneme poses created by using the Modify Curve and the Metahuman CTRL curves, which are already built into the Metahuman Face Anim blueprint.
A timeline along with the region times exported from the audio file as used to control the timing of the animation and sync it with the original audio file.