“During the pandemic, a friend of mine named Kevin Baillie, who is Robert Zemeckis’ VFX supervisor, turned me on to the deepfake Tom Cruise TikToks,” Ulbrich recalls. “I’d seen deepfakes before, but when a bunch of us VFX guys saw that, we were all looking at it and going ‘holy shit’. Long story short, that was the genesis of Metaphysic.”
The company first debuted its photoreal GenAI technology back in 2022, when it was used to revive Elvis Presley for a performance on America’s Got Talent. They placed the King’s face on top of tribute singer Emilio Santoro’s during the live broadcast and audiences could hardly believe their eyes.
To train its models, Metaphysic scours the Earth for source data – in an entirely ethical manner, of course. For deceased actors and artists, this means getting the green light from their estates, as well as film studios’ permission to use existing footage. “It is unforgiving,” states Ulbrich. “There’s no margin for error. It’s a great responsibility to bring someone like that back, and we take it very seriously. It’s with the blessings of estates every step of the way.” For vocal performances, Metaphysic also uses GenAI to ‘recreate their actual voices, as opposed to using impersonators’.
In an instant
Here has been Metaphysic’s big project from the start. “A bunch of old guys – and I’ll include myself in that category – are harnessing the most advanced, cutting-edge technology in the world to make a traditional live-action movie. It’s just actors talking, it’s a drama. It’s not superheroes throwing buildings at each other,” Ulbrich says half-jokingly.
“Zemeckis had the idea of using this technology. He thought this could get the movie made because doing it with CGI instead would cost tens of millions of dollars. It’s throughout the entire movie, and would have been like composing music by one note a day, multiplied by four characters. It would have been a very long, expensive, tedious process just for people talking on camera,” he explains, having been an early proponent of CGI in the nineties. The team behind Here initially contacted both VFX studios and AI companies, testing the available technology for whether it could support Zemeckis’ vision.
“At the end of the day, only one test – Metaphysic’s – was successful,” states Ulbrich, who was working elsewhere at the time. “I see the test; it’s Tom Hanks delivering a line from the movie. Then they show the same footage again, and he’s 20 years old, delivering the same line. Then they tell me they’re doing it live, in real time. I’m like ‘wait, what?’”
This instantaneous element has serious implications, creating avenues for live performances and productions on top of traditional, pre-recorded ones. Of course, the Elvis AGT moment had already happened, proving that live GenAI was possible in entertainment, but Here was the first time Ulbrich had witnessed it with his own eyes. “It blew my mind,” he admits. “I could take the best CG team and the best company on Earth – hundreds of people with limitless sums of money – and we couldn’t deliver the same thing. It’s a completely unique technology for production.”
A model with a good memory
Besides saving time, live GenAI also boasts another benefit: “It’s always photoreal,” claims Ulbrich. Trained on ‘trillions of pixels’, the AI model ‘starts on the other side of the uncanny valley’ as opposed to CGI. “Things that are hard to do in CGI, like eyes, mouths, lip sync and emotionality, are easy for us. It’s all built on photography.