ChatGPT isn’t perfect, but the popular AI chatbot’s access to large language models (LLM) means it can do a lot of things you might not expect, like give all of Tamriel’s NPC inhabitants the ability to hold natural conversations and answer questions about the iconic fantasy world. Uncanny, yes. But it’s a prescient look at how games might one day use AI to reach new heights in immersion.
YouTuber ‘Art from the Machine’ released a video showing off how they modded the much beloved VR version of The Elder Scrolls V: Skyrim.
The mod, which isn’t available yet, ostensibly lets you hold conversations with NPCs via ChatGPT and xVASynth, an AI tool for generating voice acting lines using voices from video games.
Check out the results in the most recent update below:
The latest version of the project introduces Skyrim scripting for the first time, which the developer says allows for lip syncing of voices and NPC awareness of in-game events. While still a little rigid, it feels like a pretty big step towards climbing out of the uncanny valley.
Here’s how ‘Art from the Machine’ describes the project in a recent Reddit post showcasing their work:
A few weeks ago I posted a video demonstrating a Python script I am working on which lets you talk to NPCs in Skyrim via ChatGPT and xVASynth. Since then I have been working to integrate this Python script with Skyrim’s own modding tools and I have reached a few exciting milestones:
NPCs are now aware of their current location and time of day. This opens up lots of possibilities for ChatGPT to react to the game world dynamically instead of waiting to be given context by the player. As an example, I no longer have issues with shopkeepers trying to barter with me in the Bannered Mare after work hours. NPCs are also aware of the items picked up by the player during conversation. This means that if you loot a chest, harvest an animal pelt, or pick a flower, NPCs will be able to comment on these actions.
NPCs are now lip synced with xVASynth. This is obviously much more natural than the floaty proof-of-concept voices I had before. I have also made some quality of life improvements such as getting response times down to ~15 seconds and adding a spell to start conversations.
When everything is in place, it is an incredibly surreal experience to be able to sit down and talk to these characters in VR. Nothing takes me out of the experience more than hearing the same repeated voice lines, and with this no two responses are ever the same. There is still a lot of work to go, but even in its current state I couldn’t go back to playing without this.
You might notice the actual voice prompting the NPCs is also fairly robotic too, although ‘Art from the Machine’ says they’re using speech-to-text to talk to the ChatGPT 3.5-driven system. The voice heard in the video is generated from xVASynth, and then plugged in during video editing to replace what they call their “radio-unfriendly voice.”
And when can you download and play for yourself? Well, the developer says publishing their project is still a bit of a sticky issue.
“I haven’t really thought about how to publish this, so I think I’ll have to dig into other ChatGPT projects to see how others have tackled the API key issue. I am hoping that it’s possible to alternatively connect to a locally-run LLM model for anyone who isn’t keen on paying the API fees.”
Serving up more natural NPC responses is also an area that needs to be addressed, the developer says.
For now I have it set up so that NPCs say “let me think” to indicate that I have been heard and the response is in the process of being generated, but you’re right this can be expanded to choose from a few different filler lines instead of repeating the same one every time.
And while the video is noticeably sped up after prompts, this mostly comes down to the voice generation software xVASynth, which admittedly slows the response pipeline down since it’s being run locally. ChatGPT itself doesn’t affect performance, the developer says.
This isn’t the first project we’ve seen using chatbots to enrich user interactions. Lee Vermeulen, a long-time VR pioneer and developer behind Modbox, released a video in 2021 showing off one of his first tests using OpenAI GPT 3 and voice acting software Replica. In Vermeulen’s video, he talks about how he set parameters for each NPC, giving them the body of knowledge they should have, all of which guides the sort of responses they’ll give.
Check out Vermeulen’s video below, the very same that inspired ‘Art from the Machine’ to start working on the Skyrim VR mod:
As you’d imagine, this is really only the tip of the iceberg for AI-driven NPC interactions. Being able to naturally talk to NPCs, even if a little stuttery and not exactly at human-level, may be preferable over having to wade through a ton of 2D text menus, or go through slow and ungainly tutorials. It also offers up the chance to bond more with your trusty AI companion, like Skyrim’s Lydia or Fallout 4’s Nick Valentine, who instead of offering up canned dialogue might actually, you know, help you out every once in a while.
And that’s really only the surface level stuff that a mod like ‘Art from the Machine’ might deliver to existing games that aren’t built with AI-driven NPCs. Imagining a game that is actually predicated on your ability to ask the right questions and do your own detective work—well, that’s a role-playing game we’ve never experienced before, either in VR our otherwise.