Virtual Human Controller is a collection of modules that allows to create virtual characters that can talk, perform facial expressions and gestures. It provides a quick pipeline to build interactive conversational characters from scratch.
If you are a researcher or game developer that likes to include virtual characters with realistic appearance and social behaviors, UUVHC is for you. For example, you might want to develop a job skills training game where a virtual character takes the role of an interviewer. Or you want to train police officers where they interact with non-player characters in a game that takes the role of citizens or victims. You could also imagine a game to help people with social anxiety to safely experience social situations. It is up to your imagination! Role-playing characters are required in almost every application area including business, health, education, security and military.
The asset is built on top of the Unity 3D Game Engine. The animation pipeline includes three steps:
Virtual Human Controller is an asset partly developed in the RAGE (Realising an Applied Gaming Ecosystem) Horizon 2020 project and partly by the Utrecht University Game Research Seed Money. We successfully integrated our asset with Communicate! dialogue manager from Utrecht University and with the Emotion Appraisal module from INESC-ID from Portugal. In addition to inter-asset integration, our asset is currently being used by the game developers at BipMedia in Paris. It will take part in the interviews skills training game for Randstad.
It was recently used for a case study where the virtual character takes the role of a Virtual Receptionist. The set-up includes a microphone to capture people’s speech and a Kinect camera to capture their behaviors. Beyond the functionalities of the Virtual Human Controller, we added Google speech recognition and chatting functionalities using AIML Pandorabots. Furthermore, we developed a novel autonomous gaze control module based on Kinect to drive the “look at” behavior of the virtual character in group-based interactions.
We showed our results as a live interactive demo in two public events recently: one in May for the visit of EU ambassadors to Utrecht and the other one in June during the INTETAIN 2016 conference. Take a look at the video below:
The software has an Utrecht University license. 3rd party assests have their own licenses and needs to be downloaded/purchased from their related websites. It is currently available to RAGE game researchers/developers and for internal projects at Utrecht University.
For questions and feedback, please contact Dr. Zerrin Yumak at z.yumak@uu.nl. See here our research page at Utrecht University.
Below we provide step-by-step instructions for the Virtual Human Controller.
The first step is to prepare a 3D model to be imported to Unity. We currently use Daz3D Editor. It provides rigged models and allows to export blendshapes for facial animation without any designer effort. Desired accessories such as clothes and hair can be downloaded from the Daz3D store.
The model is exported as .fbx file from Daz3D Editor. Please make sure that you add export rules to include the visemes and facial expressions.
Download the Unity project and drag and drop the .fbx file to the "Models" folder under Assets. Add the model to the scene. You will see the list of blendshapes (facial expressions and visemes) attached to the body mesh.
Notice that the eyes and hair of the model has problems. You can fix that by playing with the shader settings. For more information, please check the Unity Manual. Alternatively, you can see a manual here. For the background and lighting settings, we worked with a designer from the Utrecht School of the Arts (HKU).
For speech animation, currently we are using an asset from the Unity Asset Store, Rogo Digital. The free version of the asset Rogo Digital Lite works well. For extra functionality, you can also check the Rogo Digital Pro. Text-to-speech is based on CereProc SDK. The free academic version comes with a free voice Heather (Scottish English female). We currently use Isabella (American English female). You can test for different voices here.
Once your model is in the scene, you can add the Rogo LipSync component by using Add Component button in Unity. Rogo LipSync includes 9 phonemes (+1 rest frame) and you need to map them to the blendshapes exported from Daz3D. You also need to create an Audio Source file and link it from the LipSync component.
Please see below the mapping of the Rogo phonemes to Daz3D blendshapes. For the rest frame, we have chosen any blendshape and set its value to 0.
Rogo Digital Phonemes | Daz3D Blendshapes |
AI | head.eCTRLvAA |
E | head.eCTRLvEE |
U | head.eCTRLvUW |
O | head.eCTRLvOW |
CDGKNRSThYZ | head.eCTRLvTH |
FV | head.eCTRLvF |
L | head.eCTRLvL |
MBP | head.eCTRLvM |
WQ | head.eCTRLvW |
Since Rogo Digital lipsync works offline with sound files, we linked RogoDigital to CereVoice TTS callback functionality. TTS callback example can be found in the examples/basictts folder once you download the CereVoice SDK. We wrote a set of scripts in Unity to extract phonemes from CereVoice and to pass them to RogoDigital. The voice and license files from CereVoice and the necessary dlls needs to be added to the Unity project to make it work. Cereproc provides an API for both Mac OS X and Windows. Currently, we made the link only for the Windows version. The last step is setting the parameter of the Speech Realizer to Rogo Digital so that it can find it.
Facial animation is based on blendshapes in Unity. These are exported from Daz3D and can be seen listed under Genesis3Female.Shape skinned mesh renderer. In the Face Realizer settings, you need to select Genesis3Female.Shape as the Morph Container. Then, add the number of face lexemes that you want to include and link them to the blendshapes as seen below. For example, for Suprised, the blendshape number in the blendshapes list is 5, which matches to head.eCTRLSuprised in the blendshapes list. If you want to create different facial expressions than the ones provided by Daz3D, you can use the face primitives as basis and create different combinations in 3dsMax or Maya or work with a 3D artist.
Once you add the 3D character model to the project, you can set the Humanoid Rig. It is possible to get animations from Mixamo. For more information on animation, please check the Unity Manual.
The next step is to create an animation controller. In our example, we have a base animation layer with a standing idle motion
and a second animation layer for the waving animation with a right hand mask. This is to use only the desired part of the original waving animation which effects the whole body parts. Define a transition from the No motion state to the Wave state, which is triggered by the Wave boolean variable defined in the Parameters tab.
The final step is to link the Gesture Realizer to the animation controller of the model and adding the supported gestures list.
It is possible to add gesture parameters such as attackPeak according to BML specification. This is not fully supported yet in our BML Realizer.
For the gaze animation, add the Head Look Controller script from the asset store and set the parameters.
Then, link the Gaze Realizer to the Head Look Controller script and create gaze targets from your scene.
Take a look at the video below
and its corresponding BML script.