Per-Job Options
Every job in the queue has its own independent set of solve options. Add columns to the queue view via the column picker, then edit each cell inline. Settings here override nothing at the global level — they are purely per-take controls passed directly to the MetaHuman Animator solver.
Universal Options
All job typesOne or more frame ranges to skip during the solve, e.g. 0-5, 120-125. Use to cut out blinks, obstructions, equipment glare, or any frames where the face is occluded or corrupted.
Renders preview frames to the viewport during the solve. Useful for debugging to visually confirm the solver is tracking correctly. Slows down processing — disable for production batches.
Override the frame rate auto-detected from the capture data asset. Leave blank in most cases. Set explicitly if your image sequence lacks embedded timing metadata or if auto-detection gives incorrect results.
Mono Options
Mono jobs onlyCorrects for drift introduced by head movement across the recording. Enable when the subject moves significantly during the take or when camera stabilization is imperfect. Has a small performance cost.
Enables the tongue tracking solver. Requires UE 5.7 or later. Capture quality must be sufficient for tongue visibility — close-framing and good lighting are prerequisites. Disabled by default due to the higher solve time.
Stereo Options
Stereo jobs onlySelects the internal stereo solver algorithm variant. In most productions the default is correct. Change only when directed by Epic support or when experimenting with non-standard rig configurations.
The frame number used as the identity reference for head movement correction. Frame 0 is the default. Set to a frame where the subject is in a neutral, stable pose facing forward.
Uses a dedicated neutral expression frame (separate from the performance footage) to calibrate the facial geometry baseline before solving. Improves accuracy when a calibration take is available.
Minimum proportion of the face that must be visible in the depth map for a frame to be included in the solve. Frames below this threshold are automatically excluded. Raise if you see partial-face frames corrupting the solve.
Minimum face width in depth pixels required for a frame to be included. Acts alongside Face Coverage Threshold to filter out frames where the face is too far from camera or too small for reliable depth data.
Acceptable deviation between the stereo baseline distance recorded in the camera calibration and what is measured in the live footage. Increase if calibration was done at a slightly different rig configuration than the capture.
Acceptable scale mismatch between calibrated and measured stereo geometry. Increase if the rig was adjusted between calibration and capture.
Runs an NNE-based hole-filling pass on the depth sequence before the performance solve. Fills gaps in the depth map within the face region while leaving borders intact. Preserves single-channel GrayF EXR format.
Clips depth values below this threshold to zero before cleaning. Eliminates background or near-field noise that would otherwise be treated as valid face depth.
Clips depth values above this threshold to zero before cleaning. Eliminates far-field objects and reflections that appear in the depth map beyond the face region.
Audio Options
Audio jobs onlyRealtime — faster solve, lower quality. Suitable for previews and iteration. Offline — slower, higher quality. Use for final output deliverables.
Sets the emotional baseline used to color the generated animation. Available presets: Neutral, Happy, Sad, Angry, Surprised, Disgusted, Fearful. The solver blends the detected phonemes against this mood baseline.
Auto-generates natural blink keyframes when the audio solve does not produce blink data. Enabled by default. Disable if you are driving blinks from a separate system or if the performance data already includes them.
Which channel from a multi-track audio asset to use for the lip-sync solve. Defaults to 0 (left/mono). For stereo files where dialogue is on the right channel, set to 1.
Number of frames the solver reads ahead in the audio signal to generate anticipatory animation cues — for example, jaw pre-opening before a vowel. Increase for more anticipatory motion; decrease for tighter sync.