A complete beginner's guide to controlling anything in Unity with your body, audio, or keyboard — no coding required after setup.
This system lets you control anything in a Unity scene using your body.
Stand in front of your webcam. Bend your arm. A character's mouth opens. Raise your hand. A light gets brighter. Dance around. Particles explode. The system tracks 33 points on your body (shoulders, elbows, wrists, hips, knees, etc.) and turns those movements into numbers that can drive any property on any Unity object.
It also works with audio (music drives visual effects), LFOs (automatic oscillating patterns), and keyboard/mouse input.
One small open-source library included (websocket-sharp, MIT license). No paid assets. No package manager setup needed.
Think of it like a postal system. Your body movements create "letters" (numbers like "right elbow bend = 90 degrees"). These are stored in named "mailboxes" (the Data Bus). Unity objects check their mailbox each frame and update themselves accordingly.
pose/joint/rightElbow/bend is a channel that holds the current bend angle of your right elbow (0–180 degrees).From the DataBus, data can optionally pass through Processors (to smooth, scale, or threshold values), then finally reaches:
No component knows about any other component. They all just read and write named channels on the DataBus.
This means you can:
Assets/4E_System/. You don't need to import anything — just create GameObjects and add the scripts as components.This is the "brain" of the system — it receives body data and converts it into channels.
[4E System] then press Enter. The brackets are just for visibility — they make it sort to the top of the list.DataBus and click on Data Bus when it appears in the list. The component is now added.MainThreadDispatcherDataFlowMonitorWebSocketServerLandmarkConverterPoseChannelSourceChannelSliderPanel (F11 debug sliders)
PoseChannelSource component, you'll see a field called Converter that says "None (Landmark Converter)". You need to tell it where to find the LandmarkConverter.[4E System] object in the Hierarchy (left panel), then drag it into the Converter field on the PoseChannelSource component (right panel). Unity will automatically find the LandmarkConverter component on that object.
These add audio, timing, and keyboard data to the system. You don't need them for basic body tracking, but they're useful for more complex effects.
[4E Sources].TimeSource — gives you automatic oscillating values (LFOs). Ships with 3 pre-configured LFOs: slow (0.25 Hz), medium (1 Hz), fast (4 Hz).InputSource — tracks keyboard keys (Space, Enter, Shift) and mouse position. You can add more keys in the Inspector.AudioAnalysisSource — analyses audio for bass, mid, high frequencies and beat detection. Requires an AudioSource (see note below).
Audio Source → add itThis is what connects DataBus channels to your scene objects. It reads a "mapping" file that describes which channel drives which property.
[4E Driver].DriverMappingRunner → add it.[4E System], [4E Sources], and [4E Driver]. No red errors in the Console.It's a web page that runs Google's MediaPipe AI directly in your browser. It accesses your webcam, detects 33 body landmarks in real-time, and sends the data to Unity over a WebSocket connection.
The file is located at: Assets/4E_System/Browser/MediaPipeSender.html
Assets → 4E_System → Browser. You'll see MediaPipeSender.html.MediaPipeSender.html, or right-click → "Open with" → Google Chrome. Do not open it from inside Unity — it needs to run in a real browser.The page has these areas:
[4E WebSocket] Server started — ws://localhost:8080ws://localhost:8080. Click the Connect button.[4E WebSocket] Client connected.pose/landmark/nose/x, pose/joint/rightElbow/bend, etc. The numbers change in real-time as you move in front of the camera.pose/joint/rightElbow/bend change from ~180 (straight arm) to ~50 (bent arm).[4E System] object, find the DataFlowMonitor component, and type a prefix in Filter Prefix — for example pose/joint to only show joint angles.DataBus.Write("pose/joint/rightElbow/bend", 90f);A Driver Mapping is a reusable asset file — like a recipe that says "take this channel and feed it to that object". You can create multiple mappings for different performances or scenes and swap between them.
Assets/ or create a new folder called Mappings.MyFirstMapping and press Enter.[4E Driver]. In the Inspector, find the DriverMappingRunner component. Drag your new MyFirstMapping asset from the Project window into the Mapping field.MyFirstMapping asset file in the Project window, and the [4E Driver]'s Mapping field showing "MyFirstMapping" instead of "None".A binding is one self-contained connection: Source channel + input range → output range + target property.
Each binding has a built-in remap. A single mapping can have many bindings — one for each thing you want to control.
When your arm is at 0° (down), the cube is at Y=0. When your arm is at 90° (raised), the cube is at Y=5. The remap converts between the two ranges automatically.
There are 100+ channels. Here's how to find the one you need:
pose/joint to show only joint angles, or audio/ for audio channels.| I want to react to... | Use this channel | Range |
|---|---|---|
| Bending an elbow | pose/joint/rightElbow/bend | 0–180° (180=straight) |
| Raising an arm | pose/joint/rightUpperArm/raise | 0–180° (0=up, 180=down) |
| Overall body movement | pose/body/velocity | 0–1 (0=still, 1=fast) |
| Leaning left/right | pose/body/centre/x | -1 to 1 |
| Crouching/standing | pose/body/centre/y | -1 to 1 |
| Bending a knee | pose/joint/rightKnee/bend | 0–180° |
| Head tilt | pose/joint/headTilt/bend | 0–180° |
| Music bass/kick | audio/bass | 0–1 |
| Music overall volume | audio/amplitude | 0–1 |
| Beat drop moment | audio/beat/float | 0 or 1 |
| Automatic pulsing | time/lfo/slow | 0–1 (4-second wave) |
| Spacebar held | input/key/space | 0 or 1 |
Let's start with the simplest possible binding.
pose/joint/rightUpperArm/raise0 · Input Max: 90 ← arm raise range in degrees0 · Output Max: 5 ← cube Y position range in metresTransform.LocalPositionY.rightUpperArm/raise slider and drag it — the cube moves. This lets you fine-tune your input/output ranges before connecting the browser.Make a light respond to how much you're moving.
pose/body/velocity0 · Input Max: 1 ← velocity is already 0–10.2 · Output Max: 8 ← dim when still, bright when moving
LightIntensity.Open a character's mouth (or any morph target) by bending your arm. This requires a 3D model with blend shapes — many free characters on the Asset Store have them.
pose/joint/rightElbow/bend60 · Input Max: 180 ← elbow range (60=tight bend, 180=straight)0 · Output Max: 100 ← BlendShapes use 0–100
BlendShape.Make an object cycle through rainbow colours and emit particles in sync with a slow wave — no body input needed.
time/lfo/slowMaterialColor. Drag the object's Renderer into Target Renderer. Set Material Property to _BaseColor (the URP main colour property).time/lfo/slow.ParticleEmission. Add a Particle System to the object first (Add Component → Particle System). Drag it into Target Particles. Set Max Emission Rate to 50.| Target Type | What It Does | Required Fields |
|---|---|---|
| Transform | Moves, rotates, or scales an object | Target Transform + Property (e.g. LocalPositionY, UniformScale) |
| BlendShape | Deforms a 3D mesh (morph targets, face expressions) | Target Mesh (SkinnedMeshRenderer) + Blend Shape Index |
| AnimatorFloat | Sets a float parameter on an Animator | Target Animator + Param name |
| AnimatorBool | Sets a bool parameter (true when value > threshold) | Target Animator + Param name + Bool Threshold |
| AnimatorTrigger | Fires a trigger once when value crosses threshold | Target Animator + Param name + Bool Threshold |
| MaterialFloat | Sets a shader property (glow, dissolve, metallic) | Target Renderer + Material Property name (e.g. _Metallic) |
| MaterialColor | Sets a colour property — value 0–1 sweeps the rainbow | Target Renderer + Material Property name (e.g. _BaseColor) |
| LightIntensity | How bright a light is | Target Light |
| LightRange | How far a light reaches | Target Light |
| LightColor | Light colour — value 0–1 sweeps the rainbow | Target Light |
| ParticleEmission | How many particles per second | Target Particles + Max Emission Rate |
| AudioVolume | Volume of an AudioSource (0–1) | Target Audio |
| AudioPitch | Pitch/speed of an AudioSource | Target Audio |
| Reflection | ANY public float/bool/int on ANY component | Reflection Target + Component Type + Member Name |
Raw data doesn't always match what you need. Your elbow gives 0–180°, but a BlendShape wants 0–100. A sensor is jittery, but you want smooth motion. A value changes gradually, but you need a clear on/off switch.
Processors read one channel, transform the value, and write the result to a new channel. You chain them together.
[4E System] object (or create a new empty called [4E Processors] — either works).Changes the number range or applies mathematical operations.
| Operation | What It Does | Example |
|---|---|---|
| Remap | Maps one range to another | Elbow 60°–180° → BlendShape 0–100 |
| Scale | Multiplies by a constant | Velocity × 5 for stronger effect |
| Add | Adds a constant | Position + 0.5 to offset centre |
| Clamp | Limits to a min/max range | Keep value between 0 and 1 |
| Invert | Flips 0→1 and 1→0 | Light dims when you move (instead of brightening) |
| Abs | Makes negative values positive | Track distance regardless of direction |
| Power | Raises to a power | Power 2 = more dramatic at extremes |
Now use mapped/elbowBlend as your binding's Channel Key instead of the raw elbow channel.
Smooths out rapid changes so values feel natural instead of twitchy.
| Mode | Feel | Best For |
|---|---|---|
| Lerp | Gentle lag — follows the target with a delay | Most situations. Start here. |
| SpringDamper | Bouncy — overshoots then settles | Physical-feeling objects (pendulums, springs) |
| EMA | Exponential Moving Average — very stable | Aggressive noise removal |
Factor controls how much smoothing: 0.1 = very responsive (almost no smoothing), 0.9 = very smooth (big lag).
Converts a continuous value into a true/false trigger. Has hysteresis to prevent flickering at the boundary.
| Mode | When It's TRUE | Use Case |
|---|---|---|
| GreaterThan | Value is above the threshold | "Arm is raised above shoulder" |
| LessThan | Value is below the threshold | "Elbow is bent past 90°" |
| InRange | Value is between min and max | "Hand is in the middle zone" |
Hysteresis adds a deadband around the threshold. If threshold=90 and hysteresis=10, it turns ON at 85 and OFF at 95 — preventing rapid on/off switching when hovering near 90.
Writes both outputKey (bool) and outputKey/float (0 or 1) plus outputKey/rising and outputKey/falling for edge detection.
Takes two input channels and combines them into one output.
| Mode | Formula | Use Case |
|---|---|---|
| Multiply | A × B | Body velocity × audio beat = movement-reactive beat flash |
| Add | A + B | Combine two sources into one effect |
| Blend | Lerp(A, B, t) | Crossfade between two values |
| Max | Max(A, B) | Use whichever source is stronger |
| Min | Min(A, B) | Use whichever source is weaker |
MediaPipe detects 33 named points on your body every frame. Each point has an X, Y, Z position and a visibility score (0–1, how confident the AI is that it can see that point).
All 33 Landmark Indices
Left side = your left · Right side = your right
0 Nose| Channel Key | Range | Description |
|---|---|---|
| pose/landmark/{name}/x | -1 to 1 | Horizontal position (left to right) |
| pose/landmark/{name}/y | -1 to 1 | Vertical position (bottom to top) |
| pose/landmark/{name}/z | ~-0.5 to 0.5 | Depth (towards/away from camera) |
| pose/landmark/{name}/visibility | 0–1 | How confident the AI is it can see this point |
| pose/joint/{name}/bend | 0–180° | Angle at a joint. 180° = straight. 0° = fully folded. |
| pose/joint/{name}/bendNorm | 0–1 | Same angle, normalised to 0–1 range |
| pose/joint/{name}/raise | 0–180° | Angle of a limb relative to a world axis |
| pose/joint/{name}/raiseNorm | 0–1 | Same angle, normalised to 0–1 range |
| pose/body/centre/x|y|z | -1 to 1 | Midpoint between your two hips |
| pose/body/velocity | 0–1 | How much your whole body is moving. 0 = still, 1 = very active. |
Joint names available: leftElbow, rightElbow, leftWrist, rightWrist, leftKnee, rightKnee, leftHip, rightHip, headTilt, leftUpperArm, rightUpperArm, leftForeArm, rightForeArm, leftThigh, rightThigh, shoulderWidth, hipWidth
Only available if you added AudioAnalysisSource with an AudioSource playing music.
| Channel Key | Range | Description |
|---|---|---|
| audio/amplitude | 0–1 | Overall audio volume level |
| audio/bass | 0–1 | Low frequency energy (kick drums, bass) |
| audio/mid | 0–1 | Mid frequency energy (vocals, guitars) |
| audio/high | 0–1 | High frequency energy (hi-hats, cymbals) |
| audio/beat | bool | True for one frame when a beat is detected |
| audio/beat/float | 0 or 1 | Same as beat but as a float (useful for bindings) |
Only available if you added TimeSource. Ships with 3 pre-made LFOs: slow, medium, fast.
| Channel Key | Range | Description |
|---|---|---|
| time/lfo/slow | 0–1 | Sine wave at 0.25 Hz (4-second cycle) |
| time/lfo/medium | 0–1 | Sine wave at 1 Hz (1-second cycle) |
| time/lfo/fast | 0–1 | Sine wave at 4 Hz (quarter-second cycle) |
| time/lfo/{name}/raw | -1 to 1 | Bipolar version of any LFO |
| time/clock/pulse | bool | True once per beat at set BPM (default 120) |
| time/elapsed | 0–∞ | Total seconds since scene started |
You can add custom LFOs in the Inspector: change the shape (Sine, Triangle, Saw, Square, Noise), frequency, and phase.
Only available if you added InputSource. Tracks keyboard and mouse by default.
| Channel Key | Range | Description |
|---|---|---|
| input/key/space | 0 or 1 | 1 while Space key is held down |
| input/key/space/down | 0 or 1 | 1 for one frame when Space is first pressed |
| input/key/space/up | 0 or 1 | 1 for one frame when Space is released |
| input/mouse/x | 0–1 | Mouse horizontal position (0=left edge, 1=right) |
| input/mouse/y | 0–1 | Mouse vertical position (0=bottom, 1=top) |
| input/mouse/dx | dy | float | Mouse movement delta per frame |
| input/mouse/left | right | 0 or 1 | Mouse buttons |
Default tracked keys: space, enter, shift. Add more in the Inspector by expanding the Keys list and clicking +. Uses Unity's new Input System package.
| Field | Default | What It Does |
|---|---|---|
| Audio Source | (none) | Drag in an AudioSource component that's playing music |
| FFT Size | 512 | How many frequency bins to analyse. Higher = more detail, more CPU. |
| Bass Range | 0–10 | Which FFT bins count as "bass" (low frequencies) |
| Mid Range | 10–80 | Which FFT bins count as "mid" frequencies |
| High Range | 80–255 | Which FFT bins count as "high" frequencies |
| Beat Threshold | 0.15 | How loud the bass must be to trigger a beat. Lower = more sensitive. |
| Beat Cooldown | 0.3s | Minimum time between beats (prevents double-triggers) |
| Smoothing | 0.3 | How smoothed the output is. 0 = raw, 0.9 = very smooth. |
| Field | Default | What It Does |
|---|---|---|
| LFOs list | 3 entries | Each LFO has: name, shape (Sine/Triangle/Saw/Square/Noise), frequency (Hz), phase offset |
| Clock BPM | 120 | Beats per minute for the clock pulse channel |
| Enable Clock | true | Whether to generate clock pulses |
| Field | Default | What It Does |
|---|---|---|
| Keys list | Space, Enter, Shift | Each entry has: channel name + Key (new Input System). Click + to add more. |
| Track Mouse | true | Whether to write mouse position and button channels |
A character's mouth opens when you bend your arm, with extra punch on bass beats.
MathProcessorpose/joint/rightElbow/bendmapped/elbowBlendCombineProcessormapped/elbowBlendaudio/beat/floatfinal/elbowPunchfinal/elbowPunchAn object grows and shrinks rhythmically, stronger when you move faster.
CombineProcessorpose/body/velocity · Input Key B: time/lfo/mediumcombined/velocityLFO · Mode: MultiplyMathProcessorcombined/velocityLFO · Output Key: final/scalefinal/scaleLocalScaleX, LocalScaleY, LocalScaleZ (one each)Play a "Wave" animation when you raise your right arm above your shoulder.
ThresholdProcessorpose/joint/rightUpperArm/raisetrigger/armRaisedtrigger/armRaised/floatWave (must match the trigger name in your Animator Controller)[4E WebSocket] Server started. If you don't see it, make sure WebSocketServer has Auto Start checked.netstat -an | findstr 8080 — you should see a LISTENING entry.ws://localhost:9090/mediapipe.
Assets/4E_System/ — don't move them outside Assets.using FourE.DataFlow; at the top.
These all work with just a Cube and the [4E System] running. Add one binding per experiment.
| # | Channel | Target | Input | Output | What Happens |
|---|---|---|---|---|---|
| 1 | pose/joint/rightElbow/bend | Transform → LocalRotationY | 60 → 180 | 0 → 360 | Bend arm = cube spins |
| 2 | pose/body/centre/x | Transform → LocalPositionX | -1 → 1 | -3 → 3 | Lean left/right = cube slides |
| 3 | pose/body/velocity | Transform → UniformScale | 0 → 1 | 0.5 → 3 | Move fast = cube grows |
| 4 | pose/joint/rightUpperArm/raise | LightIntensity | 0 → 90 | 0 → 10 | Raise arm = light up |
| 5 | time/lfo/slow | MaterialColor (_BaseColor) | 0 → 1 | 0 → 1 | Auto rainbow cycle |
| 6 | audio/bass | Transform → LocalScaleY | 0 → 1 | 1 → 5 | Bass hits = cube stretches up |
Start with experiment 1 or 2 — they give the most obvious visual feedback. Once those work, try combining multiple bindings on the same object for richer effects.