In my previous Masterclass, we explored the Azure Kinect high-definition color capabilities.
Today, we are going to dive into the feature that makes Kinect special: its depth sensor.
The depth sensor is a piece of hardware that can measure the distance between a physical point and the device. The depth sensor is a Time of Flight infrared camera that works as follows:
- The sensor emits a laser beam.
- The beam is reflected in the physical objects and returns to the sensor.
- Based on the time it took for the beam to get back, the sensor estimates the distance of that point!
Repeat the above process for thousands of points, and you have what is called a “depth map.” Using that fundamental principle, Kinect is capable of sensing the world in 3D! Using the Azure Kinect SDK, we can visualize the depth data in Unity3D, as shown in the video below.
System Requirements
To execute the code and run the demos, your system should meet the following requirements.
Hardware
- 7th Gen Intel Core i5 CPU
- 4 GB Memory
- NVIDIA GeForce GTX 1070
- Dedicated USB3 port
Software
Did you know?…
After leaving my job at Microsoft, I have been helping Fortune-500 companies and innovative startups create amazing body-tracking applications and games. If you are looking to get your business to the next level, get in touch with me.
Azure Kinect depth data
Before writing any code, it’s important to understand how the depth data are actually structured. Kinect is streaming depth images at approximately 30 frames per second. A depth image is nothing but a set of distance values, measured in millimeters (to all my American friends, 1 millimeter is equal to 0.04 inches).
Launching the Azure Kinect Viewer, you can see a live visualization of the depth data. In this tutorial, we are going to create a similar visualization for our programs.
Depth configurations
Just like the Kinect color camera, the depth sensor also supports a variety of configurations, including frame rate, resolution, and field of view. The field of view specifies how much of the physical space the camera can see. You can select between a Narrow and a Wide field of view. The Narrow field of view can see a smaller portion of the world between 0.5 and 5 meters. On the other hand, the wide field of view can see a (literally) bigger picture between 0.25 and 3 meters.
Narrow FOV (left) vs Wide FOV (right).
You can check a list of the available configuration options below (source):
Mode | Resolution | FoI | FPS | Operating range | Exposure time |
---|---|---|---|---|---|
NFOV unbinned | 640×576 | 75°x65° | 0, 5, 15, 30 | 0.5 – 3.86 m | 12.8 ms |
NFOV 2×2 binned | 320×288 | 75°x65° | 0, 5, 15, 30 | 0.5 – 5.46 m | 12.8 ms |
WFOV 2×2 binned | 512×512 | 120°x120° | 0, 5, 15, 30 | 0.25 – 2.88 m | 12.8 ms |
WFOV unbinned | 1024×1024 | 120°x120° | 0, 5, 15 | 0.25 – 2.21 m | 20.3 ms |
Passive IR | 1024×1024 | N/A | 0, 5, 15, 30 | N/A | 1.6 ms |
Why so many options?
You may wonder “why the heck do we have so many options?”. The reason is simple: to accommodate as many use-case scenarios as possible. With great power comes great responsibility. Ask yourself how your users are going to use your application and what are your hardware limitations. Then, try to balance between the desired settings and the performance overhead. After all, it’s your job as a developer to select the configuration that fits your use-case scenario accordingly.
Depth Data in C#
As we’ve seen, depth images have a specific width and height. The frame data are encoded into a one-dimensional array of size:
Width x Height
Since the depth data are measured in millimeters, they are stored as C# unsigned integer values (ushort
).
Configuring the Depth Settings
To specify the configuration of your choice, create a Kinect device object and feed it with the appropriate options, as below:
private KinectSensor sensor; private void Start() { sensor = KinectSensor.Create(new Configuration { FPS = FramesPerSecond.FPS_30, DepthMode = DepthMode.NFOV_Unbinned }); sensor?.Open(); }
To stream using the wide field of view, simply change the depth mode to:
private KinectSensor sensor; private void Start() { sensor = KinectSensor.Create(new Configuration { FPS = FramesPerSecond.FPS_15, DepthMode = DepthMode.WFOV_Unbinned }); sensor?.Open(); }
Retrieving the Depth data
Accessing the depth data is a piece of cake. Head to the Unity Update()
method and acquire the data of the depth source:
private void Update() { if (sensor == null || !sensor.IsOpen) return; Frame frame = sensor.Update(); if (frame != null) { if (frame.DepthFrameSource != null) { int depthWidth = frame.DepthFrameSource.Width; int depthHeight = frame.DepthFrameSource.Height; ushort[] depthData = frame.DepthFrameSource.Data; } } }
Up to this point, you may wonder how we can visualize a set of distance values? Fair question. To visualize the depth image, we need to convert the physical measurement units (millimeters) into digital pixels (bytes).
Grayscale depth visualization
The most common method visualization of the depth stream is by using mapping the distance values to light and dark shades of gray (white to black). The algorithm is straightforward:
- We know the minimum, and maximum distance Kinect can see (0 to 10,000 millimeters).
- The RGB byte range is 0 to 255.
- As a result, we proportionately map the distance value to the byte range.
To put it simply, if a point is located at e.g., 5,000 millimeters in front of the camera, then we’ll assign it a value of 128. This method is not perfect, as it’s discarding a lot of information. However, it’s useful enough for most use-case scenarios and debugging purposes.
Colorful (Jet) depth visualization
Another popular (and, to be honest, much more exciting) visualization is by using rainbow-like shades. In that visualization type, we draw close distances in shades of blue, mid-range distances in shades of green, and long distances in shades of red.
As you can see in the image below, you can switch the visualization type directly into the Unity Editor.
Azure Kinect Narrow Field Of View (NFOV) – Left: Grayscale, Right: Jet visualization.
Azure Kinect Wide Field Of View (WFOV) – Left: Grayscale, Right: Jet visualization.
As you can see, you can select the visualization mode directly in the Unity Editor.
Summary
In this tutorial, we explored the Azure Kinect depth stream. We elaborated on the depth configuration options and seen two ways to visualize the raw depth data in Unity3D.
Source code
You made it to this point? Awesome! Here is the source code for your convenience.
Before you go…
After leaving my job at Microsoft, I have been helping Fortune-500 companies and innovative startups create amazing body-tracking applications and games. If you are looking to get your business to the next level, get in touch with me.
Sharing is caring!
If you liked this article, remember to share it on social media, so you can help other developers, too! Also, let me know your thoughts in the comments below. ‘Til the next time… keep Kinecting!