In my previous Masterclass, we explored the Azure Kinect high-definition color capabilities.

Today, we are going to dive into the feature that makes Kinect special: its depth sensor.

The depth sensor is a piece of hardware that can measure the distance between a physical point and the device. The depth sensor is a Time of Flight infrared camera that works as follows:

  • The sensor emits a laser beam.
  • The beam is reflected in the physical objects and returns to the sensor.
  • Based on the time it took for the beam to get back, the sensor estimates the distance of that point!

Repeat the above process for thousands of points, and you have what is called a “depth map.” Using that fundamental principle, Kinect is capable of sensing the world in 3D! Using the Azure Kinect SDK, we can visualize the depth data in Unity3D, as shown in the video below.

System Requirements

To execute the code and run the demos, your system should meet the following requirements.

Hardware

  • 7th Gen Intel Core i5 CPU
  • 4 GB Memory
  • NVIDIA GeForce GTX 1070
  • Dedicated USB3 port

Did you know?…

After leaving my job at Microsoft, I have been helping Fortune-500 companies and innovative startups create amazing body-tracking applications and games. If you are looking to get your business to the next level, get in touch with me.

Azure Kinect depth data

Before writing any code, it’s important to understand how the depth data are actually structured. Kinect is streaming depth images at approximately 30 frames per second. A depth image is nothing but a set of distance values, measured in millimeters (to all my American friends, 1 millimeter is equal to 0.04 inches).

Launching the Azure Kinect Viewer, you can see a live visualization of the depth data. In this tutorial, we are going to create a similar visualization for our programs.

Azure Kinect Viewer

Depth configurations

Just like the Kinect color camera, the depth sensor also supports a variety of configurations, including frame rate, resolution, and field of view. The field of view specifies how much of the physical space the camera can see. You can select between a Narrow and a Wide field of view. The Narrow field of view can see a smaller portion of the world between 0.5 and 5 meters. On the other hand, the wide field of view can see a (literally) bigger picture between 0.25 and 3 meters.

Azure Kinect NFOV – WFOV

Narrow FOV (left) vs Wide FOV (right).

You can check a list of the available configuration options below (source):

ModeResolutionFoIFPSOperating rangeExposure time
NFOV unbinned640×57675°x65°0, 5, 15, 300.5 – 3.86 m12.8 ms
NFOV 2×2 binned320×28875°x65°0, 5, 15, 300.5 – 5.46 m12.8 ms
WFOV 2×2 binned512×512120°x120°0, 5, 15, 300.25 – 2.88 m12.8 ms
WFOV unbinned1024×1024120°x120°0, 5, 150.25 – 2.21 m20.3 ms
Passive IR1024×1024N/A0, 5, 15, 30N/A1.6 ms

Why so many options?

You may wonder “why the heck do we have so many options?”. The reason is simple: to accommodate as many use-case scenarios as possible. With great power comes great responsibility. Ask yourself how your users are going to use your application and what are your hardware limitations. Then, try to balance between the desired settings and the performance overhead. After all, it’s your job as a developer to select the configuration that fits your use-case scenario accordingly.

Depth Data in C#

As we’ve seen, depth images have a specific width and height. The frame data are encoded into a one-dimensional array of size:

Width x Height

Since the depth data are measured in millimeters, they are stored as C# unsigned integer values (ushort).

Configuring the Depth Settings

To specify the configuration of your choice, create a Kinect device object and feed it with the appropriate options, as below:

private KinectSensor sensor;

private void Start()
{
    sensor = KinectSensor.Create(new Configuration
    {
        FPS = FramesPerSecond.FPS_30,
        DepthMode = DepthMode.NFOV_Unbinned
    });
    sensor?.Open();
}

To stream using the wide field of view, simply change the depth mode to:

private KinectSensor sensor;

private void Start()
{
    sensor = KinectSensor.Create(new Configuration
    {
        FPS = FramesPerSecond.FPS_15,
        DepthMode = DepthMode.WFOV_Unbinned
    });
    sensor?.Open();
}

Retrieving the Depth data

Accessing the depth data is a piece of cake. Head to the Unity Update() method and acquire the data of the depth source:

private void Update()
{
    if (sensor == null || !sensor.IsOpen) return;

    Frame frame = sensor.Update();

    if (frame != null)
    {
        if (frame.DepthFrameSource != null)
        {
            int depthWidth = frame.DepthFrameSource.Width;
            int depthHeight = frame.DepthFrameSource.Height;
            ushort[] depthData = frame.DepthFrameSource.Data;
        }
    }
}

Up to this point, you may wonder how we can visualize a set of distance values? Fair question. To visualize the depth image, we need to convert the physical measurement units (millimeters) into digital pixels (bytes).

Grayscale depth visualization

The most common method visualization of the depth stream is by using mapping the distance values to light and dark shades of gray (white to black). The algorithm is straightforward:

  • We know the minimum, and maximum distance Kinect can see (0 to 10,000 millimeters).
  • The RGB byte range is 0 to 255.
  • As a result, we proportionately map the distance value to the byte range.

To put it simply, if a point is located at e.g., 5,000 millimeters in front of the camera, then we’ll assign it a value of 128. This method is not perfect, as it’s discarding a lot of information. However, it’s useful enough for most use-case scenarios and debugging purposes.

Colorful (Jet) depth visualization

Another popular (and, to be honest, much more exciting) visualization is by using rainbow-like shades. In that visualization type, we draw close distances in shades of blue, mid-range distances in shades of green, and long distances in shades of red.

As you can see in the image below, you can switch the visualization type directly into the Unity Editor.

Azure Kinect Depth NFOV

Azure Kinect Narrow Field Of View (NFOV) – Left: Grayscale, Right: Jet visualization.

Azure Kinect Depth WFOV

Azure Kinect Wide Field Of View (WFOV) – Left: Grayscale, Right: Jet visualization.

As you can see, you can select the visualization mode directly in the Unity Editor.

Azure Kinect depth data visualization settings

Summary

In this tutorial, we explored the Azure Kinect depth stream. We elaborated on the depth configuration options and seen two ways to visualize the raw depth data in Unity3D.

Source code

You made it to this point? Awesome! Here is the source code for your convenience.

Download source code

Before you go…

After leaving my job at Microsoft, I have been helping Fortune-500 companies and innovative startups create amazing body-tracking applications and games. If you are looking to get your business to the next level, get in touch with me.

Sharing is caring!

If you liked this article, remember to share it on social media, so you can help other developers, too! Also, let me know your thoughts in the comments below. ‘Til the next time… keep Kinecting!

Vangos Pterneas

Author Vangos Pterneas

Vangos Pterneas is a professional software engineer and an award-winning Microsoft Most Valuable Professional (2014-2019). Since 2012, Vangos has been helping Fortune-500 companies and ambitious startups create demanding motion-tracking applications. He’s obsessive about analyzing and modeling every aspect of the human motion using Computer Vision and Mathematics. Kinect programming started as a hobby and quickly evolved into a full-time business (lightbuzz.com). Vangos is sharing his passion by regularly publishing articles and open-source projects that help fellow developers understand the fascinating Kinect technology.

More posts by Vangos Pterneas

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.