Understanding Kinect Coordinate Mapping

Kinect Coordinate Mapping

This is another post I publish after getting some good feedback from my blog subscribers. Seems that a lot of people have a problem in common when creating Kinect projects: how they can properly project data on top of the color and depth streams.

As you probably know, Kinect integrates a few sensors into a single device:

  • An RGB color camera – 640×480 in version 1, 1920×1080 in version 2
  • A depth sensor – 320×240 in v1, 512×424 in v2
  • An infrared sensor – 512×424 in v2

These sensors have different resolutions and are not perfectly aligned, so their view areas differ. It is obvious, for example, that the RGB camera covers a wider area than the depth and infrared cameras. Moreover, elements visible from one camera may not be visible from the others. Here’s how the same area can be viewed by the different sensors:

[youtube code=”GZ3eYyBPv44″]

An example

Suppose we want to project the human body joints on top of the color image. Body tracking is performed using the depth sensor, so the coordinates (X, Y, Z) of the body points are correctly aligned with the depth frame only. If you try to project the same body joint coordinates on top of the color frame, you’ll find out that the skeleton is totally out of place:

Kinect Coordinate Mapping wrong

CoordinateMapper

Of course, Microsoft is aware of this, so the SDK comes with a handy utility, named CoordinateMapper. CoordinateMapper’s job is to identify whether a point from the 3D space corresponds to a point in the color or depth 2D space – and vice-versa. CoordinateMapper is a property of the KinectSensor class, so it is tight to each Kinect sensor instance.

You can download a test project from GitHub and check how CoordinateMapper is used. To understand it more thoroughly, continue reading this tutorial.

Using CoordinateMapper

Let’s get back to our example. Here is the C# code that accesses the coordinates of the human joints:


foreach (Joint joint in body.Joints)
{
    // 3D coordinates in meters
    CameraSpacePoint cameraPoint = joint.Position;
    float x = cameraPoint.X;
    float y = cameraPoint.Y;
    float z = cameraPoint.Z;
}

Note: Please refer to my previous article (Kinect version 2: Overview) about finding the body joints.

The coordinates are 3D points, packed into a CameraSpacePoint struct. Each CameraSpacePoint has X, Y and Z values. These values are measured in meters.

The dimensions of the visual elements are measured in pixels, so we somehow need to convert the real-world 3D values into 2D screen pixels. Kinect SDK provides two additional structs for 2D points: ColorSpacePoint and DepthSpacePoint.

Using CoordinateMapper, it is super-easy to convert a CameraSpacePoint into either a ColorSpacePoint or a DepthSpacePoint:


ColorSpacePoint colorPoint = _sensor.CoordinateMapper.MapCameraPointToColorSpace(cameraPoint);
DepthSpacePoint depthPoint = _sensor.CoordinateMapper.MapCameraPointToDepthSpace(cameraPoint);

This way, a 3D point has been mapped into a 2D point, so we can project it on top of the color (1920×1080) and depth (512×424) bitmaps.

How about drawing the joints?

You can draw the joints using a Canvas element, a DrawingImage object or whatever you prefer.

This is how you can draw the joints on a Canvas:


public void DrawPoint(ColorSpacePoint point)
{
    // Create an ellipse.
    Ellipse ellipse = new Ellipse
    {
        Width = 20,
        Height = 20,
        Fill = Brushes.Red
    };

    // Position the ellipse according to the point's coordinates.
    Canvas.SetLeft(ellipse, point.X - ellipse.Width / 2);
    Canvas.SetTop(ellipse, point.Y - ellipse.Height / 2);

    // Add the ellipse to the canvas.
    canvas.Children.Add(ellipse);
}

Similarly, you can draw a DepthSpacePoint above the depth frame. You can also draw the bones (lines) between two points. This the result of a perfect coordinate mapping on top of the color image:

Kinect Coordinate Mapping right

Note: Please refer to my previous article (Kinect v2 color, depth and infrared streams) to learn how you can create the camera bitmaps.

Download the source code from GitHub and enjoy yourself:

In this tutorial, I used Kinect for Windows version 2 code, however, everything applies to the older sensor and SDK 1.8 as well. Here are the corresponding class and struct names you should be aware of. As you can see, there are some minor changes regarding the naming conventions used, but the core functionality is the same.

Version 1 Version 2
SkeletonPoint CameraSpacePoint
ColorImagePoint ColorSpacePoint
DepthImagePoint DepthSpacePoint

PS: Vitruvius

This project is part of the Premium version of Vitruvius (original cost: $299). Vitruvius is a set of powerful Kinect extensions that will help you build stunning Kinect apps in minutes. Vitruvius includes avateering, HD Face, background removal, angle calculations, and more. Check it now.

Want to hire my team for your next cutting-edge app? Drop me an email.

Author Vangos Pterneas

Vangos Pterneas is an award-winning Microsoft Most Valuable Professional. He is helping companies from all over the world grow their revenue by creating profitable software products. He loves Motion Technology and Mixed Reality. Vangos is the CEO of LightBuzz Inc and author of The Dark Art Of Freelancing.

More posts by Vangos Pterneas

Join the discussion 289 Comments

  • Hi,

    congrulations!

    I just started to experiment with visualizing the skeletons.
    Positions are easy but I don’t know working with the orientations. Do you help me?

  • Chris says:

    When will the new book be available?

  • Eloisa says:

    Its like you read my mind! You appear to know a lot about this, like you wrote the book in it or something.
    I think that you can do with some pics to drive the message home a bit, but instead of that, this is great blog.
    A fantastic read. I will certainly be back.

  • Afif Hendrawan says:

    wow, SDK v2 is out ? Looking little bit easy than 1.8,
    Can you tell me how to avateering using 1.8 ?
    I still confuse with example from 1.8 sdk, ty ^^

    • Hi Affif. SDK v2 will be publicly available on July 15.

      Avateering is currently available for XNA only, though I am planning to add similar 3D modelling functionality in my library, Vitruvius: https://github.com/lightbuzz/vitruvius. Such additions are expected during this summer.

      WPF does not directly support advanced 3D modelling out-of-the-box, so the avateering engine will run XNA, DirectX or SharpDX under the hood.

  • NIlesh says:

    In this example, final background removed image would be of lower resolution as we are mapping color into depth frame. Is it possible to have reverse ? Meaning I want to keep final image of the size 1920×1080 with background removed ?

    • Hi Nllesh. There is no out-of-the-box way to do so. The depth frame has a resolution of only 512×424. As a result, not every point is mapped to a color pixel.

      • Nilesh says:

        Yes that is what I thought. But if we just use depth frame for background removal feature then in Kinect V1 it would produce higher resolution image. I was thiking we can mark color frame with all mapped points and then expose its RGB value and somehow apply some algorithm to expose all neighboring pixels so that we get full 1920×1080 Background removed image. Can you suggest some algorithms on that ?

        • NIlesh says:

          What I meant is that in this respect Kinect V1 is better because it produces 640×480 image. But in Kinect V2 we can produce only 512×414.

          • Hi Nilesh. I think that the image Kinect v1 produced was 320×240 pixels, since the depth resolution was half the color resolution. Let me know if I didn’t understand correctly.

            A quick algorithm for increasing the resolution would be to take the 4 nearest color pixels of the mapped one. That would give you an image of higher resolution, but it would also be a little inaccurate in the edges. If that is not an issue, you can scan the array and insert the required pixels after applying coordinate mapping.

  • Ray Price says:

    Hi,

    This is great stuff. Do you have any experience with the JointOrientations in V2? I’ve been using the Unity plug-in to try and do a Unity avateering demo, but am getting some VERY strange results. I have been trying to map the bone.transform.localRotation = Quaternion.Inverse(kinectParentOrientation) * kinectChildOrientation, but it’s just not right.

    Regards
    Ray

  • Your style is really unique in comparison to other folks I have read stuff from.

    Thank you for posting when you have the opportunity, Guess I will just bookmark this web site.

  • Amin says:

    Hello,

    I took an easier an approach by using the RGB image as a background image for the skeleton viewer. It works perfectly.

    using (DrawingContext dc = this.drawingGroup.Open())
    {

    // Draw a transparent background to set the render size

    ImageBrush myBrush = new ImageBrush();

    BitmapEncoder encoder = new PngBitmapEncoder();

    // create frame from the writable bitmap and add to encoder
    encoder.Frames.Add(BitmapFrame.Create(this.RGBImage));

    ImageBrush ib = new ImageBrush();

    ib.ImageSource = RGBImage;

    dc.DrawRectangle(ib, null, new Rect(0.0, 0.0, RenderWidth, RenderHeight));

    Best regards,
    Amin Amini
    Department of Electronic and Computer Engineering (ECE)
    School Engineering & Design
    Brunel University

  • Kris says:

    Hello,
    I was wondering if there was a way to increase the brightness, hue, or saturation on version 2 of the windows sdk for Kinect. I understand the function call and the classes needed, but I can’t directly add it or change the sdk. ColorStream.CameraSettings.ExposureTime = …

  • Jony Yang says:

    GREAT GREAT resources! I wonder if your new book is about SDK v2 public preview? Will you revise the content once the formal version released? Do I have to spend money on new revision again if I already purchased your e-copy book right after you publish it?

  • Alex says:

    Hey Vangos,

    nice post, love your site. In your example you used 1920*1080 as the size of the color image. The same resolution as the color camera, so the mapping is nice and clean. I have a project, where I have a much smaller window for the color stream and drawing of the body joints in the top corner of the main window and therefore the mapping doesnยดt work, because it uses 1920*1080 as base.

    Do you know a workaround for this problem?

    • Alex says:

      Nvm, I fixed it by dividing 1920 / 6 and 1080 / 6. Then I divided the Point which is used for drawing after coordinate mapping by 6 too.

    • Hi Alex. The best option is to use coordinate mapping. Instead of mapping a joint to the depth space, map it to the color space or vice-versa. You won’t have a any problems, as long are the mapping occurs on the correct stream.

  • Kal says:

    Has anybody worked out what is the exact 4×4 matrix that CoordinateMapper uses to convert from, say, CameraSpacePosition to DepthSpacePosition?

  • Frank Liu says:

    This is a very good example for new beginner just like me.
    In the program for tracking hands you used the function named ScaleTo()? Could you tell me the difference between ScaleTo() and MapCameraPointToColorSpace()

    • Hi Frank. ScaleTo simply tries to to fit the real-world coordinates into a specified rectangle (e.g. your canvas). It’s a quick-and-dirty method. On the other hand, CoordinateMapper is more powerful, since it knows exactly how to convert 3D space points into depth or color pixels.

      I’m a strong advocate of CoordinateMapper ๐Ÿ™‚

  • tcboy says:

    can you make an example of mapping whole color frame to depth frame?
    I saw your post in stackoverflow but it is not detailed enough.
    I try it and the performance is very slow, about 1 frame per second.
    So I wonder if you can make a full example of whole color frame mapping to depth frame.
    Thanks!

  • Brian says:

    Vangos, thank you for a great example.

    I was wondering if you know of any way to obtain the camera intrinsic parameters for Kinect 2 using the SDK? Or do I need to do it on my own using OpenCV? The SDK defines a CameraIntrinsics structure but I don’t see it exposed anywhere in the API.

  • Jim says:

    Hi Vangos,
    Your bog is really awesome !
    It’s really helpful !
    Currently , i use Kinect to find 2 fiducial points and generate a 3-D box from these points.
    So i am wondering how can i get depth from camera to these 2 points ?
    And how can we map from colorSpacePoint to CameraSpacePoint ?
    Thanks a lot !

  • Hey Jim. Thanks a lot for your comment!

    CoordinateMapper also includes the method MapColorFrameToCameraSpace, which maps a whole RGB frame to the 3D space. However, you need to know the depth data, too, to use that method. If you know the depth data, the mapping is straightforward. If you only know the RBG pixels, you won’t be able to convert a 2D point into a 3D point. On the other hand, if you know the depth points, you can easily map them to the RGB frame.

  • Muhammad Roshaan says:

    Hey man i just got your code and as i compiled it on visual studio it gives me two errors which i’m unable to solve. errors are given below
    1 – ‘Microsoft.Kinect.KinectSensor’ does not contain a definition
    2 – ‘Microsoft.Kinect.Body’ does not contain a definition for ‘Dispose’ and no extension method ‘Dispose’ accepting a first argument of type ‘Microsoft.Kinect.Body’ could be found (are you missing a using directive or an assembly reference?)

    please guide me any help would be appreciated. Thanks

  • Lincoln says:

    Hi Vangos. I saved a couple of color frames and depth frames into pictures by using kinect. Then I would like to map color frame to depth points without using kinect device, but the method MapColorFrameToDepthSpace seems does not work in this situation. Can I get the coordinates without using kinect device ?
    Wish you could help me. Thanks !

  • Hey Lincoln. You cannot get any coordinates unless you plug a Kinect device. You need a Kinect device connected, even if you are playing Kinect Studio .xef files.

  • Vahid says:

    Hi Vangos. I want to use Kinect v2 to put a surface on a ROI of depth imageusing point cloud. I think for this I need camera parameters such as focal distance projection matrix and …..

    Do you know how I can have these parameters for Kinect v2?

    Many thanks
    Vahid

    • Hi Vahid. Not sure what you are trying to do. What is the use-case scenario? The API can give you depth/infrared distances between objects and the camera, as well as information about the floor pane.

      Best,
      Vangos

  • Eisha says:

    Hi,
    Very nicely explained … I am working with V1.8 sdk and I want to ask that if i go forward , a box move in upward direction and if i go backward the box move in downward direction. for that i need to map the Z axis but i dont know how to do it ?
    Thanks

    • Hi Eisha. The Z axis values are measured in meters. Since the Z axis represents the depth information in the 3D space, it cannot be mapped on the 2D screen space.

      You have to decide how a change in meters in the 3D space can be translated into a 2D pixel movement. So, let’s say that you have a rectangle on a canvas. You need to move upwards/downwards according to the user’s position. Here’s what you need to do:

      1) Find the distance between the user and the sensor. The distance is the Z value of the Position of the SpineBase joint.
      2) Let’s suppose that the initial distance is 1.0 meter (you’ll need to specify a starting distance – e.g. when clicking a button).
      3) Keep calculating the current distance for every body frame.
      4) For each frame, compare the current distance to the initial distance.
      5) Subtract the initial distance from the current distance and find the change (e.g. 2.4 – 1.0 meters)
      6) Map 0.1 meters to e.g. 10 pixels. So, if the change is 1.4 meters, move the rectangle 140 pixels downwards. If the change is -0.8, move the rectangle 80 pixels upwards.

      Hope this is clear enough to get you started ๐Ÿ™‚

      Apologies for my late replies โ€“ I have been sick for the past few daysโ€ฆ

  • Muhammad Usman says:

    Hi Vangos,
    I find the circles going out of the screen for smaller resolution, can you please guide me to restrict the circles moving out of panel and works fine for smaller size of screen of 720P.
    Thanks in advance ๐Ÿ™‚ i am new to kinect development

    • Hi Muhammad,

      That’s mostly a XAML question. You’ll just need to experiment with the size of the circles (width and height). You could change their dimensions proportionally according to the screen dimensions.

      • Muhammad Usman says:

        Hi Vangos,
        let me clear my question again, i have added canvas and image on blank app on XaML of both width =900 and height = 700.
        on C# code i wrote this

        bodyCanvas.Children.Add(headcircle);
        Canvas.SetLeft(headcircle, (colorPoint.X*(image.Width/1920)) – 25 ); //if in case i changed the width of canvas
        Canvas.SetTop(headcircle, (colorPoint.Y*(image.Height/1080)) – 25 );

        this displays X axis correctly but for y axis, it shows inaccurate position of eclipse for lower than 1080p screens, or with less resolution of 1920*1080.

        can you explain why i am having issue on Y axis ?

  • Harold says:

    Hi Vangos, could you somehow work harder to get your book published? lol

  • Harsh Sarin says:

    Hi Vangos,
    Thanks for the wonderful tutorial. I am working with SDK 2.0 and want to use the head joint to get more information about a person’s face. The idea is to use the head joint positions as a reference, and retrieve all color pixels within a 24×24 window around that position. This will give me a person’s face(at least most of it). I could have used the FaceFrameSource, but its not robust towards occlusions and works well for frontal faces only. Hence, I want to try this method.
    Can you suggest how I should go about doing that.

    Thanks!!!

    • Hi Harsh. Face tracking works better for frontal views. You can rotate your head 45 degrees, too. However, it will not work if you look sideways. What would you like to achieve?

      • Harsh Sarin says:

        Among all the people present in the sensor’s FOV, I want to single out a specific person using face recognition. In cases, when the person is not facing the sensor, I want to use the clothes he/she is wearing as a secondary reference point. The idea is to use the region around the face joint, to retrieve a person’s face, and the region between the shoulder and hip joints, to extract a person’s attire. Then I will use Eigen Faces and histogram correlation to do face recognition and attire matching, respectively.

        • Hi Harsh. You can use Face HD to get the exact points of the face. The Head joint represents the center of the head. Using Face HD, you can get 3000 facial points, so you have a full view of the human face. Face HD works similarly to Face Basics. I am writing another blog post about it.

  • haishan says:

    Hi Vangos,
    Your post is very nice and helpful!
    Actually, I’m making a test application which compare a live motion(for ex. body frame) captured by the Kinect sensor and a playback xef file recorded by KStudio,
    I wanted that they display in two different canvas, but my difficulty is that how to know the frames come from the sensor or the xef file.
    Have you faced this problem?
    Thanks in advance for your help.

    • Hi Haishan. You can either connect your app to a sensor or to a .xef file from Kinect Studio. As far as I am concerned, you can’t have both of them at the same time.

  • Jernej Jerin says:

    Hi!

    Thanks for concise explanation! By the way, at the end of your post you mentioned the Kinect book. Where can I buy it?

  • Eadhun says:

    Can you please help me to draw bones? Thank you very much .

    • Hi Eadhun. You can draw bones using the code from Vitruvius. Hope it helps you.

      A bone is represented using a Line object. The Line object has X1, Y1, X2, Y2 properties. X1 & X2 correspond to the first joint, while Y1 & Y2 correspond to the second joint.

      Using coordinate mapper, you find the X & Y coordinates for the color or depth frame and you set the position of the Line accordingly.

  • kal says:

    Good and easy explanation for newbie as me. I’m having an issue in my implementation, similar as described in http://stackoverflow.com/questions/25788545/kinect-for-windows-v2-depth-to-color-image-misalignment/29043584#29043584 and recently in http://stackoverflow.com/questions/29084773/kinect-v2-for-windows-depth-to-color-image-misalignment-and.

    What or where is my error?

    In your example (https://github.com/Vangos/kinect-2-background-removal), when I run it, since you remove the misalignment is not noted, but if I comment only the line
    //if (player != 0xff)
    to paint all pixels of the image, it duplicates the pixels in the rightmost side of the body, and perceiving some distortion in the top of the mapped image (http://s7.postimg.org/gvtv6vnl7/depth_In_Color3.png).

    Many thanks,

  • Abhishek says:

    Hi, Vangos
    Thank you very much for your very simple and brief explanation of these Concepts. it is improving my knowledge in a big scale. ๐Ÿ™‚ It will be very helpful if you can explain a simple 3d object mapping tutorial using kinect.
    for ex : Placing 3d objects to kinect joints in real time..

  • Yash says:

    Hi, Vangos !!! The application I am interested is quite an advanced one to this !!! I want to modify the pixel values in the Color Frame with the information of Joints . I mean : Taking the joint 2D coordinates as input, I need to modify certain pixels on the Color Frame !!! So, What could be the idea…?? .

    • Kal says:

      Hi Yash. The information provided by the color frame is a [byte] buffer like buff= [BGRaBGRaBGRaBGRa….]. In it you can easily access a specific location, for instance the JOINT (X,Y) is mapped to idx = (Y * frameWidth)+X in the array, then change that pixel color to red : buff[idx*4] =0; buff[idx*4 +1]=0, buff[idx*4+2]=255, buff[idx*4+3]=0. To copy the current color frame into the buffer, use the function .CopyConvertedFrameDataToArray provided by the same MS SKD.

      • Yash says:

        Hi Kal . Thanks for that . I had a brief idea of doing something like this . But at the same time, I am unable to convert to code . Either I am understanding the code in a wrong way or whatever , I would like to give the scenario with reference to code :

        void Reader_MultiSourceFrameArrived(object sender, MultiSourceFrameArrivedEventArgs e)
        {
        var reference = e.FrameReference.AcquireFrame();

        // Color
        using (var frame = reference.ColorFrameReference.AcquireFrame())
        {
        if (frame != null)
        {
        if (_mode == CameraMode.Color)
        {
        camera.Source = frame.ToBitmap();
        }
        }
        }

        // Depth
        using (var frame = reference.DepthFrameReference.AcquireFrame())
        {
        if (frame != null)
        {
        if (_mode == CameraMode.Depth)
        {
        camera.Source = frame.ToBitmap();
        }
        }
        }

        // Infrared
        using (var frame = reference.InfraredFrameReference.AcquireFrame())
        {
        if (frame != null)
        {
        if (_mode == CameraMode.Infrared)
        {
        camera.Source = frame.ToBitmap();
        }
        }
        }

        // Body
        using (var frame = reference.BodyFrameReference.AcquireFrame())
        {
        if (frame != null)
        {
        canvas.Children.Clear();

        _bodies = new Body[frame.BodyFrameSource.BodyCount];

        frame.GetAndRefreshBodyData(_bodies);

        foreach (var body in _bodies)
        {
        if (body.IsTracked)
        {
        // COORDINATE MAPPING
        foreach (Joint joint in body.Joints.Values)
        {
        if (joint.TrackingState == TrackingState.Tracked)
        {
        // 3D space point
        CameraSpacePoint jointPosition = joint.Position;

        // 2D space point
        Point point = new Point();

        if (_mode == CameraMode.Color)
        {
        ColorSpacePoint colorPoint = _sensor.CoordinateMapper.MapCameraPointToColorSpace(jointPosition);

        point.X = float.IsInfinity(colorPoint.X) ? 0 : colorPoint.X;
        point.Y = float.IsInfinity(colorPoint.Y) ? 0 : colorPoint.Y;
        }
        else if (_mode == CameraMode.Depth || _mode == CameraMode.Infrared) // Change the Image and Canvas dimensions to 512×424
        {
        DepthSpacePoint depthPoint = _sensor.CoordinateMapper.MapCameraPointToDepthSpace(jointPosition);

        point.X = float.IsInfinity(depthPoint.X) ? 0 : depthPoint.X;
        point.Y = float.IsInfinity(depthPoint.Y) ? 0 : depthPoint.Y;
        }

        // Draw
        Ellipse ellipse = new Ellipse
        {
        Fill = Brushes.Red,
        Width = 30,
        Height = 30
        };

        Canvas.SetLeft(ellipse, point.X – ellipse.Width / 2);
        Canvas.SetTop(ellipse, point.Y – ellipse.Height / 2);

        canvas.Children.Add(ellipse);
        }
        }
        }
        }
        }
        }
        }
        }

        enum CameraMode
        {
        Color,
        Depth,
        Infrared
        }

        In this code by Vangos , (var frame = reference.BodyFrameReference.AcquireFrame()) gives access to only body frame and using (var frame = reference.ColorFrameReference.AcquireFrame()) to only color stream . So, in color stream we have written a function where we update the byte stream and return the bitmap . And now when we come to the body frame ( (var frame = reference.BodyFrameReference.AcquireFrame()) ) , nn this code block above how to access the byte stream of color frame. because “using” gives no more access to colour frame returned in above function

        Can you please update the code segment what you are trying to convey and where…?? … That would be really helpfull . Thanks Kal .

        • Kal says:

          Hi Yash

          I downloaded the Vangos code, and for some bug it fails in the function given a message stack overflow state. Then I didn’t run your code.

          Seeing your code, I think you must to REMOVE NULL BODY, then before your
          if (body.IsTracked)
          add
          if(body!=null)

          Obs. Kinect can track six bodies, but if there is nobody in the scene, the _bodies lists contains null elements.

          • Yash says:

            Hi Kal . Thanks for the code . But did you forget to add the concept of coordinate mapping to above code because since the resolution of Depth Frame and Color Frame are different . As we know , the Joints map with respect to Depth Frame but we modify correspond pixels on color frame, so I think we need to initially convert the Joint locations in your code to corresponding pixels in Color Frame by ColorSpacePoint colorPoint = _sensor.CoordinateMapper.MapCameraPointToColorSpace(jointPosition); Am I right…??

  • Yash says:

    Hi Vangos !!!! I want to elaborate my above ques !!! Suppose : Lets take five joint locations : ShoulderCenter, Shoulder Left, ShoulderRight,HipLeft, HipRight. Joining all these points forms a polygon . I want to color the polygon with a specific color say Yellow .Since the joints locations change dynamically with frames , the byte stream in Color Frame should be updated each time we construct the joints by Coordinate mapper. So, How to proceed now…??

    • Yash says:

      So, all I need is : How to access that byte stream of Color Fram where I can update the pixel values once we obtained the Joints mapped onto the Color Stream with Coordinate Mapper

      • Kal says:

        Hi Yash

        I suggest, firstly read both color and joints frames in buffers, next processes the joints via coordinate mapping and update the color buffer, finally, display the new updated color frame.

        //Based in the Vangos’ code
        1) Copy color frame to buffer

        // Color
        using (var frame = reference.ColorFrameReference.AcquireFrame())
        {
        if (frame != null)
        {
        // copy the current frame to a buffer
        byte[] buffColor32 = new byte[colorWidth * colorHeight * 4];
        frame.CopyConvertedFrameDataToArray(buffColor32, ColorImageFormat.Bgra);
        }
        }

        //Copy body list to a buffer
        using (var frame = reference.BodyFrameReference.AcquireFrame())
        {
        if (frame != null)
        {
        _bodies = new Body[frame.BodyFrameSource.BodyCount];
        frame.GetAndRefreshBodyData(_bodies);
        }
        }

        2) Coordinate Mapping
        foreach (var body in _bodies)
        {
        if (body!=null && body.IsTracked)
        {
        foreach (Joint joint in body.Joints)
        {
        //your code to obtain the 2D-point color point (colorX, colorY)

        // Update the pixels locations in buffColor32
        if ((colorX >= 0 && colorX = 0 && colorY < colorHeight))
        {
        int index = (colorY * colorWidth + colorX) * 4;
        // set the pixel to yellow
        buffColor32[indexRGBA + 0] = 0; //B
        buffColor32[indexRGBA + 1] = 0xff;// G = 255
        buffColor32[indexRGBA + 2] = 0xff; // R = 255
        }

        }

        }
        }

        3) Display the new color frame in a bitmap
        // create a bitmap
        WriteableBitmap bitmap = new WriteableBitmap(colorWidth, colorHeight, 96, 96, PixelFormats.Bgr32, null);

        // copy the buffer into the bitmap
        bitmap.WritePixels(
        new Int32Rect(0, 0, bitmap.PixelWidth, Bitmap.PixelHeight),
        buffColor32, bitmap.PixelWidth * 4, 0);

        • Kal says:

          Set all the code in
          void Reader_MultiSourceFrameArrived(object sender, MultiSourceFrameArrivedEventArgs e)
          {

          }

    • Yash says:

      Hi Kal . Thanks for that . I had a brief idea of doing something like this . But at the same time, I am unable to convert to code . Either I am understanding the code in a wrong way or whatever , I would like to give the scenario with reference to code :

      void Reader_MultiSourceFrameArrived(object sender, MultiSourceFrameArrivedEventArgs e)
      {
      var reference = e.FrameReference.AcquireFrame();

      // Color
      using (var frame = reference.ColorFrameReference.AcquireFrame())
      {
      if (frame != null)
      {
      if (_mode == CameraMode.Color)
      {
      camera.Source = frame.ToBitmap();
      }
      }
      }

      // Depth
      using (var frame = reference.DepthFrameReference.AcquireFrame())
      {
      if (frame != null)
      {
      if (_mode == CameraMode.Depth)
      {
      camera.Source = frame.ToBitmap();
      }
      }
      }

      // Infrared
      using (var frame = reference.InfraredFrameReference.AcquireFrame())
      {
      if (frame != null)
      {
      if (_mode == CameraMode.Infrared)
      {
      camera.Source = frame.ToBitmap();
      }
      }
      }

      // Body
      using (var frame = reference.BodyFrameReference.AcquireFrame())
      {
      if (frame != null)
      {
      canvas.Children.Clear();

      _bodies = new Body[frame.BodyFrameSource.BodyCount];

      frame.GetAndRefreshBodyData(_bodies);

      foreach (var body in _bodies)
      {
      if (body.IsTracked)
      {
      // COORDINATE MAPPING
      foreach (Joint joint in body.Joints.Values)
      {
      if (joint.TrackingState == TrackingState.Tracked)
      {
      // 3D space point
      CameraSpacePoint jointPosition = joint.Position;

      // 2D space point
      Point point = new Point();

      if (_mode == CameraMode.Color)
      {
      ColorSpacePoint colorPoint = _sensor.CoordinateMapper.MapCameraPointToColorSpace(jointPosition);

      point.X = float.IsInfinity(colorPoint.X) ? 0 : colorPoint.X;
      point.Y = float.IsInfinity(colorPoint.Y) ? 0 : colorPoint.Y;
      }
      else if (_mode == CameraMode.Depth || _mode == CameraMode.Infrared) // Change the Image and Canvas dimensions to 512×424
      {
      DepthSpacePoint depthPoint = _sensor.CoordinateMapper.MapCameraPointToDepthSpace(jointPosition);

      point.X = float.IsInfinity(depthPoint.X) ? 0 : depthPoint.X;
      point.Y = float.IsInfinity(depthPoint.Y) ? 0 : depthPoint.Y;
      }

      // Draw
      Ellipse ellipse = new Ellipse
      {
      Fill = Brushes.Red,
      Width = 30,
      Height = 30
      };

      Canvas.SetLeft(ellipse, point.X – ellipse.Width / 2);
      Canvas.SetTop(ellipse, point.Y – ellipse.Height / 2);

      canvas.Children.Add(ellipse);
      }
      }
      }
      }
      }
      }
      }
      }

      enum CameraMode
      {
      Color,
      Depth,
      Infrared
      }

      In this code by Vangos , (var frame = reference.BodyFrameReference.AcquireFrame()) gives access to only body frame and using (var frame = reference.ColorFrameReference.AcquireFrame()) to only color stream . So, in color stream we have written a function where we update the byte stream and return the bitmap . And now when we come to the body frame ( (var frame = reference.BodyFrameReference.AcquireFrame()) ) , nn this code block above how to access the byte stream of color frame. because “using” gives no more access to colour frame returned in above function

      Can you please update the code segment what you are trying to convey and where…?? … That would be really helpfull . Thanks Kal .

  • ozan emre says:

    Hi Vangos,
    I wanna add external camera to kinect v2. Is it possible map joints to external camera space from depth space if possible how can i do.
    Many Thaks.

    Regards…

    Ozan Emre YAPICI

  • Spungia says:

    Hey Vangos,
    thanks for this tutorial! I just have one or two questions concerning alignment and the different coordinate systems:
    – coordsystems: i was wondering in what the kinectv2 sdks resolution of the depth/cameraspace information is. So better ask in which range and with which stepsize these to frames are captured?

    – alignment: unfortunately I cannot upload an image but I tried to map the color frame(1920×1080) on the depth frame(512×424). But there exists double mapping in the areas where the background of the object is to far away. I already found out that this is due to the offset between the IR-camera and the rub-camera. But how can I fix this? (if I didn’t explained well my problem I could send you my example img via mail or sth like this)

    thanks a lot for your answers!!!
    cheers

    • Hi Spungia. Here are my comments to your questions:

      – Coordinate system: the depth/infrared resolution is 512×424 pixels. The values are measured in meters. To properly map a point from the 3D space to the 2D screen space, you need to use Coordinate Mapper. The sensor is accurate from 0.5 meters to 5 meters, approximately.

      – Alignment: indeed, there is no built-in fix for this. You could just ignore the depth pixels above a specific depth (e.g. > 5 meters).

  • Richard Minh says:

    Dear Vangos,

    thanks for your boundless efforts which lead our developers into the success of development.
    However, I’ve some problems in getting facial landmarks in kinect v2.0.
    In kinect v1.x , it supports 100 landmarks so it was easy to use them.
    But in v2.0, it only supports 5 in free basic mode and 1347 in high definition mode.
    My questions are:
    1. how can i get more than 36 points of landmarks in v2.0?
    2. when i use MapCameraPointsToColorSpace function in hd mode, it often returns some infinite point struct. I think this is because CameraSpacePoint’s elements are negative. How can I fix this?
    Expecting your help…
    Thanks and Regards.

  • Praneeth says:

    Hi Vangos,
    Thanks for the post. It proved very valuable for me. Especially shifting from v1 to v2 , it was new for me……and you gave good examples while explaining.

    I was trying on with depth data and could color them according to depth. I also tried testing out sensitivity of Kinect v2’s depth. I am very interested in smoothing the depth data and then filtering the depth image for edge detection for more accurate Body Joints detection. I have a few important doubts:-

    First let me explain what i want to do.
    I want to detect a body using depth image and take in the joint coordinates. Then, i want to see how accurately is my depth data coming out while i move, jump, crawl. etc etc. [ie., i will use VGB to build some good gestures and test them out with my actions]

    1. I want to know if the depth Id’s that the kinect gives out to the pixels … to identify them….. are fixed to that particular pixel ? That means…for example…i know that for Head joint there is a depth pixel Id….. and now when the head moves ..will the pixel and its Id associated with it also moves along with the head ??

    2. I want to know in general…how smoothing, and depth filtering for edge detection is done in Kinect v2….. as i am not good at signal processing…I really am feeling uncomfortable to use filtering methods with VS C#. Is there any hard rule to follow…or else any particular book which would guide me for depth smoothing and filtering? I really would appreciate your help here Vangos.

    I hope i have made my question clear.

    • Hi Praneeth,

      Here are my comments to your questions:

      1) Kinect notifies you whether a specific pixel belongs to a player or not. This is the job of the BodyIndex frame. The BodyIndex frame gives you an array of 0s and 1s. 0 means that the corresponding pixel does not belong to a player. 1 means that the corresponding pixel belongs to a player. It is updated real-time.

      2) There is no built-in way to achieve smoothing. You need to write your own image processing algorithm or use an existing library. You could use EmguCV or AForge.NET. Try experimenting with smoothing filters, like the Median Filter. You’ll need to provide them the resulting image of your background removal process.

      Check Median Filter for EmguCV and Median Filter for AForge.

  • Mari says:

    Hi Vangos,

    Thanks for the tutorials. It may sound silly, what is the distance in mm between the pixels in v2 depth image? Can you point me to any resources how to calculate it?

    Thanks
    Mari

    • Hi Mari,

      The Depth Image is an array of distances. Kinect gives you the distance between any point in the field of view and the plane of the sensor.

      The Depth Frame has a resolution of 512×424 = 217,088 points. So, for every point, you know its distance from the sensor’s plane.

      Check this article, too.

      • Mari says:

        Thanks for the reply. The value of the points ( pixels) is the distance from the sensor plane, but can the distance between two adjacent points (pixels) be found, i.e using camera intrinsic parameters?

  • Tommy says:

    Hi Vangos,
    Do you know how to calculate the sitting posture angle ?

    • Hi Tommy. Vitruvius could help you. There are a few Kinect Joint extensions, so you can find the angle between three joints easily.

      For example, here’s what you an do:
      var spineMid = body.Joints[JointType.SpineMid];
      var spineBase = body.Joints[JointType.SpineBase];
      var kneeRight = body.Joints[JointType.KneeRight];

      double angle = spineBase.AngleBetween(spineMid, kneeRight);

  • […] are measured in meters! To properlyย find the corresponding pixel values, we’ll use Coordinate Mapper. Coordinate Mapper is a built-in mechanism that converts between 3D space positions to 2D screen […]

  • Bill Moldestad says:

    Hello Vangos, say, it sounds like the Kinect 2.0 uses a different 3D scanning system than the Kinect 1.0, the 1.0 used an infrared field of dots and then inferred depth from their distortion, and then used pattern recognition software to identify possible body shapes from the distortions, including reading how the distortion was dynamically changing with time: movement. But 2.0 uses a time of flight sensor that allows for finer resolution. How is the 2.0 sensor mapping joints? Does it map joints from the depth field like 1.0 does, or does it do it some other way? Thank you, Bill.

    • Hi Bill. The sensor API and logic is the same between SDK 1.8 and SDK 2. The infrared information is processed by the depth processor and the 3D joint coordinates are generated. It’s a similar process to Kinect v1, though you now have access to the infrared data using the API. Infrared data were not directly accessible in v1.

      • Bill Moldestad says:

        Thanks so much for this information!: this is what I suspected: same process, just with more information to work with thanks to the improved depth sensors on kinect 2.0, and like you say, we now have direct access to the IR light data. I’m looking forward to your programming for Kinect book when it comes out. Say, some people are doing entertaining things with this technology, Bill Chappell of Digital Dowsing has a handheld SLS camera, you can buy it and it’s used for “ghost hunting”. The Ghost Adventures show is using it a lot this season. Pretty far fetched, but once in a while you get interesting results, but then again it’s hard to rule out that instead of mapping non-visible entities the pattern recognition software is hallucinating! Please keep up the great work! Bill

  • driming says:

    thanks for your share, which helps me a lot. And I’m looking forward to your new book, is it available right now and where can I buy it?

  • […] to convert points between the 3D world-space and the 2D screen-space. Kinect SDK includes CoordianteMapper, a useful tool that converts between camera space and depth/color space. Vitruvius simplifies […]

    • Have you ever thought about including a little bit more than just your articles? I mean, what you say is fundamental and all. But think about if you added some great pictures or videos to give your posts more, “pop”! Your content is excellent but with images and video clips, this blog could undeniably be one of the most beneficial in its niche. Wonderful blog!

    • i want to see whiskey and udolpho go head to head on the microphone, WWF style. then after they're too angry to even bother yelling anymore and boil over into primal rage, they can start throwing wild punches.this match is scheduled for one fall.

  • Imran says:

    Hi,
    I am stuck at one point. The objective is How can I find the distance (Horizontal / Adjacent distance) from kinect to Body joint e.g. Head Joint.

    Regards

  • Hi Imran. The distance from the sensor is the Z value.

    For example:
    float distance = body.Joints[JointType.Head].Position.Z;

    • Imran says:

      Thanks for the reply.

      But I need different from this. I will try to explain well. I need the exactly horizontal distance from the body (Feet) to kinect. Z gives the diagonal distance between kinect and object. One more thing I manually Tilt the Kinect device towards ground.

      Explanation: Kinect is 186 cm from ground and Kinect stand base is about 320 cm from front wall bottom. on the same wall i mark point on the wall according to my body head as x = 0, y= 0 and z = 321. this Z value is almost equal to the physical distance from wall to kinect stand bottom.

      When I move towards kinect and move left or right Z value increases from 321. I need the ground distance / Horizontal distance from kinect stand to body feet. How can i get please?

      Regards,

  • Leo says:

    Hello Vangos, I’m updating a source code from the SDK 1.8 to 2.0 and which are not references to the following types and methods :

    private DepthImagePixel[] depthPixels;
    DepthImageFrame curDepthFrame;
    curDepthFrame.CopyDepthImagePixelDataTo(this.depthPixels);

    Can you tell me that I have to use?

    • Hi Leo. Here are the proper replacements:
      DepthImagePixel —> DepthSpacePoint
      DepthImageFrame —> DepthFrame
      CopyDepthImagePixelDataTo —> CopyDepthDataToArray

      Hope that helped you.

  • Leo says:

    Vangos , Thanks for your answer!
    One more question , there is a possibility that Kinect detect more than six faces simultaneously ?

  • Vibek says:

    Hey, I am working with Openni and ROS. I draw the skeleton in depth frame and published the tf RVIZ. But my quick question how to draw the same skeleton in RGB image frame. Note that I am using OpennI and ROS under ubuntu.

    Your help will be appreciable

  • Kal says:

    Hi Vangos,

    Recently I noticed that the CoordinateMapper object has a CoordinateMappingChanged event.

    Do you know anything about it?

    There is not much explanation in the official MSDN documentation.

    In particular:
    1. When can this event fire?
    2. How do coordinate mappings change? The event args object is basically empty.
    3. Why would the coordinate mappings change at all?
    4. Does it actually fire from the latest SDK?

    Would you have *insider* connections at Microsoft who can give explanations?

    • Hi Kal. Didn’t have time to test this event, so I apologize for my late reply.

      CoordinateMappingChanged event is fired when the sensor is initialized. The event arguments are, indeed, empty.

      I would expect this event fire when the sensor is tilted or positioned differently. Seems like this is not the case. This event seems useless right now.

      • Kal says:

        Hi Vangos,

        I did a bit of testing myself.

        The short answer is that CoordinateMappingChanged should probably have been named “CoordinateMappingReady”. It seems to fire when the CoordinateMapper is actually ready and usable. If you try to use the CoordinateMapper before this event fires, the camera intrinsics would not have been loaded and you would get garbage out of any mapping attempt – e.g. depth points all mapping to zeros,

  • Sanket says:

    Hi Vangos,
    I would like to track multiple hands with there coordinates of fingers. Is there any feature to achieve this?

    Thanks

  • Sanket says:

    I have detected multiple bodies (No.6) using your code but the bodies array holds the body in any order i.e first body arrived it put in bodies[4] second to bodies[2] and so on.
    Is this the correct behavior? Because i am facing some issue while performing the operation for the specific one.

  • Sanket says:

    Dear Vangos,
    First of all Congratulations and thanks for writing the code on Finger Tracking.
    I am using your Vitruvius code of finger tracking in my project. Needs lots of modification because the project needs to detect 6 Right hands.
    Could you please tell me how can i get the Index Finger from your finger tracking code as i can see there are only 2 states HandState.Open for All the 5 fingers otherwise 0 Fingers.

    • Hi Sanket. You’ll have to loop through the detected fingers and sort them by their relative distance. The index should be the second finger.

      • Sanket says:

        That’s true when we have detected all the 5 fingers and then apply the distance based calculation.
        By this approach i think if we open the last 3 fingers it will detect the RING finger as Index Finger.
        How can we manage if i have just opened the THUMB and the INDEX finger only (lasso) and then would like to track the Index Finger.

        • Hi Sanket. I can think of two ways to do this:

          1) Compare the position of each finger with the position of the ThumbLeft/ThumbRight joint.

          2) Compare the positions of each finger with the position of the palm (HandLeft/HandRight) and see which one is longer.

  • RS says:

    In Your Finger Tracking code how to check how many fingers are opened. And which finger is open?
    Thanks

  • Yannis says:

    Dear Vangos,

    is it possible to map (i.e. downsample) the colorframe to depthframe? I think I miss something here.

    • Sure. Just use the CoordinateMapper.MapColorFrameToDepthSpace method.

      • Yannis says:

        There is the missing point for me. This function gets a depthframe. Where is the colorframe involved?

        • That function gets two parameters: an array of ushort values (the depth data from the depth frame array) and an array of depth space points (X-Y coordinates).

          The array of depth space points is originally empty. After you call the MapColorFrameToDepthSpace it gets filled with the corresponding X-Y values. These values are the “common points” between a 1920×1080 color frame and a 512×424 depth frame.

          • Yannis says:

            Thanks a lot! I was reading now the corresponding msdn page, trying to understand the function. I will try it asap.

      • Boogiedoll says:

        Hi Vangos. When I use MapDepthFrameToColorSpace, I get duplicate color pixels at the background, most likely caused by the pixels that are not seen by the color camera. Besides using depth thresholding, how do you remove these duplicate pixels?

  • Sankaraganesh says:

    Dear Vangos,

    Your explanations are interesting.

    In my project I am trying to use both color and depth frames obtained from the Kinect sensor. Due to the offset between the frames , I am facing a problem in the alignment of Kinect color frame to depth map. Kindly, advise me how to align color and depth frames.

    Thanks!!!!!!!!!!!

    • Hi Sankara,

      Thank you for your comment. You can use the method MapCameraFrameToDepthSpace and find the mapped points between the two frame types. The frames are not in perfect sync, because the cameras have a small distance between them.

  • Ibrahim Seleem says:

    Thanks a lot for your explanations. I used kinect V2, when I run the code related to it. It works well but in order to detect all body, I should move far away from the camera. I want to detect the all body by small distance from camera. What will be changes in code? and Is the joint data will be accurate?

    • Hi Ibrahim. Kinect works for a specific distance. You can’t change that.

      • Ibrahim Seleem says:

        Ok, I have no problem in that. But I want to make all body appeared on screen by not far distance. When I run V2 code, I move more than 5 meters in order to detect all body. I want to detect at 3 meters for example. This distance is acceptable for kinect.

      • Ibrahim Seleem says:

        First, I want to know that skeleton tracking using kinect is affected by the height of table.

        I put kinect on 60 cm height table, if I put it on 80 cm height Is skeleton tracking affected or not ?

        Second, I want to know the suitable table height for skeleton tracking using kinect ?

        • Hi Ibrahim. In general, Kinect works fine between 80cm and 180cm height. 60cm is quite short, so you may get slightly different results.

          I usually suggest placing the sensor between 100 and 150 cm.

  • Ibrahim Seleem says:

    Thanks a lot Vangos Pterneas.
    I’m trying c# examples related to skeleton tracking of human walking on straight line parallel to camera (z is constant).
    But x and y values is larger than normal values. I want to get values close to real values.

    camera is on 80 cm height . Human body is 4 m away from camera.

    Thanks in advance.

    • Hi Ibrahim. Try placing the sensor to 1 meter. X and Z values are measured in terms of the FOV of the camera.

      • Ibrahim Seleem says:

        Thanks a lot.
        I found that filed of view parameters affect the kinect readings. So I want to know the relation between field of view with real world values??

        • They are real-world values:
          * X is the distance from the left of the FOV.
          * Y is the distance from the top of the FOV.
          * Z is the distance from the plane of the sensor.

          • Ibrahim Seleem says:

            I put the camera on 1.20 m table,But it also give me values larger than real ??

          • Hi Ibrahim. What do you mean “larger than real”? What is the divergence between the values you get and the real-world values?

          • Ibrahim Seleem says:

            The difference between the x-y values that I got ftom kinect and x-y values in real “I means on the ground that I walk on ” aboyt 5cm to 10 cm..

          • That may depend on the distance and the tracking accuracy. The Z-axis values are usually close to the real-world values (millimeter-level accuracy). The X and Y values are measured based on the FOV, so the 5-10cm divergence in those axis sounds normal to me.

  • Lou-Ann Raymond says:

    Hi I am knew in Kinect and I need to detect the floor. I know there is the floorClipPlane function giving the equation of the floor but it is with the skeleton frame class and I am not tracking anybody I just want to know where is my floor with the Kinect. Is it possible to obtain the equation without using skeletonFrame or do I have to build my own code to detect it.

    Thanks in advance

    • Hi Lou-Ann. The built-in floor information is only available in the Body frame class, so you have to use that stream.

      There is no need to have any tracked bodies, though. The FloorClipPlane will be available even if there are no bodies in front of the sensor.

      • Lou-Ann Raymond says:

        Hi Thank you for your answer ๐Ÿ™‚

        but in this case would it be the equation for any plane in my environment?

        • Yes, that’s right. It’s specific to the environment, not the body.

          • Lou-Ann Raymond says:

            okay thank you very much for your answers ๐Ÿ™‚

          • Lou-Ann Raymond says:

            Hi.
            I was wondering if the distance w from the camera to the floor depends on the kinect y axis? I suppose also that the camera can’t never tilt whilst using this function ? what happens if the floor is tilting ?

            Thanks in advance

          • Hello. The values are the same, until you restart the sensor. The sensor is not supposed to move or tilt.

            Probably this would help you: http://gamedev.stackexchange.com/a/80489

          • Lou-Ann Raymond says:

            Hi,

            For the floorclipplane, if my camera is tilting on the sides (left,right), the w = 0. I am trying to detect the floor no matter how my camera is moving/tilting and I supposed then that I can’t use this built in function because it gives me the equation of the plane.

  • […] pixels. So, how could we convert meters to pixels? The SDK includes a powerful utility, called CoordinateMapper. Using CoordinateMapper, we can find the position of the hand in the 2D […]

  • […] needs to follow the position of the hand, which is displayed in a 2D ย texture (1920×1080). CoordinateMapper is a utility thatย allowsย us to convert 3D coordinates to 2D points, and […]

  • Yang Wu says:

    Hi Vangos,
    Thanks for your post! It’s really helpful.
    I’m working on Kinect v2 and I encountered a problem when using CoordinateMapper. I used MapCameraPointToDepthSpace function. For most of the time, the X and Y of returned DepthSpacePoint are in the range of [0, 511] and [0, 423], respectively. However, every now and then, the X and Y of the returned DepthSpacePoint is out of this range. I mean, the X can be as big as around 543 and as small as around -30. The Y can be as big as around 1200. Do you know why these happen and how to deal with these? Thanks so much!
    Best
    Yang

    • Hi Yang. CoordinateMapper may return non-numeric values. Actually, it may return positive or negative infinity number.

      For every DepthSpacePoint, you have to check whether its X and Y values are numbers:

      if (!float.IsInfinity(point.X) && !float.IsInfinity(point.Y))
      {
      // Do something ๐Ÿ™‚
      }

  • Hanan says:

    Hi Mr.Pterneas,

    I have a question ..how I can print out the local axis (a, b, c) for the hand joint ???

    I just need the local axes for the joint

    Thanks in advance

    • Hi Hanan. You can use the X, Y, and Z values of the Position property:

      var x = body.Joints[JointType.Head].Position.X;
      var y = body.Joints[JointType.Head].Position.Y;
      var z = body.Joints[JointType.Head].Position.Z;

  • Hanan says:

    Thanks a lot Mr.Pterneas…

  • Hanan says:

    Hi Mr.Pterneas

    I am sorry for bother you

    I tried to write your code in ( MainWindow.xamk.cs ) and (MainWindow.Xaml)

    and I got error in ‘CameraMode’ which says” CameraMode could not found

    what should I do

    thanks

  • Ibrahim Seleem says:

    Hello Vangos Pterneas,
    I used skeleton tracking c# program, human walk in sagittal plan 2.8 m away from the camera but I noticed that the resolution is low. Kinect- V2 is at 1.1 m height.
    I need the subject away enough from the camera in order to detect all body joints. How can I compensate error obtained obtained by low resolution ?

  • Ibrahim Seleem says:

    Ok, but if I need to track the whole skeleton I need the subject to be away enough from the camera

  • Hanan says:

    Hello Mr.Pterneas,

    I have a questions and I did get the idea .. so, can you explain it to me if you don’t mind..
    I tried to know the following points:

    1- while person eating , the arm direction (if the arm up near to mouth or down )
    2- the head direction (pitch-roll- yaw) I need to identify if head is up , down or rotate.

    Thank you so much for everything

    • Hi Hanan,

      Regarding point #1, check the distance between the hand and the head. You could use HD Face and check the distance between the hand and the Mouth.

      Regarding point #2, you’ll need to use HD Face and measure the angle between the nose and neck.

      You can also use Vitruvius and access the HD Face properties directly.

  • Helena says:

    Hello Mr. Pterneas,

    A friend and I have noticed when testing this Kinect Coordinate Mapping program that the ellipses that are drawn on the user tend to blink a lot. Is there anyway of alleviating this blinking or stopping it entirely? Thank you for your time.

  • […] So, we have to convert meters to pixels! How’s that possible? I have thoroughly explained this process in my blog post Understanding Kinect Coordinate Mapping. […]

  • […] So, we have to convert meters to pixels! How’s that possible? I have thoroughly explained this process in my blog post Understanding Kinect Coordinate Mapping. […]

  • Ibrahim Seleem says:

    Dear Vangos Pterneas,

    I need to know how can I detect the position of fixed marker using kinect V2. It will be fine if there is a code that display this point

  • Gillian Lei says:

    Hey Mr. Pterneas, I wanted an image to the head joint only,
    Not Ellipses, can you help me with that? Im trying to learn kinect developing, and it would be a big help.

    • Gillian Lei says:

      I downloaded and Copied your Kinect Coordinate Mapping to my own project, and i have these errors
      // Color
      using (var frame = e.OpenColorImageFrame())
      {
      if (frame != null)
      {
      if (_mode == CameraMode.Color)
      {
      camera.Source = frame.ToBitmap();
      }
      }
      using (var frame = e.OpenDepthImageFrame())
      {
      if (frame != null)
      {
      if (_mode == CameraMode.Depth)
      {
      camera.Source = frame.ToBitmap();

      i have error on frame.toBitmap on both color and depth field
      and it says
      ‘Microsoft.Kinect.DepthImageFrame’ does not contain a definition for ‘ToBitmap’ and no extension method ‘ToBitmap’ accepting a first argument of type ‘Microsoft.Kinect.DepthImageFrame’ could be found (are you missing a using directive or an assembly reference?)

      Thank you Dear Vangos Pterneas

    • Hello. To display an image instead of an ellipse, use the Image control instead of the Ellipse.

      The “ToBitmap” method is an extension method and can be found in the source code of this blog post here and here.

      Vangos

      • Gillian Lei says:

        can you show me the code on how can i implement the image control?
        i am using your code in Kinect Coordinate mapping
        thank you so much

        • In your XAML add an Image element and set its source to the image path you want. Read more about the Image control:
          https://www.youtube.com/watch?v=affbgRZoeGc

          • Gillian Lei says:

            i mean is, in your kinect coordinate mapping, In your part of the code, how will i replace the ellipse in the joint, with the joint in the head only? Thank youu so much for your time sir, I just want to be a girl that programs not just a typical type of girl,
            anyway, it’s for educational purpose only
            Thanks

          • Replace the Ellipse object with an Image object. Also, set its Source property to the image you want to display.

          • Gillian Lei says:

            do you have any tutorial there? what will i replace?

          • I do not have a particular tutorial. You need to replace the Ellipse objects with Image objects. Ellipse and Image are XAML controls. So, instead of displaying an Ellipse control, you’ll display an Image control.

            This is a similar tutorial that uses Image controls: http://pterneas.com/2014/12/21/kinect-2-face-basics/

          • Gillian Lei says:

            I Replaced ellipse with an image,(image1).
            When I ran the program, I can track the image(i am sure to that) but it ran only in the background of the color image, I can show you my code if you like.

          • Gillian Lei says:

            Heres the code

            // Body
            using (var frame = e.OpenSkeletonFrame())
            {
            if (frame != null)
            {
            canvas.Children.Clear();

            frame.CopySkeletonDataTo(_bodies);

            foreach (var body in _bodies)
            {
            if (body.TrackingState == SkeletonTrackingState.Tracked)
            {
            // COORDINATE MAPPING
            //foreach (Joint joint in body.Joints)
            Joint joint = body.Joints[JointType.Spine];
            {
            // 3D coordinates in meters
            SkeletonPoint skeletonPoint = joint.Position;

            // 2D coordinates in pixels
            Point point = new Point();

            if (_mode == CameraMode.Color)
            {
            // Skeleton-to-Color mapping
            ColorImagePoint colorPoint = _sensor.CoordinateMapper.MapSkeletonPointToColorPoint(skeletonPoint, ColorImageFormat.RgbResolution640x480Fps30);

            point.X = colorPoint.X;
            point.Y = colorPoint.Y;
            }
            else if (_mode == CameraMode.Depth) // Remember to change the Image and Canvas size to 320×240.
            {
            // Skeleton-to-Depth mapping
            DepthImagePoint depthPoint = _sensor.CoordinateMapper.MapSkeletonPointToDepthPoint(skeletonPoint, DepthImageFormat.Resolution320x240Fps30);

            point.X = depthPoint.X;
            point.Y = depthPoint.Y;
            }

            // DRAWING…
            //Ellipse ellipse = new Ellipse
            //{
            // Fill = Brushes.LightBlue,
            // Width = 20,
            // Height = 20
            //};

            Canvas.SetLeft(image1, point.X – image1.Width / 2);
            Canvas.SetTop(image1, point.Y – image1.Height / 2);

            canvas.Children.Add(image1);
            }
            }

            Heres my Xaml

            I hope you can help me Thanks <3

          • The code seems OK. You just have to place the Image control into the Canvas, in front of the Camera image.

          • Gillian Lei says:

            This is my Code now

            When i put the image outside the canvas, it said that.

            Specified element is already the logical child of another element. Disconnect it first.

            When i Put the image inside the canvas, it runs only in the background of the color image

          • The code should be something like this:


            <Image Name="camera" />
            <Canvas Name="canvas">
            <Image Name="yourImage" />
            </Canvas>

  • Gillian Lei says:

    Please take a look,
    I have done it with your code, but when i run the program,
    It doesnt produces any error but the thing is it is not showing the image, and i think its running in the background
    of the camera

    • Please do not paste code in blog comments, since it’s not displayed properly.

      Have you specified the Source property of the image?


      <Image Name="yourImage" Source="path/to/image" />

  • gillian lei says:

    Yes i have specified the source property of the code, do you mind taking a look of the whole solution file?

  • gillian lei says:

    Ok thanks again ๐Ÿ™‚

    • Gillian Lei says:

      Hey Mr. Pterneas

      I have no problem tracking the image, But the image is blinking, just like your kinect coordinate mapping ellipses, it keeps blinking while tracking the image.

      Do you have any tutorial where the image will be 3D shirt now?

      • To prevent the image from blinking, remove the following lines:


        canvas.Children.Clear();
        canvas.Children.Add(image1);

        Animating a 3D shirt is way more complicated than displaying an image. It would require thousands of lines of code. You could use Vitruvius for this purpose.

        • Gillian Lei says:

          What if i just want to Fit a 2d image into the body only? do you have any tutorials?

          • To project a 2D image on top of a body, you could do something like this:

            XAML:


            <Viewbox>
            <Canvas Width="1920" Height="1080">
            <Image Name="ClotheImage" Source="/Path/To/Image" Stretch="Uniform" />
            </Canvas>
            </Viewbox>

            C#:


            var shoulderLeft = body.Joints[JointType.ShoulderLeft];
            var shoulderRight = body.Joints[JointType.ShoulderRight];
            var neck = body.Joints[JointType.ShoulderCenter];
            var handLeft = body.Joints[JointType.HandLeft];
            var handRight = body.Joints[JointType.HandRight];
            var shoulderLeftPoint = CoordinateMapper.MapCameraPointToColorSpace(shoulderLeft.Position);
            var shoulderRightPoint = CoordinateMapper. MapCameraPointToColorSpace(shoulderRight.Position);
            var neckPoint = CoordinateMapper. MapCameraPointToColorSpace(neck.Position);
            ClotheImage.Width = Math.Abs(shoulderLeftPoint.X - shoulderRightPoint.X);
            var x = neckPoint.X - (ClotheImage.Width / 2.0);
            var y = neckPoint.Y;
            Canvas.SetLeft(ClotheImage, x);
            Canvas.SetTop(ClotheImage, y);

          • Gillian Lei says:

            I will do it ASAP thank you somuch

          • Gillian Lei says:

            Where should i put your suggested codes in your kinect coordinate mapper?

          • Inside the BodyFrame_Arrived event handler or inside the MultiSourceFrame_Arrived event handler.

          • Gillian Lei says:

            It is said that microsoft.kinect.mapper does not contain a definition for map camera point to colorspace
            and where did you get that upperbody?

          • The upperBody is a type. I fixed it. It’s “ClotheImage”.

            The MapCameraPointToColorSpace method is part of the Kinect SDK v2. In SDK v1, it’s called MapSkeletonPointToColorPoint.

          • Gillian Lei says:

            Hey Mr. Vangos, It is working now but it takes only the neckpoint and is not taking the left and right should point.
            so the output is the image is only in the center of the user. If you want to see a picture I will send you one in your email

          • In my code, I have set the width of the image to be the width of the shoulders. You can’t get into more detail in 2D.

          • Gillian Lei says:

            How did you do that in your code? Will you please help me?

          • ClotheImage.Width = Math.Abs(shoulderLeftPoint.X – shoulderRightPoint.X);

          • Gillian Lei says:

            DepthImagePoint ShoulderLeftDepthPoint =
            depth.MapFromSkeletonPoint(first.Joints[JointType.ShoulderLeft].Position);
            DepthImagePoint ShoulderRightDepthPoint =
            depth.MapFromSkeletonPoint(first.Joints[JointType.ShoulderRight].Position);
            DepthImagePoint NeckDepthPoint =
            depth.MapFromSkeletonPoint(first.Joints[JointType.ShoulderCenter].Position);

            ColorImagePoint ShoulderLeftColorPoint =
            depth.MapToColorImagePoint(ShoulderLeftDepthPoint.X, ShoulderLeftDepthPoint.Y,
            ColorImageFormat.RgbResolution640x480Fps30);

            ColorImagePoint ShoulderRightColorPoint =
            depth.MapToColorImagePoint(ShoulderRightDepthPoint.X, ShoulderRightDepthPoint.Y,
            ColorImageFormat.RgbResolution640x480Fps30);

            ColorImagePoint NeckColorPoint =
            depth.MapToColorImagePoint(NeckDepthPoint.X, NeckDepthPoint.Y,
            ColorImageFormat.RgbResolution640x480Fps30);

            image1.Width = Math.Abs(ShoulderLeftColorPoint.X – ShoulderRightColorPoint.X);

            var x = NeckColorPoint.X – (image1.Width / 2.0);

            var y = NeckColorPoint.Y;

            Canvas.SetLeft(image1, x);
            Canvas.SetTop(image1, y);

            I dont know what is wrong with my codes. It just doesnt fit with the width of my shoulders

          • Pick a PNG image that has no transparent gaps between the cloth and its edges.

          • Gillian Lei says:

            Hi its me again, the image is actually to small for the body.
            I think the image is only half the size of my body. How can I solve it?

          • Hello. You could experiment with different images.

          • Gillian Lei says:

            I tried many images but it is too small,

            The width of the shoulders is not enough,
            I wonder if I can widen the width of the shoulders more? and how to do that?

          • const double multiplier = 1.2; // A positive number to multiply the shoulder width.
            image1.Width = Math.Abs(ShoulderLeftColorPoint.X โ€“ ShoulderRightColorPoint.X) * multiplier;

          • Gillian Lei says:

            Wow! Thank You so much, that works perfectly ๐Ÿ™‚

            is there anyway I can adjust the position of the neckpoint upwards? just a little bit ๐Ÿ™‚

          • const double yMultiplier = 0.95;
            var y = NeckColorPoint.Y * yMultiplier;

          • Gillian Lei says:

            Wow you’re such a genius,
            How about calculating the height of the image.
            Neck point up to the hip center?

          • Similarly to the width. Instead of the distance between the shoulders, calculate the distance between the Neck and the SpineBase.

          • Gillian Lei says:

            Thank You so much! that would be all ๐Ÿ™‚

        • Gillian Lei says:

          Hello again,

          Is there a possibility, that when I rotate the image will also rotate with me?
          In your codes, when I rotate my body slightly, the image changes its size, and does not rotate at all.
          can you help me with that?

        • Gillian Lei says:

          I am trying to rotate the image using 3 merge joints (neck,shoulderleft, shoulderright)
          but when I twist my body, the image becomes smaller because of my code :

          image1.Width = Math.Abs(ShoulderLeftColorPoint.X โ€“ ShoulderRightColorPoint.X);

          how can I Prevent the image from getting smaller when I twist my body?

  • Mechatronika says:

    Dear Vangos Pterneas

    I am using skeleton frame to allow a mobile robot from avoiding collision with the people in the path.

    Since the camera space origin is the depth sensor, and since the depth sensor is shifted around 8 cm from the Kinect origin, I noticed that the robot shifts its motion for 8cm also ” closer to human when it avoids it on the right, and farther when it avoids him to the left”.

    I am wondering how can I shift the camera space to the origin of the Kinect sensor.

    The coordinate mappers try to shift one space to another one, but non of them shows how to shift its space to the Kinect origin!

    • Hello. I’m not sure what you are trying to achieve. Kinect is a set of sensors/cameras. The coordinate mapper helps you map the coordinate of one space (color/depth) to the real-world space (called camera space). What would you like to shift?

      • Mechatronika says:

        Dear Vangos,

        Thank you very much for your reply, and I hope I can find answers from you.

        I agree that we can take the real-world measurements of human using the camera space. Still, if we checked the 3D coordinates of the camera space, we will find that the coordinates origin is the center of the Depth sensor.
        The depth sensor is not located in the center of the Kinect sensor, rather it is around 6 cm from the center of Kinect.

        In this case, when we take the measurements from skeleton frames, we actually take it from the center of the Depth sensor, rather than the center of the Kinect sensor itself.

        I am working on a mobile robotic application, in which the robot has to avoid collision with human. Briefly, the robot takes the 3D positions of people, and generates collision-free paths between the people. I noticed that the robot doesn’t go directly to the planned path, because the Kinect provides the 3D measurements of people with a shift 6 cm from the center of the Kinect. When I shift the Kinect 6 cm so the center of the Depth sensor is located in the center of the Robot, the robot avoids people accurately.

        In the depth/color coordinate, it is understood that we do it to get the depth of pixels. I am wondering whether there is a coordinate mapper which shift the camera space to the center of Kinect sensor!

        At the end, thank you very much for spending your time reading my message ๐Ÿ™‚

        • Hello. I understand what you need. Kinect has no coordinate mapper to shift the space, though. You should assume that the center of the device is where the depth sensor is.

          • Mechatronika says:

            Thank you very much again,

            I am trying two solutions, and I wish that you give me your advise.

            I am adding ( -5cm) for x-values which is approximate distance between the depth sensor and the Kinect center. While keeping the Z values since it is not the linear distance between the Depth sensor and the body, rather it is the direct distance between the Kinect plane and the body plane ( I don’t need Y values in my application..).
            Do you think this will shift the measurements to the Kinect’s center? I don’t think the draw is understandable, still it might help me in let you know what am I thinking about:
            . ________ Human
            . /
            z /
            . /
            . /
            . /
            . /
            . /
            .
            –D-5->C———
            ——————— The kinect…

            The other solution is to shift the Kinect sensor totally to the right so I make the depth sensor center in the robot’s center!

            Sorry again for disturbing you..

          • Hello, no problem. You’d better go with the safest way: move the physical device. Coordinate Mapping won’t work accurately with different injected values.

  • Mechatronika says:

    Thank you very much for your fruitful support ๐Ÿ™‚

  • almosca says:

    Hi Vengos,
    your blog is amazing, I’m working with the Kinect v2 since one week and I just found a lot of answers in it. But, I have some questions to ask you ?
    is possible to take the orientation of a joint (ex. HandRight) refers to the kinect frame (or camera frame) ?

  • almosca says:

    thanks a lot!!!

  • Michael says:

    Hi Vangos, is there a chance that you will give me a code where the user should’ve strike a pose before it enables the GUI, and when the user leaves is not tracked, the gui will not show.

  • Ibrahim Seleem says:

    Hello Vangos Pterneas,

    I actually measured (x,y) joint position that related to a human skeleton in the sagittal plan. Now, I want to create the angle between Kinect v2 and skeleton direction of motion( like in this figure: http://www.mediafire.com/file/7wf8890ngnmi1d4/kinect.pdf ).

    How can I measure the joint position relative to a coordinate fixed on certain join on the skeleton like SpineBase position??

    what is the trasformation required to do that?

    • Hi Ibrahim. You can use the exact same technique: measure the distances in the real world (X, Y, Z) and map them to the color or infrared space. 3D calculations and measurements are the same.

  • hanan says:

    I have a question

    can we counting the number of the pixels in any frames that I got from the Kinect?

    • Hello. I am not sure what you mean. The number of pixels is 1920×1080 per color frame and 512×424 per depth frame. You can access the width and height of a frame using the FrameDescription property.

  • Derik says:

    hey vangos,
    can camera kinect v2 become smaller?
    example: application lightbuzz, there camera kinect v2 smaller

  • Kelly says:

    Hi I just want to know how can i show a pop up message, when the user is not tracked how can I implement the “msgbox.show” code?

  • Bill says:

    Hi Vaggos ,

    The output of a Kinect Sensor SDK V2 after a recording what is it?
    Is only depth data?

    Thank you

  • Sid says:

    Hi Vangos. Thanks for all your help thus far. Had a couple of questions since I’m new to Object Oriented Programming. In the code, the lines with ‘canvas’, such as
    canvas.Children.Clear();
    error because the compiler can’t recognize canvas. Where have you defined that?

    Secondly, I’m attempting to obtain joint angle changes over time. I saw that you suggested using Vitrivius, however, I’m not sure how to incorporate that into the code I’ve already written. Is there an easier way to extract joint angles from the sample code you posted, what you called the KinectCoordinateMapping class?

    • Sid says:

      Update: I figured out that I needed to add a Canvas element to the XAML file, so I solved that problem.
      New problem: ToBitmap isn’t recognized, and I noticed you have it in a Utilities class. How do I add that to the Project?
      Also still need help on the joint angle calculations as mentioned above. Thanks so much in advance.

    • Hello Sid. To add the Utilities class into your project, simply drag-and–drop it into your Visual Studio solution.

      To measure the angles, consider using Vitruvius. There is a how-to guide here:
      https://github.com/LightBuzz/Vitruvius

      • Sid says:

        Alright, I’ll do that. I also just realized (correct me if I’m wrong) that all your work is meant for a single image grabbed by the Kinect at a instant of time. I am attempting to map joints to a continuously moving “video” streamed through the Kinect over time. Aka, the user presses some sort of “Start” button and the Kinect records the feed and maps joints, which move with the body’s movement until the user presses “Stop” and the Kinect then stops streaming. All this time, I want to record joint angle changes over time. Any idea if that takes a whole different approach or is an easy edit from here? Thanks again.

        • Hi Sid. Video recording is a little different and much more complex than displaying an RGB frame. You’ll need to store the Kinect data in binary format (for performance reason) and read them during the playback on the fly.

          We have added video recording in Vitruvius. You may give it a look ๐Ÿ™‚

          • Sid says:

            Thanks Vangos. One more question – is there an easy way to draw lines (bones) between the joints?

            I was thinking of saving the previous point location on each iteration and use that as the (X1,Y1) coordinate of the line. And the new point would serve as the (X2,Y2). But I’m not sure what to specify as (X1,Y1) for the first iteration. Or is there an in-built function I can make use of?

          • Hello Sid,

            The initial coordinates of the Line element could be set to (0, 0). This way, the line would be invisible. After you have a Body to display, you could do this:

            Line.X1 = point1.X;
            Line.Y1 = point1.Y;
            Line.X2 = point2.X;
            Line.Y2 = point2.Y;

            For your convenience, I have also developed the KinectViewer control that handles the lines and points internally.

  • Geucimar says:

    Hi Pterneas,

    I’ve searched a lot but didn’t find anything how to convert depth pixel data from ushort to distance in float. The only solution I’ve found is to use CoordinateMapper and map depth data to camera space like snippet below:

    CameraSpacePoint[] cameraSpacePoints = new CameraSpacePoint[depthData.Length];
    CoordinateMapper cm = sensor.CoordinateMapper;
    cm.MapDepthFrameToCameraSpace(depthData, cameraSpacePoints);

    for (int i = 0; i < depthData.Length; i++)
    {
    CameraSpacePoint cameraSpacePoint = cameraSpacePoints[i];
    float distance = cameraSpacePoint.Z;

    }

    Do you know any solution used to convert data from ushort to float directly and give the value in meters?

    for (int i = 0; i < depthData.Length; i++)
    {
    ushort depth = depthData[i];
    foat distance = ?????????? depth;

    }

    • Hello Geucimar. The ushort values are the distances in millimeters. So, to find the distance in meters, all you have to do is divide by 1000. No need to use the coordinate mapper.

  • Evan Murder says:

    Hello Vangos,
    Vitruvius looks pretty interesting! Does it work with the original Kinect for Xbox 360?

  • Kwon says:

    Hello Vangos,
    I want to know how measure Z value on Kinect.

    Ex) I have a big size Square(Width:2M, Height: 1M) , And Left Top, Middle Top, Right Top’s Z value is Same? or Diff?

    Plz Reply, Thanks.

    • Hello. If the square is parallel to the Kinect’s vertical plane, the Z values would be the shame. The Z values are in the ushort[] array of the depth frame (measured in millimeters).

  • amir says:

    Hello Vangos,

    Thank you for your awesome post.

    I wanted to extract the body joint positions in each frame of recorded .XEF file. I found there is a Body class and bodyframe stream but I can not extract the positions of body joints. Can you please help me?

    Thanks
    Amir

  • […] Understanding Kinect Coordinate Mappingย (For v2 but good for learning about mechanics) […]

Leave a Reply