During the past few days, I have been working on a really cool personal project: finger tracking using Kinect version 2! As always, I’m providing the source code to you.
If you want to understand how things work, keep reading!
Prerequisites
- Kinect for XBOX v2 sensor with an adapter (or Kinect for Windows v2 sensor)
- Kinect for Windows v2 SDK
- Windows 8.1 or higher
- Visual Studio 2013 or higher
- A dedicated USB 3 port
Finger Tracking Algorithm
While developing the project, I had the following facts in mind: the algorithm should be able to track both hands and also know who those hands belong to. The Kinect SDK provides us with information about the human bodies. The finger tracking algorithm should be able to extend this functionality and assign fingers to specific Body objects.
Existing finger tracking algorithms simply process the depth frame and search for fingers withing a huge array (512×424) of data. Most of the existing solutions provide no information about the body; they only focus on the fingers. I did something different: instead of searching the whole 512×424 array of data, I am only searching the portion of the array around the hands! This way, the search area is limited tremendously! Moreover, I know who the hands belong to.
So, how does the finger tracking algorithm work?
Step 1 – Detect the Hand joints
This is important. The algorithm starts by detecting the Hand joints and the HandStates of a given Body. This way, the algorithm knows whether a joint is properly tracked and whether it makes sense to search for fingers. For example, if the HandState is Closed, there are no fingers visible.
If you’ve been following my blog for a while, you already know how to detect the human body joints:
Joint handLeft = body.Joints[JointType.HandLeft];
Joint handRight = body.Joints[JointType.HandRight];
If you need more information about how Kinect programming works, read my previous Kinect articles.
Step 2 – Specify the search area
Since we have detected the Hand joints, we can now limit the search area. The algorithm only searches within a reasonable distance from the hand. What exactly is a “reasonable” distance? Well, I have chosen to search within the 3D area that is limited between the Hand and the Tip joints (10-15 cm, approximately).
Step 3 – Find the contour
This is the most interesting step. Since we have strictly defined the search area in the 3D space, we can now exclude any depth values that do not fall between the desired range! As a result, every depth value that does not belong to a hand will be rejected. We have an, almost perfect, shape of a hand. The outline of this shape is the contour of the hand!
Step 4 – Find the Convex Hull
The contour of a hand is a big set of points. However, only five (or less) of these points correspond to valid fingertips. The fingertips are the edges of a polyline that contains all of the contour points. In the Eucledian space, this is called “convex hull”.
Consequently, the edges of the convex hull above the wrist define, well, the fingers.
So, you now understand how the algorithm works. Let’s see how to use it.
How-to-use
While building this project, my goal was simple: I did not want you to mess with the all of the complexity. So, I encapsulated all of the above algorithm into a handy class: HandsController.
Finger Tracking is under the LightBuzz.Vitruvius.FingerTracking namespace. This namespace should be imported whenever you need to use the finger tracking capabilities.
using LightBuzz.Vitruvius.FingerTracking;
Everything is encapsulated into the HandsController class. To use the HandsController class, first create a new instance:
private HandsController _handsController = new HandsController();
You can specify whether the controller will detect the left hand (DetectLeftHand property), the right hand (DetectRightHand property), or both hands. By default, the controller tracks both hands.
Then, you’ll need to subscribe to the HandsDetected event. This event is raised when a new set of hands is detected.
_handsController.HandsDetected += HandsController_HandsDetected;
Then, you have to update the HandsController with Depth and Body data. You’ll need a DepthReader and a BodyReader (check the sample project for more details).
private void DepthReader_FrameArrived(object sender, DepthFrameArrivedEventArgs e)
{
using (DepthFrame frame = e.FrameReference.AcquireFrame())
{
if (frame != null)
{
using (KinectBuffer buffer = frame.LockImageBuffer())
{
_handsController.Update(buffer.UnderlyingBuffer, _body);
}
}
}
}
Finally, you can access the finger data by handling the HandsDetected event:
private void HandsController_HandsDetected(object sender, HandCollection e)
{
if (e.HandLeft != null)
{
// Contour in the 2D depth space.
var depthPoints = e.HandLeft.ContourDepth;
// Contour in the 2D color space.
var colorPoints = e.HandLeft.ContourColor;
// Contour in the 3D camera space.
var cameraPoints = e.HandLeft.ContourCamera;
foreach (var finger in e.HandLeft.Fingers)
{
// Finger tip in the 2D depth space.
var depthPoint = finger.DepthPoint;
// Finger tip in the 2D color space.
var colorPoint = finger.ColorPoint;
// Finger tip in the 3D camera space.
var cameraPoint = finger.CameraPoint;
}
}
if (e.HandRight != null)
{
// Do something with the data...
}
}
And… That’s it! Everything is encapsulated into a single component!
Need the position of a finger in the 3D space? Just use the CameraPoint property. Need to display a finger point in the Depth or Infrared space? Use the DepthPoint. Want to display a finger point on the Color space? Use the ColorPoint.
Til next time… Keep Kinecting!
PS: Vitruvius
If you enjoyed this post, consider checking Vitruvius. Vitruvius is a set of powerful Kinect extensions that will help you build stunning Kinect apps in minutes. Vitruvius includes avateering, HD Face, background removal, angle calculations, and more. Check it now.
The code is brilliant. I have ran the solution and it works wonderfully. Just a question I have, currently I am working on a project that tracks hands, detects fingertips and then traces one of the finger tip location to write in air and then run character recognition on the trace.
I was wondering if it would be possible to record the trace of just one fingertip, lets say the right hand, index finger?
Hi George. Thank you for your comment. To identify each finger type, you need to e.g. sort them based on their distance from the thumb or distance from the palm. This way, you’ll be able to track a specific tip per second and draw its trace.
1) Good algorithm, but I’ve noticed one thing that it’s not very fast.
Why didn’t you try tracking fingers through machine learning like the rest of body joints?
2) Is there a way to get the finger joints too and not just the tips?
Hi Mughees.
1) The algorithm is fast (and it’s actually one of the fastest algorithms out there). If you are getting a delay, that’s because you are drawing a lot of ellipses on your Canvas. To reduce this lagging, simply turn on the “Optimize Code” checkbox in your Project Properties.
2) Right now, you can only detect the tips.
Congratulations for your work, it’s really awesome what you`ve done, unfortunately I have problems to run the application, when I try to run the program the next error message appear: “visual studio cannot start debugging because the debug target is missing ‘C:/Users/ etc etc..’ Please build the project and retry, or set the OutputPath and AssemblyName properties appropriately to point at the correct location for the target assembly”.
May you help me to resolve this issue?
Thank you very much.
Hi Luis. Right-click the project named “KinectFingerTracking” and select “Set as startup project”. Then, hit Debug to run the project again. I guess that would solve your problem.
have you achieved this on unity3d project too ?
Hi Muhammad. The Unity code should be the same. You’ll only need to replace the XAML code with Unity Game Objects. Did you face any problems with this?
Hi, sir
Can you provide me how to import this project to unity 3d?
Have you imported it you your Unity Project yet?
if yes can I have a sample project? I’m having trouble importing what he did
No, I haven’t imported it yet.
Sorry
Hello Mr.Pterneas
can I ask you .. after drawing the fingertips , is there any way can I compare the distance between them to know if the fingers hold something or not?
thank you very much
Hello Hannan. You can definitely compare the distance between the fingers. The fingers are given as CameraSpacePoints. So, you can use the following code to measure their distance, or simply include Vitruvius in your project.
Hello,
can you explain to me the following and why use it ?
LockImageBuffer()
buffer
UnderlyingBuffer
Thanks
Hello Hanan. The Underlying Buffer is the native representation of the depth information. It’s equivalent to using the ushort[] byte array. It’s just a little faster than using CopyFrameDataToArray. All it contains is a pointer to an array of depth values. Locking the buffer is necessary before accessing it to avoid leaks.
so , why we need to convert the frame to bitmap image
Because you need to convert a byte array to a Bitmap object. Otherwise, you won’t be able to display it. XAML is using BitmapImage and WriteableBitmap. Unity is using Texture2D. Each class is a “wrapper” of byte arrays.
Hi, Mr.Pterneas:
It’s really a brilliant Method, but it seems can only used in some special situation with convex hull. I’m a student who also researching Kinect based finger tracking in China, counld you show me any idea of tracking other finger joints?
Hello Neo. Thanks for your comment. To perform accurate finger joint tracking for the other joints, you’d better use the Leap Motion sensor. It’s much more accurate than Kinect.
Hello!
Can you do something similar with kinect 360 (v1) , with your code?
How to determine whether the index finger is pointing to the camera or not ..by calculating the distance
Hi Abdulahfiz. That could get really complicated. It’s a gesture that could be easily perceived in a lot of different ways. Do you have a particular example in mind?
thanks for your attention sir , am thinking to develop something like for example counting fingers….just like this demo on youtube
https://www.youtube.com/watch?v=tRiLaPwN2BU
You can count the number of the fingers by using the following code:
var count = e.HandLeft.Fingers.Count;
mr.pterneas
i am a student researching on kinect please help me i want to display only the hand within contour not to show body and every thing else then i want to capture this image
thanks a lot for your useful tutorials
Hello. You can simply toggle the depth display off and do not feed the Image element with depth bitmaps.
i am new in kinect and i am not completely understand you
may you please express it in code
It is quite a lot of code to write it all down. You need to process the depth frame (an array of ushort values) and select the values that correspond to the hands. Copy those values to a new array of the same size, while leaving the remaining values equal to zero.
This way, you’ll only display the points that belong to the hands and remove everything else.
thanks a lot it help me i get the ushort [] of depth frame but how to get ushort [] of hand data i did not find any refrence about getting that ushort[] on internet may you please tell me how to get these values ?
The ushort[] array is the result of the depth frame:
var depthData = new ushort[512 * 424];
depthFrame.CopyFrameDataToArray(depthData);
i got it but the ushort [] of hand values is the one which i can not get
This is how to get the ushort values that correspond to the finger points.
Hi Vangos,
Thanks for this great post. If I understand correctly, your solution basically extends the standard Kinect skeleton by adding a few new points to the built in Hand Joints. I am trying to create a similar effect as the one you have created here, but with a few key differences: the user is sitting at a table with the arms laying on the table, the Kinect is placed over the left shoulder of the user and the camera is facing the left hand. I want to capture only the left arm (from elbow to fingertips) and display it in a Unity scene as a skeleton in one scenario and as the real arm camera feed in the other. The user will view the replication in from of them in VR, which is why the camera angle is important. Am I correct in thinking that applying your solution would therefore not work, since I do not display enough of the user’s body to internally actually recognize it (and therefore the accompanying joints), or do you think it is still possible somehow?
Hi Nina. Correct, the current solution will not work on your use-case scenario. Consider checking OpenCV hand tracking or EmguCV hand tracking.
Hello sir. I just wanna know how you managed to separate the hand from the background? Thank you so much
Hi Jessa. To separate the hand from the background, we extracted the depth pixels that are close to the HandLeft and HandRight joints. This is done by simply comparing the Z/depth values.
Mr Pterneas,
I am a student researching into the Kinect program, and I was wondering if it is possible to use a similar code like yours but with finger joints to control a robot’s hand. Also, is there any way possible to help me reduce the lag from your code because every time I start the code it starts freezing and stopping.
Thank you for this useful tutorial.
Hello Bob. I assume the lagging is caused by the specs of the computer or some heavy internal computations. Try running the code in Release mode (instead of Debug). This should significantly increase performance.
CAN IT POSSIBLY READ A FINGER MOTION
This will not load for me. I am using Visual Studio 2017 with a Kinect 2.0- when I try to open it on visual studio it cannot work. Any suggestions??
Hi – would this work with the kinect v3 over a circular table pointing down while people around the table put their hands to touch things on it?
I’ve tried this myself with no success as it can’t see a skeleton properly from directly above.
cheers,
Jerry
Hello, thank you for tutorial. But I have a quastion… How to implement algorithm into HandsController?
Hello, thank you for tutorial. How to implement algorithm into HandsController?
Hi Ian. You can download the source code with the implementation of the algorithm here.
Yes i did it. But why this code didn’t mark the contour of my hands and fingers like on the video?
Do I need to implement parts of the code under the title “How to use” into a project – source code?
You can simply draw the hand.ContourDepth or hand.ContourColor points.
Hello Mr.Pterneas
can I ask you .. after drawing the fingertips ,how To track the position of the index finger and sort based on its distance from the thumb using the Kinect sensor in C#?
thank you very much
Hello Mr.Pterneas
can I ask you .. after drawing the fingertips ,how To track the position of the index finger and sort based on its distance from the thumb,
and Draw a line by tracing the index finger using the Kinect sensor in C#, ?
thank you very much
Hi Rola. There is no way to identify a finger’s type (e.g., index, thumb, etc). You could search for an AI-based finger-tracking solution, instead.
Hello Mr.Pterneas,
First of all, thanks for sharing with us your project. I tried to upload the code to Visual Studio 2022. However, I couldn’t transfer the code to Visual Studio because Framework 4.5 is no longer supported. What can I do about it to run the code?
Hi Eylül and thanks for your comment. Here’s what you can do (source):
– Download Microsoft.NETFramework.ReferenceAssemblies.net45 from nuget.org
– Open the package as zip.
– copy the files from build\.NETFramework\v4.5\ to C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.5