Throughout my previous article, I demonstrated how you can access the 2D positions of the eyes, nose, and mouth, using Microsoft’s Kinect Face API. The Face API provides us with some basic, yet impressive, functionality: we can detect the X and Y coordinates of 4 eye points and identify a few facial expressions using just a few lines of C# code. This is pretty cool for basic applications, like Augmented Reality games, but what if you need more advanced functionality from your app?
Recently, we decided to extend our Kinetisense project with advanced facial capabilities. More specifically, we needed to access more facial points, including lips, jaw and cheeks. Moreover, we needed the X, Y and Z position of each point in the 3D space. Kinect Face API could not help us, since it was very limited for our scope of work.
Thankfully, Microsoft has implemented a second Face API within the latest Kinect SDK v2. This API is called HD Face and is designed to blow your mind!
At the time of writing, HD Face is the most advanced face tracking library out there. Not only does it detect the human face, but it also allows you to access over 1,000 facial points in the 3D space. Real-time. Within a few milliseconds. Not convinced? I developed a basic program that displays all of these points. Creepy, huh?!
In this article, I am going to show you how to access all these points and display them on a canvas. I’ll also show you how to use Kinect HD Face efficiently and get the most out of it.
Prerequisites
- Kinect for XBOX v2 sensor with an adapter (or Kinect for Windows v2 sensor)
- Kinect for Windows v2 SDK
- Windows 8.1 or higher
- Visual Studio 2013 or higher
- A dedicated USB 3 port
Source Code
Tutorial
Although Kinect HD Face is truly powerful, you’ll notice that it’s badly documented, too. Insufficient documentation makes it hard to understand what’s going on inside the API. Actually, this is because HD Face is supposed to provide advanced, low-level functionality. It gives us access to raw facial data. We, the developers, are responsible to properly interpret the data and use them in our applications. Let me guide you through the whole process.
Step 1: Create a new project
Let’s start by creating a new project. Launch Visual Studio and select File -> New Project. Select C# as you programming language and choose either the WPF or the Windows Store app template. Give your project a name and start coding.
Step 2: Import the required assemblies
To use Kinect HD Face, we need to import 2 assemblies: Microsoft.Kinect.dll and Microsoft.Kinect.Face.dll. Right click your project name and select “Add Reference”. Navigate to the Extensions tab and select those assemblies.
If you are using WinRT, Microsoft.Kinect is called WindowsPreview.Kinect.
Step 3: XAML
The user interface is pretty simple. Open your MainWindow.xaml or MainPage.xaml file and place a drawing canvas within your grid. Preferably, you should add the canvas within a Viewbox element. The Viewbox element will let your Canvas scale proportionally as the window size changes. No additional effort from your side.
<Viewbox Grid.Row="1">
<Canvas Name="canvas" Width="512" Height="424" />
</Viewbox>
Step 4: Declare the Kinect HD Face objects
After typing the XAML code, open the corresponding C# file (MainWindow.xaml.cs or MainPage.xaml.cs) and import the Kinect namespaces.
For .NET 4.5, import the following:
using Microsoft.Kinect;
using Microsoft.Kinect.Face;
For WinRT, import the following:
using WindowsPreview.Kinect;
using Microsoft.Kinect.Face;
So far, so good. Now, let’s declare the required objects. Like Kinect Face Basics, we need to define the proper body source, body reader, HD face source, and HD face reader:
// Provides a Kinect sensor reference.
private KinectSensor _sensor = null;
// Acquires body frame data.
private BodyFrameSource _bodySource = null;
// Reads body frame data.
private BodyFrameReader _bodyReader = null;
// Acquires HD face data.
private HighDefinitionFaceFrameSource _faceSource = null;
// Reads HD face data.
private HighDefinitionFaceFrameReader _faceReader = null;
// Required to access the face vertices.
private FaceAlignment _faceAlignment = null;
// Required to access the face model points.
private FaceModel _faceModel = null;
// Used to display 1,000 points on screen.
private List<Ellipse> _points = new List<Ellipse>();
Step 5: Initialize Kinect and body/face sources
As usual, we’ll first need to initialize the Kinect sensor, as well as the frame readers. HD Face works just like any ordinary frame: we need a face source and a face reader. The face reader is initialized using the face source. The reason we need a Body source/reader is that each face corresponds to a specific body. You can’t track a face without tracking its body first. The FrameArrived event will fire whenever the sensor has new face data to give us.
_sensor = KinectSensor.GetDefault();
if (_sensor != null)
{
// Listen for body data.
_bodySource = _sensor.BodyFrameSource;
_bodyReader = _bodySource.OpenReader();
_bodyReader.FrameArrived += BodyReader_FrameArrived;
// Listen for HD face data.
_faceSource = new HighDefinitionFaceFrameSource(_sensor);
_faceReader = _faceSource.OpenReader();
_faceReader.FrameArrived += FaceReader_FrameArrived;
_faceModel = new FaceModel();
_faceAlignment = new FaceAlignment();
// Start tracking!
_sensor.Open();
}
Step 6: Connect a body with a face
The next step is a little tricky. This is how we connect a body to a face. How do we do this? Simply by setting the TrackingId property of the Face source. The TrackingId is the same to TrackingId of the body.
private void BodyReader_FrameArrived(object sender, BodyFrameArrivedEventArgs e)
{
using (var frame = e.FrameReference.AcquireFrame())
{
if (frame != null)
{
Body[] bodies = new Body[frame.BodyCount];
frame.GetAndRefreshBodyData(bodies);
Body body = bodies.Where(b => b.IsTracked).FirstOrDefault();
if (!_faceSource.IsTrackingIdValid)
{
if (body != null)
{
_faceSource.TrackingId = body.TrackingId;
}
}
}
}
}
So, we have connected a face with a body. Let’s access the face points now.
Step 7: Get and update the facial points!
Dive into the FaceReader_FrameArrived event handler. We need to check for two conditions before accessing any data. First we, need to ensure that the frame is not null. Secondly, we ensure that the frame has at least one tracked face. Ensuring these conditions, we can call the GetAndRefreshFaceAlignmentResult method, which updates the facial points and properties.
The facial points are given as an array of vertices. A vertex is a 3D point (with X, Y, and Z coordinates) that describes the corner of a geometric triangle. We can use vertices to construct a 3D mesh of the face. For the sake of simplicity, we’ll simply draw the X-Y-Z coordinates. Microsoft’s SDK Browser contains a 3D mesh of the face you can experiment with.
private void FaceReader_FrameArrived(object sender, HighDefinitionFaceFrameArrivedEventArgs e)
{
using (var frame = e.FrameReference.AcquireFrame())
{
if (frame != null && frame.IsFaceTracked)
{
frame.GetAndRefreshFaceAlignmentResult(_faceAlignment);
UpdateFacePoints();
}
}
}
private void UpdateFacePoints()
{
if (_faceModel == null) return;
var vertices = _faceModel.CalculateVerticesForAlignment(_faceAlignment);
}
As you can see, the vertices is a list of CameraSpacePoint. The CameraSpacePoint is a Kinect-specific structure that contains information about a 3D point.
Hint: we have already used CameraSpacePoints when we performed body tracking.
Step 8: Draw the points on screen
And now, the fun part: we have a list of CameraSpacePoint objects and a list of Ellipse objects. We’ll add the ellipses within the canvas and we’ll specify their exact X & Y position.
Caution: The X, Y, and Z coordinates are measured in meters! To properly find the corresponding pixel values, we’ll use Coordinate Mapper. Coordinate Mapper is a built-in mechanism that converts between 3D space positions to 2D screen positions.
private void UpdateFacePoints()
{
if (_faceModel == null) return;
var vertices = _faceModel.CalculateVerticesForAlignment(_faceAlignment);
if (vertices.Count > 0)
{
if (_points.Count == 0)
{
for (int index = 0; index < vertices.Count; index++)
{
Ellipse ellipse = new Ellipse
{
Width = 2.0,
Height = 2.0,
Fill = new SolidColorBrush(Colors.Blue)
};
_points.Add(ellipse);
}
foreach (Ellipse ellipse in _points)
{
canvas.Children.Add(ellipse);
}
}
for (int index = 0; index < vertices.Count; index++)
{
CameraSpacePoint vertice = vertices[index];
DepthSpacePoint point = _sensor.CoordinateMapper.MapCameraPointToDepthSpace(vertice);
if (float.IsInfinity(point.X) || float.IsInfinity(point.Y)) return;
Ellipse ellipse = _points[index];
Canvas.SetLeft(ellipse, point.X);
Canvas.SetTop(ellipse, point.Y);
}
}
}
That’s it. Build the application and run it. Stand between 0.5 and 2 meters from the sensor. Here’s the result:
Common facial points
OK, we drew the points on screen. So what? Is there a way to actually understand what each point is? How can we identify where they eyes are? How can we detect the jaw? The API has no built-in mechanism to get a human-friendly representation of all of the face data. We need to handle over 1,000 points in the 3D space manually!
Don’t worry, though. Each one of the vertices has a specific index number. Knowing the index number, you can easily deduce where does it correspond to. For example, the vertex numbers 1086, 820, 824, 840, 847, 850, 807, 782, and 755 belong to the left eyebrow.
Microsoft includes some commonly-used points as part of the SDK (thanks to Reini Adovics for noting this to me). These points are exposed by the HighDetailFacePoints enumeration:
Key | Index |
HighDetailFacePoints_LefteyeInnercorner | 210 |
HighDetailFacePoints_LefteyeOutercorner | 469 |
HighDetailFacePoints_LefteyeMidtop | 241 |
HighDetailFacePoints_LefteyeMidbottom | 1104 |
HighDetailFacePoints_RighteyeInnercorner | 843 |
HighDetailFacePoints_RighteyeOutercorner | 1117 |
HighDetailFacePoints_RighteyeMidtop | 731 |
HighDetailFacePoints_RighteyeMidbottom | 1090 |
HighDetailFacePoints_LefteyebrowInner | 346 |
HighDetailFacePoints_LefteyebrowOuter | 140 |
HighDetailFacePoints_LefteyebrowCenter | 222 |
HighDetailFacePoints_RighteyebrowInner | 803 |
HighDetailFacePoints_RighteyebrowOuter | 758 |
HighDetailFacePoints_RighteyebrowCenter | 849 |
HighDetailFacePoints_MouthLeftcorner | 91 |
HighDetailFacePoints_MouthRightcorner | 687 |
HighDetailFacePoints_MouthUpperlipMidtop | 19 |
HighDetailFacePoints_MouthUpperlipMidbottom | 1072 |
HighDetailFacePoints_MouthLowerlipMidtop | 10 |
HighDetailFacePoints_MouthLowerlipMidbottom | 8 |
HighDetailFacePoints_NoseTip | 18 |
HighDetailFacePoints_NoseBottom | 14 |
HighDetailFacePoints_NoseBottomleft | 156 |
HighDetailFacePoints_NoseBottomright | 783 |
HighDetailFacePoints_NoseTop | 24 |
HighDetailFacePoints_NoseTopleft | 151 |
HighDetailFacePoints_NoseTopright | 772 |
HighDetailFacePoints_ForeheadCenter | 28 |
HighDetailFacePoints_LeftcheekCenter | 412 |
HighDetailFacePoints_RightcheekCenter | 933 |
HighDetailFacePoints_Leftcheekbone | 458 |
HighDetailFacePoints_Rightcheekbone | 674 |
HighDetailFacePoints_ChinCenter | 4 |
HighDetailFacePoints_LowerjawLeftend | 1307 |
HighDetailFacePoints_LowerjawRightend | 1327 |
If you wish, you can use the Color, Depth, or Infrared bitmap generator and display the camera view behind the face. Keep in mind that simultaneous bitmap and face rendering may cause performance issues in your application. So, handle with care and do not over-use your resources.
Source Code
PS: I’ve been quite silent during the past few months. It was not my intention and I really apologize for that. My team was busy developing the Orthosense app for Intel’s International Competition. We won the GRAND PRIZE and we were featured on USA Today. From now on, I promise I’ll be more active in the Kinect community. Please keep sending me your comments and emails.
Till the next time, enjoy Kinecting!
PS: Vitruvius
If you enjoyed this article, then you’ll love Vitruvius. Vitruvius is a set of powerful Kinect extensions that will help you build stunning Kinect apps in minutes. Vitruvius includes avateering, HD Face, background removal, angle calculations, and more. Check it now.
Want to hire my team for your next cutting-edge fitness app? Drop me an email.
Can u please tell me how to access the various index numbers of the vertices? like i want to know which vertices correspond to the mouth and what are their indices.
Hi Amit,
Here’s what you can do find the index number of a vertex:
Let me know if you need further help with this.
Excellent guide Vangeli! Thank you.
What about profile views of faces? Is this robust? Does the system return less facial points?
Hi Georgios. Thank you very much for your kind words. HD Face is accurate when you are facing the sensor (frontal view). It remains accurate for approximately 45-60 degrees of head rotation. Expect a lot of false positives when you rotate your head for more than 60 degrees.
Hi there! Thank you for this great tutorial! Would like to ask if you’d know how to use these vertices information to map a picture of another face, say a Celebrity’s facial features, on to one’s face that’s seen from the Kinect. Thank you very much!
Hi Lynn,
I am not sure I understand your use-case scenario. Could you elaborate more, please? Thanks.
have you found (or maybe mapped yourself) vertex numbering for all the HD face points? i just cannot find any docs/”nice picture” on this. ms documetation for sdk 1.8 does have numbering shown https://i-msdn.sec.s-msft.com/dynimg/IC584330.png, hope something like this existed for v2 HD. thanks!
Hi Kroko. This is something I’m working on. There is no official documentation about it.
mkay, i’ll try to generate a map then
quick and dirty http://imgur.com/a/fs3lb. i was only interested in eye corners, thus these closeups, but hope this comes handy. i am using kinect common bridge though, but i doubt the indexes are different from working with “raw SDK” (kind of defeats the purpose of the bridge, right 🙂 )
This is awesome. Thanks for sharing!
enum _HighDetailFacePoints
{
HighDetailFacePoints_LefteyeInnercorner = 210,
HighDetailFacePoints_LefteyeOutercorner = 469,
HighDetailFacePoints_LefteyeMidtop = 241,
HighDetailFacePoints_LefteyeMidbottom = 1104,
HighDetailFacePoints_RighteyeInnercorner = 843,
HighDetailFacePoints_RighteyeOutercorner = 1117,
HighDetailFacePoints_RighteyeMidtop = 731,
HighDetailFacePoints_RighteyeMidbottom = 1090,
HighDetailFacePoints_LefteyebrowInner = 346,
HighDetailFacePoints_LefteyebrowOuter = 140,
HighDetailFacePoints_LefteyebrowCenter = 222,
HighDetailFacePoints_RighteyebrowInner = 803,
HighDetailFacePoints_RighteyebrowOuter = 758,
HighDetailFacePoints_RighteyebrowCenter = 849,
HighDetailFacePoints_MouthLeftcorner = 91,
HighDetailFacePoints_MouthRightcorner = 687,
HighDetailFacePoints_MouthUpperlipMidtop = 19,
HighDetailFacePoints_MouthUpperlipMidbottom = 1072,
HighDetailFacePoints_MouthLowerlipMidtop = 10,
HighDetailFacePoints_MouthLowerlipMidbottom = 8,
HighDetailFacePoints_NoseTip = 18,
HighDetailFacePoints_NoseBottom = 14,
HighDetailFacePoints_NoseBottomleft = 156,
HighDetailFacePoints_NoseBottomright = 783,
HighDetailFacePoints_NoseTop = 24,
HighDetailFacePoints_NoseTopleft = 151,
HighDetailFacePoints_NoseTopright = 772,
HighDetailFacePoints_ForeheadCenter = 28,
HighDetailFacePoints_LeftcheekCenter = 412,
HighDetailFacePoints_RightcheekCenter = 933,
HighDetailFacePoints_Leftcheekbone = 458,
HighDetailFacePoints_Rightcheekbone = 674,
HighDetailFacePoints_ChinCenter = 4,
HighDetailFacePoints_LowerjawLeftend = 1307,
HighDetailFacePoints_LowerjawRightend = 1327
} ;
This is amazingly useful. I am going to update the blog post and give credits to you.
i’m just going through SDK and this enum is there. assuming the standard install path for SDK you can find it in header c:\Program Files\Microsoft SDKs\Kinect\v2.0_1409\inc\Kinect.Face.h
I found this https://social.msdn.microsoft.com/Forums/getfile/668131 in the KinectV2 forum of Microsoft
Hi Juandi!
Any idea on how to get this? Would appreciate it. Thank you!
Hi Vangos,
First of all, congratulations for the website.
I’m working in my final degree proyect with Kinect V2 and I have a doubt that I hope you would be able to solve.
How can I extract a photo in bmp of the detected face? In other words, how can I extract the point cloud of the tracking face?
Thanks and sorry for my bad english.
Hi Juandi. That would be feasible using HD Face. First of all, you need to detect the forehead, the cheeks and the jaw:
var forehead = vertices[HighDetailFacePoints.ForeheadCenter];
var cheekLeft = vertices[HighDetailFacePoints.LeftcheekCenter];
var cheekRight = vertices[HighDetailFacePoints.RightcheekCenter];
var jaw = vertices[HighDetailFacePoints.LowerjawLeftend];
You now have the 3D coordinates of 4 points. Each point is a CameraSpacePoint in the world space. Using Coordinate Mapper, you can map the world space to the RGB or depth/infrared space and find the 2D coordinates in a 1920×1080 or 512×424 frame.
Since you have 4 points in the frame, you can crop the generated bitmap and extract the face.
Hi Vangos. Your articles are amazing and to be honest, I don’t know where I would be without you!
I have an interesting problem in which I am trying to detect if a person has a beard and a particular kind of glasses. Firstly, my plan with the beard is to use the HD face data in combination with the colour data and run the result through some sort of external imaging package (likely Emgu CV which is a wrapper for OpenCV). For the glasses, a similar trick will be used. The FaceFrameFeatures enumeration seems to only yield the presence of glasses and not whether they are the correct ones.
Firstly, I wonder if you have any source code which involves overlaying the face data to the colour frames. This seems essential to solving my problem and I’m not sure whether the traditional coordinate mapper example can help me. By the look of your final picture, you may have something handy that I could have a peek at.
Secondly, I’d really appreciate it if you could give me some general advice on the aforementioned strategy to beard (and particular kind of glasses) detection. A word of advice from you would be excellent!
Finally, I wonder if you have any plans to release a book on Kinect SDK v2. I have really struggled in particular with Kinect Fusion stuff and have found tutorials hard to come by. I wonder if you could either point me to a tutorial/resource/extra source code (I’ve already seen the example that comes with the SDK) or failing that, write a book that I can buy for any price! 🙂
Thanks again. I love you.
Do you know if it is possible to modify the code that is used to identify a face? I am trying to track infants (6-9 months) and Kinect does not pick up their faces. I suspect that this is because their features have different proportions than adults. I would like to try to modify the code to identify infant faces. Thanks!
Hi E B. We cannot modify the face tracking code. It is part of the SDK. Have you verified that the body of an infant is tracked? if the infant is lying on a bed, Kinect will not easily recognize it. You can put the sensor on the ceiling for better results.
Thank you for the quick reply! I have not verified that. The infant is sitting and moving his arms and head. I know that for adults, tracking the body while sitting is more difficult than tracking it while standing, but not particularly problematic. I’ll see if the body is tracked and hopefully that will make the head track as well.
Indeed. Face tracking relies on Body tracking! Verify that the body is tracked and your job will be much easier. Simply run the Body-Basics example from the Kinect SDK Browser.
Body tracking doesn’t work for sitting infants. I’ve tried positioning the Kinect in different ways, but it won’t consistently pick them up.
I have a some LED markers that I can easily and safely attach to the infants’ heads and shoulders (the only body parts I’m interested in). Do you know of any tutorial or code for tracking LED markers with the Kinect?
Thanks again!
You can try using EmguCV or Aforge.NET. You can feed them with color or depth data and perform pattern/image recognition. You can use them with Kinect, too.
Hi Bro, I have a problem in this line:
faceSource = new HighDefinitionFaceFrameSource(KinectSensor.GetDefault());
The mistake says this:
An unhandled exception of type ‘System.InvalidOperationException’ in Microsoft.Kinect.Face.dll but not handled in user code
Note: This API has returned an exception from an HRESULT: 0x80070002.
You can help me with this please.
Hi JA. Is it a WPF or Windows Store app? In case you are using WPF, you need to add the following line under Project → Properties → Build Events → Post-build event command line. This command will import some necessary configuration files.
xcopy “C:\Program Files (x86)\Microsoft SDKs\Windows\v8.0\ExtensionSDKs\Microsoft.Kinect.Face\2.0\Redist\CommonConfiguration\x64\NuiDatabase” “NuiDatabase” /e /y /i /r
Thanks it is runing :),
now I’m trying to put together the color Basic with this but when running becomes very slow you have any idea why?
Regards 🙂
Yes. The color frame is really huge (1920×1080). As a result, trying to display so many Color frames and handle so many Face points is a demanding job.
You have two options:
1) Display the Depth frame instead of the color frame.
2) Display the Color frame with fewer Face points (you probably do not need all of the 1,000 points).
Hope that information helps you.
thanks for your help and you answer, thank very much, see you later
Thank you for sharing ,sir! Here I want to ask you a problem.
If I want to put a obj glass model draw by opengl on the kinect face,how can I put the right location? That is, how can the obj model tracking the facepoint?
Best.
Hi there. First, let me say thank you so much for your posts on the Kinect. They have helped me immensely in my thesis project.
My question to you involves actual face recognition. I would like to track a face, save data obtained from the face, and be able to recall that data to make a comparison to a new face when the program is turned on again. Most of what I just described are things that I have worked out (save multiple values into a csv file, import them back into the program, and compare). However, the hard part is simply what values to use that give the appropriate details for the face. The only values I can find come from the CalculateVerticesForAlignment method. I can extract all 1347 points from those vertices but, when I compare them to multiple tries of my own face (or to someone else’s face), the values I get just don’t match up. Specifically, I am calculating the 3D vectors from 1 specific point to another and then comparing that same vector to see if the values is similar. This does not appear to work as well as I hoped as the values are never close enough to make a good determination.
Can you suggest another way to extract the numerical data for a face that would allow me to compare it to the same face or different faces for actual facial recall/recognition?
Thanks in advance for any help you can give.
Evan
Hi Evan. Do not compare vertices. Instead, compare the relative distances between specific points. For example, you can compare the distance between the eyes, the jaw-forehead, etc. Also, remember to take the measurements at a specific angle. It will be much more accurate if you compare measurements between 10 degrees of rotation than 40 degrees of rotation.
Thank you for responding so quickly. I guess I should clarify though: I am looking at the relative distances between vertices by calculating the 3D vector length (SQRT((x2-x1)^2 + (y2-y1)^2 +(z2-z1)^2) between two points (say tip of the nose to the left eye) and then comparing that length to the length of the same vector for a different face. The results, as I mentioned, were not accurate. Are you saying there are other values I should be getting?
Also, those vertex values are very very small and are often negative. Do you know what point they are being measured from (i.e. What the 0,0,0 point is)?
Lastly, I agree with your point in making sure the face is at the same rotation for each capture (hopefully 0). Is there a method that will give me the approximate head rotation angle so I can know?
Thanks again for any help you can provide.
Evan
Hi Vangos,
I already tried this programme as per your recommendation. As I already assembled the code without error, unfortunately the program ran but nothing coming on the UI canvas. Just Kinect 2 Face HD text. Can you help me on this one?
Thank you.
Hi Faiz. If you are using WPF, you’ll need to type the following command in the Post-Build events section:
xcopy “C:\Program Files (x86)\Microsoft SDKs\Windows\v8.0\ExtensionSDKs\Microsoft.Kinect.Face\2.0\Redist\CommonConfiguration\x64\NuiDatabase” “NuiDatabase” /e /y /i /r
Hi Vangos,
Thanks for your post. I’m doing a project using Kinect HD Face Tracking (C#). I need to extract the face features of a tracked face to a text or excel file to analyze in a data mining tool. Can you suggest a way to extract the animation units value and 3D Head Poses for a tracked face in C#?
Thanks again.
Hello Hoa. You can use something like my Kinect-2-CSV tool. This tool is an easy way of saving Kinect data to CSV files.
Hi Vangos, Thank you for your response. But how could I make Kinect -2-CSV to work together with Kinect HD Face Tracking?
Thanks.
Hello Hoa. You can modify the values and the columns to fit the Face data instead of the body. Everything contains CameraSpacePoints.
First of all my compliments with the clear and nice blog.
Seeing the popularity of like snapchat effects; this HD-face technology comes potentially into “our scope of business”.
I was just wondering -from your point of view- how the HD-face can be mapped on a relatively easy way to existing effects (like the google hangout toolbox) or even better existing 3D models. Everybody has seen the pepsi halloween promo – https://www.youtube.com/watch?v=3GG2wKZw3wg. I do realise this as an promo video by an agency, and even if this technology would make a close match, a project like this is a custom (and expensive) job.
I just wonder if you see any possibilities to find a library of effects, similar like your approach to the 3D models & kinect2.
Hi Filip. Thanks for your comment. It would need some work, but it would be feasible using Kinect v2.
Hi Vangos,
I tried to modify your Kinect-2-CSV tool to extract the animation units but I’m confused what should be replaced Body class and joins for the AUs of HD FT. Please advise:
public void Update(HighDefinitionFaceFrame hdff)
{
if (!IsRecording) return;
if (hdff == null || !hdff.IsFaceTracked) return;
string path = Path.Combine(Folder, _current.ToString() + “.line”);
using (StreamWriter writer = new StreamWriter(path))
{
StringBuilder line = new StringBuilder();
if (!_hasEnumeratedJoints)
{
foreach (var joint in hdff?)
{
line.Append(string.Format(“{0};;;”, joint.JointType.ToString()));
}
line.AppendLine();
foreach (var joint in body.Joints.Values)
{
line.Append(“X;Y;Z;”);
}
line.AppendLine();
_hasEnumeratedJoints = true;
}
foreach (var joint in body.Joints.Values)
{
line.Append(string.Format(“{0};{1};{2};”, joint.Position.X, joint.Position.Y, joint.Position.Z));
}
writer.Write(line);
_current++;
}
}
Hello. Thanks for your comment. There are no Face joints or equivalents. All you can do is use Vertices[some_index] to get the X-Y-Z values and store them in your CSV file.
Vangos,
Great articles, they are helping me a lot as I am a beginner BUT I’ve run into something I don’t see any information on.
If Kinect v2.0 can only track 2 HD faces at a time, how can we tell it which face we want to track if there are more than two people in the visible frame?
I’ll await your advice on this as I am stuck.
Thank you!
Sergio
Hi Sergio. That’s a great question. You specify the Body a Face belongs to by assigning the proper tracking ID:
_faceSource.TrackingId = body.TrackingId;
This way, you know exactly who the face belongs to!
Hey Vangos
Any opinions on the accuracy of the face tracking ?
I’m mostly interested in the change in size and shape of features over time. For example, jaw width, width of the bridge of the nose, stuff like that. Have some medical research in the back of my mind. I don’t need to see the actual size in cm or similar, I’m more interested in the ratios of facial vertices I capture now and facial vertices I capture in 6 months time. The changes would be pretty small though so if I can capture the same face 20 frames in a row and get 20 different results, it’s not going to work out too well.
Cheers
Pete
Hi Pete. Thanks for your comment – it’s very interesting. You can achieve what you are describing, but you’ll need a lot of testing and a lot of captured frames. Probably a classifier, too. Kinect is accurate, but you’ll get a lot of “false positives” and a lot of “jumpy” values. So, you’ll also need to smooth the data before comparing.
Thanks. On a somewhat related note, any plans to allow Kinetisense to hook up to EMR/EHRs with APIs ? I’m a consultant that does a lot of work for a large Orthopaedic practice that’s very interested in outcome data.
Hi Pete. We could talk in private about this. Use these contact options to reach me. Kinetisense is a really ambitious and well-established project.
Hi Vangos, Do you think that is possible to smooth face data in order to get smoother Animation Units?
Hi Thanasis. Yes, definitely. To smooth face data, you have to choose a window size (let’s say 10 or 20 frames) and take the average of the positions of a specific Face point.
Thank you for your response. Averaging frames might solve face flickering. But im still now sure how to use these new smoother positions to acquire smoother AnimationUnits (FaceAlignment.AnimationUnits).
Hi Thanasis. Smoothing the AnimationUnits would mean to eliminate any non-matching units within a sequence of units.
Alternatively, you could use the float value of the AnimationUnits dictionary to detect the weight of an expression and reject the ones that are below a predefined threshold.
Hey Vangos. Can you elaborate a little bit on what you mean by this? Are you saying to simply look at the AUs per frame and eliminate any that aren’t within an expected threshold? I am working with the HD face process as well and have been trying to understand the Animation Units and what exactly they do so any further insight would be appreciated.
Also, I’m not sure you have delved into the HD Face Basics-WPF process, but that process includes 94 Shape Units (SUs) that become static once the process has been completed. Do you happen to know how those SUs manipulate the vertices at all? I can’t find any documentation that states exactly what they do, just that they modify the vertices obtained in each frame.
Hi Evan. Yes, that’s what I mean. The weight is just a float number.
Hello , I am trying to run the program , however only appears blank window , does not appear with the video capture. Can you help me ?
Hi Petter. Can you add a breakpoint to see if you are getting Kinect data?
Hi Vangos. As far as i know, animation units are distances related to facial movement (facial point movement). Thus, having smoother facial points could lead to smoother Animation Units (whithout averaging or thresholding them). But what im missing here is the way to pass these new face points to the hd face model. Hope i made myself clear. Thnks
Hi Thanasis. You cannot update a face model. You have to create your own classes/structures.
Hi Vangos,
I need your advice on identifying a face. For example, if one of my family member stand in front of the kinect, i want to identify the person and show profile information. (name, address , age etc).
What sort of face points i need to store in the database for later match.
Please help how i should approach this?
Thanks,
Sam
Hi Sam. You can check this face recognition library.
Thank you. Appreciate your help.
Hi Vangos:
Thx for ur amazing post, it works well !!!
Have u ever studied the High definition face tracking Animation Units ? the Microsoft Kinect programming guide describe it as following:
“The API tracks 17 animation units (AUs). Most of the AUs are expressed as a numeric weight varying between 0 and 1. Three of them, Jaw Slide Right, Right Eyebrow Lowerer, and Left Eyebrow Lowerer, vary between -1 and +1.”
the relevant link : https://msdn.microsoft.com/en-us/library/microsoft.kinect.face.faceshapeanimations.aspx
I wanna use the Animation Units coefficients, but I do not find the relevant APIs. The Kinect 1 SDK 1.8 provides GetAnimationUnitCoefficients() function that could easily make me to get the coefficients, but how to do it in Kinect 2 SDK 2.0. BTW, I use WPF(C#)
thank you for ur help !!!
Hi Tony. You can install the HD Face Basics demo and check the FaceAlignment field. The FaceAlignment includes the Animation Units.
var value = currentFaceAlignment.AnimationUnits[FaceShapeAnimations.JawOpen];
Youtube should keep the 'switch' as a permanent feature some people prefer the black background and it is environmentally friendly but instead of turning it black then going to another video and it goes back to white you should be able to turn it off and be able to change the video as well and it should be on every page not just video's channels, your subs page, homepage and every thing else k just a thought.
Hello Mr.Pterneas;
I do appreciate your reply and help for me.
I need to develop a simple application with simple interface to know the following points:
1- movement and direction of head
2- movement of arm (up or down)
3- hand if its near to mouth or not
what is your for me as an expert developer with Kinect …what the basic codes and points can I depend on to start and to create my interface for the app??
Hello Hannan. You should measure the distance between the hand joint and the mouth.
To find whether the arm is up or down, simply compare the following:
var handY = body.Joints[JointType.HandRight].Position.Y;
var elbowY = body.Joints[JointType.ElbowRight].Position.Y;
if (handY > elbowY) { /* Down */ }
if (handY < elbowY) { /* Up */ }
To calculate the distance between the hand and the mouth, the easiest way is using Vitruvius:
var mouth = face.Mouth;
var hand = body.Joints[JointType.HandRight].Position;
var distance = mouth.Length(hand);
if (distance < 0.1) { /* Hand is close to mouth */ }
Hello Vangos.
Thanks for all your articles on the Kinect.
I have seen that different Kinect sensors have slight differences in position when overlaying the HD face generated mesh over the color feed. Have you ever seen this yourself or anywhere else?
To explain, I have an application where I apply facial makeup texture on the 3D HD mesh and overlay it on the user’s face on the color feed. So the overlay needs to be very accurate. I achieve this accuracy somewhat by correcting the parallax dynamically between the color and depth spaces, and shifting the x-y position of the color image by some hard-coded values. While I can achieve good accuracy for one sensor this way, the xy shift goes off if I plug in a different sensor. The parallax fix still works accurately so I have to adjust the xy offset for each Kinect sensor.
My conclusion from all this is that the physical placement of the color and IR cameras is not standard (not accurate enough) for all Kinect sensors. What do you think?
PS: The sensors I use are XBox One sensors made in China
Hi Vanlal. Thank you very much for your comment. I have not experienced such an inconsistency with any of my Kinect sensors. I have checked over 15 sensors and there are no inconsistencies. Check whether your code considers any fixed distances that could change based on the position of the sensor or the height of a user.
My code does take care of those parameters – the user can move around the whole view area in all three lateral dimensions without much inaccuracy. Since the differences between the sensors are very consistent, we have decided to adjust for each sensor. Thanks for taking the time to check it out.
I am so thankful ..you did help me a lot.
1- I always got confuse about y coordinate and x coordinate..how I can know it?? for example, in these statement what the Y …I don’t understood this:
if (handY > elbowY) { /* Down */ }
if (handY < elbowY) { /* Up */ }
2- here you used the function length ( var distance = mouth.Length(hand); )
Is there any documentation for the functions of Kinect2 can I use ??? from where can I know kinds of these functions
Thanks alot
Hello Hannan. Here are the answers to your questions:
1) The complete code would be the following:
float handX = body.Joints[JointType.HandRight].Position.X;
2) The Length method is part of Vitruvius. The documentation can be found here.
I do appreciate all you support for me and I promise you , I will mention your full name in my printed thesis because you are really help me a lot..
I will graduate next Fall 🙂
thanks from my heart
No worries. I’m really excited to see more and more students using Kinect in their research!
Well done!
Hi
I have a problem in this line:
faceSource = new HighDefinitionFaceFrameSource(KinectSensor.GetDefault());
The mistake says this:
An unhandled exception of type ‘System.InvalidOperationException’ in Microsoft.Kinect.Face.dll but not handled in user code
This API has returned an exception from an HRESULT: 0x80070002.
I applied
xcopy “C:\Program Files (x86)\Microsoft SDKs\Windows\v8.0\ExtensionSDKs\Microsoft.Kinect.Face\2.0\Redist\CommonConfiguration\x64\NuiDatabase” “NuiDatabase” /e /y /i /r
but the new error is “GenerateResource” task failed unexpectedly !
Hello. Are you using Kinect SDK 1.x or 2?
any of these features for the game, and it's the first mention of the possibility anywhere. 360sync eyeballed the listing, but it's unclear whether or not these Kinect features will make it into the Xbox 360
Hi Vangos,
Thanks for your post which is helping me in implementing my application which is using Kinect.
I am having trouble in mapping these face points on a 2D bitmap image, which I am getting from ColorBasics. I just want to ask that vertices[index] contains three axis (x, y, z) I want to map it in 2D plane (x, y). If I just ignore z-axis, then it is not returning accurate results. What would be a nice method to map these points on 2D plane?
Thanks.
For whatever reason when I build and run the exe I get a blank canvas with no output from the kinect. Any idea what I’m doing wrong?
Hello. Please launch the SDK Browser app (auto-installed with the Kinect SDK) and launch the HD Face demo. Does the demo run properly? Some computers do not support HD Face.
Hi Vangos, thanks for your tutorial !
Do you know if using a facemodel builder also increases the accuracy of the vertices ? Or is it only used to access the other properties of the facemodel (like shape deformations) ?
Hello Panagiotis. The facemodel builder is helping you access and display the vertices/deformations. The accuracy is not configurable. It’s affected by lighting conditions and distance from the sensor.
I would like you to advise me as I can evaluate if it has opened the mouth or if an eye has closed with the vertices that the function returns.
Thank you so much.
Hello. You can measure the distance between the vertex HighDetailFacePoints_MouthUpperlipMidtop and the vertex HighDetailFacePoints_MouthLowerlipMidbottom. The distance would be low if the mouth is not open. It would be bigger if the mouth his open.
Thankyou so much for a great tutorial. Can you please tell me how can i get RGB values of a particular vertices you have mentioned on the face. I want to get the Skin Tone of the face.
Hello Ahmad. When projected on the Color frame, each point has a specific X and Y coordinate, within the 1920×1080 Color frame. The colorIndex would be:
var colorIndex = Y * 1920 + X;
So, the RGBA color values for that point would be:
var r = colorData[colorIndex + 0];
var g = colorData[colorIndex + 1];
var b = colorData[colorIndex + 2];
var a = colorData[colorIndex + 3];
Will it works for Kinect for XBOX 360 v1 ?
Hello. HD Face is only available in Kinect v2.
Thanks so much for this tutorial, it has been very helpful!
Quick question regarding the CameraSpacePoints. I am interested in snapping the orientation of the face vertices towards the camera, presumably using the FaceAlignment.FaceOrientation property. I have been able to convert the individual vertices (CameraSpacePoints) to Microsoft.Xna.Framework.Vector3 objects, and have transformed them via the inverse of the FaceOrientation rotation matrix without errors. However, as a sanity check, I was hoping to plot them on the canvas like you do in this example. However, the CoordinateMapper.MapCameraPointToDepthSpace only takes in CameraSpacePoints (not Vector3 objects). Just curious, is there a way to essentially recreate this behavior, warp the CameraSpacePoints directly without changing types, or otherwise achieve this goal?
Thanks again!
Boa Tarde. Estou tentando já importei o ddl e face.dll mas aparece que o BodyFrameSource não foi encontrado, porque ?
Hi Andressa. Please, edit your comment and translate it in English, so we can all understand what you mean. Thank you very much!
When I exacute show this error:
System.Windows.Markup.XamlParseException: ”The invocation of the constructor on type ‘Kinect2FaceHD_NET.MainWindow’ that matches the specified binding constraints threw an exception.’ Line number ‘3’ and line position ‘9’.’
and
System.BadImageFormatException: ‘Could not load file or assembly ‘Microsoft.Kinect.Face, Version=2.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35′ or one of its dependencies. An attempt was made to load a program with an incorrect format.’
Hello. Open your project in Visual Studio and go to Project → Properties → Build Events → Post-build event command line. Type this command:
xcopy "C:\Program Files (x86)\Microsoft SDKs\Windows\v8.0\ExtensionSDKs\Microsoft.Kinect.Face\2.0\Redist\CommonConfiguration\x64\NuiDatabase" "NuiDatabase" /e /y /i /r
This command will import some necessary configuration files.
Hello. Vangos Pterneas
I want to ask you about Kinect v2 can do emotion recognition like(Anger,Sad ,Happy…..)?
How can I do ?
Thank you !
Sure, you can check this article for basic emotion recognition:
https://pterneas.com/2014/12/21/kinect-2-face-basics/
Hello. Vangos Pterneas
Thank you. I already read your link, could I create a condition to recognize anger emotion?
Because the Kinect v2 face api just provide the happy emotion.
Thank you a lot !!!
Hi Dong. Anger would be trickier to implement because it’s ambiguous. You could use the HD Face points and check the distance between the upper and lower lips, as well as the angle of the eyebrows. You should print the IDs of the points to see their serial number and extract them from the Vertices list.
Hi. Is it possible that Kinect recognize the face only one time?
Sure, you can stop the HD face frame reader whenever you want. You can make it null or call the Dispose() method.
I’m trying, but I do not know how, I wanted it to close the image, only the window would open.
Thanks for help me 😉
I assume you have a HighDefinitionFaceFrameReader object. This object should have a Dispose() method to call when the proper event is fired. E.g. when you click a button or when the Face is recognized for the first time.
Hi Vangos, first of all, thanks for knowledge you’d shared, i don’t know how will i be without your guidance :”D
Then i have a question for you, i’m currently developing my kinect project in unity and reconstruct it in 3d mesh. Intermittently, kinect won’t detect the face though it’s in stable ambient light and constant distance, do you know what are things that make face detection unstable?
Hello Uwi. Thanks for your comment! The HD Face capability of the Kinect works best when the room has good lighting and you are standing between 1.5 and 3.0 meters from the camera.
Also, remember that Kinect can only track a face that is rotated between -45 and +45 degrees. So, if the head is rotated sideways, Kinect will not be able to track the face.
Hi!
Very helpful information! I was working on this code. While I run it, it shows a blank frame. Though my kinect gets switched on instantly but something seems wrong. Any suggestions?
Hello Mehak. Please, check Microsoft’s Kinect Configuration Verifier to ensure your system is working properly. This should provide more insights about what’s going on.
I fixed this error. How to get the values of Right eye and left eye; Im looking for a proper syntax!
At first I used this ‘CameraSpacePoint lefteyeOutercorner = vertices[(int)HighDetailFacePoints.LefteyeOutercorner];’ but it gave me no values with your code.
After alot of research, I got something like…
private Point MapCameraPointToCurrentSpace(CameraSpacePoint cameraSpacePoint)
{
Point mappedPoint = new Point();
DepthSpacePoint point = _sensor.CoordinateMapper.MapCameraPointToDepthSpace(cameraSpacePoint);
mappedPoint.X = point.X;
mappedPoint.Y = point.Y;
return mappedPoint;
}
var rightEyeOuterPoint = MapCameraPointToCurrentSpace(vertices[(int)HighDetailFacePoints.RighteyeOutercorner]);
If I stand infront of it, I get different values for my righteyeouterpoint. I was hoping to save the values I get and compare with different faces.
Any suggestions on what values I could compare for different faces and their syntax?
When doing coordinate mapping, you need to check the X and Y values for infinity:
DepthSpacePoint point = _sensor.CoordinateMapper.MapCameraPointToDepthSpace(cameraSpacePoint);
float x = 0f;
float y = 0f;
if (!float.IsInfinity(point.X) && !float.IsInfinity(point.Y))
{
x = point.X;
y = point.Y;
}
mappedPoint.X = x;
mappedPoint.Y = y;
Thank you for your help. You reply quite fast, pretty impressive though :’D How did Reini Adovics get those index position? I am trying to get those but instead Im getting those x and y values.
Oh and btw, keep up the good work. Your blog is pretty nice!
Thank you 🙂
By default, the X, Y, and Z values of the original CameraSpacePoint are measured in meters and correspond to the physical word. The X and Y values are the positions of the DepthSpacePoint is the position of a face point within the 512×424 depth frame. You could have also used MapCameraPointToColorSpace() to get the position into the 1920×1080 color frame.
You can read more about mapping coordinates in my Kinect Coordinate Mapping tutorial (Part 1 and Part 2).
As an easier starting point, I would like to recommend you the use of Vitruvius. Vitruvius lets you easily convert among different coordinate spaces. For example, using Vitruvius, you can do the following:
var face = faceFrame.Face();
var eyeLeft3D = face.EyeLeft; // 3D space
var eyeRight3D = face.EyeRight; // 3D space
var nose3D = face.Nose; // 3D space
var eyeLeft2D = eyeLeft3D.ToPoint(Visualization.Color); // 2D space
var eyeRight2D = eyeRight3D.ToPoint(Visualization.Color); // 2D space
var nose2D = nose3D.ToPoint(Visualization.Color); // 2D space
Hello vangos a big appreciation for your awesome blog here, by the way i want to make the one as yours, run the camera simultaneously with face dots, but ended up i couldn’t adjust the canvas size with the image (camera) so the dots misplaced my face. Do you have any advice? Thank you
Thank you so much for such awesome post you’d shared sir. I learn a lot about WPF programming and Kinect v2 from this.
right now, I am trying to make “gaze tracker” based on this code. but, before that, I need to make the iris recognition.
my idea is to use the vertices to segment color frame (especially on eye region). So that I can use some algorithm (like RANSAC ) to fit an ellipse to both eye. the problem is I don’t know how to do that . if possible, could you please give me tips to do the segmentation. thanks
You could check OpenCV or EmguCV for pupil tracking methods (link).
I had used Emgu CV for eye region segmentation. but, I think it will a lot faster if I crop eye region using face dots. then, I will use Emgu CV to track both iris.
Correct, that would be much faster.
Hello Vangos,
thank you for your work.
i have a question about tracking the face via HD Face.
The problem that i have is, that in my project the person is wearing sunglasses, so the program doesn’t always recognizes the face.
Is there any way to change the model so that the program detects the face wearing those glases?
If i could delete some points of the model in example the eye points to ensure the program detects the face without seeing the eyes.
I hope you understand what i am talking about 😀
Looking forward to hearing from you.
Hi Andreas. Thank you for your message. The face models are made by Microsoft and are given to us in binary format. Thus, we cannot change them. It seems that, when wearing sunglasses, the Face model classifies the face as invalid.
Hi, this is really a good tutorial and it helps me a lot. However, I still have some questions. For some reasons, I need the position/coordinate(X,Y,Z) of those vertices. But, I’m not so familiar with the C# language. So, I wonder if you can tell me which method should I use.
Hi, I tried to clone your project and check how it works. But after successful it loads a white window with no camera content. Do you have any idea whats going on?
Please, check your output window for any debugging information. Also, ensure you have the Kinect SDK version 2 installed.
Hi, this tutorial really give me an idea for a feature i need in a project, i need to determine the face shape of a person, but for that i need to calculate distance between 2 facial points, is there any way i could do that? Thanks in advance!
Hello. The positions of the face points are given in the 3D space. So, all you need to do is apply this formula. The X, Y, and Z coordinates are measured in meters.
Thanks for the answer! And one last question, if i wanted to do the same exercise but with the body, how can i access the body points?
You can access the Body joints using the Kinect SDK (here is a short tutorial).
Joint head = body.Joints[JointType.Head].Position;
Hi again! I have another question. The common facial points were very useful for the project, but we discover that we need other facial points that aren’t in that list. Is there any way to obtain other facial points rather than those?
Hello. Unfortunately, there is no other way to obtain points that are not in the list.
I don’t usually comment but I gotta admit regards for the post on this perfect one :D.