Getting Started with Kinect and Processing

So, you want to use the Kinect in Processing. Great. This page will serve to document the current state of my Processing Kinect library, with some tips and info.

Before you proceed, however, you may want to instead consider using the wonderful SimpleOpenNI library and Greg Borenstein’s Making Things See book! OpenNI has lots of features (skeleton tracking, gesture recognition, etc.) that are not available in this library.

I’m ready to get started right now

What hardware do I need?

First you need a “stand-alone” kinect. You do not need to buy an Xbox. If you get the stand-alone kinect listed below, it will come with a USB adapter and power supply.

Standalone Kinect Sensor

If you have a kinect that came with an XBox, it will not include the USB adapter. You’ll need to purchase this separately:

Kinect Sensor Power Supply

Um, what is Processing?

I’m going to assume you are familiar with Processing, but just in case you are not, I suggest checking out: processing.org (Processing is an open source programming language and environment for people who want to create images, animations, and interactions. Initially developed to serve as a software sketchbook and to teach fundamentals of computer programming within a visual context, Processing also has evolved into a tool for generating finished professional work. Today, there are tens of thousands of students, artists, designers, researchers, and hobbyists who use Processing for learning, prototyping, and production.)

What if I don’t want to use Processing?

If you are comfortable with C++ I suggest you consider using openFrameworks or Cinder with the Kinect. At the time of this writing, these environments have some features I haven’t implemented yet and you also get a C++ speed advantage when processing the depth data, etc.:

ofxKinect
Kinect CinderBlock

More resources from: The OpenKinect Project

I’ve got Processing, how do I download and install the library?

By default Processing will create a “sketchbook” folder in your Documents directory, i.e. on my machine it’s:

/Users/daniel/Documents/Processing/

If there isn’t already, create a folder called “libraries” there, i.e.

/Users/daniel/Documents/Processing/libraries/

Then go and download openkinect.zip and extract it in the libraries folder, i.e. you should now see

/Users/daniel/Documents/Processing/libraries/openkinect/
/Users/daniel/Documents/Processing/libraries/openkinect/library/
/Users/daniel/Documents/Processing/libraries/openkinect/examples/
etc.

Restart Processing, open up one of the examples in the examples folder and you are good to go!

More about installing libraries:

http://wiki.processing.org/w/How_to_Install_a_Contributed_Library
http://www.learningprocessing.com/tutorials/libraries/

At the moment, my library only works with Mac OS X (intel, 10.5 or 10.6 should both be ok). Hopefully this will be remedied soon enough. However, if you are interested in working with Kinect on windows, I recommend SimpleOpenNI.

What code do I write?

To get started using the library, you need to include the proper import statements at the top of your code:

import org.openkinect.*;
import org.openkinect.processing.*;

As well as a reference to a “Kinect” object, i.e.

// Kinect Library object
Kinect kinect;

Then in setup() you can initialize that kinect object:

  kinect = new Kinect(this);
  kinect.start();

Once you’ve done this you can begin to access data from the kinect sensor.

Currently, the library makes data available to you in four ways:

  1. RGB image from the kinect camera as a PImage.
  2. Grayscale image from the IR camera as a PImage
  3. Grayscale image with each pixel’s brightness mapped to depth (brighter = closer).
  4. Raw depth data (11 bit numbers between 0 and 2048) as an int[] array

Let’s look at these one at a time. If you want to use the Kinect just like a regular old webcam, you can request that the RGB image is captured:

  kinect.enableRGB(true);

Then you simply ask for the image as a PImage!

  PImage img = kinect.getVideoImage();
  image(img,0,0);

Alternatively, you can enable the IR image:

  kinect.enableIR(true);

Currently, you cannot have both the RGB image and the IR image. They are both passed back via getVideoImage() so whichever one was most recently enabled is the one you will get.

Now, if you want the depth image, you can:

  kinect.enableDepth(true);

and request the grayscale image:

  PImage img = kinect.getDepthImage();
  image(img,0,0);

As well as the raw depth data:

  int[] depth = kinect.getRawDepth();

If you are looking at the raw depth data only, you can turn off the library’s behind the scenes depth image processing to make it slightly more efficient:

  kinect.processDepthImage(false);

Finally, you can also adjust the camera angle with the tilt() function, i.e.:

float deg = 15;
kinect.tilt(deg);

So, there you have it, here are all the useful functions you might need to use the Processing kinect library:

  1. enableRGB(boolean) — turn on or off the RGB camera image
  2. enableIR(boolean) — turn on or off the IR camera image
  3. enableDepth(boolean) — turn on or off the depth tracking
  4. processDepthImage(boolean) — turn on or off the depth image processing
  5. PImage getVideoImage() — grab the RGB or IR video image
  6. PImage getDepthImage() — grab the grayscale depth map image
  7. int[] getRawDepth() — grab the raw depth data
  8. tilt(float) — adjust the camera angle (between 0 and 30 degrees)

Where’s the javadoc?

Stay tuned, I’ll get something up soon. All the source is here:

https://github.com/shiffman/libfreenect/tree/master/wrappers/java/processing

So now what?

So far, I only have three basic examples:

Display RGB, IR, and Depth Images


Code:RGBDepthTest

(NOTE: KNOWN BUG IN IR IMAGE RIGHT NOW)

This example does nothing but use all of the above listed functions to display the data from the kinect sensor.

Point Cloud


Code: PointCloud

Here, we’re doing something a bit fancier. Number one, we’re using the 3D capabilities of Processing to draw points in space. You’ll want to familiarize yourself with translate(), rotate(), pushMatrix(), popMatrix(). This tutorial is also a good place to start. In addition, the example uses a PVector to describe a point in 3D space. More here: PVector tutorial.

The real work of this example, however, doesn’t come from me at all. The raw depth values from the kinect are not directly proportional to physical depth. Rather, they scale with the inverse of the depth according to this formula:

depthInMeters = 1.0 / (rawDepth * -0.0030711016 + 3.3309495161);

Rather than do this calculation all the time, we can precompute all of these values in a lookup table since there are only 2048 depth values.

float[] depthLookUp = new float[2048];
for (int i = 0; i < depthLookUp.length; i++) {
  depthLookUp[i] = rawDepthToMeters(i);
}
 
float rawDepthToMeters(int depthValue) {
  if (depthValue < 2047) {
    return (float)(1.0 / ((double)(depthValue) * -0.0030711016 + 3.3309495161));
  }
  return 0.0f;
}

Thanks to Matthew Fisher for the above formula. (Note: for the results to be more accurate, you would need to calibrate your specific kinect device, but the formula is close enough for me so I’m sticking with it for now. More about calibration in a moment.)

Finally, we can draw some points based on the depth values in meters:

  for(int x = 0; x < w; x += skip) {
    for(int y = 0; y < h; y += skip) {
      int offset = x+y*w;
 
      // Convert kinect data to world xyz coordinate
      int rawDepth = depth[offset];
      PVector v = depthToWorld(x,y,rawDepth);
 
      stroke(255);
      pushMatrix();
      // Scale up by 200
      float factor = 200;
      translate(v.x*factor,v.y*factor,factor-v.z*factor);
      // Draw a point
      point(0,0);
      popMatrix();
    }
  }

Average Point Tracking

The real magic of the kinect lies in its computer vision capabilities. With depth information, you can do all sorts of fun things like say: "the background is anything beyond 5 feet. Ignore it!" Without depth, background removal involves all sorts of painstaking pixel comparisons. As a quick demonstration of this idea, here is a very basic example that compute the average xy location of any pixels in front of a given depth threshold.

Source: AveragePointTracking

In this example, we declare two variables to add up all the appropriate x's and y's and one variable to keep track of how many there are.

float sumX = 0;
float sumY = 0;
float count = 0;

Then, whenever we find a given point that complies with our threshold, we add the x and y to the sum:

  if (rawDepth < threshold) {
    sumX += x;
    sumY += y;
    count++;
  }

When we're done, we calculate the average and draw a point!

if (count != 0) {
  float avgX = sumX/count;
  float avgY = sumY/count;
  fill(255,0,0);
  ellipse(avgX,avgY,16,16);
}

Why don't the RGB images and depth values correspond properly?

Unfortunately, b/c the RGB camera and the IR camera are not physically located in the same spot, we have a stereo vision problem. Pixel XY in one image is not the same XY in an image from a camera an inch to the right. I'm hoping to stretch my brain to try to understand this better and work out some examples that calibrate the data in Processing. Stay tuned!

If you are interested in more (and software that will do this very job!) check out Nicolas Burrus' amazing work:

Theory on depth/color calibration and registration
version 0.3 of RGBDemo

What's missing?

Lots! Open a github issue if you want to add an item to my to do list!

https://github.com/shiffman/libfreenect/issues

FAQ

1. What are there shadows in the depth image?

Kinect Shadow diagram

2. What is the range of depth that the kinect can see?

~0.7–6 meters or 2.3–20 feet. Note you will get black pixels (or raw depth value of 2048) at both elements that are too far away and too close.

  • Tsnilsen

    Never mind, I found the javadoc info above

  • Schorschi

    Your instructions make no sense… I downloaded the zip file, but then what? Processing?  The link just goes to a directory structure?  Can you explain step by step what the heck is the next step?  How about 1, 2, 3… sequence?  With screen prints on a Windows based platform?

  • Gerry Straathof

    which version of processing are you trying to run?

  • Anonymous

    Unfortunately, the library does not work for Windows (see note above).   For the Kinect and Processing on windows, I’d suggest SimpleOpenNI: http://code.google.com/p/simple-openni/

    If you aren’t familiar with Processing and how to install libraries, take a look at:

    http://processing.org/learning/gettingstarted/
    http://wiki.processing.org/w/How_to_Install_a_Contributed_Library

  • Schorschi

    Thanks for the additional information… I did get things working using the Microsoft SDK for Kinect Beta2.  The official SDK is due out soon.  But I plan to explore other options as well, so this information will be a big help.

  • Thewhiterabbit

    No library found for org.openkinect
    No library found for org.openkinect.processing
    As of release 1.0, libraries must be installed in a folder named ‘libraries’ inside the ‘sketchbook’ folder.
     PLEASE PLEASE HELP

  • Mario Alejandro

    gracias mi viejo estoy aprendiendo con tus tutoriales, un saludo desde Colombia…  (Y)

  • http://www.facebook.com/maxim.safioulline Maxim Safioulline

    Is there a way to check if Kinect is there or not? Maybe
    if (kinect.devices > 0 ){
    //do stuff
    }
    Thanks

  • David Sanz Kirbis

    @facebook-688585698:disqus 
    You can add this functionality to the library yourself, let’s say, by putting a new function in the Kinect.java file and building the library:

    public int getNumDevices() {
    context = Context.getContext();
    return context.devices();
    }

    or…. do a DIRTY workaround (actually I love these :) ):

    boolean kinectConnected = true;  void setup() {  size(1280,520);  kinect = new Kinect(this);    kinect.start();    try {    kinect.enableDepth(depth);  }  catch (NullPointerException ex) {    kinectConnected = false;    println(“you humans… you forgot to plug your kinect again!!!”);  }    if (kinectConnected) {    kinect.enableRGB(rgb);    kinect.enableIR(ir);    kinect.tilt(deg);  }}void draw() {  if (kinectConnected) {    background(0);      image(kinect.getVideoImage(),0,0);    image(kinect.getDepthImage(),640,0);    fill(255);    text(“RGB/IR FPS: ” + (int) kinect.getVideoFPS() + ”        Camera tilt: ” + (int)deg + ” degrees”,10,495);    text(“DEPTH FPS: ” + (int) kinect.getDepthFPS(),640,495);    text(“Press ‘d’ to enable/disable depth    Press ‘r’ to enable/disable rgb image   Press ‘i’ to enable/disable IR image  UP and DOWN to tilt camera   Framerate: ” + frameRate,10,515);  }}

  • http://www.facebook.com/maxim.safioulline Maxim Safioulline

     I see, I see! Thank you much!

  • http://www.facebook.com/people/Phil-Spitler/558888713 Phil Spitler

    Hi, I love this project, thanks. What I would love to see would to have the Point Cloud saved as a text file such as a PLY.

    RGBDemo can do this but only a still frame, it seems like with your libraries and Processing, it should be somewhat easy to figure out.

    Basically I would love to have a “record” button and once that is hit then each frame is written to disk as a PLY.

    My knowledge of processing is very limited and  am wondering if anybody would be interested in helping me out,

    Cheers.

    Phil

  • http://www.facebook.com/people/Phil-Spitler/558888713 Phil Spitler

    Hi, I modified your point cloud example to save an ascii PLY of every frame and it seems to work ok, I’m pretty new to Processing so my solution may be clunky.

    In other apps such as rgbDemo and CocoaKinect, there is control over the depth so that they only show points within a certain depth range. Is there a simple way to do that with your library?

    Thanks

    Phil

  • http://www.facebook.com/people/Phil-Spitler/558888713 Phil Spitler

    Hi, did you ever get the multiple Kinect thing working? I would love to record data from 2 Kinects and merge into a single point cloud.

  • cris241006

    Hi  let me know if you manage to connect kinect I have the same problem is not where to find and org.openkinect.processing 
    org.openkinect

  • Joe Misterovich

    I’ve installed the library and all of the examples work properly aside from the AveragePointTracking example which gives me “NullPointerException” when I attempt to run the sketch. Any solution?

  • Trunks7federer

    When I try to run this I get an error saying there is nothing named NativeKinect, and it highlights the NativeKinect.init(): where can i download the new library ?, 

  • Thewhiterabbit

    on mac you go to documents/processing/libraries and create libraries if it aint these then when you get libraries you throw them in there. in there it will be then a folder and alway sin that folder there will be a library folder. take SimpleOpenNI for example. It will go in the libraries folder called SimpleOpenNI and in that there will be library folder along with others. peace. 

  • http://profile.yahoo.com/OIXM2EQQVXODDC3OG6LUMILTRU Filippo

    all this is great!!!

    but i can’t understand: why is avalable only one rgb image and not two (one for each rgb camera) ?

  • Clement

    Thanks a lot for this!  It was very helpful. Great work

    Clement

  • Fernanda-herrera

     hola mario estoy tratando de utilizar el kinect en mac pero no he podido podrias contactarme para preguntarte como hacerlo mil gracias

  • Fernanda-herrera

     fernanda-herrera@hotmail.com

  • Fernanda-herrera

    can someone help me to set up the kinect in mac osx im new in this thanks, i got some tutorial but i have some troubles

  • Digimouse

    I need some help measuring distances using the kinect application, i know this can be done with depth image,  the programming using guide said “The depth data stream provides frames in which the high 13 bits of each pixel give the distance” but i have no idea how to get the depth sata stream and get the 13 higher bits

  • claudia casati

    Hi! thank’s for your library! it’s great!!!!
    but… I’ve not understand how can I get the z-axis coordinate?I thought with int[] getrawdepth(), but I don’t know how…. please, help me :)

  • Gerry Straathof

    This is the line which will help guide you: 

    translate(v.x*factor,v.y*factor,factor-v.z*factor);

    It is in the point cloud demo. You can see what the program calculates from this value.When you generate the different values for v.x and v.y they use the value of v.z to figure out where they are among the cone-shaped field of view from the kinect. The kinect sees things in a cone. If you just displayed the z-value as a height with the un-calculated x/y values from the camera, things might look a bit… strange.

    I would suggest you play around with figuring out what I mean by strange, though. It’s a secret… 8^)

  • mcnet

    kinect and Lion???

    thank you very much for the library and the forum contributions, are really fantastic!

    I have used them great on my macOS with snow leopard :)
    … but now I’ve switched to Lion and processing does not recognize the kinect any more :( (
    can anyone help me? 

  • Ray

    Hi there!

    How would you use processing in correlation with Microsoft’s SDK package? What would I reference for this part:
    import org.openkinect.*;import org.openkinect.processing.*;

  • Chris

    Hey guys,

    I keep getting an error that says “NullPointerException” whenever I try to load a basic depthmap file. Why is this?

  • http://www.facebook.com/Vigneshwaran.Hariharan Vigneshwaran Hariharan

    How to configure the standalone sensor for further processing etc..etc..

  • Gerry Straathof

    could you be more specific about your intentions? There are a number of resources online which explore using the output from the kinect in distinct ways, such as silhouettes, open-cv based proximity and skeleton tracking. What are the specifics behind what you are trying to do…?

  • Gerry Straathof

    Can you see the grayscale image from the kinect? Are you running the demo code for the depth map, or have you made your own system. If you have your own code that isn’t working, you may want to try stackoverflow.com to get help with the behaviour of your code. If it is the demo that isn’t working, you may still find some important clues there.

  • as

    Hi, I am trying to run the PointCloud example and I keep getting this:
    “Java.lang.NoClassDefFoundError: Could not initialize class org.openkinect.Context”
    any ideas as to what could be wrong?

  • P.P.

    Hi,
    I need to know how to do a simultaneous localization and mapping as 3D with this platform.
    Like in this link :http://www.youtube.com/watch?v=dGVnPvgqu3M. That’s the 3D point cloud
    will be accumulated as Kinect moves.
    Thanks in advance.
    Piyaphat

  • Krishna

    This may seem like a silly question, The raw depth data seems to have 2047 “levels”, but is displayed as pixels on screen as values with a range from “0 – 255″ ( black and white ) is this true?

    The idea is to use the kinect to record the grey scale depth video or even a still grey scale depth image data for starters, from three equidistant vantage points (with the subject in the middle of the triangle), manipulate the footage in Photoshop (to account for the perspective distortion and giving it a spherical mapping) and then extrude them together into a point cloud in processing using this stock footage.

    The method Ive been using so far is reading the brightness() of a pixel ( from the grey scale image ) and then generating a extrude / Z value.
    The trouble with this is the brightness() function in processing only provides a value between 0 and 255! instead of the 2047 levels needed. So the subject appears really flat!

    Is there a potential work around for my scenario?

  • Gerry Straathof

    The thing to remember is the depth information is stored in the same way an image would be stored. An array with width and height. It is not the same as the grayscale image. Those are two different things. If you look at the point cloud example, you will see that the grayscale image has been turned off to save bandwidth.

    If you are just using photoshop to build a composite image, then skip the depth information and just use the grayscale image. You will not have as detailed a result, since the values are compressed to 256 shades of gray.

    The main differences between the two is that you can use the depth information to ‘build’ your own grayscale image, and ignore certain points as being ‘out of the bounds’ of your object. So you would read the depth information, ignore any values outside of your bounds (nothing farther than 8 feet, and nothing closer than 5, let’s say) and then build a greyscale image based on those results… for example.

    For your project, you will probably want to use the depth information from the 3d point cloud the isolate the points you want to play with, build an array to store the points from the various kinects (perhaps a 3-value array with width, height and kinect#) and then build your composite grayscale image if necessary.

    The problem, as I understand this, is that you may be wanting to use 3d information, but are trying to tackle it from a 2d graphic manipulation concept… and not getting the results you want. You cannot get the detailed depth information from a 2d image (ir or rgb), only from the depth map or whatever system you come up with for storing the values.

    Good luck.

  • Krishna

    Thanks for your reply!

    I was thinking along the same lines after I read this article below…

    I found this project really helpful, they changed Shiffman’s code so that it outputs the depth array data out as a series of .txt files per frame so its recording “depth video” so to speak, whats even cooler is that they’ve provided a python script that takes this .txt info into cinema4D to generate meshes in real time for post-production!

    link : http://moullinex.tumblr.com/post/3180520798/catalina-music-video

    Total awesomeness!

    Now just need to figure out how to read these series of .txt files back into processing to reconstruct the image…. hopefully not too hard..

  • 1ndustrialdesign

    Any word on when this will be available for windows SDK 1.0?

  • Alex_S

    Hi, I was wondering if there was a way to make the RGB Kinect camera register as a webcam. I am using a program that recognizes QR codes from video and it uses the Capture() method to get the video feed to analyze. I’m using other features of the Kinect, blob detection to make a silhouette, for a project and am using QR codes to change some things about the silhouette. Basically, I need to be standing in front of the Kinect while holding up a QR code, so it just makes sense to use the RGB camera that’s already in the Kinect as opposed to a separate camera. Do you know of any way to do this? I can’t find anything through google or the processing forums.

  • Rookie

    Can you update that information. Please. I’m running Ubuntu 12.10?

  • David Sanz Kirbis

    Unless there is any special reason to use freenect library, as Daniel suggested to windows users, I’d also suggest you to try with Simple Open NI: http://code.google.com/p/simple-openni/
    With that you’ll also get the skeleton tracking feature.

  • Cam

    I want to try and create a motion detector type thing, and I’m trying to do that by storing the first PImage in a variable, then I have a delay for 1 second, and store a second PImage into a second variable, then comparing those two. Unfortunately, it always tells me that they are equal, would anyone know why? Code:

    void draw()
    {
    background(0);

    PImage frame1, frame2;

    PImage img = kinect.getVideoImage();
    image(img, 0, 0);
    frame1 = img;

    fill(255);

    //This separates frames
    delay(1000);

    PImage newimg = kinect.getVideoImage();
    image(newimg, 0, 0);
    frame2 = newimg;

    if (frame1 == frame2)
    {
    text(“frames[0] = frames[1]“, 640, 495);
    }
    else
    {
    text(“frames[0] != frames[1]“, 640, 495);
    }

    frame1 = null; frame2 = null;
    }

  • David Sanz Kirbis

    Look for the FrameDifferencing sketch in the video capture examples. At the bottom of the sketch you’ll find a variable called movementSum, wich you can use to determine the minimum amount of movement you want to be considered as a trigger for your purposes. If you want to run at a 1 frame per second, put frameRate(1) in the setup.

  • Cam

    What file is that / folder is that in? I couldn’t find it in the openkinect examples.

  • Cam

    Or could you provide me with a download link? I don’t think I have the files you are talking about

  • james braselton

    hi there want kinect

  • Gerry Straathof

    I’m assuming you know how to use Processing. The sketch he is referring to is in processing, and has nothing to do wiht kinect, but it will point you in the right direction. You need to understand how to work with video (from any camera source, not just kinect) to see how it works.

    In processing, depending on the version, under ‘files’ in the menu there should be a selection for ‘examples’ in there you will find many useful examples on how the processing system works. Under ‘libraries’ you will find one on video. Under ‘Capture’ you will find a sketch for frame differencing. Start there.

  • Cam

    Found it, thank you very much.

  • emusicman11

    I get a message saying No library found for org.openkinect No library found for org.openkinect.processing Libraries must be installed in a folder named ‘libraries’ inside the ‘sketchbook’ folder.

    But I don’t even have a folder on my computer named sketchbook. Is processing just stupid or something? Does it matter if I just put it in the libraries folder?

  • emusicman11

    Source code: https://github.com/shiffman/libfreenect/tree/master/wrappers/java/processing

    What does this mean? What do I do with this?

  • emusicman11

    What does this mean: “Open Kinect: openkinect.org”

    Do you want me to download something somewhere on this page? What? Where? Where do I put the folder? What is it? What does it do? Iam just trying to get a simple point cloud working running from processing.