{"id":305495,"date":"2011-09-26T18:30:21","date_gmt":"2011-09-27T01:30:21","guid":{"rendered":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/?p=305495"},"modified":"2016-10-13T13:22:07","modified_gmt":"2016-10-13T20:22:07","slug":"kinect-body-tracking-reaps-renown","status":"publish","type":"post","link":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/blog\/kinect-body-tracking-reaps-renown\/","title":{"rendered":"Kinect Body Tracking Reaps Renown"},"content":{"rendered":"<p><em>By Rob Knies, Managing Editor, Microsoft Research<\/em><\/p>\n<p>By any standard, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.xbox.com\/en-US\/xbox-360\/accessories\/kinect\" target=\"_blank\">Kinect for Xbox 360<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> has proved to be a technological sensation. Kinect, the controller-free interface that enables users to interact with the Xbox 360 with the wave of your hand or the sound of your voice, sold 8 million units in its first 60 days on the market, a figure that makes it the fastest-selling consumer-electronics device in history, as confirmed by Guinness World Records.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignleft size-full wp-image-305414\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/Kinect-Contributions.png\" alt=\"Kinect contributions\" width=\"250\" height=\"250\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/Kinect-Contributions.png 250w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/Kinect-Contributions-150x150.png 150w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/Kinect-Contributions-180x180.png 180w\" sizes=\"auto, (max-width: 250px) 100vw, 250px\" \/>That, of course, is a tribute to the Interactive Entertainment Business (IEB), which produced Kinect\u2014and, by extension, to Microsoft Research, which made several key contributions to the technology.<\/p>\n<p>For many, the most noteworthy part of Kinect is its ability to track the body movements of users and provide natural interaction as a result, and critical portions of that work came from <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/lab\/microsoft-research-cambridge\/\" target=\"_blank\">Microsoft Research Cambridge<\/a>.<\/p>\n<p><a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/jamiesho\/\" target=\"_blank\">Jamie Shotton<\/a>, <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/awf\/\" target=\"_blank\">Andrew Fitzgibbon<\/a>, Andrew Blake, <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/tsharp\/\" target=\"_blank\">Toby Sharp<\/a>, and Mat Cook of Microsoft Research Cambridge each made seminal contributions to the success of Kinect, and the value of their research has been widely praised, in particular during early June, when they received the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.raeng.org.uk\/news\/news-releases\/2011\/June\/cambridge-engineers-kinect-land-uk-prize\" target=\"_blank\">MacRobert Award<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, the most significant honor bestowed for innovation by The Royal Academy of Engineering. Another member of the Cambridge contingent who helped shape the skeletal-tracking feature was Oliver Williams, who worked at that facility before transferring to Microsoft Research Silicon Valley.<\/p>\n<div id=\"attachment_305501\" style=\"width: 410px\" class=\"wp-caption alignright\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-305501\" class=\"size-full wp-image-305501\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/MacRobert-Award.jpg\" alt=\"MacRobert Award\" width=\"400\" height=\"310\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/MacRobert-Award.jpg 400w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/MacRobert-Award-300x233.jpg 300w\" sizes=\"auto, (max-width: 400px) 100vw, 400px\" \/><p id=\"caption-attachment-305501\" class=\"wp-caption-text\">Researchers from Microsoft Research Cambridge beam during the ceremony in which they won the MacRobert Award: (from left) Mat Cook, Jamie Shotton, Andrew Blake, Andrew Fitzgibbon, and Toby Sharp.<\/p><\/div>\n<p>\u201cThis technology is a radical development in human-computer interaction,\u201d says Blake, a Microsoft distinguished scientist and managing director of Microsoft Research Cambridge. \u201cFirst, we had the green screen, then mouse and keyboard, then touch and multitouch, and now, what could be called \u2018no-touch\u2019 interaction.<\/p>\n<p>\u201cIt has radically advanced the state of the art in gaming, but that is just the beginning. This is a new kind of technology that could have far-reaching implications for the way we interact with different kinds of machines.\u201d<\/p>\n<p>Fitzgibbon, a researcher at the facility, has an extensive background in computer-vision research connected with films and video, so he performed an invaluable role in consulting for the various components of the skeletal-tracking research. Shotton, also a researcher, devised the algorithm that takes an image from Kinect\u2019s depth camera and identifies the different parts of the body. Sharp, senior research software-development engineer, worked on high-speed implementations of the algorithm for the Kinect system, and Cook, a contractor with extensive experience in working on computer games, was called in to provide a massive amount of training data.<\/p>\n<p>\u201cKinect takes a stream of images coming off the camera,\u201d Shotton explains, \u201cand quickly works out where the joints in your body are in 3-D. It can use that to animate characters and to manipulate objects on the screen.\u201d<\/p>\n<div id=\"attachment_305504\" style=\"width: 410px\" class=\"wp-caption alignleft\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-305504\" class=\"size-full wp-image-305504\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/email.jpg\" alt=\"Kinect\u2019s skeletal-tracking ability\" width=\"400\" height=\"310\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/email.jpg 400w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/email-300x233.jpg 300w\" sizes=\"auto, (max-width: 400px) 100vw, 400px\" \/><p id=\"caption-attachment-305504\" class=\"wp-caption-text\">This email began the collaboration that culminated in Kinect\u2019s skeletal-tracking ability.<\/p><\/div>\n<p>The research effort that led to the skeletal-tracking ability of Kinect began in September 2008 with an email from Mark Finocchio, principal software engineer with IEB, to Shotton, asking for help with a planned accessory that could track bodies in real time for use in gaming.<\/p>\n<p>The Xbox team already was using a depth camera and had written a tracking algorithm that could track a body\u2019s movements quickly. A video sent to Cambridge left the researchers there impressed.<\/p>\n<p>\u201cWe saw that video,\u201d Fitzgibbon recalls, \u201cand thought, \u2018OK, you\u2019ve solved it, so why are you talking to us?\u2019\u201d<\/p>\n<p>Soon enough, the reasons became apparent. To begin with, to start the recognition process, the user had to strike a particular pose with arms extended. More important, the algorithm typically would work well for a while, then, when the motion input became unpredictable, the body tracking could \u201ccrumple\u201d into unusable disarray, and the only way to reset the tracking system was to strike the pose again. Because the system sometimes failed after only about a minute, it simply was not feasible for extended game play.<\/p>\n<p>\u201cWhat they wanted,\u201d Fitzgibbon recalls, \u201cwas a way of initializing the algorithm, recovering from these crumplings\u2014preventing them in the first place, ideally\u2014and making it work for everyone, no matter your body shape and size.\u201d<\/p>\n<p>The algorithm\u2019s handicap was that, basically, it was relying on analysis of past motions to predict the future. When a user began moving deliberately, the algorithm would assume that was the natural pace of motion, so when certain types of fast motion occurred\u2014for example, when a person moved too far within a 30-millisecond span, which easily can occur when people are playing an exciting video game\u2014it couldn\u2019t keep up.<\/p>\n<h2>One at a Time<\/h2>\n<p>\u201cWe realized that we had to look at a single image at a time,\u201d Shotton says. \u201cWe couldn\u2019t rely exclusively on context, your history, or your motion in the past. We had to just look at an image and decide what your body pose was. We knew that this was, in theory, possible, because a person can do this. If you look at a photo, you can draw the position of the joints.\u201d<\/p>\n<p>Related work in this area existed in the computer-vision literature, and the team tried one promising approach.<\/p>\n<p>\u201cIt would try to match your whole body at once,\u201d Shotton says. \u201cIt would have a big database of the way the body appears in different positions. You\u2019d take the image coming off the camera and search through this big \u2018flipbook\u2019 of different body positions and try each of them against the image until you find the one that matches best.<\/p>\n<p>\u201cThat kind of worked. We got Xbox very excited quickly. But we realized early on that it wasn\u2019t going to scale up to our needs. It\u2019s essentially a brute-force approach, and you have to represent every possible combination of human pose and body shape and size into the training set.\u201d<\/p>\n<p>In addition, the number of joints under such analysis led to a gigantic number of potential body poses.<\/p>\n<p>\u201cLet\u2019s say I can bend my right elbow into 10 positions,\u201d Shotton continues, \u201cand I can move my right shoulder in a hundred different positions. That\u2019s 10 times 100\u2014a thousand. I\u2019ve got another 1,000 on the left side, so when I multiply those, we\u2019re at a million. And then we\u2019ve got maybe a hundred more positions for this joint or that joint. You get this exponential growth in the number of positions the human body can make. This was obviously quite a major issue.\u201d<\/p>\n<p>Existing literature pointed a way out, though, by breaking up the problem: trying to match the hand separately to the shoulder to avoid the exponential compounding. Shotton decided to alter his approach. Instead of trying to predict the positions directly, he\u2019d just examine every pixel in an image and try to determine which part of the body it represents.<\/p>\n<p>That, the team soon realized, was akin to <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/textonboost-joint-appearance-shape-and-context-modeling-for-mulit-class-object-recognition-and-segmentation\/\" target=\"_blank\">earlier work<\/a> on object recognition in images by Shotton, Sharp, and Microsoft Research Cambridge colleagues <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/antcrim\/\" target=\"_blank\">Antonio Criminisi<\/a>, <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/jwinn\/\" target=\"_blank\">John Winn<\/a>, and Carsten Rother on object recognition in images.<\/p>\n<div id=\"attachment_305510\" style=\"width: 164px\" class=\"wp-caption alignright\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-305510\" class=\"size-full wp-image-305510\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/sheep-and-grass.jpg\" alt=\"sheep-and-grass example\" width=\"154\" height=\"216\" \/><p id=\"caption-attachment-305510\" class=\"wp-caption-text\">The sheep-and-grass example of object recognition in images.<\/p><\/div>\n<p>\u201cWe usually use the sheep-and-grass example,\u201d Shotton says. \u201cYou take a photo of a sheep on grass and try to segment the sheep against the grass and label those automatically as sheep and grass. We knew this worked. The key realization for Kinect was that we can take that, apply it to this new problem, where we were trying to effectively color in the body. We got an image from the camera, and we\u2019re trying to color in the body with different body-part colors.<\/p>\n<p>\u201cIf we can take an image coming off the depth camera, a gray-scale image where the different pixels represent depth to the sensor, and convert it to this body-part coloring, we have an extremely good idea of where the joints in the body are, because we defined these parts to be near the joints. If you can get all of the pixels that are labeled as the person\u2019s left hand and color them in correctly, by clustering those together and using the depth information, that gives us a really good idea about the 3-D XYZ coordinate of the left hand.\u201d<\/p>\n<div id=\"attachment_305513\" style=\"width: 320px\" class=\"wp-caption alignleft\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-305513\" class=\"size-full wp-image-305513\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/body-part-coloring.jpg\" alt=\"body-part coloring\" width=\"310\" height=\"240\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/body-part-coloring.jpg 310w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/body-part-coloring-300x232.jpg 300w\" sizes=\"auto, (max-width: 310px) 100vw, 310px\" \/><p id=\"caption-attachment-305513\" class=\"wp-caption-text\">Body-part coloring identifies areas near joints in the body.<\/p><\/div>\n<p>With the object-recognition approach showing promise, Sharp began to consider how Shotton\u2019s algorithm could be implemented on the Xbox 360 hardware, which had existed since 2005 and was already ensconced in millions of living rooms around the world. Not only was that hardware not going to be retrofitted, but game developers also have gotten extremely adept at using all available processing capability and memory to optimize their titles.<\/p>\n<p>\u201cIt becomes a challenge to squeeze implementation of a state-of-the-art tracking algorithm onto pre-existing hardware,\u201d Sharp says, but he was able to apply more of his software-engineering expertise.<\/p>\n<p>\u201cSome work I\u2019d done became relevant,\u201d he says, \u201cabout how to run these decision-forest algorithms on a graphics processor using DirectX [Microsoft technologies for running and displaying multimedia applications]. The sheep-and-grass example translated to the Kinect domain. The implementation on a graphics card translated to the Kinect domain, as well. A <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/implementing-decision-trees-and-forests-on-a-gpu\/\" target=\"_blank\">paper I published on that in 2008<\/a> was used as a starting point for the development of the Kinect implementation.\u201d<\/p>\n<p>The researchers also needed training data. Enter Cook, in December 2008.<\/p>\n<p>\u201cWe were going to use machine learning,\u201d Shotton says, \u201cbecause we\u2019re in the machine-learning group, we have experience with it, and we know it can work really well. But the thing about machine learning is that you need data. That\u2019s why we brought Mat on board. His job was to develop a way of generating images that we needed.<\/p>\n<p>\u201cWe couldn\u2019t capture real images of people and label them by hand. That would take too long, be too expensive, and we\u2019d never get enough data. But if we could synthesize images using computer graphics, we could use those synthesized images for our training data. Synthesizing depth images, as opposed to the usual RGB color images, turns out to be a sweet spot of what we can achieve reliably. \u201c<\/p>\n<p>That enabled the collection of training data to begin in earnest.<\/p>\n<p>\u201cWe began to acquire motion-capture data, 3-D joint positions,\u201d Fitzgibbon says. \u201cMat would feed that motion-capture data into a computer-graphics tool that generated depth images so we\u2019d have something to test on where we knew the right answer. We had a ground-truth answer associated with each image.\u201d<\/p>\n<p>At the outset, Cook had no idea exactly what he was helping to achieve. The real nature of the project was a secret.<\/p>\n<p>\u201cIt wasn\u2019t immediately clear exactly what the whole system was going to be and how well it would work,\u201d Cook smiles. \u201cI was told that we\u2019re working on a system that would plug into an Xbox and track people. The first challenge was to generate anything, to get some data that was useful fairly quickly.\u201d<\/p>\n<h2>Hundreds of Thousands of Poses<\/h2>\n<p>Off-the-shelf software was proving too slow or unable to generate a good depth image, but one tool was up to the task. It could work with existing motion data sets, enabling the use of motion data using different people and different poses. And it included a feature that fogs out items that are relatively further away. That provided image depth. A series of tweaks eventually enabled the synthesis of millions of images. Eventually, the training data ballooned into hundreds of thousands of body poses.<\/p>\n<p>What followed was a lengthy period of incremental improvements in getting accuracy up to snuff.<\/p>\n<p>\u201cWe could plot graphs of accuracy versus the number of training images,\u201d Shotton says. \u201cYou\u2019d see it going up and up and up. We knew we had to keep extending the training set, but it was taking a week to train, and that wasn\u2019t fast enough. We spent a long time working with our colleagues at Microsoft Research Silicon Valley to make it train quickly.\u201d<\/p>\n<p>Once they did, they realized they had an algorithm, dubbed Exemplar, that would work fast enough and accurately enough to run on every frame of the stream of data from the Kinect depth camera.<\/p>\n<p>To obtain realistic body positions, they went global.<\/p>\n<p>\u201cWe told the Xbox people we\u2019d need real-world test data,\u201d Fitzgibbon says. \u201cAmazingly, they sent a team of people with the prototype depth camera to 10 houses across the planet and asked the residents to dance around as if they were playing Kinect games. No one was allowed to see the data unless they were testing the algorithm.\u201d<\/p>\n<p>In June 2009, the researchers attended an offsite meeting near Microsoft\u2019s Redmond, Wash., headquarters. By this point, Kinect was moving from incubation into production. The researchers explained the algorithm in more detail to the platform team, which was a bit wary of the machine-learning component. How, the product team asked, are we supposed to debug this? The team from Cambridge convinced its counterparts that machine learning was just another form of test-driven development and that more training data would improve performance.<\/p>\n<p>Shortly after the meeting, the Xbox team wrote an algorithm of its own, one that takes the output from the Exemplar algorithm and applies additional wizardry to provide a full skeleton rather than a collection of joints and to provide the use of temporal coherence.<\/p>\n<h2>Help from the Valley<\/h2>\n<p>The latter work was aided significantly by the work of Williams of Microsoft Research Silicon Valley. That facility also supplied <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/budiu.info\/work\/\" target=\"_blank\">Mihai Budiu<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, who collaborated with Shotton on distributing the training algorithm.<\/p>\n<p>By January 2010, the final algorithm was basically complete. The GPU implementation had enabled the processing of 30 frames per second while using only 10 percent of the hardware resources.<\/p>\n<p>\u201cThe algorithm is highly parallel,\u201d Fitzgibbon says. \u201cComputation can be run on every pixel of an image independently, making it suitable for a graphics processor, and can be implemented using graphics primitives. \u00a0Those primitives are based on rendering triangles to cover pixels in an image buffer. The result is that every pixel in the body is labeled, in real time, according to which part of the body it belongs.\u201d<\/p>\n<p>Adds Shotton: \u201cHad Toby not done this work on the GPU runtime of the decision forest, we may not have even considered it as a possibility, because we wouldn\u2019t have known that it was something you could do fast enough.\u201d<\/p>\n<p>That, Sharp says, is a hallmark of Microsoft Research.<\/p>\n<p>\u201cIt is an example of research,\u201d he observes, \u201cthat turned out to be crucial for a product almost immediately after it was done, even though it wasn\u2019t done with the product in mind.\u201d<\/p>\n<p>At the time, the effort that eventually became Kinect was known as Project Natal. It was enthusiastically received once unveiled in Los Angeles in June 2009 during the Electronic Entertainment Expo. Renowned film director Steven Spielberg was particularly effusive in his praise.<\/p>\n<p>It was around then that the Cambridge researchers got their hands on the first games designed to use the capabilities they had helped to create.<\/p>\n<p>\u201cWe could immediately see,\u201d Shotton recalls, \u201cthat even if it doesn\u2019t work 100 percent, the games are going to be fun!\u201d<\/p>\n<p>Adds Fitzgibbon: \u201cThat was the first time we realized that, yes, we probably are going to get there.\u201d<\/p>\n<p>That they did, as evidenced by the fact that <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/real-time-human-pose-recognition-in-parts-from-a-single-depth-image\/\" target=\"_blank\">Real-Time Human Pose Recognition in Parts from Single Depth Images<\/a>, the paper that resulted from their work, received the best-paper award during the Institute of Electrical and Electronics Engineers\u2019 2011 Computer Vision and Pattern Recognition conference, held June 21-23, 2011, in Colorado Springs, Colo.<\/p>\n<h2>Black-Tie Time<\/h2>\n<p>Two weeks earlier, before a black-tie-clad audience of hundreds at the Guildhall in London, Messrs. Blake, Fitzgibbon, Shotton, Sharp, and Cook had heard their names read as the MacRobert Award winners.<\/p>\n<p>\u201cIt was an evening,\u201d Fitzgibbon relates, \u201cwhere there was a lot of smiling.\u201d<\/p>\n<p>That good humor was the result of lots of hard work\u2014and effective teamwork\u2014within the Cambridge team and with the Xbox product and incubation groups.<\/p>\n<p>\u201cIt was a really good, constructive collaboration,\u201d Shotton says. \u201cMark Finocchio, the guy I was working with directly, is a very clever coder and did a fantastic job of taking what we were throwing at him and integrating it.\u201d<\/p>\n<p>Matt Bronder, principal software engineer for IEB, also collaborated with the researchers. It didn\u2019t hurt that he and Finocchio had been aware of Sharp\u2019s work on decision trees on GPUs.<\/p>\n<p>\u201cI had stumbled upon some interesting image-classification work online from a researcher named Jamie Shotton,\u201d Finocchio recalls. \u201cI found that he actually worked at Microsoft and contacted him immediately. He felt that there was a possibility of something there and that it was worth pursuing. As I got to know Jamie, I learned that he is unique. He\u2019s an incredible researcher <em>and<\/em> developer. Because of this, he can see the academic side through a practical perspective. A quality like that is rare, and this company is lucky to have him.\u201d<\/p>\n<p>That mutual respect is a common thread among those who worked on this project.<\/p>\n<p>\u201cFor me,\u2019 Cook says, \u201cit\u2019s been an amazing opportunity to work with so many fantastic, great people across the company. To see the research we do in a real product, having a real impact, is an amazing experience.\u201d<\/p>\n<p>Sharp offers an additional perspective.<\/p>\n<h2>A Big Leap<\/h2>\n<p>\u201cTechnology progresses sometimes in big leaps and sometimes in small steps,\u201d he says. \u201cWith Kinect, particularly with the machine learning in Kinect, it\u2019s definitely one of the big leaps. It\u2019s great to have been a part of that story, which I think will stand out as a milestone long into the future.\u201d<\/p>\n<p>That\u2019s the kind of excitement and wonder the project has brought to those who helped bring it to life\u2014let alone to those thousands playing a Kinect game at this very moment.<\/p>\n<p>For Blake, this is the culmination of a long process.<\/p>\n<p>\u201cPeople have been working on vision for decades,\u201d he observes, \u201csince long before it was practical to build real-time vision machines, and the human-body-motion-tracking problem has been open for the last two decades or so. Now, Kinect is a prime example of a computer-vision system that has really impacted mainstream technology.\u201d<\/p>\n<p>As for Shotton, who had been employed by Microsoft Research Cambridge a mere three months when he received Finocchio\u2019s fateful email, the project became something that he could socialize proudly\u2014once the word was out.<\/p>\n<p>\u201cIt was very exciting to work on something that was new and different, with world-class engineering that would change the world,\u201d Shotton says. \u201cI couldn\u2019t share that excitement with any of my friends or family, because it was all top-secret. But eventually, as the press reports came out, I could say, \u2018Yeah, this is what I\u2019m working on.\u2019\u201d<\/p>\n<p>Cook knows the feeling.<\/p>\n<p>\u201cIt was great to be working on something that you could talk about down the pub and people would have some idea what it did,\u201d he smiles. \u201cThe game thing where you have to jump around vigorously for three hours in order to play it\u2014parents approve very much of this.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"<p>By Rob Knies, Managing Editor, Microsoft Research By any standard, Kinect for Xbox 360 has proved to be a technological sensation. Kinect, the controller-free interface that enables users to interact with the Xbox 360 with the wave of your hand or the sound of your voice, sold 8 million units in its first 60 days [&hellip;]<\/p>\n","protected":false},"author":39507,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[],"msr_hide_image_in_river":0,"footnotes":""},"categories":[194471,194476],"tags":[214208,214193,214202,214196,187405,196135,202515,214205,203103,197061,214199,212255],"research-area":[13562,13552],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-305495","post","type-post","status-publish","format-standard","hentry","category-computer-vision","category-devices-and-hardware","tag-3-d-joint-positions","tag-controller-free-interface","tag-directx","tag-gaming","tag-human-computer-interaction","tag-kinect-for-xbox-360","tag-macrobert-award","tag-motion-capture-data","tag-object-recognition","tag-royal-academy-of-engineering","tag-sheep-and-grass-example","tag-skeletal-tracking","msr-research-area-computer-vision","msr-research-area-hardware-devices","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[199561],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-events":[],"related-researchers":[],"msr_type":"Post","byline":"","formattedDate":"September 26, 2011","formattedExcerpt":"By Rob Knies, Managing Editor, Microsoft Research By any standard, Kinect for Xbox 360 has proved to be a technological sensation. Kinect, the controller-free interface that enables users to interact with the Xbox 360 with the wave of your hand or the sound of your&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/305495","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/users\/39507"}],"replies":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/comments?post=305495"}],"version-history":[{"count":3,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/305495\/revisions"}],"predecessor-version":[{"id":305522,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/305495\/revisions\/305522"}],"wp:attachment":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media?parent=305495"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/categories?post=305495"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/tags?post=305495"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=305495"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=305495"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=305495"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=305495"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=305495"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=305495"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=305495"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=305495"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}