{"id":307643,"date":"2007-04-19T13:00:33","date_gmt":"2007-04-19T20:00:33","guid":{"rendered":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/?p=307643"},"modified":"2016-10-18T23:59:28","modified_gmt":"2016-10-19T06:59:28","slug":"personal-audio-space-headphones-experience-sans-headphones","status":"publish","type":"post","link":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/blog\/personal-audio-space-headphones-experience-sans-headphones\/","title":{"rendered":"Personal Audio Space: The Headphones Experience sans Headphones"},"content":{"rendered":"<p><em>By Rob Knies, Managing Editor, Microsoft Research<\/em><\/p>\n<p>Many people are accustomed to donning headphones to enjoy music at a desired volume without inflicting their tunes on others nearby. But there are tradeoffs inherent in the headphones experience.<\/p>\n<p>For one, you\u2019re generally physically tethered to the sound source via wires, inhibiting your movement. Second, the headphones tend to isolate you from your physical environment.<\/p>\n<p>What if such limitations could be overcome? Within <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/lab\/microsoft-research-redmond\/\" target=\"_blank\">Microsoft Research Redmond<\/a>, a trio of researchers has moved beyond \u201cwhat if\u201d and \u201chow\u201d toward \u201cwhen.\u201d<\/p>\n<p>\u201cWe\u2019re trying to recreate the headphone experience without headphones,\u201d says <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/jdroppo\/\" target=\"_blank\">Jasha Droppo<\/a>, a researcher with the Speech Technology Group.<\/p>\n<p>Droppo, along with teammates <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/ivantash\/\" target=\"_blank\">Ivan Tashev<\/a>, a software architect, and <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/mseltzer\/\" target=\"_blank\">Michael Seltzer<\/a>, a fellow researcher, has developed a project called Personal Audio Space, which they define as a semi-private, energy-efficient system for real-time communication.<\/p>\n<p>The project uses multiple speakers to focus sound around the user. This tailored approach enables the user to hear the audio clearly, while people adjacent to the focus area experience the sound, if at all, as much quieter in volume than does the target recipient.<\/p>\n<p>\u201cIvan, Mike, and I are trying to look at ways computers can work with audio to make the computing experience better,\u201d Droppo says. \u201cWe are looking at computer control of multiple audio drivers, multiple speaker cones, to see what kind of interesting things are possible with those.<\/p>\n<p>\u201cThere\u2019s a whole area of mathematics that has been developed for microphone arrays and capturing sounds from different directions. How does that apply to sound rendering? What kind of interesting things can we do?\u201d<\/p>\n<div id=\"attachment_307646\" style=\"width: 360px\" class=\"wp-caption alignleft\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-307646\" class=\"size-full wp-image-307646\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/Personal-Audio-Space.jpg\" alt=\"Jasha Droppo (left), Ivan Tashev (center), and Mike Seltzer display the latest version of their Personal Audio Space sound-targeting speaker array.\" width=\"350\" height=\"212\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/Personal-Audio-Space.jpg 350w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2016\/10\/Personal-Audio-Space-300x182.jpg 300w\" sizes=\"auto, (max-width: 350px) 100vw, 350px\" \/><p id=\"caption-attachment-307646\" class=\"wp-caption-text\">Jasha Droppo (left), Ivan Tashev (center), and Mike Seltzer display the latest version of their Personal Audio Space sound-targeting speaker array.<\/p><\/div>\n<p>The team has demonstrated its proof of concept with a deceptively simple piece of hardware. The first prototype consisted of 16 two-inch speaker cones attached to the front of a 42-inch 2-by-4. Even with this array of speakers, they were able to direct sound waves effectively, amplifying some and negating others to define a sweet spot that optimizes projected audio for the listener yet diminishes it for others within hearing range. Although the second prototype, pictured at left, features a baffling box and more intelligent wiring, the principle remains the same.<\/p>\n<p>In a project FAQ, the three state: \u201cThe magic lies in independent computer control of multiple speaker drivers.\u201d Droppo elaborates:<\/p>\n<p>\u201cIf you have audio going into a single speaker, it pours into a room like water. It just goes everywhere. Once we have multiple speakers under computer control, we can pre-distort the audio so that it builds up in some areas of the room and cancels itself out in others. We can do computer simulations of how the sound is going to propagate from the individual speaker cones and specify that we want more sound in one region and less sound in another.<\/p>\n<p>\u201cWhat we\u2019re doing is the simplest thing that computers know how to do to an audio signal. It\u2019s called a linear filter. Basically, it shapes the frequency content of the signal in a rather straightforward way. For every frequency, we try to determine what kind of unique delay we can apply so that, as these speakers cooperate to produce a sound field, it does what we want.\u201d<\/p>\n<p>The effect, as one might expect, is liberating. While the technology is not yet ready to be included in a product, those who have been exposed to it invariably come away with smiles on their faces.<\/p>\n<p>That sort of response is rewarding for Droppo, Tashev, and Seltzer, who bring a well-rounded research portfolio to their Personal Audio Space project.<\/p>\n<p>\u201cIn the past,\u201d Droppo says, \u201cI have worked with speech enhancement and speech recognition. Once you have captured a speech signal, how can you make it sound better? How can you do better, more accurate recognition?<\/p>\n<p>\u201cThe same kinds of tools that go into processing speech sounds on a computer are also useful for doing audio rendering. It\u2019s the same basic tool set of linear algebra and convolution and frequency analysis.\u201d<\/p>\n<p>Tashev, meanwhile, is an expert on microphone arrays, which are similar to speaker arrays but designed for sound capture.<\/p>\n<p>\u201cA lot of the math that Ivan developed for his microphone-array technologies,\u201d Droppo says, \u201cis similar to what we\u2019re using for audio rendering. He likes to joke that he just runs the software backward.\u201d<\/p>\n<p>Seltzer, too, has a background in microphone arrays and speech recognition. Together their talents put them in an advantageous position for tackling the Personal Audio Space project.<\/p>\n<p>\u201cThere are some things about building a speaker array that are similar to building a microphone array,\u201d Droppo says. \u201cThere are also a lot of things that are quite different. Discovering what these different things were was a goal in the first phase of our project.\u201d<\/p>\n<p>Another goal was to learn if such functionality was economically feasible.<\/p>\n<p>\u201cThe other thing we were trying to do was see how cheaply we could build one of these things,\u201d Droppo adds. \u201cThere exist on the market devices that are similar at the surface, but they cost a lot of money to build and to buy. One of the questions we were trying to answer was: Do we really need to spend a lot of money building these things to get a useful result out of them?\u201d<\/p>\n<p>Apparently not. The materials used to construct their first prototype consisted of little more than 16 small commodity speaker cones, a piece of lumber, some speaker wires, and a handful of fasteners.<\/p>\n<p>\u201cThe way I like to design research projects,\u201d Droppo says, \u201cis that each phase answers at least one or two questions that we don\u2019t know [the answer to]. While we don\u2019t have anything spectacularly different yet, in the past few months, we\u2019ve been able to catch up to the state of the art, and the real exciting part for me is where we\u2019re taking this in the future.\u201d<\/p>\n<p>There are a number of usage scenarios that come to mind when considering the potential of the Personal Audio Space technology. One, for example, would enable an office worker to listen to music without disturbing those in adjacent workspaces. Another features a more expansive applicability.<\/p>\n<p>\u201cOne that Ivan, in particular, is very passionate about,\u201d Droppo says, \u201cis pairing the speaker array for audio rendering with the microphone array for audio capture and a screen and a camera for video capture and rendering. Once you have that complete solution, you can have a communications terminal that will track the users and deliver audio to the intended users, capture low-noise audio from them, do face tracking, and provide a really nice communications experience where it feels like you\u2019re having a private-conversation video chat.<\/p>\n<p>\u201cThe advantage of the speaker array in that scenario is that once you aren\u2019t tethered to the computer anymore, you can wander around the room. Without the speaker array, you\u2019d want to turn up the speakers so you could hear it everywhere. But with the speaker array and the user tracking, the audio could be delivered to you wherever you are in the room.\u201d<\/p>\n<p>And then there\u2019s the babysitter scenario.<\/p>\n<p>\u201cIt\u2019s very simple,\u201d Droppo explains. \u201cIt\u2019s delivering audio to you at a higher volume than your kids are going to hear upstairs or in the next room while they\u2019re sleeping.<\/p>\n<p>\u201cWhen you tell people about that scenario, it divides them into two groups. Either they don\u2019t understand the utility, or they have children. The people who have children get it right away.<\/p>\n<p>\u201cAbout once a month,\u201d Droppo smiles, \u201cone of my kids will come downstairs just to see what\u2019s going on, and they\u2019ll complain about the TV being too loud. I don\u2019t know if it\u2019s actually too loud, because I try to be considerate, or if they\u2019re just using that as an excuse to see what\u2019s going on. I\u2019d love to take that excuse away from them.\u201d<\/p>\n<p>One interesting part of the Personal Audio Space project is that, unlike many research projects that revolve around scientific inquiry and conjecture, the one actually forced Droppo to build a physical object to test the project\u2019s success.<\/p>\n<p>\u201cThe hardest part was overcoming the physical aspects of the project,\u201d Droppo says, \u201cbecause I had mainly been a software person before. As part of my job at Microsoft, I haven\u2019t built anything this big before. It\u2019s actually a physical thing, with amplifiers and digitizers and boards and glue and nails and screws. That was my favorite part of the project.\u201d<\/p>\n<p>Of course, there were more abstract aspects, particularly with regard to learning about acoustic rendering and the physics of sound. But Seltzer was able to obtain a copy of a seminal text, <em>Acoustics<\/em>, by Leo L. Beraneck, that helped them master such subtleties.<\/p>\n<p>As things stand, Droppo stipulates, it will be difficult to reduce the non-targeted audio to absolutely zero, a concept readily understood by anybody who has found themselves next to a person listening to high-volume music in a bus or on an airplane. But the Personal Audio Space technology can diminish the audio leakage to the point where, in a busy room, it is virtually impossible to detect.<\/p>\n<p>\u201cMost of the time,\u201d Droppo says, \u201cpeople will perceive that you have the audio much lower than you actually do.\u201d<\/p>\n<p>Having achieved that, there are a number of ways this technology can be extended.<\/p>\n<p>\u201cOne,\u201d Droppo says, \u201cis developing algorithms that produce better directional sound fields. Right now, we\u2019re using a well-known technique, beam forming, that produces the effect that the audio exists in one region and not another. There are still interesting things to do in that space in order to produce more of a separation between where you want the audio to be and where you don\u2019t want the audio to be, to make a better separation between the two.\u201d<\/p>\n<p>Another direction to take could involve applications of the technology.<\/p>\n<p>\u201cNow that we know how to build these things and what a lot of the tradeoffs are in the design,\u201d he says, \u201cwe can start looking at different applications of the technology and seeing how what we\u2019ve been able to build can actually improve the customer\u2019s experience.\u201d<\/p>\n<p>That might involve enabling users to mold the sound to their precise preferences using an enhanced user interface. Or it might mean constructing an actual hardware component to bring the demonstrated capacity to life.<\/p>\n<p>\u201cIf you have a speaker array under computer control,\u201d Droppo suggests, \u201cit can produce many more patterns than you can reasonably choose from. Presenting these options to users and letting them make intelligent choices about what they want is a user-interface issue. Given today\u2019s technology, we can build new calibration tools, new interaction tools, so that users can intelligently design the type of audio they want for their rooms.<\/p>\n<p>\u201cThe other direction that we\u2019re looking at is speaker arrays as a component technology. What kind of end-to-end systems can we build that make sense? Can we make something the width of a monitor that actually produces sound that it pleasant to listen to?\u201d<\/p>\n<p>There\u2019s no doubt, though, that when Droppo, Tashev, and Seltzer demonstrate Personal Audio Space, the response they typically get is music to their ears.<\/p>\n<p>\u201cThe coolest part to me,\u201d Droppo says, \u201cis that when I describe it to people, they just get it. Their eyes light up, and they can think of many different ways to use the technology. That is the kind of feedback I really like to get.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"<p>By Rob Knies, Managing Editor, Microsoft Research Many people are accustomed to donning headphones to enjoy music at a desired volume without inflicting their tunes on others nearby. But there are tradeoffs inherent in the headphones experience. For one, you\u2019re generally physically tethered to the sound source via wires, inhibiting your movement. Second, the headphones [&hellip;]<\/p>\n","protected":false},"author":39507,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[],"msr_hide_image_in_river":0,"footnotes":""},"categories":[194456,194462],"tags":[215645,215648,215651,215654,197281],"research-area":[13545],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-307643","post","type-post","status-publish","format-standard","hentry","category-natural-language-processing","category-speech-and-dialog","tag-headphones","tag-personal-audio-space","tag-sound-targeting","tag-speech-enhancement","tag-speech-recognition","msr-research-area-human-language-technologies","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[199565],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-events":[],"related-researchers":[],"msr_type":"Post","byline":"","formattedDate":"April 19, 2007","formattedExcerpt":"By Rob Knies, Managing Editor, Microsoft Research Many people are accustomed to donning headphones to enjoy music at a desired volume without inflicting their tunes on others nearby. But there are tradeoffs inherent in the headphones experience. For one, you\u2019re generally physically tethered to the&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/307643","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/users\/39507"}],"replies":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/comments?post=307643"}],"version-history":[{"count":2,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/307643\/revisions"}],"predecessor-version":[{"id":308684,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/307643\/revisions\/308684"}],"wp:attachment":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media?parent=307643"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/categories?post=307643"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/tags?post=307643"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=307643"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=307643"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=307643"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=307643"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=307643"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=307643"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=307643"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=307643"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}