DENG et al.: DISTRIBUTED SPEECH PROCESSING IN MiPad’S MULTIMODAL USER INTERFACE
Hsiao-Wuen Hon (M’92–SM’00) received the B.S. degree in electrical engineering from National Taiwan University and the M.S. and Ph.D. degrees in computer science from Carnegie Mellon University, Pittsburgh, PA. He is an Architect in Speech.Net at Microsoft Corporation. Prior to his current position, he was a Senior Researcher at Microsoft Research and has been a key contributor of Microsoft’s Whisper and Whistler technologies, which are the corner stone for Microsoft SAPI and SDK product family. Before joining Microsoft, he worked at Apple Computer, Inc., where he was a Principal Researcher and Technology Supervisor at Apple-ISS Research Center. He led the research and development for Apple’s Chinese Dictation Kit, which received excellent reviews from many industrial publications and a handful of rewards, including Comdex Asia’96 Best Software Product medal, Comdex Asia’96 Best of the Best medal and Singapore National Technology award. While at CMU, he was the co-inventor of CMU SPHINX system on which many commercial speech recognition systems are based on, including Microsoft and Apple. Hsiao-Wuen is an international recognized speech technologist and has published more than 70 technical papers in various international journals and conferences. He authored (with X. D. Huang and A. Acero) a book titled Spoken Language Processing. He has also been serving as reviewer and chairs for many international conferences and journals. He holds nine U.S. patents and currently has nine pending patent applications.
Derek Jacoby received the B.S. degree in psychology from Rice University, Houston, TX, in 1995.
After working as a Consumer Products Usability Engineer at Compaq, he joined Microsoft in 1996 to work on the Systems Management Server. After four years of program management on the Windows team, he joined Microsoft Research to work on MiPad and associated projects. He is currently developing user interface approaches to adding speech recognition to the next version of Windows.
Milind Mahajan received the B.Tech. degree in computer science and engi- neering from Indian Institute of Technology, Mumbai, in 1986. He received the M.S. degree in computer science from University of Southern California, Los Angeles, in 1988.
He is currently a Researcher in the Speech Technology Group of Microsoft Research. His research interests include language modeling and machine learning.
He is an associate editor for the IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING.
Jasha Droppo (M’01) received the B.S. degree in electrical engineering (cum laude, honors), from Gonzaga University in 1994. He received the M.S. degree in electrical engineering and the Ph.D. degree in electrical engineering from the University of Washington under L. Atlas in 1996 and 2000, respectively.
At the University of Washington, he helped to develop and promote a dis- crete theory for time-frequency representations of audio signals, with a focus on speech recognition. He joined the Speech Technology Group at Microsoft Research in 2000. His academic interests include noise robustness and feature normalization for speech recognition, compression, and time-frequency signal representations.
Ciprian Chelba received the Dipl.-Eng. degree from the Politehnica University, Bucharest, Romania, in 1993. In 2000, he received the Ph.D. degree from The Johns Hopkins University, Baltimore, MD, working in the Center for Language and Speech Processing Lab.
He joined the Speech Technology Group at Microsoft Research in 2000. His main research interests are in the areas of statistical speech and language processing, particularly in language modeling and information extraction from speech and text. More broadly, he is in interested in statistical modeling.
Constantinos Boulis is pursuing the Ph.D. degree at the University of Washington, Seattle. He received the M.Sc. degree from the Computer Engineering Department of Technical University of Crete, Greece from where he also holds an undergraduate degree.
His academic interests include unsupervised topic detection in unconstrained speech, distributed speech recognition, speaker adaptation, and pattern recogni- tion in general.
Ye-Yi Wang (M’99) received the B.Eng. and M.S. degree in computer science and engineering from Shanghai Jiao Tong University in 1985 and 1987, respectively. He received the M.S. degree in computational linguistics and the Ph.D. degree in language and information technology from Carnegie Mellon University, Pittsburgh, PA, in 1992 and 1998, respectively. He is currently a Researcher with the Speech Tech- nology Group at Microsoft Research. His research in- terests include spoken language understanding, lan- guage modeling, machine translation, and machine learning.
Xuedong D. Huang (M’89–SM’94–F’00) received the B.S. degree in computer sciences from Hunan University, the M.S. degree in computer sciences from Ts- inghua University, and the Ph.D. degree in electrical engineering from Univer- sity of Edinburgh.
As General Manager of Microsoft .NET Speech, he is responsible for the development of Microsoft’s speech technologies, speech platform, and speech development tools. He is widely known for his pioneering work in the areas of spoken language processing. He and his team have created core technolo- gies used in a number of Microsoft’s products including both Office XP and Windows XP, and pioneered the industry-wide SALT initiatives. He joined Mi- crosoft Research as a Senior Researcher to lead the formation of Microsoft’s Speech Technology Group in 1993. Prior to joining Microsoft, he was on the faculty of Carnegie Mellon’s School of Computer Sciences and directed the ef- fort in developing CMU’s Sphinx-II speech recognition system. He is an affil- iate Professor of electrical engineering at University of Washington, and an ad- junct professor of computer science at Hunan University. He has published more than 100 journal and conference papers and is a frequent keynote speaker in nu- merous industry conventions. He has co-authored two books: Hidden Markov Models for Speech Recognition (Edinburgh, U.K.: Edinburgh University Press,
and Spoken Language Processing (Englewood Cliffs, NJ: Prentice-Hall,
Dr. Huang received the National Education Commission of China’s 1987 Sci- ence and Technology Progress Award, the IEEE Signal Processing Society’s 1992 Paper Award, and Allen Newell Research Excellence Medal.