1. ARA V. NEFIAN, PhD
http://www.anefian.com/ara_nefian.htm anefian@yahoo.com (408) 930 0395
OBJECTIVE: Challenging research and development position in computer vision, 3D reconstruction, Bayesian
networks, robotics, image, video, speech and/or text processing.
EDUCATION
1996-1999 Ph.D in Electrical Engineering , Georgia Institute of Technology
Thesis:“Face detection and recognition using hidden Markov models”.
1994-1996 MS in Electrical Engineering, Georgia Institute of Technology.
1988-1993 Electrical Engineer Diploma, Politehnica University of Bucharest.
Major: Signal Processing and Radio Communications (ranked 1st
in class).
SKILLS: Researcher with over 10 years of experience, 40 peer reviewed papers and 20 filed and issued (10)
patents in the fields of computer vision, machine learning, Bayesian networks, image, text, audio, video
processing and sensor fusion. Experienced technical leader of research teams possessing strong analytical,
communications and programming skills (C/C++, Matlab, OpenCV, OpenMP). Fluent in English, French and
Romanian.
PROFESSIONAL EXPERIENCE
Senior System Scientist Carnegie Mellon University, 2008-present.
Selected projects
1. 3D terrain reconstruction-project lead. The imagery captured during the Apollo missions
and stored in digital format are used to generate the highest resolution and most accurate
maps covering 16% of the Moon surface. Results are available in Google Earth. Current
research includes albedo reconstruction and generation of accurate 3D topography using
shape from shading techniques.
2. Mars Science Lab. Built the 3D surface reconstruction and scene segmentation for the next
NASA robotic mission to Mars.
3. Pipeline threat detection system. Built a complete hardware a software system for detection
of heaving digging machinery using images captured by the pipeline patrol airplanes.
Senior Imaging Engineer Nokia Point&Find, 2008.
Project lead on real time car logo recognition from mobile platforms.
Senior Researcher Ooyala, 2007-2008.
Computer Vision team lead responsible for tracking and segmentation of large video sequences.
2. Senior Researcher, Intel Corporation, Microprocessor Research Labs, 2000-2007.
1 Research, development and technical lead in various areas of machine learning, computer vision, sensor
fusion, image and video mining. Analyze the impact of emerging multimedia applications on future
computational platforms.
1 Drive several collaborations with the research community through university grants, interns, joint projects,
external publications and organizational involvements.
Selected projects
1 Web image clustering using text and visual features. The web image search results are clustered
into several semantic categories using a novel unified Bayesian framework for image segmentation and
grouping. The multimodal clustering system increases the accuracy of any single modality and is
implemented on a 32-way parallel machine.
2 Detection of drivable corridors for off road autonomous navigation. Video sequences are modeled using a
novel two-dimensional Bayesian network that incorporates a set of spatial and temporal constraints specific to
desert images. The project is part of the vision component used by the Stanford racing team, the winner of the
2005 DARPA Grand Challenge.
2 Face recognition using embedded Bayesian networks. Built a novel statistical face recognition system and
led a team effort to implement it in real time and integrate it with OpenCV library. The novel two-dimensional
Bayesian models set the state of the art in face recognition on a highly used medium sized database.
2 Audio-video continuous speech recognition (AVCSR). Led a research team and conducted research in
multimodal speech recognition using a coupled HMM and a set of robust visual features. The coupled HMM
outperformed all existing audio-visual models for speech recognition and reduced the error rate of the
acoustic speech recognition by 55% in noisy environments. The AVCSR package, released in open source, is
used by several universities and research labs inside US, Europe and China.
2 Audio-visual speaker identification. Led a research team and conducted research in multimodal statistical
speaker identification using face, lip movements and speech features. The multimodal system outperforms the
accuracy of any single modality and reduces by a large margin the error rate of the acoustic-only speaker
recognition in noisy environments.
2 Audio-visual source separation. Research in Bayesian source separation in noisy environments using both
acoustic and visual features extracted from the user’s face. The novel approach outperforms the existing
acoustic-only speech source separation systems in noisy environments and opens a new perspective for
solving “the cocktail” problem.
2 Static and dynamic hand gesture recognition. Research in statistical upper body modeling, segmentation,
tracking and hand gesture recognition from dense disparity maps and color video sequences.
2 Distributed structure learning for embedded Bayesian networks with application in bioinformatics.
Developed a parallel structure learning algorithm for singular nucleotide polymorphism (SNP)
characterization.
2 Runtime compiler optimization and data pre-fetch using dynamic Bayesian networks.
Research Assistant, Georgia Institute of Technology, 1996-1999
Face detection and recognition using hidden Markov models.
Teaching Assistant, Georgia Tech Lorraine, 1998
Lectured graduate level “Random processes”.
Software Engineer, NCR, Human Interface Technology Center, 1995-1998
3. Real-time face detection in video sequences, customer activity analysis and people tracking.
Research Assistant, Georgia Tech Lorraine, 1994-1995
Fractal image compression, motion estimation.
Telecommunication Engineer, Electrical Research Institute (Romania), 1993-1994
Teaching Assistant, Politehnica University of Bucharest, 1993-1994.
Lectured graduate level “Digital Signal Processing” laboratory.
AWARDS
DARPA Grand Challenge 2005 Award: winner of the autonomous navigation race
with Stanford racing team-Stanley.
Intel Microprocessor Research Labs Award: 2001, 2003, 2007.
PATENTS
20+ US and International patents filed and issued (10) in face recognition and detection, audio–visual signal
processing, machine learning and Bayesian networks.
Selected patents
0* “A coupled hidden Markov model for continuous audio-visual speech recognition” 7,165,029 - 2007.
1* "Image recognition using hidden Markov models and coupled hidden Markov models" 7,171,043 -2007.
2* "System and method for real time lip synchronization" 7,133,560 - 2007.
3* “Gesture detection from digital video images” 7,224,830 - 2007.
4* “Embedded Bayesian network for pattern recognition”, 7203368 -2007
5* “Embedded multi-layer coupled hidden Markov model” 7,089,185 - 2006.
6* "System and method for detecting a human face in uncontrolled environments" 6,184,926 - 2001.
7* “Techniques for separating and evaluating audio and video source data” 2004.
8* “Identifying a speaker using Markov models” 2003.
PUBLICATIONS
Journal Papers
9* S .Thrun, M. Montemerlo, H. Dahlkamp, D. Stavens, A. Aron, J. Diebel, P. Fong, J. Gale, M. Halpenny, G.
Hoffmann, K. Lau, C. Oakley, M. Palatucci, V. Pratt, P. Stang, S. Strohband, C. Dupont, L.-E. Jendrossek, C.
Koelen, C. Markey, C. Rummel, J. van Niekerk, E. Jensen, P. Alessandrini, G. Bradski, B. Davies, S. Ettinger,
A. Kaehler, A. Nefian, and P. Mahoney. “Winning the DARPA Grand Challenge.” Journal of Field Robotics,
2006.
10* Ling Chen, Hong Man and Ara V. Nefian, “Face recognition based multi-class mapping of Fisher scores”,
Pattern Recognition, Special issue on Image Understanding for Digital photographs, 2005.
11*Ara V. Nefian, Luhong Liang, Xiaoxing Liu, Xiaobo Pi and Kevin Murphy, “Dynamic Bayesian networks for
audio-visual speech recognition”, EURASIP, Journal of Applied Signal Processing, 2002.
12* Silviu Ciochina and Ara V. Nefian, “ Fast algorithms for the discrete Hartley transform, part II”,
4. Telecommunication Journal, 1996 (Romanian).
13* Silviu Ciochina, Ara V. Nefian, “Fast algorithms for the discrete Hartley transform, part I: radix two
algorithms”, Telecommunication Journal, 1995 (Romanian).
Conference papers
3D Reconstruction from Planetary Images
14* Ara V. Nefian, Kyle Husmann, Michael J. Broxton, Vinh To, Michael Lundy and Matthew Hancher "A
Bayesian Formulation for Sub-pixel Refinement in Stereo Orbital Imagery", IEEE International Conference
on Image Processing 2009
15* M. J. Broxton, Z. M. Moratto, A. Nefian, M. Bunte, M. S. Robinson, "Preliminary Stereo Reconstruction
from Apollo 15 Metric Camera Imagery", 40th Lunar and Planetary Science Conference 2009.
16* M. Weiss-Malik; T. Scharff; A. Nefian; Z. Moratto; E. Kolb; M. Lundy; M. Hancher; N. Gorelick; M.
Broxton; R. A. Beyer ,"Visualizing Moon Data and Imagery with Google Earth", American Geophysical
Union 2009.
17* R. A. Beyer; M. Broxton; N. Gorelick; M. Hancher; M. Lundy; E. Kolb; Z. Moratto; A. Nefian; T.
Scharff; M. Weiss-Malik, "Visualizing Mars data and imagery with Google Earth", American Geophysical
Union 2009.
18* Michael J. Broxton, Ara V. Nefian, Zachary Moratto, Taemin Kim, Michael Lundy, and Aleksandr V.
Segal, "3D Lunar Terrain Reconstruction from Apollo Images", International Symposium on Visual
Computing 2009.
19* Taemin Kim, Ara V. Nefian, and Michael J. Broxton, "Photometric Recovery of Ortho-images
Derived from Apollo 15 Metric Camera Imagery", International Symposium on Visual Computing
2009.
Web Image Clustering and Video Mining
20* Ara V. Nefian, Maha El Choubassi, Jean-Yves Bouguet, Igor Kozintsev and Yi Wu, “Web image
clustering”, accepted to IEEE International Conference on Acoustic Speech and Signal Processing 2007.
21* Haoran Yi, Igor Kozintsev, Marzia Polito, Yi Wu , Jean-Yves Bouguet, Ara V. Nefian and Carole Dulong
“A spatiotemporal decomposition strategy for personal home video management”, SPIE Electronic Imaging
2007.
22* Yi Wu, Jean-Yves Bouguet, Ara V. Nefian and Igor Kozintsev, “Learning concept templates from web
images to query personal image databases”, submitted to IEEE International Conference on Multimedia and
Expo, 2007.
Emerging applications
23* Ara V. Nefian, Gary Bradski, “Detection of drivable corridors for off road autonomous navigation”, IEEE
International Conference on Image Processing 2006.
24* Ara V. Nefian, “Learning in embedded Bayesian networks with distributed data”, IEEE Computational
Systems Bioinformatics Conference 2006.
Audio-Visual Speech Processing
25* Luhong Liang, Yu Luo, Feiyue Huang and Ara V. Nefian,” A multi-stream audio-video large vocabulary
Mandarin Chinese speech database’, IEEE International Conference on Multimedia and Expo, 2004.
26* Shyamsundar Rajaram, Ara V. Nefian and Thomas Huang, “Bayesian separation of audio-visual speech
sources”, IEEE International Conference on Acoustic Speech and Signal Processing 2004.
27* Ara V. Nefian and Luhong Liang, “Bayesian networks in multimodal speech recognition and speaker
identification”, ASILOMAR 2003.
5. 28* Tieyan Fu, Xiaoxing Liu, Luhong Liang, Xiaobo Pi and Ara V Nefian, “Audio-visual speaker
identification using coupled hidden Markov models”, IEEE International Conference on Image Processing
2003.
29* Ara V. Nefian, Luhong Liang, Tyean Fu and Xiaoxing Liu, “A Bayesian approach to audio visual speaker
identification”, IEEE International Conference on Audio-and Video- based Biometric Person Authentication,
June 2003.
30* Xiaoxing Liu, Yibao Zhao, Xiaobo Pi, Luhong Liang and Ara V. Nefian, “Audio-visual continuous speech
recognition using a coupled hidden Markov model”, IEEE International Conference on Spoken Language
Processing, September 2002.
31* Luhong Liang, Xiaoxing Liu, Yibao Zhao, Xiabo Pi and Ara V. Nefian, “Speaker independent audio-
visual continuous speech recognition”, IEEE International Conference on Multimedia and Expo, August
2002.
32* Ara V. Nefian, Luhong Liang, Xiaobo Pi, Xiaoxing Liu, Crusoe Mao Kevin Murphy ,“A coupled HMM
for audio-visual speech recognition”, IEEE International Conference on Acoustic Speech and Signal
Processing 2002.
Face Recognition
33* Navin Goel, George Bebis and Ara V. Nefian, “Face recognition experiments with random projections”,
SPIE Conference on Biometric Technology for Human Identification 2005.
34* Ara V. Nefian “Embedded Bayesian Networks for face recognition”, IEEE International Conference on
Multimedia and Expo, August 2002.
35* Ara V. Nefian and Monson H. Hayes, “Maximum likelihood training of the embedded HMM for face
detection and recognition”, IEEE International Conference on Image Processing 2000.
36* Ara V. Nefian and Monson H. Hayes III, “An embedded HMM based approach for face detection and
recognition”, IEEE International Conference on Acoustic Speech and Signal Processing 1999.
37* Ara V. Nefian and Monson H. Hayes III, “Face recognition using an embedded HMM”, IEEE Conference
on Audio and Visual-based Person Authentication 1999.
38* Ara. V. Nefian and Monson H. Hayes III, “Face detection and recognition using Hidden Markov Models”,
IEEE International Conference on Image Processing1998.
39* Ara. V. Nefian and Monson H. Hayes III, “Hidden Markov Models for face recognition”, IEEE
International Conference on Acoustic Speech and Signal Processing 1998.
40* Ara V. Nefian, Mehdi Khosravi and Monson H. Hayes III, “Real-time human face detection from
uncontrolled environments”, SPIE Visual Communications on Image Processing 1997.
Body Tracking and Gesture Recognition
41* Ara V. Nefian, Radek Grezsczuk and Victor Eruhimov ,“A statistical upper body model for 3D static and
dynamic gesture recognition from stereo sequences”, IEEE International Conference on Image Processing
2001.
42* Robert D. Cavin, Navin Goel and Ara V. Nefian, “A Bayesian formulation for 3D articulated upper body
segmentation and tracking from dense disparity maps”, IEEE International Conference on Image Processing
2003.
SERVICE
1 The Second Workshop on Entertainment systems ENSYS 2007, co-chair.
1 International Symposium on Visual Computing, 2006, computer vision co-chair.
1 International Symposium on Visual Computing, 2005, board member. Chair of the special track on Intelligent
Vehicles and Autonomous Navigation, organizing committee.
6. 1 International Conference on Entertainment Systems and Applications, 2005, board member.
1 IEEE Workshop on Combined Research –Curriculum Development in Computer Vision, December 2001,
program committee member.
1 MPEG 7, face recognition committee member, 2001-2003.
1 NSF Combined Research-Curriculum Development in Computer Vision program, University of Nevada at
Reno 2000-2002, advisory board member.
1 IEEE Transactions on Multimedia, Image Processing, Signal Processing and Pattern Analysis and Machine
Learning, reviewer.
PRESS
1 “Computers learn to read lips”, New York Times, September 2003.
1 “Digital lip reader”, MIT Technology review, September 2003.
1 “Steady pace takes DARPA autonomous driving race”, EE Times, October 2005.
ACTIVITIES
Biking, long distance running, tennis, rugby, climbing, hiking, skiing, sailing.
References available upon request
7. 1 International Conference on Entertainment Systems and Applications, 2005, board member.
1 IEEE Workshop on Combined Research –Curriculum Development in Computer Vision, December 2001,
program committee member.
1 MPEG 7, face recognition committee member, 2001-2003.
1 NSF Combined Research-Curriculum Development in Computer Vision program, University of Nevada at
Reno 2000-2002, advisory board member.
1 IEEE Transactions on Multimedia, Image Processing, Signal Processing and Pattern Analysis and Machine
Learning, reviewer.
PRESS
1 “Computers learn to read lips”, New York Times, September 2003.
1 “Digital lip reader”, MIT Technology review, September 2003.
1 “Steady pace takes DARPA autonomous driving race”, EE Times, October 2005.
ACTIVITIES
Biking, long distance running, tennis, rugby, climbing, hiking, skiing, sailing.
References available upon request