报告题目: Vision with A Billion Eyes
报告人： Jiebo Luo教授
报告人简介：Jiebo Luo joined the University of Rochester in Fall 2011 after over fifteen prolific years at Kodak Research Laboratories, where he was a Senior Principal Scientist leading research and advanced development. He has been involved in numerous technical conferences, including serving as the program co-chair of ACM Multimedia 2010 and IEEE CVPR 2012. He is the Editor-in-Chief of the Journal of Multimedia, and has served on the editorial boards of the IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Transactions on Multimedia, IEEE Transactions on Circuits and Systems for Video Technology, Pattern Recognition, Machine Vision and Applications, and Journal of Electronic Imaging. Dr. Luo is a Fellow of the SPIE, IEEE, and IAPR.His research spans image processing, computer vision, machine learning, data mining, medical imaging, and ubiquitous computing. He has been an advocate for contextual inference in semantic understanding of visual data, and continues to push the frontiers in this area by incorporating geo-location context and social context. A recent research thrust focuses on exploiting social media for machine learning, data mining, and human-computer interaction, for example, mining the wisdom of crowds for social, political, and economic prediction and forecasting. He has published extensively in these fields with over 270 papers and 90 US patents.
报告摘要：A recent trend in computer vision is driven by images and video generated by heterogeneous and multi-perspective visual sensing networks. We present a few examples of research along this line. First, we will present an interesting framework for event recognition. With GPS information, we obtain satellite images corresponding to picture locations and investigate their novel use to recognize the picture-taking environment. We then combine this inference with classical vision-based event detection methods and demonstrate the synergistic fusion of the two approaches.
Second, to determine the viewing direction for geotagged photos, we utilize both Google StreetView and Google Earth satellite images. Third, we explore using phone-captured images for localization as it contains more context information than the embedded sensory GPS coordinates. We then build applications to enable people to enjoy ubiquitous location-based services (LBS) using their phones. Fourth, we leverage crowd-sourced photos to remove unwanted bystanders from tourist photos taken at popular attractions and measure air pollution in major cities in China. Furthermore, given a new source of visual data from public webcams deployed in urban environments, we will present some ongoing work on crowd analytics using such data.