Tin Kam Ho is the leader of the Statistics, Learning, and Computing Research Group at Bell Laboratories. Her interests embrace pattern recognition, data mining, and modeling and simulation of complex systems. She received a Ph.D. in Computer Science from SUNY at Buffalo in 1992. She is Editor-in-chief of the IAPR official journal Pattern Recognition Letters, and has been associate editor of several other journals including IEEE-Transactions on PAMI. She has received the ICDAR Young Scientist Award in Document Analysis and Recognition, the Bell Labs President’s Gold Award, and the Pierre Devijver Award in Statistical Pattern Recognition. She is a Fellow of the IAPR and the IEEE, and has received 7 U.S. patents in classifier design, image analysis, and wireless tracking. For further information, visit Ho’s web page.
Tin Kam Ho will give the following two introductory talks on Tuesday, May 13:
- 10:30h: Learning with Random Guesses â€“ Principles of Stochastic Discrimination and Ensemble Learning.
Learning in everyday life is often accomplished by making many random guesses and synthesizing the feedback. Kleinberg’s analysis of this process resulted in a new method for classifier design — stochastic discrimination (SD). The method constructs an accurate classifier by combining a large number of very weak discriminators that are generated essentially at random. SD is an ensemble learning method in an extreme form. Studies on other ensemble learning or decision fusion methods have long suffered from the difficulty of properly modeling the complementary strengths of the components. The SD theory addresses this rigorously via the mathematical concepts of enrichment, uniformity, and projectability. Ho will explain these concepts via a very simple numerical example that captures the basic principles of the SD theory and method. Ho will discuss how these led to my development of the classifier known as “random decision forests”.
- 15:00h: On limits of automatic pattern learning.
Decades of research in automatic pattern recognition resulted in many learning algorithms. Yet for a new learning task it is still difficult to know which method would work best. Often it is unclear whether the limit in classification accuracy is due to a deficiency in the methods or is intrinsic to the task with the given data. We describe some measures that characterize the intrinsic complexity of a classification problem and its relationship to classifier performance. The measures revealed that a collection of real-world problems can span an interesting continuum between those easily learnable to those with no learning possible. We discuss our results on identifying the domains of dominant competence of several popular classifiers in this measurement space. We describe an exploratory data visualization tool that can help in understanding the data geometry, and some real-world applications in science and engineering.
Hope to see you in these amazing talks!