IJCAI 2013 Tutorial

-        Large-scale Non-linear Classification: Algorithms and Evaluations

 

 

Abstract: Most of the research in AI has been directed to the problem of data classification in which the algorithm learns linear/nonlinear models from data. Non-linear data classification is particularly important as complex non-linear concepts often occur in the nature. An accurate and scalable algorithm with the ability of learning non-linear model plays a key role in various data mining, NLP, computer vision, and information retrieval problems. In an environment where new large-scale problems are emerging in various disciplines and pervasive computing applications are becoming common, there is a real need for classification algorithms that are able to process increasing amounts of data efficiently. Recent advances in large-scale learning resulted in many popular algorithms for linear classification using large data. However, technologies for large-scale non-linear classification are still under developed and the best practices are less known. To fill this gap we present a survey of state-of-the-art algorithms and software packages and our evaluation on real-life data sets. We discuss algorithms on different aspect of this area in details. We also present a comprehensive experimental evaluation of these algorithms and off-the-shelf software on a collection of the large real-life data sets across various applications.

Speaker Biography: Dr. Zhuang (John) Wang is a member of IBM Global Business Services, where he is dedicated in bridging science and business by developing big data analytics solutions for business innovation. Prior to IBM, he was a research scientist with Siemens Corporate Research and led/worked on a wide variety of projects building predictive maintenance, anomaly detection and decision support systems for servicing several fleets of industrial and medical equipments that generate huge amount of senor/log data. Dr. WangĄ¯s research interests are in supervised learning algorithms, in particular in Support Vector Machines, Neural Networks, as well as in large-scale, online, and multi-instance learning and their applications. He is the author/coauthor of 15 or so papers published at JMLR, MLJ, ICML, KDD, AISTATS et al.

Download [pdf]

Tutorial Outline:

-          Large-scale linear classification

-          Large-scale non-linear classification

-          Parallelism