Visual Analytics for High Dimensional Data
Prof. Alfred Inselberg, Tel Aviv University, Senior Fellow San Diego Supercomputing Center
A dataset with M items has 2M subsets, any one of which may be the one satisfying our objective. With a good data display and interactivity our fantastic pattern-recognition defeats this combinatorial explosion by extracting insights from the visual patterns. This is the core reason for data visualization. With parallel coordinates the search for relations in multivariate data is transformed into a 2-D pattern recognition problem. Together with criteria for good query design, we illustrate this on several real datasets (financial, process control, credit-score, one with hundreds of variables) with stunning results. A geometric classification algorithm yields the classification rule explicitly and visually. The minimal set of variables, features, are found and ordered by their predictive value. A model of a country's economy reveals sensitivities, impact of constraints, trade-offs and economic sectors unknowingly competing for the same resources. An overview of the methodology provides foundational understanding; learning the patterns corresponding to various multivariate relations. These patterns are robust in the presence of errors and that is good news for the applications. A topology of proximity emerges opening the way for visualization in Big Data.
The tutorial will include an interactive data exploration session, for which participants can bring along their own multidimensional datasets. Datasets should be in Excel format and with all-numeric values (at most 4 significant digits, values of categorical data can be listed as integers), and max. 5MB in size. Five files will be explored in real-time during the third hour of the tutorial to demo and teach the basic skills (if more than five files are submitted, five will be chosen by lottery).
Alfred Inselberg received a Ph.D. in Mathematics and Physics from the University of Illinois (UICU). He was at the Biological Computer Lab (BCL), doing research on Brain Function, Cognition and Neural Networks (together the McCulloch Lab at MIT). He stayed on as Professor and subsequently was IBM researcher (at the Los Angeles Scientific Center and later Yorktown Labs). There he developed a Mathematical Model of the Ear (TIME, Newsweek 1974 etc) concurrently teaching at UCLA and USC. He also held academic positions at the Technion, Ben Gurion University and Tel Aviv University since 1995. Inselberg was elected Senior Fellow in Visualization at the San Diego Supercomputing Center (1996), Distinguished Visiting Professor at the Korea University (2008) and National University of Singapore (2011). He invented the multidimensional visualization methodology of Parallel Coordinates which became widely applied (Air Traffic Control, Data Mining etc) receiving numerous patents and awards. His textbook was published by Springer and praised by Stephen Hawking, among others.