Biography

My background is in Theoretical Computer Sciences. However, after completing my Ph.D. in abstract automata and formal languages, I shifted to the area of data analysis and classification which, for quite a while, was considered by computer scientists as part of Statistics and by statisticians as not belonging in the Sciences at all. Things have changed with the advent of modern computer systems that are capable of processing really massive data. Nowadays, my area of research has become part of the Computer Science under the title of data mining and knowledge discovery.

My earlier work, on revealing order and cluster structures in qualitative data, is reflected in my monographs: Group Choice (in Russian 1974, English translation 1979, Wiley Interscience), Graphs and Genes (with S.N. Rodin, in Russian 1977, English translation 1984, Springer), and Analysis of Qualitative Attributes and Structures (in Russian 1976, 1980).

My later work focuses on the field of cluster analysis considered as data driven classification, which is partly described in the monograph, Mathematical Classification and Clustering, 1996, written while at DIMACS, Rutgers University, USA. I maintain that two problems - revealing clusters in data and describing clusters/groups - are the core of data driven classification. Traditionally, only the former is considered as clustering. To deal with these problems, I assume that it must be possible to use the cluster structure found in data to approximately reconstruct the original data; and the quality of the cluster structuring should be evaluated according to the quality of the reconstruction. This idea leads to a class of methods and algorithms that have proven successful in theory as well as in applications such as biomolecular analysis, industrial organizations, large-scale surveys, etc. I show how this view, referred to as the ``data recovery approach'', can be applied to two most popular methods, K-Means and Ward clustering, leading to a consistent theory in data analysis, that provides a wealth of mutually compatible methods and interpretation aids, in my book, Clustering: A Data Recovery Approach, Chapman & Hall/CRC 2005 (2d Edition 2012).

I have spent some time by travelling and working with colleagues in France (1991-1993), USA (1993-1998), and Germany (1996-1999); this gave me a unique opportunity to update my knowledge in modern developments and enhance my understanding of data driven classification problems. After retirement from Birkbeck (2010) where I remain Professor Emeritus, I returned to Moscow where I teach courses in Core data analysis (Faculty of Computer Science, National Research University Higher School of Economics Moscow Russia), as described in my textbook, Core Concepts in Data Analysis: Summarization, Correlation and Visualization, Springer 2011.