The Future of Search and Discovery in Big Data Analytics: Ultrametric Information Spaces
- Speaker: Professor Fionn Murtagh, Department of Computer Science, Royal Holloway, University of London.
- Date: Wednesday, 21 November 2012 from 16:30 to 17:30
- Location: Room 160, Birkbeck Main Building
In considering hierarchical clustering for structuring and orienting search and retrieval, I will describe a new linear time hierarchical clustering method. The hierarchy is induced from the Baire distance. This is applied to astronomy data, using Sloan Digital Sky Survey (SDSS) spectroscopic data, and to a large database of chemical compounds. This work is motivated, firstly, by the knowledge that as spatial dimensionality becomes very large so too does spatial sparsity and ultrametricity. The latter expresses the property of hierarchically embedded clusters. The second motivation for this work is to benefit computationally from these findings, for the tasks of search, discovery, and data understanding, in massive and possibly very high dimensional data. A major current application field is that of text analysis.