Learning from multi-modal data: integration, fusion, and data translation
- Speaker: Professor Samuel Kaski, Helsinki Institute for Information Technology HIIT Aalto University and University of Helsinki
- Date: Saturday, 27 July 2013 from 16:30 to 17:30
- Location: Room 160, Birkbeck Main Building
In data analysis tasks, across fields from genomics to multimodal interfaces, one of the most needed operations is data integration or data fusion. For the goal of making sense of the data, the different very high-dimensional data sources give different but complementary information. In a case study in genomics, the sources include gene expression in different diseases and under different treatments, metabolite concentrations, DNA copy number variation etc. Given the large number of data sources with mostly unknown connections, it may be more appropriate to talk about data translation than integration, with the goal being to find, characterize, and utilize the unknown connections between data sources. In machine learning this task has been called unsupervised multi-view machine learning, for which we have introduced Bayesian canonical correlation analysis-based methods, and recently Group Factor Analysis (GFA) which generalizes factor analysis from analysing relationships of univariate variables to analysis of multiple data sources each consisting of multivariate observations. I will discuss the methods and present case studies in metabolomics and in analysing genome-wide effects of drugs.
Brief Bio: Samuel Kaski is director of Helsinki Institute for Information Technology HIIT, a joint research instute of Aalto University and University of Helsinki, and a profesor of computer science at the Aalto University. His research field is machine learning and probabilistic modeling, with applications in computational biology and medicine, proactive interfaces, information visualization, and brain signal analysis.