Professor Roberto Tagliaferri
NeuRoNe Lab, DISA-MIS
University of Salerno, Italy
Data integration in genomics and systems biology
Dr. Michael Hoffman
Multi-view learning is the branch of machine learning that deals with multi modal data, i.e. with patterns represented by different sets of features. The fast spread of this learning technique is motivated by the continuing increase of real applications based on multi-view data. For example, in Bioinformatics, multiple experiments can be available (microarray gene expression, miRNA expression, RNASeq, Genome wide association studies (GWAs) and others) for a set of samples. In Bioinformatics multi-view approaches are useful since heterogeneous genome-wide data sources capture information on different aspects of complex biological systems. Each view provides a distinct facet of the same domain, but, probably, it encodes different biologically-relevant patterns. The integration of such views can provide a richer model of the functioning of the underlying system than that produced by a single view alone.
This paper provides a review of the literature particularly with respect to bioinformatics, with the purpose to understand the principles and operation modes of the proposed methods and their possible applications. In order to organize the proposed methods in literature and to find similarities between them, these approaches are organized according to three categories: the type of data used in the papers, the statistical problem and the stage of integration.