ONLINE - Estimation and Use of Variance Contributions in Multiple High-Dimensional Data Sources

14.05.2021 11:15 – 12:15


The simultaneous analysis of multiple sources of high-dimensional data is nowadays a major challenge in several research areas. In cancer genomics, data collected on several omic platforms provide information both in form of individual patterns within each data source and of joint patterns that are shared among the different sources. Capturing these two components of variation can help provide a broader understanding of cancer genetics. Several methods have been proposed to separate common and distinct components of variation in multiple data sources, based on different algorithms and frameworks. For instance, Joint and Individual Variation Explained (JIVE) [1] is based on an iterative algorithm to factorize the data matrix into low rank approximations that capture variation across and within data types. It has been widely used in integrative genomics, and several generalizations and improvements have been developed, such as the angle based JIVE (aJIVE) [2]. On the other hand, integrated PCA (iPCA) [3] is a model based generalization of principal components analysis that can be used in similar applications. We will describe several methods for data integration, especially focusing on the estimation of joint and individual variance components. We will present an application of such methods to a lung cancer case control study nested in the Norwegian Woman and Cancer (NOWAC) cohort study [4]. JIVE, aJIVE and iPCA are used to separate the joint and individual contributions of DNA methylation, miRNA and mRNA expression and to improve prediction models for the occurrence of lung cancer.


[1] Lock, E. F., Hoadley, K. A., Marron, J. S. and Nobel, A.B. (2013). Joint and Individual Variation Explained (JIVE) for integrated analysis of multiple data types. Annals of Applied Statistics, 7, 523 – 542.

[2] Feng, Q., Jiang, M., Hannig and J., Marron, J. S.(2018). Angle based joint and individual variation explained. Journal of Multivariate Analysis, 166, 241 – 265.

[3] Tang, T. M. and Allen, G. I. (2018). Integrated principal components analysis. arXiv, 1810.00832.

[4] Lund, E., Dumeaux, V., Braaten, T., Hjartåker, A.,Engeset, D., Skeie, G. and Kumle, M. (2008) Cohort profile: the Norwegian Women and Cancer Study—NOWAC—Kvinner og kreft. International Journal of Epidemiology,37, 36 – 41.



Organisé par

Faculté d'économie et de management
Research Center for Statistics


Erica PONZI, University of Oslo, Norway

entrée libre


Catégorie: Séminaire

Plus d'infos

Contact: missing email