Demos and Matlab functions for the paper "Heterofusion: Fusing genomics data of different measurement scales"

Author: Yipeng Song

Research group: Biosystems Data Analysis Group, Uiversity of Amsterdam

Make sure the current folder is in your Matlab path and run the help function from the Matlab console (e.g. help INDORT) for more information on the input/output of the algorithms.

Demos

The demos are used to show how to do data fusion using representation matrix approach, optimal scaling and GSCA model. The docs for and the results of the demos can be found in the examples folder.

data_fusioin_representation_matrices.m: it shows how to do data fusion of binary and quantitative data sets using representation matrix approach.

data_fusioin_optimal_scaling.m: it shows how to do data fusion of binary and quantitative data sets using optimal scaling approach.

data_fusioin_GSCA.m: it shows where can find the details of data fusion using GSCA model.

examples:

data_fusioin_GSCA.html: 
    Documentation for running data_fusioin_GSCA.m

data_fusioin_optimal_scaling.html: 
    Documentation for running data_fusioin_optimal_scaling.m

data_fusioin_representation_matrices.html: 
    Documentation for running data_fusioin_representation_matrices.m

Algorithms

The algorithms used to do data simulation, to fit a GSCA model, to do model selection are in the algorithms folder.

Algorithms:

data:
    X1_CNA.mat: An example of imbalanced binary data set. The logit transform of the empirical marginal probabilities are used as the offset term in simulation.

functions:
    functions used in data simulation.

representation_matrix_algorithms:
    Algorithms to fit INDSCAL (INdividual Differences SCALing) model and
    INDORT (INDscal with ORThogonal constraints). Algorithms are from Kiers
    Referred to the paper "Kiers, H. (1989). Three-way methods for the analysis of qualitative and quantitative two-way data. DSWO Press."