Research, development, and application of technologiesIntelligent data analysis and modelling


Allows a comparison of data sets obtained under different experimental conditions. If some points in data sets were not obtained, the missing points can be imputed. Using the program, statistically significant differences in the analyzed data samples can be identified.

Data sets with multiple numerical data points in MS EXCEL sheets are used as input for the program. Processed data are exported to an output file of the same type.

A Gaussian Mixture Model (GMM) is used to impute the missing data points. The parameters of GMM are trained utilising the Expectation Maximization (EM) algorithm. Missing values are recovered from the trained probabilistic distribution.

To statistically detect the outlying points in the dataset, General Linear Model is implemented. For the GLM, each datapoint is approximated with a linear or quadratic fit, and coefficients of these fits are used to compare the difference between corresponding data points.

Typical applications

  • assembly and comparison of protein expression profiles based on spot sizes obtained from 2D gel electrophoresis
  • imputation of missing data in expression profiles
  • comparison of expression profiles obtained under standard and modified experimental conditions
  • comparison of initially measured or linearized expression profiles
  • identification of proteins with different expression profiles based either on the statistical significance or on the selected number of the most different profiles
  • presentation of expression profiles in both tabular and graphical form, the imputed missing experimental values in tables are highlighted in color

The format and structure of input and output data can be modified according to customer requirements.

ExProf can run both locally and in server mode where computationally intensive parts of algorithm run on server.

ExProf was developed in cooperation with the Institute of Plant Genetics and Biotechnology of the Slovak Academy of Sciences.

ExProf – figures


    Bioinformatics and data processing

    • ADICyt for automated clustering of flow cytometry data
    • ADprot for automated search in protein databases
    • ExProf for assembly and comparison of expression profiles
    • InDelFinder for automated search of mutations in DNA sequences
    • Logit for automatic detection of data dependencies by means of logistic regression
    • MultipluginG for extending NextGen sequencing funcionality of Geneious software
    • Parent Prophet for testing of custom relationships between people