65. Pathway enrichment analysis: to make sense of single nucleotide polymorphism

Giuseppe Agapito – Data Analytics Research Center, Department of Medical and Surgical Sciences, University “Magna Græcia” of Catanzaro, Italy

Abstract

Clinical bioinformatics is currently growing and is based on integrating clinical and omics data aiming to develop personalized medicine. Thus the introduction of novel technologies to investigate the relationship between clinical states and biological machinery may help the development of this field.

High-throughput experimental platforms such as single nucleotide polymorphism (SNP) microarray can study the relationship between the variation of the genome of patients and drug metabolism, detecting SNPs (Single Nucleotide Polymorphism) on genes related to drug metabolism. This may allow, for instance, to find genetic variants in patients who present different drug responses in pharmacogenomics and clinical studies. Statistical and data mining software tools can help researchers determine the association between SNPs and patients’ clinical conditions responsible for the specific drug response.

This is only a partial result because the statistical and data mining analysis of microarray data provides a list of SNPs referring to specific genes that are still detached from the affected biological machinery functions.
Pathway enrichment analysis (PEA) methods seek to overcome the problem of interpreting overwhelmingly large lists of essential genes detached from biological context, which are the main output of most basic high-throughput data analysis, such as SNP microarray analysis.
Thus, PEA enables the researcher to generate a new hypothesis, design subsequent experiments, and further validate their findings, for instance, by identifying the biological roles of candidate genes in designing new cancer therapies.

About the author

The research activity of Giuseppe Agapito encompasses central issues in computer science, focusing mainly on data mining, parallel and distributed computing, and bioinformatics. The research is primarily devoted to developing omics data analytics that harnesses software and system-level metrics to quantify behavior better and efficiently manage complex biological systems. The main research activities have focused on the definition, implementation, and validation of:

Developing innovative and advanced statistical, machine learning, and data-mining analysis algorithms, even parallel and distributed for Omics data analysis.
Innovative algorithms for storing, analyzing, and visualizing biological networks, with significant emphasis on biological pathways analysis.
Innovative data mining algorithms and methodologies to support ontology curators evaluate ontological terms’ consistency through automatic software tools.

Developed software frameworks:
DMET-Miner is a software tool for extracting association rules from Single Nucleotide Polymorphism (SNP) microarray. Pares (Parallel Association Rule Extraction) is the multi-thread version of DMET-Miner, allowing to analyze massive amount of SNP datasets through an automatic workload balancing among the thread-slave.
DMET-Analyzer is a software framework for the preprocessing and statistical analysis of DMET SNP (Single Nucleotide Polymorphism) microarray data sets used in pharmacogenomics.
coreSNP and Clod4SNP are the parallel and cloud version of DMET-Analyzer, developed to face the constant increase in the number of genes investigated in a single microarray experiment.
OSAnalyzer is a software tool that combines SNPs data sets with clinical information and computes the overall survival and progression-free survival of a whole data set in a single analysis.
GO-WAR is a software tool for the extraction of Weighted Association Rule from Gene Ontology using the information content obtained from the ontological terms as weight.

More publications HERE.