Detection of high-order epistasis

Introduction


Many studies have shown that there is a direct relationship between Single Nucleotide Polymorphisms (SNPs) and the appearance of complex diseases, such as Alzheimer's or Parkinson's. However, recent advances in the Genome-Wide Association Study (GWAS) indicate that the relationship between SNPs and these diseases goes beyond a simple one-to-one relationship, that is, the appearance of multiple SNPs (epistasis) influences the generation of these diseases. In this sense, this work proposes the application of multi-objective metaheuristics for the detection of epistasis of multiple loci. Moreover, parallelism also is applied due to the high computational times in this field.

 

High-order Epistasis Detection Problem


In order to solve the gen-gen interaction (epistasis), this work uses two objective functions. The first function (y1) is based on a logistic regression that measures the likelihood, defined by:

where log lik is the maximum likelihood logarithm of the model and d the number of free parameters. The lower the value of the logistic regression, the greater the relationship with the disease. Therefore, y1 is a minimization objective function.

The second objective function (y2) is defined by a Bayesian network. This statistical model is represented by a set of random variables and their conditional dependencies using a directed acyclic graph, where the set of nodes is formed by the genotypes and phenotypes and the set of edges is formed by their conditional dependencies:

where I is the combinational number of SNPs with different values, J is the number of disease states for a node, ri is the number of SNP nodes for the combination i and rij is the number of cases where the disease takes the state j and its parents have the combination i. This second objective function is in charge of measuring the relationship between SNP nodes and disease nodes, so that the lower the value of y2, the greater the association.

 

References


  1. "A Multi-Objective Optimization Procedure for Solving the High-Order Epistasis Detection Problem". José M. Granado-Criado, Sergio Santander-Jiménez, Miguel A. Vega-Rodríguez, Álvaro Rubio-Largo. Expert Systems With Applications, Volume 142, 113000, Pergamon-Elsevier Science, Oxford, England, UK, 2020, pp. 1-15, ISSN: 0957-4174. DOI: 10.1016/j.eswa.2019.113000. (Impact factor = 4.292 in 2018, Quartile Q1)
  2. "Parallel Multi-Objective Optimization for High-Order Epistasis Detection", in: "Algorithms and Architectures for Parallel Processing, LNCS, Vol. 10393". Daniel Gallego-Sánchez, José M. Granado-Criado, Sergio Santander-Jiménez, Álvaro Rubio-Largo, Miguel A. Vega-Rodríguez. Springer International Publishing, Switzerland, 2017, pp. 523-532. ISBN: 978-3-319-65481-2. DOI: 10.1007/978-3-319-65482-9_38.