Please use this identifier to cite or link to this item:
https://hdl.handle.net/20.500.11851/1170
Title: | Effective gene expression data generation framework based on multi-model approach | Authors: | Şirin, Utku Erdoğdu, Utku Polat, Faruk Tan, Mehmet Alhajj, Reda |
Keywords: | Multi-Model Approach Probabilistic Boolean Networks Ordinary Differential Equations Genetic Algorithm Hierarchical Markov Models Gene Expression Data Generation Gene Regulation Network Modeling |
Publisher: | Elsevier | Source: | Sirin, U., Erdogdu, U., Polat, F., Tan, M., & Alhajj, R. (2016). Effective gene expression data generation framework based on multi-model approach. Artificial intelligence in medicine, 70, 41-61. | Abstract: | Objective: Overcome the lack of enough samples in gene expression data sets having thousands of genes but a small number of samples challenging the computational methods using them. Methods and material: This paper introduces a multi-model artificial gene expression data generation framework where different gene regulatory network (GRN) models contribute to the final set of samples based on the characteristics of their underlying paradigms. In the first stage, we build different GRN models, and sample data from each of them separately. Then, we pool the generated samples into a rich set of gene expression samples, and finally try to select the best of the generated samples based on a multi-objective selection method measuring the quality of the generated samples from three different aspects such as compatibility, diversity and coverage. We use four alternative GRN models, namely, ordinary differential equations, probabilistic Boolean networks, multi-objective genetic algorithm and hierarchical Markov model. Results: We conducted a comprehensive set of experiments based on both real-life biological and synthetic gene expression data sets. We show that our multi-objective sample selection mechanism effectively combines samples from different models having up to 95% compatibility, 10% diversity and 50% coverage. We show that the samples generated by our framework has up to 1.5x higher compatibility, 2x higher diversity and 2x higher coverage than the samples generated by the individual models that the multi model framework uses. Moreover, the results show that the GRNs inferred from the samples generated by our framework can have 2.4x higher precision, 12x higher recall, and 5.4x higher f-measure values than the GRNs inferred from the original gene expression samples. Conclusions: Therefore, we show that, we can significantly improve the quality of generated gene expression samples by integrating different computational models into one unified framework without dealing with complex internal details of each individual model. Moreover, the rich set of artificial gene expression samples is able to capture some biological relations that can even not be captured by the original gene expression data set. (C) 2016 Elsevier B.V. All rights reserved. | URI: | https://www.sciencedirect.com/science/article/pii/S0933365715300518?via%3Dihub https://hdl.handle.net/20.500.11851/1170 |
ISSN: | 0933-3657 |
Appears in Collections: | Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering PubMed İndeksli Yayınlar Koleksiyonu / PubMed Indexed Publications Collection Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection |
Show full item record
CORE Recommender
SCOPUSTM
Citations
2
checked on Nov 9, 2024
WEB OF SCIENCETM
Citations
3
checked on Nov 9, 2024
Page view(s)
180
checked on Nov 11, 2024
Google ScholarTM
Check
Altmetric
Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.