Accesso libero

Integrated Statistical and Rule-Mining Techniques for Dna Methylation and Gene Expression Data Analysis

INFORMAZIONI SU QUESTO ARTICOLO

Cita

For determination of the relationships among significant gene markers, statistical analysis and association rule mining are considered as very useful protocols. The first protocol identifies the significant differentially expressed/methylated gene markers, whereas the second one produces the interesting relationships among them across different types of samples or conditions. In this article, statistical tests and association rule mining based approaches have been used on gene expression and DNA methylation datasets for the prediction of different classes of samples (viz., Uterine Leiomyoma/class-formersmoker and uterine myometrium/class-neversmoker). A novel rule-based classifier is proposed for this purpose. Depending on sixteen different rule-interestingness measures, we have utilized a Genetic Algorithm based rank aggregation technique on the association rules which are generated from the training set of data by Apriori association rule mining algorithm. After determining the ranks of the rules, we have conducted a majority voting technique on each test point to estimate its class-label through weighted-sum method. We have run this classifier on the combined dataset using 4-fold cross-validations, and thereafter a comparative performance analysis has been made with other popular rulebased classifiers. Finally, the status of some important gene markers has been identified through the frequency analysis in the evolved rules for the two class-labels individually to formulate the interesting associations among them.

eISSN:
2083-2567
Lingua:
Inglese
Frequenza di pubblicazione:
4 volte all'anno
Argomenti della rivista:
Computer Sciences, Databases and Data Mining, Artificial Intelligence