Implementation of Enzyme Family Classification by using Autoencoders in a Study Case with Imbalanced and Underrepresented Classes

[1] N. Buton, F. Coste, and Y. Le Cunff, “Predicting Enzymatic Function of Protein Sequences With Attention,” Bioinformatics, vol. 39, no. 10, Oct. 2023, doi: 10.1093/bioinformatics/btad620. N. Buton F. Coste Y. Le Cunff “ Predicting Enzymatic Function of Protein Sequences With Attention ,” Bioinformatics , vol. 39 , no. 10 , Oct . 2023 , doi: 10.1093/bioinformatics/btad620 . Open DOI Search in Google Scholar

[2] Y. González Valle, D. Galpert, and R. Molina-Ruiz, “Integración De Rasgos Y Aprendizaje SemiSupervisado Para La Clasificación Funcional De Enzimas Utilizando KMeans De Spark,” Revista Cubana de Ciencias Informáticas, vol. 14, no. 4, 2020. Y. González Valle D. Galpert R. Molina-Ruiz “ Integración De Rasgos Y Aprendizaje SemiSupervisado Para La Clasificación Funcional De Enzimas Utilizando K2010;Means De Spark ,” Revista Cubana de Ciencias Informáticas , vol. 14 , no. 4 , 2020 . Search in Google Scholar

[3] Y. González Valle, D. Galpert, and R. MolinaRuiz, “Agrupamiento Funcional De Enzimas GH-70 Utilizando Aprendizaje Semi-Supervisado Y Apache Spark,” Revista Cubana de Transformación Digital, pp. 14–32, 2021. Y. González Valle D. Galpert R. Molina-Ruiz “ Agrupamiento Funcional De Enzimas GH-70 Utilizando Aprendizaje Semi-Supervisado Y Apache Spark ,” Revista Cubana de Transformación Digital , pp. 14 – 32 , 2021 . Search in Google Scholar

[4] H. Chehili, S. E. Aliouane, A. Bendahmane, and M. A. Hamidechi, “DeepEnz: Prediction Of Enzyme Classification By Deep Learning,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 22, no. 2, 2021, doi: 10.11591/ije ecs.v22.i2.pp1108-1115. H. Chehili S. E. Aliouane A. Bendahmane M. A. Hamidechi “ DeepEnz: Prediction Of Enzyme Classification By Deep Learning ,” Indonesian Journal of Electrical Engineering and Computer Science , vol. 22 , no. 2 , 2021 , doi: 10.11591/ije ecs.v22.i2.pp1108-1115 . Open DOI Search in Google Scholar

[5] Z. Tao, B. Dong, Z. Teng, and Y. Zhao, “The Classification of Enzymes by Deep Learning,” IEEE Access, vol. 8, 2020, doi: 10.1109/ACCESS.202 0.2992468. Z. Tao B. Dong Z. Teng Y. Zhao “ The Classification of Enzymes by Deep Learning ,” IEEE Access , vol. 8 , 2020 , doi: 10.1109/ACCESS.202 0.2992468 . Open DOI Search in Google Scholar

[6] N. Ibtehaz and D. Kihara, “Application of Sequence Embedding in Protein Sequence-Based Predictions,” in Machine Learning in Bioinformatics of Protein Sequences: Algorithms, Databases and Resources for Modern Protein Bioinformatics, 2022. doi: 10.1142/9789811258589_0002. N. Ibtehaz D. Kihara “ Application of Sequence Embedding in Protein Sequence-Based Predictions ,” in Machine Learning in Bioinformatics of Protein Sequences: Algorithms, Databases and Resources for Modern Protein Bioinformatics , 2022 . doi: 10.1142/9789811258589_0002 . Open DOI Search in Google Scholar

[7] K. K. Yang, Z. Wu, C. N. Bedbrook, and F. H. Arnold, “Learned Protein Embeddings For Machine Learning,” Bioinformatics, vol. 34, no. 15, pp. 2642–2648, Aug. 2018, doi: 10.1093/bi oinformatics/bty178. K. K. Yang Z. Wu C. N. Bedbrook F. H. Arnold “ Learned Protein Embeddings For Machine Learning ,” Bioinformatics , vol. 34 , no. 15 , pp. 2642 – 2648 , Aug . 2018 , doi: 10.1093/bi oinformatics/bty178 . Open DOI Search in Google Scholar

[8] C. Marquet et al., “Embeddings From Protein Language Models Predict Conservation And Variant Effects,” Hum Genet, vol. 141, no. 10, 2022, doi: 10.1007/s00439-021-02411-y. C. Marquet “ Embeddings From Protein Language Models Predict Conservation And Variant Effects ,” Hum Genet , vol. 141 , no. 10 , 2022 , doi: 10.1007/s00439-021-02411-y . Open DOI Search in Google Scholar

[9] M. M. Moya and D. R. Hush, “Network Constraints And Multi-Objective Optimization For One-Class Classification,” Neural Networks, vol. 9, no. 3, 1996, doi: 10.1016/0893-6080(95)00120-4. M. M. Moya D. R. Hush “ Network Constraints And Multi-Objective Optimization For One-Class Classification ,” Neural Networks , vol. 9 , no. 3 , 1996 , doi: 10.1016/0893-6080(95)00120-4 . Open DOI Search in Google Scholar

[10] M. Sakurada and T. Yairi, “Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction,” in ACM International Conference Proceeding Series, 2014. doi: 10.1145/2689746.2689747. M. Sakurada T. Yairi “ Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction ,” in ACM International Conference Proceeding Series , 2014 . doi: 10.1145/2689746.2689747 . Open DOI Search in Google Scholar

[11] K. Pawar and V. Attar, “Deep Learning Model Based on Cascaded Autoencoders and One-Class Learning For Detection And Localization Of Anomalies From Surveillance Videos,” IET Biom, vol. 11, no. 4, 2022, doi: 10.1049/bme2.12064. K. Pawar V. Attar “ Deep Learning Model Based on Cascaded Autoencoders and One-Class Learning For Detection And Localization Of Anomalies From Surveillance Videos ,” IET Biom , vol. 11 , no. 4 , 2022 , doi: 10.1049/bme2.12064 . Open DOI Search in Google Scholar

[12] L. López, N. Acosta-Mendoza, and A. Gago-Alonso, “Detección De Anomalías Basada En Aprendizaje Profundo,” Revista de Ciencias Informáticas, vol. 13, no. 3, 2020. L. López N. Acosta-Mendoza A. Gago-Alonso “ Detección De Anomalías Basada En Aprendizaje Profundo ,” Revista de Ciencias Informáticas , vol. 13 , no. 3 , 2020 . Search in Google Scholar

[13] M. V. Nallapareddy and R. Dwivedula, “ABLE: Attention Based Learning For Enzyme Classification,” Comput Biol Chem, vol. 94, p. 107558, 2021, doi: https://doi.org/10.1016/j.compbiolchem.2021.107558. M. V. Nallapareddy R. Dwivedula “ ABLE: Attention Based Learning For Enzyme Classification ,” Comput Biol Chem , vol. 94 , p. 107558 , 2021 , doi: https://doi.org/10.1016/j.compbiolchem.2021.107558 . Search in Google Scholar

[14] R. Atienza, Advanced Deep Learning with Keras. 2018. R. Atienza Advanced Deep Learning with Keras . 2018 . Search in Google Scholar

[15] L. Wang, H. Zhang, W. Xu, Z. Xue, and Y. Wang, “Deciphering The Protein Landscape With Protflash, A Lightweight Language Model,” Cell Rep PhysSci, vol. 4, no. 10, p. 101600, 2023, doi: https://doi.org/10.1016/j.xcrp.2023.101600. L. Wang H. Zhang W. Xu Z. Xue Y. Wang “ Deciphering The Protein Landscape With Protflash, A Lightweight Language Model ,” Cell Rep PhysSci ,vol. 4 , no. 10 , p. 101600 , 2023 , doi: https://doi.org/10.1016/j.xcrp.2023.101600 . Search in Google Scholar

[16] K. Cabello-Solorzano, I. Ortigosa de Araujo, M. Peña, L. Correia, and A. J. Tallón-Ballesteros, “The Impact of Data Normalization on the Accuracy of Machine Learning Algorithms: A Comparative Analysis,” 2023. doi: 10.1007/978-3-031-4253 6-3_33. K. Cabello-Solorzano I. Ortigosa de Araujo M. Peña L. Correia A. J. Tallón-Ballesteros “ The Impact of Data Normalization on the Accuracy of Machine Learning Algorithms: A Comparative Analysis ,” 2023 . doi: 10.1007/978-3-031-4253 6-3_33 . Open DOI Search in Google Scholar

[17] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic Minority Over-Sampling Technique,” Journal of Artificial Intelligence Research, vol. 16, 2002, doi: 10.1613/jair.953. N. V. Chawla K. W. Bowyer L. O. Hall W. P. Kegelmeyer “ SMOTE: Synthetic Minority Over-Sampling Technique ,” Journal of Artificial Intelligence Research , vol. 16 , 2002 , doi: 10.1613/jair.953 . Open DOI Search in Google Scholar

[18] G. Douzas, F. Bacao, and F. Last, “Improving Imbalanced Learning Through A Heuristic Oversampling Method Based On K-Means And SMOTE,” Inf Sci (N Y), vol. 465, 2018, doi: 10.1 016/j.ins.2018.06.056. G. Douzas F. Bacao F. Last “ Improving Imbalanced Learning Through A Heuristic Oversampling Method Based On K-Means And SMOTE ,” Inf Sci (N Y) , vol. 465 , 2018 , doi: 10.1 016/j.ins.2018.06.056 . Open DOI Search in Google Scholar

[19] H. Han, W. Y. Wang, and B. H. Mao, “Borderline-SMOTE: A New Over-Sampling Method In Imbalanced Data Sets Learning,” in Lecture Notes in Computer Science, 2005. doi: 10.1007/115380 59_91. H. Han W. Y. Wang B. H. Mao “ Borderline-SMOTE: A New Over-Sampling Method In Imbalanced Data Sets Learning ,” in Lecture Notes in Computer Science , 2005 . doi: 10.1007/115380 59_91 . Open DOI Search in Google Scholar

[20] Aurélien Géaron, Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow: Concepts, Tools, And Techniques to Build Intelligent Systems. 2022. Aurélien Géaron Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow: Concepts, Tools, And Techniques to Build Intelligent Systems . 2022 . Search in Google Scholar

[21] D. P. Kingma and J. L. Ba, “Adam: A Method For Stochastic Optimization,” in 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings, 2015. D. P. Kingma J. L. Ba “ Adam: A Method For Stochastic Optimization ,” in 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings , 2015 . Search in Google Scholar

[22] R. Dhanuka, A. Tripathi, and J. P. Singh, “A Semi-Supervised Autoencoder-Based Approach for Protein Function Prediction,” IEEE J Biomed Health Inform, vol. 26, no. 10, pp. 4957–4965, Oct. 2022, doi: 10.1109/JBHI.2022.3163150. R. Dhanuka A. Tripathi J. P. Singh “ A Semi-Supervised Autoencoder-Based Approach for Protein Function Prediction ,” IEEE J Biomed Health Inform , vol. 26 , no. 10 , pp. 4957 – 4965 , Oct . 2022 , doi: 10.1109/JBHI.2022.3163150 . Open DOI Search in Google Scholar

[23] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A Simple Way To Prevent Neural Networks From Overfitting,” Journal of Machine Learning Research, vol. 15, 2014. N. Srivastava G. Hinton A. Krizhevsky I. Sutskever R. Salakhutdinov “ Dropout: A Simple Way To Prevent Neural Networks From Overfitting ,” Journal of Machine Learning Research , vol. 15 , 2014 . Search in Google Scholar

[24] T. Dozat, “Incorporating Nesterov Momentum into Adam,” ICLR Workshop, no. 1, 2016. T. Dozat “ Incorporating Nesterov Momentum into Adam ,” ICLR Workshop , no. 1 , 2016 . Search in Google Scholar

[25] H. Bin Shen and K. C. Chou, “EzyPred: A Top– Down Approach For Predicting Enzyme Functional Classes And Subclasses,” Biochem Biophys Res Commun, vol. 364, no. 1, pp. 53–59, Dec. 2007, doi: 10.1016/J.BBRC.2007.09.098. H. Bin Shen K. C. Chou “ EzyPred: A Top– Down Approach For Predicting Enzyme Functional Classes And Subclasses ,” Biochem Biophys Res Commun , vol. 364 , no. 1 , pp. 53 – 59 , Dec . 2007 , doi: 10.1016/J.BBRC.2007.09.098 . Open DOI Search in Google Scholar

[26] A. Dalkiran, A. S. Rifaioglu, M. J. Martin, R. Cetin-Atalay, V. Atalay, and T. Doğan, “ECPred: A Tool For The Prediction Of The Enzymatic Functions Of Protein Sequences Based On The EC Nomenclature,” BMC Bioinformatics, vol. 19, no. 1, Sep. 2018, doi: 10.1186/s12859-018-2368-y. A. Dalkiran A. S. Rifaioglu M. J. Martin R. Cetin-Atalay V. Atalay T. Doğan “ ECPred: A Tool For The Prediction Of The Enzymatic Functions Of Protein Sequences Based On The EC Nomenclature ,” BMC Bioinformatics , vol. 19 , no. 1 , Sep . 2018 , doi: 10.1186/s12859-018-2368-y . Open DOI Search in Google Scholar

[27] T. Sanderson, M. L. Bileschi, D. Belanger, and L. J. Colwell, “ProteInfer, Deep Neural Networks for Protein Functional Inference,” Elife, vol. 12, 2023, doi: 10.7554/eLife.80942. T. Sanderson M. L. Bileschi D. Belanger L. J. Colwell “ ProteInfer, Deep Neural Networks for Protein Functional Inference ,” Elife , vol. 12 , 2023 , doi: 10.7554/eLife.80942 . Open DOI Search in Google Scholar

Lingua:: Inglese

Frequenza di pubblicazione:: 4 volte all'anno
Argomenti della rivista:: Informatica, Intelligenza artificiale, Ingegneria, Elettrotecnica, Ingegneria dell'automazione, metrologia e collaudo, Ingegneria meccanica, Basi dell'ingegneria meccanica

Feed RSS della rivista

Implementation of Enzyme Family Classification by using Autoencoders in a Study Case with Imbalanced and Underrepresented Classes

Darian Fernández Gutiérrez

Ariadna Arbolaez Espinosa

Deborah Raquel Galpert Cañizares

María Matilde García Lorenzo

Pubblicato online: 31 mar 2025

Pagine: 42 - 48

Ricevuto: 15 apr 2024

Accettato: 20 mag 2024

DOI: https://doi.org/10.14313/jamris-2025-005

Parole chiaveautoencoders, bioinformatics, embeddings, enzyme classification

© 2025 Darian Fernández Gutiérrez et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Parole chiave
autoencoders, bioinformatics, embeddings, enzyme classification