Accès libre

Fault Tolerance of Cloud Infrastructure with Machine Learning

À propos de cet article


Enhancing the fault tolerance of cloud systems and accurately forecasting cloud performance are pivotal concerns in cloud computing research. This research addresses critical concerns in cloud computing by enhancing fault tolerance and forecasting cloud performance using machine learning models. Leveraging the Google trace dataset with 10000 cloud environment records encompassing diverse metrics, we systematically have employed machine learning algorithms, including linear regression, decision trees, and gradient boosting, to construct predictive models. These models have outperformed baseline methods, with C5.0 and XGBoost showing exceptional accuracy, precision, and reliability in forecasting cloud behavior. Feature importance analysis has identified the ten most influential factors affecting cloud system performance. This work significantly advances cloud optimization and reliability, enabling proactive monitoring, early performance issue detection, and improved fault tolerance. Future research can further refine these predictive models, enhancing cloud resource management and ultimately improving service delivery in cloud computing.

4 fois par an
Sujets de la revue:
Computer Sciences, Information Technology