Open Access

Intelligent Models for Prediction of Compressive Strength of Geopolymer Pervious Concrete Hybridized with Agro-Industrial and Construction-Demolition Wastes

 and   
Sep 26, 2024

Cite
Download Cover

Introduction
General

The construction industry stands at a cross-road, facing the dual challenges of meeting global infrastructure demands and mitigating its environmental footprint. Central to this challenge is the industry's reliance on ordinary Portland cement (OPC), the production of which is notably carbon-intensive. Studies have quantified the environmental burden of OPC production, revealing that approximately 0.73–0.85 tonnes of CO2 are emitted for every ton of OPC produced, spotlighting the urgent need for sustainable alternatives in concrete manufacturing[1]. Furthermore, the burgeoning issue of construction waste, alongside the overproduction of industrial by-products like fly ash and slag, necessitates a shift toward sustainable construction methodologies. The world generates billions of tons of construction waste annually, a significant portion of which remains underutilized, contributing to environmental degradation [2]. Geopolymer (GP) and Alkali-Activated Cements (AACs) emerge as a formidable nominee in this framework, offering a viable pathway to curtail the carbon emissions associated with traditional cement[3]. Hence, the GPs emerge as a beacon of sustainability in this landscape, offering a robust framework for recycling and reusing construction and industrial waste. These are synthesized from aluminosilicate materials to proposing a significant reduction in CO2 emissions and also proven to be excelling in mechanical performances and durability aspects compared to OPC-based materials[4]. The global warming potential of GPs is markedly lower primarily due to their synthesis from industrial by-products, thereby circumnavigating the energy-intensive clinker production process inherent in OPC manufacturing [5].

This research pivots on the development of a novel slag-based GP pervious concrete, hybridized with an agro-waste, i.e., sugarcane bagasse ash (SBA) and construction-demolition (C&D) wastes, steering the conversation toward circular economy in construction. The utilization of such waste materials not only addresses the disposal issue but also enhances the sustainability quotient of the concrete produced. SBA, an agricultural by-product, and C&D wastes, typically viewed as landfill fodder, are thus valorized, contributing to waste minimization and resource efficiency. Soft computing models stand at the forefront of this research, offering a nuanced approach to predicting and optimizing the mechanical properties of these novel concrete mixtures. By integrating machine learning (ML) techniques, this study aims to refine the prediction accuracy of the concrete's strength, providing a robust framework for the application of these materials in real-world scenarios. This computational approach aligns with the emphasis on innovative applications of computing in civil engineering, heralding a new era of data-driven material science.

The urgency to transition to sustainable construction practices is further amplified by the dire warnings of climate scientists. The alarming trajectory of global warming, exacerbated by the construction sector's carbon emissions, necessitates a paradigm shift toward materials that reduce the carbon footprint. GPs present a promising solution in this regard, offering a sustainable alternative to OPC by harnessing the latent hydraulic properties of industrial by-products. GPs not only contribute to the reduction of CO2 emissions but also promise enhancements in the material properties of concrete, including superior mechanical strength and durability, fostering the advancement of green construction materials [6]. Hence, this research underscores the imperative for innovative, sustainable construction materials, with a particular focus on GP pervious concrete enhanced with industrial and agricultural wastes. By incorporating advanced computational models with sustainable material science, this study aims to contribute significantly to the field, offering insights and methodologies that align with the urgent call for environmental stewardship in construction practices. Moreover, the prediction of the mechanical properties of any type of GP pervious concretes is considered very much challenging due to the complex interactions between its heterogeneous components, including various types of industrial by-products and the specific conditions required for the alkali-activation. This work contributes by leveraging ML to unstitch these intricate relationships, offering a more accurate, efficient predictive approach. Hence, this research directly addresses the challenge by utilizing data-driven models to forecast pervious GPC's behavior, thereby guiding the optimization of sustainable concretes. Through this endeavor, the study also addresses a critical gap in the current literature and lays down a clear pathway for future research in sustainable construction materials, resonating with the global agenda for sustainable development and climate resilience.

Review on Earlier Studies in Soft Computing Applications in AAC and GPC mixes

The advent of GPC/AAC represents a significant leap toward sustainable construction practices, aligning with the global impetus to reduce the environmental footprint of the building industry. This novel material, synthesized from industrial by-products, would effectively not only addresses the urgent need to repurpose waste but also offers enhanced mechanical properties and durability compared to traditional Portland cement. The integration of such soft computing models offers a novel paradigm to address complex, nonlinear problems inherent in the concrete research, ranging from mix design optimization to performance prediction under various conditions. Hence, these advanced models facilitate a deeper understanding of the complex interplay between GPC's compositional variables and its mechanical attributes, enabling the optimization of mix designs for tailored applications. As the construction sector continues to evolve, the fusion of materials science and computational intelligence heralds a new era of innovation, where the accelerated design and deployment of high-performance, eco-friendly materials become a tangible reality. Table 1 collates seminal works in the domain, illustrating the scope, methodologies, and breakthroughs achieved through the application of soft computing specific to GPC/AAC research, thereby setting a comprehensive backdrop for the ensuing discussion on the subject.

Thematic Categorization of Selected Soft Computing Models Used in AAC/GPC Research.

Ref. Model Key Findings Attributes Future Scopes
[7] ANN Effectively predicted the strength variation due to molar concentration changes in activator solutions with R2 values over 0.96 Predicting strength with the use of 70% results for training and 30% sample results for testing Further refine ANN models to enhance predictive accuracy
[8] GEP Developed numerical models to predict GGBS-based GPC strength, demonstrating high accuracy and validation with R2 values ranging from 0.97 to 0.99 Compressive strength prediction of GGBS-based GPC with the use of 351 samples Expand GEP models to include more variables influencing GPC properties
[9] GEP Predict the compressive strength of bacteria-incorporated GPC, showing minimal error against experimental data Modeling compressive strength of bacteria-incorporated GPC Explore GEP's application in other GPC types with different admixtures
[10] RFR and GEP RFR and GEP were applied to develop empirical models predicting fly-ash GPC strength, where RFR showed better performance through statistical error checks Strength prediction of GPC using advanced soft computing methods developed through 298 datasets Compare these models against other ML techniques for broader applicability
[11] AI tools AI techniques like GP, RVM, and GPR showed high accuracies in predicting GPC strength with R2 values in the range of 0.93–0.99 AI-assisted mix-design tool for GPC Test these AI models in real-world mix-design scenarios for validation
[12] GEP GEP provided an empirical equation for GPC strength prediction using FA, showing good model accuracy and generalization capability Estimating GPC compressive strength using GEP developed through 298 datasets Enhance the GEP model by incorporating more diverse datasets
[13] ANN, RSM, and GEP Comparative analysis of ANN, RSM, and GEP showed RSM and ANN outperformed GEP in accuracy for predicting the strength of engineered GP composite (EGC) Predictive modeling of EGC compressive strength. The RSM showed 96% accuracy, whereas the ANN had 93% Improve GEP models or explore hybrid approaches for better prediction in EGC
[14] ML Ensembled ML techniques, particularly AdaBoost and random forest, outperformed individual methods in predicting GPC strength, and the R2 values of 0.90 for ensemble methods were obtained. Applying ML for strength prediction of GP composites; AdaBoost and random forest showed superior predictions Further explore the potential of ensembling techniques in predictive accuracy improvement
[15] ANN, M5P-Tree, LR, and MLR ANN model excelled in predicting the compressive strength of GGBS/FA-based GPC, showcasing its potential over other models Compressive strength prediction for GPCcompositesdeveloped through 220 datasets Enhance model reliability with broader datasets and explore real-time prediction capabilities
[16] ANN ANN models showed promise in predicting strength characteristics of AAC masonry blocks, with significant accuracy in training and validation phases Strength prediction for alkali-activated masonry blocks developed through 108 datasets Validate ANN models in diverse AAC formulations and structural applications
[17] GEP GEP demonstrated high accuracy in predicting the compressive strength of FRGC, supporting its use in optimizing concrete mixes; R2 values in the range of 0.97–0.99 indicating GEP's robust performance and reliability Predictive modeling for fiber-reinforced geopolymer concrete (FRGC)developed through 393 datasets Apply GEP in broader FRGC applications and investigate other fiber types and contents
[5] ANN, MPR, and SA-LR Utilized ANN and advanced regression techniques for predicting the performance of high-strength GPC, focusing on sustainable and cost-effective solutions Optimization of high-performance GPC mixes, with the use of 81 sample data Extend analysis to include long-term performance and durability predictions
[18] NSGA-II and BPNN Introduced a multi-objective optimization approach using NSGA-II and BPNN for geopolymer mix design, balancing mechanical, environmental, and economic factors; R2 and other statistical tests were used for validation Mix design optimization for fly ash-based GPC mixes, with the use of 896 sample data Expand optimization frameworks to incorporate additional environmental and durability criteria
[19] LR, ANN, and AdaBoost AdaBoost model showcased superior prediction accuracy with the highest R2 value for the compressive strength of FlA-based GPC compared to conventional machine learning models Enhancing predictive accuracy for FlA-based GPC strength Investigate AdaBoost's application in predicting other relevant concrete properties
[20] SVR and GWO The study applied SVR combined with GWO to predict the compressive strength of GGBFS-based geopolymer concrete, showing high accuracy and potential for optimization; R2 value for SVR-GWO was 0.95 Prediction of compressive strength for GGBFS-based GPC developed through 268 datasets Explore the integration of GWO with other predictive models for enhanced optimization and prediction
[21] LSTM Employed LSTM to forecast the compressive strength of FAGC, introducing a novel approach with optimized LSTM parameters for better prediction accuracy Compressive strength prediction in FAGC using LSTM developed using 162 datasets Further refine LSTM models and explore their application in real-time monitoring and control of GPC properties
[22] XGB and SVM The study compared XGB and SVM for predicting the slumpand strength of AAC, finding XGB to perform significantly better with higher R2 values (respective R2 values of 0.94 and 0.97 for slump and strength), providing a robust tool for AAC mix design Slump and compressive strength prediction in AAC with a total of 193 datasets Investigate the applicability of XGB in broader contexts of AAC production and other performance parameters

Abbreviations: AdaBoost: Adaptive Boosting; AI Techniques: artificial intelligence techniques; ANN: artificial neural network; BPNN:back propagation neural network; GEP:gene expression programing; GWO: Grey Wolf Optimization; LR: linear regression; LSTM: long short-term memory; ML: machine learning; MLR: multiple linear regression; MPR: multilinear regression; M5P-Tree: M5’ regression tree; NSGA-II: nondominated sorting genetic algorithm II; RFR: random forest regression; RSM: response surface methodology; SVM: support vector machine; SVR: support vector regression; SA-LR: linear regression models enhanced by swarm optimization; XGB: extreme gradient boosting.

To identify which soft computing method was determined to be the best across the reported studies, one would typically look for the method that consistently showed high accuracy, low error rates, and good generalization capabilities across different datasets. From the summarized details shown in Table 1, methods like GEP, AdaBoost, and RVM have shown high R2 values or have been explicitly mentioned as outperforming others in the respective studies, indicating their effectiveness in modeling and prediction tasks within the context of geopolymer concrete research. However, each study might have found a particular method to be the best based on its specific dataset and objectives. For instance, AdaBoost showed a high R2 value, indicating its strong predictive performance. Similarly, RVM's high R2 value suggests that it effectively captures the underlying patterns in the data, making it a robust choice for predicting the properties of GPC. Overall, it can be believed that while several soft computing methods demonstrated high accuracy and predictive capabilities, the “best” method could vary depending on specific study goals, data characteristics, and performance metrics used. However, methods like AdaBoost and RVM stood out in their respective studies for their predictive prowess. The effectiveness of a particular method, such as AdaBoost's ensemble approach or RVM's kernel-based learning, often hinges on how well it can generalize from the training data to make accurate predictions on unseen data, as evidenced by their performance metrics. Hence, for a detailed selection of the best method, one would need to consider not only the accuracy (like R2 values) but also other performance metrics and the context of each study, including the nature of the dataset and the specific prediction tasks. Each method's strengths and weaknesses should be weighed against the research objectives and the characteristics of the data being analyzed.

Research Gaps and Specific Objectives of Current Investigations

The previous studies have extensively explored individual soft computing techniques for predicting compressive strength of GPC/AAC mixes. The exploration of such models in the scope of the study represents a promising frontier, particularly when these materials are integrated with sustainability-enhancing components like agro-industrial wastes. The literature studies strongly reveal a burgeoning interest in optimizing GPC properties through advanced computational techniques, yet a discernible gap persists in the specific domain of pervious concretes developed using geopolymers and alkali-activated binders. Notably, the intersection of soft computing with the utilization of agricultural by-products with the utilization of industrial wastes such as C&D wastes, foundry wastes, and steel industry wastes in creating pervious alkali-activated binder-based concretes remains underexplored. This research niche holds significant potential for advancing total sustainable construction practices, leveraging the inherent benefits of AAC technology, such as reduced carbon footprint and enhanced material reuse, while incorporating the permeability attributes essential for modern infrastructure requirements.

Hence, the current investigation seeks to bridge this gap by focusing on pervious geopolymer concretes enhanced with the utilization of specific agro-waste material and industrial by-products, thereby pushing the boundaries of sustainability in construction materials. Moreover, the integration of soft computing models to predict and optimize the unique properties of these novel concretes represents an innovative approach that melds computational intelligence with sustainable material science. By addressing these gaps, this research outcome will strongly contribute to the academic discourse that paves the way for practical advancements in sustainable construction, promoting enhanced environmental stewardship and resource efficiency in the industry. Hence, under the broad scope of soft computing, the present investigation specifically includes comparisons of four established and less used ML models. These are Multiple Linear Regression, Gradient Boost, AdaBoost, and, XGBoost Regressions. Total of 156 datasets have been studied, which are cautiously developed in the sophisticated laboratory.

The detailed literature review focusing on the reported literatures on soft computing in similar concretes was also carried out to find out the performance of various models. Furthermore, an ensemble technique that combines the predictions from multiple ML algorithms together to make more accurate predictions than any individual model was also developed. The performance of the developed models was evaluated through the statistical score values, including root mean squared error (RMSE), mean absolute error (MAE), mean squared error (MSE), R2 score, and coefficient of variation (CV) mean, and the overall comparison of the models were made. Generalized flow diagram showcasing the soft-computing scope of the article is presented in Figure 1.

Figure 1:

Flowchart showing the experimentation and development of soft computing models.

Materials and Experimental Methods
Material Properties

The iron and steel industry by-product in the ground form (Ground Granulated Blas Furnace Slag, i.e., GGBS) is used as a major binder, and agro-waste, called sugarcane bagasse ash (AWA, i.e., SBA), was used as a substitute to binder at different levels. The GGBS was characterized by a specific gravity of 2.89 and a fineness of 360 m2/kg, containing major chemical oxides such as 38.12% silica (i.e., SiO2), 36.89% lime (i.e., CaO), 14.52% alumina (i.e., Al2O3), 7.60% magnesium oxide (i.e., MgO), and 1.15% iron oxide (i.e., Fe2O3). Whereas the SBA was tested with a specific gravity of 2.49 and a fineness of 462 m2/kg, comprised with 59.28% silica, 16.08% alumina, 8.10% lime, 5.85% iron oxide, and 4.80% magnesium oxide.

Two types of coarse aggregates were utilized in this study: naturally crushed granite coarse aggregates (NCA) and recycled coarse aggregates (RCA) sourced from demolished building materials, with respective specific gravities of 2.68 and 2.53. Given the nature of pervious concrete, which requires minimal fine aggregate, the coarse to fine aggregate ratio was consistently maintained at 9:1 throughout research. Additionally, waste foundry sand (WFS), an industrial by-product from the metal casting industry, was engaged as the fine aggregate, exhibiting a specific gravity of 2.56. All mechanical testing on the aggregates was carried out in accordance with relevant standards [23,24,25,26]. The results of particle size distribution of all these ingredients are presented in Figure 2. The alkaline activator solution for the concrete mixes was formulated using 98% pure sodium hydroxide (NaOH) flakes and liquid sodium silicate (LSS, i.e., Na2SiO3), sourced from local chemists. The LSS had 14.70% sodium oxide, 32.80% silicon dioxide, and 52.50% water, with a specific gravity of 1.57, while NaOH featured a specific gravity of 2.10. The mixture was created by blending NaOH with LSS to achieve a target activator modulus (i.e., Ms value which is the ratio of SiO2 to Na2O) and adjusting the water-to-binder ratio initially to 0.20 and then to 0.40 for mix preparation by the use of laboratory tap water. The prepared alkali activator solution was left in a sealed container for a minimum duration of 24 hours before use to ensure consistency in the chemical properties for optimal concrete mix performance.

Figure 2:

Particle size distribution of binder materials and aggregates.

Mix Design Strategies of GPC Mixes

The mix design for geopolymer pervious concrete was developed following the basic guidelines outlined in IRC: 44-2017 [27], aiming for a low-slump concrete (<25 mm) with a target compressive strength of 20 MPa. This design was adapted to create a slag-based geopolymer concrete (GPC) mix, leveraging insights from previous studies [3,28]. A satisfactory mix was achieved with 290 kg of total binding material (GGBS) per cubic meter of concrete and a water-to-binder (w/b) ratio of 0.40. The mix maintained a minimum percolation rate of 300 mm per minute, corresponding to a Darcy's coefficient of permeability of 5.0 cm/s. The total water content in the activator solution combined water from the liquid sodium silicate (LSS) solution with additional water to achieve the desired water content. Alkali activator solutions (AS) were tailored for each mix to provide a 4% Na2O dosage by binder weight, with a consistent activator modulus (Ms value) of 1.25. Tap water was used to produce the aqueous-alkali solution.

Initially, GGBS served as the primary binder, with systematic replacements by SBA ranging from 0% to 20% in 5% increments. To optimize the level of RCA, adjustments were made from 0% to 100% in place of NCA. Based on testing, mixes with 0% and 50% RCA were further adjusted for SBA content. This approach resulted in 13 distinct mix designs, which is detailed in Table 2. Each mix, identified by unique Mix IDs like “M-5-50,” indicates a composition of 5% SBA and 50% RCA. For each formulation, 12 individual cube samples of size 100-mm facedimension were prepared and air-cured for 28 days before compressive strength (CS) testing as per standard directives [29]. The freshly made mixture is poured into the standard mold in 3 layers, with each layer being thoroughly compacted using a table vibrator. It is then carefully finished and left to air-cure in laboratory conditions. This resulted in 156 cube samples across all mixes. The glances of casting, air curing and testing sequence are presented in Figure 3. Additionally, in an analogous manner, 3 cylindrical samples from each mix of size 100 mm dia. and 200 mm ht. were prepared and tested for hydraulic conductivity using the falling-head permeability method, as documented in literature studies [30,31].

Mix Proportion Details for 1 m3 Geopolymer Pervious Concrete Preparations in kg.

Mix ID GGBS AWA NaOH LSS Water AS NCA RCA FA
M-0-0 290 0 6.583 44.207 92.791 143.58 1881.3 0 199.7
M-0-25 290 0 6.583 44.207 92.791 143.58 1411.03 444.01 199.7
M-0-50 290 0 6.583 44.207 92.791 143.58 940.68 888.03 199.7
M-0-75 290 0 6.583 44.207 92.791 143.58 470.34 1332.05 199.7
M-0-100 290 0 6.583 44.207 92.791 143.58 0 1776.07 199.7
M-5-0 275.5 14.5 6.583 44.207 92.791 143.58 1878.13 0 199.3
M-10-0 261 29 6.583 44.207 92.791 143.58 1874.89 0 198.9
M-15-0 246.5 43.5 6.583 44.207 92.791 143.58 1871.65 0 198.6
M-20-0 232 58 6.583 44.207 92.791 143.58 1868.42 0 198.3
M-5-50 275.5 14.5 6.583 44.207 92.791 143.58 939.065 886.505 199.3
M-10-50 261 29 6.583 44.207 92.791 143.58 937.45 884.98 198.9
M-15-50 246.5 43.5 6.583 44.207 92.791 143.58 935.83 883.45 198.6
M-20-50 232 58 6.583 44.207 92.791 143.58 934.21 881.92 198.3
Figure 3:

Preparation, air-curing, and testing sequence of geopolymer pervious concrete specimens.

Development of Machine Learning (ML) Models

ML algorithms are highly capable of integrating a variety of complex parameters, including material properties, mix design, environmental conditions, and curing processes, which all influence the final strength of concrete. This predictive capability of ML is proven to be crucial for the optimization of the material mix and ensuring the structural integrity with sustainability in construction projects without the need for extensive physical trial and error, which can be costly and time-consuming. Hence, utilizing ML allows for a more accurate and efficient analysis of the parameters, thereby improving the predictability of concrete's performance characteristics [32,33].

Proposed ML Architecture

Figure 4 presents the flow diagram of the proposed ML modeling architecture adopted under the current scope of the investigations. Initially, the “dataset” is introduced into the system, where it undergoes “data preprocessing.” This initial step includes a “cleansing” phase that checks for “null, missing values.” If present, these are addressed before proceeding. Upon ensuring data integrity, the process applies “normalization and standard scalar” methods to standardize the scale of the data features, which is decisive for Ml model performance and comparison. Once the data are standardized, they are divided into two distinct sets for “training” and “testing,” with 70% of the data owed for ML model training to capture the underlying data patterns and 30% reserved for testing to validate model predictions against concealed data. Following the application of the various ML model algorithms, the “data calibration and verification” was adopted to refine the models and to ensure that the predictions align closely with the actual data. The process iteratively returned to model application if the error rates are not within accepted limits. The “evaluation metrics” play a critical role in assessing model performance. These statistical metrics help to quantify the accuracy, precision, and reliability of the models. Finally, if the errors are acceptable, indicating that the ML model's predictions are within a satisfactory range, the process was concluded. If not, the models are recalibrated and revalidated to improve its accuracy. This ML model architecture is designed to be rigorous and iterative, enhancing the developed ML model's ability to predict the CS of pervious GPC with high precision.

Figure 4:

Model architecture flow diagram of the soft computing adopted in the current investigation.

Brief Description of the Proposed ML Models

Specifically, multiple linear regression, Gradient Boost, AdaBoost, and, XGBoost Regressions are applied. Brief details on every individual ML models are presented under the following section:

Multiple linear regression (MLR) model: The MLR is a statistical technique employed to predict the outcome of a dependent variable based on two or more independent variables. This method is instrumental in analyzing how variations in independent variables contribute to the overall variance in the dependent variable. This method allows for the assessment of individual contributions from each independent (predictive) variable, providing insights into the relationships within the data. This approach is considered vital for understanding complex interactions in various scientific and engineering applications [32].

Gradient Boost Regression (GBR) model: The GBR utilizes a class within the Scikit-Learn library designed specifically for regression tasks. This method capitalizes on the concept of boosting, an ensemble technique that combines multiple weak predictive models to create a stronger aggregate model. GBR is fundamentally built on decision trees, structuring predictions beginning from the root and branching out based on various conditions until reaching the leaves, which represent the final prediction outcomes. The effectiveness of each iterative improvement in GBR depends on the “learning rate,” a parameter that determines the magnitude of adjustment made to the model with each successive tree added. A smaller learning rate may require more trees to converge to a robust model, enhancing the model's ability to generalize but increasing computational complexity. This method is particularly useful for handling nonlinear datasets with complex interactions and dependencies among variables [34].

AdaBoost Regression (ABR) model: The ABR-tuned model utilizes an AdaBoost regressor, a powerful meta-estimator. This method starts by fitting a base regressor on the initial dataset and subsequently fits additional copies of the regressor on the same dataset, adjusting the weights of instances based on the errors of current predictions. This iterative process enhances the model's focus on difficult-to-predict instances. For this particular application, the ABR model underwent fine-tuning of its hyperparameters through Grid Search CV. The tuning optimized several key parameters: the base estimator was configured as a decision tree, the learning rate was set at 0.5, the loss function was designated as linear, and the model was built with 40 estimators. These adjustments were specifically tailored to enhance the predictive accuracy and efficiency of the model in handling complex regression tasks [35].

XGBoost Regression (XGBR)model: The XGBR, short for Extreme Gradient Boosting, is a high-performance machine learning library that enhances the gradient-boosted decision tree algorithm through scalability and parallel processing. Known for its efficient implementation, XGBR significantly speeds up the training process of decision trees by utilizing parallel tree boosting [36]. Under the scope of the current work, the XGBR model was meticulously optimized using Grid Search CV to fine-tune its hyperparameters for optimal performance on the specific dataset. The best parameters identified were as follows: the booster type was set to gradient boost tree; gamma was fine-tuned to 0.001; the importance type used was “gain”; no GPU was used (gpu_id=-1); the learning rate was adjusted to 0.1; the maximum depth of trees was limited to 2 to prevent over-fitting; the minimum child weight was set at 1; it used 500 estimators; it was configured to run on a single thread (n_jobs=0); only one tree was computed in parallel (num_parallel_tree=1); the model's randomness is controlled by random_state=0; regularization on the weights of features was minimal with reg_ alpha=0 and reg_lambda=1; the balance of positive and negative weights was neutral (scale_pos_ weight=1); all training data were used in each tree (subsample=1); the tree method was set to “exact” to find the best split; and parameter validation was enabled (validate_parameters=1).

The Ensemble Voting Regressor (VR) model: This model employs a robust technique known as ensemble learning, which enhances prediction accuracy by combining outputs from multiple machine learning models. A key strategy within this approach is the Voting Regressor, which operates by aggregating the predictions from several regression models, using either simple or weighted averaging. This method effectively capitalizes on the wisdom of the crowd, where the collective predictions are averaged to enhance the accuracy and stability of the final result[37].

Hence, for this work, the ensemble integrated the predictive capabilities of all the four specific models, i.e., MLR, GBR, ABR (tuned), and XGBR (tuned). The corresponding weights assigned to each model in the voting mechanism were carefully calibrated based on their predictive performance: 0.40 for the MLR model, 0.10 for both the GBR and XGBR models, and 0.80 for the ABR model. This weighted averaging approach is expected to optimally balance the individual strengths of each model, leading to a superior collective prediction capability that outperforms any single model in the ensemble. This strategy is particularly effective in reducing variance and bias, thereby improving the robustness of predictive outcomes in complex datasets.

Criteria for Analyzing ML Model Performance

In the present study, six input parameters (namely, GGBS, AWA, AAS, NCA, RCA, WFS) and a single output parameter (i.e., CS) are considered. The details of the mixes were given in quite detail and are clearly presented in Table 2. Furthermore, the processing action was carriedout. In the processing of the dataset, the initial step involves preprocessing, where the data were verified and cleansed by checking for and addressing any missing or null values. Following this, the data were transformed to ensure uniform scale across all input features; this was achieved by applying a Standard Scalar, which normalized the data and enhanced the model performances. Finally, to facilitate the application of machine learning models, the dataset was strategically split into training and testing subsets. This segmentation allowed for the effective training of models on one portion of the data while validating model accuracy and generalizability on the other, ensuring stoutness and reliability of the predictive ML analysis.

In the validation of developed ML models designed to predict the CS, several key performance metrics are utilized to assess model accuracy and reliability. Root Mean Squared Error (i.e., RMSE) is employed to quantify the average magnitude of the prediction errors, providing a clear measure of the error variance, which is critical in evaluating the precision of the predictions. Another parameter, Mean Absolute Error (i.e., MAE) serves as another crucial metric, measuring the average magnitude of errors in predictions without considering their direction. This metric offers a straightforward representation of typical prediction errors, allowing to assess the average deviation from the actual values. Furthermore, Mean Squared Error (i.e., MSE) is used extensively to highlight the average of the squares of the errors. By squaring the errors before averaging, MSE gives a greater weight to larger errors, making it a vital tool for identifying models that may have occasional but significant deviations in prediction accuracy. Furthermore, the famous R2 Score, or the coefficient of determination, was considered to play a pivotal role by indicating the proportion of the variance in the dependent variable that is predictable from the independent variables in any developed statistical models. This score is helpful in determining the goodness of fit of the model, as a higher R2 value generally indicates a model that can explain a larger proportion of the variance, reflecting a more accurate representation of the real-world data. All the corresponding formulas are presented, respectively, under equations (i), (ii), (iii), and (iv) for RMSE, MSE, MAE, and R2 score. Where EVi and PVi are the measured (i.e., experimental) and predicted (i.e., test) values of the target variable (i.e., CS), respectively, n is the number of data, and EVmean and PVmean are the average ML model experimental CS value and output CS value, correspondingly[35]. RMSE=i=1n(PViEVi)2n RMSE = \sqrt {{{\sum\limits_{i = 1}^n {{{({PV}_i - {EV}_i)}^2}}} \over n}} MSE=1ni=1n(EViPVi)2 MSE = {1 \over n}\sum\limits_{i = 1}^n {{{({EV}_i - {PV}_i)}^2}} MAE=1ni=1n|EViPVi| MAE = {1 \over n}\sum\limits_{i = 1}^n {|{EV}_i - {PV}_i|} R2=1i=1n(EViPVi)2i=1n(EViPVmean)2 {R^2} = 1 - {{\sum\limits_{i = 1}^n {{{({EV}_i - {PV}_i)}^2}}} \over {\sum\limits_{i = 1}^n {{{({EV}_i - {PV}_{mean})}^2}}}}

Lastly, the coefficient of variation (i.e., CV) is another critical statistical metric used in the validation of developed ML models. CV represents the ratio of the standard deviation to the mean, expressed as a percentage. This metric is important for assessing the relative variability of the model predictions, irrespective of the units of measurement. Generally, a lower value of CV indicates a lesser degree of dispersion around the mean, signifying a model's consistency in performance across different samples. In the field of compressive strength prediction, where consistency is as critical as accurateness, the CV value provides an essential measure of the reliability and stability of the developed ML models. Hence, the evaluation based on the CV value will ensure that the predictive ML models are totally accurate and consistently reliable in diverse conditions, which is fundamental for their realistic deployment in design and quality control of concrete mixtures [32,38]. The obtained results were meticulously verified to ensure that they fall within the accepted error margins. If deviations were observed, the values were calibrated, and steps of the methodology were repeated to achieve the desired accuracy. Figure 6 clearly illustrates the flow of adopted methodology under in the present scope of the explorations.

Figure 5:

Average compressive strength and hydraulic conductivity of trial pervious GPC mixes.

Figure 6: (a)

Corelation matrix showing the affiliation of individual parameters with the other parameters.

Figure 6: (b)

Pearson's correlation coefficients between the parameters.

Results and Discussion
Compressive Strength and Permeability Results of Pervious GPC mixes

This study's exploration into pervious GPC mixes has demarcated a clear trend in the compressive strength(CS) and hydraulic conductivity (i.e., permeability) dependent on material proportions. The inclusion of up to 10% SBA as a replacement for the chief binder GGBS has led to an approximate 16% increase in compressive strength. This enhancement highlights the pozzolanic reactivity of SBA in the matrix. However, further increasing the SBA content beyond this threshold resulted in a decline in strength, suggesting an optimal threshold for SBA incorporation. Conversely, the substitution of NCA with C&D aggregates (RCA) markedly reduced compressive strength. At 100% replacement with RCA, the strength decreased by approximately 42%, underscoring the significant impact of aggregate quality on the mechanical properties of concretes [39]. This reduction in strength with an increased RCA content is offset by enhanced permeability, indicating a trade-off between structural strength and permeability, which are the major hardened properties for pervious concretes and are central to this investigation. Other fresh and hardened properties are not reported in this study. It is clear from the results that the mixes with higher strength have low permeability due to fewer voids. Many researchers revealed that the permeable concrete composite mixes with lower permeability values lead to a higher strength [40,41]. This is evident from the results obtained in laboratory research. As the finer particles of aggregates fill the gaps between various-sized coarse aggregates, the permeable mixes become denser. This increase in fine aggregate content also increases the surface area of the aggregates and reduces the average pore diameter size. As a result, water ingress is reduced, providing greater resistance to the flow of water. This ultimately leads to a decrease in the coefficient of permeability value of the permeable composite mixes [42].

These phenomena are visually summarized in Figure 5, which depicts the inverse relationship between CS and permeability coefficient across the varied mix compositions. The standard deviation values in CS across the mixes suggest a reasonable consistency, with the observed variances reflecting material heterogeneity and the impacts of aggregate types on the mix performances. Furthermore, for all the CS results reported, the observed standard deviations of the individual sample results were within the tolerable variation confines of 15% prescribed as per the standard code of practice for concretes [43]. The quantified data on CS and permeability will serve as the foundation for developing ML models aimed at predicting the performance of these mixes. As the scope of this work is to develop ML models to study the effects on pervious GPCs, the obtained results are presented and discussed in the next section.

Results on ML Modeling of Pervious Geopolymer Concretes

The diversity and breadth of training data are crucial for the robustness of ML models, particularly when developing predictive models for concrete compressive strength. A comprehensive dataset, representative of varied conditions in practical settings, is essential for this purpose [34]. In this study, which explores an under-researched area, data for 156 pervious GPC mix formulations were meticulously collected through controlled laboratory experiments. These mixes were air-cured under standard conditions, and the dataset compiled includes six input variables reflecting the mix components and one output variable, which is the compressive strength measured from 100 mm side cube specimens. The nomenclature and units of these variables are detailed in Table 2.

Statistical Narrative and Correlation Exploration of the Input Datas

The CS was ascertained using conventional standard testing methods. Table 3 provides a statistical breakdown of these variables, illustrating the distribution characteristics essential for effective ML modeling. These ML models are tailored to predict the performance of ordinary strength of pervious GPCs, which typically feature compressive strengths ranging from 15.0 MPa to 39.8 MPa. These concretes utilize alkali-activated, GP binders and incorporate both NCA and RCA, with WFS serving as fine aggregate.

Expressive statistics of the dependent and independent variables.

Variable Unit Count Mean std. dev Minimum 25% 50% 75% Maximum
GGBS kg 156 268.16 21.56 232.00 246.5 275.5 290.0 290.0
AWA kg 156 21.85 21.56 0.0 0.00 14.5 43.5 58.0
AAS kg 156 143.59 - 143.58 143.6 143.6 143.6 143.58
NCA kg 156 1230.18 601.67 0.00 935.8 940.7 1871.7 1881.3
RCA kg 156 610.13 569.48 0.00 0.00 881.9 886.5 1776.1
FA (WFS) kg 156 199.14 0.5321 198.30 198.6 199.3 199.7 199.7
CS MPa 156 27.73 5.544 14.96 24.37 27.79 31.2 39.81

Figure 8 (a) complements this by showing the frequency distributions of the variables, confirming their suitability for regression analysis in ML which clearly portrays the relative frequency allocations [35] of personage input (GGBS, AWA, AAS, NCA, RCA, WFS) and output (CS) parameters. The corresponding correlation coefficient matrix is shown in Figure 8 (b). This “Pair Grid” methodology elucidates the correlations among various attributes, enhancing the understanding of inter-variable relationships essential for robust model development. This visualization facilitates an in-depth analysis by displaying the pair-wise relationships between all attributes, thus allowing for a comprehensive assessment of how each variable interacts within the dataset. The process of correlation analysis involves evaluating the degree of association between variables. Although various correlation coefficients exist, such as Spearman, Kendall, and Pearson, Pearson's correlation coefficient (ρxy) is majorly recognized by the researchers. It quantifies the linear relationship between two variables by dividing the covariance of the variables (cov (X, Y)) by the product of their std. deviations (σX, σY), as expressed in Eq. (v). Here, x and y are the means of the variables X and Y, respectively. ρxy=cov(X,Y)σXσY=(xix¯)((yiy¯)(xix¯)2(yiy¯)2 {\rho_{xy}} = {{cov(X,Y)} \over {{\sigma_X}{\sigma_Y}}} = {{\sum {({x_i} - \bar x)(({y_i} - \bar y)}} \over {\sqrt {\sum {{{({x_i} - \bar x)}^2}}} \sqrt {\sum {{{({y_i} - \bar y)}^2}}}}}

Figure 7:

Actual vs predicted compressive strength results from ML models.

Figure 8:

Results of RMSE and R2 values of developed ML models.

The Pearson correlation coefficient (ρxy) has a value range from -1.0 to +1.0. Higher ρxy values suggest a well-built linear relationship impacting the output parameter. A coefficient of -01 indicates a contrary correlation, while the value ”0” suggests that the variables may be uncorrelated or have a nonlinear relationships, as Pearson's method only detects linear correlations. Thus, a zero value does not imply a lack of correlation but rather that there is no defined linear dependency between the variables Y and X.

As clearly portrayed in Figure 6 (a) and (b), the attribute representing the alkali activator solution is held constant across the dataset. Notably, NCA exhibits a strong positive correlation (0.80) with CS, indicative of their contributory role in enhancing mechanical robustness. Conversely, RCA manifests a prominent negative correlation (−0.79) with CS, suggesting a clear detrimental effect on structural integrity when used in higher proportions.

These statistical relationships emphasize the material trade-offs, particularly in sustainable construction paradigms where the use of recycled materials must be balanced against strength performance imperatives. Hence, this approach was considered critical for effectively identifying potential influences and dependencies that could affect the predictive accuracy of the ML models employed in this study [44].

Also, from the mix design Table 2, it is clear to note that the input parameters exhibit variations in scale. To mitigate the dominance effect arising from these magnitude differences among input parameters, it is necessary to implement normalization. This data preparation technique standardizes the values within the dataset to a common scale, enhancing the efficiency of the learning algorithms and facilitating quicker convergence. Hence, the Feature standardization is pivotal in ML modeling as it equalizes the significance of all features by transforming their values to a uniform range. This standardization ensures that the model attributes equal importance to each feature during the learning phase, promoting a more balanced and effective analysis. As the methodology employed is well-documented in existing literature [37,38], thus providing the additional detail may result in redundancy of fundamental concepts. This task is achieved by applying the maximum-minimum normalization technique, which renovates the data using the equation specified under eq. (vi). Xn=XXminXmaxXmin {X_n} = {{X - {X_{min}}} \over {{X_{max}} - {X_{min}}}}

In this formula, Xn denotes the feature normalized data, with Xmin representing the smallest and Xmax the largest values of the inputs. X corresponds to the individual original data before normalization. This technique benefits the model development process by expediting calculations and enhancing the accuracy and robustness of the predictive model. Table 4 displays the normalized data for the input parameters following feature standardization, presenting the transformed values that ensure comparability across the ML study's variables.

Input Data after Feature Standardization.

GGB S AWA AAS NCA RCA FA
1.02 −1.02 - 0.29 −0.28 1.06
0.32 −0.32 - −0.51 0.51 0.28
−1.76 1.76 - 1.06 −1.07 −1.66
−0.37 0.37 - 1.07 −1.07 −0.49
−0.372 0.37 - −0.51 0.51 −0.49
Comparative Exploration of Developed Soft Computing ML Models

Figure 7 portrays a series of scatter plots comparing the actual experimental results of CS for pervious GPC against the values predicted by various ML models. These plots serve as a illustrative assessment of the models' performance, showing the degree to which the predicted values align with the actual experimental outcomes. In each plot, the dashed sloping line represents the line of ideal prediction, where the predicted values precisely match the actual CS values. Hence, the proximity of data points to this line is indicative of the ML model's predictive accuracy.

The MLR model shows a respectable congruence with the true values, denoting a solid base model performance. However, there appear to be deviations, especially as the CS values increase, suggesting linear regression's limitations in capturing complex nonlinear relationships. The scatter plot for Tuned XGBR demonstrates a better alignment with the line of perfect prediction, implying that the tuning process has refined the model to better encapsulate the complexities of the data. The GBR, another ensemble method, similarly displays a commendable predictive performance. The ABR results indicate a slight improvement over the GBR, which could be due to the adaptive learning process it utilizes, placing more emphasis on the instances that previous models misjudged. Finally, the ensembleVR model– which aggregates predictions from the aforementioned models – exhibits a high degree of accuracy, as seen by the concentration of data points around the line of perfect prediction. This ensemble approach evidently synergizes the CS of individual models, thereby improving the sturdiness and reducing the potential for overfitting. Overall, while each model has merits, the VR emerged as the most promising, encapsulating the predictive power of other models while mitigating their respective weaknesses.

Furthermore, every model's performance metrics (such as RMSE, MAE, and R2 score) would quantitatively complement these visual insights. Lower RMSE and MAE values, alongside a R2 score close to 1, would support the visual elucidations of the models' effectiveness in predicting the strength values.

The key ML model statistical parameters obtained after scrutinizing the efficacy of various ML models applied to predict the CS of pervious GPC are presented in Table 5. Each model's performance was meticulously tuned to achieve optimal accuracy, and the results are collectively presented.

Results on Machine Learning Models Applied on Input Data with the Performance Metrics

Statistical Parameters of ML Models Multiple Linear Regression XGBoost Tuned AdaBoost Tuned Gradient Boost Regressor Voting Regressor
RMSE 1.64 1.63 1.59 1.64 1.52
MAE 1.28 1.30 1.26 1.30 1.21
MSE 2.70 2.70 2.51 2.70 2.32
R2 Value 0.83 0.91 0.86 0.88 0.90
CVmean −0.14 −0.74 −0.91 −0.79 −0.11

The results of MLR revealed a significant predictive capability with an R2 score of 0.83, suggesting that the model could explain 83% of the variance in predicted strength. However, a negative CVmean of -0.14 hints at potential over fitting issues, which might affect the model's performance on unseen data. The XGBoost Tuned, similarly, displayed robustness with an identical R2 score of 0.88. Its performance metrics, including RMSE, MAE, and MSE, matched those of the MLR, underscoring its consistency. However, the more negative CVmean of -0.74 raises concerns about the model's stability and reliability across different validation folds. The tunedABR emerged slightly superior among the individual models, achieving the highest R2 score of 0.86. This model demonstrated the lowest RMSE and MSE, indicating its greater accuracy and consistency in predicting the strength of pervious GPC. The least negative CVmean of -0.91, although improved still suggests room for enhancing model generalization. The GBR tracked closely with the XGBR and MLR models, mirroring their statistical scores but with a slightly less negative CVmean of -0.79. This reflects a balanced performance but with room for improvement in model training and validation phases. Finally, the VR, an ensemble of all the aforementioned models, outstripped the individual predictors by integrating their strengths. This model achieved the most favorable outcomes, with the highest accuracy (R2 score of 0.90) and the lowest error rates (RMSE of 1.52 and MSE of 2.32). Its CVmean of -0.11 is substantially less negative, indicating a robust model with consistent performance across different test scenarios. The comprehensive evaluation emphasizes the effectiveness of ensemble methods, particularly the VR, in refining the prediction accuracy for the CS of pervious GPC. The notable performance of AdaBoost Tuned justified its significant weightage in the ensemble configuration, enhancing the overall model efficacy. It can be witnessed that the error among the predicted and measured strength value is minimal.

Figure 8 indicates the major metrics (i.e., RMSE and R2 values) for ML models for both the training and test datasets, which are considered decisive for understanding model efficacy. In examining the MLR model, we find RMSE values of 1.64 and 1.72 for the training and test datasets, respectively, with R2 values of 0.82 and 0.83. While linear regression provides a reasonable baseline for prediction, the RMSE indicates a moderate discrepancy between the predicted and actual values, and the R2 shows that a substantial portion, but not all, of the variance is captured by the model. The GBR improves upon MLR with a lower RMSE of 1.63 and 1.65 and higher R2 values of 0.89 and 0.91 for the training and test datasets, respectively. These values suggest that this model more accurately predicts CS and accounts for a greater degree of variance, likely due to its ability to minimize errors sequentially through multiple decision trees. The ABR, which adapts by focusing on instances that previous iterations mispredicted, shows an RMSE of 1.59 for training data and 1.61 for testing, with R2 values of 0.89 and 0.86.

These statistics indicate a strong model fit in training, although with a slight reduction in the test phase, hinting at potential overfitting issues or the need for further parameter tuning. XGBR, an optimized gradient boosting library, shows an RMSE of 1.64 in training and 1.79 in testing, with R2 of 0.88 and 0.88. The increase in RMSE for the test data suggests that this model may not generalize as well as others, although the consistent R2 indicates a stable prediction of variance across both datasets. Finally, the VR model, an ensemble of the aforementioned models, registers the lowest RMSE of 1.52 and 1.59 for the training and test datasets, respectively, and an R2 of 0.88 and 0.90. The VR's performance indicates that it effectively combines the strengths of the individual models, balancing out their weaknesses and thereby providing more reliable predictions. The consistent improvement in RMSE and R2 across both datasets underscores the robustness of the ensemble approach[45].

In order to critically evaluate these models, we must consider both RMSE and R2 in tandem. RMSE offers a clear indication of the average magnitude of the model's errors, with lower values signifying more accurate predictions. R2 provides insight into the proportion of the variance for the dependent variable that's captured by the model. Together, these metrics illustrate the models' predictive accuracy and their ability to generalize to novel, concealed data. Overall, while each model has its merits, the VR model emerged as the most effective, leveraging the collective power of multiple algorithms to enhance predictive accuracy. The analysis reveals that the choice of model can significantly influence the performance and reliability of CS predictions for pervious GPCs.

Comparative Analysis of Predictive Accuracy and Feature Influence in ML Models

The juxtaposition of figures under Figure 9(a) and Figure 9(b) shows the predictive efficacy and influence of the input parameters of various ML models on the CS value. The X-axis in Figure 9 (a), denoting the sample number, provides a sequential view of the model's performance over the testing dataset. The Y-axis represents the error values, reflecting the model's precision across each sample point. The closeness of the predicted values (depicted by the blue line) to the actual values (represented by the dashed red line) signifies a low error margin, reinforcing the model's robust predictive capabilities under every ML models.

Figure 9: (a)

Results showing the errors in predicted vs actual values of compressive strength from the testing dataset.

Figure 9: (b)

Results showing the feature score of the ML models for compressive strength.

Comparatively, the VR model demonstrates much closer alignment between predicted and actual values, suggesting an enhanced predictive performance. This result is attributable to the weighted aggregation of predictions from multiple models, which mitigates individual model biases and leverages collective intelligence. Occasional peaks and troughs suggest that while certain samples may pose a greater challenge in prediction, the model's overall performance remains unfalteringly high. Over fine-tuning, the VR model's accuracy and reliability, as visually depicted in this plot, mark a promising advancement in the domain of soft computing applications within material science, showcasing a method that could be pivotal in future engineering innovations.

Conversely, Figure 9 (b) highlights the feature coefficient scores, revealing the varying degrees of influence that input variables exert on CS. However, the ensemble VR model's feature coefficient scores are not showcased here due to its methodology of amalgamating outputs from various other ML models.

For instance, NCA and RCA appear to have substantial impacts, as demonstrated in their coefficient magnitudes across models. The discrepancy between the influence of NCA and RCA underscores the complexity of incorporating varying aggregate types and the nuanced effects on concrete properties. Hence, through the integration of the insights from both figures, it is apparent that while individual ML models offer valuable predictions, the ensemble approach in VR provides a more robust and accurate predictive performance. This consolidates the premise that in the realm of complex material interactions in GPC formulations, ensemble ML models are paramount in harnessing the predictive power of soft computing techniques. The disparity between the coefficients of features across models further corroborates the necessity of considering multiple models to capture the heterogeneity of influential factors on the compressive strength of the composite under consideration. Overall, these analyses clearly prove that while individual factors can significantly impact the CS, the integration of multiple ML models into an ensemble framework like VR can significantly enhance the accuracy and reliability of predictions for pervious GPCs.

Figure 10 presents a density plot juxtaposing the actual and predicted values of concrete CS derived from the best-performing tuned ensemble VR model. The congruence of the density curves signifies that the predictions closely align with the actual data [38], confirming the VR model's capacity to capture the variance in the dataset effectively. The proximity of the peaks for both actual and predicted values stresses the VR's proficiency in central tendency prediction.

Figure 10:

Results of predicted values and actual values from the ensemble Voting Regressor ML model.

Moreover, the model's robustness is evident from the distribution spread, where both predicted and real values exhibit similar variance, reinforcing the model's credibility. The similarity in the tail lengths of both distributions further illustrates that extreme values, whether high or low, are accurately anticipated by this ML model [35]. Hence, the VR model's capacity to generalize well, indicated by the high degree of similarity between the density plots of predicted and actual values, lays the groundwork for its application in optimizing the mix design for improved pervious GPC performances, thus opening avenues for future developments in material technology and computational modeling in this field.

Hence, this investigation exemplifies how integrating multiple ML techniques can substantially benefit predictive modeling in sustainable construction engineering contexts, offering a vigorous tool for designing better-performing geopolymer concretes for sustainable future. These encapsulated findings effectively provide a compelling narrative on the application of advanced ML methodology to improve the understanding and prediction of material properties in civil engineering research. Overall, the developed ML models effectively persuade all the indispensable conditions for all the dependent variables, which clearly shows that the developed ML models are proficient enough to predict the most-important strength of the pervious geopolymer concrete mixes.

Conclusions and Scopes for Future Research

This study presented a comprehensive investigation into the performance of pervious GPC hybridized with agro-industrial wastes (GGBS, SBA, and WFS) and C&D wastes, employing advanced soft computing techniques for CS prediction. The experimentation involved creating 13 distinct GPC mixes with varying percentages of SBA and RCA content and analyzing their effects on the 28-day strength and hydraulic conductivity. These properties were considered to be vital as they directly relate to the structural integrity and functionality of pervious concretes. The experimental results elucidated a significant enhancement in compressive strength with up to 10% inclusion of SBA, after which the strength gradually decreased. This finding highlights the optimal use of SBA in enhancing the geopolymer matrix's strength due to its pozzolanic activity up to a certain dosages. Conversely, increasing the proportion of RCA negatively impacted the compressive strength due to the poorer quality of C&D aggregates compared to fresh crushed granite. However, the increased RCA content improved the hydraulic conductivity, indicating a beneficial aspect for permeable concrete applications obliging higher permeability. Furthermore, the application of multiple linear regression, gradient boost, AdaBoost, XGBoost regressions, and an ensemble model using a Voting Regressor effectively modeled the compressive strength of GPC. Among these, the AdaBoost Tuned model and the ensemble approach emerged as superior, providing robust predictions with lower error rates, demonstrating the effectiveness of combining multiple predictive models to enhance prediction accuracy. The present investigation effectively confirms that the leveraging advancements in soft computing models can significantly contribute to the sustainable development of construction materials, aligning with global sustainability goals by reducing industrial waste and enhancing material properties.

Future researchers on the topic may have the possibility to explore further the balance between mechanical properties and environmental benefits in GPC by integrating other types of industrial and agricultural waste products such as copper slag, rice husk ash, and fly-ash. There is also an opportunity to refine the ML models by incorporating more comprehensive datasets that include additional environmental and operational variables affecting composite performances. Furthermore, long-term durability studies under various environmental conditions could provide deeper insights into the practical applications and limitations of these materials. Also, expanding the scope to include fresh concrete properties and other mechanical parameters could offer a more holistic view of the material characteristics. Further studies could also focus on scaling up the production process and evaluating the economic viability of pervious GPC in commercial applications.

Language:
English