Predictive Modelling of Pavement Quality Fibre-Reinforced Alkali-Activated Nano-Concrete Mixes through Artificial Intelligence
Categoría del artículo: Original Study
Publicado en línea: 24 mar 2025
Páginas: 389 - 416
Recibido: 19 sept 2024
Aceptado: 22 ene 2025
DOI: https://doi.org/10.2478/sgem-2025-0007
Palabras clave
© 2025 Akhila Sheshadri et al., published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
Alkali-activated concrete (AAC) has emerged as a potentially promising alternative for sustainability in highway and rigid pavement construction. The development of highways and rigid pavement has been a significant focus in India and worldwide. The increase of the rigid pavement system in Thailand has led to the initiation of the Rigid Pavement Maintenance System by the Department of Rural Roads [1]. In India, the Golden Quadrilateral project has been a pivotal initiative, significantly impacting the manufacturing activities and contributing to the economic development of the country [2]. The National Highway Development Program in India has also been a major investment in upgrading and rehabilitating the country’s major highways to international standards, reflecting the country’s commitment to infrastructure development [3] [4][5]. There has been exponential growth in the national highways in India from the years 2014 to 2023 by an increase of about 47.85%, as shown in

The yearly growth of National Highways in India [15].
There are numerous studies which analyses the effects of fibre in normal Ordinary Portland Cement (OPC) concrete. However, the incorporation of fibres in PQAC mixes is a seldom researched area. Therefore, in order to overcome the brittle nature of PQAC, fibres were incorporated in the concrete mix. Polyvinyl alcohol fibres (PVAF)are noted for their high tensile strength and Young’s modulus, significantly enhancing the tensile load-carrying capacity of concrete. Their hydrophilic nature promotes strong bonding with the cementitious matrix, improving crack resistance and ductility. Polypropylene fibres (PPF), on the other hand, are effective in reducing early shrinkage cracks due to their high elongation capacity and flexibility, bridging micro-cracks and enhancing overall resistance to larger cracks. Both PVAF and PPF exhibit excellent chemical resistance, making them suitable for aggressive environments where traditional fibres, such as steel, may corrode. They are lightweight, are easy to incorporate and disperse uniformly throughout the concrete matrix, ensuring consistent performance and minimal disruption to workability. In contrast, steel fibres are known for significantly improving tensile strength and crack resistance but are susceptible to corrosion, particularly in harsh environments, which limits their long-term effectiveness. Glass fibres, while non-corrosive and exhibiting high tensile strength, are brittle and prone to alkali attacks unless treated, limiting their effectiveness in high-pH environments. Carbon fibres provide exceptional tensile strength but are often impractical for large-scale applications due to their high cost [14]. Thus, in order to overcome these drawbacks, PVAF and PPF were adapted in the current research work. The optimisation of these various additives involves a lot of trial mixes, consequently resulting in utilisation of large quantity of materials. This poses a challenge in the field of building construction. Therefore, as artificial intelligence advances, a growing array of algorithms and models have emerged, offering fresh approaches that address these issues. Despite the comprehensive research conducted on nano-additives and fibres in OPC concrete, their integration into AAC has been less thoroughly investigated. The study addresses the gap by examining PQAC, which is specifically designed for sustainable pavement applications.
Nguyen and Dinh emphasised the importance of predicting the 28-day compressive strength of concrete from its initial ingredients, which is crucial for ensuring that concrete mixes meet strength requirements [16]. This predictive approach is echoed in the findings of Jamali et al., who applied AI methods to predict the compressive strength of fibre-reinforced polymer-confined concrete, further illustrating the adaptability of AI in diverse concrete scenarios [17]. Moreover, the use of machine learning techniques, including backpropagation neural networks (BPNNs), has been pivotal in predicting concrete strength. Li and Singh employed a BPNN to analyse the strength index of concrete with large recycled aggregates, achieving satisfactory prediction results [18]. This aligns with the findings of Haddad and Qarqaz, who conducted a comparative analysis of artificial neural network (ANN) for predicting bond strength in concrete, further validating the effectiveness of AI in structural applications [19]. Dao et al. explored novel hybrid AI techniques for predicting the compressive strength of geopolymer concrete, indicating that combining various AI methodologies can yield superior results compared to traditional methods [20]. This hybridisation approach is echoed in the work of Wang et al., who adapted ANNs for estimating concrete compressive strength, demonstrating the potential for integrating multiple AI techniques to improve predictive performance [21]. Wu et al. successfully predicted the tensile strength of high-performance concrete by combining ANN and Support Vector Regression (SVR) models using optimisation techniques [22]. The tensile strength of concrete was predicted by researchers using SVR and Gradient Boosting machine (GBM) models. The results showed that GBM provided superior prediction performance to SVR [23]. Four models were employed by Hammad et al., specifically M5P, model tree method, a gene expression programming, ANN and Random Forest (RF), to forecast the strengths of concrete comprising of metakaolin. Findings indicated that RF exhibited most accurate forecasting capability [24]. Nozar et al. investigated the compressive strength of concrete incorporating metakaolin by employing the Multi-Layer Perceptron (MLP) model. The results showed how the MLP network exhibited consistent accuracy in predicting the compressive strength of metakaolin-infused concrete [25]. Additionally, an intuitive programme was created to simplify the utilisation of the suggested MLP network, which is grounded on machine learning techniques. Huang et al. introduced a model of hybrid machine learning that combines the firefly method with the RF algorithm to provide accurate forecasts of the compressive strength of cementitious materials when expanding clay is presents [26]. Abdulrahman et al.’s research assesses how well different ensemble and individual models predicted the compressive strength of binders made of expanding clays. The Decision Tree (DT) AdaBoost model and the enhanced bagging model showed the best prediction performance in the Strength of Metakaolin concrete, according to the findings of Bulbul et al [27]. While existing studies have primarily focused on OPC-based systems, this research extends ML applications to PQAC, addressing important gaps in the literature regarding its mix optimisation and mechanical performance. The study evaluates multiple ML algorithms to identify the most reliable model for predicting Split Tensile Strength (STS) in PQAC, ensuring robust and practical results. Nevertheless, there is a lack of research and study about the use of ML models for predicting the STS of PQAC that incorporates nano-additives and fibres. While most studies emphasise compressive strength prediction [28], this work addresses tensile properties critical for pavement applications, a relatively underexplored domain. This study employs advanced ensemble ML models, including RF, GBM and AdaBoost, to predict the STS of PQAC. Unlike simpler regression models or standalone ANNs, the ensemble models used here can handle complex, non-linear interactions between input features. By focusing on tensile properties, which are crucial for pavement applications, this research aims to optimise fibre dosages and nano-additive incorporation to enhance the tensile performance of PQAC. Experimental validation further confirms the relevance of ML predictions in real-world applications, enhancing the credibility of the proposed optimisation framework. Overall, this study fills a significant gap in current research by adapting machine learning to optimise PQAC properties, integrating the effects of fibres (PVAF and PPF) and nano-additives (NS and NA). The research uses a robust evaluation framework, including MAE, MSE, RMSE, R2, and CV metrics, to rigorously assess the predictive accuracy of the machine learning models. It also explores the effect of varying estimator counts in ensemble models, optimising them for practical applications. This comprehensive, data-driven approach aims to develop high-performance PQAC mixes tailored for sustainable construction applications. This approach minimises the need for extensive experimental trials, making it cost-effective and time-efficient. This combination of innovative material integration, focus on underexplored systems (PQAC), and the use of machine learning for predictive modelling sets the study apart in the field of fibre-reinforced alkali-activated nano-concretes.
The subsequent sections provide a description of all the materials used in the manufacturing of the PQAC mixes, which include ground granulated blast furnace slag (GGBS), river sand fine aggregate (RSFA), natural coarse aggregate (crushed granite) (NCA), liquid sodium silicate (LSS), sodium hydroxide (NaOH), water, NS, NA, PVAF and PPF.
GGBS was supplied by JSW, Thoranagallu, India, and was acquired from a local supplier. The properties of GGBS, including its chemical and physical properties, are obtained from the research work of Sheshadri et al. [29] as this research work is a continuation of the previous research of Sheshadri et al. and are in accordance with the necessary BIS and BS standards (IS 12089-1987; BS 15167-1:2006).
We procured NCA from local vendors. Crushed, well-graded, angular, clean granite aggregates up to the highest aggregate size of 20 mm in accordance with IRC:44-2017 [32] were employed. RSFA belonging to zone II as per IS:383, 2016 [33] constituted the fine aggregates used in the research work. The properties of the aggregate are obtained from research work of Sheshadri et al., [29] as it is an extension of the previous research performed.
The alkaline activator solution is employed to activate the cementing property of pozzolanic binders in PQAC mixes. In the current study, LSS, that is, Na2SiO3, along with NaOH, that is, NaOH chips (98% purity), served as the alkaline activator solution. These chemicals were procured from a local chemical supplier. The composition of LSS can be described as follows: It consists of 14.70% sodium oxide (Na2O), 32.80% silicon dioxide (SiO2) and 52.50% water (H2O) as per IS:14212, 1995 [34]. The ‘activator modulus’ (Ms) defined as the ‘ratio of SiO2 to Na2O’ in LSS was 2.23. At the same time, LSS and NaOH had specific gravities of 2.10 and 1.57, respectively. The dosage of NaOH in the LSS was adjusted to generate a Ms value of 1.25 when LSS + NaOH were combined with water; this ratio was maintained throughout the PQAC investigations. To obtain the desired Ms value of 1.25, the alkaline activator solution was made at least 24 hours prior to the concrete production by combining the NaOH chips with LSS and water. At this point, the water-to-binder (w/b) ratio was 0.20; later, while preparing the PQAC concrete, more water was added to raise the w/b ratio to 0.4.
NS particles are utilised, which are spherical and nano-sized. They are in the form of a white, odourless powder with an average particle size of 17 nm and a specific gravity of 2.2. The NS particles, owing to their larger specific surface area and superior fineness, enhance the pore structure of the concrete matrix. NS is composed of SiO2 content of 99.88%, carbon content of 0.06%, chloride of 0.009%, Al2O3 of 0.005%, TiO2 of 0.004%, and Fe2O3 of 0.001%.
NA is characterised by a spherical shape and is found in the form of a white, odourless and crystalline powder. Their mean particle size is 40 nm with a specific gravity of 3.4. The nanoparticles, because of their superior fineness and specific surface area, enhance the pore structure of the concrete matrix. Nano-alumina comprises pure Al2O3, which is in a crystalline form. The NA utilised in the experiments is made up of spherical particles with an Al2O3 concentration of 99.9%, Fe2O3 of 0.0012%, SiO2 of 0.015% and NaO2 of 0.45%.
PVAF was procured by Fibre Region supplier, Chennai, India, with 12 mm length, 40 μm diameter, 1600 MPa tensile strength, a specific gravity of 1.1, an elastic modulus of 40 GPa and a density of 1.29 g/cm3.
PPF was procured by Fibre Region supplier, Chennai, India, with a 6 mm length, 20 μm diameter, 500 MPa tensile strength, a specific gravity of 0.9, an elastic modulus of 4 GPa and a density of 0.91 g/cm3.
A total of 19 PQAC mixes were developed. The initial reference mix was developed based on the available literature studies [35], [36] to meet desired standards for high quality concrete pavements (28 days) and a slump value of 25–75 mm. For each mix ID, a total of 30 concrete cylinders were cast for testing. Therefore, the total number of samples that were cast was 540. Two nano-additives, that is, NS and NA, were adapted in the mix design to improve the strength of the PQAC. NS was used as a nano-additive at intervals of 0.5 starting from 0% to 2.0% by weight of the binder. NA was added at regular intervals of 0.25% starting from 0% to 1.25% by weight of the binder.
Furthermore, in order to overcome the brittleness of PQAC, fibres were incorporated into the concrete mix. The two fibres which were individually incorporated in the PQAC mixes are PVAF and PPF. Both the fibres were added at regular intervals of 0.4% starting from 0% to 2.0% by volume of the binder. The details of the mix design and mix IDs are presented in
Mix Design of PQAC mixes with nano-additives.
% Addition of nano-additives (by weight of binder) | 0% | 0.5% | 1.0% | 1.5% | 2.0% | 0% | 0.75% | 1.0% | 1.25% |
Materials | in kg/m3 | ||||||||
GGBS | 493 | 493 | 493 | 493 | 493 | 493 | 493 | 493 | 493 |
NaOH flakes | 11.2 | 11.2 | 11.2 | 11.2 | 11.2 | 11.2 | 11.2 | 11.2 | 11.2 |
Liquid sodium silicate | 75.12 | 75.12 | 75.12 | 75.12 | 75.12 | 75.12 | 75.12 | 75.12 | 75.12 |
Water | 157.56 | 157.56 | 157.56 | 157.56 | 157.56 | 157.56 | 157.56 | 157.56 | 157.56 |
Natural coarse aggregate | 1071.4 | 1071.4 | 1071.4 | 1071.4 | 1071.4 | 1071.4 | 1071.4 | 1071.4 | 1071.4 |
River sand fine aggregate | 577.84 | 577.84 | 577.84 | 577.84 | 577.84 | 577.84 | 577.84 | 577.84 | 577.84 |
Nano-silica | 0 | 2.47 | 4.93 | 7.39 | 9.86 | - | - | - | - |
Nano-alumina | - | - | - | - | - | 2.47 | 3.69 | 4.93 | 6.16 |
Mix Design of PQAC mixes with fibres.
% Addition of fibres (by volume of binder) | 0.4% | 0.8% | 1.2% | 1.6% | 2.0% | 0.4% | 0.8% | 1.2% | 1.6% | 2.0% |
Materials | in kg/m3 | |||||||||
GGBS | 493 | 493 | 493 | 493 | 493 | 493 | 493 | 493 | 493 | 493 |
NaOH flakes | 11.2 | 11.2 | 11.2 | 11.2 | 11.2 | 11.2 | 11.2 | 11.2 | 11.2 | 11.2 |
Liquid sodium silicate | 75.12 | 75.12 | 75.12 | 75.12 | 75.12 | 75.12 | 75.12 | 75.12 | 75.12 | 75.12 |
Water | 157.56 | 157.56 | 157.56 | 157.56 | 157.56 | 157.56 | 157.56 | 157.56 | 157.56 | 157.56 |
Natural coarse aggregate | 1071.4 | 1071.4 | 1071.4 | 1071.4 | 1071.4 | 1071.4 | 1071.4 | 1071.4 | 1071.4 | 1071.4 |
River sand fine aggregate | 577.84 | 577.84 | 577.84 | 577.84 | 577.84 | 577.84 | 577.84 | 577.84 | 577.84 | 577.84 |
PVA | 0.748 | 1.49 | 2.24 | 2.99 | 3.74 | - | - | - | - | - |
PPF | - | - | - | - | - | 0.612 | 1.224 | 1.836 | 2.448 | 3.06 |
The quantities of GGBS, RSFA, NCA, NS, NA, PVAF and PPF were measured by weight. The base reference mix was designed to meet desired standards for high quality concrete pavements with a water-to-binder (w/b) ratio of 0.4. NS and NA were added in increments of 0.5% and 0.25%, respectively, by weight of the binder. PVAF and PPF were added in increments of 0.4% by volume of the binder. GGBS, RSFA, NCA and the respective percentages of NS/NA were thoroughly mixed in a mechanical mixer for 2 minutes to ensure even distribution of dry components. The alkaline activator solution was slowly introduced to the dry mix while the mixer operated at low speed followed by additional water content. Mixing continued for 3 minutes to achieve homogeneity. PVAF or PPF fibres were gradually added over 1 minute to avoid clumping. The mixing was continued for additional 2 minutes to ensure uniform fibre dispersion. Cylindrical moulds (100 mm diameter × 200 mm height) were filled in three layers. Each layer was compacted using a vibrating table for 30 seconds to eliminate entrapped air. Specimens were demoulded after 24 hours and immediately transferred to curing conditions. The samples were water cured for 7 days and then air cured for the remaining period up to 28 days before the testing.
A flow chart representing the experimental framework is shown in

The detailed flowchart for the experimental programme and analysis.
The compaction factor test was performed as per IS 1199 (Part II) [37] in order to assess the workability of each PQAC mix, the results of which are presented in

Compaction factor values of PQAC mixes: (a) PQAC+NA and PQAC+NS and (b) PQAC+PVA and PQAC+PPF.
The STS test was performed on 30 cylindrical samples for each mix having a dimension of ‘200 mm height and 100 mm dia’ as per the Indian Standard Codes IS 5816-1999 [38] for the PQAC mixes. The obtained results are presented in

Split tensile strength and percentage variation in split tensile strength: (a) PQAC+NS, (b) PQAC+NA, (c) PQAC+PVAF and (d) PQAC+PPF.
Nevertheless, as the concentration of NA is increased in 0.25% increments from 0% to 1.25%, the CFV progressively decreases from 0.923 to 0.807. The workability of PQAC reduces with NA incorporation as the NA particles tend to agglomerate due to their high surface energy. These agglomerates can act as barriers within the mix, inhibiting the movement of other particles causing increased viscosity, hindering flow and making the mixture less fluid for proper compaction and placement [39], [40]. In a similar manner, increasing the concentration of NS from 0% to 2% in 0.5% increment results in decrement of CFV values from 0.923 to 0.806. NS rapidly reacts with alkaline solution and water, producing a viscous liquid, which is the cause for decreased workability in PQAC mixes [41], [42]. In addition, the inclusion of fibres decreased the workability of PQAC mixtures. In case of incorporation of PVAF into the PQAC mix, the workability reduced from 0.923 to 0.797 as the percentage of PVAF increased from 0% to 2%, respectively. The decrease in workability is caused by the hindered movement of the paste due to the inclusion of fibres. The inclusion of PPF fibres resulted in a decrease in workability, with a decrease from 0.923 to 0.796 as the proportion of PPF introduced increased from 0% to 2% accordingly. The inclusion of fibres in the concrete matrix hinders the flow of cement paste, leading to a decrease in workability [43].
The STS values for PQAC mixtures at 28-day curing are shown in
The STS varied as the PPF content varied from 0.4% to 2.0% at regular intervals of 0.4% as 4.71, 5.23, 5.70, 6.85 and 6.12 MPa. Notable observations are identified based on the analysis of the results. The incorporation of NS improved the STS of PQAC mixes up to 1.5% beyond which there was decrement in strength. A similar pattern is observed in case of NA where the STS increased up to an addition of 1.0% beyond that optimum point and further increment in the NA caused a decrease in the strength of PQAC mixes. Nano-additives have high surface area-to-volume ratios, allowing them to form strong bonds with the surrounding cementitious matrix [39]. This enhanced interfacial bonding improves the load transfer mechanism within the material, leading to increased stiffness and modulus of elasticity; they also act as fillers within the ITZ between the aggregate particles and the cementitious gel, which improves the bonding between these zones, creating a more continuous and stronger network, and good interfacial adhesion is responsible for overcoming local failures and providing higher resistance to bending and fracture forces. The existence of evenly distributed nano-particles impedes the spreading of fractures when subjected to tensile stress. By bridging micro-cracks or providing additional nucleation sites for further gel formation around the crack tip, NA can help prevent the growth of cracks and improve overall tensile strength [51 52]. Nevertheless, at higher nano-additive contents, if the particles are not well dispersed in the mix, they may cause agglomeration and weak bonding, resulting in weaker zones and a decrease in strength. Hakamy proposed that the agglomeration of the surplus unreacted fine materials functions as both stress concentrators and defect sites, weakening the material [45], and therefore, an optimised content of the nano-additive needs to be considered for incorporation in the mix. The PVAF and PPF also followed the same trend that the STS increased substantially up to a certain percentage addition of fibres, that is, 1.6% both in case of PVAF and PPF beyond which the STS reduced. Fibres in the PQAC matrix act as discrete reinforcements, effectively bridging cracks and improving the tensile load-bearing capacity of the material. As the fibre content increases, the number of fibres available to intersect and arrest the propagation of cracks also increases, leading to enhanced tensile strength. At lower fibre contents, they are typically well dispersed and distributed within the concrete matrix, maximising their reinforcement effectiveness. However, as the fibre content increases beyond the optimal point, agglomeration and poor dispersion can occur, leading to the formation of weak zones and stress concentration points, ultimately reducing the tensile strength [46]. Furthermore, the workability and compaction of the PQAC mix reduce, making it challenging to achieve a homogeneous and void-free matrix. Improper compaction can result in the formation of voids and defects, which can act as stress concentrators and reduce the overall tensile strength [43].
The area of ML is widely applied across several domains and is now undergoing extensive study to enhance its practicality. The present study use 399 samples, with 70% randomly designated as the training set and the remaining 171 samples, constituting 30%, utilised as the test set for the trained model. The STS has been forecasted utilising ML techniques, such as MLR, DT, RF, AdaBoost, SVR and GB. This study employs many tools, including DT, MLR, RF, ADA, SVR and GB, within the Sci-kit-Learn software package using Python.
A representative dataset is essential and crucial for ML models. Hence, this study gathered a grand total of 570 samples through its experimental works.
Statistical summary of the dataset.
570 | 493 | 0 | 493 | 493 | 493 | 493 | 493 | |
570 | 11.2 | 0 | 11.2 | 11.2 | 11.2 | 11.2 | 11.2 | |
570 | 75.12 | 0 | 75.12 | 75.12 | 75.12 | 75.12 | 75.12 | |
570 | 157.56 | 0 | 157.56 | 157.56 | 157.56 | 157.56 | 157.56 | |
570 | 1071.4 | 0 | 1071.4 | 1071.4 | 1071.4 | 1071.4 | 1071.4 | |
570 | 577.84 | 0 | 577.84 | 577.84 | 577.84 | 577.84 | 577.84 | |
570 | 1.3 | 2.81 | 0 | 0 | 0 | 0 | 9.86 | |
570 | 0.91 | 1.87 | 0 | 0 | 0 | 0 | 6.16 | |
570 | 0.59 | 1.13 | 0 | 0 | 0 | 0.75 | 3.74 | |
570 | 0.48 | 0.92 | 0 | 0 | 0 | 0.61 | 3.06 | |
570 | 5.34 | 0.94 | 3.69 | 4.64 | 5.17 | 5.81 | 8.2 |
MLR is an expanded form of the basic linear regression model. MLR models are used to estimate the degree of correlation between the input variable and the response variable [47], [48]. It is a ML approach that selects a predictor variable based on a variety of input factors. Many studies employed linear regression or MLR to predict the output variables efficiently [49]. The study incorporates many independent factors and one or more dependent variables, which can be either categorical or continuous in nature. It may be used to simulate the effect of independent factors on the dependent variable, to test hypotheses regarding the coefficients of the independent variables and to generate predictions based on the fitted model [50], [51].
The equation for multiple linear regression is represented by Eq. 1:
The aforementioned prognosis of STS of PQAC, here, STS, is considered as the dependent variable. The quantities of GGBS, NaOH, LSS, RSFA, NCA, Water, NS, NA, PVAF and PPF are considered as the independent variables. The fundamental objective of MLR is to allocate the coefficient and intercept values in a sequence of iterations in a manner that minimises the error or cost function, ensuring that the predicted values are as close as possible to the actual values. The dataset was divided into training and testing subsets in a 70:30 ratio, assigning 70% of the data to the training group. Thereafter, the co-efficient and constants were calculated. Additionally, the formulation was developed by altering the values of constants, and the intercept was subsequently assessed using the remaining 30% of the information. Nevertheless, if the method fails to implement appropriate sample and selection of independent criteria, it will not yield dependable findings for the end users.
DT is a supervised ML model that is capable of modelling both classification and regression problems [52]. In order to generate a DT, the DT model divides the samples according to their features. It is a rule-based model that predicts the mapped target value by generating logical divisions in data [53]. DT regression uses the characteristics of an item to train a model inside a hierarchical framework, facilitating the estimation of future data and the generation of meaningful continuous outputs. The procedure entails analysing the data via a series of enquiries, each aimed at refining the prospective values till a dependable model is developed. DT regression is a technique that partitions a dataset into more compact subgroups. The structure comprises a decision leaf that divides into two or more branches, each of which denotes the value of the attribute under investigation. The DT model comprises of three nodes: the root node, the decision node (or internal node) and the terminal node (or leaf node). The root node is the most suitable predictor and is located at the topmost level. The algorithm initiates at the root node, encompassing all training data, subsequently leading to the division of the root node into decision nodes. The process persists through the following levels until the tree reaches a specified maximum depth or when the nodes contain only a single sample from the training data. The approach aims to reduce the loss function, particularly metrics such as mean squared error (MSE). The model is calibrated to independent factors in conjunction with the target variable and is partitioned at designated places referred to as leaf nodes within a dataset for every variable that is independent. The loss function is observed, and the partition with the least error is chosen, perpetuating this procedure iteratively. However, the DT can become excessively large and overfit the data, leading to reduced generalisation on the test set and potentially impairing prediction accuracy. A threshold is established for multiple splits, or a minimum number of data points is required for each leaf node in order to construct the DT and mitigate the risk of increased misclassification [54], [55].
RF algorithm is a method of ensemble estimation. It is utilisable for both regression and classification tasks. An ensemble method is superior to a single DT methodology since it reduces over-fitting by averaging the results. The RF algorithm employs basic DTs as the fundamental learners [56]. RF is an integrated learning model that comprises multiple DTs. The fundamental concept is to enhance prediction accuracy and stability through the construction of multiple DTs. Each DT is constructed using random samples and features, which enables RF to mitigate overfitting and exhibit strong robustness. The following benefits include: RF can employ multiple DTs for prediction, resulting in greater accuracy compared to a single DT. RFs are capable of managing a substantial quantity of input features, making them suitable for classification and regression tasks involving high-dimensional data. They are developed through the use of random samples and random features, which mitigates the issue of overfitting. RF represents an enhanced iteration of the DT model, employing the ensemble method to integrate several DTs, thereby diminishing the overall model’s variance [57]. Every DT is constructed from the sample by employing a bootstrap/bagging technique. Utilising the voting and averaging approach, the output of all the DT is consolidated into a single value for classification and regression tasks. Bagging methodology iteratively chooses B random samples (with replacement) using a specified training dataset (X) to train DTs. During the construction process, these DTs are formed independently and do not interact with each other. Once trained, these trees are capable of predicting the unknown data. In general, DTs have a quite large variation. Nevertheless, when they are merged simultaneously to develop the RF, the total variance decreases. Addition of randomisation serves to reduce the correlation among individual trees in the ensemble. Selecting the variable for that particular node minimises the error for the target variable [58]. This work implements the RF algorithm using the scikit-learn Python tools [59], [60]. The prediction of this algorithm is dependent on the RF’s multiple trees, which are more generalised [61]. Conversely, RF is more challenging to interpret and visualise than Decision Trees [58], [61].
Many real-world datasets may exhibit a non-linear connection between input and output properties, which the linear regression technique may not be capable of detecting. The SVR approach is commonly employed for constructing input–output model mappings due to its efficient resolution of non-linear regression issues. SVR has shown several effective applications in the domain of civil engineering [62]. SVR is a linear approach, it is a supervised algorithm from the machine learning family that was developed by Vapnik in 1995, and it is considered one of the most advanced methods in the field of machine learning in order to resolve classification and regression issues [63]. The objective of this supervised learning approach is to optimise the distance between the separator hyperplane and the nearest train point (support vector) for each class in order to get the required performance for training data. Once linear data separation becomes unattainable, a mapping to a higher space is employed to facilitate this separation. An algorithm called Kernel is used to do this mapping, enabling the determination of a non-linear decision boundary in a high-dimensional space. Subsequently, the margin boundary lines are drawn from the hyperplane at a given error threshold distance, represented by the symbol epsilon. We call this kind of loss ‘ε-insensitive loss’. Outside of the margin mapped data points are not included in the training process. The goal of SVR for regression is to determine a hyperplane that best fits the data within a specified error margin, ensuring minimal deviation from actual values while maintaining generalization [60], [64]. There are numerous kernels available, including the linear function, Radial Basis Function and polynomial function [65]. A linear function was employed in the investigation. The SVR algorithm is executed utilising Scikit-learn tools [60], [65].
Freund and Schapire were the first to suggest AdaBoost. It is an ensemble technique. The learning rate and the number of estimators are the primary parameters of the AdaBoost algorithm. AdaBoost operates by integrating multiple weak learners into a singular robust learner. In general, AdaBoost employs single divided decisions, termed ‘decision stumps’, to assist feeble learners [66]. The fundamental principle of the AdaBoost learning method is to construct a robust classifier with excellent detection performance by combining weak classifiers. It is a learning method that performs two functions involving repeated computations: function selection and classifier conditioning. By iteratively refining computations, the overall classification performance improves, as weak classifiers, which contribute to performance evaluation, are integrated into the strengthened iterative classifier [67]. The AdaBoost algorithm assigns weights to classification examples based on their difficulty level, meaning that instances that are difficult to classify are assigned a higher weight than those that are simple to classify. It adjusts to the mistakes produced by classifiers in prior rounds [68], [69] adaptively. With each iteration, adjusting the weight of training examples compels the learning algorithms to prioritise instances that were previously misclassified and downplay instances that were previously successfully classified. Put simply, weights are increased for misclassified cases and minimised for correctly classified ones. Consequently, the subsequent iterations will assign a higher weight to misclassification mistakes for those misclassified occurrences. The use of adaptive boosting extends to both regression and classification problems. Transitioning from Linear Regression to AdaBoost enables the mapping of a wider range of non-linear connections, leading to improved estimate and, consequently, increased accuracy.
Friedman introduced Gradient Boosting (GB) in 1999 as an ensemble method for doing regression and classification tasks. Among the several boosting techniques, GB is unique in that it is exclusively applicable to regression tasks. The GBR algorithm employs many base learners in order to minimise the residual error [70]. GBR employs a gradient descent technique to address the problem, where each weak learner iteratively decreases the prediction error according to the established learning rate [71]. A sequential base learner is employed, where each subsequent learner minimises the error made by the preceding learner, therefore resulting in the development of a generalised model at the conclusion. Within this technique, every iteration of the randomly chosen training set is verified against the base model. By randomly subsampling the training data, the speed and precision of GBR for execution may be enhanced, therefore mitigating the risk of overfitting. The approach operates based on the boosting concept, in which several weak learners are together combined to create a robust learner. In general, DTs are employed as the weakest learners. First, single-node basis trees are built. The construction of subsequent trees is contingent upon the mistakes made by the preceding trees. The scaling of the trees is achieved by selecting the learning rate, therefore guaranteeing the equitable contribution of each tree to the total prediction. The next trees are combined with the previous trees in order to forecast the reaction. This procedure is iterated until the maximum number of trees is achieved or the ensuing answers do not show improvement [72]. GBR is applicable for predicting numerical outputs, so the response variable must be numerical.
The compiled database is examined to reveal the statistical attributes, as shown in

Pair plots for the input parameters and observed responses.
The co-relation matrix is a square matrix that displays each dataset’s pairwise co-relation coefficients between variables [73], [74], [75]. Every value in the matrix corresponds to the co-relation between two variables, whereas the diagonal components indicate the co-relation between each variable and itself, which is 1 [76]. Data co-relation analysis is a valuable technique for examining the connections between variables in a dataset and detecting patterns or trends. It can also be employed to detect multicollinearity, a problem that arises in statistical models when two or more predictor variables exhibit a high co-relation with each other.

Correlation heatmap of input parameters and observed responses.

Relative frequency distribution of the prediction-to-test STS ratio.

The violin plot illustrating the relative error percentages of different models.
Mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), R squared (R2) and cross-validation (CV) mean were calculated for all the models and are presented in
The variations between the projected outcomes and the observed outcomes of each sample for various models are depicted in

Comparative analysis between actual and predicted values: a) MLR, b)DT, c) RF, d) SVR, e) AdaBoost and f) GBR.
Relative frequency distribution of the prediction-to-test STS ratio.
Linear Regression | 1.003337 | 0.99848 | 0.108407 | 0.154706 |
Decision Tree | 1.002638 | 0.98704 | 0.0855 | 0.399816 |
Random Forest | 1.002764 | 0.98737 | 0.085843 | 0.40379 |
Support Vector | 1.004801 | 0.99847 | 0.109484 | 0.076785 |
ADA Boost | 0.999269 | 0.98854 | 0.084598 | 0.367988 |
Gradient Boosting | 1.004576 | 0.99069 | 0.08731 | 0.416545 |
Performance results of the predictive models.
Linear Regression | 0.629266 | 0.488546 | 0.395976 | 0.608906 | 0.445094 |
Decision Tree | 0.452297 | 0.382320 | 0.204573 | 0.797950 | 0.713217 |
Random Forest | 0.453657 | 0.383848 | 0.205805 | 0.796733 | 0.707697 |
Support Vector | 0.636715 | 0.492057 | 0.405406 | 0.599593 | 0.454470 |
ADA Boost | 0.454821 | 0.384150 | 0.206862 | 0.795688 | 0.714149 |
Gradient Boosting | 0.459522 | 0.390104 | 0.211160 | 0.791443 | 0.71143811 |

Correlation between expected and experimental values of STS for PQAC.
The ML metamodels are initially calibrated to optimise the primary hyperparameters. This is accomplished by adjusting the quantity of estimators and training the ML metamodels.

Effect of the number of estimators on RF’s performance in terms of (a) MAE, (b) MSE, (c) RMSE, (d) R2 and (e) cross-validation mean.

Effect of the number of estimators on AdaBoost’s performance in terms of (a) MAE, (b) MSE, (c) RMSE, (d) R2 and (e) cross-validation mean.

Effect of the number of estimators on GBR’s performance in terms of (a) MAE, (b) MSE, (c) RMSE, (d) R2 and (e) cross-validation mean.
Scalability is a critical consideration for the practical implementation of any new material in construction, particularly in the context of widespread use in infrastructure projects. The proposed PQAC mix leverages the advantages of nano-materials and fibres, which are increasingly available, although they may come at a higher cost compared to traditional additives. Notably, while PVAF is effective, PPF is more cost-effective and widely accessible, making them a promising choice for large-scale projects. The established manufacturing and supply chains for PPF facilitate their integration into large batching operations. One of the primary challenges in scaling this mix is achieving a uniform dispersion of nano-materials and fibres during the mixing process. To address this, advanced mixing techniques and chemical dispersants are required to prevent agglomeration, ensuring consistent quality throughout the batch. The use of industrial mixing equipment with high shear capabilities can effectively tackle these challenges, promoting uniformity and reliability in the final product. The mix design closely aligns with traditional concrete practices, which aids in its integration into existing construction workflows. Minor adjustments to current processes can facilitate the adoption of this innovative mix, making it more feasible for large-scale applications. However, it is important to note that the incorporation of nano-materials and fibres can reduce workability, as evidenced by lower CFV, such as 0.807 for nano-additive at 1.25% and 0.796 for PPF at 2.0%. This reduction in workability may pose challenges for placement and compaction, particularly in thick pavement layers. Therefore, careful consideration of placement techniques and potential adjustments to the mix design may be necessary to ensure optimal performance in large-scale applications. From a performance perspective, the significant improvements in mechanical properties, such as a split tensile strength increase of up to 7.5 MPa with PVA fibres, indicate that this mix is well-suited for high-stress pavement applications. In addition to performance benefits, the environmental advantages of this mix are noteworthy. The replacement of OPC with alkali-activated materials, can lead to a reduction in carbon emissions by 40–80%. This aligns with sustainable construction practices and makes the mix particularly appealing for environmentally conscious infrastructure projects, which are increasingly prioritised by government and industry initiatives. Moreover, the use of predictive modelling techniques, such as DT, RF, GB, MLR, SVR and AdaBoost, allows for efficient optimisation of the mix. These models can reduce the need for extensive trial-and-error testing at scale by simulating and predicting performance across various site conditions and material availabilities. This capability not only enhances scalability but also expedites the deployment of optimised mixes tailored to diverse project requirements. In conclusion, the proposed PQAC mix demonstrates significant potential for scalability in large-scale pavement applications, particularly for high-performance and sustainable infrastructure. However, practical challenges related to material costs, workability and the need for advanced mixing and quality control must be addressed to fully realise this potential. Continued innovations in predictive modelling will further enhance the scalability and cost-effectiveness of this approach, paving the way for its widespread adoption in the construction industry.
This study presents a method for predicting STS utilising concrete with nano-additions and fibres through individual and ensemble learning models. These ML models exhibit strong efficacy in representing the intricate non-linear correlations between input and output parameters in predicting STS for PQAC. The present study enhances engineers’ comprehension in research about the optimal selection of input variables and regressors for the accurate execution of ML models to forecast outputs, with high accuracy, minimising reliance on extensive experimental testing.
By replacing OPC with alkali-activated materials, this research supports the global push for greener construction practices, reducing the carbon footprint of rigid pavements.
The adoption of predictive modelling reduces material wastage and optimises resource utilisation, aligning with sustainable engineering goals.
The enhanced tensile strength and crack resistance of the proposed fiber reinforced, nano-concrete mix contribute to longer-lasting and more resilient pavement structures.
This innovation addresses the inherent brittleness of conventional AAC, making it more suitable for high-stress applications such as highways and heavy-duty pavements.
The use of machine learning in mix optimisation introduces a paradigm shift, allowing engineers to predict performance under diverse conditions. This capability could lead to the development of smart construction materials and automated quality control systems.
In accordance with additional error metrics and the co-relation coefficient between expected and actual values, the DT ensemble model demonstrates superior predictive ability and is endorsed as an effective approach for STS prediction.
The co-relation plot shows that PVAF and PPF are highly co-related with the STS.
It is clear from comparing the six models that DT, RF and AdaBoost have a strong command of strength forecasting, indicating their suitability for the redesign of PQAC.
The DT model exhibited superior performance compared to the other machine learning models, while the RF model was shown to yield the second-best performance. Furthermore, the AdaBoost, GB and MLR exhibited commendable performance. The SVR model had the lowest level of prediction accuracy.
The statistical indicators show that DT regression models outperformed AdaBoost, RF, MLR, SVR and GB approaches in terms of minimising the error difference between the targeted and projected values, with reduced errors.
The low values of AdaBoost, DT and RF models for MAE (0.38, 0.382 and 0.383), MSE (0.202, 0.204 and 0.205) and RMSE (0.45, 0.452 and 0.453) verify that these machine learning models are more accurate than the others.
The R2 for the forecasted strength of MLR, DT, RF, SVR, AdaBoost and GB is observed to be 0.61, 0.797, 0.797, 0.6, 0.799 and 0.791, respectively.
The effect of a number of estimators was studied on three different ML models, and the optimum number of estimators was found to be around 300–400 for RF, around 50 for AdaBoost and in the range of 50–100 for GB.
Overall, Artificial intelligence (AI) has significantly transformed predictive modelling in civil engineering, improving precision, simplifying construction procedures, and potentially replacing or augmenting conventional experimental approaches. The mix design can be extended to other infrastructure elements such as bridge decks, airport runways and industrial flooring, where tensile strength and durability are critical. Its adaptability to local materials and site-specific conditions makes it viable for diverse geographies and construction scenarios. ML algorithms like Adaboost, DT and RF are effective for analysing extensive information, making them useful for forecasting real strength characteristics under varying complexity settings. Ensemble learning methods, like the one used in this study, can consistently mature and adjust to fresh data, ensuring improved accuracy over time. Integrating explainable AI techniques like Shapley Additive Explanations (SHAP) can provide a deeper understanding of how input factors impact the projected result.
The results show that ensemble models of ML simulated more realistically than other algorithms, which can be effectively used at the building site to reduce laboratory usage and save time and costs. With further optimisation, the efficacy of these algorithms could be enhanced to yield more accurate predictions. The enhanced accuracy of the models signifies their significance in civil engineering applications, especially in forecasting concrete strength, as the experimental method is time consuming and labour intensive.
Furthermore, reliable and high-quality experimental data are crucial for the model’s real performance. Future research should focus on augmenting the dataset with additional examples and using SHAP analysis to further examine the influence of input parameters on output.