Skip to main content

Application of gene expression programming, artificial neural network and multilinear regression in predicting hydrochar physicochemical properties


Globally, the provision of energy is becoming an absolute necessity. Biomass resources are abundant and have been described as a potential alternative source of energy. However, it is important to assess the fuel characteristics of the various available biomass sources. Soft computing techniques are presented in this study to predict the mass yield (MY), energy yield (EY), and higher heating value (HHV) of hydrothermally carbonized biomass using Gene Expression Programming (GEP), multiple-input single output-artificial neural network (MISO-ANN), and Multilinear regression (MLR). The three techniques were compared using statistical performance metrics. The coefficient of determination (R2), mean absolute error (MAE) and mean bias error (MBE) were used to evaluate the performance of the models. The MISO-ANN with 5-10 to 10-1 and 5-15-15-1 network architectures provided the most satisfactory performance of the three proposed models (R2 = 0.976, 0.955, 0.996; MAE = 2.24, 2.11, 0.93; MBE = 0.16, 0.37, 0.12) for MY, EY and HHV, respectively. The GEP technique’s ability to predict hydrochar properties based on the input parameters was found to be satisfactory, while MLR provided an unsatisfactory predictive model. Sensitivity analysis was conducted, and the analysis revealed that volatile matter (VM) and temperature (Temp) have more influence on the MY, EY, and HHV.


The increasing energy demand has led to the need to find alternative energy sources that are affordable, widely available, and environmentally friendly. Biomass is a biological and sustainable material originated from plants and animals, along with their waste and residues (Krylova and Zaitchenko 2018). Biomass is the most available renewable energy source, with a contribution of about 50% of the total global renewable energy as of 2018, and providing energy to billions of people and stimulating economic growth (Pradhan et al. 2018). The studies by (Tekin et al. 2014) and (Rousset et al. 2012) reported that biomass is a potential alternative renewable energy source for power generation as a result of its low emissions, low ash, and total sulphur content. (Saba et al. 2017) and (Perlack et al. 2011) reported that biomass greenhouse gas emission status is zero to net negative as carbon dioxide is absorbed by plants during photosynthesis.

The most generally used thermochemical pre-treatment techniques include pyrolysis, gasification, torrefaction, and hydrothermal carbonization (HTC) (Wang et al. 2018; Kambo and Dutta 2015). The studies by (Kubacki et al. 2012) and (Makwarela et al. 2017) stated that the co-combustion of biomass and coal allowed coal to ignite and burnout at lower temperatures because of the interactions with the early combustion of biomass volatile matter. The study concluded that the emission reductions reported were due to an improved reaction between coal and biomass volatiles in a hot oxidizing atmosphere. A number of studies have been carried out on the pre-treatment of biomass by various researchers (Safarian et al. 2019; Zhang and Pang 2019; Kambo and Dutta 2015). The type of feedstock and the preferred end product determines the type of pre-treatment method to be used (Kambo and Dutta 2015).

The hydrochar utilized in the study was produced using the HTC method, which is generally considered to be a more effective technique (Danso-Boateng 2015). Biomass for HTC treatment does not require drying before treatment and thus uses less energy. In fact, unlike the conventional biological treatment technique, the presence of toxic compounds in the biomass does not affect HTC. HTC treatment typically takes place at relatively low temperatures (180–260 °C) and under internally generated pressure from an enclosed reactor, which decreases the oxygen and hydrogen content of the starting material by dehydration and decarboxylation (Libra et al. 2011). The HTC treatment converts the wet biomass into a hydrochar, a solid substance with improved carbon content. Hydrochar has a heating value higher than the feedstock and a chemical structure similar to that of coal (Mumme et al. 2011). The process is controlled by process parameters such as temperature and residence time, which define the intensity of biomass treatment (Wiedner et al. 2013; Xu et al. 2013).

The temperature has a significant influence on the HTC process. It is the key determinant of the water properties, which leads to ionic reactions in the subcritical region. A rise in temperature alters the viscosity of the water, making it easier to penetrate the pores of the material and thus further degrade the biomass (Funke and Ziegler 2010). With an increase in temperature, the disintegration of solid residues increases, and this further leads to an increase in the yield of solids to gas products. In most of the studies reviewed (Wang et al. 2018; Kim et al. 2014; Parshetti et al. 2013; Hwang et al. 2012; Sevilla and Fuertes 2009), an increase in temperature has been reported to result in lower mass yields, with an increase in HHV of hydrochars, suitable for power generation. A study conducted by (Wang et al. 2018) also reported on the significance of residence time on the severity of the HTC process. The residence time at a given temperature influences the degree of decomposition of the feedstock, but the minimum impact on hydrochar mass yield compared to temperature.

The investigation conducted by (Zhu et al. 2018) at different temperatures and residence time has shown that the above-mentioned parameters do influence the properties of cornstalk hydrochar. The author reported that the fuel mass yield decreased from 70.57 to 33.40% with an increase in temperature. While the residence time tends to have a lower influence on the mass yield relative to the temperature. The energy content of the raw cornstalk was increased from 16.35 to 26.31 MJ/kg. A similar result was also observed for the HTC treatment of other biomass such as biogas sludge, barley, and maize silage, starch, municipal solid waste, and sewage sludge (Seyedsadr et al. 2018; Kim et al. 2014; Parshetti et al. 2013; Hwang et al. 2012; Sevilla and Fuertes 2009). Therefore, it can be concluded that temperature and residence time do influence the properties of the hydrochar as well as the raw biomass type. With the understanding of the relationship between temperature and residence time on the mass yield of hydrochar, the physiochemical properties of the hydrochar under the set conditions were used to predict its yield theoretically.

Empirical and semi-empirical correlations have been reported in the literature for the estimation of biomass fuel HHV based on their proximate, ultimate, and chemical analyses (Saldarriaga et al. 2015; Saidur et al. 2011; Sheng and Azevedo 2005). (Vargas-Moreno et al. 2012) reported on mathematical models used to predict biomass HHV and assessed the performances of the prediction models. The study reported that the R2 remained as high as 0.748 for biomass in the 15 univariate and multivariate prediction equations reviewed. Artificial neural networks and, in particular, feed-forward artificial neural networks (FANNs) have been widely used to develop process models over the last 10 years, and their use in industry has evolved rapidly (Onifade et al. 2019; Aladejare et al. 2020; Majumder et al. 2008; Hansen and Meservy 1996; Wasserman 1993). (Aladejare et al. 2020) and (Estiati et al. 2016) used neural networks and regression analysis to predict the HHV of coal and biomass-based fuels from their proximate and ultimate analyses using both experimental and existing data from the literature. The results obtained by these authors show that the adaptive neuro-fuzzy inference system (ANFIS) and artificial neural network–particle swarm optimization (ANN-PSO) models perform better than the MLR models as reflected in the statistical analysis conducted to assess the performance of the models.

There is limited study in the literature that assesses the influence of temperature, residence time, and the composition of biomass sources on hydrochar properties. The aim of this study is, therefore, to predict the mass yield, energy yield, and higher heating value of hydrochars using the HTC process conditions (temperature and residence time), and the biomass proximate analysis results. The experimental data from this study and the data obtained from the literature were utilized in the linear and non-linear empirical models proposed to predict these properties for different biomass sources (as provided in Table 1). The performance of the proposed models was compared using the R2, mean absolute error, and mean bias errors.

Table 1 Statistics of the inputs and output data used

Experimental method and models overview

Sample characterization and data generation

To develop the proposed models, data from the proximate analysis, HTC process conditions (temperature and residence time), hydrochar properties (MY, EY, and HHV) relating to a number of biomass species were used. The woody biomass (Searsia lancea) used in this study were harvested from a phytoremediation trial site polluted with groundwater from gold and the uranium tailings dam at AngloGold Ashanti Limited’ West Wits and Vaal River mining operations in South Africa. The different biomass components were milled in a Retsch SM 200 mill to − 1 mm and − 212 µm size fractions. The − 1 mm fraction was used for the hydrothermal carbonization and the − 212 µm fraction for the physicochemical characterization. The proximate analysis for these samples was performed based on the ASTM D5142, with approximately 1 g used to calculate the fixed carbon, moisture, ash, and volatile matter contents. The fixed carbon is expressed as the subtraction of the sum of moisture, volatile matter, and ash contents from 100%. The bomb calorimeter (Leco AC500), in accordance with the ASTM D5865-04 standard, was used to estimate the HHV of the samples.

One hundred and fifteen (115) data points; 9 from the experimental investigation and 106 from published articles on several biomass species were used to obtain predictive models. The summary of the statistics of the dataset obtained from the experimental tests and the literature is presented in Table 1 and the details of the data used in the model development are presented in Table 2. Summary of the statistics in Table 1 shows that volatile matter, Ash content, fixed carbon, and residence time do not follow normal distributions based on their respective skewness. To enable the general application of the proposed models, the data set was trained, tested, and validated using GEP, MISO-ANN, and MLR and compared with one another.

Table 2 Experimental data and data from Literature

Hydrothermal carbonization

The woody Searsia lancea tree species was carbonized in a laboratory-scale high-pressure Berghof BR-1500 reactor. For each experiment, the reactor was loaded with 100 g of air-dried sample and 800 ml of deionized water, with the reactor pressurized at 20 bar using nitrogen. The hydrothermal test was conducted at different reaction temperatures of 200, 250, and 280 °C and residence time of 30, 60, and 90 min. The mixture was stirred with the reactor agitated at 200 rpm and was sustained for the entire experiment. After the holding time, the reactor was allowed to cool to room temperature. The solid hydrochar was collected via filtration and allowed to dry in an oven at 105 °C for 24 h. The results from the nine (9) hydrochar samples produced under the set conditions are depicted in Table 2. The mass yield and energy yield of hydrochars were calculated using the following equations:

$$ {\text{MY}} = \frac{{M_{{{\text{HC}}}} }}{{M_{{\text{R}}} }} \times 100\% , $$

where MY is mass yield, MHC is the mass of hydrochar and MR is the mass of the raw sample:

$$ {\text{EY}} = \left( {{\text{MY}} \times \frac{{{\text{HHV}}_{{{\text{HC}}}} }}{{{\text{HHV}}_{{\text{R}}} }}} \right) \times 100\% , $$

where EY is energy yield, HHVHC and HHVR is the higher heating value of the hydrochars and raw samples, respectively.

Overview of GEP and ANN

Gene expression programming (GEP)

GEP is an evolutionary-based algorithm that explores the genotype from the genetic algorithm (GA) and phenotype from genetic programming (GP). Like a living organism, the GEP utilizes a simple chromosome with a fixed length for keeping and transmitting genetic information and complex tree structures for learning and adaptation by changing size, shapes, and composition. The key advantage of the GEP model is the ability to present its output in the form of an expression tree and a simple relationship between the model parameters and the targeted output. Unlike many optimization algorithms that require prior suggestion of the relationship between the model parameters and the output parameter. Hence, the rigors required in many optimization algorithms in establishing the model parameter combination that will give optimum results have been solved. In GA and GP, mutation and crossover operators are the common means of reproduction between them, which operate based on their respective algorithms which could increase the computational resources (Guven and Aytek 2009; Teodorescu and Sherwood 2008). The GEP proposed by (Ferreira 2001) explores the merit of GA and GP, however, overcomes the demerits of both the GA and GP. It utilizes two entities which are the chromosomes and the expression trees. Instead of applying its operators on the expression tree directly, it operates on the chromosome which reduces the computational resources (Guven and Aytek 2009; Teodorescu and Sherwood 2008). The flowsheet in Fig. 1 shows the steps involve in GEP.

Fig. 1
figure 1

Basic steps of GEP

Artificial neural networks (ANN)

ANN belongs to the family of artificial intelligence which imitates the functionality of the human brains. It explores how the human brain receives, process, and transform information. There are different types of ANNs, but the multilayer neural network is the most used. Mainly, in a supervised ANN, the input parameters are supplied with the targeted output (Jain et al. 1996). The input parameters will be multiplied with the connecting weights and their summation together with the bias will be fed into the transfer function at the hidden layer. The output of the hidden layer will be multiplied by another weight connecting the hidden layer to the output layer and its summation will be added to the bias and then fed into the transfer function at the output layer to obtain the final predicted output. The transfer function at the hidden layer is usually non-linear, while that at the output layer could be linear or non-linear. The flow chart explaining the steps involved in the ANN training is illustrated in Fig. 2.

Fig. 2
figure 2

Basic steps of ANN training

Description of experimental data

Table 2 shows the proximate analysis of the biomass feedstock (Searsia lancea). The results of the proximate analysis test were presented on a moisture-free basis (dried-basis). Volatile matter, ash content, and fixed carbon are influential constituents of fuel materials used to ascertain its quality. The content of volatile matter significantly influences the process of combustion (Mierzwa-Hersztek et al. 2019; Sadiku et al. 2016). In addition, (Brewer et al. 2014) and (Holtmeyer et al. 2013) reported that material with higher volatile matter could be advantageous for combustion processes, because it is easier to ignite, lower temperature of complete burnout, and a stable flame. A high volatile matter of 75.67% was obtained for the Searsia lancea, making the material a potential feedstock for combustion. Ash content of 4.26% was obtained for the feedstock. (He et al. 2018) reported that, with lower ash content, there might be a decrease in fouling and slagging. The fixed carbon content of any material indicates the fuel's heating value (Sadiku et al. 2016). For our biomass feedstock, the fixed carbon content of 20.07% was obtained.

The mass yields of the hydrochars from Searsia lancea calculated using Eq. (1) decreases as the temperature increases at each residence time, reaching yields as low as 34.89% at 250 °C. The reduction in mass yield is a result of decarboxylation and dehydration reactions during the HTC process resulting in the decomposition of the biomass feedstock (Saba et al. 2017; Reza et al. 2014). For each time interval, it is observed that the energy yield of the hydrochars decreases with an increase in temperature. This can be attributed to the influence of the mass yield in the calculation of the energy yield (Eq. 2). The HHV of Searsia lancea (17.23 MJ/kg) increased after the HTC process to as high as 29.71 MJ/kg at 280 °C and as low as 20.27 MJ/kg at 200 °C. It is observed that HHV increases with an increase in temperature and residence time.

Development of models using soft computing and regression analyses

GEP model

In developing the GEP model, the dataset used in the training and testing of the ANN model was also used. However, instead of normalizing the dataset within the range of − 1 and 1, the dataset was normalized within 0 and 1 in the GEP model. The purpose is to ensure dimensional linearity and forestall overfitting. The GEP model was implemented in GeneXproTools 5.0. After loading the data into the software, the number of chromosomes, the head size, the number of genes, and linking function were set to 30, 8, and 5 for the respective MY and HHV, while 30, 8, and 6 were used for the EY. The linking function used for MY was Average (Avg2), while addition was used for the HHV, and Minimum (Min) was used for the EY. The fitness function adopted for the MY, EY, and HHV was the root-mean-squared error (RMSE). For the genetic operators, the strategy used was the Optimal Evolution. The functional operators (e.g., addition, subtraction, division, multiplication, hyperbolic tangent, etc.) were also selected for the respective MY, EY, and HHV. The maximum fitness was used as the stop condition. The model was then trained for the respective MY, EY, and HHV. The final expression trees obtained and the mathematical interpretation of each of the trees are presented in Figs. 3, 4, 5 and Eqs. (3), (5), and (7):

$$ {\text{MY}} = 72.5y_{5} + 26.7, $$
Fig. 3
figure 3

Expression tree for the MY

Fig. 4
figure 4

Expression tree for the EY

Fig. 5
figure 5

Expression tree for the HHV

where y5 is given in Eq. (4e):

$$ y_{1} = \tan^{ - 1} \left\{ {1/\left[ {\max \left( {({\text{Ash}} - 0.25)/49.6,({\text{VM}} - 34.3)/55.44} \right) - 0.0771({\text{FC}} - 2.66)} \right]} \right\} $$
$$ y_{2} = 0.5\left\{ \begin{gathered} y_{1} \, + \, \left[ {1 - \left( {0.5( - 7.2301 + ({\text{VM}} - 34.3)/55.44)} \right)^{2} } \right] \times ... \hfill \\ \left( {\min \left( {({\text{VM}} - 34.3)/55.44,({\text{Temp}} - 140)/160} \right)({\text{Temp}} - 140)/160 - 0.1076} \right) \hfill \\ \end{gathered} \right\} $$
$$ y_{3} = 0.5(y_{2} + 3.3587) $$
$$ y_{4} = 0.5\left\{ {y_{3} + \left[ {1/\left( \begin{gathered} 0.5\left( {({\text{Temp}} - 140)/160 + ({\text{VM}} - 34.3)/55.44} \right) \times ... \hfill \\ \left( {({\text{VM}} - 34.3)/55.44} \right)^{1/3} + ... \hfill \\ 0.5\left( {{\text{EXP}}\left( {({\text{Ash}} - 0.25)/49.6} \right) + ({\text{Ash}} - 0.25)/49.6} \right) \hfill \\ \end{gathered} \right)} \right]^{2} } \right\} $$
$$ y_{5} = 0.5\left[ {y_{4} - 135.4044\left( {({\text{RT}} - 5)({\text{Ash}} - 0.25)/23560} \right)^{2} ({\text{FC}} - 2.66)/51.34} \right] $$
$$ {\text{EY}} = 75.44x_{6} + 37.11, $$

where x6 is given in Eq. (6f):

$$ x_{1} = 0.7833/(2({\text{FC}} - 0.25)/49.6 + 1.7742({\text{RT}} - 5)/475) $$
$$ x_{2} = \min \left\{ {x_{1} ,\left[ {1 - 0.5\left( {({\text{FC}} - 5)/475 + (({\text{VM}} - 34.3)/55.44 \, )^{2} } \right)} \right]} \right\} $$
$$ x_{3} = \min \left\{ {x_{2} ,\left[ {\tan^{ - 1} \left( {\tan^{ - 1} \left( {({\text{Ash}} - 0.25)/49.6 + 0.5(0.06505 + ({\text{FC}} - 2.66)/51.34)} \right)} \right)} \right]^{1/3} } \right\} $$
$$ x_{4} = \min \left\{ {x_{3} ,\left[ \begin{gathered} \left( {0.5(({\text{RT}} - 5)/475 + ({\text{Ash}} - 0.25)/49.6) - (({\text{VM}} - 34.3)/55.44)^{2} } \right)/ \hfill \\ \left( {0.5((({\text{Ash}} - 0.25)/49.6)^{2} - 0.3248)} \right) \hfill \\ \end{gathered} \right]^{2} } \right\} $$
$$ x_{5} = \min \left\{ {x_{4} ,\left[ \begin{gathered} 0.5\left( {({\text{Temp}} - 140)/160 + ({\text{Ash}} - 0.25)/49.6} \right)({\text{Temp}} - 140)/160) \hfill \\ \left( {({\text{FC}} - 2.66)/51.34 - ({\text{VM}} - 34.3)/55.44} \right) - \hfill \\ \left( {({\text{RT}} - 5)/475 + 2({\text{Temp}} - 140)/160 - 4.07501} \right)/4 \hfill \\ \end{gathered} \right]} \right\} $$
$$ x_{6} = \min \left\{ {x_{5} ,\left[ {\left( {({\text{RT}} - 5)/475 + (({\text{Temp}} - 140)/160 + ({\text{FC}} - 2.66)/51.34)/4} \right)^{2} + 0.6882} \right]} \right\} $$
$$ {\text{HHV}} = 24.73\sum\limits_{i = 1}^{n} {z_{i} } + 9.8. $$

The zi in Eq. (7) can be computed using Eq. (8):

$$ z_{1} = - {\text{EXP}}( - 4.6508 + {\text{VM}}/55.44 - {\text{Ash}}/49.6) $$
$$ \begin{gathered} z_{2} = {\text{EXP}}(0.5( - 12.8458 + {\text{Temp}}/160 + {\text{VM}}/55.44 + {\text{EXP}}( - 1.4937 + {\text{Temp}}/160 + ... \hfill \\ {\text{VM}}/55.44))) \hfill \\ \end{gathered} $$
$$ z_{3} = \tanh \left\{ {2/\left[ {2.2321 + (0.8649 + 2{\text{Ash}}/49.6 - {\text{Temp}}/160)({\text{Temp}} - 140)^{1/3} /160^{1/3} } \right]} \right\} $$
$$ z_{4} = 1/\left\{ \begin{gathered} - 3.0552( - 0.6705 + {\text{FC}}/51.34 + {\text{VM}}/55.44) - \tanh [({\text{VM}} - 34.3)/55.44] - ... \hfill \\ {\text{EXP}}[ - 13.2680({\text{FC}} - 2.66)/51.34] \hfill \\ \end{gathered} \right\} $$
$$ \begin{gathered} z_{5} = ({\text{FC}} - 2.66)\tanh (0.5(({\text{RT}} - 5)/475 + (({\text{Temp}} - 140)/ - 332.0686)( - 2.9276 + ... \hfill \\ {\text{VM}}/55.44)))/51.34. \hfill \\ \end{gathered} $$

Artificial neural network

A MISO-ANN is proposed in this study for the prediction of EY, MY, and HHV. To achieve this, single hidden-layer and double hidden-layers were tried for each of the EY, MY, and HHV as presented in Tables 3, 4. The optimum networks obtained for each of the EY, MY, and HHV are presented in Figs. 7 and 8. In developing the MISO-ANN models, the number of neurons in the input, hidden, and output layers are to be defined and the respective transfer functions at the hidden and output layers are to be defined. Therefore, in this study, there are five neurons in the input layers comprising VM, Ash, FC, Temp, and RT. For the hidden layer, several neurons ranging from 3 to 15 were tried for the MISO-ANN with single hidden layer architecture, while for the MISO-ANN with double hidden layer architecture, the neurons combinations tried ranged 5–3 to 15–15 for each of the targeted variables. The transfer function adopted for the network with a single hidden layer is a hyperbolic tangent for the hidden and output layers, respectively. For the double hidden layer, hyperbolic tangent was used in the first and second hidden layers, while purlin was used for the output layer. Feedforward Backpropagation training algorithm with Levenberg–Marquardt training function was used for the training of the network. One hundred and fifteen (115) datasets were used for model development, divided into 70% for training, 15% each for testing, and validation, respectively (Fig. 2). The datasets were normalized to within the range of − 1 and 1 to forestall overfitting and ensure dimensional uniformity. The performance of each of the trained networks using the normalized datasets was evaluated using R2, RMSE, ME, and standard deviation (std). The obtained outputs for various combinations of neurons are presented in Tables 3, 4. The best network for the MY prediction is 5-10 to 10-, while 5-15 to 15-1 is the best network for EY and 5-15 to 15-1 for the HHV as bolded in Tables 3, 4 and presented in Figs. 6 and 7.

Table 3 Selection of the optimum MISO-ANN network for MY and EY
Table 4 Selection of the optimum MISO-ANN network for HHV
Table 5 Error analysis
Fig. 6
figure 6

Optimum MISO-ANN architecture for MY (5–10-10–1)

Fig. 7
figure 7

Optimum MISO-ANN architecture for either EY or HHV (5–15-15–1)

Multiple linear regression analysis

Regression analysis is commonly used in establishing the relationship between the regressor and the targeted variable. When it involves a relationship between the targeted variable and a regressor, it is known as linear regression analysis. However, for more than one regressor, it is known as multiple linear regression analysis. MLR has been used by researchers (Said et al. 2020a, b; Onifade et al. 2019) for prediction purposes. MLR is also adopted in this study, to enable the comparison between GEP and ANN models. MLR model was developed for each of the three predicted parameters: MY, EY, and HHV. The MLR analysis was performed in the Microsoft Excel software Add-ins using the same datasets used in GEP and ANN models. The obtained MLR models are as presented in Eqs. (9) to (11):

$$ {\text{MY}} = 557.5078 - 4.4502{\text{VM}} - 4.3243{\text{Ash}} - 4.0474{\text{FC}} - 0.2495{\text{Temp}} - 0.0409{\text{RT}} $$
$$ {\text{EY}} = - 230.2695 + 3.2966{\text{VM}} + 3.2934{\text{Ash}} + 3.9157{\text{FC}} - 0.1372{\text{Temp}} - 0.0337{\text{RT}} $$
$$ {\text{HHV}} = - 73.8668 + 0.89395{\text{VM}} + 0.5879{\text{Ash}} + 0.9914{\text{FC}} + 0.0398{\text{Temp}} + 0.0058{\text{RT}}{.} $$

Results and discussion

Models comparison

The accuracy of the proposed models using GEP, MISO-ANN, and MLR methods are compared with the laboratory-measured values using the testing and validation datasets. For the MY, the outcome of the comparison is presented in Fig. 8. For the training datasets, the points predicted with MISO-ANN fall largely within the 3% error line, while many of the points predicted by the GEP and MLR fall outside the error line. This hitherto gave rise to R2 of 0.981 obtained for the MISO-ANN, while the R2 recorded for both the GEP and MLR models are 0.691 and 0.463, respectively, for the testing datasets. For the validation data points, however, the R2 values recorded for the MISO-ANN are 0.976, while those of GEP and MLR are 0.548 and 0.154, respectively. The MISO-ANN predictions are generally closer to the experimentally measured values among the three proposed models. The performance of MISO-ANN can be attributed to its ability to handle complex non-linearity between the model parameters (Gevrey et al. 2003). The outcome of the MISO-ANN is consistent with most of the previous studies that compared the performance of ANN with the regression-based models in predicting the HHV of solid fuels (Onifade et al. 2019; Ghugare et al. 2017; Uzun et al. 2017; Patel et al. 2007). Aside from the HHV of solid fuel, many authors have found that the ANN provides a more reliable predictive model than the regression-based model (Lawal 2020; Lawal et al. 2020; Said et al. 2020a; Saadat et al. 2014; Khandelwal and Singh 2010).

Fig. 8
figure 8

Model comparison for the MY using a testing datasets, b validation datasets

The outcome of the comparison of the predictive ability of the three proposed models GEP, MISO-ANN, and MLR are also tested for the hydrochar property EY as presented in Fig. 9. The majority of the data points predicted with MISO-ANN fall within the 3% error lines, while many of the predicted data points using GEP and MLR fall outside the error line. As a result of this, the resulting performance indicator R2 of MISO-ANN for the testing and validation datasets are 0.965 and 0.955, respectively, while that of the GEP are 0.622 and 0.419. For the MLR model, the R2 values for the respective testing and validation datasets are 0.219 and 0.205. Again, the MISO-ANN model outperforms the GEP and MLR models.

Fig. 9
figure 9

Model comparison for the EY using a testing datasets, b validation datasets

Similarly, the accuracy of the proposed models (GEP, MISO-ANN, and MLR) for predicting HHV is also evaluated as presented in Fig. 10. All the predicted data points using the MISO-ANN fall within the 3% error lines for both the testing and validation data points. The majority of the predicted data points by GEP and MLR models fall outside the error lines in Fig. 10. As a result of the presence of the data points predicted using the MISO-ANN model within the error band, the performance of the MISO-ANN is excellent as the R2 of 0.999 is obtained for testing data points (Fig. 10a), while 0.996 was recorded for the validation data points (Fig. 10b). The R2 values obtained for the GEP models are 0.810 and 0.717, respectively, while that of the MLR models are 0.788 and 0.643 for the respective testing and validation data points. The low R2 values observed in GEP and MLR models for the MY, EY, and HHV predictions can be attributed to their respective predicted data points that fall outside the error bands. Hence, the MISO-ANN can give reliable predictions of the MY, EY, and HHV follow by the GEP model, while MLR may not be reliable.

Fig. 10
figure 10

Model comparison for the HHV using a testing datasets, b validation datasets

Error analysis

To enable the selection of the best performing model for predicting the MY, EY, and HHV, mean absolute error and mean bias errors were evaluated for each of the three techniques used in developing the proposed models as presented in Eqs. (12) and (13):

$$ {\text{MAE}} = \frac{{\sum\limits_{i = 1}^{n} {\left| {\frac{{{\text{Pi}} - {\text{Mi}}}}{{M_{{\text{i}}} }}} \right|} }}{n} \times 100\% $$
$$ {\text{MBE}} = \frac{{\sum\limits_{i = 1}^{n} {\left[ {\frac{{{\text{Pi}} - {\text{Mi}}}}{{M_{{\text{i}}} }}} \right]} }}{n} \times 100\% . $$

The obtained results from the conducted analyses using Eqs. (12) and (13) are presented in Table 5. From Table 5, the MAE of 2.24, 2.11, and 0.93% were obtained for MY, EY, and HHV, respectively, using the MISO-ANN model, while the MBE obtained are 0.16 0.37, and 0.12% for MY, EY, and HHV, respectively. Hence, the best model for the prediction of MY, EY, and HHV is the MISO-ANN model follow by the GEP model, while the MLR will overestimate the values of the MY, EY, and HHV based on the MBE values in Table 5.

Sensitivity analysis

The sensitivity analysis helps in providing useful information on the contributions of each of the input parameters on the output predicted by the model. Various techniques have been proposed to perform this task but the Cosine Amplitude method (CAM) (Yang and Zhang 1997) as presented in Eq. (14) is adopted in this study:

$$ R_{{{\text{ij}}}} = \frac{{\sum\limits_{k = 1}^{n} {(r_{{\text{m}}} \times P)} }}{{\sqrt {\sum\limits_{k = 1}^{n} {r_{{\text{m}}}^{2} } \sum\limits_{i = 1}^{n} {P^{2} } } }}, $$

where Rij stands for the strength of the input parameter, rm represents the model regressors, P is the predicted output, n is the data points number.

The MISO-ANN model which is adjudged the best out of the three proposed models based on the previous analysis conducted in this study is used to perform the sensitivity analysis. The output obtained is presented in the Pareto chart (PC) shown in Figs. 11, 12, 13. The VM and Temp have the highest influence on the MY, EY, and HHV as presented in Figs. 11, 12, 13. The order of the influence in all the figures is VM > Temp > FC > RT > Ash. In addition, based on the Pareto chart analysis, the Ash and RT in that order should not be ignored when predicting MY, EY, and HHV.

Fig. 11
figure 11

Input parameters contributions to the predicted MY

Fig. 12
figure 12

Input parameters contributions to the predicted EY

Fig. 13
figure 13

Input parameters contributions to the predicted HHV


Mass yield, energy yield, and higher heating value are important hydrochar properties required for the analysis and design of any bioenergy systems. In the present study, Gene Expression Programming, multiple-input single output-artificial neural network, and Multilinear regression were applied to predict MY, EY, and HHV of hydrochars using the composition of biomass source from proximate analysis and HTC process conditions (temperature and residence time). Based on R2 values and error analysis, MISO-ANN with 5-10 to 10-1 and 5-15 to 15-1 network architectures presented the best performance among the proposed models with R2 = 0.976, 0.955, 0.996; MAE = 2.24, 2.11, 0.93; MBE = 0.16, 0.37, 0.12 for the respective MY, EY, and HHV. GEP has been shown to provide satisfactory predictive alternative to MISO-ANN with R2 = 0.691, 0.622, 0.810; MAE = 12.38, 10.31, 8.58; MBE = 2.95, 0.64, 0.78 for MY, EY and HHV, respectively. From the sensitivity analysis, volatile matter and temperature were found to be the most influencing input variables. This study demonstrated the ability of GEP to satisfactorily model hydrochar properties based on biomass composition and HTC process conditions. Although the accuracy of the GEP models was slightly lower than that of the MISO-ANN models, the GEP models provided much more accurate predictions than the MLR models, which proved unsatisfactory.

Availability of data and materials

All data generated or analysed during this study are included in this published article.



Adaptive neuro-fuzzy inference system


Artificial neural network


Artificial neural network–particle swarm optimization


American Society for Testing and Materials


Cosine amplitude method


Energy yield


Fixed carbon


Feed-forward artificial neural networks


Genetic Algorithm


Gene expression programming


Genetic Programming


Higher heating value


Higher heating value of hydrochar


Higher heating value of raw biomass


Hydrothermal carbonization


Mean absolute error

MBE: :

Mean bias errors

ME: :

Mean error

MHC: :

Mass of hydrochar

MR: :

Mass of raw biomass


Multiple-input single output-artificial neural network


Multilinear regression


Mass yield


Pareto chart

R 2 :

Coefficient of determination


Root Mean Square Error


Residence time


Thermogravimetric analysis


Volatile matter


  • Aladejare AE, Onifade M, Lawal AI (2020) Application of metaheuristic based artificial neural network and multilinear regression for the prediction of higher heating values of fuels. Int J Coal Prep Util:1–22

  • Bach Q-V, Tran K-Q, Skreiberg Ø (2016) Hydrothermal pretreatment of fresh forest residues: effects of feedstock pre-drying. Biomass Bioenerg 85:76–83

    Article  CAS  Google Scholar 

  • Brewer CE, Chuang VJ, Masiello CA, Gonnermann H, Gao X, Dugan B, Driver LE, Panzacchi P, Zygourakis K, Davies CA (2014) New approaches to measuring biochar density and porosity. Biomass Bioenerg 66:176–185

    Article  CAS  Google Scholar 

  • Danso-Boateng E (2015) Biomass hydrothermal carbonisation for sustainable engineering. Doctoral dissertation, Loughborough University

  • Estiati I, Freire FB, Freire JT, Aguado R, Olazar M (2016) Fitting performance of artificial neural networks and empirical correlations to estimate higher heating values of biomass. Fuel 180:377–383

    Article  CAS  Google Scholar 

  • Ferreira C (2001) Gene expression programming: a new adaptive algorithm for solving problems. Accessed 22 Jul 2020

  • Funke A, Ziegler F (2010) Hydrothermal carbonization of biomass: a summary and discussion of chemical mechanisms for process engineering. Biofuel BioprodBior 4(2):160–177

    Article  CAS  Google Scholar 

  • Gao L, Volpe M, Lucian M, Fiori L, Goldfarb JL (2019) Does hydrothermal carbonization as a biomass pretreatment reduce fuel segregation of coal-biomass blends during oxidation? Energy Convers. Manage 181:93–104

    CAS  Google Scholar 

  • Gevrey M, Dimopoulos I, Lek S (2003) Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecol Model 160(3):249–264

    Article  Google Scholar 

  • Ghugare SB, Tiwary S, Tambe SS (2017) Computational intelligence based models for prediction of elemental composition of solid biomass fuels from proximate analysis. Int J Sys Ass EngMgt 8(4):2083–2096

    Google Scholar 

  • Guven A, Aytek A (2009) New approach for stage-discharge relationship: gene-expression programming. J HydrolEng 14(8):812–820

    Google Scholar 

  • Hansen JV, Meservy RD (1996) Learning experiments with genetic optimization of a generalized regression neural network. Decis Support Syst 18(3–4):317–325

    Article  Google Scholar 

  • He X, Liu Z, Niu W, Yang L, Zhou T, Qin D, Niu Z, Yuan Q (2018) Effects of pyrolysis temperature on the physicochemical properties of gas and biochar obtained from pyrolysis of crop residues. Energy 143:746–756

    Article  CAS  Google Scholar 

  • Holtmeyer ML, Li G, Kumfer BM, Li S, Axelbaum RL (2013) The impact of biomass cofiring on volatile flame length. Energ Fuel 27(12):7762–7771

    Article  CAS  Google Scholar 

  • Hwang I-H, Aoyama H, Matsuto T, Nakagishi T, Matsuo T (2012) Recovery of solid fuel from municipal solid waste by hydrothermal treatment using subcritical water. J Waste Manag 32(3):410–416

    Article  CAS  Google Scholar 

  • Jain AK, Mao J, Mohiuddin KM (1996) Artificial neural networks: a tutorial. Comput J 29(3):31–44

    Google Scholar 

  • Kambo HS, Dutta A (2014) Strength, storage, and combustion characteristics of densified lignocellulosic biomass produced via torrefaction and hydrothermal carbonization. Appl Energy 135:182–191

    Article  CAS  Google Scholar 

  • Kambo HS, Dutta A (2015) A comparative review of biochar and hydrochar in terms of production, physico-chemical properties, and applications. Renew SustEnerg Rev 45:359–378

    Article  CAS  Google Scholar 

  • Khandelwal M, Singh T (2010) Prediction of macerals contents of Indian coals from proximate and ultimate analyses using artificial neural networks. Fuel 89(5):1101–1109

    Article  CAS  Google Scholar 

  • Kim D, Lee K, Park KY (2014) Hydrothermal carbonization of anaerobically digested sludge for solid fuel production and energy recovery. Fuel 130:120–125

    Article  CAS  Google Scholar 

  • Krylova AY, Zaitchenko V (2018) Hydrothermal carbonization of biomass: a review. Solid Fuel Chem 52(2):91–103

    Article  CAS  Google Scholar 

  • Kubacki ML, Ross AB, Jones JM, Williams A (2012) Small-scale co-utilisation of coal and biomass. Fuel 101:84–89

    Article  CAS  Google Scholar 

  • Lawal AI (2020) An artificial neural network-based mathematical model for the prediction of blast-induced ground vibration in granite quarries in Ibadan, Oyo State Nigeria. Sci Afr 8:e00413

    Google Scholar 

  • Lawal AI, Aladejare AE, Onifade M, Bada S, Idris MA (2020) Predictions of elemental composition of coal and biomass from their proximate analyses using ANFIS, ANN, and MLR. Int J Coal Sci Technol.

    Article  Google Scholar 

  • Lee J, Sohn D, Lee K, Park KY (2019) Solid fuel production through hydrothermal carbonization of sewage sludge and microalgae Chlorella sp. from wastewater treatment plant. Chemosphere 230:157–163

    Article  CAS  PubMed  Google Scholar 

  • Libra JA, Ro KS, Kammann C, Funke A, Berge ND, Neubauer Y, Titirici M-M, Fühner C, Bens O, Kern J (2011) Hydrothermal carbonization of biomass residuals: a comparative review of the chemistry, processes, and applications of wet and dry pyrolysis. Biofuels 2(1):71–106

    Article  CAS  Google Scholar 

  • Majumder AK, Jain R, Banerjee P, Barnwal J (2008) Development of a new proximate analysis based correlation to predict calorific value of coal. Fuel 87(13–14):3077–3081

    Article  CAS  Google Scholar 

  • Makwarela M, Bada S, Falcon R (2017) Co-firing combustion characteristics of different ages of Bambusabalcooa relative to a high ash coal. Renew Energy 105:656–664

    Article  Google Scholar 

  • Mierzwa-Hersztek M, Gondek K, Jewiarz M, Dziedzic K (2019) Assessment of energy parameters of biomass and biochars, leachability of heavy metals, and phytotoxicity of their ashes. J Mater Cycles Waste 21(4):786–800

    Article  CAS  Google Scholar 

  • Mumme J, Eckervogt L, Pielert J, Diakité M, Rupp F, Kern J (2011) Hydrothermal carbonization of anaerobically digested maize silage. BioresourTechnol 102(19):9255–9260

    Article  CAS  Google Scholar 

  • Onifade M, Lawal AI, Aladejare AE, Bada S, Idris MA (2019) Prediction of gross calorific value of solid fuels from their proximate analysis using soft computing and regression analysis. Int J Coal Prep Util:1–15

  • Park KY, Lee K, Kim D (2018) Characterized hydrochar of algal biomass for producing solid fuel through hydrothermal carbonization. BioresourTechnol 258:119–124

    Article  CAS  Google Scholar 

  • Parshetti GK, Hoekman SK, Balasubramanian R (2013) Chemical, structural and combustion characteristics of carbonaceous products obtained by hydrothermal carbonization of palm empty fruit bunches. BioresourTechnol 135:683–689

    Article  CAS  Google Scholar 

  • Patel N (2019) Hydrothermal Carbonization (HTC) of Marine Seaweed (Macroalgae) for Producing Hydro-Char. Masters thesis, Dalhousie University

  • Patel SU, Kumar BJ, Badhe YP, Sharma B, Saha S, Biswas S, Chaudhury A, Tambe SS, Kulkarni BD (2007) Estimation of gross calorific value of coals using artificial neural networks. Fuel 86(3):334–344

    Article  CAS  Google Scholar 

  • Peng C, Zhai Y, Zhu Y, Xu B, Wang T, Li C, Zeng G (2016) Production of char from sewage sludge employing hydrothermal carbonization: char properties, combustion behavior, and thermal characteristics. Fuel 176:110–118

    Article  CAS  Google Scholar 

  • Perlack RD, Eaton LM, Turhollow Jr AF, Langholtz MH, Brandt CC, Downing ME, Graham RL, Wright LL, Kavkewitz JM, Shamey AM (2011) US billion-ton update: biomass supply for a bioenergy and bioproducts industry. Accessed 16 Jul 2020

  • Pradhan P, Mahajani SM, Arora A (2018) Production and utilization of fuel pellets from biomass: a review. Fuel Process Technol 181:215–232

    Article  CAS  Google Scholar 

  • Reza MT, Uddin MH, Lynam JG, Hoekman SK, Coronella CJ (2014) Hydrothermal carbonization of loblolly pine: reaction chemistry and water balance. Biomass Convers Biorefin 4(4):311–321

    Article  CAS  Google Scholar 

  • Rousset P, Macedo L, Commandré J-M, Moreira A (2012) Biomass torrefaction under different oxygen concentrations and its effect on the composition of the solid by-product. J Anal Appl Pyrolysis 96:86–91

    Article  CAS  Google Scholar 

  • Saadat M, Khandelwal M, Monjezi M (2014) An ANN-based approach to predict blast-induced ground vibration of Gol-E-Gohar iron ore mine. Iran J Rock MechGeotechEng 6(1):67–76

    Google Scholar 

  • Saba A, Saha P, Reza MT (2017) Co-Hydrothermal Carbonization of coal-biomass blend: Influence of temperature on solid fuel properties. Fuel Process Technol 167:711–720

    Article  CAS  Google Scholar 

  • Sadiku NA, Oluyege AO, Sadiku IB (2016) Analysis of the calorific and fuel value index of bamboo as a source of renewable biomass feedstock for energy generation in Nigeria. Lignocellulose 5(1):34–49

    Google Scholar 

  • Safarian S, Unnþórsson R, Richter C (2019) A review of biomass gasification modelling. Renew SustEnerg Rev 110:378–391

    Article  CAS  Google Scholar 

  • Said KO, Onifade M, Lawal AI, Githiria JM (2020a) An artificial intelligence-based model for the prediction of spontaneous combustion liability of coal based on its proximate analysis. Combust Sci Technol:1–18

  • Said KO, Onifade M, Lawal AI, Githiria JM (2020b) Computational intelligence-based models for predicting the spontaneous combustion liability of coal. Int J Coal Prep Util:1–25

  • Saidur R, Abdelaziz E, Demirbas A, Hossain M, Mekhilef S (2011) A review on biomass as a fuel for boilers. Renew SustEnerg Rev 15(5):2262–2289

    Article  CAS  Google Scholar 

  • Saldarriaga JF, Aguado R, Pablos A, Amutio M, Olazar M, Bilbao J (2015) Fast characterization of biomass fuels by thermogravimetric analysis (TGA). Fuel 140:744–751

    Article  CAS  Google Scholar 

  • Sevilla M, Fuertes AB (2009) The production of carbon materials by hydrothermal carbonization of cellulose. Carbon 47(9):2281–2289

    Article  CAS  Google Scholar 

  • Seyedsadr S, Al Afif R, Pfeifer C (2018) Hydrothermal carbonization of agricultural residues: a case study of the farm residues-based biogas plants. Carbon Resour Convers 1(1):81–85

    Article  Google Scholar 

  • Sheng C, Azevedo J (2005) Estimating the higher heating value of biomass fuels from basic analysis data. Biomass Bioenerg 28(5):499–507

    Article  CAS  Google Scholar 

  • Silakova M (2018) Hydrothermal carbonization of the tropical biomass.

  • Stemann J, Erlach B, Ziegler F (2013) Hydrothermal carbonisation of empty palm oil fruit bunches: laboratory trials, plant simulation, carbon avoidance, and economic feasibility. Waste Biomass Valori 4(3):441–454

    Article  CAS  Google Scholar 

  • Tekin K, Karagöz S, Bektaş S (2014) A review of hydrothermal biomass processing. Renew SustEnerg Rev 40:673–687

    Article  CAS  Google Scholar 

  • Teodorescu L, Sherwood D (2008) High energy physics event selection with gene expression programming. Comput Phys Commun 178(6):409–419

    Article  CAS  Google Scholar 

  • Uzun H, Yıldız Z, Goldfarb JL, Ceylan S (2017) Improved prediction of higher heating value of biomass using an artificial neural network model based on proximate analysis. BioresourTechnol 234:122–130

    Article  CAS  Google Scholar 

  • Vargas-Moreno J, Callejón-Ferre A, Pérez-Alonso J, Velázquez-Martí B (2012) A review of the mathematical models for predicting the heating value of biomass materials. Renew SustEnerg Rev 16(5):3065–3083

    Article  CAS  Google Scholar 

  • Volpe M, Goldfarb JL, Fiori L (2018) Hydrothermal carbonization of Opuntia ficus-indica cladodes: role of process parameters on hydrochar properties. BioresourTechnol 247:310–318

    Article  CAS  Google Scholar 

  • Wang T, Zhai Y, Zhu Y, Li C, Zeng G (2018) A review of the hydrothermal carbonization of biomass waste for hydrochar formation: Process conditions, fundamentals, and physicochemical properties. Renew SustEnerg Rev 90:223–247

    Article  CAS  Google Scholar 

  • Wasserman PD (1993) Advanced methods in neural computing. Wiley, New York

    Google Scholar 

  • Wiedner K, Naisse C, Rumpel C, Pozzi A, Wieczorek P, Glaser B (2013) Chemical modification of biomass residues during hydrothermal carbonization–What makes the difference, temperature or feedstock? Org Geochem 54:91–100

    Article  CAS  Google Scholar 

  • Xiong J-B, Pan Z-Q, Xiao X-F, Huang H-J, Lai F-Y, Wang J-X, Chen S-W (2019) Study on the hydrothermal carbonization of swine manure: The effect of process parameters on the yield/properties of hydrochar and process water. J Anal Appl Pyrolysis 144:104692

    Article  CAS  Google Scholar 

  • Xu Q, Qian Q, Quek A, Ai N, Zeng G, Wang J (2013) Hydrothermal carbonization of macroalgae and the effects of experimental parameters on the properties of hydrochars. ACS Sustain Chem Eng 1(9):1092–1101

    Article  CAS  Google Scholar 

  • Yang Y, Zhang Q (1997) A hierarchical analysis for rock engineering using artificial neural networks. Rock Mech Rock Eng 30(4):207–222

    Article  Google Scholar 

  • Zhang Z, Pang S (2019) Experimental investigation of tar formation and producer gas composition in biomass steam gasification in a 100 kW dual fluidised bed gasifier. Renew Energy 132:416–424

    Article  CAS  Google Scholar 

  • Zhu Y, Si Y, Wang X, Zhang W, Shao J, Yang H, Chen H (2018) Characterization of hydrochar pellets from hydrothermal carbonization of agricultural residues. Energ Fuel 32(11):11538–11546

    Article  CAS  Google Scholar 

Download references


This work was supported by the National Research Foundation (NRF), South Africa (Grant Number: 86421). The opinions, observations, and findings are those of the authors and cannot be attributed to the NRF.


We acknowledge the funding support from the National Research Foundation (NRF), South Africa for the experimental part of this study.

Author information

Authors and Affiliations



JA: Conceptualization, methodology, Data analysis, Experimental investigation, and writing the original draft. AL: Model development, software, data analysis, and validation. RS: Experimental investigation and data collection. MO: Reviewed and edited the original draft. SB: provided resources, supervision, and project administration. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Jibril Abdulsalam.

Ethics declarations

Ethics approval and consent to participate

Not Applicable.

Consent for publication

Not Applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abdulsalam, J., Lawal, A.I., Setsepu, R.L. et al. Application of gene expression programming, artificial neural network and multilinear regression in predicting hydrochar physicochemical properties. Bioresour. Bioprocess. 7, 62 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: