Farm-heterogeneity and persistent and transient productive efficiencies in Ethiopia’s smallholder cereal farming

Studying the sources of growth in agricultural production and analyzing farm performance is an important step in assessing the developmental role of agriculture in developing countries. Knowing the level of efficiency of smallholder farms has important implications for the choice of development strategy, particularly in sub-Saharan Africa (SSA) in which most countries derive over 60% of their livelihoods from agriculture and related economic activities (Maurice et al., 2014). Agriculture contributes 40% to Ethiopia’s GDP, provides employment and livelihood to more than 83% of the population, contributes 85% to the country’s total export earnings, and supplies 73% of the raw material to domestic industries (AfDB, 2011). However, the sector is rain-fed, has frequent droughts, has high population pressure, has severe land degradation, and vulnerable to climate change. Despite its importance, the sector is marked by one of the lowest productivity levels in the world and is dominated by subsistence smallholders, who usually cultivate small areas averaging <1.5 hectares (FAO, 2009).

Cereals are the most vital crop in the country’s crop production. Cereals as a major food crop comprise about two-third of the agricultural share of GDP and one-third of the national GDP. Cereals have a lion’s share in the country’s crop farming in terms of production volume, farmland, and farm households. According to ECSA (2015), cereals comprised of about 79% of the total cropped area, 85% of the grain crop production, and engaged 81% of private farms for the Meher season in the production year 2014–15. Cereal production was marked by remarkable growth in Ethiopian crop farming during the last decade. Several of ECSA’s yearly reports show that cereal production grew consistently from an average of 16 million metric tons (MMTs) in 2004–08 to 21.6 MMTs in 2009–14. Cereal production averaged 18.8 MMTs for a decade, showing a growth rate of 2.74% per annum. However, despite the widely believed view of the central role of agriculture in Ethiopia’s economic transformation, the sector did not perform as per its potential. Furthermore, as Kassahun (2011) shows, the sector is characterized by inefficiencies and poor productivity in which cereals had a steady low-growth rate in the last two decades. This underlines the importance of knowing the performance or efficiency levels of cereal producing farms in Ethiopia. Such information will help enhancing food security, which is an important issue, and also inform policymakers in agrarian countries such as Ethiopia.

In efficiency literature, since the pioneering work of Farrell (1957), various studies have been done to examine the issues of efficiency in crop farming in different countries using different methodologies. Economic efficiency is the product of technical and allocative efficiency. The focus of this paper is only on technical efficiency (TE), which is a measure of the effectiveness with which a given set of inputs such as labor, capital, land, seeds, and technology is used for producing an output such as crops. A farm is said to be technically efficient if it produces the maximum output using the minimum quantity of inputs. Over the years, various methods of estimating production frontiers have been developed for predicting reliable efficiency measures. These frontier methods vary from the parametric stochastic frontier analysis (SFA) to the non-parametric data envelopment analysis (DEA) method. SFA has an advantage in modeling input–output relations while controlling for producer heterogeneity with the production environment and management factors assuming a functional form. On the other hand, DEA is based on only input–output relations and inefficiency is marked by errors associated with the leftout variables. The DEA method is also sensitive to outliers, but is immune to functional form assumptions and choice of the estimation method. The stochastic production frontier (SPF) model introduced by Aigner et al. (1977) has been extended over the years to accommodate different circumstances (Battese and Coelli, 1992, 1995; Jondrow et al., 1982; Kumbhakar, 1991; Pitt and Lee, 1981; Schmidt and Sickles, 1984). The model has been extensively used for estimating TE of decision-making units at different levels of aggregation. Unlike the average production function, which is based on fitting an average function, the stochastic production frontier is consistent with the objective of output maximization and it estimates a frontier production function that is stochastic. In particular, SPF models are found to better fit agricultural efficiency analyses. Agriculture experiences higher noise in the data as a result of the stochastic nature of the production process and large yield variabilities.

However, efficiency results of such earlier models are sensitive to the way they are modeled and interpreted and to the assumptions underlying the models mainly when panel data is used (Kumbhakar et al., 2014, 2015). The main reason for the different assumptions is that when panel data is available, a farm’s productive efficiency is composed of time-invariant (persistent) and time-variant (transient) components of efficiency. These cannot be captured distinctively by the earlier SPF models. In addition, these models do not treat explicitly unobservable farm heterogeneity effects of inefficiency. Thus, the models generate a misspecification bias and the effects of these factors may be captured by the inefficiency term thereby producing biased efficiency results. However, when panel data became available recently, panel data models were developed (Colombi et al., 2014; Filippini and Greene, 2016; Kumbhakar et al., 2014; Tsionas and Kumbhakar, 2014), which allow separating inefficiency’s two components of time-invariant and time-variant inefficiency, along with disentangled heterogeneity and random error effects.

Several empirical studies have been done to investigate the efficiency of Ethiopia’s crop farming using different methodologies. However, there have only been limited attempts at studying farming efficiency by applying panel data SFP models. Most of the studies use simpler model specifications of Battese and Coelli’s (1992, 1995) efficiency models that have a number of problems of mixed farm heterogeneity and farm-specific TE. Moreover, to the best of our knowledge, no existing study provides separate estimates of the two components of inefficiency or the disentangled heterogeneity effects of inefficiency. Estimates of time-invariant or persistent inefficiency provide useful information about the farms because high persistent inefficiency scores are indicators of non-competitiveness and costly policies for inducing small changes. This part of inefficiency may be due to structural problems in the organization of the farms’ production processes or the presence of systematic shortfalls in managerial capabilities, farms’ lasting habit of wasting inputs, or the quality of land and climatic conditions. The time-variant or transient part of inefficiency, on the other hand, may stem from temporal behavioral aspects of the management, for example, from a nonoptimal use of some inputs due to the presence of non-systematic management problems that can be solved in the short-term. Further, as discussed by Kumbhakar et al. (2015), knowing the estimates of the two inefficient components, especially in long panels, and their separation from heterogeneity effects is important as this allows the farms to elicit their resource- or cost-saving potential in the short run as well as in the long run. Each component provides different information with different policy implications for promoting efficiency in the production of scarce resources.

Accordingly, this study applies a recently proposed four-component random error panel data SPF model (Kumbhakar et al., 2014), for estimating persistent and transient efficiency and disentangling them from unobserved farm heterogeneity effects and random errors for smallholder cereal farms in Ethiopia using a partially balanced panel dataset. We also compare the results of this model with three other stochastic frontier (SF) production models in which one of the four components is not accounted for. The study contributes to existing literature as it provides one of the first empirical analyses to show the presence of persistent and transient inefficiency using a novel econometric approach—a four-component random error panel data SPF model for Ethiopia’s smallholder cereal farming. Second, to the best of our knowledge, this is the first panel data analysis that addresses the problem of individual and farm heterogeneities in measuring production efficiency in Ethiopia’s crop farming that disentangles farm heterogeneity from inefficiency effects. Thus, this study provides valuable information on persistence and transient inefficiency and farm heterogeneity effects.

Third, it does an analysis of agro-ecological zones (AEZs) considering cereal farming at the farm level making it replicable elsewhere in the country, between regions, and within AEZs.

The rest of this study is organized as follows. The method and data are presented in Section 2 whereas Section 3 discusses the empirical results and their implications. Section 4 provides the summary and conclusions.

2

Method and data

2.1

Panel data stochastic production frontier models

Measuring and comparing producer performance (efficiency) is an important topic of research in the field of applied economics. Since the pioneering work of Farrell (1957) on productive efficiency, various modifications and improvements have been made to the measurement of production efficiency. Efficiency can be achieved by maximizing output for given inputs or minimizing inputs for a given output and technology. Aigner and Chu (1968) translated Farrell’s frontier concept into a production function and described it in the input-oriented approach—TE as the ability to produce a given level of output using a minimum quantity of inputs using a certain technology. Akin to we do in this paper, TE can be obtained following an output oriented approach. Here the objective is maximizing output for given inputs and technology. It is an engineering concept and refers to the physical input–output relationship. Numerically, TE takes values between zero and one (0 ≤ TE ≤ 1) in which a value of one shows that the firm is fully technically efficient whereas zero means that inputs are being used for producing zero output. A firm/farm is said to be efficient if it operates on the production frontier that provides the input/output ratios of the most efficient use of inputs for producing the output. Deviations in the observed ratios from this frontier are associated with technical inefficiency of the firm/farm. On the other hand, a firm/farm is said to be technically inefficient when it fails to achieve the maximum output using the given inputs, or fails to operate on the production frontier. In a production performance analysis, efficiency is determined by the frontier model and the selection of an appropriate model for estimating efficiency and the interpretation of the results may not be straightforward as the results depend on functional forms and the way the model is specified.

Most previous studies on efficiency analyses are based on Farrell type measures of efficiency, where the researchers focus on deciding which functional form to use and the model specifications. However, over the years, various other methods of specifying and estimating production frontiers have been developed to come up with more reliable efficiency measures.

Among these, the SF model originally proposed by Aigner et al. (1977) has been used for estimating and comparing TE of individual production units within a geographic location, an industry, or an agricultural sector. As a result, SF has been considered a standard approach for evaluating efficiency in the production of products and services at a variety of levels of aggregation and research areas.

Extensive research in this field resulted in the rapid development of econometric techniques concerning specifications, estimations, and testing issues of frontier models. Literature is broadly divided into parametric versus non-parametric, cross-section versus panel data, time-variant versus time-invariant inefficiencies, static versus dynamic models, and various distributional assumptions, estimation methods, heteroscedasticity, and autocorrelation explaining inefficiency by its possible determinants. These techniques were developed rapidly and implemented in many areas using mostly cross-sectional and panel data. The use of panel data models in estimating producers’ efficiency helps in avoiding some of the problems related to distributional assumptions encountered in the cross-section approach.

According to Schmidt and Sickles (1984), when inefficiency is time-invariant, panel data enables one to estimate inefficiency consistently without distributional assumptions. Panel data also has the advantage of separating individual- and time-specific effects from combined effects (Heshmati et al., 1995). Furthermore, panel data enables one to control individual heterogeneity effects; they have a greater variability, less collinearity between variables, more degree of freedom, and more estimation efficiency. Panel data enables one to identify and measure effects that are not detected in cross-sectional or time-series data.

Consider a sample of N farms operating in time period t that use various inputs to produce a non-negative output using a technology (production frontier). Then, the panel data versions of the standard 1980s SPF models can be written as:

(1)

Y_{i t} = f (X_{i t}; β) * \exp (ϕ_{i t}) = f (X_{i t}; β) * \exp (ε_{i t} - τ_{i t})

$${{Y}_{it}}=f\left( {{X}_{it}};\beta \right)*\exp \left( {{\phi }_{it}} \right)=f\left( {{X}_{it}};\beta \right)*\exp \left( {{\varepsilon }_{it}}-{{\tau }_{it}} \right)$$

where subscripts i = 1, ..., N denotes farms, and t = 1, ..., T denotes time periods. The variable Y_it represents the output produced by farm i at time period t, the function f(.) is SPF whereas X_it is a vector of input variables of the ith farm at time period t (plus other exogenous/control variables), and β is a vector of unknown parameters to be estimated. φ_it is a composed error term with two components ε_it and τ_it, where the two-sided component ε_it is a symmetric random error that accounts for statistical noise assumed to be identically independently distributed (i.i.d) normal with zero mean and constant variance σ_ε², that is, ε_it ~ iidN(0, σ_ε²) is independent of τ_it that captures random variations in output resulting from factors outside the control of the farm as well as measurement errors and left-out explanatory variables. Similarly, the one-sided component τ ≥ 0 reflects technical inefficiency relative to the SF of the ith farm in year t which is assumed to be i.i.d. as half normal with zero mean, that is, $τ_{i t} = (T_{i t}), where T_{i t} \sim i i d N^{+} (0, σ_{T}^{2}) .$ $\tau_{i t}=\left|\mathrm{T}_{i t}\right|, \text { where } \mathrm{T}_{i t} \sim i i d N^{+}\left(0, \sigma_{\mathrm{T}}{^{2}}\right).$

Now by taking the logarithms of both sides in Equation (1) the panel data SPF model can be written as:

(2)

y_{i t} = α_{0} + x_{i t}^{'} β + ε_{i t} - τ_{i t} = α_{0} + x_{i t}^{'} β + ϕ_{i t}

$${{y}_{it}}={{\alpha }_{0}}+x_{it}^{\prime }\,\beta +{{\varepsilon }_{it}}-{{\tau }_{it}}=\,\,{{\alpha }_{0}}+x_{it}^{\prime }\,\beta +{{\phi }_{it}}$$

here y_it is logarithm of the output variable and x_it is vector logarithms of the input variables. The parameter α₀ is a common intercept, whereas other variables retain their previous definitions. Based on the panel data SPF model’s specifications, a number of SPF models in panel data have been developed leading to alternative measures of technical inefficiency.

Panel data SPF models introduced in the early 1990s assumed inefficiency to be individual-specific and time-invariant, that is, inefficiency levels may be different for different producers, but they do not change over time, meaning that an inefficient producer never learns to improve his performance over time. This might be the case in some situations where, for example, the soil quality is poor and farms lack water sources for irrigation, or inefficiency is associated with managerial abilities and there is no change in the management and in production technology for any of the farms during the period of study (Kumbhakar et al., 2014, 2015). This seems unrealistic, particularly when production competition is considered and technology is continuously developing.

Another drawback of this approach is that farms’ unobserved heterogeneity cannot be distinguished from inefficiency. It is mixed with time-invariant inefficiency. This raises some related questions that need to be considered, such as whether inefficiency has been persistent over time or whether it is time-varying. An additional key question that needs to be considered with regard to time-invariant individual effects is whether an individual effect represents persistent inefficiency, or whether the effect is independent of inefficiency and it captures time-invariant unobserved heterogeneity. The question here is, should one view the time-invariant effects as persistent inefficiency or as farm heterogeneity that captures the effects of (unobserved) time-invariant covariates and as such is unrelated to inefficiency.

Although several panel data SF models discussed earlier can separate farm heterogeneity from transient inefficiency, none or very few of these models consider persistent technical inefficiency. Related to these questions, as discussed in Colombi et al. (2014) and Kumbhakar et al. (2014, 2015), several panel data SPF models were developed to include both time-invariant and time-varying inefficiency effects. Some of these models have been developed based on the assumption that all the time-invariant (fixed or random) effects are persistent inefficiency (for example, Pitt and Lee, 1981; and Schmidt and Sickles, 1984). Other models have been developed based on the assumption that the time-variant effect is transient inefficiency without considering the heterogeneity effects (for example, Battese and Coelli, 1992; and Lee and Schmidt, 1993), or farm effects have been separated from transient inefficiency without considering the possibility of persistent inefficiency (for example, Greene, 2005a, 2005b). The models proposed by Kumbhakar (1991) and Kumbhakar and Heshmati (1995) lie in between. Their models treat farm effects as persistent inefficiency and include another component for capturing transient inefficiency.

Some recently developed panel models provide information on whether a farm is characterized by the presence of both types of inefficiency and are concerned with the separation of inefficiency from heterogeneity effects (Colombi et al., 2014; Filippini and Greene, 2016; Kumbhakar et al., 2014; Tsionas and Kumbhakar, 2014) that may overcome some of the limitations of earlier approaches. These recently developed models propose an error structure that is decomposed into four components thus making it possible to account separately for: the usual noise in the data, individual time-invariant heterogeneity, time-variant transient (short-term) inefficiency, and time-invariant persistent (long-term) inefficiency components. Thus, having distinct information or estimates of each component of inefficiency are separated from each other and both are also disentangled from unobserved heterogeneity effects, which is very important. Both components of productive efficiency are equally essential as they provide different information with different policy implications for promoting efficiency in the production of scarce resources. Herein, transient inefficiency is interpreted as short-term productive inefficiency associated with changes in efficiency resulting from changes in managerial skills or adoption of new technologies. This part of inefficiency may stem from temporal behavioral aspects of the management or, for example, from a nonoptimal use of some inputs or due to the presence of non-systematic management problems that can be solved in the short-term without operational changes in a farm or any major policy changes. The assumption in time-varying models is that inefficiency in the current period is independent of the inefficiency in the previous period.

In contrast, persistent inefficiency is long-term productive inefficiency due to structural or institutional factors which evolve slowly over time. As persistent inefficiency is time-invariant, it can only be changed in the long-term through restructuring or changes in farm ownership. Information on persistent inefficiency is important especially in short panels because it reflects the effects of inputs such as management as well as other unobserved inputs that vary across firms but not over time. This part of productive inefficiency may be due to the presence of structural problems in an organization in the production process or the presence of systematic shortfalls in managerial capabilities, regulations, inefficient infrastructure, or the management’s lasting habit of wasting inputs. Thus, unless there is a change in something that effects management practices at the farm level (such as changes in ownership or new government-regulations), persistent inefficiency is unlikely to change. While persistent inefficiency and farm unobserved heterogeneity are both time-invariant effects, a major difference between them is that the latter is always beyond the control of the farms (for example, the geological/locational makeup of a farm or other physical features). Thus, such a distinction and measurement of the two components of productive efficiency is informative as it allows the farms to use their resources or cost-saving potential both in the short-term and in the long-term.

In view of these issues, this paper provides an alternative econometric approach for estimating inefficiency based on SPF models that allow a distinction between persistent and transient inefficiency by disentangling persistent inefficiency from unobserved heterogeneity and also time-varying inefficiency from the random noise. Accordingly, in line with Heshmati et al. (2018) and Rashidghalam et al. (2016) we use four alternative SPF panel data models categorized in terms of the assumptions made about the temporal behavior of inefficiency and separating inefficiency effects from unobservable individual heterogeneity effects. A common issue in all the models is that inefficiency is farm-specific. The first model (Model 1 or fixed-effects (FE) model) is the basic version of the panel models: the FE model by Schmidt and Sickles (1984), which assumes inefficiency effects to be farm-specific but time-invariant offering a persistent/ long-run inefficiency estimate. The second model (Model 2 or true fixed-effects (TFE) panel model) is proposed by Greene (2005a), which separates transient/short-run inefficiency from persistent individual effects. Model 2 allows inefficiency to be farm-specific and time-variant and it separates inefficiency from unobserved farm heterogeneity. Models 3 and 4 separate persistent inefficiency and transient inefficiency from unobservable heterogeneity effects. The third model (Model 3 or KH) is a three-error component panel data model introduced by Kumbhakar and Heshmati (1995) that offers estimates of persistent and transient inefficiencies, without accounting for farm heterogeneity. The fourth model (Model 4 or KLH) is a recently developed four-error component panel data model introduced by Kumbhakar et al. (2014) that provides estimates of persistent and transient inefficiency separating it from time-invariant farm effects and random noise.

2.2

Model specifications and estimation procedures

2.2.1

Model specifications

In line with Heshmati et al. (2018) and Karagiannis and Tzouvelekas (2009), who provide a comparison of alternative specifications of inefficiency based on the same data, in this section, we present the specifications of the four SPF panel data models used in this study. The specifications of all the models are based on the formulation of the model given in Equation (2).

2.2.1.1

Model 1: Individual effects treated as long-run inefficiency

To specify a model with time-invariant inefficiency effects we treat the term, τ_it in Equation (2) as time-invariant inefficiency u_i to represent long-run inefficiency:

(3)

y_{i t} = α_{0} + f (x_{i t}; β) + ε_{i t} - u_{i}; u_{i} \geq 0

$${{y}_{it}}={{\alpha }_{0}}+f\left( {{x}_{it}};\beta \right)+{{\varepsilon }_{it}}-{{u}_{i}}\,\quad ;\,\quad {{u}_{i}}\ge 0$$

This model utilizes the panel feature of the data via u_i and can be estimated if either the inefficiency component is treated as a fixed parameter in an FE model, or a random variable in the random-effects model, assuming ε_it and u_i are homoscedastic. However, this model has been criticized for its assumption of inefficiency being time-invariant which seems to be unrealistic, especially for long panel datasets because this inefficiency term may capture some time-invariant farm attributes such as individual instinctive abilities and other persistent farm heterogeneities which are unrelated to the production process but which effect output. Thus, these factors may be mixed up with inefficiency leading to a miss-specified model and tending to overestimate inefficiency levels.

2.2.1.2

Model 2: Individual effects treated as heterogeneity

To overcome the drawbacks of the FE model, Greene (2005a) proposed an extension of this model, called the TFE model. The purpose of this model is treating time-invariant farm heterogeneity and transient inefficiency effects separately. Hence, treating the inefficiency term τ_it in Equation (2) as time-varying but splitting the error term as ε_it = μ_i + υ_it, the model is written as:

(4)

y_{i t} = α_{0} + f (x_{i t}; β) + μ_{i} + v_{i t} - τ_{i t}

$${{y}_{it}}={{\alpha }_{0}}+f\left( {{x}_{it}};\beta \right)+{{\mu }_{i}}+{{v}_{it}}-{{\tau }_{it}}$$

where μ_i captures any time-invariant farm heterogeneity and not inefficiency, τ_it represents transient inefficiency, and ν_it is a random shock with the following distributions:

(5)

τ_{i t} \sim N^{+} (0, σ_{τ}^{2}), v_{i t} \sim N (0, σ_{v}^{2}), and μ_{i} \sim N (0, σ_{μ}^{2})

$${{\tau }_{it}}\sim {{N}^{+}}\left( 0,\,\sigma _{\tau }{^{2}} \right),\,\,{{v}_{it}}\sim N\left( 0,\,\sigma _{v}{^{2}} \right),\text{ }\!\!~\!\!\text{ and }\,\!\!~\!\!\text{ }{{\mu }_{i}}\sim N\left( 0,\,\sigma _{\mu }^{2} \right)$$

In this model, if we treat μ_i as fixed parameters that do not capture inefficiency, the model becomes a TFE model.

The TFE model allows inefficiency to be time-variant and controls for farm heterogeneity. However, the model views individual effects as being different from inefficiency and assumes that the inefficiency term is always transient. Thus, it fails to capture persistent inefficiency. Therefore, the individual effects cannot be distinguished from transient inefficiency and the persistent component of inefficiency is completely absorbed in a farm’s heterogeneity effects. Hence, all time-invariant effects that are not necessarily inefficient are included as inefficiency and therefore ${\overset{˘}{τ}}_{i t}$ $\breve{\tau}_{i t}$ might pick up farm heterogeneity in addition to or even instead of inefficiency (Kumbhakar and Heshmati, 1995). Consequently, the model is miss-specified and tends to underestimate transient inefficiency levels and overestimates the efficiency scores.

2.2.1.3

Model 3: Individual effects treated as persistent inefficiency

To overcome the downward bias inefficiency estimation of the TFE model and its tendency to disregard the persistent inefficiency component, Kumbhakar and Heshmati (1995) proposed a model that treats individual effects as persistent inefficiency and decomposed the overall inefficiency into persistence and transient components.

To formalize the model, we split the inefficiency term, τ_it in Equation (2) as τ_it = η_i + u_it to obtain the model:

(6)

\begin{matrix} y_{i t} = α_{0} + f (x_{i t}; β) + ϕ_{i t} \\ ϕ_{i t} = ε_{i t} - τ_{i t} and τ_{i t} = η_{i} + u_{i t} so that \\ y_{i t} = α_{0} + f (x_{i t}; β) + ε_{i t} - η_{i} - u_{i t} \end{matrix}

$$\begin{align}& \,\,\,{{y}_{it}}={{\alpha }_{0}}+f\left( {{x}_{it}};\beta \right)+{{\phi }_{it}} \\ & {{\phi }_{it}}={{\varepsilon }_{it}}-{{\tau }_{it}}\quad \text{ }\!\!~\!\!\text{ }\,\text{and}\,\text{ }\!\!~\!\!\text{ }\quad {{\tau }_{it}}={{\eta }_{i}}+{{u}_{it}}\quad \,\text{ }\!\!~\!\!\text{ so that} \\ & {{y}_{it}}={{\alpha }_{0}}+f\left( {{x}_{it}};\beta \right)+{{\varepsilon }_{it}}-{{\eta }_{i}}-{{u}_{it}} \\ \end{align}$$

This model labeled as the KH model splits the error term into three components – ϕ_it = ε_it – η_i – u_it, where ε_it captures a random shock; η_i ≥ 0, which captures persistent inefficiency; and u_it ≥ 0, which captures the transient inefficiency component. Unlike the TFE model, the KH model does not consider any time-invariant farm heterogeneity effects. Farm heterogeneity is mixed with individual persistent inefficiency. Consequently, the model is again miss-specified and is likely to produce upwardly biased persistent inefficiency estimates.

2.2.1.4

Model 4: Separation of individual heterogeneity from persistent inefficiency

To overcome the limitations of the first three models, Colombi et al. (2014), Kumbhakar et al. (2014), and Tsionas and Kumbhakar (2014) proposed a model that split the error term into four components—persistent and transient inefficiency, farm heterogeneity effect, and random noise. Hence, we specify a model that distinguishes between persistent and transient inefficiency and separate persistent inefficiency from farm heterogeneity effects. Following Kumbhakar et al.’s (2014) decomposition, assume $τ_{i t} = η_{i} + u_{i t} and ε_{i t} = μ_{i} + v_{i t}$ $\tau_{i t}=\eta_{i}+u_{i t} \text { and } \varepsilon_{i t}=\mu_{i}+v_{i t}$ as in Equation (1) to obtain:

(7)

y_{i t} = α_{0} + f (x_{i t}; β) + μ_{i} + v_{i t} - η_{i} - u_{i t}

$${{y}_{it}}={{\alpha }_{0}}+f\left( {{x}_{it}};\beta \right)+{{\mu }_{i}}+{{v}_{it}}-{{\eta }_{i}}-{{u}_{it}}$$

This model labeled the KLH model decomposes the error term ϕ_it into four components as $ϕ_{i t} = μ_{i} + v_{i t} - η_{i} - u_{i t},$ ${{\phi }_{it}}={{\mu }_{i}}+{{v}_{it}}-{{\eta }_{i}}-{{u}_{it}},$where μ_i is the farm heterogeneity effect (for example, farm management and soil quality), υ_it is the idiosyncratic random component, the one-sided η_i ≥ 0 captures persistent inefficiency, and the one-sided u_it ≥ 0 captures transient inefficiency effects. Without μ_i, Equation (7) is reduced to the KH model and without η_i it is same as the TFE model.

2.2.2

Models’ estimation procedures

To estimate the FE model, we reformulate Equation (3) to obtain the following estimable model:

(8)

\begin{matrix} y_{i t} = α_{0} + f (x_{i t}; β) + ε_{i t} - u_{i} = (α_{0} - u_{i}) + f (x_{i t}; β) + ε_{i t} \\ = α_{i} + f (x_{i t}; β) + ε_{i t} \end{matrix}

$$\begin{align}& {{y}_{it}}={{\alpha }_{0}}+f\left( {{x}_{it}};\beta \right)+{{\varepsilon }_{it}}-{{u}_{i}}=\left( {{\alpha }_{0}}-{{u}_{i}} \right)+f\left( {{x}_{it}};\beta \right)+{{\varepsilon }_{it}} \\ & \,\,\,\,\,\,\,={{\alpha }_{i}}+f\left( {{x}_{it}};\beta \right)+{{\varepsilon }_{it}} \\ \end{align}$$

Equation 8 is like a standard FE panel data model (Schmidt and Sickles, 1984), where α_i = α₀– u_i is a farm-specific intercept. Here, u_i and α_i are individual farm effects and assumed to be fixed parameters to be estimated along with the vector of the slope parameters β. One can apply the standard FE panel data estimation method to obtain ${\hat{α}}_{i},$ ${{\hat{\alpha }}_{i}},$and the following transformation to obtain an estimate for u_i:

(9)

{\hat{u}}_{i} = {max}_{i} ({\hat{α}}_{i}) - {\hat{α}}_{i} \geq 0, i = 1, \dots . ., N

$${{\hat{u}}_{i}}={{\max }_{i}}\left( {{{\hat{\alpha }}}_{i}} \right)-{{\hat{\alpha }}_{i}}\ge 0,\quad \,\,i=1,\ldots ..,N$$

Farm-specific TE is estimated as TE_i $E x p (- {\overset{˘}{u}}_{i}) .$ $\operatorname{\it Exp}\,\left(-\breve{u}_{i}\right).$This formulation implicitly assumes that the efficiency of the most efficient unit in the sample is 100%, so the inefficiency of other farms is relative to the best farm in the sample.

We estimate the TFE model by making distributional assumptions of the random error. Different estimation methods have been proposed for estimating KH and KLH models. Colombi et al. (2014) used a single stage maximum likelihood estimation (MLE) method based on the distributional assumptions of the four-error components. Kumbhakar and Heshmati (1995) and Kumbhakar et al. (2014) used a multi-step procedure, whereas Filippini and Greene (2016) used the simulated ML approach. In this paper we use the multi-step estimation procedure suggested by Kumbhakar et al. (2014) for its simplicity for the KH and KLH models. The multi-step procedure has the advantage of avoiding strong distributional assumptions for estimating the model. In what follows we present the multi-step approach for the two models.

The KH model can be estimated in four steps. The steps are described in Kumbhakar et al. (2015). For this we rewrite the model in Equation (6) as:

(10)

\begin{matrix} y_{i t} = α_{i} + f (x_{i t}; β) + ω_{i t}, \\ α_{i} = α_{0} - η_{i} - E (u_{i t}) and ω_{i t} = ε_{i t} - (u_{i t} - E (u_{i t})) . \end{matrix}

$$\begin{array}{*{35}{l}}{{y}_{it}}={{\alpha }_{i}}+f\left( {{x}_{it}};\beta \right)+{{\omega }_{it},} \\{{\alpha }_{i}}={{\alpha }_{0}}-{{\eta }_{i}}-E\left( {{u}_{it}} \right)\text{ }\!\!~\!\!\text{ and }\,\!\!~\!\!\text{ }{{\omega }_{it}}={{\varepsilon }_{it}}-\left( {{u}_{it}}-E\left( {{u}_{it}} \right) \right). \\\end{array}$$

In this case the error component ω_it has zero mean and constant variance. Thus, the model in Equation (10), which fits the standard panel data model with individual effects, can be estimated either by the least squares dummy variable (LSDV) or by the generalized least squares (GLS) method. Under the LSDV framework, the model can be estimated in four steps using a multi-step procedure: In step 1 we estimate Equation (10) using the standard within FE panel data model to obtain consistent estimates of the β vector. In step 2 we estimate ${\hat{η}}_{i},$ ${{\hat{\eta }}_{i}},$which can be used for estimating persistent TE, $P T E = \exp (- {\hat{η}}_{i}) .$ $PTE=\exp \left( -{{{\hat{\eta }}}_{i}} \right).$Using the standard half-normal SF model, we estimate α₀ and the parameters associated with ε_it = u_it in step 3. Finally, in step 4 we use the Jondrow et al.’s (1982) technique using mean or mode of the conditional distribution of u given ε to estimate the transient inefficiency u_it. This procedure helps predict the transient inefficiency component, ${\hat{u}}_{i t}$ $\hat{u}_{i t},$which can be used for estimating transient TE, $R T E_{i t} = \exp (- {\overset{˘}{u}}_{i t}),$ $R T E_{i t}=\exp \left(-\breve{u}_{i t}\right),$and finally the overall technical efficiency (OTE) is obtained from persistent and transient efficiencies, that is, OTE_it = PTE_i × RTE_it.

To estimate the KLH model, we reformulate the model in Equation (7) as:

(11)

y_{i t} = α_{0}^{*} + f (x_{i t}; β) + α_{i} + ω_{i t}

$${{y}_{it}}=\alpha _{0}^{*}+f\left( {{x}_{it}};\beta \right)+{{\alpha }_{i}}+{{\omega }_{it}}$$

where $α_{0}^{⋆} = α_{0} - E (η_{i}) - E (u_{i t}), and α_{i} = μ_{i} - η_{i} - E (η_{i}), and ε_{i t} = v_{i t} - u_{i t} + E (u_{i t}) .$ $\alpha _{0}^{\star }={{\alpha }_{0}}-E\left( {{\eta }_{i}} \right)-E\left( {{u}_{it}} \right),\text{ }\!\!~\!\!\text{ and }\!\!~\!\!\text{ }{{\alpha }_{i}}={{\mu }_{i}}-{{\eta }_{i}}-E\left( {{\eta }_{i}} \right),\text{ }\!\!~\!\!\text{ and }\!\!~\!\!\text{ }{{\varepsilon }_{it}}={{v}_{it}}-{{u}_{it}}+E\left( {{u}_{it}} \right).$With this specification α_i and ω_it have zero mean and constant variance. As Equation (11) is a familiar panel data model, we use the three-step approach for estimating the KLH model much like the previous case. In the first step, the standard FE panel regression is used for estimating $\hat{β} .$ $\hat{\beta }.$This procedure also gives predicted values of α_i and ε_it, denoted by ${\hat{α}}_{i} and {\hat{ε}}_{i t}^{*} .$ $\hat{\alpha}_{i} \text { and } \hat{\varepsilon}_{i t}^{*}.$In step 2, we estimate the time-varying TE using the predicted value of ε _it^* from the previous step using the standard

SF technique. This procedure predicts the time-varying transient technical inefficiency, which can be used for estimating $R T E_{i t} = \exp (- u_{i t} ∣ ε_{i t}^{⋆}) .$ $RT{{E}_{it}}=\exp \left( -{{u}_{it}}\mid \varepsilon _{it}^{\star } \right).$In step 3, we estimate η_i following a similar procedure as in step 2. For this, we use the standard pooled half-normal SF model to obtain estimates of the persistent inefficiency component η_i. Then PTE can be estimated using the formula $P T E_{i} = \exp (- {\hat{η}}_{i}), and O T E_{i t} = P T E_{i} \times R T E_{i t} .$ $PT{{E}_{i}}=\exp \left( -{{{\hat{\eta }}}_{i}} \right),\text{ }\!\!~\!\!\text{ and }\!\!~\!\!\text{ }OT{{E}_{it}}=PT{{E}_{i}}\times RT{{E}_{it}}.$

2.3

The empirical model

The production function f (x_it; β) in Models 1–4 is specified using a translog (TL) functional form because of its flexibility (Christensen et al., 1973). Hence, assuming a TL form with the time trend representation of technical change (TC), we estimate a SF panel data model using the specification:

(12)

\begin{matrix} \ln Y_{i t} = α_{0} & + \sum_{j = 1}^{J} β_{j} \ln X_{j i t} + β_{t} T_{t} + \frac{1}{2} (\sum_{j = 1}^{J} \sum_{k = 1}^{J} β_{j k} \ln X_{j i t} \ln X_{k i t} + β_{t t} T_{t}^{2}) \\ + \sum_{j = 1}^{J} β_{j t} \ln X_{j i t} T_{t} + ε_{i t} - τ_{i t} \end{matrix}

$$\begin{aligned} \ln Y_{i t}=\alpha_{0} &+\sum_{j=1}^{J} \beta_{j} \ln X_{j i t}+\beta_{t} T_{t}+\frac{1}{2}\left(\sum_{j=1}^{J} \sum_{k=1}^{J} \beta_{j k} \ln X_{j i t} \ln X_{k i t}+\beta_{t t} T_{t}^{2}\right) \\ &+\sum_{j=1}^{J} \beta_{j t} \ln X_{j i t} T_{t}+\varepsilon_{i t}-\tau_{i t} \end{aligned}$$

where ln Y_it is the natural logarithm of output produced by a farm i, i = 1,2,……, N in time period t, t = 1,2,……, T. ln X_it is a vector of natural logarithm of j, j = 1,2,……, J inputs. T is time trend, a proxy for the exogenous rate of technological change. All the other variables (α, β, ε and τ) retain their previous definitions as in Equations (1) and (2).

2.3.1

Input elasticities (E), technical change, and return to scale

As the coefficients of the TL production function do not have direct interpretations because of squares and interaction, we compute the elasticities of output with respect to each input. As all input variables are expressed in their logarithms, the elasticities can be simply obtained from a partial differentiation of the production function with respect to appropriate inputs as:

(13)

E_{j i t} = \frac{\partial \ln y_{i t}}{\partial \ln X_{j i t}} = β_{j} + β_{j j} \ln X_{j i t} + β_{j t} T_{t},

$${{E}_{jit}}=\frac{\partial \ln {{y}_{it}}}{\partial \ln {{X}_{jit}}}={{\beta }_{j}}+{{\beta }_{jj}}\ln {{X}_{jit}}+{{\beta }_{jt}}{{T}_{t}}\,,$$

the rate of TC and return to scale (RTS) is obtained from:

(14)

T C_{i t} = \frac{\partial \ln y_{i t}}{\partial T_{t}} = β_{t} + β_{t t} T_{t} + \underset{j = 1}{\sum^{J}} β_{j t} \ln X_{j i t} and R T S_{i t} = \underset{j = 1}{\sum^{J}} E_{j i t} .

$$T{{C}_{it}}=\frac{\partial \ln {{y}_{it}}}{\partial {{T}_{t}}}={{\beta }_{t}}+{{\beta }_{tt}}{{T}_{t}}+\underset{j=1}{\overset{J}{\mathop{\sum }}}\,{{\beta }_{jt}}\ln {{X}_{jit}}\text{ }\!\!~\!\!\,\,\text{ and }\,\,\!\!~\!\!\text{ }RT{{S}_{it}}=\underset{j=1}{\overset{J}{\mathop{\sum }}}\,{{E}_{jit}}.$$

The elasticity measures the responsiveness of output to a 1% change in the jth input used by farm i, at time t. TC is the percentage change in output due to an increase in time measured in years for unchanged input use. RTS measures the percentage change in output in response to a proportional 1% increase in all inputs simultaneously. Technology is said to be exhibiting increasing, constant, or decreasing RTS, respectively, if RTS is greater than, equal to, or less than one. Like farm level efficiency, all input elasticities, RTS, and TC are farm- and time-variant and computed at every data point.

2.4

Data and variables

The data source for this paper is the Ethiopian Rural Household Survey dataset collected from randomly selected stratified farm households in rural Ethiopia during 1994–2015. It includes farm production and economic data collected at 5 year intervals from local farms associations (FAs) that were selected to represent the country’s diverse farming systems. Originally, the first four waves of the survey were conducted in collaboration with the Department of Economics, Addis Ababa University (AAU) and the International Food Policy Research Institute (IFPRI). The last round was extended forming a subsample from the original respondents covering eight FAs following a similar strategy comprising 503 farm households in 2015. The data extension was in collaboration with AAU and the Department of Environment for Development (EfD) at the University of Gothenburg, Sweden. Consequently, this study uses the last four rounds (1999, 2004, 2009, and 2015) of data covering eight FAs forming partially balanced 446 panel households or 1,648 observations. The four rounds were selected to allow for even time spacing and covering approximately similar time frames. The 1994 survey was excluded as it misses most of the important variables for the analysis.

We employed aggregated cereal output measured in Ethiopian birr (ETB) as a dependent variable. The explanatory variables include labor employed measured in man-day units (MDUs), cereal sown farmland measured in hectares, amount of fertilizers used in kilogram, agricultural machinery implements in ETB, and livestock ownership measured in tropical livestock units (TLUs) as a proxy for wealth and livestock asset endowments. Agrochemicals measured in ETB include pesticides, herbicides, and insecticides and oxen as animal traction power is measured in the number of oxen owned. Oxen is mainly used in traditional farming in the land preparation and harvesting periods. Time trend and its square are used for capturing the shift in production over time representing technological changes, whereas the squared trend captures the non-linear shift in the production function over time. All monetarily measured variables are transformed to fixed ETB prices obtained by deflating to 1999 prices.

The summary statistics of the data is provided in Table 1. Based on the information on output and land size, the calculated cereal production varied between a minimum of 34 kg to a maximum of 51,100 kg per hectare with overall mean cereal production being about 1,952 kg per hectare during the study period. The periods’ mean value of production per hectare increased from 1,260 kg in 1999 to 3,020 kg in 2015. This shows that cereal production increased over the study period. On average, farms cultivated cereals on 2.6 hectares and used 342.7 MDUs of labor. Fertilizer application was minimal with an average of 116.1 kg per farm household, whereas average expenses were 133.9 ETB for agrochemicals per farm household. Average livestock ownership was 6.5 TLUs and average oxen ownership was around 1.8 or almost two oxen per farm household, ranging from 0 to 9 oxen. The standard deviation (SD) shows large dispersions in the data. The coefficient of variations (CV=SD/Mean) for most of the variables in Table 1 is larger than one.

Table 1

Summary statistics of the data (NT=1,648 observations)

Variable	1999		2004		2009		2015		All-waves
	Mean	SD	Mean	SD	Mean	SD	Mean	SD	Mean	SD	CV
Cereal output	1,260	1,320	1,253	1,300	2,065	2,390	3,020	4,000	1,952	2,682	1.37
Fertilizers	107.8	115.1	88.0	140.9	81.0	104.3	179.0	166.2	116.1	139.0	1.19
Agrochemicals	26.9	71.5	23.7	77.0	114.7	461.5	336.7	675.5	133.9	447.0	3.34
Labor	316.3	423.9	266.2	290.7	170.8	241.4	593.9	1,222.2	342.6	714.0	2.08
Machinery	0.6	4.6	41.3	301.4	836.8	3,216.4	376.4	915.5	336.3	1,776.0	5.28
Livestock	5.7	4.3	4.5	4.0	7.2	6.3	7.9	7.4	6.5	5.9	0.91
Land area	1.5	1.1	4.9	22.6	2.8	14.2	1.8	1.4	2.6	12.4	4.77
Oxen	1.8	1.2	1.4	1.3	1.9	1.5	1.9	1.4	1.8	1.4	0.78

3

Empirical results and analysis

3.1

Analysis of the results

Table 2 gives the estimates of the parameters based on the specified TL frontier production function, input elasticities, and rate of TC and RTS across the four models. As shown in Table 2, almost all the parameter estimates in all the four models are significantly different from zero at the 5% level or lower. Most of the first order estimates have positive signs in all the models. Agrochemicals and livestock estimates are statistically significant in all the models. Hence, an increase in agrochemicals and owning more livestock units, which may include plowing oxen, enhanced cereal production. An estimate of labor and machinery use are unexpectedly negative and statistically significant, showing that cereal production decreased with such inputs. This result may be because farms may use more draft/hand tool inputs as compared to machinery to control for weeds (Battese and Coelli, 1992) during years of poor output. Family size increased and fixed and scarce land was allocated to smaller plots reducing labor productivity. Estimates of the time trend and its squared term are significantly positive at the 1% level, suggesting evidence of technical progress at an increasing rate. An estimate of time interacted with farmland area is positive showing that there were TCs which were land using. The coefficients of time’s interactions with other inputs are negative and significant implying factor saving technological changes in these inputs. An estimate of time’s interaction with agrochemicals is not significant, implying technical neutrality with respect to this input. However, the overall rate of TC is not neutral because some production factors changed significantly over time.

Table 2

Estimates of the Parameters, Elasticities, TC, and RTS across Models (NT=1,648)

Variables	TFE model			FE, KH, and KLH models
	Estimate	SE	Elasticity	Estimate	SE	Elasticity
Constant	4.144***	0.457		4.678***	0.396
Fertilizer	0.089	0.067	0.012	0.064	0.080	0.004
Agrochemicals	0.096**	0.050	0.066	0.108*	0.059	0.071
Labor	0.344***	0.095	−0.115	0.351***	0.114	−0.116
Machinery	−0.295***	0.064	−0.324	−0.290***	0.076	−0.304
Livestock	0.225**	0.106	0.275	0.197	0.126	0.247
Oxen	0.386	0.244	0.471	0.380	0.292	0.495
Area	0.013	0.113	0.276	0.067	0.132	0.322
Fertilizer*Fertilizer	−0.003	0.015		−0.002	0.018
Agrochemicals*Agrochemicals	−0.001	0.012		−0.005	0.014
Labor*Labor	−0.031	0.019		−0.035	0.023
Machinery*Machinery	0.062***	0.013		0.059***	0.016
Livestock*Livestock	0.125***	0.024		0.122***	0.028
Oxen*Oxen	−0.217	0.185		−0.218	0.222
Area*Area	−0.124***	0.022		−0.118***	0.026
Fertilizer*Agrochemicals	−0.008	0.011		−0.004	0.013
Fertilizer*Labor	0.013	0.021		0.019	0.025
Fertilizer*Machinery	−0.003	0.012		−0.002	0.014
Fertilizer*Livestock	−0.099***	0.025		−0.093***	0.029
Fertilizer*Oxen	0.112**	0.056		0.103	0.066
Fertilizer*Area	0.126***	0.030		0.111***	0.035
Agrochemicals*Labor	−0.007	0.016		−0.012	0.020
Agrochemicals*Machinery	0.001	0.011		0.001	0.013
Agrochemicals*Livestock	0.070***	0.024		0.067***	0.028
Agrochemicals*Oxen	−0.107**	0.044		−0.104**	0.053
Agrochemicals*Area	−0.002	0.023		−0.007	0.028
Labor*Machinery	0.071***	0.016		0.069***	0.019
Labor*Livestock	0.045	0.038		0.051	0.046
Labor*Oxen	−0.098	0.080		−0.104	0.096
Labor*Area	−0.036	0.036		−0.043	0.043
Machinery*Livestock	0.006	0.021		0.002	0.025
Machinery*Oxen	0.018	0.450		0.021	0.054
Machinery*Area	−0.042*	0.230		−0.042	0.027
Livestock*Oxen	−0.208**	0.107		−0.195	0.128
Livestock*Area	−0.195***	0.058		−0.193**	0.070
Oxen*Area	0.326***	0.120		0.332**	0.143
Time (1=1999,...,4=2015)	0.666***	0.139		0.686***	0.162
Time*Time	0.392***	0.048		0.356***	0.052
Time*Fertilizer	−0.026**	0.012		−0.024*	0.014
Time*Agrochemicals	−0.011	0.009		−0.010	0.011
Time*Labor	−0.119***	0.018		−0.114***	0.022
Time*Machinery	−0.036**	0.015		−0.029*	0.017
Time*Livestock	−0.052**	0.022		−0.050**	0.026
Time*Oxen	0.078*	0.043		0.090*	0.052
Time*Area	0.118***	0.026		0.114***	0.031
RTS			0.660			0.710
Rate of TC			0.901			0.880

Notes: *p< 0.05, **p< 0.01, and ***p< 0.001 levels of significance.

FE (fixed effects), KH (Kumbhakar and Heshmati, 1995), KLH (Kumbhakar et al., 2014), TFE (Greene, 2005a), and TCs (technical changes).

Elasticities with respect to all inputs, evaluated as the mean of the data, are significantly different from zero. With a few exceptions, elasticities across models are positive, indicating positive marginal products of inputs. The positive sign of elasticities further indicates that lack of these inputs will hamper agricultural activities and hence output levels. Elasticities for fertilizers, agrochemicals, oxen (animal traction power), farmland area, and livestock indicate that an increase in these inputs enhanced cereal output levels. However, the magnitude of the elasticities differs across models. For instance, if a farm increased the number of the oxen input by 1%, keeping other inputs constant, it increased production by 0.495% (FE, KH, and KLH models) and 0.471% (TFE model). Similarly, an increase in livestock rearing by 1%, increased production by 0.275% in the TFE model and 0.247% in the other models. On the other hand, elasticities with respect to labor and machinery use were negative, conforming to the inverse production relationships with these inputs found in other studies. The negative elasticity of labor and machinery can be attributed to high fertility under constant land size and increased use of family labor. This result shows that if a farm increases labor input by 1%, then average production will decrease by 0.115% for the TFE model and 0.116% for the other models. Similarly, an increase in machinery implements by 1% will decrease production by 0.324% for the TFE model and 0.304% for the other models.

Our results are consistent with Wan and Cheng (2001), who found that excessive labor use had a negative effect on Chinese agriculture, and Rashidghalam et al. (2016), who reported that machinery had a negative effect on Iranian cotton production. The negative elasticity with respect to labor may be explained by the fact that farms with surplus family labor are likely to use excessive family labor; with respect to machinery, it could be because small and fragmented landholdings make it difficult to attain economies of scale by using machinery implements. This indicates a mismatch between machinery implements and low technical skill realities of smallholder cereal farmers. Thus, excessive use of family labor, labor hoarding, and inefficient use of machinery can explain the unexpected negative productivity effects of labor and machinery. This implies that given the current landholdings and smallholders’ resource base, investments in highly mechanized agriculture might not necessarily translate into high productivity and production.

The RTS and TC estimates are positive across models. However, their magnitude is model-specific. Specifically, RTS is 0.660 for the TFE model and 0.710 for the other models. RTS’ estimates also suggest that cereal growing farms in the sample exhibited decreasing RTS in all the models. This can be explained by increased population and excessive use of family labor and demand for higher food security that led to the use of inferior and less productive land. TC estimates clearly show technical progress with an increasing rate of 0.901 for the TFE model and 0.880 for the other models. This is a result of increase in farming skills, improved seed quality, and skills in the use of modern inputs such as fertilizers.

3.2

Technical efficiency

Table 3 provides the summary statistics and frequency distribution of efficiency scores obtained from the four models. The FE model produces values of TE that are time-invariant and therefore should reflect persistent efficiency. The results of the KH and KLH models provide persistent as well as transient TE components. The TFE model, which does not include persistent efficiency but produces values that are time-variant, reflects the overall (transient) efficiency. In general, the results show significant variations in efficiency estimations across the models. Efficiency scores are found to be sensitive to the models’ specifications. In the next section, we provide adetailed analysis of the results obtained from the four models.

Table 3

Frequency distribution of persistent and transient TEs

TE-Interval (%)	Persistent TE		Transient TE
	FE and KH models	KLH model	TFE model	KH model	KLH model
0−10	2.91	0	0	13.41	0
11−20	25.34	0	0	42.72	0.36
21−30	29.15	0	0	25.55	0.91
31−40	19.28	0	0	12.50	4.13
41−50	10.76	0	0.12	3.58	19.54
51−60	6.50	0.22	0.00	1.40	50.24
61−70	3.36	5.38	0.18	0.67	23.97
71−80	1.12	47.09	7.40	0.12	0.85
81−90	1.35	47.31	2.61	0.06	0
91−100	0.22	0	89.68	0	0
Mean	0.304	0.791	0.944	0.210	0.545
Std. Dev.	0.155	0.053	0.065	0.111	0.082
Minimum	0.054	0.567	0.427	0.020	0.105
Maximum	1.000	0.889	1.000	0.840	0.783

Yearly mean of the transient efficiency

1999	0.964	0.213	0.550
2004	0.958	0.195	0.523
2009	0.941	0.215	0.559
2015	0.918	0.210	0.541

3.2.1

Persistent technical efficiency

The persistent efficiency component captured by KLH and KH models resulting from time-invariant policy or management (structural or institutional factors) are on average 0.79 and 0.30, respectively, which differ sharply with their level of transient efficiency.

As shown in Table 3, mean persistent efficiency of the FE and KH models is low (0.30) with larger dispersions. Contrarily, mean persistent efficiency captured by the KLH model is 0.79, which is significantly higher than the mean of FE and KH models, and there are much lower dispersions. Hence, comparing the efficiency estimates obtained by the FE and KH models does not provide precise information on the level of persistent efficiency. The reason for this is that these models do not separate unobserved farm heterogeneity from persistent inefficiency. Parts of time-invariant farm heterogeneity effects could be mixed up with persistent inefficiency. Thus, the models tend to overestimate inefficiency scores, thus generating lower estimates of persistent efficiency.

Distribution of persistent efficiency further shows that almost 58% of the farms were operating below the mean score in the KH model and 44% farms were operating below the mean score in the KLH model. In the KLH model, 94% of the farms had persistent efficiency scores between 0.71 and 0.90. However, FE and KH models estimate more farms as having efficiency scores between 0.21 and 0.30. This implies that most cereal farms faced severe persistent productive inefficiency problems in the study area.

3.3

Transient technical efficiency

When we consider time-varying efficiency, the mean OTE obtained from the KH, KLH, and TFE models is 0.21, 0.55, and 0.94, respectively. As can be seen in Table 3, mean TE in TRE is significantly higher than that in the KH model and moderately higher than that in the KLH model. The results show that there were fewer farms with transient efficiency scores below 90% in the TFE model as compared to the other two models. Variations in transient efficiency’s estimates by these models are due to their underlying assumptions. The TFE model, which assumes that inefficiency is always time-varying, controls for unobserved farm heterogeneity without considering persistent inefficiency. If a farm household is characterized by persistent individual effects, it becomes part of farm heterogeneity. The farm heterogeneity effect captures some of the persistent inefficiency. Consequently, the model underestimates the transient inefficiency level, which results in transient efficiency scores being inflated upwards. This inflated transient efficiency is induced by the assumption that persistent efficiency is 100% in the model’s specification, thus leading the overall efficiency to be biased upward.

Unlike the TFE model, the KH model does not consider any farm heterogeneity effect and treats all time-invariant farm effects as inefficient. Hence, it mixes the heterogeneity effects with persistent inefficiency. Thus, persistence inefficiency estimated in the KH model is overestimated. Consequently, the model is likely to produce overestimated persistent efficiency scores. As OTE (which is time-variant) is a product of persistent and transient efficiency components, transient efficiency in the KH model is lower due to low persistent-efficiency estimates. Hence, we conclude that OTE is biased downward in the KH model whereas it is biased upward in the TRE model. These characteristics of the KH model, together with those of the TFE model, suggest that latent farm and individual effects such as unobserved heterogeneity are significant in the sample and we need to reconsider our modeling for obtaining more accurate efficiency estimates. This also demonstrates that if there is moderate persistence inefficiency and unobserved heterogeneity among the farms, both the KH and TFE models will produce biased results.

Based on this discussion, the true measure of efficiency may be somewhere between the two extremes. To overcome the risk of a bias, we considered a recently developed more flexible efficiency model called the KLH model, which separates the two time-invariant inefficiency components and farm heterogeneity and comes closer to capturing true efficiency. This new model overcomes some of the limitations of the earlier models by decomposing the overall inefficiency into its persistent and transient components and distinguishes time-invariant farm effects from persistent inefficiency. In the process of estimating efficiency using the KH and KLH models, the OTE is decomposed into persistent and transient efficiency components. However, separation of persistent inefficiency from farm heterogeneity results in a higher estimate of persistent inefficiency in the KH model with quite low variations. Thus, mean transient efficiency results of the KLH model are quite high compared to the KH model’s and less compared to the TFE model’s. Frequency distribution of transient efficiency shows that 46% of the farms were operating below the mean in the KLH model, as opposed to 60% in the KH model.

In general, the variability of efficiency scores across different models considered clearly demonstrates the existence of significant unobserved farm heterogeneity in the sample and should be considered in frontier model’s specifications in line with the findings of Heshmati et al. (2018) and Kumbhakar et al. (2014). The variability in the results of these models clearly demonstrates the difficulty in “correctly” measuring efficiency. This makes us conclude that the selection of an appropriate model for estimating efficiency and interpreting the results may not be a straightforward process. No model can be said to be “correct” and efficiency will always be model-specific and likely biased. However, as noted by Badunenko and Kumbhakar (2016), the KLH model can produce more reliable results as it disentangles persistent inefficiency from farm heterogeneity and also transient inefficiency from random noise and generates low levels of noise. Thus, the results of the KLH model show the reliability of transient and persistent inefficiency estimates as it provides low levels of noise. Besides, the results also show that efficiency estimates varied over time. As shown in Table 3, transient efficiency varied over time and decreased during the study period, with 2009 being the most efficient year and 2015 the least efficient year. Concerning the pattern of efficiency ratings through time, the results show that the level of transient efficiency was quite low and mostly concentrated between 0.11 and 0.20 in the KH model and it was concentrated between 0.51 and 0.60 in the KLH model in all the years.

To get a better picture of the efficiency components in the different models, we report their density plots in Figure 1. The density plots show that the distribution of persistent efficiency in the FE and KH models is almost identical, and except for some values in the upper tail, most of the farms had low levels of efficiency in so far as their persistent efficiency is concerned. This is, however, not the case with the KLH model as it provides highest persistent efficiency scores. The mean is higher compared to that in the FE and KH models and it has least dispersion.

Regarding distribution of transient efficiency, as in the TFE model the heterogeneity effects are not considered to be inefficient, it leads to high efficiency scores (Figure 2) with low dispersion (Figure 3) compared to the other two models.

Distribution of transient efficiency in the KH model looks like the distribution of its persistent component but its mean is pushed back by about 10%. In the KLH model, most of the farms were found to have moderate transient efficiency scores between those obtained from the TFE and KH models (Figure 2). The efficiency score spreads were between the TFE model (low spread) and KH model (high spread) (lower part of Figure 3). Similar results and patterns were found by Heshmati et al. (2018) and Kumbhakar et al. (2014).

The dispersion in efficiency components in the KH and KLH models, as the main element of overall efficiency, is significantly higher for the persistent component as compared to the transient component in both the models (Figure 2). Thus, the results suggest that persistent inefficiency was a bigger problem than transient inefficiency in the sampled cereal farms.

Finally, for comparing the models and exploring the effects of the estimated models on the ranking order of farms’ TE, we estimated Kendall’s rank correlation coefficient between the efficiency scores (see Table 4). The correlation coefficients of persistent efficiency in the FE, KH, and KLH models were positive and high, implying that the models were consistent in generating similar results. Further, correlation coefficients between the transient efficiency estimates obtained for all the models were positive, except for the KH and TFE models. KH and TFE models had high ranking disagreements. This result is not surprising given their assumptions with respect to farm heterogeneity effects. The transient efficiency estimates of the

Table 4

Kendall’s rank order correlation between different models

Components	Persistent			Transient
Model	FE	KH	KLH	TFE	KH	KLH
PTE_FE	0.998
PTE _KH	0.998	0.998
PTE _KLH	0.998	0.998	0.998
TTE_TFE	−0.024	−0.024	−0.024	1.000
TTE_KH	0.845	0.845	0.845	−0.013	1.000
TTE_KLH	0.322	0.322	0.322	0.043	0.477	1.000

Notes: P indicates persistent and T transient efficiency components.

KLH and TFE models, however, have low positive correlations, whereas the results of the KH and KLH models are independent having a positive correlation.

3.3.1

Technical efficiency estimates across agro-ecological zones

For the purpose of investigation, the farms’ performance across AEZs is compared in Table 5. The efficiency results of the models show that there are systematic differences between farms by their AEZ location. It also shows the effect of geographical or climatic conditions on farm efficiency.

Table 5

Mean efficiency measures by AEZs

Components	Persistent		Transient
AEZs\Model	FE and KH	KLH	TFE	KH	KLH
Lowland	0.220	0.763	0.897	0.151	0.525
Midland	0.311	0.794	0.794	0.215	0.548
Highland	0.387	0.813	0.960	0.267	0.560

Table 5 shows that as one moves from a highland to lowland AEZ, the mean of TE decreases. This suggests more productive efficiency at higher altitudes. Highland’s rainfall and temperature are favorable for cereal production. The low mean scores noted in the lowland area can be attributed to several factors constraining them in cereal production. These differences in performance can be attributed to and explained by time-invariant heterogeneity effects (such as the geological/locational makeup of a farm and other physical features), which are beyond the control of the farms. This demonstrates that allowing time-invariant and unobserved heterogeneity effects in the error terms in some models or/and controlling them in other model specifications can lead to appreciably different TE estimates. The differences can also change the ranking of farms in different AEZs when comparing their efficiency performance.

4

Summary and conclusion

This paper investigated persistent and transient productive efficiency of Ethiopian cereal farms during the period 1999 and 2015. It used four-error components panel data SF model to distinguish between time-invariant farm heterogeneity, persistent and transient inefficiency components, and random noise. A flexible TL production frontier was also specified and estimated. Four different models that make up recent advanced models developed in the field were estimated empirically using the same data. The models differed in the way they treated the three first-error components. The results of the most general four-error components (KLH) model were compared to the other three (FE, TFE, and KH) models in which one of the four components is missing. The models differed in their underlying assumptions of time-variant/ invariant efficiency and separation of persistent technical inefficiency and farm heterogeneity effects. The TFE model disentangled time-varying inefficiency from time-invariant farm heterogeneity. The KLH and KH models distinguished between persistent and transient inefficiency and the FE model was used for estimating time-invariant efficiency for comparison purposes.

Estimates of the parameters showed that agrochemicals, livestock, and land significantly enhanced farm productivity whereas productivity was negatively affected by labor and machinery. These negative effects are attributed to increased rural population for a given land size that made the plots smaller as family size grew. The coefficient of time interacted with farmland area was positive and significant showing that TC was land using. Estimates of time’s interaction with other inputs was significantly negative, implying factor saving TC for these inputs. However, overall, the rate of TC was not neutral because some production factors significantly changed over time. Elasticities of fertilizers, agrochemicals, oxen, farmland area, and livestock showed that these inputs enhanced cereal production’s efficiency levels. The results further showed that cereal farming progressed technically at an increasing rate and exhibited decreasing returns to scale. The estimated efficiency results across the models, in general, illustrated significant variations in efficiency estimates across the models showing that the efficiency estimation was highly sensitive to the models’ specifications.

Our results suggest that models structured for capturing inefficiency that is time-invariant are mixed with farm heterogeneity effect that is not inefficient. The mixed effect may lead to very low efficiency estimates, whereas models in which farm heterogeneity effects are not considered to be part of inefficiency may give high efficiency scores.

The empirical results of our findings confirmed the assumption of significant farm heterogeneity in the sample, which was demonstrated by significant overestimates of efficiency in the TFE model and underestimates of efficiency in the KH model. The KLH model overcame these problems by splitting the time-invariant effects into farm-specific heterogeneity and a persistent inefficiency effect. In addition, TE was decomposed into persistent and transient components. Consequently, it provided a very dissimilar estimate of overall efficiency levels between the TFE and KH models, which reduced the downward and upward biases of the TFE and KH models.

Kendall’s rank correlation coefficients showed that FE, KH, and KLH models generated similar and consistent persistent efficiency measures. Further, the correlation between transient efficiency’s estimates obtained in all the models were positive, except for the partial correlation between the KH and TFE models. The transient TE estimates obtained in the KLH and TFE models had a low positive rank, whereas the results based on KH and KLH models had large positive ranks. The results also showed differences in efficiency levels by AEZs, which shows the impact of geographical/climatic conditions on efficiency. As per the results as one moved from highland to lowland AEZs, TE decreased. The empirical results also show differences in TE estimates between AEZs, which, in turn, show the impact of unobserved heterogeneity conditions on farms’ TE.

Our results confirm wide variations in TE estimates across farms and over time. This is an indication that most of the farms were still using their resources inefficiently in their production processes and there still existed wide room for improving cereal production through improved efficiency levels. In particular, the TE results for cereal farms included in this study show the existence of significant persistent inefficiency, implying that policy measures that can reduce persistent inefficiency should be prioritized. Long-run policy needs to be supplemented with short-run policies aimed at transitory inefficiency. Such policy measures will enable farms to improve their efficiency in the long-run. These findings are important and can be used for initiating government policy options when planning agricultural policies tailored for supporting AEZs across the country. The study, therefore, recommends that location-specific policies that reduce persistent inefficiency should be put in place. A location-specific public policy could improve the supply of agricultural inputs and help in meeting the needs of farms and also suit AEZ’s peculiarities.

Język:: Angielski

Częstotliwość wydawania:: 1 razy w roku
Dziedziny czasopisma:: Biznes i ekonomia, Ekonomia polityczna, Mikroekonomia, Makroekonomia, Polityka gospodarcza, Matematyka i statystyka dla ekonomistów, Ekonometria

Kanał RSS czasopisma

Farm-heterogeneity and persistent and transient productive efficiencies in Ethiopia’s smallholder cereal farming

Oumer Berisso

Almas Heshmati

Data publikacji: 16 paź 2020

DOI: https://doi.org/10.2478/izajodm-2020-0018

Słowa kluczoweStochastic frontier, heterogeneity, persistent and transient efficiency, cereal farming

© 2020 Oumer Berisso, Almas Heshmati, published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Słowa kluczowe
Stochastic frontier, heterogeneity, persistent and transient efficiency, cereal farming