Verhulst’s logistic growth law has long been used. Its mathematical expression is logistic curves, which are a particular case of sigmoidal functions, and because of the characteristic shape, they are called S-shaped curves.
The logistic curve is a popular model for studying and forecasting future changes. It is based on the law of Nature (Kucharavy and De Guio, 2015). Owing to the law of Nature, the S-curve can be used when a “genetically stable species” competing for limited resources are identified. It means that there should be two components: the ability of “species” to multiply and finite niche capacity. An argument for using the model can be made, too, if the data fit well into a large section of an S-curve. Nevertheless, the confidence of forecasting will be when the features of natural growth are identifying (Modis, 2007).
The S-curve represents the growth or decline of every system in interaction with an environment (Kucharavy and De Guio, 2015). Therefore S-shaped curves are applied for projecting the performance of technologies, to foresee population changes, for market penetration analyses, for micro- and macro-economic studies, for diffusion mechanisms of technological and social inventions, for ecological modeling, and many others (Kucharavy and De Guio, 2011). S-curves are adopted in quantitative, qualitative, and both the methods simultaneously. These are used in trend impact analysis, curve fitting technique, decision modeling based on Fisher and Pry model, statistical modeling, text mining for technology forecast, life cycle analysis in the framework of strategic analysis, theory of innovation diffusion, and emerging issues analysis.
Quantitative methods using S-curve are applied for:
extrapolation of previously collected data, examining market, technological, social substitution dynamics, analyzing annual publications to prove informative trends, identifying the issues before they reach the trend of problem phase for engineering and non-engineering fields.
Qualitative methods using S-curve are often put in for identifying the stage of a system’s evolution, and both the methods are simultaneously used for:
trend extrapolation, studying the technology adaptation dynamic.
Traditionally, analyzes using S-shaped curves identify three stages of phenomenon’s development and three parameters. These stages are initial growth, exponential growth, and deceleration, and the parameters are saturation level, midpoint, and growth rate. The saturation level is the horizontal asymptote that limits the chart from above, the midpoint is the level that is half the saturation level, and the growth rate describes the slope of the curve (Słupiński, Kucharavy, 2011). In the interpretation of the curve, 10–15% saturation is assumed as the level of transition from slow growth to hypergrowth, and it is assumed that saturation comes after exceeding 90% (Johnson, 2012).
From a mathematical point of view, the midpoint or inflection point is the zero point of the second derivative of the logistic function.
In this article, in contrast to the previous approach, we use the zero point of the third derivative, not only the second one, in forecasting, and we find it for a generalized logistic function using the second differences of its terms. We take the presentation of this approach in economic projections as the goal of the article.
The Riccati equation with constant coefficients is the following first-order ordinary differential equation:
The right-hand side of equation (1) is a quadratic function of variable u, with coefficient r of u2, having zeroes u1, u2. The constants r ≠ 0, u1, u2 are real or complex numbers.
If u(t) is a solution of equation (1), then there is known formula for the n-th derivative u(n)(t) (n = 2,3, ...) of u(t), expressing the derivative as a polynomial of the function itself:
Formula (2) has been discussed during the Conference ICNAAM 2006 (International Conference of Numerical Analysis and Applied Mathematics, September 2006) held in Greece, and it appeared with an inductive proof in an article by Rządkowski (2006) (see also Rządkowski, 2008). Independently, the formula has been considered and proved, with a proof based on generating functions, by Franssens (2007). Eulerian numbers play a significant role in combinatorics, probability theory, statistics, and different applications of mathematical analysis.
It is convenient, for our purposes, to write equation (1) as:
Many economic phenomena, including those related to management, follow the logistic equation and can be modeled with it (see, e.g., Meade and Islam, 2006; Michalakelis and Sphicopoulos, 2012; Qian and Soopramanien, 2014; Wu and Chu, 2010; Yamakawa, et al., 2013).
A phenomenon described by equation (3) and function u(t) has a valuable property that the rate of growth u′ (t) is proportional to the level already achieved, that is, (u(t) − umin). As a result, in the initial period, the phenomenon has nearly exponential growth. On the other hand, if u(t) is sufficiently large, then the factor (umax − u) is more and more significant, and its influence inhibits further growth of function u(t).
Mathematically, equation (3) is the first-order ordinary differential equation, which can be easily solved using the method of separation of variables.
After solving (3), we get the generalized logistic function in the following form:
Fig. 1 shows graphs of two exemplary generalized logistic functions with parameters umax = 100, umin = 10, and c = 5 for two values of parameter s. At point t = c, the generalized logistic function (3) has the inflection point (zero of the second derivative), at which its value equals (umin + umax)/2. In the article by Rządkowski, et al. (2014), it is proved that for the logistic curve (i.e., when umin = 0), the value of the function at zero of the third derivative equals 0.211umax. Hence, it follows that for a generalized logistic curve, at the zero of the third derivative u′″, function (4) takes value umin + 0.211(umax − umin) = 0.789umin + 0.211umax.
On the basis of the same paper, we can also show that:
at the minimal positive point t when u(4)(t) = 0, function u(t) takes the value:
at the minimal positive point t when u(5)(t) = 0, function u(t) takes the value:
In the article by Rządkowski, et al. (2014), it is proved by using appropriate examples that for a given time series, potentially having a logistic shape, the central second differences of its terms (for the time series vt, the central second differences are defined as ∆2vt = vt+1 − 2vt + vt−1) reach a maximum near of the zero point of the third derivative at which the second derivative takes its maximum. This is actually a special case of the more general rule for every smooth function f(t).
On the basis of the Taylor formula, one can easily justify the following formula:
The sum of both the integrals is positive but rather small, because f′″ (t) = 0 and usually:
In the case of a generalized logistic function (4), the coordinate of the point t0, where u″ (t0) takes the maximum value, satisfies the equation:
Then using (2), we get:
Suppose that a time series derived exactly from a generalized logistic function:
Then if ε is small enough compared to the value of u″(t0) in (5), then the maximum value of the central second differences ∆2Ut = Ut+1 − 2Ut + Ut−1 is taken at a point only slightly different from t0. This will be indicated by the following example.
Let umax = 20, umin = 4, s = 0.3, c = 30, a = 0.1, and b = 0 in equations (6) and (7). Fig. 2 shows the function u(t) and Fig. 3 its second derivative u″(t).
We have performed the Monte Carlo simulations for two values: ε = 0.01 and ε = 0.001 (Figs. 4–7).
After performing 50 Monte Carlo simulations, we obtained the following mean values and standard deviations for both the coordinates of the maximum point of the second differences ∆2U(t) (Fig. 5 and Fig. 7):
for ε = 0.001, the mean horizontal coordinate is 25.66 with a standard deviation of 0.479 and the mean vertical coordinate is 0.138 with a standard deviation of 0.002, for ε = 0.01, the mean horizontal coordinate is 25.4 with a standard deviation 1.07 and the mean vertical coordinate is 0.16 with a standard deviation of 0.014.
The example indicates that the maximum of the second differences in the disturbed logistic series behaves stable when the value of ε is relatively small. Therefore, this point can be used to estimate the saturation level of the logistic curve.
S-curves have long been used for a long time to analyze economic phenomena. The strong assumption in these analyses is that all businesses have a life cycle; they develop, mature, and then die. Over time, all markets and all products climb to a peak; they then wane and sometimes disappear entirely.
Unlimited growth is impossible. So the fundamental question is: How far along on the S-curve is the business, market, technology, and so forth? (Harrison, 2011).
In the recent years, many articles devoted to the mathematical modeling of diffusion of mobile telephones in various countries have been published (see, e.g., Junseok, et al., 2009; Michalakelis and Sphicopoulos, 2012; Wu, et al., 2010; Yamakawa, et al., 2013). In this article, we applied S-curve for mobile telephones subscriptions in Poland and showed its utility for market penetration.
Table 1 presents the number of mobile telephones subscriptions in Poland in 1992–2012 per 100 inhabitants. The data were taken from the website of International Telecommunication Union (ITU;
Number of subscriptions of mobile telephones per 100 inhabitants in Poland in the years 1992–2012 (
No (t) | Year | Number of subscriptions per 100 inhabitants, y(t) | Second differences ∆2 |
---|---|---|---|
1 | 1992 | 0.007 | |
2 | 1993 | 0.035 | 0.0378 |
3 | 1994 | 0.101 | 0.02894 |
4 | 1995 | 0.196 | 0.27426 |
5 | 1996 | 0.565 | 1.1828 |
6 | 1997 | 2.117 | 1.39996 |
7 | 1998 | 5.069 | 2.29338 |
8 | 1999 | 10.315 | 2.03268 |
9 | 2000 | 17.593 | 1.24125 |
10 | 2001 | 26.112 | 1.67588 |
11 | 2002 | 36.307 | −1.0091 |
12 | 2003 | 45.493 | 5.74175 |
13 | 2004 | 60.421 | 0.99075 |
14 | 2005 | 76.339 | 3.94934 |
15 | 2006 | 96.207 | −7.6964 |
16 | 2007 | 108.378 | −5.5284 |
17 | 2008 | 115.021 | −4.3499 |
18 | 2009 | 117.315 | 3.30763 |
19 | 2010 | 122.915 | 2.77762 |
20 | 2011 | 131.294 | 0.67138 |
21 | 2012 | 140.343 |
For the data (not including items 20 and 21, because the points seem to form a new generalized logistic curve), using the nonlinear least squares method, we found a logistic function (assuming umin = 0) best describing changes in the number of subscriptions. We have received the following form of the regression curve:
In Fig. 8, the regression curve, that is, the logistic function, is marked in red and subscriptions are marked in blue.
The coefficient of determination is close to 1 and equals to R2 = 0.9977, which indicates an excellent explanation of the phenomenon.
If you look at the values of the second differences in Table 1, the upward trend leading to their maximum value is not visible. The phenomenon is affected by random fluctuations, which can be, to some extent, eliminated, for example, by the exponential (delayed) smoothing
Assuming the value of the smoothing constant α = 0.5, we get the following results for the smoothed values and their second differences.
On the basis of Table 2, we can clearly estimate the value of the inflection point at about 70, and hence, the saturation level of the phenomenon could be estimated at 140. The maximum amount of the second difference is assumed for 2003, and in this case, the predicted saturation level is 36.607/0.211 = 173.5.
Exponentially smoothed values for the data of Table 1 (
No (t) | Year | Smoothed values
| Second differences ∆2 |
---|---|---|---|
1 | 1992 | 0.007 | |
2 | 1993 | 0.021 | 0.02594 |
3 | 1994 | 0.061 | 0.027439 |
4 | 1995 | 0.129 | 0.150848 |
5 | 1996 | 0.347 | 0.666822 |
6 | 1997 | 1.232 | 1.033391 |
7 | 1998 | 3.151 | 1.663384 |
8 | 1999 | 6.733 | 1.848031 |
9 | 2000 | 12.163 | 1.544643 |
10 | 2001 | 19.137 | 1.610262 |
11 | 2002 | 27.722 | 0.300585 |
12 | 2003 | 36.607 | 3.021167 |
13 | 2004 | 48.514 | 2.005957 |
14 | 2005 | 62.427 | 2.977648 |
15 | 2006 | 79.317 | −2.35939 |
16 | 2007 | 93.848 | −3.9439 |
17 | 2008 | 104.434 | −4.14688 |
18 | 2009 | 110.875 | −0.41963 |
19 | 2010 | 116.895 | 1.178996 |
20 | 2011 | 124.094 | 0.925189 |
21 | 2012 | 132.219 |
In this article, starting from the Riccati equation, we have built an analytical form of a generalized logistic function. Thus, we presented a useful tool for forecasting different phenomena, in which the role of the explanatory variable is played by the time variable, synthesizing according to Nazarko (2018), and the impact of many factors affecting the phenomenon. This method can be used with good results in forecasting, for example, economic phenomena.
Its undoubted advantage is the possibility of forecasting with a small number of points in a time series. We claim that by anticipating the logistic development of the phenomenon, we are able to effectively forecast its saturation level and the time of its achievement at its early stage of development. We postulate the use of the third derivative of the logistic function and the maximum of the central second differences in terms of a given time series.