AID  Vol.10 No.3 , December 2020
Curve Fitting and Least Square Analysis to Extrapolate for the Case of COVID-19 Status in Ethiopia
Abstract: On 30 January 2020 World Health Organization (WHO), declared the novel corona virus as a Public Health Emergency of International Concern (PHEIC), COVID-19 virus as an epidemic transmitted virus. It was on 31 December 2019, the WHO China Country office was informed the cases of pneumonia unknown etiology detected in Wuhan city, Hubei Province of China. Just after WHO’s declaration, Ethiopia has taken different measures to protect from this public health emergency problem. The disease is human to human transmitted virus. It comes from outside of the country, so that it opens check points in different entrance of the country. However, in 13 March 2020 the first positive case was reported from Japanese man. The virus is continuing the transmission in the public progressively more. While this research has been working, within 90 days from the first case, the country reported 2506 positive cases, 35 deaths. The research has done after collecting the first 90 days of data in Ethiopian case. Daily report announced by Ethiopian MoH is based on the test. And hence, the reported data as positive cases with COVID-19 is not actual positive case data in the country. There for, this paper has contribution for planning and taking further measure on the viruses by demonstrating the next 90 days predictive data. I use best curve fitting analysis using python function of the module polyfit algorithm to predict the trend of COVID-19 cases in Ethiopia.

1. Introduction

COVID-19 is an Epidemic viruses, it spreads all over the world with short period of time. The first symptom was seen on 31 December 2019 [1] [2] [3], and reported the case in Wuhan, China [1]. After the first cases outside of China reported in Thailand on 13 January 2020 [1]. On 30 January 2020 World Health Organization, (WHO) announced as COVID-19 is an outbreak disease [3] [4] [5]. Within three months of time over 82 thousand cases has been reported as COVID-19 positive, of this more than 3000, (4.03%) were dead from total reported cases, which is 0.0002% from the total population of China [6] [7]. For the first thirty day, there were 5994 cases, 132 deaths, (2.20% from total of cases). For the second thirty days, or after sixty days, the number of cases reported in China increased to 78,927 and of this 2790 were dead, which is 3.53% of the total case [8] [9]. For the third thirty days, the total number cases become 82,059. See Table 1 for USA, Italy and Ethiopia cases.

As we have seen in Table 1, the virus spreading speed is significantly varies. Based on one country status report, we cannot predict what will happen in our country with equal time interval. There has to be strict further research why the virus spreading speed is notably varies, in addition to those controlling mechanisms suggested by WHO.

In the case of Ethiopia also the spreading speed is different. So that need predictable research for the next 180 days or 365 days, based on its own reported data, instead of predicting based on other countries report. This predication data helps the government to plan for controlling and taking measure on the virus.

In Ethiopia the first case was reported on 13 March 2020, after 73 days of the first world case was reported [7] [8] [10]. Since WHO declared as COVID-19 is an epidemic virus, the government of Ethiopia and Ministry of health has done a lot of constructive and progressive works for controlling the virus from entering to Ethiopia [11] [12]. This work enables to create awareness all over the nation about the virus. This has great impact to reduce the spread of the virus. Dated as of 27 March 2020 Ethiopia reported a total of 12 cases, all are imported [6]. After

Table 1. COVID-19 Cases reported in different countries with time interval.

the first case in Ethiopia reported the number of transmission rate is very less for about the first 50 days, relative to other world cases [11] [12] [13].

There were 145 total cases reported as COVID-19 positive with in fifty four days, 0.58% total cases from total test. However after this day on wards the spreading speed becomes up. Just after 6 days, the number of total case becomes 250 from 145 and 0.68% of cases from total test. Total recover 105, hospitalized 138 and death 5 in number were reported with in sixty days. From Table 2, we have seen that the transmission rate becomes high. This paper raises a question, where goes the COVID-19 cases within 180 days in Ethiopia?

Curve fitting is a mathematical technique which involves the solution of multiple equations [14], for the analysis and prediction purpose. To predict the transmission of the virus within the next ninety day, I use best fit curve analysis using python code. Based on the first ninety days of data, this paper illustrates, how vary the transmission of the virus in the population for the next ninety days? What will be the number of infected population with corona virus at the end of 180 days?

The paper gives good estimated prediction of the number of population that has the virus, so that MoH and Ethiopian Government can use this for initial source document to take measure on the virus, and for researchers also it can be good resource to have initial value for further data analysis.

2. Curve Fitting Analysis

Curve fitting is a procedure for creating a curve that built up from a series of data points having best fit. It can involve either interpolation, to get exact fit to the data, or smooth that construct approximately fits the data. For data visualization we can use fitted curves by inferring values of a function, where no data are available. For constructing fitted curve to determine the values beyond the range of the observed data, we call it Extrapolation; this is subjected to a degree of uncertainty [15] [16].

In regression analysis, to get the best fit to the specific curves in the data set we use curve fitting as a process for specifying the model. We can fit curves using linear regression, usually include in polynomial terms in the linear model. These terms are independent variables that we raise to a power.

The method of least square is a standard approach in regression analysis to approximate the solution of over determined systems, by minimizing the sum of

Table 2. Ethiopian COVID-19 Status with time interval.

the squares of the residuals made in the results of every single equation. The most important application that we used is in data fitting. The best fit in the least-squares senses minimizes the sum of squared residuals. Polynomial least squares describe the variance in a prediction of the dependent variable as a function of the dependent variable and the deviation from the fitted curve [16] [17].

Polynomial fit a commonly used linear form is a polynomial, if the degree of the polynomial is n, then we have f ( x ) = i n a i x i where the base functions are f i ( x ) = x i (i = 0, 1, …, n). In the module of the function polyfit setup and solve the normal equations for the coefficient of a polynomials [18]. The normal equations are solved by Gaussian Elimination with pivoting [19]. For this paper I choose directly to use python program polyfit module. On this module, and since in curve fitting low-order polynomial are useful, I used for analysis purpose 4th order polynomial.

General Curve Fitting and Least Square

Curve fitting is the process of constructing a curve to mathematical function that has the best fit to a series of data points. In numerical analysis the classical Runge Kutta methods (RK4) with initial value problem is defined [14].

It is a family of implicit and explicit iterative methods which includes a well-known Euler Methods. The function f and the ICV to y0 are defined in mathematical formulation [14].

Least Square is a form of mathematical regression analysis used to determining the limit of best fit for a set of data. LS method provides the overall rationale for the placement of the limit of best fit among the data point being studied. It is the smallest sum of squares called variance. For calculating the Least squares, first find the real y(t): y r ( t ) and predicted y(t): y p ( t ) . The error e is given by e = | y r ( t ) y p ( t ) | . Then find sum of the square of e. If the value of R-square ( R 2 ) is a number between zero and one, then the predicted value is more accurate. R 2 is closer to one, then the predicted value is more accurate.

3. COVID-19 in the Case of Ethiopia

3.1. Analytical Result

The first case reported in Ethiopia was on 13th March, 2020 [11] [12]. Just when one Japanese man has the virus, MoH and Ethiopian Government has immediately taken action on this patient, his friends and co-workers who are suspect to be infectious with the virus. In the first 30 days from the total test of 3,178 there was 69 cases has been reported as positive. With the first one month registered dead is 3, 10 recovered and 54 were in hospitalized. See Table 3.

From Table 1, the first 30 days only 69 cases were reported from 3178 tests. Which is 2.17% from this number 14.49% were recovered and 78.26% hospitalized and 4.35% death was reported. For the second 30 days the number of testes increased from 3178 to 36,606, which is 33,428 testes, has been done with one month. From this number 181 new cases reported which is 0.54% from the test.

As we have seen in Table 4, after the second month, with in fifteen days, 50,658 tests have been done which is 58.05% of the total test 87,264. This is great change in testing capacity of the ministry. In the second month, there was a declination of reporting positive cases. On the first month 2.17% are reporting as positive case, where as the second month reduce to 0.68% from the total test and 0.54% within the second month. The third month at 75 days reported case increases from 0.68% to 0.80% and within 15 days 0.89% of the tests reported as positive cases. This may show that, it is the time the virus spreading highly in the population. Not only this, the next fifteen days of the third month, there are 71,257 tests has been done, from this 1805 reported as positive cases. This is 2.53% of the test. Generally within 90 days, (end of three months), total of 158,521 tests has been done and 2506 cases reported as positive. This is 1.58% of the total test.

This is a signal for the government, MoH and for the nation to work more on the protection of the virus.

At the end of 90 day from the day where the first case reported, there are 2506 cases reported, 16% of the case recovered, 82.52% in hospital and 1.40% of the total case dead.

In the ninths day averagely, from total test 1.72% case reported as positive, 20.67% recovered, 75.56% in hospital and 1.61% dead from the total cases.

From Figure 1, the recovered rate is increased and hospitalized rate is decrease. Which is the number of population infected by the virus were decreases

Table 3. COVID-19 Cases in Ethiopia in number.

Table 4. COVID-19 In Ethiopian Case in percent.

until fifty third day of the first report was registered. The death rate is almost constant. But after fifty third days it comes the reverse, hospitalized rate was increase and recover rate was decrease dramatically.

3.2. Curve Fitting and Least Square Analysis Result from Python Polyfit Code

After taking the first ninety day data, I am working the curve fit analysis. I used and prefer the python Polyfit algorithm, with polynomial degree 4. For comparison purpose I tested with MS-Excel Trend line, it has R2 = 0.99 for total cases. This determines the accuracy is very good. Also it has the same equation with equal coefficient with my python code result. Figure 2 and Figure 3, the graphs generated from collected data and predicted with curve fitting. As I have said, above since it has R2 = 0.99, the closer to one the more accurate the prediction.

From Figures 2-4 showed that, the prediction value is almost the same as the real data. To visualize more the difference see Figure 4. In 90th day, 1.72% of total test, 20.67% of total cases and 1.61% of total cases averagely reported as positive case, recovered and dead accordingly, see Table 5. The data is shown in

Table 5. Average percent cases, recovered and hospitalize within 90 days.

Figure 1. Recovered, hospitalized and dead population in percent with COVID-19 in Ethiopia for the last 90 days.

Figure 2. 90 days COVID-19 total cases, new cases, hospitalized, recovered and death status in Ethiopia with actual data.

Figure 3. 90 days COVID-19 total cases, new cases, and hospitalized, recovered and dead status in Ethiopia with predicted data.

Table 6 as real data and Table 7 the predicated values.

Based on the first 90 days data, using best fit case analysis, the status of COVID-19 for the second 90 days in Ethiopia is shown in the Table 8 and Table 9.

From Figure 5 and Table 8 and Table 9, the number of total case increase

Figure 4. 90 days COVID-19 total cases, new, hospitalized, recovered and dead status in Ethiopia with real and predicted data.

Table 6. Real data from day 71 to 90.

Table 7. Predicted data from day 71 to 90th.

Table 8. Predicted data from 91th to 110th day.

Table 9. Predicted data from the 161th to 180th day.

Figure 5. COVID-19 Status in Ethiopia for the Next 180 days.

from 2524 on 90th day to 124,775 on the day of 180th. The death is increase from 32 on 90th day to 2469 on the day of 180th. The rate of death with the total case was, during 90th day 1.61% averagely, whereas from Table 10 1.98% of the total case will pass away.

4. Best Curve Fitting Analysis

From the data that I have in Table 8 and Table 9, the total number of positive COVID-19 virus case would be around 124,775 on 180th day, (after six months of the first case reported in Ethiopia). From these positive cases, 13,066 recovered, 109,297 in hospital and 2469 patients will die. When we put this in percent, see Table 10.

From Table 4 the percent of recover to the total case is 16%, hospitalize 82.53% and death rate is 1.4% within 90 days from the total reported cases. Here in the prediction data accordingly, 14.86% rate of total recover, 83.80% total hospitalizes and 1.27% total death from the total of reporting cases.

From the Figure 6 from actual reported data and Figure 7 from estimated data, daily reported positive case is highly oscillated, between 0.001 and 0.14 for the first 20 days. This is probably; the test was taken from already highly exposed populations. Then after it becomes, in stable format between 0.005 and 0.02 up to 75th day, then it increase the rate to 0.04 while the rate of total case with total test is below 0.02. Table 8 and Table 9 the total number of patient in hospital decreases while the recovery rate increase for about the first 55 days, and then the rate of hospitalized is increase dramatically. At 90th day almost more than 80% of the cases are in hospital and the recovery rate is less than 20.

From Figure 8 using actual reported data and Figure 9 using predicted data, the upper curve shows the rate of hospitalized cases. At the beginning it was almost the rate is one. Which is almost all of patients are in hospital, after ten days slowly decrease until fifty five days or when the time near to fifty five days from 1 to 0.55 or below to 0.6, and then it increases with in the next 35 days the rate is above 0.8. This implies that the number of reporting cases as positive COVID-19 is increased. In the ninetieth day 85% of reported positive case patients are hospitalized. The bottom curve shows recovered rate. As shown In Figure 8 and Figure 9, after 60 day the recovered rate is decreased from 0.4 to below 0.2 or near to 0.14. Which means in the ninetieth day the total recovered population is

Table 10. Predicted cases in number and in percent.

Figure 6. 90 days COVID-19,total case, daily new case and death status rate in Ethiopia

Figure 7. 90 days estimated COVID-19 total case, daily new case and death status rate in Ethiopia.

Figure 8. 90 days COVID-19 Recover and Hospital cases status rate in Ethiopia with real reported cases.

Figure 9. 90 days Estimated COVID-19 Recover and Hospital case rate in Ethiopia with predicted data.

almost 14% of the total reported cases.

Based on the data from which we have get after executing the python code generating ployfit curve as shown in Figures 10-12. The figure shows that, the

Figure 10. 180 days predicted COVID-19 Status rate in Ethiopia.

Figure 11. 180 days predicted COVID-19 total case, daily new and death Status rate in Ethiopia.

Figure 12. 180 days predicted COVID-19 Status rate in Ethiopia.

rate of total case, new daily case, recovery and death all are more than 1. Which means the rate of testing with the rate of the spread of the virus is incompatible. The virus transmitted in to the population faster than the capacity of testing the case. Even the recovery rate is very high, without knowing or testing the virus, already the patients recover from the disease, or dead.

5. Conclusions

Based on the analysis, Ethiopia will have positive cases in 180 days, 124,775. This data is predicted from the data that we collected within 90 days, from total tested population around 158,521. From this number 2506 cases were reported as positive with the virus, which is 1.58% of the total case, (1.72% as an average). Ethiopian Government and MoH announced that working for more than 8000 testes per day. For this analysis, assume number of testing is 5000 per day, within 90 days there are 400,000 new testes. The total number of testes within 180 days becomes 558,521. From this data, if averagely 1.72% is positive, then within 180 days around 9907 total cases will be reported. Based on Table 4 within the days 76 to 90 the rate of cases per test is 2.53%. When considering this percent, around 14,131 cases will be reported. The question is why the prediction number is very high? This is near to 124,775 cases to be report as positive in COVID-19. From the rate of cases per test, to get this result, (124,775), MoH should at least test 4.9 Million exposed populations. This number is just 4.4% of the total population, (Consider Ethiopian population 110 Million). To test the above number of population, 54 thousand per day has to be tested. From the result of the paper, we can deduce that

· The number of testing per day is very low when we compare to the transmission rate of the virus, see Table 3. From day 60 to 75 and day 76 to 90 the rate of reported case is increasing tremendously.

· From the predicted data we have see that, new cases and total cases that will be reported goes parallel.

· In the result of this paper, rate of total case = T . C T . T > 1 . This means there are COVID-19 positive cases which are not traced by testing.

· The number of population that has COID-19 positive case in Ethiopia not only the number that is reported daily. The daily report data could be sample, not actual data.

· The testing rate and the transmission rate of the virus are incomparable.

· As we have seen in the result, within 180 days the number of positive cases will be around 124,775, to get this result based on the case per test ratio. MoH has to test around five million exposed populations.

· However, if we continuing the time by extending for example to 365 days, (one year), the transmission rate become declines sometime. Need further research to find the pick point, and stability time.

· This research is an input for further research, to investigate and predict the number of population that will be catch by the virus.

· Finally Ethiopian Government, MoH and EPHI, Ethiopian Medias more of the community has to work more to reduce the transmission rate of the virus.

· The distribution of the virus is not only as declared in number with news. The number that is declared daily is based on the daily test. To know more the number of populations those are infected by the virus, there has to be detail research and the sampling techniques also not regional, it has to be with in small area level.


Acronyms Descriptions

WHO World Health Organization

T.C Total Reported Cases

T.T Total COVID-19 Laboratory Test

T.H Total Hospitalized cases

T.R Total Recover

T.D Total Death by the Case of COVID-19

N.C New Case Reported per Day

MoH Ethiopian Ministry of Health

PHI Ethiopian Public Health Institute

R-Square Regression Square

ICV Initial Curve Value

Cite this paper: Balcha, A. (2020) Curve Fitting and Least Square Analysis to Extrapolate for the Case of COVID-19 Status in Ethiopia. Advances in Infectious Diseases, 10, 143-159. doi: 10.4236/aid.2020.103015.

[1]   Binti Hamzah, F.A., Lau, C., Nazri, H., Ligot, D.V., Lee, G., Tan, C.L., et al. (2020) CoronaTracker: World-Wide COVID-19 Outbreak Data Analysis and Prediction. Bulletin of the World Health Organization.

[2]   Hui, D., et al. (2020) The Continuing 2019-nCoV Epidemic Threat of Novel Corona Virus to Global Health—The Latest 2019 Novel Corona Virus Outbreak in Wuhan, China. International Journal of Infectious Diseases, 91, 264-266.

[3]   World Health Organization (WHO) (2020) Novel Corona Virus (2019-nCOV) Situation Report-1.

[4]   World Health Organization (WHO) (2020) Novel Corona Virus (2019-nCOV) Situation Report-2.

[5]   World Health Organization (WHO) (2020) Statement on the Second Meeting of the International Health Regulations (2005) Emergency Committee Regarding the Outbreak of Novel Corona Virus (2019-nCov).

[6]   World Health Organization (WHO) (2020) Corona Virus Disease 2019 (COVID-19) Situation Report-67.


[8]   Worldometer (2020) COVID-19 Coronavirus Pandemic.

[9]   Corona Virus Symptoms, 2020.


[11]   Ethiopian Corona Virus Daily’s Status Report. MoH Twitter.

[12]   Ethiopian Public Health Institute (2020).

[13]   Ethiopian Integrated COVID-19 Control System. Daily COVID-19 Status Report.

[14]   Least Squares Analysis and Curve Fitting, by Don C. Warrington. Department of Civil and Mechanical Engineering University of Tennessee at Chattanooga. Fluid Mechanics Laboratory ENCE3070L.




[18]   Kiusalaas, J. (2013) Numerical Methods in Engineering with Python 3. Cambridge University Press, Cambridge.

[19]   Flowers, B.H. (2009) An Introduction to Numerical Methods in C++ Indian Edition. Oxford University Press, Oxford, 293-294.