The Markov chain model has been widely used in different fields including education to study students’ enrolment projection both in secondary schools and tertiary institutions. Mostly it has been applied in a single school, a university or a college because according to  and  respectively. Education system is comparable to a hierarchical organization in which after an academic year, three possibilities arise in the new status of the students; the student may move to the next higher class, may repeat the same class, or may leave the system successfully as graduate or dropout of the system before attaining the maximum qualification.  shows that movements between grades of a social process, like an educational process, can be described by transition probabilities, because in the educational system dropouts run counter to educational goals. In the paper the consequences of dropouts on length of stay in school and the cost of education are examined for the two sexes. On application to Nigeria, the average length of stay is found to be small, 1 - 4 years for boys and 3 - 8 years for girls instead of the statutory six years. Markov Chain models are also used to model things in physical system  and solve social problems  . Markov process is a synthesis of movements between states to describe the relocations of members of the transfer probability matrix to different states, on the basis of the mobility trend of historical data  . Because the average time completes secondary school education in Nigeria, the numerical success rate varies from school to school. This has been a matter of discussion among education policy makers. With high rate of student dropouts, unfinished studies as a result of female students being pregnant without proper planning, and males students dropping out of school in search of money for keeping family, this will indirectly affect the internal planning of the schools in terms of predicting student enrolment for each session, number of teachers needed to teach all the courses offered in the school and implementing class-room planning. The school administrators will have problems in making strategic planning and coming up with a decision on new student admissions into various classes. In view of this, we attempt to proffer an answer to the question―what is the future class structure of educational system which expands at a uniform rate if we continue with patterns of wastage and promotion continues while the carrying capacity of the system is not exceeded? An excellent brief review of applications and a concise introduction to Markov concepts are found in  . Using this type of model for our data involves utilizing a probable matrix in order to predict the future enrolment of secondary school pupil.
The central objective of secondary education is to provide young people to acquire the skills, aptitudes, values, knowledge, and experience needed to continue their education and to be active citizens and productive workers.
A policy objective is to ensure that both access and quality are made available to those generally excluded by poverty, ethnicity, gender, and other factors. Projection which is defined as the process of obtaining an estimate (or estimates) based on present situation, future goals and targets and trend will be a useful tool in achieving this key objective. Projecting future enrolment is one of the most important tasks for educational plan. Projections are based on the assumption that the past trends will continue to operate in the future. The reliability and usefulness of projections depend on the assumptions and their closeness to reality. The likely effects of policy changes are to be judged and projections are to be made accordingly. Thus, when an element of judgment is added to the projections, it becomes a forecast. Forecasts enjoy the advantage of being based upon the assumption or a set of assumptions which are likely to be realized in the near future and can yield a relatively more realistic picture of the future. They should be reviewed frequently in order to determine the degree to which they agree with recent demographic changes. In order to do this, important variables concerning educational activities need to be made use of such as teacher quality, dropout rates, and various grade sizes etc. A fore knowledge of future enrolment of students in a school can help provide adequate man-power, infrastructures, etc. Thus considering the fact that the state of student is hierarchical in nature, the stochastic Markovian model finds an application in its study.
Students in a secondary school aspire to reach grade level six and graduate out of school, but not all achieve this, some leave the system before rising to that of the top classes or grade. For a long established school the various grades will be composed of students who joined the school at different times and in different grades. Enrolment projection is a necessary activity in educational planning because the enrolment each year show a generally stable condition, the management cannot foresee the overall flow of students when this information is required for future planning. This is essential because in the recent time some changes have taken place because of the competitive nature of school system in Nigeria.
There are various forecasting models used to estimate future enrolment such as cohort, regression, ratio, Markov and simulation. Among these techniques, the Markov Chain seems to be the most suitable model for the study as observed by  . This is because of the specialty of the Markov chain method that not only can estimate promotion and repetition rates, but it can also estimate the number of dropouts and graduates in the matrix.
To expand access and enhance relevance and quality, studies of the movement throughout grades is then of interest in giving the educational career expectation of a student in the school, as well as a forecast of the future class size and the teacher-pupil ratio, will help in good budgeting and planning.
In this paper we model the movement of the school in question through the secondary education system using a Markov chain. Many applications of Markov chains technique occur in educational system such that the paper of  addressed by Statistical analysis of data from University of Zimbabwe Educational System and they described the educational advancement of student through the undergraduate degree programme. The paper has reported valuable insights as a result of using Markov Analysis. The classical Markov chain model for the multi-echelon educational system was developed by  . In the educational field,  proposed a Markovian model to forecast enrolment and degrees awarded in Australian Universities.
 proposed an enrolment projection method based on the carrying capacity of the educational system. The method is a refinement of the recruitment control strategy proposed in the literature. They implement their proposed method using enrolment data from a university setting. The results obtained by extrapolating the short-term shifts in enrolment structure reflect the normal progression pattern in the system.  showed a transition matrix for a multi-echelon educational system, using logistic and Markov chain theoretic methodologies. The explanatory variables of the logistic model are the school differential variables, and the transition matrix of the Markov chain is the non-homogeneous empirical transition matrix (NHETM). They compared the NHETM with the periodically updated transition matrix suggested in literature using data in a university setting. The result indicated that the NHETM do not violate the flow mechanism of the academic programmer and that the higher-order NHETM is not a sparse matrix.
 reported that increased school size also negatively affected students’ ability to identify with their school.  found that in large schools of over 400 students about 30% of the students felt a sense of belonging whereas in small schools about 70% felt a sense of belonging. This increased sense of belonging occurred in small schools because (a) people in small schools are more likely to know and respect each other; (b) the anonymity of large schools increases anger and physical violence; and (c) small schools were less intimidating for parents. Similarly,  noted that established relationships are more intense and enduring at smaller schools than at larger schools. In addition,  found higher degree of cooperation among teachers and students in small schools than in large schools in their study. The lack of personal satisfaction and connectedness experienced by students and teachers in large schools has been a major component of the schools-within-schools movement   .
2. Discrete-Time Markov Chain Model
The discrete-time Markov chain is a mathematical system that undergoes transition on a state space. It is also a random process characterized with a memory less property such that the next state (t + 1) depends only on the current state (t) and not on the sequence of event that preceded it.
In developing a model of the flow of students through the system, we have to take into consideration the inflow, promotion and wastages (resulting from dropouts or graduates) processes of that system. We shall assume that all promotions occur once at the end of the year (annually) and promotions are made only to the next higher grade. The data for this research work is a secondary data collected from the administrative department of the Apostolic Faith Secondary School, Akwa Ibom State.
Furthermore, we assume that wastages occur due to deaths, illness, poor academic performance, dismissals, transfer to other school and graduation. The wastage vector we denote by W.
Enrolment into the various grades constitute the inflow process and those student who remained in a grade that is to say repeaters inflow can be made into any of the grades any time. The inflow vector will be denoted by I (Table 1).
1) A 6 × 6 matrix p of transition probability governing the movement within the system and is denoted by ..
2) A vector of 6 × 1 wastage probabilities denoted by .
3) A 1 × 6 vector of inflow probabilities denoted by .
Table 1. Model notation.
1) The summation of the probability of promotion flow is less than one (< 1).
This is because in an open system, transitions out of the system are possible.
2) The probability of promotion flow plus probability of wastage sum one.
This is because a stochastic matrix is a matrix of finite or infinite order with non-elements such that the sum of each row is equal to one.
3) The probability of inflow into grade j at time (t + 1) sum to 1
This is because a stochastic matrix sum to one along the rows (Table 2).
Table 2. Students flow format.
The data in the above flow format is then used in:
1) Estimation and validation of the model (test for stationary);
2) Prediction of the expected future enrolment;
3) Projections of teachers;
4) Estimating the expected wastage;
5) Estimating the expected, length of stay;
6) Estimating the variance and standard deviation of length of stay;
7) Calculating the probabilities of attaining higher grades.
2.1. Estimation and Validation of the Model (Test for Stationarity)
The prediction equation is true whether the probabilities are constant or not. But if the assumption of stationary is not validated, we would have to update the matrix Q before using it to predict for each new item period as given by  . In other words, we would be dealing with equations of the type
In this section, we shall give a test for testing the assumption of stationary transition probabilities.
Assumption of constant transition probabilities implies that:
for all t and for all t.
Test for Constant Transition Probabilities
Ho: Transition probabilities are constant over time.
HA: Transition probabilities are not constant over time.
2) Test Statistics
The statistical inference procedures for Markov chains following the works of  (pp. 90-100)   and using the principle of maximum likelihood estimation of a multinomial distribution give the estimates of for each session as
3) Decision Rule
We do not reject the null hypothesis of constant transition probabilities if for all t, otherwise we reject.
4) Computation and conclusion will be displayed subsequently.
Test for Stationary
To test the stationary of the sectional TPM’s p(t) with elements pij(t) we use the following layout discussed in the above references (Table 3).
Ho: Transitions from a row state i are stationary.
HA: Transitions from a row state i are not stationary.
2) Test Statistic
The test for stationary as stated by  is
where is the level of significance and b is the number of those .
3) Decision Rule
We reject the null hypothesis that transitions from a row state i and the entire system if and .
Computation and conclusion will be displayed subsequently.
2.1.1. The Prediction Equation for the Expected Future Enrolment
The basic prediction equation as given by  for the expected future size is:
This equation can be expressed using matrix notation as:
where p = 6 ´ 6 transition probability matrix (TPM); wT = 6 ´ 1 vector of wastage probabilities; I = 1 ´ 6 vector of inflow probabilities
And also Q is a stochastic matrix with the element given as:
Table 3. Layout for test of stationary of transitions from the ith grade ( Contingency Table).
We shall use the behavior of Q to discuss and answer the questions about the model described. Using the predicted value at time (t + 1) we obtain that for (t + 2) and so on.
That is we have
2.1.2. Projections of Teachers
Enrolment statistics forms the basis for many investment decisions in education. A teacher is the most important academic input especially at the primary and secondary level, and teacher’s salaries accounts for a major share of recurring expenditure of the federation budget education. Projections on recruitment of teachers should follow enrolment projections.
2.1.3. Method Based on the Number of Pupils per Class and Hours Taught by a Teacher
This is technically a better method of making projections of teacher-requirements in the future, as it takes into account the following variables:
1) Size of the class;
2) Number of hours that the students receive instruction per week;
3) Number of hours taught by a teacher per week.
The following set of data is required:
1) Stage-wise enrolment;
2) Average number of hours per week for a student as per time-table;
3) Average number of students taught at the same time by one teacher;
4) Average number of student-hours per week taught by a teacher.
According to this method following the work of  the requirement of teachers is determined by the following procedure:
where T = Number of teacher required; E = Projected enrolment; R = Average number of students per teacher or per instructional group or size of average class; Hs = Average number of weekly hours per student which is generally prescribed in the school curricula; and Ht = Average number of weekly hours per full-time teacher.
Equation (12) is very useful for planning purposes. All the different factors can be planned, as none of them is constant. In this equation, the number of teachers required is directly proportional to the number of pupils and the average weekly hours per student.
The following assumption holds.
1) Teacher-pupil ratio will vary gradually (increase or decrease);
2) Weekly hours per student will remain constant; and
3) Weekly hours per teacher will vary gradually (increase or decrease).
Given as the expected grade size or the structure at time t, the expected wastages at the end of time t is given by:
where is the probability of a student dropping from grade j and is independent on time.
Expected Length of Stay
Bartholomew (1991) established that the mean length of time spent in a grade in the system is given by:
where 1 = 6 ´ 6 identity matrix; p = the transition probability matrix for the base year.
Variance and Standard Deviation of Length of Stay
The variance of length of stay is a measure of the variability of length of stay in a grade. It is given by:
where is the element of .
The standard deviation is defined as the square root of the variance in 2.15 above.
The Probability of Attaining Higher Grades from Grade i
This is the probability that an entrant to any grade i attains higher grades and is given as:
where is the element of and ij denotes the probability that an entrant to grade i will attain grade j.
3. Data Analysis and Discussions (Tables 4-10)
Since for all t we reject, we conclude that probabilities are not constant.
3.1. Testing for Stationarity
Test for stationary as described on Section 3.3 is applied to the data for the application we have that the states are: (grade levels) and the times of observation are (2008/09-2013/14).
For easy follow through, the transitions calculated for each time of observation will be given illustrating the procedure using i = 1. The result for the other states i = 2, 3, 4, 5, 6 shall only be stated.
Table 4. Transition probabilities for the year 2008/2009.
Table 5. Transition probabilities for the year 2009/2010.
Table 6. Transition probabilities for the year 2010/2011.
Table 7. Transition probabilities for the year 2011/2012.
Table 8. Transition probabilities for the year 2012/2013.
Table 9. Transition probabilities for the year 2013/2014.
Table 10. Transition probabilities for the year 2008/09-2013/2014.
Presenting the test of stationary of transition from each row state for and for the whole (TPM), using the relations (a) and (b) and the layout in Table 2 as follows. However details of the first grade i = 1 will be given here (Table 11).
Here b = 3 implies that the degree of freedom (df) is and at 5% level of significance the critical value is the value of (10) = 18.31. Since 64.3358 > 18.31 we reject the null hypothesis of stationarity from grade 1 to the
Table 11. Array for testing stationarity of transition from grade 1 to 0ther grades.
other grades over time (study period). The result for similar test for grades 2, 3, 4, 5 and 6 and that for the entire TPM are set down in the table below (Table 12).
From the table above, the rejection of null hypothesis is most probably due to high mobility or transition rate in each grade.
Calculation of the Stochastic Matrix Q
The stochastic matrix Q which was defined as: .
Where: P = the transition probability matrix (TPM); W = the vector of wastage probability; I = the inflow probability vector.
The (ij)th elements of Q is defined by :
From Table 10, the w and I vectors of wastage and inflow probabilities are given by:
And the transition probability is given in the table below (Table 13).
From these and using the relation connecting Q, P, W and I, we obtain Q.
3.2. Future Grade Size
The prediction equation was defined as (t + 1) = (t)Q, but since our assumption for stationary is not validated we would have to update the matrix Q before using it to predict for each new item period. This implies that:
Table 12. Results of test of stationary of transition probabilities.
Table 13. Transition probability matrix (TPM).
Hence we update the matrix as follows:
Using 2013/2014 as the base year and with the number of students at this time as:
Recall is the probability of a student moving from grade i to grade j at the end of the session and is the number of students who move from grade i to grade j at the end of the Session (Table 14, Table 15).
Therefore to obtain
Table 14. Prediction equation for year (t + 1) .
Table 15. Estimated projection for 2014/2015.
Table 16. Total 2008/2009-2014/2015.
Table 17. Total 2008/2009-2014/2015.
Table 19. Estimated projection for 2015/2016.
To obtain for = we compute as above.
The structure for the three years ahead is given in the table below (Table 20).
An educational planner will use the projected structures to plan and provide adequate infrastructure needed in the Secondary School system by taking into consideration the variations.
3.3. Projection of Teachers
We begin by computing the base year (2013/14) ratio on the basis of resources available.
The number of teachers required for the three years ahead following the method in the analysis sectioin is shown in the table below (Table 21).
For 2013/14; E = 1623, Hs = 45, Ht = 13.2 and R = 44. Where, T is Number of teacher required; E is Projected enrolment; R is Average number of students per teacher or per instructional group or size of average class; Hs is Average number of weekly hours per student which is generally prescribed in the school curricula; and Ht is Average number of weekly hours per full-time teacher.
The following assumptions have been made in the above calculations:
1) Teacher-pupil ratio will be gradually decreased from 44 in 2013/14 to 43 in year 2016/17;
2) Weekly hours per student will remain the same; and
Table 20. Projected structures for the three years ahead.
Table 21. Grade size data.
3) Weekly hours per teacher will be gradually increased from 13.2 hours in 2013/14 to 14.0 hours in 2016/17 (Table 22).
3.4. Expected Wastage
Recall that wastage and inflow are random and as such we are justified to talk of expected wastages (Table 23).
We thus obtain the following result for the predicted three years ahead using Equation (13)
3.5. The Expected Length of Stay in a Grade
It is of interest to the educational planner in an organization to have an idea of the length of time a student is likely to spend in a given grade and also the mean total time spent in the system.
Hence the expected length of stay as given by Bartholomew (1982) and stated in method of data analysis can be obtained using Equation (14):
P = the transition probability matrix (using the transition matrix P. for the base year 2008/09-2013/14).
Table 22. Projection of Teachers for the three years ahead.
Table 23. Expected wastage for the three years ahead.
The above shows the total expected length of stay in the system as well as the time in a grade. For example, on entering grade 1, a student is expected to spend 1.1885 year in grade, 1.0250 years in the second grade, 0. 9745 year in the third, 0.9038 year in the fourth, 0.8365 year in the fifth and 0.8068 year and 0.8068 year in the sixth grade on the whole, a new entrant into this system is expected to spend 6 years in the system. The above result should be expected considering the fact that some students on grade 1 fail and do not pass to the next higher class while some student dropout and almost about 70% of the student were promoted to the next higher class.
Similarly, an entrant into grade 2 is expected to spend total of 4.6213 years in the system which is divided into 1.0419 years in the second grade, 0.9905 year in the third grade, 0.9187 year in the fourth grade, 0. 8502 year in the fifth grade and 0.8200 in the sixth grade. We observe, generally, that the total expected length of stay decreases as a student ascends to hierarchy. This result reflects the increase of wastage.
3.6. Variance and Standard Deviation of Length of Stay in a Grade
Applying Equation (2.15) to the above matrix, we obtain the variance as:
The corresponding standard errors are
We observe from the above matrices of variance and standard error that the variability in the expected lengths of stay in a given grade is not too high or low.
3.7. The Probability of an Entrant to Grade i to Attain Higher Grades
We calculate the probabilities that an entrant to any grade i to attain higher grades by using Equation (2.16)
From the above results we observe that an entrant to grade 1 has a chance of about 98% of ever being in grade 2, 94% of being promoted to grade 3, 86% of being promoted to grade 4, 80% of being promoted to grade 5 and 76% of being promoted to grade 6. Similarly an entrant to grade 5 has 95% chance of being promoted to grade 6.
4. Summary and Conclusion
Base on the result of this research work indicating a decrease in the future grade size and teachers in the system, the school management should provide necessary facility to reduce wastage in the entire system as it is obvious from the future projections that the inflow level is inversely proportional to the wastage level. I hereby recommend that educational planners in both private and Government schools use this model to project future enrolment especially in secondary schools where the assumptions underlying the use of the model is met, as a knowledge of the future size will help in wise management, infrastructure and man-power development.
 Uche, P.I. (1980) A Transition Model of Academic Survival in a Single Channel System with an Empirical Examination. International Journal of Mathematical Education in Science and Technology, 11 177-183.
 Chukwu, W.I.E. (1992) A Markov Chain Model for Crop Planning in Nsukka Environs Using Rainfall Data. East African Agricultural and Forestry Journal, 57, 253-259.
 Tsai, M.-C., Li, Y.-L. and Chung, P.-H. (2014) A Study on Predicting the Turnover of Nursing Staff of Different Education Backgrounds: Using the Absorbing Markov Chain Method. Journal of Quality, 21, 501-520.
 Barrington, B.L. and Hendricks, B. (1989) Differentiating Characteristics of High Graduates, Dropouts, and Non-Graduates. Journal of Educational Research, 82, 309-319.
 Gani, J. (1963) Formulae for Projecting Enrolments and Degrees Awarded in Universities. Journal of the Royal Statistical Society. Series A (General), 126, 400-409.
 Cotton, K. (2003) School Size, School Climate, and Student Performance. School Improvement Research Series, Close-Up 20.
 Pethel, G.E. (2005) An Investigation of the Relationship of School Size and Program Quality in the Public High Schools of Georgia. Unpublished Doctoral Dissertation, University of Georgia, Athens.