Researchers use different approaches to induct additional parameters to a continuous class of distributions, ostensibly because in many applications, these classical probability distributions do not fit real life data. In other words, all of these approaches extend the classical baseline probability distributions by introducing additional parameter(s) to the baselines, thereby making the extended baselines much more flexible to fit wide range of data from practical situations. With this approach, several generalized families of distributions have been proposed and applied to real life data in areas such as engineering, life sciences, environmental sciences, finance and medical sciences.
Recently, there are a lot of attempts in the statistics literature to generalize distributions. This generalization is mainly on a methodology proposed by many researchers, as in . The most frequently used is the T-X approach by . Some of the generalized families of distributions based on this approach in the literature include Weibull G family by , Lomax Generator of distributions by , Odd Generalized Exponential family by , Odd Lindley-G family by , Gompertz-G family by , Zubair-G family by , Odd Frechet G family by , Power Lindley G family by , Topp Leone Exponentiated-G Family by , Odd Chen-G family by , Burr X Exponential G family by , Inverse Lomax-G family by .
The objective of this paper is to propose a new family of distribution called the Kumaraswamy Odd Rayleigh-G family of distributions which has the capacity of providing more robust compound probability distribution when used in modelling real life data set. This new family adds three additional parameters to the baseline distribution.
The rest of this article is structured as follows: In Section 2, we defined the Kumaraswamy Odd-Rayleigh-G Family. In Section 3, we derive some models based on the KORG family. In Section 4 we present the estimation method used in estimating the parameters of non-linear models. We conduct a Monte Carlo simulation study using a Kumaraswamy Odd-Rayleigh Log-Logistic (KORLLD) model in Section 5. In Section 6, we apply the new model of KORG family to five real life datasets and compare their performance with some existing distributions. Lastly, Section7 concludes the paper.
2. Kumaraswamy Odd Rayleigh G (KOR-G) Family
Attempts have been made to define new families of probability distributions which enhance the flexibility in practical data modeling of well known baseline distributions. In the spirit of the T-X approach by , this paper defines the cumulative distribution as
where is the function of the baseline cdf H(x) of any continous random variable X. The function must satisfy the following conditions
(b) is non-decreasing and monotonically differentiable;
(c) tends to a as x tends to ;
(d) tends to b as x tends to .
Let T be a random variable which is continuous with probability density function (pdf) z(t) defined on the close interval [a,b].
In 2011,  introduced the Kumaraswamy-G family of distribution. The probability density function (pdf) and the cummulative distribution function (cdf) are given by:
where and are the cdf and pdf of the baseline distribution with parameter vector .
The Odd Rayleigh-G family has pdf and cdf given by
The cdf of the proposed KORG family of distributions is given by
where , the vector is the parameter of the baseline distribution and .
From Equation (1),
if , the and if , . So,
and this can be written as
whence the proof. if , and if
From Equation (7), the pdf of the KORG family can be written as
And substituting Equations (4) and (5) in to 11 yields
Similarly, differentiating Equation (6) with respect to x will also yield Equation (12). Figure 1(a) illustrates the density function with different parameter values. It is obvious from this graph that values of x. And to evaluate this integral
and if , then
Figure 1. Density and hazard rate plots of KORLL distribution with varying parameter values.
if , then
which showed that is a pdf for the continous random variable X. The Hazard function and survival function of the KORG family can be given as
Quantile Function of KORG
The quantile function of KROG model can be given as
where is the quantile function of the baseline distribution.
3. Sub-Models of KORG Family
In this section, we considered two submodels of KORG family: Kumaraswamy Odd Rayleigh Log-Logistic (KORLL) and Kumaraswamy Odd Rayleigh Inverse Rayleigh (KORIR) distributions.
3.1. The KORLL Model
The cdf and pdf of log-logistic (LL) distribution are given as
The quantile function of the LL distribution is given by
where u is uniformly distributed in the interval (0, 1). Then, the KORLL distributon has the cdf given by:
The corresponding pdf of Equation (16) is given below:
The hazard function (hf), and survival function (sf) are presented below:
3.1.1. Quantile Function of KORLL
Let the random variable u be uniformly distributed on . Define the random variable y as
then the random variable x defined as
has a kumaraswamy odd Rayleigh-Log-Logistic distribution i.e. . And when , x is distributed as .
Figure 1 illustrates the various shapes of the density and hazard functions of the KORLL distribution at various parameter values. The density can be symmetric, skewed, and unimodal depending on the parameter values chosen. The hazard function can take many shapes depending on parameter values. This includes J-shaped and non-decreasing.
Table 1 presents the skewness and kurtosis of both the baseline log-logistic distribution and the KORLL distribution, computed from the quantile function in Equation (21) using Equation (22) and (23) respectively. For the choosen parameter values the skewness of the log-logistic ranged from −1.4352 to −0.0686, whereas that of the KORLL ranged from −0.0696 to 0.3479. Interms of skewness, it’s clear that KORLL model is much more flexible than the log-logistic distribution. Similarly the kurtosis for the baseline and extended baseline distribution ranged from −2.9641 to −0.1024 and −0.1646 to 31.0576 for the choosen parameter values, respectively. This further suggest the flexibility of the KORLL over log-logistic distribution.
Table 1. Skewness and Kurtosis using different parameter values.
Figure 2. cdf and sf plots of KORLL distribution with varying parameter values.
3.1.2. Skewness and Kurtosis
The skewness and kurtosis of the KORLL distribution can easily be computed from the quantile function using the relation: the Bowley’s skewness (by ) is based on the quantile defined as
And the Moor’s Kurtosis by  is based on octiles given by
3.2. The KORIR Model
The cdf and pdf of the baseline Inverse Rayleigh distribution are given as
is scale parameter. The qf is given by
when u is uniformly distributed. The cdf and pdf of KORIR distribution is given as
Quantile Function of KORIR
Let the random variable u be uniformly distributed on . Define the random variable y as in Equation (20), then the random variable defined as
has a kumaraswamy odd Rayleigh-Inverse Rayleigh distribution i.e. . And when , x is distributed as .
The parameters of the KORG family are estimated in this section using the method of maximum likelihood. Given a random sample of of size n with parameters and from KORG family of distribution, the pdf of KORG can written as
Let be the (p × 1) parameter vector, then the log-likelihood function based on Equation (25) is given by
Partially differentiating the likelihood function yields the components of the score function as follows
where , , , and .
The estimators of the parameters can be obtained by setting Equations (29)-(32) to zero and solving numerically using Newton Rapson or any other iterative methods.
5. Monte Carlo Simulation
A Monte Carlo Simulation is conducted and the results of the bias and root mean squared error of the various estimated parameter values are presented in Table 2. The efficacy for the simulation study is to observe the performance of the maximum likelihood estimates and to see whether the simulated values of the model parameters approach the true parameter values or not. The Monte carlo simulation is described as follows:
(a) For known parameter values i.e. , samples of different sizes from the KORLL distribution were generated ( , , , and ) using the quantile function defined in Equation (21).
(b) Using the maximum likelihood method, we compute the MLE of , , , and for the ith replicate.
(c) Steps (a) and (b) are replicated N = 1000 times.
Table 2. A simulation results for the KORLL distribution.
(d) The bias and RMSE for each sample size n are computed as
where are the mle for each iteration . The simulation results in Table 2 have shown that based on the parameter values chosen, the estimated Biases decrease as the sample size n inreases. In addition, the estimated root mean squared errors decay towards zero as the sample size increases. These two observations illustrate the consistency of the maximum likelihood estimates.
Here, we illustrate the applicability of the KORLL distribution to five data sets. Data set I represent survival times of 121 patients with breast cancer as reported by . Data set II represents the Marine water as reported by . Data set III represents 101 data points that reflect the stress-rupture life of kevlar 49/epoxy strands which were subjected to continuous persistent pressure at the 90 percent stress point until everything had collapsed as in . Data set IV represents the death times (in weeks) of patients with cancer of tongue with aneuploidy DNA profile as reported by . Data set V is due to  which is a life times data relating to times (in months from 1st January, 2013 to 31st July, 2018) of 105 patients who were diagnosed with hypertension and received at least one treatment related to hypertension in the hospital where death is the event of interest.
We used a maxLik package by  in R by . The analytical measures in comparing the model fit are the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). Smaller values of the AIC statistic indicate better model fittings. The competing models are as follows:
(i) The Marshall Olkin Extended Log-Logistics (MOELL) as in  wth cdf
(ii) The Kumuraswamy Log-Logistic (KUMLL) as in  with cdf
(iii) The Zografos-Balakrishnan Log-Logistic (ZBLL) as in  with cdf
Based on the considered analytical measures, we have noted that the proposed KORLL model provides the best fit to the five analyzed real life data sets presented in Tables 3-7. This proposed model outperforms the other four competing
Table 3. MLEs of the Parameters with SEs (paranthesis), BIC, −ll, and AIC values for data set I.
Table 4. MLEs of the Parameters with SEs (paranthesis), BIC, −ll, and AIC values for Data set II.
Table 5. MLEs of the Parameters with SEs (paranthesis), BIC, −ll, and AIC values for Data set III.
Table 6. MLEs of the Parameters with SEs (paranthesis), BIC, −ll, and AIC values for Data set IV.
Table 7. MLEs of the Parameters with SEs (paranthesis), BIC, −ll, and AIC values for Data set V.
extensions of the log-logistic distributions presented.
In this paper, a new family of distributions called the Kumaraswamy Odd Rayleigh G family which introduced three additional parameters to the baseline distribution is proposed and studied. This new family gives more flexibility and proved best fit, to a wide range of data from practical situations. The Monte Carlo simulation results indicated that the simulated values of the parameters of the sub-model of this family approached the true values as the sample size increases. Also, the root mean squared error estimates decay towards zero as the sample size becomes large. These facts suggest the consistency of the estimates. Based on the considered analytical measures, we concluded that the proposed family represented in this study by the Kumaraswamy Odd Rayleigh Log-Logistic distribution provided the best fit to the 5 analysed real life data sets, some of which are the survival times of 121 patients with Breast cancer and death times (in weeks) of patients with cancer of tongue with aneuploidy DNA profile.
 Lee, C., Famoye, F. and Alzaatreh, A.Y. (2013) Methods for Generating Families of Univariate Continuous Distributions in the Recent Decades. Wiley Interdisciplinary Reviews: Computational Statistics, 5, 219-238.
 Cordeiro, G.M., Ortega, E.M., Popovic, B.V. and Pescim, R.R. (2014) The Lomax Generator of Distributions: Properties, Minification Process and Regression Model. Applied Mathematics and Computation, 247, 465-486.
 Tahir, M.H., Cordeiro, G.M., Alizadeh, M., Mansoor, M., Zubair, M. and Hamedani, G.G. (2015) The Odd Generalized Exponential Family of Distributions with Applications. Journal of Statistical Distributions and Applications, 2, Article No 1.
 Gomes-Silva, F.S., Percontini, A., de Brito, E., Ramos, M.W., Venancio, R. and Cordeiro, G.M. (2017) The Odd Lindley-G Family of Distributions. Austrian Journal of Statistics, 46, 65-87.
 Alizadeh, M., Cordeiro, G.M., Pinho, L.G.B. and Ghosh, I. (2017) The Gompertz-G Family of Distributions. Journal of Statistical Theory and Practice, 11, 179-207.
 Ibrahim, S., Doguwa, S.I., Audu, I. and Muhammad, J.H. (2020) On the Topp Leone exponentiated-G Family of Distributions: Properties and Applications. Asian Journal of Probability and Statistics, 7, 1-15.
 Sanusi, A., Doguwa, S., Audu, I. and Baraya, Y. (2020) Burr X Exponential-G Family of Distributions: Properties and Application. Asian Journal of Probability and Statistics, 7, 58-75.
 Falgore, J.Y. and Doguwa, S.I. (2020) The Inverse Lomax-G Family with Application to Breaking Strength Data. Asian Journal of Probability and Statistics, 8, 49-60.
 Falgore, J.Y., Aliyu, Y., Umar, A.A. and Abdullahi, U.K. (2018) Odd Generalized Exponential-Inverse Lomax Distribution: Properties and Application. Journal of the Nigerian Association of Mathematical Physics, 47, 147-156.
 Shao, Q. (2000) Estimation for Hazardous Concentrations Based on NOEC Toxicity Data: An Alternative Approach. Environmetrics, 11, 583-595.
 Cooray, K. and Ananda, M.M. (2008) A Generalization of the Half-Normal Distribution with Applications to Lifetime Data. Communications in Statistics Theory and Methods, 37, 1323-1337.
 Ibrahim, S., Doguwa, S.I., Isah, A. and Jibril, H.M. (2020) On the Flexibility of Topp Leone Exponentiated Inverse Exponential Distribution. International Journal of Data Science and Analytics, 6, 83-89.
 Gui, W. (2013) Marshall-Olkin Extended Log-Logistic Distribution and Its Application in Minification Processes. Applied Mathematical Sciences, 7, 3947-3961.
 Hamedani, G. (2013) The Zografos-Balakrishnan Log-Logistic Distribution: Properties and Applications. Journal of Statistical Theory and Applications, 12, 225-244.