Inference for Generalized Inverted Exponential Distribution UnderProgressive Type-I Interval Censored Data

This article discusses the estimation methods for parameters of a generalized inverted exponential distribution with different estimation methods by using Progressive type-I interval censored data. In addition to conventional maximum likelihood estimation, the mid-point method, probability plot method and method of moments are suggested for parameter estimation. To get maximum likelihood estimates, we utilize the Newton-Raphson, expectation -maximization and stochastic expectation-maximization methods. Furthermore, the approximate confidence intervals for the parameters are obtained via the inverse of the observed information matrix. The Monte Carlo simulations are used to introduce numerical comparisons of the proposed estimators. In addition, we use the percentile bootstrapping technique that is used to calculate confidence intervals. The proposed methodology in a real-life using the survival times of guinea pigs inoculated with different doses of tubercle bacilli data are considered to offer the applicability of the suggested methods.


Introduction
The generalized inverted exponential distribution (GIED) is a generalization of the inverted exponential distribution (IED).IED is a continuous transformation of the reciprocal of the exponential distribution.Specifically, if  is a random variable that follows the exponential distribution, then  = respectively.The IED was investigated by many authors, for example [1] , [2].The IED can be generalized to include shape parameters and proposed a (GIED).A random variable  of the GIED with  shape parameter and  scale parameter has the following expression of c.d.f. and p.d.f.() = 1 − (1 −  −  ⁄ )  ,  > 0,  > 0,  > 0 ,, , respectively.It can be seen that the hazard function of the GIED () 1 − () =   2 (   ⁄ − 1) can be increased or decreased based on the shape parameter.Also, it is clear that in many states, this distribution provides a better fit than the Weibull, Gamma, and GED distribution, for more details, see [3].The GIED is used in such applications, for example; in sea currents, horse racing, and wind speeds.For more properties and applications of the GIED, one can refer to [4] [5] [6]- [7] , [8] .
In life and reliability testing studies, type-I and type-II censoring schemes are more common.However, it is important in some of these studies that a particular fraction of individuals may be removed from the experiment at every of several ordered failure times [9].Clearly, type I and type II schemes do not have the ability to permit the removal of units at points other than the final point of the experiment.Aggarwala [10] proposed the progressive type I interval censored scheme which can be described as follows.Assume the units of n are put on a test at time  0 = 0 and each unit is followed until it fails or is censored.Units can be observed at present time  1 <  2 < ⋯ <   , where  is the pre-specified time to the end of the experiment which means the time axis is divided into intervals   = [ −1 ,  ) ,  = 1, …  with   is the time when the experimentation finishes.Let   be the units number which are failed in   and   be the units number which are removed from the experiment at time   particularly, if the units of  are put on a test at time  0 and  1 which are observed at time  1 , at this time  1 units that are not failed are removed from the experiment leaving  1 −  1 −  1 items still there.At time  2 , when other  2 items have failed,  2 of items that are not failed are removed from the experiment with leaving  −  1 −  1 −  2 −  2 items still present and the same as for the rest.The experiment terminates after  number of repetitions.
Finally, at time   , the number of the removed items that are not failed is   .Note that  = ∑ (  =1   +   ) . Figure 1 shows a progressive type I interval censored.

Figure 1: Progressive type I Interval Censored Scheme
Hence, our observations consist of  = {(  ,   ,   );  = 1, … , } .The numbers of removal items  1 , … ,   are expressed as nonnegative integers.Alternatively, the removal numbers may be set by pre-specified percentages of the surviving units which are reminded as follows.Let  = ( 1 ,  2 , … ,   ) be pre-specified percentages with   = 1 At time   , ⌈  × (        )⌉ from the remaining surviving units are removed from the experiment where ⌈⌉ denotes the largest integer, which is smaller than or equal to .
In the paper, we utilize different estimation processes for estimating the parameters of the GIED based on progressive type I interval censored.The remainder of the paper is arranged as follows.In Section 2, we obtain the maximum likelihood function estimators (MLEs) of the unknown parameters  and .The standard errors for the MLEs and approximated 95% confidence intervals for the parameters are computed as well using the inverse of the observed information matrix.Further, the computing of the MLE using EM and stochastic EM algorithms is also investigated.The nonparametric bootstrap percentile technique is utilized to construct 95% confidence intervals of unknown parameters.The midpoint approximation method, probability plot and method of moments are studied in Sections 3, 4 and 5, respectively.A Monte Carlo simulation study is prepared in Section 6, which supplies a comparison of all the estimation procedures in terms of their biases, estimated standard errors, sampled standard error mean square errors, lengths of 95% confidence intervals and empirical 95% coverage probabilities.An analysis of a real data set is presented in Section 7. Finally, a conclusion is given in Section 8.

The Maximum likelihood estimation
Based on the observed progressive, type I interval censored sample  = {(  ,   ,   );  = 1, … , } , the likelihood function of  and  is written as (3) with corresponding log-likelihood function Then log-likelihood ( 4) is expressed as The first-order partial derivatives of   and   with respect to α and λ are obtained by ,λ : = and the second-order partial derivatives are given by ,λλ ∶= ,λλ ∶= Hence, the first and the second order partial derivatives of the log-likelihoodEq.(7) (19) .To calculate the MLEs  and ˆ,  for the unknown parameters  and , we need to solve the equations   = 0 and   = 0 ,where   and   are given in Eq.( 15) and Eq.( 16).It can be seen that there is no closed form of the MLEs.Hence, to obtain the MLEs of  and , we use a simple numerical procedure, namely the Newton-Raphson method.The iterative equation is given by (  (+1)  (+1)   ) = (  ( )    ( )   ) -( = ( ) ,=  ( )   or equivalently where  ( ) and  ( ) are the amounts of  and  at k -th iteration and   ,   ,   ,   and   are given in Eq.( 15),Eq.(16),Eq.(17),Eq.(18) and Eq.( 19), respectively.The iteration procedure continues until convergence that means | (+1) −  ( ) | + | (+1) −  ( ) | <  for some pre-specified  > 0 .The standard error of the MLEs can be computed by the inverse of the observed information matrix.Hence, the estimated standard error of  and  can be calculated by the square root of the diagonal elements of the inverting of the observed information matrix assessed at ( ̂ ,  ̂ ) Next, we calculate the 95% confidence interval for  and  using the nonparametric percentile bootstrap (Boot-p) method.Bootstrap methods are extremely used to get confidence intervals for the parameters.In [11], the authors suggested the Boot-p method which is used to construct confidence intervals for the parameters in addition to hazard functions and reliability.To construct the Boot-p confidence interval, one has to follow the next steps. Step . respectively.

The EM Algorithm
It can be seen that utilizing the Newton-Raphson method to compute the MLEs requires the computation of the second derivatives of the associated log-likelihood.In this subsection, we propose the EM algorithm to avoid such computations for obtaining the MLEs of  and .The EM algorithm suggested by [12] is a very powerful technique used in parameter estimation under incomplete or missing information data.The EM algorithm contains of two main steps; the expectation step (E-step) and Maximization step (M-step).In the E-step, we calculate the conditional expectation of the complete log-likelihood function condition on the observed values and in the M-step, we maximize the resulting function with respect to the unknown parameters.Now, we define   ,  = 1, … ,   to represent the complete survival times by subintervals   = [ −1 ,   ) and we also define   ,  = 1, … ,   to represent the complete survival times of those withdrawn items at   where  = 1, … ,  Using  = ( 11 , . .,    ) and  = ( 11 , . .,    ) the complete log-likelihood function can be expressed by Now, for = 1, 2, , , im the following conditional expectations , define Therefore, the EM algorithm works as follows.Set as  (0) and  (0) .Be the initial values of  and  ,respectively.

The Stochastic EM Algorithm
The Stochastic EM algorithm (SEM) is an alternative method of the EM algorithm where the expectation in the E-step is calculated using Monte Carlo simulations.It is useful for cases when the E-step is hard to calculate exactly.The approximating of the E-step in the EM algorithm by the Monte-Carlo technique was first proposed by [13].As mentioned by [14], the approximations have more time-consuming.Later, in [15] the authors modified their idea by replacing the E-step with a stochastic step through the simulation technique.For more details about SEM, see for example, [16], [17] , [18]. .The main idea of the SEM method can be described as follows.Observe that the conditional survival functions of   <  ≤  can be written as Now, we state the procedure for simulating random variate from the GIED in the interval [, ].Let : (0,1).Observe that, by solving the expression with respect to  , we obtain Note that, when  approaches to ∞ , the above expression reduces to Now, we first generate independent   number of samples   ,  = 1,2, … , ;  = 1, … ,   from the conditional survival function given in Eq.(32) with  and  are substituted by  −1 and   , respectively.Next, we generate   number of samples of   ,  = 1,2, … , ;  = 1, … ,   from the conditional survival function given in Eq.(32) with  is replaced by   .Using these simulated samples, Eq.(30) and Eq.(31) reduce to ) . ( ) (36) Therefore, the SEM algorithm works as follows.Set  (0) and  (0) be the initial values of  and , respectively.

The Midpoint Approximation Method
In this subsection, we estimate the unknown parameters of the GIED using the midpoint approximation method.The main thought of this method is to approximate the data of the progressive type I interval censored by type I censored.We assume that   number of failures is noticed at the center   = ( −1 ,   )/2 of i-th interval ( −1 ,   ] and too   number of units are censored at the inspection time   ,  = 1,2, … , .The log-likelihood function of  and  based on this type of observations is written as ). (37) After that, we need to resolve the following system of equations to get the midpoint estimates of unknown parameters ∑ The likelihood Eq.(38) and Eq.(39) cannot be solved analytically because of their nonlinear nature.Therefore, we may adopt a numerical method ,Newton-Raphson method ,to get the estimates of  and .
Estimating the parameters using the probability plot method can be performed by finding the amounts of  and  that minimize the function So, we want to find the solution the following system of equations These estimates are computed numerically by some nonlinear optimization technique.

Method of moments estimation
The k th population moment of a GIED with pdf that is given in Eq.( 2) has not an explicit form and can be computed by where  + is the set of positive integers.Substituting  = e where  is the digamma function and  ҆ is its derivative, see [19] .Now, the _th negative population moment of a doubly truncated GIED distribution in the interval [, ), 0 <  <  can be given by Since we can not obtain the closed form of the solution to Eq.( 42) and Eq.( 43) , we can employ the iterative procedure as follows.Set  (0) and  (0) as initial values of  and .
The above schemes are picked to specify the surviving units percentage to be withdrawn at the censoring and monitoring points.Observe that, in Scheme 1, in the first two intervals the removal is lighter as compared to the last two intervals and in Scheme 2 is the reverse scenario of Scheme 1.Moreover, in Scheme 3, there is no removal done prior to termination which is a case similar to conventional type I interval censored.In Scheme 4, we conduct the removal at the left-most and right-most ends.
⌉, where x indicates the largest integer not greater than . x Step(vii) If  < , replace  by  + 1 and go to Step(v).Otherwise, stop.
For the bootstrap confidence intervals, the size of the bootstrap samples is taken to be 5000.
At each iteration, we estimate the parameters using the MLE via Newton-Raphson, EM and SEM, probability plot (PP), mid-point (MP) and method of moments (MM) methods.For each of these methods, we have computed the absolute average bias (Bias), the root mean square error (RMSE), the sample standard deviation (SSE), the estimated standard deviation (ESE) via the observed information matrix.Moreover, we have evaluated the widths (Len) of 95% Wald's confidence intervals by using the observed information matrix (CI) and 95% Boot-p (BT) confidence intervals with their empirical coverage probabilities (CP).The process for the estimation is repeated 1000 times and the results of the estimation are reported in Tables 1-7.From Tables 1-6, it is observed that the Bias For every estimators, in general, it is rationally small which references that the estimated values are close to the true parameter values.However, the MP method, as expected, presents more bias estimates than the other methods.In addition, the SEM algorithm performs worse than NR and EM based on this aspect.Clearly, the RMSE of MP is higher than that of the other methods.Moreover, the values of SSE and ESE of NR and EM methods are close, especially for large .
n This indicates that ESE based on the inverse of the observed information matrix is considered as a reasonable estimate of the SSE.As expected, the Bias, RMSE, SSE an ESE of all estimators are decreasing when are increasing sample sizes for every case.With respect to the 95% confidence interval, from Table 7, the length of the confidence intervals is decreasing when is increasing the value of the sample size.Moreover, the empirical coverage probabilities of 95% confidence intervals (CP) are very close to the nominal level for every case.Subsequently, the performances of all proposed methods except for the MP method are satisfying in terms of the biases and standard errors of the estimates.

Application
In that section, we can analyze a data set as a real life application of the GIED under progressive type I interval censored observations.The data set can be provided by [21], and it represents the survival times (in days) of guinea pigs inoculated with different doses of tubercle bacilli.It can be known that guinea pigs have a high predisposition to human tuberculosis and for this reason, they are used in this specific study.The regimen number is the common logarithm of the number of bacillary units in ml. of challenge solution; i.e., regimen corresponds to bacillary units per ml.[22].This data are used to fit the inverse Weibull distribution.In agreement to regimen 6.6, there are 72 observations listed below: 12,15, 22, 24, 24,32,32,33,34,38,38, 43, 44, 48,52,53,54,54,55,56,57,58,58,59,60,60, 60,60,61,62,63,65,65,67,68,70,70,72,73,75,76,76,81,83,84,85,87,91,95,96,98,99, 109,110,121,127,129,131,143,146,146,175,175, 211, 233, 258, 258, 263, 297,341,341,376.First, we check whether the GIED is suitable for the data based on the complete data set.We propose three measures for fitting the data set with GIED and these measures are the Akaikes information criterion (AIC), the Bayesian information criterion (BIC) and the minimum distance of Kolmogorov-Simrnov (KS).These measures are defined by where α ̂ and λ ̂ are the MLEs of α and λ,  is the log-likelihood function that can get it from Eq.( 4), F ̂ is the empirical c.d.f. and F is the population c.d.f.given in (1).The values AIC, BIC and KS of some two-parameter lifetimes distributions, namely; the GIED, BurrXII, generalized exponential (GExp), Weibull and inverse Weibull (Iweibull) are reported in Table 9.In addition, the curves of the population c.d.f. of GIED, F(t; α ̂, λ ̂), and the empirical c.d.f.data set, F ̂ is depicted in Figure 2. Clearly, from Table 9 and Figure 2, it is shown that the GIED is the best fitted distribution of the data compared with BurrXII, GExp, Weibull and Iweibull distributions.
Next, we estimate α and λ, of GIED based on the real data set using the proposed methodology.For analyzing the above data set,we take m = 5 and inspection times  = (40,90,150,190,220).In addition, we consider the same censoring schemes presented in the simulation section, namely p 1 , p 2 , p 3 and p 4 .According to the censoring schemes, the values of (d i , r i ) within the intervals I 0 = (0, t 1 ] and I i = (t i−1 , t i ], i = 1,2, … , m can be reported inTable 8. To propose initial values of the parameters, the Cantor plot of the log-likelihood function under the real data set is plotted and is presented in Figure 3. Table 10 presents the estimates and standard errors while Table 11 presents the confidence intervals of the parameters, α and λ .for real data sets.From the obtained results, one can see that the values of the MLEs computed using NR and EM methods are very close except for the censoring scheme p 2 Similar conclusion can be observed for the ESE values.With respect to the length of the confidence intervals, both methods; CI and BT have introduced almost the same lengths except for the scheme p 2 .

Concluding remarks
In this article, statistical inference of the unknown parameters of GIED under progressive type I interval censored data is considered.The MLEs, probability plot, mid-point and method of moments as well as associated standard error, root mean square error and confidence intervals are obtained.MLEs are obtained by using the Netwon-Raphson method, expectation minimization (EM) algorithm and stochastic expectation minimization (SEM) algorithm.The Simulation results showed that all the estimators, except MP method, present reasonably small amounts of biases and RMSEs.Moreover, the ESE based on the inverse of the observed information matrix is considered as a reasonable estimate of the SSE for NR and EM methods, especially for large .
n with respect to 95% confidence interval, the length of the confidence intervals is decreasing when increasing the value of sample size and the estimated CP of 95% confidence intervals are very close to the nominal level for every case.
In real data analysis, we analyze the survival times of guinea pigs inoculated with different doses of tubercle bacilli based on the proposed methodology.Fitting the data set with the GIED is first implemented and then the GIED parameters are estimated based on the proposed methods.
We hope that the methodologies proposed in this work will be useful to applied statisticians.It will be entertaining to study the methods of estimation based on hybrid censored data.The work is in advancement and it will be announced later.

1 𝑋F
follows an IED with c.d.f. and p.d.f. which are given by

Figure 2 :Figure 3 :
Figure 2: represents the population CDF and Empirical c.d.f. of GIED.Solid line: population c.d.f and dashed lines: empirical c.d.f

𝑧 𝛾 is the upper 𝛾 − 𝑡ℎ percentile of the standard normal distribution.
Then the conditional expectation of the complete log-likelihood function , By computing the first partial derivatives of the log-likelihood function with respect to unknown parameters  and  and by equating the resulted equations with zero, we obtain

Table 1 :
Simulation results of the proposed methods of estimation for  = 25

Table 2 :
Simulation results of the proposed methods of estimation for  = 50

Table 3 :
Simulation results of the proposed methods of estimation for  = 100

Table 4 :
Simulation results of the proposed methods of estimation for  = 25

Table 5 :
Simulation results of the proposed methods of estimation for  = 50

Table 6 :
Simulation results of the proposed methods of estimation for  = 100

Table 7 :
Widths of 95% confidence interval of α and λ and their coverage probabilities.

Table 8 :
Values of (r i , d i ) within each interval I i , i = 1,2, … , m for the data set

Table 9 :
The values of MLEs, AIC, BIC and KS of real data set

Table 10 :
Estimates of α and λ of the real data set.

Table 11 :
95% Wald's confidence intervals and 95%Boot-p confidence intervals α and λ of the real data set.