Bayesian Adaptive Bridge Regression for Ordinal Models with an Application

In this article, we propose a Bayesian Adaptive bridge regression for ordinal model. We developed a new hierarchical model for ordinal regression in the Bayesian adaptive bridge. We consider a fully Bayesian approach that yields a new algorithm with tractable full conditional posteriors. All of the results in real data and simulation application indicate that our method is effective and performs very good compared to other methods. We can also observe that the estimator parameters in our proposed method, compared with other methods, are very close to the true parameter values


1-Introduction
In statistical learning, there are two important goals; First, knowledge of important variables in the model. Second, ensuring the accuracy of the high prediction. Determination of the important predictors leads to reinforcing the prediction performance for the fitted model [1]. The linear regression (LR) model, where is the predictors and is the observations, is written as follows: ) is a vector of the coefficient to be estimated, ( ) is the independent variable and ( ) is a vector of error with a mean of zero and a variance of one. Mallick and Yi [2] proposed that the

ISSN: 0067-2904
Al-Jabri and Al-Hamzawi Iraqi Journal of Science, 2020, Special Issue, pp: 170-178 060 Bayesian bridge regression (BBR) was better in estimating than each of Lasso, ridge regression (RR) and bridge regression, while the estimator of BBR can be written as follow: The above formula contains special cases: RR when ( ), the Lasso estimator when ( ) and the best subset selection if ( ) [3,4] . Recently, researches showed the regularization approaches which uses the variable selection (VS) and found that the simultaneous estimation will be effective. These methods improve the prediction accuracy in the regression. In our current paper, we propose the Bayesian adaptive bridge regression for ordinal model (BABROM). This method included the desired properties such as sparsity, oracle and unbiasedness when ( ). The ordinal data are naturally arranged categories and a type of the statistical data. This data is found in many fields such as climatology, political economy, economics, social sciences, psychology and medicine [5][6][7][8]. One of the examples of ordinal data is the intelligence level of students (weak, medium, high). The categories weak, medium and high take the values of the order 1, 2, and 3, respectively. The high level is not a multiplier of the medium level. The outcome variable takes one value of values, where . In this example, the last category equals . The ordinal data model has the general shape of: ( ) ( ) ( ) where F is the link function [9] . One of the main regression problems is when the number of covariates is increasing. The regression model may be containing many unimportant variables. Therefore, we will focus on the process of VS to get the appropriate model. The process of VS provides a perfect operator for selecting the effective variables and for estimating the parameters [10]. One of the most popular method's criteria used for VS is the Akaike information criterion (AIC) that can be written as follows: ( ) where ( ) is the probability function that is estimated in a maximum likelihood estimation, and is the number of parameters in the model [11]. In this criterion, the best model of a set of candidate models is that with minimum AIC value [12]. When size is large, the model will be inconsistent [13]. Therefore, we will use the Bayesian information criterion (BIC) that can be written as follows: ( ) we use this criterion to get the appropriate model with a probability of 1 [11,13]. We organize this paper as follows: we will show the Bayesian adaptive bridge for ordinal model in section Two. In section Three, we show the Bayesian inference, prior elicitation, hierarchical representation and full conditional distributions (FCD). We elucidate the computation in section Four. We apply the simulation study in section Five and real data application in section Six. The conclusions are presented in section Seven.

-2 Bayesian adaptive bridge regression for ordinal model
In this section, we introduce the BABROM using different penalty parameters . The formula of Bayesian adaptive bridge regression can be written as follows: , is a positive constant [14]. Here, will be different values of tuning parameters, where every regression parameter is a multiplication with different penalty parameter. We will give a large penalty parameter for the not important variables [1,15,16]. The value of in this paper is 0.5.
The ordinal data analysis is distinguished by relative simplicity in the frequentist approach. Although the approximate theory for ordinal regression (OR) model has been well studied, the Bayesian method enables exact estimation even when is greater than [17].
In a simple univariate case, the response variable takes one of the ordered values, where . If there is a normal distribution (ND) with the cumulative distribution function (cdf) Ф, we can write the probability (y) to be equal to category through the following [18]: are cut-points, which coordinates satisfy . Here is the lower bound and is the upper bound of the interval corresponding to response [19]; [7,19]. We can motivate the problem when assuming that the latent variable depends on a pvector for covariates ( ) through the model and the response variable is: ( ) The probability of is conditional on and ( ) and is given by ) . So, we obtain: we can write the likelihood function for the model as: The indicator function is * + for the event [6,20]. When , it leads to the removal of the possibility of change distribution, with no shifting to the probability of observing .

{ ( )
By fixing the cut-point and adding to , the identification problem is typically corrected simply. In Bayesian inference, the steps to determine prior distribution for parameters are very important [7,21]. The prior distribution plays an important role [22]. A prior must be selected with care, because some problems can occur if the prior distribution is used without care [23].

3-Bayesian inference 3-1 Prior elicitation
In this article, similar to Mallick and Yi [2], we consider the conditional generalized Gaussian (GG) prior specification: We will solve the proplem (formula 6) by using the Gibbs sampler (GS), that confirms construction of a Markov chain, which has a joint posterior for ( ) as a constant distribution [11].
We show a new practice, namely the BABROM, by using Scale Mixture of Uniform (SMU). To proceed with Bayesian analysis, we assume the generalized Gaussian distribution (GGD) which can be written as SMU. Following Mallick and Yi [2], the GGD can be adapted as: In practice, we have found that the mixture representation (14) performs better than (13) in sampling the regression coefficients in terms of prediction accuracy. Albert and Chib (2001) proposed the logarithmic transformation in order sampling of the cut points .
( ) To eliminate high autocorrelation for cut-points, we use the transformation ( ) instead of ( ) to obtain the parameters of the tailored proposal in Metropolis-Hastings (MH) for . We assume that the prior of is ( ). The prior of is assumed as ( ) where is a beta distribution (BD).

3-2 Hierarchical Representation
The hierarchical representation for BABROM can be written as follows:

3-3 Full Conditional Distributions
Under the hierarchical representation in ( ), we write the (FCD) for the parameters as follows: The FCD of as follows: The FCD of is given by: The FCD of is given by: To calculate , we assign a beta prior ( ) and the FCD of is given by:

4-Computation
Firstly, we construct the Gibbs sampler for BABROM procedure by initiating the initial valuations for parameters , , , , and . Then we execute the algorithm as follows:  [20], we generated the . Here ( ̂ ̂ ) and is a t distribution where ̂ ( ) ( ) refers to the degree of freedom and ̂ is the negative inverse Hessian. Given the current values of and the proposed draw for values of , we return x with the probability of: The efficient GS is based on this full condition to extract samples from every full (cpd). The process of sampling will continue until all chains converge.

5-Simulation study
In this section, the performance of the proposed method is illustrated by simulations. The proposed method is compared with Bayesian lasso median regression, ordered logistic regression and ordinal probit Regression, as in Jeliazkov et al. (2008). These methods are evaluated based on the median of mean absolute deviations (MMAD) over 100 replications and standard deviation (SD). The Bayesian estimates are posterior mean estimates using 11,000 samples of the Gibbs sampler after burn-in the first 1000 samples. In this simulation, we set .

5-1 Simulation
In this simulation study, we generate 200 observations from the model , where is a vector of 10 covariates that are simulated from standard multivariate normal distribution, and is simulated from standard normal distribution. We set ( ) and ( ). In the next simulation, we will consider the following values:   Figure-0 shows that the sampler is moving from a point to another in relatively few steps. Figure-

6-Real data application
In this section, we apply our proposed method by using real data. The real data was collected through a questionnaire for employees and workers of the Oil Products Distribution Company in Thi-Qar-IRAQ. The questionnaire was about the management of the company and how to deal with employees. Sample size is 150 observations and the number of the covariates is 17.
For real data, we estimated the parameters for our proposed method and the comparison methods. The results indicated that our proposed method is better than the other comparison methods.   We computed the Deviance Information Criteria (DIC) for the three models (BABROM, ORD (logistic), and ORD (probit)). The values were 350.8743, 389.2391 and 378.2251, respectively. The results of deviance information criteria show that our proposed methods perform better than the ORD (logistic) and ORD (probit) method.

7-Conclusion
In this paper, we developed a Bayesian adaptive bridge regression for ordinal model in the univariate case. Our method is based on a conditional conjugate prior distribution for regression parameters. We developed a new hierarchical representation for our method. To estimate the ordinal regression parameters, we introduced a Gibbs sampler for generating samples from the posterior distribution. In a simulation study, we found that the MMAD, SD, MSE, the cut-point and DIC indicated that our proposed method is better than the comparison methods. The studies showed that, in comparison with existing ORD (logistic) and ORD (probit), the Bayesian ordinal regression method by using the conjugate prior distribution generally performs better than the other methods.