Welcome to TutorsOnSpot.Com!

World's No. 1 Assignment Writing Market

Post Your Homework

Proposals

Post your homework and get free proposals here!

Post Your Homework

Stuck in your homework and missing deadline?

Get Urgent Help In Your Essays, Assignments, Homeworks, Dissertation, Thesis Or Coursework Writing

100% Plagiarism Free Writing - Free Turnitin Report - Professional And Experienced Writers - 24/7 Online Support

Get Free 2 Pages Post Your Requirements And Get Free Help

Assignment on Model selection in Medical Research, A simulation study comparing Bayesian Model Averaging and Stepwise Regression

Category: Arts & Education Paper Type: Assignment Writing Reference: APA Words: 1000

Summary of Model selection in Medical Research, A simulation study comparing Bayesian Model Averaging and Stepwise Regression

Although automatic variable selection methods are discouraged, they are valuable where subject matter information is limited. Comparing Bayesian model averaging with automatic selection procedures may be valuable for further investigation. The most popular method is stepwise regression which performed poorly in simulations. There is a need of evaluation of each step of the model building process including model selection. The focus of this study will be use of linear regression for examining and comparing stepwise regression using Akaike Information Criterion (AIC) for model building together with 0.05 significance criteria for inclusion in final model. Basically, we will be comparing step wise regression with Bayesian model averaging. We are choosing linear regression to enable better control of effect size of true predictors (Genell, Nemes, Steineck, & Dickman, 2010).

Data Simulation of Model selection in Medical Research, A simulation study comparing Bayesian Model Averaging and Stepwise Regression

Variables that generate outcome are true predictors and remaining are redundant variables. There is series off 300 simulations that were started by generating 500 observations of 20 independent, identically distributed random variables from standard normal distribution. Simulations was repeated independently for each data generating process for 300 times for the 30 different values of sigma. We excluded previously selected variables with a p-value of 0.05 in the final step. This is stepwise regression. Our study focus is on areas where subject matter knowledge is extremely limited. While analysing data we presume no existing knowledge is available. That’s why we used noninformative priors for Bayesian model averaging.

Method Comparison and Results

The selection methods were compared in relation to the probability of selecting a true predictor and the non-selection of a redundant variable. We also evaluated probability of selecting a correct model. Bayesian model averaging was shown to never select redundant variables whereby with 50% threshold it selects a redundant variable 1 time per hundred and stepwise regression selects a redundant variable with probability of 0.05. The effect size of the true predictor and the probabilities are independent of each other. The redundant variables are uncorrelated with a true predictor. The exception in this case was when a redundant variable was correlated with a true predictor during data generating process 4. We found that chances of selecting a true predictor amplified as the effect size of the true predictor increased.

On the other hand, Bayesian model averaging with 50% threshold and stepwise regression performed similarly and better than Bayesian model averaging with 95% threshold. Probability of selecting a true predictor levelled out at 1 for data generating processes 2 and 3. The probability of selecting indirect predictor was approximately constant at 0 for Bayesian model averaging with 95% threshold on the other hand for stepwise regression it increased to approximately 0.2 for effect size matching to a t-test statistic between 0 and 3 and at t-test statistic of approximately 7 the probability reduced and levelled out at approximately 0.1. When a different method was used probability of selecting the correct model increased as the effect size of the true predictor increased. In all data generating processes stepwise regression mostly levelled out at selection probability approximately 0.3 and Bayesian model averaging with 50% threshold at approximately 0.8.

The simulations were done for five different pre-determined data generating processes for 30 different values of the effect size. They were further analysed with stepwise regression and Bayesian model averaging respectively. We assessed that Bayesian model averaging fell short on selecting a redundant variable, while stepwise regression succeeded. Depending on effect size redundant variable which corelates with the true predictor was less frequently selected by Bayesian model averaging than by stepwise regression which often selected such a variable more than 1 time out of 4. Bayesian model averaging performed like stepwise regression with 50% posterior probability threshold in selecting a true predictor. Wang and co-workers study also evaluated probabilities of selecting a true predictor. It compared both Bayesian model averaging and stepwise regression and found that they both selected the two true predictors 10 out of 10. The Ratery and co-workers study further support that Bayesian model averaging has similar probability of selecting a true predictor as stepwise regression.

Depending on the effect size Bayesian model averaging almost never selected an indirect predictor while stepwise regression did. The focus is stepwise regression comparison with Bayesian model averaging with 50% posterior probability threshold. In this study Bayesian model was chosen as model selection method. For interpretation of posterior probabilities Kass and Raftery offer informative thresholds, implying that the posterior probability threshold 50% corresponds to the 0.05 p-value significance level. In this study we used linear regression in simulation to better control the variance independently of the regression coefficient and thus to control the effect size. We intentionally selected data generating processes that were small and simple in order to easily see the differences between the model selection methods.

We have seen that it all depends on the complexity of data generating process. If the Bayesian model averaging shows same result even in the more complex real-life data structures, it will prove its reliability. This needs more research and can help in future studies of these methods. The simulations displayed that under the given circumstances, Bayesian model averaging had a higher probability of not selecting a redundant variable compared with stepwise regression and had a similar probability of selectin a true predictor. We can conclude that medical researchers which rely on building regression models with limited subject matter knowledge can take advantage Bayesian model averaging method.

References of Model selection in Medical Research, A simulation study comparing Bayesian Model Averaging and Stepwise Regression

Genell, A., Nemes, S., Steineck, G., & Dickman, P. W. (2010, December 6). Model selection in Medical Research. BMC Medical Research Methodology.