dc.contributor.author |
Karagrigoriou, A |
en |
dc.contributor.author |
Koukouvinos, C |
en |
dc.contributor.author |
Mylona, K |
en |
dc.date.accessioned |
2014-03-01T01:34:01Z |
|
dc.date.available |
2014-03-01T01:34:01Z |
|
dc.date.issued |
2010 |
en |
dc.identifier.issn |
0266-4763 |
en |
dc.identifier.uri |
https://dspace.lib.ntua.gr/xmlui/handle/123456789/20644 |
|
dc.subject |
Deviance |
en |
dc.subject |
Generalized linear model |
en |
dc.subject |
Highdimensional data set |
en |
dc.subject |
Model selection |
en |
dc.subject |
Non-concave penalized likelihood |
en |
dc.subject |
Trauma |
en |
dc.subject.classification |
Statistics & Probability |
en |
dc.subject.other |
ASYMPTOTICALLY EFFICIENT SELECTION |
en |
dc.subject.other |
VARIABLE SELECTION |
en |
dc.subject.other |
CROSS-VALIDATION |
en |
dc.subject.other |
REGRESSION |
en |
dc.subject.other |
ORDER |
en |
dc.subject.other |
LASSO |
en |
dc.title |
On the advantages of the non-concave penalized likelihood model selection method with minimum prediction errors in large-scale medical studies |
en |
heal.type |
journalArticle |
en |
heal.identifier.primary |
10.1080/02664760802638116 |
en |
heal.identifier.secondary |
http://dx.doi.org/10.1080/02664760802638116 |
en |
heal.language |
English |
en |
heal.publicationDate |
2010 |
en |
heal.abstract |
Variable and model selection problems are fundamental to high-dimensional statistical modeling in diverse fields of sciences. Especially in health studies, many potential factors are usually introduced to determine an outcome variable. This paper deals with the problem of high-dimensional statistical modeling through the analysis of the trauma annual data in Greece for 2005. The data set is divided into the experiment and control sets and consists of 6334 observations and 112 factors that include demographic, transport and intrahospital data used to detect possible risk factors of death. In our study, different model selection techniques are applied to the experiment set and the notion of deviance is used on the control set to assess the fit of the overall selected model. The statistical methods employed in this work were the nonconcave penalized likelihood methods, smoothly clipped absolute deviation, least absolute shrinkage and selection operator, and Hard, the generalized linear logistic regression, and the best subset variable selection. The way of identifying the significant variables in large medical data sets along with the performance and the pros and cons of the various statistical techniques used are discussed. The performed analysis reveals the distinct advantages of the non-concave penalized likelihood methods over the traditional model selection techniques. © 2010 Taylor & Francis. |
en |
heal.publisher |
ROUTLEDGE JOURNALS, TAYLOR & FRANCIS LTD |
en |
heal.journalName |
Journal of Applied Statistics |
en |
dc.identifier.doi |
10.1080/02664760802638116 |
en |
dc.identifier.isi |
ISI:000272848000002 |
en |
dc.identifier.volume |
37 |
en |
dc.identifier.issue |
1 |
en |
dc.identifier.spage |
13 |
en |
dc.identifier.epage |
24 |
en |