Capítulo de livro
Using Data Mining for Prediction of Hospital Length of Stay: An Application of the CRISP-DM Methodology
Nuno Caetano (Caetano, Nuno); Paulo Cortez (Cortez, Paulo); Raul Laureano (Laureano, Raul M. S.);
Título Livro
Enterprise Information Systems
Ano (publicação definitiva)
2015
Língua
Inglês
País
Alemanha
Mais Informação
Web of Science®

N.º de citações: 15

(Última verificação: 2024-03-28 06:31)

Ver o registo na Web of Science®

Scopus

N.º de citações: 23

(Última verificação: 2024-03-27 06:32)

Ver o registo na Scopus

Google Scholar

N.º de citações: 39

(Última verificação: 2024-03-28 18:59)

Ver o registo no Google Scholar

Abstract/Resumo
Hospitals are nowadays collecting vast amounts of data related with patient records. All this data hold valuable knowledge that can be used to improve hospital decision making. Data mining techniques aim precisely at the extraction of useful knowledge from raw data. This work describes an implementation of a medical data mining project approach based on the CRISP-DM methodology. Recent real-world data, from 2000 to 2013, were collected from a Portuguese hospital and related with inpatient hospitalization. The goal was to predict generic hospital Length Of Stay based on indicators that are commonly available at the hospitalization process (e.g., gender, age, episode type, medical specialty). At the data preparation stage, the data were cleaned and variables were selected and transformed, leading to 14 inputs. Next, at the modeling stage, a regression approach was adopted, where six learning methods were compared: Average Prediction, Multiple Regression, Decision Tree, Artificial Neural Network ensemble, Support Vector Machine and Random Forest. The best learning model was obtained by the Random Forest method, which presents a high quality coefficient of determination value (0.81). This model was then opened by using a sensitivity analysis procedure that revealed three influential input attributes: the hospital episode type, the physical service where the patient is hospitalized and the associated medical specialty. Such extracted knowledge confirmed that the obtained predictive model is credible and with potential value for supporting decisions of hospital managers.
Agradecimentos/Acknowledgements
--
Palavras-chave
Hospital Length of Stay, Data Mining, CRISP-DM