Talk
A Data-driven Approach to Predict Hospital Length of Stay - A Portuguese Case Study
Nuno Caetano (Caetano, Nuno); Raul Laureano (Laureano, Raul M. S.); Paulo Cortez (Cortez, Paulo);
Event Title
Proceedings of the 16th International Conference on Enterprise Information Systems (ICEIS 2014)
Year (definitive publication)
2014
Language
English
Country
Portugal
More Information
Web of Science®

This publication is not indexed in Web of Science®

Scopus

Times Cited: 6

(Last checked: 2022-02-21 22:29)

View record in Scopus

Google Scholar

This publication is not indexed in Google Scholar

Abstract
Data Mining (DM) aims at the extraction of useful knowledge from raw data. In the last decades, hospitals have collected large amounts of data through new methods of electronic data storage, thus increasing the potential value of DM in this domain area, in what is known as medical data mining. This work focuses on the case study of a Portuguese hospital, based on recent and large dataset that was collected from 2000 to 2013. A data-driven predictive model was obtained for the length of stay (LOS), using as inputs indicators commonly available at the hospitalization process. Based on a regression approach, several state-of-the-art DM models were compared. The best result was obtained by a Random Forest (RF), which presents a high quality coefficient of determination value (0.81). Moreover, a sensitivity analysis approach was used to extract human understandable knowledge from the RF model, revealing top three influential input attributes: hospital episode type, the physical service where the patient is hospitalized and the associated medical specialty. Such predictive and explanatory knowledge is valuable for supporting decisions of hospital managers.
Acknowledgements
--
Keywords
Medical Data Mining, Length of Stay, CRISP-DM, Random Forest.
  • Physical Sciences - Natural Sciences