Purpose: This paper aims to propose a data mining approach to evaluate a conceptual model in tourism, encompassing a large data set characterized by dimensions grounded on existing literature. Design/methodology/approach: The approach is tested using a guest satisfaction model encompassing nine dimensions. A large data set of 84 k online reviews and 31 features was collected from TripAdvisor. The review score granted was considered a proxy of guest satisfaction and was defined as the target feature to model. A sequence of data understanding and preparation tasks led to a tuned set of 60k reviews and 29 input features which were used for training the data mining model. Finally, the data-based sensitivity analysis was adopted to understand which dimensions most influence guest satisfaction. Findings: Previous user’s experience with the online platform, individual preferences, and hotel prestige were the most relevant dimensions concerning guests’ satisfaction. On the opposite, homogeneous characteristics among the Las Vegas hotels such as the hotel size were found of little relevance to satisfaction. Originality/value: This study intends to set a baseline for an easier adoption of data mining to evaluate conceptual models through a scalable approach, helping to bridge between theory and practice, especially relevant when dealing with Big Data sources such as the social media. Thus, the steps undertaken during the study are detailed to facilitate replication to other models.