Singularity Score For Evaluating Topic Relevance In Tiny Text
Event Title
WorldCist'26 - 14th World Conference on Information Systems and Technologies
Year (definitive publication)
2026
Language
English
Country
Portugal
More Information
Web of Science®
This publication is not indexed in Web of Science®
Scopus
This publication is not indexed in Scopus
Google Scholar
This publication is not indexed in Google Scholar
This publication is not indexed in Overton
Abstract
Topic modeling is a widely used method for extracting relevant information and insights from text, given its strong results. When using this technique, it is necessary to evaluate the topics identified. However, when the text is very short, with fewer than 10 words per document on average, the classical evaluation metrics can be unreliable. To extract meaningful topics and identify the most suitable modeling technique, this study applied topic modeling to this type of data – tiny text – using user-generated Portuguese texts collected from post-its during PLANAPP workshops. Six datasets with different preprocessing steps were tested using LDA and BERTopic, the latter with two sentence-
transformers (Multilingual and AlBERTina). As expected, the classical evaluation metrics proved inconsistent for such short texts, motivating the creation of a new measurement of topic coherence, the Singularity Score, that intends to mimic human annotators. Results show that BERTopic produced more coherent topics, despite the fact that LDA scores higher in traditional metrics. In summary, this work demonstrates that topic modeling can be effectively applied to tiny Portuguese texts, identifies BERTopic as the most
suitable approach, and introduces SS as a novel metric for assessing topic quality.
Acknowledgements
This work was partially supported by Fundação para a Ciência e
aTecnologia,I.P.(FCT)[Project2024.07395.IACDC][ISTARProjects:UIDB/04466/2023
and UIDP/04466/2023]
Keywords
Topic Modelling,Tiny text,Singularity score,Topic evaluation,Text mining
Contributions to the Sustainable Development Goals of the United Nations
With the objective to increase the research activity directed towards the achievement of the United Nations 2030 Sustainable Development Goals, the possibility of associating scientific publications with the Sustainable Development Goals is now available in Ciência_Iscte. These are the Sustainable Development Goals identified by the author(s) for this publication. For more detailed information on the Sustainable Development Goals, click here.
Português