Crowdsmelling: A preliminary study on using collective knowledge in code smells detection

José Pereira dos Reis; Fernando Brito e Abreu; Glauco Carneiro

Ciência_Iscte Publications Publication Detailed Description

Scientific journal paper Q1

Crowdsmelling: A preliminary study on using collective knowledge in code smells detection

José Pereira dos Reis (Reis, J.); Fernando Brito e Abreu (Brito e Abreu, F.); Glauco Carneiro (Figueiredo Carneiro, G.);

Journal Title

Empirical Software Engineering

Year (definitive publication)

2022

Language

English

Country

Netherlands

More Information

Visit Link

Web of Science®

Times Cited: 17

(Last checked: 2026-07-22 10:46)

View record in Web of Science®

Article Impact Index: 1.1

Scopus

Times Cited: 19

(Last checked: 2026-07-23 00:22)

View record in Scopus

Article Impact Index: 1.0

Google Scholar

Times Cited: 23

(Last checked: 2026-07-22 16:37)

View record in Google Scholar

Overton

This publication is not indexed in Overton

Abstract

Code smells are seen as a major source of technical debt and, as such, should be detected and removed. However, researchers argue that the subjectiveness of the code smells detection process is a major hindrance to mitigating the problem of smells-infected code. This paper presents the results of a validation experiment for the Crowdsmelling approach proposed earlier. The latter is based on supervised machine learning techniques, where the wisdom of the crowd (of software developers) is used to collectively calibrate code smells detection algorithms, thereby lessening the subjectivity issue. In the context of three consecutive years of a Software Engineering course, a total ``crowd'' of around a hundred teams, with an average of three members each, classified the presence of 3 code smells (Long Method, God Class, and Feature Envy) in Java source code. These classifications were the basis of the oracles used for training six machine learning algorithms. Over one hundred models were generated and evaluated to determine which machine learning algorithms had the best performance in detecting each of the aforementioned code smells. Good performances were obtained for God Class detection (ROC=0.896 for Naive Bayes) and Long Method detection (ROC=0.870 for AdaBoostM1), but much lower for Feature Envy (ROC=0.570 for Random Forrest). The results suggest that Crowdsmelling is a feasible approach for the detection of code smells. Further validation experiments based on dynamic learning are required to comprehensive coverage of code smells to increase external validity.

Acknowledgements

This work was partially funded by the Portuguese Foundation for Science and Technology, under ISTAR's projects UIDB/04466/2020 and UIDP/04466/2020, and by Anima Institute (Edital Nº 43/2021).

Keywords

Crowdsmelling,Code smells,Code smells detection,Software quality,Software maintenance,Collective knowledge,Machine learning algorithms

Fields of Science and Technology Classification

Computer and Information Sciences - Natural Sciences

Funding Records

Funding Reference	Funding Entity
UIDP/04466/2020	Fundação para a Ciência e a Tecnologia
UIDB/04466/2020	Fundação para a Ciência e a Tecnologia

Related Projects

This publication is an output of the following project(s):

Contributions to the Sustainable Development Goals of the United Nations

With the objective to increase the research activity directed towards the achievement of the United Nations 2030 Sustainable Development Goals, the possibility of associating scientific publications with the Sustainable Development Goals is now available in Ciência_Iscte. These are the Sustainable Development Goals identified by the author(s) for this publication. For more detailed information on the Sustainable Development Goals, click here.

Publication Identifiers

Scopus (source: Ciência_Iscte)	2-s2.0-85126535966
DOI (source: author)	10.1007/s10664-021-10110-5
WoS (source: Ciência_Iscte)	WOS:000770339500010
WoS (source: author)	000770339500010
Scopus (source: author)	2-s2.0-85126535966
Ciência_Iscte ID	ci-pub-84764
Handle (source: Ciência-IUL)	http://hdl.handle.net/10071/25596

Other Publication Details

Online Publication Year	2022
Publisher	Springer
Indexes	Web of Science©; Scopus;
ISSN	1382-3256 (print) 1573-7616 (online)
ISBN	--
Impact Factor	7.300
Volume	27	Number	3
Series
Article Number	69
Pages	--
Peer Reviewed	Yes
Dissemination Mean	Both (printed and digital)
ISCTE-IUL Repository	Link to the repository
Publication Date (online)	2022-03-17
Publication Date (print)

Altmetric

Dimensions

PlumX Metrics