Performance of combined models in discrete binary classification

Anabela de Fátima Domingues Cardoso; Ana Sousa Ferreira; Margarida G. M. S. Cardoso

Ciência_Iscte Publications Publication Detailed Description

Scientific journal paper Q1

Performance of combined models in discrete binary classification

Anabela de Fátima Domingues Cardoso (Marques, A.); Ana Sousa Ferreira (Ferreira, A. S.); Margarida G. M. S. Cardoso (Cardoso, M. G. M. S.);

Journal Title

Methodology

Year (definitive publication)

2017

Language

English

Country

Germany

More Information

Visit Link

Web of Science®

Times Cited: 0

(Last checked: 2026-07-24 18:28)

View record in Web of Science®

Scopus

Times Cited: 0

(Last checked: 2026-07-13 00:25)

View record in Scopus

Google Scholar

This publication is not indexed in Google Scholar

Overton

This publication is not indexed in Overton

Abstract

Diverse Discrete Discriminant Analysis (DDA) models perform differently in different samples. This fact has encouraged research in combined models which seems particularly promising when the a priori classes are not well separated or when small or moderate sized samples are considered, which often occurs in practice. In this study, we evaluate the performance of a convex combination of two DDA models: the First-Order Independence Model (FOIM) and the Dependence Trees Model (DTM). We use simulated data sets with two classes and consider diverse data complexity factors which may influence performance of the combined model -the separation of classes, balance, and number of missing states, as well as sample size and also the number of parameters to be estimated in DDA. We resort to cross-validation to evaluate the precision of classification. The results obtained illustrate the advantage of the proposed combination when compared with FOIM and DTM: it yields the best results, especially when very small samples are considered. The experimental study also provides a ranking of the data complexity factors, according to their relative impact on classification performance, by means of a regression model. It leads to the conclusion that the separation of classes is the most influential factor in classification performance. The ratio between the number of degrees of freedom and sample size, along with the proportion of missing states in the minority class, also has significant impact on classification performance. An additional gain of this study, also deriving from the estimated regression model, is the ability to successfully predict the precision of classification in a real data set based on the data complexity factors.

Acknowledgements

Keywords

Classification performance,Combined models for classification,Discrete discriminant analysis,Separability

Fields of Science and Technology Classification

Mathematics - Natural Sciences
Psychology - Social Sciences
Sociology - Social Sciences

Funding Records

Funding Reference	Funding Entity
UID/GES/00315/2013	Fundação para a Ciência e a Tecnologia

Publication Identifiers

DOI (source: ORCID)	10.1027/1614-2241/A000117
Scopus (source: Ciência_Iscte)	2-s2.0-85017329682
WoS (source: ORCID)	WOS:000397417500003
DOI (source: author)	10.1027/1614-2241/a000117
Other ID (source: ORCID)	cv-prod-id-1529959
WoS (source: Ciência_Iscte)	WOS:000397417500003
WoS (source: author)	000397417500003
Scopus (source: author)	2-s2.0-85017329682
Ciência_Iscte ID	ci-pub-33952
Handle (source: Ciência-IUL)	http://hdl.handle.net/10071/13072

Other Publication Details

Online Publication Year	2017
Publisher	Hogrefe and Huber Publisher
Indexes	Web of Science©; Scopus;
ISSN	1614-1881 (print) 1614-2241 (online)
ISBN	--
Impact Factor	--
Volume	13	Number	1
Series
Article Number
Pages	23 - 37
Peer Reviewed	Yes
Dissemination Mean	Digital
ISCTE-IUL Repository	Link to the repository
Publication Date (online)
Publication Date (print)

Altmetric

Dimensions

PlumX Metrics