Performance of combined models in discrete binary classification

Anabela de Fátima Domingues Cardoso; Ana Sousa Ferreira; Margarida G. M. S. Cardoso

Ciência_Iscte Publicações Descrição Detalhada da Publicação

Artigo em revista científica Q1

Performance of combined models in discrete binary classification

Anabela de Fátima Domingues Cardoso (Marques, A.); Ana Sousa Ferreira (Ferreira, A. S.); Margarida G. M. S. Cardoso (Cardoso, M. G. M. S.);

Título Revista

Methodology

Ano (publicação definitiva)

2017

Língua

Inglês

País

Alemanha

Mais Informação

Visitar Link

Web of Science®

N.º de citações: 0

(Última verificação: 2026-07-24 18:28)

Ver o registo na Web of Science®

Scopus

N.º de citações: 0

(Última verificação: 2026-07-13 00:25)

Ver o registo na Scopus

Google Scholar

Esta publicação não está indexada no Google Scholar

Overton

Esta publicação não está indexada no Overton

Abstract/Resumo

Diverse Discrete Discriminant Analysis (DDA) models perform differently in different samples. This fact has encouraged research in combined models which seems particularly promising when the a priori classes are not well separated or when small or moderate sized samples are considered, which often occurs in practice. In this study, we evaluate the performance of a convex combination of two DDA models: the First-Order Independence Model (FOIM) and the Dependence Trees Model (DTM). We use simulated data sets with two classes and consider diverse data complexity factors which may influence performance of the combined model -the separation of classes, balance, and number of missing states, as well as sample size and also the number of parameters to be estimated in DDA. We resort to cross-validation to evaluate the precision of classification. The results obtained illustrate the advantage of the proposed combination when compared with FOIM and DTM: it yields the best results, especially when very small samples are considered. The experimental study also provides a ranking of the data complexity factors, according to their relative impact on classification performance, by means of a regression model. It leads to the conclusion that the separation of classes is the most influential factor in classification performance. The ratio between the number of degrees of freedom and sample size, along with the proportion of missing states in the minority class, also has significant impact on classification performance. An additional gain of this study, also deriving from the estimated regression model, is the ability to successfully predict the precision of classification in a real data set based on the data complexity factors.

Agradecimentos/Acknowledgements

Palavras-chave

Classification performance,Combined models for classification,Discrete discriminant analysis,Separability

Classificação Fields of Science and Technology

Matemáticas - Ciências Naturais
Psicologia - Ciências Sociais
Sociologia - Ciências Sociais

Registos de financiamentos

Referência de financiamento	Entidade Financiadora
UID/GES/00315/2013	Fundação para a Ciência e a Tecnologia

Identificadores da Publicação

DOI (fonte: ORCID)	10.1027/1614-2241/A000117
WoS (fonte: Ciência_Iscte)	WOS:000397417500003
Scopus (fonte: autor)	2-s2.0-85017329682
Outro ID (fonte: ORCID)	cv-prod-id-1529959
DOI (fonte: autor)	10.1027/1614-2241/a000117
Scopus (fonte: Ciência_Iscte)	2-s2.0-85017329682
Handle (fonte: Ciência-IUL)	http://hdl.handle.net/10071/13072
WoS (fonte: ORCID)	WOS:000397417500003
WoS (fonte: autor)	000397417500003
ID Ciência_Iscte	ci-pub-33952

Outros Detalhes da Publicação

Ano Publicação Online	2017
Editora	Hogrefe and Huber Publisher
Indexação	Web of Science©; Scopus;
ISSN	1614-1881 (print) 1614-2241 (online)
ISBN	--
Factor de Impacto	--
Volume	13	Número	1
Série
Número Artigo
Páginas	23 - 37
Avaliado Cientificamente	Sim
Meio de Divulgação	Digital
Repositório ISCTE-IUL	Link para o repositório
Data Publicação (online)
Data Publicação (print)

Altmetric

Dimensions

PlumX Metrics