Exportar Publicação
A publicação pode ser exportada nos seguintes formatos: referência da APA (American Psychological Association), referência do IEEE (Institute of Electrical and Electronics Engineers), BibTeX e RIS.
Batista, F. & João P. Carvalho (2015). Text based classification of companies in CrunchBase. In Adnan Yazici, Nikhil R. Pal, Uzat Kaymak (Ed.), IEEE International Fuzzy Systems conference proceedings. Istambul: IEEE.
F. M. Batista and J. P. Carvalho, "Text based classification of companies in CrunchBase", in IEEE Int. Fuzzy Systems conference proceedings, Adnan Yazici, Nikhil R. Pal, Uzat Kaymak, Ed., Istambul, IEEE, 2015
@inproceedings{batista2015_1782066778825,
author = "Batista, F. and João P. Carvalho",
title = "Text based classification of companies in CrunchBase",
booktitle = "IEEE International Fuzzy Systems conference proceedings",
year = "2015",
editor = "Adnan Yazici, Nikhil R. Pal, Uzat Kaymak",
volume = "",
number = "",
series = "",
doi = "10.1109/FUZZ-IEEE.2015.7337892",
publisher = "IEEE",
address = "Istambul",
organization = "IEEE",
url = "https://ieeexplore.ieee.org/xpl/conhome/7329077/proceeding"
}
TY - CPAPER TI - Text based classification of companies in CrunchBase T2 - IEEE International Fuzzy Systems conference proceedings AU - Batista, F. AU - João P. Carvalho PY - 2015 SN - 1544-5615 DO - 10.1109/FUZZ-IEEE.2015.7337892 CY - Istambul UR - https://ieeexplore.ieee.org/xpl/conhome/7329077/proceeding AB - This paper introduces two fuzzy fingerprint based text classification techniques that were successfully applied to automatically label companies from CrunchBase, based purely on their unstructured textual description. This is a real and very challenging problem due to the large set of possible labels (more than 40) and also to the fact that the textual descriptions do not have to abide by any criteria and are, therefore, extremely heterogeneous. Fuzzy fingerprints are a recently introduced technique that can be used for performing fast classification. They perform well in the presence of unbalanced datasets and can cope with a very large number of classes. In the paper, a comparison is performed against some of the best text classification techniques commonly used to address similar problems. When applied to the CrunchBase dataset, the fuzzy fingerprint based approach outperformed the other techniques. ER -
English