Publicação em atas de evento científico
Transformer-based language models for semantic search and mobile applications retrieval
João Coelho (Coelho, J.); António Neto (Neto, A.); Miguel Tavares (Tavares, M.); Carlos Coutinho (Coutinho, C.); João Pedro Oliveira (Oliveira, J.); Ricardo Ribeiro (Ribeiro, R.); Fernando Batista (Batista, F.); et al.
Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
Ano (publicação definitiva)
2021
Língua
Inglês
País
Portugal
Mais Informação
Web of Science®

N.º de citações: 1

(Última verificação: 2024-11-04 21:28)

Ver o registo na Web of Science®

Scopus

Esta publicação não está indexada na Scopus

Google Scholar

N.º de citações: 4

(Última verificação: 2024-11-02 10:24)

Ver o registo no Google Scholar

Abstract/Resumo
Search engines are being extensively used by Mobile App Stores, where millions of users world-wide use them every day. However, some stores still resort to simple lexical-based search engines, despite the recent advances in Machine Learning, Information Retrieval, and Natural Language Processing, which allow for richer semantic strategies. This work proposes an approach for semantic search of mobile applications that relies on transformer-based language models, fine-tuned with the existing textual information about known mobile applications. Our approach relies solely on the application name and on the unstructured textual information contained in its description. A dataset of about 500 thousand mobile apps was extended in the scope of this work with a test set, and all the available textual data was used to fine-tune our neural language models. We have evaluated our models using a public dataset that includes information about 43 thousand applications, and 56 manually annotated non- exact queries. The results show that our model surpasses the performance of all the other retrieval strategies reported in the literature. Tests with users have confirmed the performance of our semantic search approach, when compared with an existing deployed solution.
Agradecimentos/Acknowledgements
--
Palavras-chave
Semantic search,Word embeddings,ElasticSearch,Mobile applications,Transformer-based models
Registos de financiamentos
Referência de financiamento Entidade Financiadora
39703 PT2020
UIDB/50021/2020 Fundação para a Ciência e a Tecnologia
Projetos Relacionados

Esta publicação é um output do(s) seguinte(s) projeto(s):