Privacy in text documents

Mariana Dias; Joao C Ferreira or Joao Ferreira; Rui Maia; Pedro Santos; Ricardo Ribeiro

Ciência_Iscte Publications Publication Detailed Description

Publication in conference proceedings

Privacy in text documents

Mariana Dias (Dias, M.); Joao C Ferreira or Joao Ferreira (Ferreira, J. C.); Rui Maia (Maia, R.); Pedro Santos (Santos, P.); Ricardo Ribeiro (Ribeiro, R.);

Proceedings of the 33rd International Business Information Management Association Conference, IBIMA 2019: Education Excellence and Innovation Management through Vision 2020

Year (definitive publication)

2019

Language

English

Country

Spain

More Information

Visit Link

Web of Science®

Times Cited: 1

(Last checked: 2026-04-12 22:16)

View record in Web of Science®

Scopus

Times Cited: 2

(Last checked: 2026-04-08 22:14)

View record in Scopus

Google Scholar

Times Cited: 2

(Last checked: 2026-04-11 22:47)

View record in Google Scholar

Overton

This publication is not indexed in Overton

Abstract

The process of sensitive data preservation is a manual and a semi-automatic procedure. Sensitive data preservation suffers various problems, in particular, affect the handling of confidential, sensitive and personal information, such as the identification of sensitive data in documents requiring human intervention that is costly and propense to generate error, and the identification of sensitive data in large-scale documents does not allow an approach that depends on human expertise for their identification and relationship. DataSense will be highly exportable software that will enable organizations to identify and understand the sensitive data in their possession in unstructured textual information (digital documents) in order to comply with legal, compliance and security purposes. The goal is to identify and classify sensitive data (Personal Data) present in large-scale structured and non-structured information in a way that allows entities and/or organizations to understand it without calling into question security or confidentiality issues. The DataSense project will be based on European-Portuguese text documents with different approaches of NLP (Natural Language Processing) technologies and the advances in machine learning, such as Named Entity Recognition, Disambiguation, Co-referencing (ARE) and Automatic Learning and Human Feedback. It will also be characterized by the ability to assist organizations in complying with standards such as the GDPR (General Data Protection Regulation), which regulate data protection in the European Union.

Acknowledgements

Keywords

Sensitive data,Natural language processing,Text mining,Named entities recognition

Fields of Science and Technology Classification

Computer and Information Sciences - Natural Sciences
Electrical Engineering, Electronic Engineering, Information Engineering - Engineering and Technology

Funding Records

Funding Reference	Funding Entity
POCI-01-0247-FEDER-038539	Comissão Europeia
UID/Multi/04466/2019	Fundação para a Ciência e a Tecnologia

Publication Identifiers

Scopus (source: Ciência_Iscte)	2-s2.0-85074083548
Other ID (source: External)	cv-prod-id-954996
Other ID (source: ORCID)	cv-prod-id-1713853
WoS (source: Ciência_Iscte)	WOS:000503988804014
WoS (source: author)	000503988804014
Scopus (source: author)	2-s2.0-85074083548
Ciência_Iscte ID	ci-pub-63744
ISBN (source: External)	978-099985512-6
Handle (source: Ciência-IUL)	http://hdl.handle.net/10071/22603

Other Publication Details

Online Publication Year	2019
Publisher	International Business Information Management Association, IBIMA
Indexes	Web of Science©; Scopus;
ISSN	--
ISBN	978-099985512-6 (print) 978-099985512-6 (online)
Volume
Article Number
Pages	2551 - 2560	Total Pages	10
Peer Reviewed	Yes
Editors	Soliman, K. S.
Event Title	33rd International Business Information Management Association Conference: Education Excellence and Innovation Management through Vision 2020, IBIMA 2019
Event Organizer	IBIMA
City	Granada
Event Type	Conference
Event Classification	International
Event Year	2019
Event Publication Type	Full Paper
ISCTE-IUL Repository	Link to the repository
Publication Date (online)	2019-01-01
Publication Date (print)	2019-01-01

Altmetric

PlumX Metrics