It is the objective of the DataSense project to create a computer system that allows acting in the area of the discovery of data considered Sensitive (Sensitive Data Discovery). DataSense has two fundamental objectives that it intends to categorically solve:

 • Allow the identification, classification, categorization and relationship of sensitive data present in unstructured information on a large scale in order to allow entities and organizations to obtain an understanding of their sensitive data.

• Allow organizations to respond immediately to the content and network (direct and indirect relationships) of the sensitive data they store and process (eg, right to forget)

 In order to respond to the aforementioned objectives, DataSense is based on five concepts essential to overcome the state of the art of application and proposes a hybrid architecture that will take the risk of applying the area of Natural Language Processing and Automatic Learning ( Machine Learning) in the critical area of sensitive data protection. The concepts, described in detail in the following chapter are: Sensitive Data (Personal Data), Natural Language Processing, Humanly readable multi-format unstructured information analysis, Intelligence and training supported in human feedback and Interactive Visualization. These basic concepts of the proposed solution are supported by three layers of Artificial Intelligence: Identification of Entities mentioned, Machine Learning models for resolution of Coreferentiation and Feedback and learning of the models

