Predicting the confusion level of text excerpts with syntactic, lexical and n-gram features

Tiago da Pedro da Costa; José Luís Silva; Rúben Pereira

Ciência_Iscte Publications Publication Detailed Description

Publication in conference proceedings

Predicting the confusion level of text excerpts with syntactic, lexical and n-gram features

Tiago da Pedro da Costa (Pedro, T. S.); José Luís Silva (Silva, J. L.); Rúben Pereira (Pereira, R.);

10th International Conference on Education and New Learning Technologies

Year (definitive publication)

2018

Language

English

Country

Spain

More Information

Visit Link

Web of Science®

This publication is not indexed in Web of Science®

Scopus

This publication is not indexed in Scopus

Google Scholar

Times Cited: 2

(Last checked: 2026-07-22 14:55)

View record in Google Scholar

Overton

This publication is not indexed in Overton

Abstract

Distance learning, offline presentations (presentations that are not being carried in a live fashion but were instead pre-recorded) and such activities whose main goal is to convey information are getting increasingly relevant with digital media such as Virtual Reality (VR) and Massive Online Open Courses (MOOCs). While MOOCs are a well-established reality in the learning environment, VR is also being used to promote learning in virtual rooms, be it in the academia or in the industry. Oftentimes these methods are based on written scripts that take the learner through the content, making them critical components to these tools. With such an important role, it is important to ensure the efficiency of these scripts. Confusion is a non-basic emotion associated with learning. This process often leads to a cognitive disequilibrium either caused by the content itself or due to the way it is conveyed when it comes to its syntactic and lexical features. We hereby propose a supervised model that can predict the likelihood of confusion an input text excerpt can cause on the learner. To achieve this, we performed syntactic and lexical analyses over 300 text excerpts and collected 5 confusion level classifications (0 – 6) per excerpt from 51 annotators to use their respective means as labels. These examples that compose the dataset were collected from random presentations transcripts across various fields of knowledge. The learning model was trained with this data with the results being included in the body of the paper. This model allows the design of clearer scripts of offline presentations and similar approaches and we expect that it improves the efficiency of these speeches. While this model is applied to this specific case, we hope to pave the way to generalize this approach to other contexts where clearness of text is critical, such as the scripts of MOOCs or academic abstracts.

Acknowledgements

Keywords

Confusion,Supervised learning,Text,Presentation

Fields of Science and Technology Classification

Computer and Information Sciences - Natural Sciences

Funding Records

Funding Reference	Funding Entity
UID/MULTI/0446/2013	Fundação para a Ciência e a Tecnologia

Contributions to the Sustainable Development Goals of the United Nations

With the objective to increase the research activity directed towards the achievement of the United Nations 2030 Sustainable Development Goals, the possibility of associating scientific publications with the Sustainable Development Goals is now available in Ciência_Iscte. These are the Sustainable Development Goals identified by the author(s) for this publication. For more detailed information on the Sustainable Development Goals, click here.

Publication Identifiers

Other ID (source: ORCID)	cv-prod-id-1422000
DOI (source: author)	10.21125/edulearn.2018.1959
Ciência_Iscte ID	ci-pub-49599
Handle (source: Ciência-IUL)	http://hdl.handle.net/10071/16884

Other Publication Details

Online Publication Year	2018
Publisher	IATED
Indexes	--
ISSN	2340-1117 (print) 2340-1117 (online)
ISBN	978-84-09-02709-5 (print) 978-84-09-02709-5 (online)
Volume
Article Number
Pages	8417 - 8426	Total Pages	--
Peer Reviewed	Yes
Editors	Luis Gómez Chova; Agustín López Martínez; Ignacio Candel Torres
Event Title	--
Event Organizer
City	Palma
Event Type	Conference
Event Classification	International
Event Year	2018
Event Publication Type	Full Paper
ISCTE-IUL Repository	Link to the repository
Publication Date (online)
Publication Date (print)

Altmetric

Dimensions

PlumX Metrics