Knowledge Discovery Group

Estimating the Reading Complexity of Documents using Eye-Tracking and Neural Networks



The time required to read a text or document is often estimated by only taking the length of the text into account. However, scientific documents consist of more than text like for example figures, tables, formulas, or algorithms. Furthermore, text can be of different complexity based on its content. Therefore, it is insufficient to estimate the reading time only via the text length. One tool that is often used to assess the understanding by users is eye-tracking. This means that the eye movements are analyzed and set into relation to the content that was viewed. To identify the parts that have different influence on the reading time and to be able to estimate the reading time, one has to build a model which can be used for prediction. Lately, neural networks showed promising results in many domains. Thus, it makes sense to use them in this context as well since they can adapt to the heterogeneous nature of documents.

In this thesis, you will build upon an existing eye-tracking framework to extract gaze information connected to the different parts of a scientific document to train a neural network. The goal is to be able to predict the reading complexity of a scientific document. This includes to work with and understand the PDF format as well as gathering data in an eye-tracking experiment to generate the training data for the neural network.

In more detail, the work should cover:

  • Development of an application for acquiring the gaze information on the different parts of a scientific document
  • Performing an eye-tracking study with the application to gather data
  • Defining the neural network model and training it with the gathered eye-tracking data
  • Evaluation of the prediction quality of the neural network


  • Good programming skills
  • Knowledge of machine learning techniques is an advantage
  • Knowledge on eye tracking is beneficial


Got Interested? Send an email!


  • Homepage kicked off!