NSF-IIS-1840751 Project Page: Using Large-scale Web Data for Online Attention Models and Identification of Reading Disabilities

Media websites now capture intricate measures of engagement from millions of readers. These measures, such as inĀ­page scrolling and viewport position, can help us understand patterns of user attention beyond simple measures like time spent on page. The goals of this project include (1) use and develop new techniques for data analysis to reason about the connection between language, text, and attention on the Web; (2) using these data and techniques, explore how to identify users with reading difficulties, and potentially support them in online reading tasks.

Understanding Reader Backtracking Behavior in Online News Articles

Uzi Smadja, Max Grusky, Yoav Artzi, Mor Naaman
WWW 2019

Rich engagement data can shed light on how people interact with online content and how such interactions may be determined by the content of the page. In this work, we investigate a specific type of interaction, backtracking, which refers to the action of scrolling back in a browser while reading an online news article. We leverage a dataset of close to 700K instances of more than 15K readers interacting with online news articles, in order to characterize and predict backtracking behavior. We first define different types of backtracking actions. We then show that "full" backtracks, where the readers eventually return to the spot at which they left the text, can be predicted by using features that were previously shown to relate to text readability. This finding highlights the relationship between backtracking and readability and suggests that backtracking could help assess readability of content at scale.

Measuring and Understanding Online Reading Behaviors of People with Dyslexia

Max Grusky, Jessie Taft, Mor Naaman, Shiri Azenkot
In Submission (Summer 2020)

Extending the benefits of online reading to people with reading disabilities such as dyslexia requires broader research on reading behavior in addition to existing small-scale eye-tracking studies. We conduct the first large-scale mixed-methods study of the unique reading challenges of people with dyslexia. We combine in-person interviews (N=6), online surveys (N=566) and a novel browser-based tool able to measure detailed reading behavior remotely on a controlled set of five pages (N=477) or as a browser extension (N=89) collecting long-term reading behavior data on self-selected pages. We find a variety of text and page layout factors that pose challenges to readers with and without dyslexia, and identify in-browser reading behaviors associated with dyslexia. Findings point toward improvements to technologies for identifying struggling readers, and to ways to improve the layout and appearance of online articles to improve reading ease for people with and without dyslexia.

This material is based upon work supported by the National Science Foundation under Grant No. 1840751, as well as Verizon Media Group / Yahoo! Research. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

NSF Project Additional Details