Skip to the content.

Scribe: Crowdsourcing Scientific Alt Text

Introduction

Many scientific papers are inaccessible due to a lack of high quality alt text, and a lack of alt text completely [5]. In addition, most scientific papers are presented in PDF format (for example, on https://arxiv.org/), and PDFs are generally not screen reader friendly. ar5iv (https://ar5iv.labs.arxiv.org/) presents articles from arXiv as HTML, which is generally more accessible for screen reader users.

We propose and implement a prototype plugin for ar5iv that adds alt text to requested papers via crowdwork.

We define two roles: users, who are screen reader users requesting alt text; and crowdworkers, who write alt text and submit it. Our frontend consists of two parts: a plugin that users can use to request alt text and add crowdsourced alt text to the HTML on ar5iv; and a website that crowdworkers can visit to request jobs for writing alt text for requested papers. Our proposed workflow is as follows:

Today, arXiv is one of the most popular sites to upload published scientific papers. When uploading a paper, there are three accepted formats: TeX/LaTeX, PDF, and HTML [6]. TeX/LaTeX is the preferred format and can be converted to HTML via ar5iv. The HTML version of a scientific paper is more easily read by accessibility technologies such as screen readers. However, the HTML usually only contains the plaintext in the document, unless the authors of the paper wrote alternative text for their images and figures. Additionally, there are tools like Be My Eyes [2] where a blind or low vision person can receive assistance to understand an image in a scientific paper. However, a user might be paired with someone who does not have the contextual knowledge to explain an image in a scientific paper, as they are typically fairly complex diagrams. The alternative text generated by the volunteer would also not persist for others reading that scientific paper. Another option could be to use AI to generate an image description. This has the same issue where the alternative text could be inaccurate or insufficient due to the complexity of the image.

Now, we list a few first person accounts that highlight issues with the lack of alt text currently:

Methodology

Our technical plan consisted of making a: user plugin, python backend, MongoDB database, and crowdworker website.

First we made the backend using python/Flask with an API to request jobs, complete jobs, write and save alt text, and get the written alt text. The API updates the database, which stores two types of jobs: requested and completed. The requested jobs are used to provide jobs for crowdworkers, and the completed jobs are used to get alt text for papers with previously written alt text by crowdworkers for inserting into the ar5iv HTML.

Following this, we constructed the User Plugin utilizing HTML, CSS, and Javascript, designed to be easily unpacked as a Chrome extension for use on any ar5iv paper. The plugin introduces two distinct user journeys:

The last component was to build the Crowdworker website. This was done using HTML, CSS, Javascript, and Flask. The website begins by displaying a button saying “Write Alt Text” that calls the backend API to get the url to the least recently requested paper by a user. Then, it retrieves the source HTML of that paper, updates some of the relative paths to accessory files to valid absolute paths, and then displays the paper like it does on ar5iv. We also find each image in the HTML and add an input box directly underneath the image for the user to easily add alternative text for that image. We added a button saying “Next Image” which focuses on the next image of the paper and allows a user to easily navigate across the images of that paper. It loops back to the first image of the paper after all images have been navigated to via the button. The original “Write Alt Text” button turns into a button that says “Submit Alt Text.” This button alerts the user if not all images on the paper have alternative text or writes the alternative text to our persistent database via the backend API.

For a more in-depth understanding of the crowd worker and user journeys, additional walkthroughs can be explored in our Validation Techniques section.

Validation Techniques

To validate our product, we assessed the entire workflow, exploring different scenarios in the process:

Our assessment included thorough navigation testing with a screen reader and other relevant accessibility technologies. If we opt to progress with this project, we intend to conduct extensive usability testing with the participation of several colleagues. This comprehensive evaluation will gather feedback from both crowdworker and user perspectives, aiming to identify any potential bugs that may have escaped our initial testing.

Disability Justice Perspective

This project addresses Disability Justice through two main principles: Interdependence and Collective Access [1]. Collective Access highlights how we share responsibility for our access needs, so we need to work together to achieve this while finding a balance between autonomy and independence. In a similar vein, interdependence stresses the importance of jointly addressing each other’s needs without relying solely on state solutions. This project embodies both principles as this crowdsourced alt text generator is not a state-mandated solution—it is a people-driven initiative. Regulated by and for the people, individuals support one another by contributing to alt text creation, ensuring equal access to information for everyone.

Learnings and Future Work

Most scientific papers are inaccessible, and solutions to this require thought as to how to improve the accessibility of older papers while encouraging authors to improve the accessibility of their own work moving forward.

We find that our plugin can successfully send requests and add crowdsourced alt text to ar5iv HTML. Using this finding, future work should focus on the adoption of such a platform on a large scale so that requests are completed in a timely manner and enough jobs are completed so that popular papers are likely to already have crowdsourced alt text available. Future work on this platform can also investigate how to instruct crowdworkers to write high-quality alt text and match crowdworkers to papers within their area of expertise.

In addition, future work can investigate how to better encourage authors to make their papers accessible so that less crowd work is needed.

We hope that our findings will motivate developers and researchers to improve the accessibility of scientific papers, both during and after publication.

How You Made Your App Accessible

We prioritized accessibility in our product by ensuring that both our user plugin and crowdworker website are compatible with screen readers and other accessible technologies. Given that our plugin is designed with the needs of individuals who are blind or have low vision in mind, achieving full accessibility was paramount. Our approach involved incorporating ARIA roles and labels into the HTML, and additionally, we utilized the aria-live=”assertive” attribute. This attribute signals to accessible technologies that the content within the specified element is dynamically changing and should be promptly announced to the user.

To illustrate, when the user clicks the plugin button, the HTML dynamically transforms to display “Task completed.” This not only provides a visual cue but also triggers an auditory announcement, ensuring that the user receives immediate feedback that their request has been successfully processed.

Furthermore, while the crowdworker website is not primarily intended for users of screen readers, we recognized the importance of making it accessible. Our efforts in this regard included incorporating ARIA roles and labels into the HTML and utilizing the aria-live=“assertive” attribute. The buttons on the page are highlighted when they are tabbed to or hovered on. Additionally, we ensure that the tab order of the page makes sense and the color contrast of elements on the website meet WCAG guidelines.

Here is the link to our poster.

References

[1] 10 Principles of Disability Justice. url: https://static1.squarespace.com/static/5bed3674f8370ad8c02efd9a/t/5f1f0783916d8a179c46126d/159586906452110_Principles_of_DJ-2ndEd.pdf.

[2] Be my eyes - see the world together. Be My Eyes - See the world together. (n.d.). https://www.bemyeyes.com/

[3] Is alt text on images important? url: https://www.reddit.com/r/Etsy/comments/13c0qcf/comment/jjdn0yy/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

[4] Juan Alcazar. How blind people view pictures online. url: https://www.youtube.com/shorts/HBiTPoST2FM.

[5] Sanjana Shivani Chintalapati, Jonathan Bragg, and Lucy Lu Wang. “A Dataset of Alt Texts from HCI Publications: Analyses and Uses Towards Producing More Descriptive Alt Texts of Data Visualizations in Scientific Papers”. In: Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility (2022).

[6] Submission guidelines. Submission Overview - arXiv info. (n.d.). https://info.arxiv.org/help/submit/index.html