Culturally Aware Alt Text Chrome Extension
Introduction
Problem
We are working to provide culturally relevant, descriptive, and accessible alt text for blind and low vision (BLV) individuals exploring the web. Traditional alt text often lacks depth and awareness of culture, especially when addressing culturally relevant images. This makes it difficult for BLV users to fully connect with and understand cultural content, especially when the image refers to unfamiliar customs, places, or symbols. This problem is most noticeable when the alt text is either too unclear or overly detailed. Solving this problem is important for making sure fair access to online cultural knowledge, giving users agency over their experiences, and encouraging web inclusion by accepting unique cultural identities and preferences.
Application
The main goal of our application is to provide people with low vision or blind people with a browser that allows them to understand images through a cultural lens. Throughout our interviews, our clients mentioned that many of the alt-texts that they work with do not explain the cultural identity/aspects of the image. Our Google Chrome browser allows the user to send in a specific image, which then the alt-text is produced. Additionally, the user can access a chatbox to ask more in-depth questions regarding the image and its cultural aspects. There is also a settings page where the user can go in and set the settings according to their cultural preferences and alt-text description length.
Client Input
Clients
- Brown Indian Young man with low vision(Correctable with glasses)
- Blind White American middle-aged adult woman
- Blind White American Senior woman
Access Needs
Through interviewing the clients, I got to know more about their needs when it comes to their use of alt-text. I learned that a lot of description preferences come based on their lived experiences. While one client had the opportunity to travel to different parts of the world, experiencing different cultures had different alt-text preferences from our clients who did not. This client preferred more shorter and concise descriptions, while others preferred more in-depth ones since the concepts were often foreign to them.
Additionally, I learned how important alt-text plays in their daily lives. It often goes beyond just image descriptions on the internet, but also something they use as a part of their daily lives. One client mentioned that she uses AI to produce alt-text to describe objects around their home.
“I used Chat GPT to help me understand where the pumpkin was in my house”
Lofi Prototype
Through making our prototype, we learned about the importance of making sure our technology is accessible through voiceover or screen readers. Since our target audience is people who are low vision or blind we wanted to ensure our technology is compatible with screen readers. However, this raised a challenge since we needed to constantly ensure that not only the screen reader works but also the reading is happening in the correct order.
Lofi Prototype Settings Page
Lofi Prototype CAATBox
Final Concept
A Chrome Extension that allows users to generate culturally relevant alt text for an image. We used HTML, Javascript, and CSS to build the front end and used LLama 3.2 Vision Instruct as our LLM. We designed an Image Description Verification Framework loosely inspired by the framework found in the GenAssist Paper. An image will be input to the LLM with a set of Image Questions, Cultural Questions, and Settings Questions. The outputs would be summarized to produce well written alt text. Then the alt text would get each cultural aspect described in the final output.
Image Description Framework
Example of Generated Alt Text
Intersectional Needs
Our design addresses our clients as whole people, not just people who are blind or have visual impairments. By having cultural identity be part of the design and output, we get a more holistic view of the image itself. Additionally, we found from our client interviews that if our users were familiar with a cultural aspect of an image whether by sharing the identity or having experienced that culture, they were more likely to prefer more succinct Alt Text. Conversely, if they were unfamiliar with a cultural aspect of an image, they were more likely to prefer more descriptive Alt Text. Customizable Alt Text output provides a more tailored experience and allows more agency for our users.
Potential Risks and Harms
We identified several risks in deploying culturally aware AI-generated alt text: hallucinated or inaccurate outputs, stereotyping, offensive assumptions, and overwhelming or excessive detail. To mitigate these harms, we designed a framework loosely based off of Chain of Thought Prompting and the framework found in the GenAssist Paper. Additionally, we prioritized user agency and contextual control. Users can optionally input their cultural identity and adjust generation settings—like including additional context or emotion—through an accessible settings page. This encourages transparency and lets users tailor the experience to their needs. The AI prompt was also carefully structured to focus on relevant cultural cues (e.g., food, clothing, activity) while avoiding overreach or bias.
Advertisement
Almost 22% of home page main images have no alt text. This does not include those with insufficient alt text. Considering the wide variety of human experiences the BLV community lives, many of these images also lack the cultural context that could be helpful in depicting culturally rich images. Our Chrome Extension, Culturally Aware Alt Text, aims to add this cultural context to images with poor or no alt text. Previously, Alt Text extension generators were inconsistent and lacked this cultural context that could benefit blind and visually impaired users. Now, users will be able to generate cultural context alongside alt text to aid understanding of cultural image descriptions.
Accessibility
VPAT Table
Criterion | Conformance (1-5) | Description |
---|---|---|
1.1.1 - Non-text Content | 3/5 | Bad alt text is intentional on the main web page, but user can add alt text to it with extension |
1.3.1 (Info and Relationships) | 5/5 | Now Complete text reading when using Voiceover – VoiceOver now reads all parts of the texts, particularly important under the “Context Settings” area, on the web application |
1.3.4 - Orientation | 0/5 | Not designed for mobile implementations |
1.4.3 - Contrast (Minimum) | 5/5 | Used basic color scheme with black on white so contrast is very high |
1.4.4 - Resize Text | 0/5 | Text in the application does not resize according to the client’s preferred font size |
1.4.10 - Reflow | 0/5 | No ability to Resize |
1.4.11 - Non textual contrast | 5/5 | Used basic color scheme with black on white so contrast is very high and switch colors are 4.5:1 minimum contrast with white |
2.4.3 - Focus Order | 5/5 | Screen Readers accurately go over the content since heading levels and text is consistent |
4.1.2 - Name, Role, Value | 0/5 | No indication on extension what technologies and actions are explicitly supported |
4.1.3 - Status messages | 4/5 | Some indication when changes are happening since aria polite will read out the save but other instances this is not confirmed. |
Accessibility Audits
Audit 1
Description:
Clickable Fields Present when they shouldn’t be
Testing Method:
VoiceOver
Evidence:
Guidelines violated:
WCAG 4.1.2 - Name, Role, Value
Explanation:
The example violates WCAG 4.1.2 - Name, Role, Value since there is no mention of what accessible tools are compatible with the extension even though it is important to know which screen readers are explicitly compatible with the extension since it is for people who rely on screen readers.
Severity Rating: 5
Justification:
Frequency is high on this website because it affects one of the main parts, the settings page, of the extension. Impact is high because it can be confusing to someone using a screen reader when they come across a bug or something similar that they are unprepared for. Persistence is high since this is unavoidable in this current state.
Possible Solution:
To improve this, the developer should accurately test different accessible tools, primarily different common screen readers, on the extension to explicitly state compatibilities and work on accessibility audits to state what the extension does not support.
Audit 2
Description:
High Contrast Non Text Elements
Testing Method:
WebAim WAVE
Evidence:
Guidelines Passed:
1.4.3 - Contrast (Minimum)
1.4.11 - Non textual contrast
Explanation:
The example passes 1.4.3 - Contrast (Minimum) and 1.4.11 - Non textual contrast since the colors are high contrast compliant. For the text, black font on white background is contrast compliant, so the extension is 1.4.3 compliant. For the switches, they are both a minimum of 4.5:1 contrast with white, so that is also 1.4.11 compliant.
Severity Rating: 1
Justification:
Frequency is high on this website because it affects the UI of the extension. Impact is none because it is not a problem. Persistence is none because it is not a problem.
Possible Solution:
Continue using high contrast colors.
Audit 3
Description:
Indication of Saved Cultural Identity
Testing Method:
VoiceOver
Evidence:
Guidelines Passed:
4.1.3 - Status messages
Explanation:
The example passes 4.1.3 - Status messages since with using VoiceOver, it will read out what cultural identity was just saved along with the visual change below the input box.
Severity Rating: 1
Justification:
Frequency is low on this website because it affects single input box under Settings. Impact is none because it is not a problem. Persistence is none because it is not a problem.
Possible Solution:
Continue using aria polite, assertive where appropriate.