Enhancing Self-Checkout Accessibility
by Abosh Upadhyaya, Ananya Ganapathi, Suhani Arora
Introduction
Our project aims to address the challenge of making self-checkout more accessible for visually impaired individuals. Visually impaired individuals often face difficulties in discerning on-screen prompts and locating buttons as touch screen self-checkout devices don’t have tactile guides. To tackle this challenge, our system enables users to take photographs of the self-checkout machine’s screen using their smartphones. Based on the provided images, the relevant information about the screen’s content and button locations is relayed to the user through our website. This allows users with visual impairments to navigate the self-checkout process independently, without relying on assistance.
Self-checkout machines offer convenience but have inadvertently excluded visually impaired individuals. By making these machines more accessible with real-time information, visually impaired individuals can also benefit from self-checkout technology. We should also note that our current project focuses specifically on self-checkout machines in QFC grocery stores.
Related Work
Several projects and initiatives have tackled the challenge of increasing self-checkout accessibility for visually impaired individuals. One notable example is the “Accessible Self Checkout” system developed by the National Federation of the Blind (NFB). This system uses audio cues and tactile buttons to guide users through the self-checkout process. While effective, this approach requires modifications to existing self-checkout machines, which can be costly and time-consuming to implement.
Another relevant project is the TapTapSee app, which allows users to take photos of everyday objects and receive information about them through text-to-speech functionality. This app could potentially be used by visually impaired individuals to gather information about self-checkout interfaces, but it lacks features specific to the self-checkout process, such as real-time guidance and interaction with the machine.
Our project builds upon these existing efforts by offering a more comprehensive and user-friendly solution. By leveraging the ubiquity of smartphones and web technologies, we provide a cost-effective and scalable approach to enhancing self-checkout accessibility without requiring hardware modifications. Our system also goes beyond information provision by offering real-time guidance and interaction capabilities, further empowering visually impaired users to complete self-checkout tasks independently.
Methodology
Our project involved the design and implementation of a web-based system for enhancing self-checkout accessibility for visually impaired individuals. We divided the project into several key phases:
User Research and Needs Assessment
We found first-person accounts that we found that supported the need for our application. For instance:
-
Josh Boykin, a visually impaired person, shared their experience on TikTok. Josh struggled using a grocery store self-checkout because the screen instructions were not accessible to them.
-
In another TikTok, Josh talks about how the LCD screens on self-checkout devices can be hard to see with limited vision, and the fact that cashiers are starting to be phased out of grocery retailers can make it difficult to find assistance in such a situation.
System Design and Development
-
We then designed a system that integrates image recognition, web scraping, natural language processing, and text-to-speech technologies to provide users with real-time information about self-checkout screens.
-
We also implemented a web interface that allows users to take pictures of the self-checkout screen and submit them to our website for processing.
-
The web interface displays the extracted information, including the current screen content, button locations, and audio instructions on how to complete different self-checkout tasks based on the screen the user is on. Everything is screen-reader accessible.
Technical Details
- Gathered 146 images across five screens from QFC: Open, Bags, Scanning, Closed and Thank You screen
- Labeled (~10%) for solely testing (no training or hyperparameter tuning would happen on them)
- Trained an AutoML model on Google Cloud’s vertex AI platform. We uploaded the 146 images on the platform and labeled them with what screen they were. We then labeled 14 images across the different screens as testing data, so the model wouldn’t train on it. The model then split the remaining data into train and validation datasets and trained an image classification model.
- Deployed the model to an endpoint so the model could be made available for online prediction requests.
- Programmed a web app (HTML, CSS, JS) that allows user to upload an image and once the user clicks a button, it makes a request to the endpoint, gets the label with the maximum probability and displays useful information about the screen based on it.
- We also integrated the SpeechSynthesisUtterance interface from the Web Speech API in JavaScript. This automatically reads out what screen the user inputter is and its description, without having the user to do any additional steps.
Role of People with Disabilities
People with disabilities played a crucial role in every stage of our project. In identifying the problem and scope, the firsthand accounts we found ensured that we were addressing the right issue and focusing on the most critical user needs. Their suggestions and wants helped us develop a user-friendly and accessible system that met their specific wants.
We didn’t have time to test our app with a visually impaired individual in our 4-week timeline, but if we had, we would have conducted testing sessions to receive feedback and improvements to our design and usability.
Disability Justice Perspectives
Intersectionality: The intersectionality principle acknowledges the interconnected nature of social categorizations such as disability, technology use, and economic participation. Our solution caters to visually impaired individuals who are also active digital consumers, thereby addressing their unique intersectional needs.
Collective Access: This project was conceptualized with visually impaired individuals in mind. As such, making it accessible to them is our priority. However, by displaying information on the user’s device, this technology could be helpful to other groups as well. For instance, consider someone who doesn’t have English as a first language. By putting information on their phone, they can use existing applications to translate it, so they have a better understanding of what they are being asked. Phones can be used to help those with several other disabilities or backgrounds that might make a task like using self-checkout difficult.
Learnings and Future Work
Learnings
Through this project, we learned about the need for more accessible self-checkouts. Before researching this, we never really thought about it or about how self-checkout kiosks at grocery stores can be inaccessible to individuals with disabilities. Therefore, in general, this project was an enlightening and educational experience for us.
In terms of the technical aspect of this project, everyone in our group has taken or is taking ML and we have worked on image classification models before, so it was interesting for us to apply that to a real-world project and understand the theory behind it as well. Additionally, it was our first time working with AutoML models and VertexAI, so learning about these tools was enriching and useful for future application as well. Additionally, it was new for us to make HTTP requests to endpoints, specifically the google cloud AI endpoint and this was also good learning. We also learned about the SpeechSynthesisUtterance interface, which is a really good tool to make accessible web apps and we will definitely consider using it in any other future projects we do.
Future
For the future, we hope to make this a mobile application that has a built-in camera or directly links to the phone camera and the minute that a user points the camera to a kiosk screen, it can classify the screen and read Most of this is front end work and we could feasibly achieve this using React Native. In terms of the model itself, we hope to collect even more training data and improve precision.
Furthermore, after implementing the above we would love to generalize this application to any grocery store, not just QFC. This would require an a lot more complex model and a much larger dataset across different stores.
How We Made Our App Accessible
We implemented several key features to ensure the platform is inclusive and usable for everyone, regardless of disability:
-
Screen reader compatibility: Our web interface is fully compatible with popular screen readers, allowing visually impaired users to navigate and interact with the platform seamlessly. All text content and interactive elements are clearly labelled and accessible to screen readers, providing an equal and efficient user experience.
-
Color contrast and readability: We carefully considered color contrast ratios and font sizes to ensure optimal readability for users with low vision or color blindness.
-
Keyboard navigation: The entire web interface is fully keyboard navigable, allowing users to navigate and interact with all elements without requiring a mouse or other pointing device. This promotes accessibility for individuals with impairments or who prefer keyboard-only interaction.
-
Text-to-speech functionality: All extracted information from the self-checkout screen, including button locations, instructions, and error messages, is presented through text-to-speech functionality. This allows users with visual impairments to access the information without relying on visual cues, ensuring they receive complete and helpful guidance.