CSE443 Final Project
Accessible Transcript Review Tool
Sahana & Mahima
Project Overview
Project Wireframe: https://scrum-tofu-27822975.figma.site/
After receiving feedback, we narrowed the scope of our project. Instead of building a full end-to-end speech-to-text system, we begin with transcripts generated by an existing note-taking tool and focus on helping users interpret, repair, and restructure transcripts.
Our design space focuses on:
- Transcript interpretation
- Error repair
- Transcript restructuring
- Reducing cognitive overload
We situate the project in the context of university lecture transcripts, since note-taking tools are increasingly used by students for both productivity and accessibility.
Tasks and Timeline
Task 1: Review and Repair Raw Transcript
Workflow
- User opens the application
- A pre-generated lecture transcript appears
- The transcript includes timestamps and speaker labels
- The system highlights low-confidence or unclear text
- Users can collapse sections to reduce visual overload
Users can then:
- Edit incorrect words
- Relabel speakers
- Reorganize transcript sections
- Tag content into categories
Example categories include:
- Key Concepts
- Definitions
- Examples
Accessibility Focus
This task supports accessibility by:
- Supporting different cognitive processing styles
- Providing a plain-language option to reduce linguistic barriers
- Supporting ESL learners
- Supporting students with learning disabilities
- Allowing multiple representations of information, such as outlines, summaries, and bullet points
Goal: Make transcription errors visible and reduce ambiguity in long transcripts while reinforcing that AI-generated text should not be treated as authoritative.
Task 2: Study Mode
Workflow
- User selects Study Mode
- The system automatically generates:
- A structured outline
- A plain-language summary
- Bullet-point key concepts
- The user can adjust reading level and text size
- Notes can be exported in accessible formats
Accessibility Focus
Study Mode supports accessibility by:
- Providing multiple representations of the same content
- Supporting different learning and cognitive styles
- Reducing linguistic complexity through plain-language summaries
- Helping ESL students and students with learning disabilities
Validation Plan
Since we are no longer building a transcription system, our validation focuses on usability and clarity rather than speech recognition accuracy.
Baseline Analysis
We will first run an existing note-taker on prerecorded lecture audio and document:
- transcription errors
- structural weaknesses
- areas of cognitive overload
This establishes the limitations of current tools.
Usability Evaluation
We will test whether users can:
- review transcripts
- edit incorrect text
- restructure notes
- generate summaries
Evaluation metrics include:
- repair efficiency
- clarity of structured notes
- differences between raw and edited summaries
The goal is to determine whether the interface improves transcript sense-making.
Accessibility Testing
Accessibility validation will include:
- WCAG 2.1 audit
- Keyboard-only navigation testing
- Screen reader testing using VoiceOver
- Verification of semantic HTML structure and color contrast
Storyboards
Task 1: Review and Repair Transcript
ALT: Three-panel storyboard titled “Task 1: Review and Repair Transcript”.
Panel 1 shows a lecture transcript with timestamps and highlighted low-confidence text.
Panel 2 shows a student editing incorrect words and tagging sections as Definition or Key Concept.
Panel 3 shows Study Mode generating a structured outline and plain-language summary.
Task 2: Study Mode

ALT: Four-panel storyboard titled “Task 2: Study Mode”.
Panel 1 shows the user selecting Study Mode.
Panel 2 shows the transcript converted into structured notes and summaries.
Panel 3 shows settings for reading level and text size.
Panel 4 shows exporting the notes.
Timeline
Week 9 – Functional Prototype
Goal: Working system with core functionality.
Sahana:
- Design transcript display UI
- Implement timestamp navigation
- Add editing functionality
- Add accessibility features (keyboard navigation, contrast improvements, text resizing)
Mahima:
- Implement transcription workflow
- Integrate speech-to-text API
- Ensure live captioning works
Deliverable: End-to-end record, save, and edit workflow.
Week 10 – Validation Phase
Sahana:
- Conduct WCAG accessibility audit
- Test keyboard-only navigation
- Perform screen reader testing
- Refine UI based on accessibility findings
Mahima:
- Conduct Word Error Rate (WER) analysis
- Test summary generation accuracy
- Validate timestamp alignment
Deliverable: Documented validation results and improved system.
Final Presentation Week
- Live demo
- Poster presentation
- Completed README
- Storyboards uploaded with ALT text
Feasibility Analysis
This scope is feasible because we are not building a speech recognition system. Instead, we are designing an interaction layer on top of existing transcripts.
Technologies used include:
- Web-based frontend interface
- Accessibility evaluation tools such as VoiceOver and WCAG guidelines
Both team members have experience with frontend development and accessibility evaluation.
Fallback Plan
If technical challenges occur:
- Use prerecorded transcripts instead of real-time transcription
- Use rule-based summaries before integrating advanced models
Risks and Mitigation
Risk: Summary quality depends on transcription accuracy
Mitigation: Allow users to edit transcripts before generating summaries
Risk: Accessibility compliance requires iteration
Mitigation: Begin accessibility testing early in Week 10
Risk: Scope creep from AI features
Mitigation: Prioritize core transcript interaction features first
Disability Analysis
Principle 1: Leadership of Those Most Impacted
Accessibility solutions should be guided by people who directly experience disability-related barriers.
Our project is not led by individuals who identify as disabled users of speech-to-text tools, which creates a limitation. Design decisions may unintentionally reflect assumptions rather than lived experience.
To partially address this limitation, we:
- studied first-person accounts from Deaf and hard-of-hearing users
- analyzed critiques of captioning and transcription systems
- grounded design decisions in documented accessibility needs
However, these steps do not replace participatory design with disabled stakeholders.
Principle 2: Intersectionality
Intersectionality recognizes that disability intersects with identities such as language background, race, socioeconomic status, and education.
Our design addresses intersectionality by:
- supporting multiple spoken languages
- allowing interaction without fine motor precision
- providing visual, textual, and editable outputs
- allowing customizable summaries to reduce cognitive load
These features support users who may experience multiple accessibility barriers simultaneously.
Is the Technology Ableist?
Speech technologies often perform worse for non-standard accents, speech impairments, or atypical communication styles.
Our design mitigates this by:
- enabling transcript correction
- allowing user control
- supporting personalization and editing
However, reliance on speech input means the system cannot eliminate all accessibility barriers.
Is It Informed by Disabled Perspectives?
The project is informed by disabled perspectives but not led by them.
Sources include:
- accessibility research
- first-person narratives
- critiques of assistive technologies
Common frustrations we considered include caption inaccuracies, cognitive overload, and lack of editing control.
Does It Oversimplify Disability?
We attempt to avoid oversimplification by providing customizable interaction modes and avoiding a single-user model.
Future improvements would involve participatory design with diverse disability communities.