CSE443 Final Project

Accessible Transcript Review Tool
Sahana & Mahima


Project Overview

Project Wireframe: https://scrum-tofu-27822975.figma.site/

After receiving feedback, we narrowed the scope of our project. Instead of building a full end-to-end speech-to-text system, we begin with transcripts generated by an existing note-taking tool and focus on helping users interpret, repair, and restructure transcripts.

Our design space focuses on:

  • Transcript interpretation
  • Error repair
  • Transcript restructuring
  • Reducing cognitive overload

We situate the project in the context of university lecture transcripts, since note-taking tools are increasingly used by students for both productivity and accessibility.


Tasks and Timeline

Task 1: Review and Repair Raw Transcript

Workflow

  1. User opens the application
  2. A pre-generated lecture transcript appears
  3. The transcript includes timestamps and speaker labels
  4. The system highlights low-confidence or unclear text
  5. Users can collapse sections to reduce visual overload

Users can then:

  • Edit incorrect words
  • Relabel speakers
  • Reorganize transcript sections
  • Tag content into categories

Example categories include:

  • Key Concepts
  • Definitions
  • Examples

Accessibility Focus

This task supports accessibility by:

  • Supporting different cognitive processing styles
  • Providing a plain-language option to reduce linguistic barriers
  • Supporting ESL learners
  • Supporting students with learning disabilities
  • Allowing multiple representations of information, such as outlines, summaries, and bullet points

Goal: Make transcription errors visible and reduce ambiguity in long transcripts while reinforcing that AI-generated text should not be treated as authoritative.


Task 2: Study Mode

Workflow

  1. User selects Study Mode
  2. The system automatically generates:
    • A structured outline
    • A plain-language summary
    • Bullet-point key concepts
  3. The user can adjust reading level and text size
  4. Notes can be exported in accessible formats

Accessibility Focus

Study Mode supports accessibility by:

  • Providing multiple representations of the same content
  • Supporting different learning and cognitive styles
  • Reducing linguistic complexity through plain-language summaries
  • Helping ESL students and students with learning disabilities

Validation Plan

Since we are no longer building a transcription system, our validation focuses on usability and clarity rather than speech recognition accuracy.

Baseline Analysis

We will first run an existing note-taker on prerecorded lecture audio and document:

  • transcription errors
  • structural weaknesses
  • areas of cognitive overload

This establishes the limitations of current tools.

Usability Evaluation

We will test whether users can:

  • review transcripts
  • edit incorrect text
  • restructure notes
  • generate summaries

Evaluation metrics include:

  • repair efficiency
  • clarity of structured notes
  • differences between raw and edited summaries

The goal is to determine whether the interface improves transcript sense-making.

Accessibility Testing

Accessibility validation will include:

  • WCAG 2.1 audit
  • Keyboard-only navigation testing
  • Screen reader testing using VoiceOver
  • Verification of semantic HTML structure and color contrast

Storyboards

Task 1: Review and Repair Transcript

IMG_7776

ALT: Three-panel storyboard titled “Task 1: Review and Repair Transcript”.
Panel 1 shows a lecture transcript with timestamps and highlighted low-confidence text.
Panel 2 shows a student editing incorrect words and tagging sections as Definition or Key Concept.
Panel 3 shows Study Mode generating a structured outline and plain-language summary.


Task 2: Study Mode

Study Mode Storyboard

ALT: Four-panel storyboard titled “Task 2: Study Mode”.
Panel 1 shows the user selecting Study Mode.
Panel 2 shows the transcript converted into structured notes and summaries.
Panel 3 shows settings for reading level and text size.
Panel 4 shows exporting the notes.


Timeline

Week 9 – Functional Prototype

Goal: Working system with core functionality.

Sahana:

  • Design transcript display UI
  • Implement timestamp navigation
  • Add editing functionality
  • Add accessibility features (keyboard navigation, contrast improvements, text resizing)

Mahima:

  • Implement transcription workflow
  • Integrate speech-to-text API
  • Ensure live captioning works

Deliverable: End-to-end record, save, and edit workflow.


Week 10 – Validation Phase

Sahana:

  • Conduct WCAG accessibility audit
  • Test keyboard-only navigation
  • Perform screen reader testing
  • Refine UI based on accessibility findings

Mahima:

  • Conduct Word Error Rate (WER) analysis
  • Test summary generation accuracy
  • Validate timestamp alignment

Deliverable: Documented validation results and improved system.


Final Presentation Week

  • Live demo
  • Poster presentation
  • Completed README
  • Storyboards uploaded with ALT text

Feasibility Analysis

This scope is feasible because we are not building a speech recognition system. Instead, we are designing an interaction layer on top of existing transcripts.

Technologies used include:

  • Web-based frontend interface
  • Accessibility evaluation tools such as VoiceOver and WCAG guidelines

Both team members have experience with frontend development and accessibility evaluation.


Fallback Plan

If technical challenges occur:

  • Use prerecorded transcripts instead of real-time transcription
  • Use rule-based summaries before integrating advanced models

Risks and Mitigation

Risk: Summary quality depends on transcription accuracy
Mitigation: Allow users to edit transcripts before generating summaries

Risk: Accessibility compliance requires iteration
Mitigation: Begin accessibility testing early in Week 10

Risk: Scope creep from AI features
Mitigation: Prioritize core transcript interaction features first


Disability Analysis

Principle 1: Leadership of Those Most Impacted

Accessibility solutions should be guided by people who directly experience disability-related barriers.

Our project is not led by individuals who identify as disabled users of speech-to-text tools, which creates a limitation. Design decisions may unintentionally reflect assumptions rather than lived experience.

To partially address this limitation, we:

  • studied first-person accounts from Deaf and hard-of-hearing users
  • analyzed critiques of captioning and transcription systems
  • grounded design decisions in documented accessibility needs

However, these steps do not replace participatory design with disabled stakeholders.


Principle 2: Intersectionality

Intersectionality recognizes that disability intersects with identities such as language background, race, socioeconomic status, and education.

Our design addresses intersectionality by:

  • supporting multiple spoken languages
  • allowing interaction without fine motor precision
  • providing visual, textual, and editable outputs
  • allowing customizable summaries to reduce cognitive load

These features support users who may experience multiple accessibility barriers simultaneously.


Is the Technology Ableist?

Speech technologies often perform worse for non-standard accents, speech impairments, or atypical communication styles.

Our design mitigates this by:

  • enabling transcript correction
  • allowing user control
  • supporting personalization and editing

However, reliance on speech input means the system cannot eliminate all accessibility barriers.


Is It Informed by Disabled Perspectives?

The project is informed by disabled perspectives but not led by them.

Sources include:

  • accessibility research
  • first-person narratives
  • critiques of assistive technologies

Common frustrations we considered include caption inaccuracies, cognitive overload, and lack of editing control.


Does It Oversimplify Disability?

We attempt to avoid oversimplification by providing customizable interaction modes and avoiding a single-user model.

Future improvements would involve participatory design with diverse disability communities.