Projects

AltVision - AI-Powered Alt Text Generator

September 01, 2024
React
AI
Chrome Extension API
Cloudflare Workers
GPT-4o
Chrome Web Store

Tech: React, AI, Chrome Extension API, Cloudflare Workers, GPT-4o
Background:
Inspired by the release of GPT-4o and its advanced capabilities, AltVision was developed in spring 2024 to explore the potential of AI in enhancing web accessibility. The project began with a simple Chrome extension template and evolved into a comprehensive tool for generating context-aware alternative text for images.

Challenge:
The primary challenges were threefold:

  1. Fully utilizing the potential of GPT-4o for accessibility applications.

  2. Addressing a significant pain point in web accessibility: the lack of alternative text for images.

  3. Implementing a serverless architecture using Cloudflare Workers, which required understanding and adapting to a new paradigm of edge computing and distributed networks.

Additional technical hurdles included:

  • Converting images to Base64 format for API processing

  • Implementing secure environment variable management

  • Designing an efficient routing system within Cloudflare Workers

  • Optimizing performance within the constraints of serverless architecture

Solution:
AltVision evolved into a comprehensive Chrome extension that:

  1. Detects images lacking alt text on web pages.

  2. Utilizes GPT-4o to analyze images and their surrounding context.

  3. Generates accurate and contextually relevant alt texts for images.

  4. Supports multiple languages, starting with English and Swedish.

  5. Leverages Cloudflare Workers for efficient, serverless backend processing.

The extension sends image data (converted to Base64) along with contextual information to the OpenAI APIs, which then return AI-generated alt text suggestions. This process is optimized through the use of edge computing, ensuring fast response times and scalability.

Flow chart diagram showing client-side request process with four connected steps: Client-side request initiation, Send data, Receive data, and Update interface.

Impact:
AltVision has the potential to become a valuable tool for content creators, accessibility professionals, and UX designers:

  1. Streamlines the process of identifying and fixing accessibility issues related to image descriptions.

  2. Improves overall web accessibility by ensuring more images have accurate and contextual alt texts.

  3. Enhances the user experience for individuals relying on screen readers or other assistive technologies.

  4. Potential for broader application in various industries, with considerations for scaling up using corporate OpenAI accounts.

  5. Contributes to making the web more inclusive and accessible to all users.

The success of AltVision demonstrates the powerful potential of combining AI technologies with accessibility initiatives, paving the way for more inclusive digital experiences.