AltVision - AI-Powered Alt Text Generator
Tech: React, AI, Chrome Extension API, Cloudflare Workers, GPT-4o
Background:
Inspired by the release of GPT-4o and its advanced capabilities, AltVision was developed in autumn 2024 to explore the potential of AI in enhancing web accessibility. The project began with a simple Chrome extension template and evolved into a comprehensive tool for generating context-aware alternative text for images.
Challenge:
The primary challenges were threefold:
Fully utilizing the potential of GPT-4o for accessibility applications.
Addressing a significant pain point in web accessibility: the lack of alternative text for images.
Implementing a serverless architecture using Cloudflare Workers, which required understanding and adapting to a new paradigm of edge computing and distributed networks.
Additional technical hurdles included:
Converting images to Base64 format for API processing
Implementing secure environment variable management
Designing an efficient routing system within Cloudflare Workers
Optimizing performance within the constraints of serverless architecture
Solution:
AltVision evolved into a comprehensive Chrome extension that:
Detects images lacking alt text on web pages.
Utilizes GPT-4o to analyze images and their surrounding context.
Generates accurate and contextually relevant alt texts for images.
Supports multiple languages, starting with English and Swedish.
Leverages Cloudflare Workers for efficient, serverless backend processing.
The extension sends image data (converted to Base64) along with contextual information to the OpenAI APIs, which then return AI-generated alt text suggestions. This process is optimized through the use of edge computing, ensuring fast response times and scalability.
Impact:
AltVision has the potential to become a valuable tool for content creators, accessibility professionals, and UX designers:
Streamlines the process of identifying and fixing accessibility issues related to image descriptions.
Improves overall web accessibility by ensuring more images have accurate and contextual alt texts.
Enhances the user experience for individuals relying on screen readers or other assistive technologies.
Potential for broader application in various industries, with considerations for scaling up using corporate OpenAI accounts.
Contributes to making the web more inclusive and accessible to all users.
The success of AltVision demonstrates the powerful potential of combining AI technologies with accessibility initiatives, paving the way for more inclusive digital experiences.