Discord Bot that Generates Image Captions using Gemma3
Website Download Page for Discord Bot
For my HCDE 310 final project, I developed an AI-powered Discord bot that creates meaningful captions for images shared in servers. Drawing on my Python expertise and AI knowledge, I leveraged Gemma3, the then leading open-source model available through Gemini's API, to build a solution that was both fast and free to use. I engineered a prompt that supports academically accepted guidelines for image captioning, ensuring accessibility while maintaining engaging descriptions for users.
To enhance the user experience, I expanded the bot's capabilities beyond image captioning. Users can now rewrite messages for better clarity, catch up on conversations they missed, and toggle image captioning on or off based on the server’s preferences. I complemented the bot with a simple landing page that clearly communicates its features and provides an easy way for server administrators to invite the bot to their communities. The project demonstrates how AI can be seamlessly integrated into social platforms to improve online communication and accessibility.