The project is based on a prototype created in the first module of my 3rd year in the course of Creative Computing. It is solely an application program for smart devices that allows users to insert any image from their camera roll which is then scanned by a trained machine learning model. This model will be able to recognise patterns, facial expressions, surroundings and postures, once objects are recognised it generates captions that could be used in social media posts/stories.
As there will be a great amount of results for each picture, the user can filter any specific category they would want their caption to be linked to. Such as funny, motivational, lyrics, etc.
The Aim of the Project:
More than 100 million photos are uploaded on Instagram each day. Many times a question which comes when posting is “What should I write for caption”. The caption app aims to facilitate the process of looking up captions online or in other social media platforms. All you need is one click and you will be provided with relevant to your photo captions.
Beneficiaries:
My intended audience consists of social media users who struggle to provide original descriptions to their images, regardless of age, gender, or activity level. It connects to the CCIs social mission as
- It is digitally inclusive- Provides space for all users to select the category of their captions, as well as a descriptive filter that identifies the picture’s features so that people with visual impairments may understand the image.
- It is a diverse technology – The aim of this app is to be helpful in terms of creativity, even the best influencers or content creators struggle to add a caption to their photos. The more entertaining the caption, the more social engagement individuals receive, and for some, social engagement is a form of currency.
- Provides entrepreneurship opportunity – Using ML analysis for product advertising and promotion is a future feature of the app, which might be highly advantageous for start-up enterprises with limited capital investments in this area.
Project Scope:
Implementing a model that recognises every object on an image and creating such a large database of captions wouldn’t have been possible considering the project deadline, therefore I decided to base the app only on facial expression/ emotion recognition. Once a facial expression is recognised, it will show all captions connected to that facial expression.
A. User-Flow

B. Features Outline
1.ML facial expression/emotion recognition
This is allows the user to choose an image from their camera roll/gallery or take a photo instantly. The implemented pre-trained machine learning model will then scan their facial expression/emotion.
2. Option to save/ copy caption
This feature allows the user to save a caption in the saved menu, where they can go back and use it another time. It also allows the user to directly copy the caption to their keyboard and paste directly on their social media posts making it easier to transfer the caption.
3. Filter captions
This feature helps users categorise their captions depending on the style they want. For example the categories included at the moment are: simple, funny, catchy, art, lyrics, quote, motivational, descriptive and sassy.
4. Sort captions
This feature helps the user to find the caption they are looking for by sorting the captions by alphabetical order or depending on its length
5. Search caption by keyword
This feature allows the user to enter a keyword by choice, and then display all captions containing this word in it.