Project Description
Our Team, NetMine, has two members; Dorna Heidari, and Majid Zarrinkolah (Ario). We participated at Deakin University. We entered the challenge "Tagging photographic images: showcasing the magnificent history of Victoria" for the love that we have to both history and Victoria and named the project VICVision.
As prior knowledge might suggest, traditional methods involved recognising individual objects within pictures, such as dogs and cats. However, our approach employs descriptive sentences as tags for each image, intricately weaving the narrative of the visual content, and this innovative process harnesses the power of AI.
We have coded this project using Python, with AI implementation. This section focuses on project installation, utilising the GPT-2 model for image-to-text conversion and extracting information from images.
For instance, consider an image description: 'A person is cutting paper with a drawing board.' Here are a few other output examples. Looking ahead, potential enhancements involve integrating a chatbot for user interaction and the possibility of replacing it with a search engine and search indexer. Moreover, we can leverage generative AI to create more artificial tags in the context. To illustrate, we've provided a sample code employing LLama2, which feeds from the extracted information from images and generates additional tags. These generated tags can then be included in our search indexer for indexing purposes.