Generating Human-Like Descriptions for the Given Image Using Deep Learning

Open Access

Issue		ITM Web Conf. Volume 53, 2023 2^nd International Conference on Data Science and Intelligent Applications (ICDSIA-2023)


Article Number		02001
Number of page(s)		13
Section		Machine Learning / Deep Learning
DOI		https://doi.org/10.1051/itmconf/20235302001
Published online		01 June 2023

J. Sudhakar, V. V. Iyer, and S. T. Sharmila, “Image Caption Generation using Deep Neural Networks,” in 2022 International Conference for Advancement in Technology (ICONAT), Jan. 2022, pp. 1–3. doi: 10.1109/ICONAT53423.2022.9726074. [Google Scholar]
T. Patel, “Object Detection Based Automatic Image Captioning using Deep Learning,” Comput. Eng. [Google Scholar]
Alex Krizhevsky, Alex Krizhevsky, Google Inc, View Profile, Alex Krizhevsky, and Alex Krizhevsky, “ImageNet classification with deep convolutional neural networks”. A. Team, “Building and Deploying an AI-powered Image Caption Generator,” AI Oodles, Apr. 08, 2020. https://artificialintelligence.oodles.io/blogs/ai-powered-image-captiongenerator/ (accessed Jan. 20, 2023). [Google Scholar]
“Image2Text | Proceedings of the 24th ACM international conference on Multimedia.” https://dl.acm.org/doi/10.1145/2964284.2973831 (accessed Oct. 31, 2022). [Google Scholar]
“‘Image Retrieval Using Image Captioning’ by Nivetha Vijayaraju.” https://scholarworks.sjsu.edu/etd_projects/687/ (accessed Oct. 31, 2022). [Google Scholar]
Y. S. Jain, T. Dhopeshwar, S. K. Chadha, and V. Pagire, “Image Captioning using Deep Learning,” in 2021 International Conference on Computational Performance Evaluation (ComPE), Dec. 2021, pp. 040–044. doi: 10.1109/ComPE53109.2021.9751818. [Google Scholar]
Z. Karimpour, Amirm. Sarfi, N. Asadi, and F. Ghasemian, “Show, Attend to Everything, and Tell: Image Captioning with More Thorough Image Understanding,” in 2020 10th International Conference on Computer and Knowledge Engineering (ICCKE), Oct. 2020, pp. 001–005. doi: 10.1109/ICCKE50421.2020.9303609. [Google Scholar]
“(PDF) Show, Attend and Tell: Neural Image Caption Generation with Visual Attention.” https://www.researchgate.net/publication/272194766_Show_Attend_and_Tell_Neural_Image_Caption_Generation_with_Visual_Attention (accessed Oct. 31, 2022). [Google Scholar]
S. Katiyar and S. Borgohain, Image Captioning using Deep Stacked LSTMs, Contextual Word Embeddings and Data Augmentation. 2021. [Google Scholar]
“BMorse-BYU-iu-active-contours.pdf.” Accessed: Jan. 18, 2023. [Online]. Available: https://www.sci.utah.edu/~gerig/CS6640-F2012/Materials/BMorse-BYU-iu-active-contours.pdf [Google Scholar]
T. V. Sneha and D. S. J. Rani, “LSTM-VGG-16: A Novel and Modular Model for Image Captioning Using Deep Learning Approaches,” vol. 12, no. 11. [Google Scholar]
S. Ayoub, Y. Gulzar, F. A. Reegu, and S. Turaev, “Generating Image Captions Using Bahdanau Attention Mechanism and Transfer Learning,” Symmetry, vol. 14, no. 12, Art. no. 12, Dec. 2022, doi: 10.3390/sym14122681. [CrossRef] [Google Scholar]
R. Khan, M. S. Islam, K. Kanwal, M. Iqbal, Md. I. Hossain, and Z. Ye, “A Deep Neural Framework for Image Caption Generation Using GRU-Based Attention Mechanism,” 2022, doi: 10.48550/ARXIV.2203.01594. [Google Scholar]
P. Ruiz, “Understanding and visualizing ResNets,” Medium, Apr. 23, 2019. https://towardsdatascience.com/understanding-and-visualizing-resnets-442284831be8 (accessed Jan. 24, 2023). [Google Scholar]
K. Doshi, “Image Captions with Attention in Tensorflow, Step-by-step,” Medium, Apr. 30, 2021. https://towardsdatascience.com/image-captions-with-attention-intensorflow-step-by-step-927dad3569fa (accessed Oct. 06, 2022). [Google Scholar]
S. Sarkar, “Image Captioning using Attention Mechanism,” The Startup, Jun. 15, 2021. https://medium.com/swlh/image-captioning-using-attention-mechanismf3d7fc96eb0e (accessed Jan. 25, 2023). [Google Scholar]
T. Gautam, “Attention Mechanism For Image Caption Generation in Python,” Analytics Vidhya, Nov. 20, 2020. https://www.analyticsvidhya.com/blog/2020/11/attention-mechanism-for-caption-generation/ (accessed Feb. 01, 2023). [Google Scholar]
“Flickr 8k Dataset.” https://www.kaggle.com/datasets/adityajn105/flickr8k (accessed Feb. 01, 2023). [Google Scholar]
“Flickr30k Dataset,” Machine Learning Datasets. https://datasets.activeloop.ai/docs/ml/datasets/flickr30k-dataset/ (accessed Feb. 01, 2023). [Google Scholar]
“Papers with Code COCO Dataset.” https://paperswithcode.com/dataset/coco1 (accessed Feb. 01, 2023). [Google Scholar]
J. Brownlee, “A Gentle Introduction to Calculating the BLEU Score for Text in Python,” MachineLearningMastery.com, Nov. 19, 2017. https://machinelearningmastery.com/calculate-bleu-score-for-text-python/ (accessed Feb. 02, 2023). [Google Scholar]
“Foundations of NLP Explained — Bleu Score and WER Metrics | by Ketan Doshi | Towards Data Science.” https://towardsdatascience.com/foundations-of-nlp-explainedbleu-score-and-wer-metrics-1a5ba06d812b (accessed Feb. 02, 2023). [Google Scholar]
R. Khandelwal, “BLEU — Bilingual Evaluation Understudy,” Medium, Jan. 26, 2020. https://towardsdatascience.com/bleu-bilingual-evaluation-understudy-2b4eab9bcfd1 (accessed Feb. 02, 2023). [Google Scholar]
“Image Captioning With Flickr8k Dataset & BLEU | by Raman Shinde | Medium.” https://medium.com/@raman.shinde15/image-captioning-with-flickr8k-dataset-bleu-4bcba0b52926 (accessed Feb. 03, 2023) [Google Scholar]
K. Doshi, “Foundations of NLP Explained Visually: Beam Search, How it Works,” Medium, May 21, 2021. https://towardsdatascience.com/foundations-of-nlp-explainedvisually-beam-search-how-it-works-1586b9849a24 (accessed Feb. 01, 2023). [Google Scholar]
P. Tian, H. Mo, and L. Jiang, “Image Caption Generation Using Multi-Level Semantic Context Information,” Symmetry, vol. 13, no. 7, Art. no. 7, Jul. 2021, doi: 10.3390/sym13071184. [CrossRef] [Google Scholar]

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.