Natural language image descriptor

Anurag Kishore, Sanjay Singh

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    2 Citations (Scopus)

    Abstract

    Generating descriptions for visual data (images and video) automatically has been a complicated task in the field of Computer Vision and Artificial Intelligence. This paper discusses the working of and improvements on an algorithm called Neural Image Captioner (NIC) by Oriol Vinyals and his team, which uses a deep convolutional and recurrent architecture to generate natural language sentences to describe the visual data input. We look at the possibility of making this algorithm train faster without allowing it to lose accuracy via the usage of techniques like Stochastic Gradient Descent and also employ an algorithm to find the perfect depth of the convolutional part of the network for different datasets. A drop of 33% was observed in the number of iterations required to get the algorithm to its original proficiency as claimed by Oriol et al.

    Original languageEnglish
    Title of host publication2015 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2015
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages110-115
    Number of pages6
    ISBN (Electronic)9781467366700
    DOIs
    Publication statusPublished - 09-06-2016
    Event2015 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2015 - Trivandrum, Kerala, India
    Duration: 10-12-201512-12-2015

    Publication series

    Name2015 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2015

    Conference

    Conference2015 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2015
    Country/TerritoryIndia
    CityTrivandrum, Kerala
    Period10-12-1512-12-15

    All Science Journal Classification (ASJC) codes

    • Artificial Intelligence
    • Computer Science Applications
    • Control and Systems Engineering

    Fingerprint

    Dive into the research topics of 'Natural language image descriptor'. Together they form a unique fingerprint.

    Cite this