Abstract
In order to address the problems of misprediction and object missing in semantic description of image, an improved Neural Image Caption (I-NIC) model is proposed. It primarily consists of the Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). The model uses Inception-v4 model developed by Google to extract image features and iteratively optimizes the training process parameters through word-based loss function. Therefore, the I-NIC model can generate more relevant descriptions and improve the accuracy and efficiency of the system. Compared with NIC model, the experiment results show that the accuracy of I-NIC model is improved by 2.5% with BLEU-4 metrics, 1.2% with METEOR metrics and 7.5% with CIDEr metrics on the Microsoft COCO Caption dataset.
Original language | English |
---|---|
Publication status | Published - 2018 |
Event | 8th International Symposium on Computational Intelligence and Industrial Applications and 12th China-Japan International Workshop on Information Technology and Control Applications, ISCIIA and ITCA 2018 - Tengzhou, Shandong, China Duration: 2 Nov 2018 → 6 Nov 2018 |
Conference
Conference | 8th International Symposium on Computational Intelligence and Industrial Applications and 12th China-Japan International Workshop on Information Technology and Control Applications, ISCIIA and ITCA 2018 |
---|---|
Country/Territory | China |
City | Tengzhou, Shandong |
Period | 2/11/18 → 6/11/18 |
Keywords
- Convolutional Neural Network
- Long Short-Term Memory
- Neural Networks
- Semantic Description of Image