Welcome to the InfoSci Platform

Image Aesthetic Description Based on Semantic Addition Transformer Model

Kai Wang, Shasha Lv, Yongzhen Ke, Jing Guo, Ruikun Wang

Source Title: International Journal of Cognitive Informatics and Natural Intelligence (IJCINI)15(4)

ISSN: 1557-3958|EISSN: 1557-3966|EISBN13: 9781799859857|DOI: 10.4018/IJCINI.20211001.oa14

MLA

Wang, Kai, et al. "Image Aesthetic Description Based on Semantic Addition Transformer Model." IJCINI vol.15, no.4 2021: pp.1-14. http://doi.org/10.4018/IJCINI.20211001.oa14

APA

Wang, K., Lv, S., Ke, Y., Guo, J., & Wang, R. (2021). Image Aesthetic Description Based on Semantic Addition Transformer Model. International Journal of Cognitive Informatics and Natural Intelligence (IJCINI), 15(4), 1-14. http://doi.org/10.4018/IJCINI.20211001.oa14

Chicago

Wang, Kai, et al. "Image Aesthetic Description Based on Semantic Addition Transformer Model," International Journal of Cognitive Informatics and Natural Intelligence (IJCINI) 15, no.4: 1-14. http://doi.org/10.4018/IJCINI.20211001.oa14

Export Reference

Favorite Full-Issue Download

View Full Text HTML

View Full Text PDF

Abstract

Image aesthetic quality assessment has been a hot research topic in the field of image analysis during the last decade. Most recently, people have proposed comment type assessment to describe the aesthetics of an image using text automatically. However, existing works have rarely considered the quality of the aesthetic description. In this work, we propose a novel neural image aesthetic description network framework, named Deep Image Aesthetic Reviewer (DIAReviewer), based on Semantic Addition Transformer Model, the learning of Residual Network, and the Attention Mechanism in a single framework. Beyond that, we design a Semantic Addition module to compromise the image feature and semantic information to focus on the comment quality, such as fluency and complexity. We introduce a new image dataset named Aesthetic Review Dataset (ARD), which contains one or more aesthetic comments for each image. Finally, the experimental results on ARD show that our model outperforms other methods in content complexity and sentence fluency of aesthetic descriptions.