STATISTICS

Viewed577

Downloads616

Multimodal Sentiment Analysis Model Based on CLIP and Cross-attention
[1]CHEN Yan,LAI Yubin,XIAO Ao,et al.Multimodal Sentiment Analysis Model Based on CLIP and Cross-attention[J].Journal of Zhengzhou University (Engineering Science),2024,45(02):42-50.[doi:10.13705/j.issn.1671-6833.2024.02.003]
Copy
References:
[1] PANG B, LEE L, VAITHYANATHAN S. Thumbs up? sentiment classification using machine learning techniques [ C]∥Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing ( EMNLP 2002) . Stroudsburg: ACL, 2002: 79-86. 
[2] ZHANG L,LIU B. Sentiment analysis and opinion mining [EB / OL] . (2015-12-31) [2023-04-24] . https:∥doi. org / 10. 1007 / 978-1-4899-7502-7_907-2. 
[3] 李勇, 金庆雨, 张青川. 融合位置注意力机制和改进 BLSTM 的食品评论情感分析[ J] . 郑州大学学报( 工 学版) , 2020, 41(1) :58-62.
 LI Y, JIN Q Y, ZHANG Q C. Improved BLSTM food review sentiment analysis with positional attention mechanisms[ J] . Journal of Zhengzhou University ( Engineering Science) , 2020, 41(1) :58-62. 
[4] MUNIKAR M, SHAKYA S, SHRESTHA A. Finegrained sentiment classification using BERT [ EB / OL ] . (2019- 10 - 04 ) [ 2023 - 04 - 24 ] . https:∥arxiv. org / abs/ 1910. 03474. 
[5] ZHU X G, LI L, ZHANG W, et al. Dependency exploitation: a unified CNN-RNN approach for visual emotion recognition [ C ] ∥Proceedings of the 26th International Joint Conference on Artificial Intelligence. New York: ACM, 2017:3595-3601. 
[6] YOU Q Z, JIN H L, LUO J B. Visual sentiment analysis by attending on local image regions[ C]∥Proceedings of the Thirty-first AAAI Conference on Artificial Intelligence. New York:ACM, 2017: 231-237.
 [7] WANG H H, MEGHAWAT A, MORENCY L P, et al. Select-additive learning: improving generalization in multimodal sentiment analysis[ C]∥2017 IEEE International Conference on Multimedia and Expo ( ICME) . Piscataway: IEEE, 2017: 949-954. 
[8] 吴思思, 马静. 基于感知融合的多任务多模态情感分 析模型[ J] . 数据分析与知识发现,2023(10) :74-84. 
WU S S,MA J. Multi-task & multi-modal sentiment analysis model based on aware fusion[ J] . Data Analysis and Knowledge Discovery, 2023(10) :74-84. 
[9] RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision[EB / OL] . (2021-02-26) [ 2023- 04- 24] . https:∥ arxiv. org / abs/ 2103. 00020.
 [10] 赖宇斌, 陈燕, 胡小春,等. 基于提示嵌入的突发公共 卫生事件微博文本情感分析[ J] . 数据分析与知识发 现,2023,7(11) :46-55. 
LAI Y B,CHEN Y,HU X C. et al. Emotional analysis of public health emergency micro-blog based on prompt embedding [ J ]. Data Analysis and Knowledge Discovery, 2023,7(11):46-55. 
[11] YU W M, XU H, MENG F Y, et al. CH-SIMS: a Chinese multimodal sentiment analysis dataset with finegrained annotation of modality [ C] ∥Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 3718-3727.
 [12] ZADEH A, CHEN M H, PORIA S, et al. Tensor fusion network for multimodal sentiment analysis [ EB / OL ] . (2017 - 07 - 23 ) [ 2023 - 04 - 24 ] . https: ∥doi. org / 10. 48550 / arXiv. 1707. 07250.
 [13] LIU Z, SHEN Y, LAKSHMINARASIMHAN V B, et al. Efficient low-rank multimodal fusion with modality-specific factors[C]∥Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2018: 2247-2256. 
[14] TSAI Y H H, BAI S J, LIANG P P, et al. Multimodal Transformer for unaligned multimodal language sequences [ C] . ∥Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2019: 6558-6569. 
[15] YU W M, XU H, YUAN Z Q, et al. Learning modalityspecific representations with self-supervised multi-task learning for multimodal sentiment analysis[ C]∥Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto :AAAI, 2021: 10790-10797. 
[16] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [ EB / OL ] . (2014- 09 - 04 ) [ 2023 - 04 - 24 ] . https:∥arxiv. org / abs/ 1409. 1556. 
[17] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [ C] ∥2016 IEEE Conference on Computer Vision and Pattern Recognition 、(CVPR) . Piscataway: IEEE, 2016: 770-778. 
[18] LIU Z, MAO H Z, WU C Y, et al. A ConvNet for the 2020s[C]∥2022 IEEE / CVF Conference on Computer Vision and Pattern Recognition ( CVPR ) . Piscataway: IEEE, 2022: 11966-11976.
 [19] BALTRUSAITIS T, ZADEH A, LIM Y C, et al. OpenFace 2. 0: facial behavior analysis toolkit[C]∥2018 13th IEEE International Conference on Automatic Face & Gesture Recognition ( FG 2018) . Piscataway: IEEE, 2018: 59-66.
 [20] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[EB / OL]. (2020-10-22)[2023-04- 24] . https:∥arxiv. org / abs/ 2010. 11929.
 [21] DESAI S, RAMASWAMY H G. Ablation-CAM: visual explanations for deep convolutional network via gradientfree localization [ C] ∥2020 IEEE Winter Conference on Applications of Computer Vision ( WACV) . Piscataway: IEEE, 2020: 972-980.
 [22] LAN Z Z, CHEN M D, GOODMAN S, et al. ALBERT: a lite BERT for self-supervised learning of language representations[EB / OL] . ( 2019 - 09 - 26) [ 2023 - 04 - 24] . https:∥arxiv. org / abs/ 1909. 11942.
 [23] DEVLIN J, CHANG M W, LEE K, et al. BERT: pretraining of deep bidirectional transformers for language understanding[ EB / OL] . ( 2018 - 11 - 11) [ 2023 - 04 - 24] . https:∥doi. org / 10. 48550 / arXiv. 1810. 04805. 
[24] SUN Y, WANG S H, LI Y K, et al. ERNIE: enhanced representation through knowledge integration [ EB / OL ] . (2019 - 04 - 19 ) [ 2023 - 04 - 24 ] . https: ∥doi. org / 10. 48550 / arXiv. 1904. 09223. 
[25] CUI Y M, CHE W X, LIU T, et al. Revisiting pretrained models for Chinese natural language processing [EB / OL] . (2020 - 04 - 29) [ 2023 - 04 - 24] . https:∥ doi. org / 10. 48550 / arXiv. 2004. 13922. 
[26] LIU Y H, OTT M, GOYAL N, et al. RoBERTa: a robustly optimized BERT pretraining approach [ EB / OL ] . (2019 - 07 - 26 ) [ 2023 - 04 - 24 ] . https: ∥doi. org / 10. 48550 / arXiv. 1907. 11692. 
[27] LUO H S, JI L, ZHONG M, et al. CLIP4Clip: an empirical study of CLIP for end to end video clip retrieval and captioning[J]. Neurocomputing, 2022, 508
Similar References:
Memo

-

Last Update: 2024-03-08
Copyright © 2023 Editorial Board of Journal of Zhengzhou University (Engineering Science)