STATISTICS

Viewed9

Downloads

A Review of Multimodal Medical Image Classification and Cancer Diagnosis
 
[1]WANG Kongyuan,BI Ying,GUO Weifeng,et al.A Review of Multimodal Medical Image Classification and Cancer Diagnosis[J].Journal of Zhengzhou University (Engineering Science),2027,48(XX):1-10.[doi:10.13705/j.issn.1671-6833.2026.06.005]
Copy
References:
[1] Bray F, Laversanne M, Sung H, et al. Global cancerstatistics 2022: GLOBOCAN estimates of incidence andmortality worldwide for 36 cancers in 185 countries [ J] .CA: A Cancer Journal for Clinicians, 2024, 74(3): 229-263.
[2] Han Bingfeng, Zheng Rongshou, Zeng Hongmei, et al.Cancer incidence and mortality in China, 2022[ J] . Jour nal of the National Cancer Center, 2024, 4(1) : 47-53.
[3] World Health Organization. Assessing national capacityfor the prevention and control of noncommunicable disea ses: report of the 2021 global survey[ M] . World HealthOrganization, 2023.
[4] Cao Guangwen. Cancer in China: epidemiological charac-teristics, current prophylaxis and treatment, and futurestrategy [ J] . Academic Journal of Naval Medical Univer sity, 2025, 46 (3) : 279 - 290. [ 曹广文。我国癌症的流行特点、防控现状及未来应对策略 [ J] . 海军军医大学学报,2025, 46 (3) : 279-290. ]
[5] Xia J Y, Aadam A A. Advances in screening and detec tion of gastric cancer[ J] . Journal of Surgical Oncology,2022, 125(7) : 1104-1109.
[6] Dey N, Bhateja V, Hassanien A E. Medical imaging inclinical applications: algorithmic and computer-based ap proaches[M]. Cham: Springer International Publishing, 2016.
[7] Anwar S M, Majid M, Qayyum A, et al. Medical imageanalysis using convolutional neural networks: a review[ J] . Journal of Medical Systems, 2018, 42(11) : 226.
[8] Gao Zixian, Jiang Xun, Xu Xing, et al. Embracing uni modal aleatoric uncertainty for robust multimodal fusion[C]∥Proceedings of the 2024 IEEE / CVF Conference onComputer Vision and Pattern Recognition ( CVPR) . Pis cataway: IEEE, 2024: 26866-26875.
[9] Rao V M, Hla M, Moor M, et al. Multimodal generativeAI for medical image interpretation [ J] . Nature, 2025,639(8056) : 888-896.
[10] Azam M A, Khan K B, Salahuddin S, et al. A review onmultimodal medical image fusion: compendious analysisof medical modalities, multimodal databases, fusion tech niques and quality metrics[ J] . Computers in Biology andMedicine, 2022, 144: 105253.
[11] Li Yihao, El Habib Daho M, Conze P H, et al. A reviewof deep learning-based information fusion techniques formultimodal medical image classification [ J ] . Computersin Biology and Medicine, 2024, 177: 108635.
[12] Xing Xiaodan, Wu Huanjun, Wang Lichao, et al. Non imaging medical data synthesis for trustworthy AI: a com prehensive survey[ J] . ACM Computing Surveys, 2024,56(7) : 1-35.
[13] Ronneberger O, Fischer P, Brox T. U-Net: convolutionalnetworks for biomedical image segmentation[C]∥Medicalimage computing and computer-assisted intervention-MIC CAI 2015. Cham: Springer International Publishing,2015: 234-241.
[14] Hasanah U, Avian C, Darmawan J T, et al. CheXNetand feature pyramid network: a fusion deep learning ar chitecture for multilabel chest X-Ray clinical diagnosesclassification[ J] . The International Journal of Cardiovas cular Imaging, 2024, 40(4) : 709-722.
[15] Ehteshami Bejnordi B, Veta M, Johannes van Diest P, etal. Diagnostic assessment of deep learning algorithms fordetection of lymph node metastases in women with breastcancer[ J] . Jama, 2017, 318(22) : 2199.
[16] Heinrich M P, Jenkinson M, Bhushan M, et al. MIND:modality independent neighbourhood descriptor for multi modal deformable registration[ J] . Medical Image Analy sis, 2012, 16(7) : 1423-1435.
[17] Zhu Junyan, Park T, Isola P, et al. Unpaired image-to image translation using cycle-consistent adversarial net works[ C] ∥Proceedings of the 2017 IEEE InternationalConference on Computer Vision ( ICCV ) . Piscataway:IEEE, 2017: 2242-2251.
[18] Isola P, Zhu Junyan, Zhou Tinghui, et al. Image-to-im age translation with conditional adversarial networks[C]∥Proceedings of the 2017 IEEE Conference on ComputerVision and Pattern Recognition ( CVPR ) . Piscataway:IEEE, 2017: 5967-5976.
[19] Dalmaz O, Yurt M, Cukur T. ResViT: residual visiontransformers for multimodal medical image synthesis[ J] .IEEE Transactions on Medical Imaging, 2022, 41( 10) :2598-2614.
[20] Kazerouni A, Aghdam E K, Heidari M, et al. Diffusionmodels in medical imaging: a comprehensive survey[ J] .Medical Image Analysis, 2023, 88: 102846.
[21] Zaitsev M, MacLaren J, Herbst M. Motion artifacts inMRI: a complex problem with many partial solutions[ J] .Journal of Magnetic Resonance Imaging, 2015, 42 ( 4) :887-901.
[22] Zhang Kai, Zuo Wangmeng, Chen Yunjin, et al. Beyonda Gaussian denoiser: residual learning of deep CNN forimage denoising [ J] . IEEE Transactions on Image Pro cessing, 2017, 26(7) : 3142-3155.
[23] Valindria V V, Pawlowski N, Rajchl M, et al. Multi-mo dal learning from unpaired images: application to multi organ segmentation in CT and MRI [ C]∥Proceedings ofthe 2018 IEEE Winter Conference on Applications ofComputer Vision ( WACV) . Piscataway: IEEE, 2018:547-556.
[24] Lin Yusong, Li Mengya, Li Yinghao, et al. Multimodalmedical image fusion based on GAN and multiscale spa tial attention [ J] . Journal of Zhengzhou University (Engi neering Science) , 2025, 46 (1) : 1-8. [林予松,李孟娅,李英豪,等。基于 GAN 和多尺度空间注意力的多模态 医 学 图 像 融 合 [ J] . 郑 州 大 学 学 报 ( 工 学 版) ,2025, 46 (1) : 1-8. ]
[25] Chen R J, Lu M Y, Wang Jingwen, et al. Pathomic fu sion: an integrated framework for fusing histopathologyand genomic features for cancer diagnosis and prognosis[ J] . IEEE Transactions on Medical Imaging, 2022, 41(4) : 757-770.
[26] Chen Junyu, Liu Yihao, Wei Shuwen, et al. A survey ondeep learning in medical image registration: new technol-ogies, uncertainty, evaluation metrics, and beyond [ J] .Medical Image Analysis, 2025, 100: 103385.
[27] Jia Gengyun, Huang Huaibo, Fu Chaoyou, et al. Rethin king image cropping: exploring diverse compositions fromglobal views [ C] ∥Proceedings of the 2022 IEEE / CVFConference on Computer Vision and Pattern Recognition(CVPR) . Piscataway: IEEE, 2022: 2436-2445.
[28] Wang Zhiwei, Liu Chaoyue, Cheng Danpeng, et al. Au tomated detection of clinically significant prostate cancerin mp-MRI images based on an end-to-end deep neuralnetwork [ J ] . IEEE Transactions on Medical Imaging,2018, 37(5) : 1127-1139.
[29] Liu Zhonghua, Zhu Fa, Vasilakos A V, et al. Discrimi native approximate regression projection for feature extrac tion[ J] . Information Fusion, 2025, 120: 103088.
[30] Bai Yunping, Xu Yifu, Chen Shifan, et al. TOPS-speedcomplex-valued convolutional accelerator for feature ex traction and inference [ J ] . Nature Communications,2025, 16: 292.
[31] Elharrouss O, Himeur Y, Mahmood Y, et al. ViTs asbackbones: leveraging vision transformers for feature ex traction[ J] . Information Fusion, 2025, 118: 102951.
[32] Yang Boquan, Li Jixiong, Zeng Ting. A review of envi ronmental perception technology based on multi-sensor in formation fusion in autonomous driving[ J] . World Elec tric Vehicle Journal, 2025, 16(1) : 20.
[33] He Man, Han Kangfu, Zhang Yu, et al. Hierarchical-or der multimodal interaction fusion network for grading glio mas [ J ] . Physics in Medicine & Biology, 2021, 66(21) : 215016.
[34] Zhang Pengfei, Li Tianrui, Yuan Zhong, et al. A data level fusion model for unsupervised attribute selection inmulti-source homogeneous data [ J] . Information Fusion,2022, 80: 87-103.
[35] Ranipa K, Zhu Weiping, Swamy M N S. A novel feature level fusion scheme with multimodal attention CNN forheart sound classification [ J ] . Computer Methods andPrograms in Biomedicine, 2024, 248: 108122.
[36] Ma Dong, Liu Zhihao, Gao Qinhe, et al. Few-shot faultdiagnosis of EHA based on MTF-ResNet-MA and dual-at tribute adaptive decision-level fusion [ J] . Measurement,2025, 247: 116787.
[37] Burt P J, Adelson E H. The Laplacian pyramid as a com pact image code [ M ] ∥Readings in Computer Vision.Elsevier, 1987: 671-679.
[38] Choi M, Kim R Y, Nam M R, et al. Fusion of multi spectral and panchromatic satellite images using the cur velet transform[ J] . IEEE Geoscience and Remote Sens ing Letters, 2005, 2(2) : 136-140.
[39] Yang Bin, Li Shutao. Multifocus image fusion and resto ration with sparse representation [ J]. IEEE Transactionson Instrumentation and Measurement, 2010, 59(4): 884-892.
[40] Li Shutao, Kang Xudong, Hu Jianwen. Image fusion withguided filtering [ J ] . IEEE Transactions on Image Pro cessing, 2013, 22(7) : 2864-2875.
[41] Mitianoudis N, Stathaki T. Pixel-based and region-basedimage fusion schemes using ICA bases [ J] . InformationFusion, 2007, 8(2) : 131-142.
[42] Freund Y, Schapire R E. A decision-theoretic generaliza tion of on-line learning and an application to boosting[ J] . Journal of Computer and System Sciences, 1997,55(1) : 119-139.
[43] Cai Jiati, Yin Jin, Zhou Fan, et al. Research on devel opment trends of multimodal fusion for medical imageclassification [ J] . Chinese Journal of Bases and Clinics inGeneral Surgery, 2025, 32 (7) : 793-800. [蔡佳倜,殷晋,周帆,等。面向医学影像图像分类:基于深度学习的多模态融合发展趋势 [ J] . 中国普外基础与临床杂志,2025, 32 (7) : 793-800. ]
[44] Li Hui, Wu Xiaojun. DenseFuse: a fusion approach toinfrared and visible images [ J ] . IEEE Transactions onImage Processing, 2019, 28(5) : 2614-2623.
[45] Liang Nannan. Medical image fusion with deep neuralnetworks[ J] . Scientific Reports, 2024, 14: 7972.
[46] Chen Wei, Li Qixuan, Zhang Heng, et al. MR-CT imagefusion method of intracranial tumors based on Res2Net[ J] . BMC Medical Imaging, 2024, 24: 169.
[47] Kamnitsas K, Ledig C, Newcombe V F J, et al. Efficientmulti-scale 3D CNN with fully connected CRF for accu rate brain lesion segmentation[ J] . Medical Image Analy sis, 2017, 36: 61-78.
[48] Isensee F, Jaeger P F, Kohl S A A, et al. nnU-Net: aself-configuring method for deep learning-based biomedic al image segmentation [ J ] . Nature Methods, 2021, 18(2) : 203-211.
[49] Albekairi M, Mohamed M V O, Kaaniche K, et al. Mul timodal medical image fusion combining saliency percep tion and generative adversarial network [ J ] . ScientificReports, 2025, 15: 10609.
[50] Chen R J, Lu M Y, Weng W H, et al. Multimodal co-at tention transformer for survival prediction in gigapixelwhole slide images[ C]∥Proceedings of the 2021 IEEE /CVF International Conference on Computer Vision ( IC CV) . Piscataway: IEEE, 2021: 3995-4005.
[51] Dar S U, Yurt M, Karacan L, et al. Image synthesis inmulti-contrast MRI with conditional generative adversarialnetworks [ J ] . IEEE Transactions on Medical Imaging, 2019, 38(10) : 2375-2388.
[52] Yang Guang, Yu Simiao, Dong Hao, et al. DAGAN:deep de-aliasing generative adversarial networks for fastcompressed sensing MRI reconstruction[ J] . IEEE Trans actions on Medical Imaging, 2018, 37(6) : 1310-1321.
[53] Shen Pengcheng, Yang Zheyu, Sun Jingjing, et al. Ex plainable multimodal deep learning for predicting thyroidcancer lateral lymph node metastasis using ultrasound im aging[ J] . Nature Communications, 2025, 16: 7052.
[54] Hu Can, Xia Yingda, Zheng Zhilin, et al. AI-basedlarge-scale screening of gastric cancer from noncontrastCT imaging[ J] . Nature Medicine, 2025, 31(9) : 3011-3019.
[55] Luo Jia, Vanguri R S, Aukerman A T, et al. Multimodalintegration of radiology, pathology, and genomics for pre diction of response to PD - 1 blockade in patients withnon-small cell lung cancer[ J] . Journal of Clinical Oncol ogy, 2022, 40(16_suppl) : 9064.
[56] Li Chengyi, Chang K J, Yang Chengfu, et al. Towards aholistic framework for multimodal LLM in 3D brain CT ra diology report generation [ J ] . Nature Communications,2025, 16: 2258.
[57] Qian Xuejun, Pei Jing, Han Chunguang, et al. A multi modal machine learning model for the stratification ofbreast cancer risk [ J] . Nature Biomedical Engineering,2025, 9(3) : 356-370.
[58] Yan Siyuan, Yu Zhen, Primiero C, et al. A multimodalvision foundation model for clinical dermatology[ J] . Na ture Medicine, 2025, 31(8) : 2691-2702.
[59] Menze B H, Jakab A, Bauer S, et al. The multimodalbrain tumor image segmentation benchmark ( BRATS )[ J] . IEEE Transactions on Medical Imaging, 2015, 34(10) : 1993-2024.
[60] de Vent N R, Agelink van Rentergem J A, Schmand BA, et al. Advanced neuropsychological diagnostics infra structure ( ANDI ) : a normative database created fromcontrol datasets [ J ] . Frontiers in Psychology, 2016,7: 1601.
[61] Rossi G, Barabino E, Fedeli A, et al. Radiomic detec tion of EGFR mutations in NSCLC[ J] . Cancer Research,2021, 81(3) : 724-731.
[62] Li Hui, Zhu Yitan, Burnside E S, et al. QuantitativeMRI radiomics in the prediction of molecular classifica tions of breast cancer subtypes in the TCGA / TCIA dataset[ J] . npj Breast Cancer, 2016, 2: 16012.
Similar References:
Memo

-

Last Update: 2026-06-03
Copyright © 1980 Editorial Board of Journal of Zhengzhou University (Engineering Science)
Email: gxb@zzu.edu.cn ;Tel: 0371-67781276,0371-67781277
Address: No.100 Science Avenue,100,Zhengzhou 450001,China