STATISTICS

Viewed629

Downloads668

Video Frame Prediction Model Based on Gated Spatio-Temporal Attention
[1]LI Weijun,ZHANG Xinyong,GAO Yuxiao,et al.Video Frame Prediction Model Based on Gated Spatio-Temporal Attention[J].Journal of Zhengzhou University (Engineering Science),2024,45(01):70-77.[doi:10.13705/j.issn.1671-6833.2024.01.017]
Copy
References:
[1] DAI K, LI X T, YE Y M, et al. MSTCGAN: multiscale time conditional generative adversarial network for long-term satellite image sequence prediction[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-16.
[2] TAN C, LI S Y, GAO Z Y, et al. OpenSTL: a comprehensive benchmark of spatio-temporal predictive learning[EB/OL]. (2023-06-20)[2023-07-20]. https:∥arxiv.org/abs/2306.11249.
[3] SRIVASTAVA N, MANSIMOV E, SALAKHUTDINOV R. Unsupervised learning of video representations using LSTMs[EB/OL]. (2016-01-04)[2023-07-20]. https:∥arxiv.org/abs/1502.04681.
[4] SHI X J, CHEN Z R, WANG H, et al. Convolutional LSTM network: a machine learning approach for precipitation nowcasting[EB/OL]. (2015-09-19)[2023-07-20]. https:∥arxiv.org/abs/1506.04214.
[5] WANG Y B, LONG M S, WANG J M, et al. PredRNN: recurrent neural networks for predictive learning using spatiotemporal LSTMs[C]∥NIPS′17: Proceedings of the 31st International Conference on Neural Information Processing Systems. Cham: Springer, 2017: 879-888.
[6] WANG Y B, GAO Z F, LONG M S, et al. PredRNN++: towards a resolution of the deep-in-time dilemma in spatiotemporal predictive learning[EB/OL]. (2018-11-19)[2023-07-20]. https:∥arxiv.org/abs/1804.06300.
[7] LIU Z W, YEH R A, TANG X O, et al. Video frame synthesis using deep voxel flow[C]∥2017 IEEE International Conference on Computer Vision (ICCV).Piscataway: IEEE, 2017: 4473-4481.
[8] AIGNER S, KÖRNER M. FutureGAN: anticipating the future frames of video sequences using spatio-temporal 3d convolutions in progressively growing GANs[EB/OL]. (2018-11-26)[2023-07-20]. https:∥arxiv.org/abs/1810.01325.
[9] YE X, BILODEAU G A. VPTR: efficient transformers for video prediction[C]∥2022 26th International Confe-rence on Pattern Recognition (ICPR). Piscataway: IEEE, 2022: 3492-3499.
[10] GAO Z Y, TAN C, WU L R, et al. SimVP: simpler yet better video prediction[C]∥2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2022: 3160-3170.
[11] GUO M H, LU C Z, HOU Q B, et al. SegNeXt: rethinking convolutional attention design for semantic segmentation[EB/OL]. (2022-09-18)[2023-07-20]. https:∥arxiv.org/abs/2209.08575.
[12] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141.
[13] WANG Y B, ZHANG J J, ZHU H Y, et al. Memory in memory: a predictive neural network for learning higher-order non-stationarity from spatiotemporal dynamics[C]∥2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2020: 9146-9154.
[14] LOTTER W, KREIMAN G, COX D. Deep predictive coding networks for video prediction and unsupervised learning[EB/OL]. (2017-05-01)[2023-07-20]. https:∥arxiv.org/abs/1605.08104.
[15] GUEN V L, THOME N. Disentangling physical dynamics from unknown factors for unsupervised video prediction[C]∥2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2020: 11471-11481.
[16] PAN T, JIANG Z Q, HAN J N, et al. Taylor saves for later: disentanglement for video prediction using Taylor representation[J]. Neurocomputing, 2022, 472: 166-174.
[17] SUN F, BAI C, SONG Y, et al. MMINR: multi-frame-to-multi-frame inference with noise resistance for precipitation nowcasting with radar[C]∥2022 26th International Conference on Pattern Recognition (ICPR). Piscataway: IEEE, 2022: 97-103.
[18] NING S L, LAN M C, LI Y R, et al. MIMO is all you need: a strong multi-in-multi-out baseline for video prediction[EB/OL]. (2023-05-30)[2023-07-20]. https:∥arxiv.org/abs/2212.04655.
[19] TAN C, GAO Z Y, LI S Y, et al. SimVP: towards simple yet powerful spatiotemporal predictive learning[EB/OL]. (2023-04-26)[2023-07-20]. https:∥arxiv.org/abs/2211.12509.
[20] SMITH L N, TOPIN N. Super-convergence: very fast training of neural networks using large learning rates[EB/OL]. (2017-08-23)[2023-07-20]. https:∥arxiv.org/abs/1708.07120v1.[21] CHANG Z, ZHANG X F, WANG S S, et al. MAU: a motion-aware unit for video prediction and beyond[C]∥35th Conference on Neural Information Processing Systems. Sydney: NeurIPS , 2021: 1-13.
[22] ZHANG J B, ZHENG Y, QI D K. Deep spatio-temporal residual networks for citywide crowd flows prediction[C]∥Proceedings of the 31st AAAI Conference on Artificial Intelligence. New York: ACM, 2017: 1655-1661.
[23] TAN C, GAO Z Y, WU L R, et al. Temporal attention unit: towards efficient spatiotemporal predictive learning[C]∥2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2023: 18770-18782.
[24] RASP S, DUEBEN P D, SCHER S, et al. WeatherBench: a benchmark data set for data-driven weather forecasting[J]. Journal of Advances in Modeling Earth Systems, 2020, 12(11): 1-17.
[25] DING X H, ZHANG X Y, MA N N, et al. RepVGG: making VGG-style ConvNets great again[C]∥2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2021: 13728-13737.
Similar References:
Memo

-

Last Update: 2024-01-24
Copyright © 2023 Editorial Board of Journal of Zhengzhou University (Engineering Science)