[1]MENDIS K, WICKRAMASINGHE M, MARASINGHE P. Multivariate time series forecasting: a review[C]∥Proceedings of the 2024 2nd Asia Conference on Computer Vision, Image Processing and Pattern Recognition. New York: ACM, 2024: 1-9.[2]梁宏涛, 刘硕, 杜军威, 等. 深度学习应用于时序预测研究综述[J]. 计算机科学与探索, 2023, 17(6): 1285-1300.
LIANG H T, LIU S, DU J W, et al. Review of deep learning applied to time series prediction[J]. Journal of Frontiers of Computer Science & Technology, 2023, 17(6): 1285-1300.
[3]ZAREMBA W, SUTSKEVER I, VINYALS O. Recurrent neural network regularization[EB/OL].(2014-09-08)[2025-08-13]. https:∥doi. org/10. 48550/arXiv.1409.2329.
[4]HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 17351780.
[5]HE Z S, ZHANG X, LI M, et al. A novel solar radiation forecasting model based on time series imaging and bidirectional long short-term memory network[J]. Energy Science & Engineering, 2024, 12(11): 4876-4893.
[6]HARISH NAYAK G H, ALAM M W, AVINASH G, et al. Transformer-based deep learning architecture for time series forecasting[J]. Software Impacts,2024,22: 100716.
[7]VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]∥Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS). New York:ACM, 2017: 6000-6010.
[8]NIE Y Q, NGUYEN N H, SINTHONG P, et al. A time series is worth 64 words: long-term forecasting with transformers[EB/OL]. (2022-11-27)[2025-08-13]. https:∥doi.org/10.48550/arXiv.2211.14730.
[9]LIU Y, HU T G, ZHANG H R, et al. iTransformer: inverted transformers are effective for time series forecasting[EB/OL]. (2023-10-10)[2025-08-13]. https:∥doi.org/10.48550/arXiv.2310.06625.
[10]姜晓勇, 李忠义, 黄朗月, 等. 神经网络剪枝技术研究综述[J]. 应用科学学报, 2022, 40(5): 838-849.
JIANG X Y, LI Z Y, HUANG L Y, et al. Review of neural network pruning techniques[J]. Journal of Applied Sciences, 2022, 40(5): 838-849.
[11]高杨, 曹仰杰, 段鹏松. 神经网络模型轻量化方法综述[J]. 计算机科学, 2024, 51(增刊1): 11-21.
GAO Y, CAO Y J, DUAN P S. Lightweighting methods for neural network models: a review[J]. Computer Science, 2024, 51(S1): 11-21.
[12]王改华, 李柯鸿, 龙潜, 等. 基于知识蒸馏的轻量化Transformer目标检测[J]. 系统仿真学报, 2024, 36(11): 2517-2527.
WANG G H, LI K H, LONG Q, et al. Object detection of lightweight Transformer based on knowledge distillation[J]. Journal of System Simulation, 2024, 36(11): 25172527.
[13] SAHA R, SRIVASTAVA V, PILANCI M. Matrix compression via randomized low rank and low precision factorization[EB/OL]. (2023-10-17)[2025-08-13]. https:∥doi.org/10.48550/arXiv.2310.11028.
[14] YANG X H, LIU W F, LIU W, et al. A survey on canonical correlation analysis[J]. IEEE Transactions on Knowledge and Data Engineering, 2021, 33(6): 23492368.
[15] HAN K, WANG Y H, CHEN H T, et al. A survey on vision transformer[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 87110.
[16]孟祥福, 石皓源. 基于Transformer模型的时序数据预测方法综述[J]. 计算机科学与探索, 2025, 19(1): 45-64.
MENG X F, SHI H Y. Survey of Transformer-based model for time series forecasting[J]. Journal of Frontiers of Computer Science & Technology, 2025, 19(1): 45-64.
[17]连家诚. 面向长序列Transformer的计算简化[D]. 合肥: 中国科学技术大学, 2023.
LIAN J C. Computational simplification for long sequence Transformer[D]. Hefei: University of Science and Technology of China, 2023.
[18] ANTONELLI M, LAGO U D, DAVOLI D, et al. An arithmetic theory for the poly-time random functions[EB/OL]. (2023-01-27)[2025-08-13].https:∥doi. org/10.48550/arXiv.2301.12028.
[19] TAY Y, DEHGHANI M, BAHRI D, et al. Efficient transformers: a survey[J]. ACM Computing Surveys, 2022, 55(6): 1-28.
[20] CHEN B D, DAO T, WINSOR E, et al. Scatterbrain: unifying sparse and low-rank attention approximation[EB/OL]. (2021-10-28)[2025-08-13].https:∥doi. org/10.48550/arXiv.2110.15343.
[21] BOYAPATI M, AYGUN R. Semanformer: semantics-aware embedding dimensionality reduction using transformer-based models[C]∥2024 IEEE 18th International Conference on Semantic Computing. Piscataway: IEEE, 2024: 134-141.
[22] ZHOU H Y, ZHANG S H, PENG J Q, et al. Informer: beyond efficient transformer for long sequence time-series forecasting[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(12): 11106-11115.
[23] SALINAS D, FLUNKERT V, GASTHAUS J, et al. DeepAR: probabilistic forecasting with autoregressive recurrent networks[J]. International Journal of Forecasting, 2020, 36(3): 1181-1191.
[24] QIN Y, SONG D J, CHENG H F, et al. A dual-stage attention-based recurrent neural network for time series prediction[C]∥Proceedings of the 26th International Joint Conference on Artificial Intelligence. New York: ACM, 2017: 2627-2633.
[25] NI Z L, YU H, LIU S Z, et al. BasisFormer: attentionbased time series forecasting with learnable and interpretable basis[EB/OL]. (2023-10-31)[2025-08-13].https:∥doi.org/10.48550/arXiv.2310.20496.
[26] KONG X J, CHEN Z H, LIU W Y, et al. Deep learning for time series forecasting: a survey[J]. International Journal of Machine Learning and Cybernetics, 2025, 16(7): 5079-5112.
[27] KITAEV N, KALEV L, LEVSKAYA A, et al. Reformer: the efficient transformer[EB/OL]. (2020-01-13)[2025-08-13]. https:∥doi. org/10.48550/arXiv.2001.04451.
[28] DAO T, FU D Y, ERMON S, et al. Flashattention: fast and memory-efficient exact attention with io-awareness[C]∥36th Conference on Neural Information Processing Systems.Cambridge:MIT,2022:16344-16359.