[1]周清雷,王宇静,段鹏松,等.基于iTransformer的轻量级时序预测模型[J].郑州大学学报(工学版),2026,47(02):9-15(26).[doi:10.13705/j.issn.1671-6833.2026.02.008]
 ZHOU Qinglei,WANG Yujing,DUAN Pengsong,et al.Lightweight Time Series Forecasting Model Based on iTransformer[J].Journal of Zhengzhou University (Engineering Science),2026,47(02):9-15(26).[doi:10.13705/j.issn.1671-6833.2026.02.008]
点击复制

基于iTransformer的轻量级时序预测模型()
分享到:

《郑州大学学报(工学版)》[ISSN:1671-6833/CN:41-1339/T]

卷:
47
期数:
2026年02期
页码:
9-15(26)
栏目:
出版日期:
2026-02-13

文章信息/Info

Title:
Lightweight Time Series Forecasting Model Based on iTransformer
文章编号:
1671-6833(2026)02-0009-07
作者:
周清雷1 王宇静2 段鹏松2 王 超2 郑永利2
1.郑州大学 计算机与人工智能学院,河南 郑州 450001;2.郑州大学 网络空间安全学院,河南 郑州 450002
Author(s):
ZHOU Qinglei1 WANG Yujing2 DUAN Pengsong2 WANG Chao2 ZHENG Yongli2
1.School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China; 2.School of Cyber Sincence and Engineering, Zhengzhou University, Zhengzhou 450002, China
关键词:
时序预测 轻量级 奇异值分解 线性注意力机制
Keywords:
time series forecasting lightweight singular value decomposition linear attention mechanism
分类号:
TP39 TP183
DOI:
10.13705/j.issn.1671-6833.2026.02.008
文献标志码:
A
摘要:
针对时序预测领域难以平衡预测精度与时效性问题,以iTransformer模型为基础框架,提出一种轻量级时序预测模型ILformer。iTransformer作为基于变量的典型时序预测模型,能有效捕获多变量间复杂交互关系,但其存在计算复杂度较高与参数量较大的局限性,导致在资源受限的实际应用场景中模型难以高效部署。ILformer 针对这些不足展开优化。首先,引入线性注意力机制(Linear Attention)替代传统注意力机制,使输入处理更加灵活,通过线性投影和维度重排,ILformer在减少参数量的同时,能更好地适应不同输入形状和结构,尤其在处理大规模数据时计算效率较高,并能在不降低模型精度前提下显著减少注意力模块的计算复杂度;其次,通过对注意力机制进行奇异值分解实现矩阵降维,大幅减少了矩阵乘法和加法的计算次数,提升了计算效率,同时降低了模型的过拟合风险;最后,在8个不同数据集上进行实验。实验结果表明:ILformer在保持相同精度的同时,推理速度提高了40.46%,参数量减少了78.75%,且计算量减半,展示了优异性能与实用性。
Abstract:
To address the challenge of balancing prediction accuracy and efficiency in time series forecasting, in this paper a lightweight time series forecasting model named ILformer was proposed, built upon the iTransformer architecture. As a representative variable-based model for temporal data, iTransformer effectively captured complex inter-variable dependencies. However, it was constrained by high computational complexity and a substantial parameter footprint, limiting its practical deployment in resource-constrained scenarios. To mitigate these limitations, ILformer incorporated the following enhancements:the model first introduces a Linear Attention mechanism to replace the traditional attention mechanism, allowing for more flexible input processing. By leveraging linear projection and dimension rearrangement, ILformer significantly reduced the number of parameters while better adapting to varying input shapes and structures. It achieved high computational efficiency, particularly when handling largescale datasets, and drastically lowered the computational complexity of the attention module without compromising model accuracy. Furthermore, singular value decomposition (SVD) was incorporated into the attention mechanism to achieve matrix dimensionality reduction. This approach substantially decreased the number of matrix multiplications and additions, improving computational efficiency and mitigating the risk of overfitting. Experimental results on eight diverse datasets demonstrated that ILformer achieved a 40.46% improvement in inference speed on average while maintaining the same level of accuracy. Additionally, the number of parameters was reduced by 78.75%, and the operations were halved, underscoring its superior performance and practical applicability.

参考文献/References:

[1]MENDIS K, WICKRAMASINGHE M, MARASINGHE P. Multivariate time series forecasting: a review[C]∥Proceedings of the 2024 2nd Asia Conference on Computer Vision, Image Processing and Pattern Recognition. New York: ACM, 2024: 1-9.

[2]梁宏涛, 刘硕, 杜军威, 等. 深度学习应用于时序预测研究综述[J]. 计算机科学与探索, 2023, 17(6): 1285-1300.
LIANG H T, LIU S, DU J W, et al. Review of deep learning applied to time series prediction[J]. Journal of Frontiers of Computer Science & Technology, 2023, 17(6): 1285-1300.
[3]ZAREMBA W, SUTSKEVER I, VINYALS O. Recurrent neural network regularization[EB/OL].(2014-09-08)[2025-08-13]. https:∥doi. org/10. 48550/arXiv.1409.2329.
[4]HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 17351780.
[5]HE Z S, ZHANG X, LI M, et al. A novel solar radiation forecasting model based on time series imaging and bidirectional long short-term memory network[J]. Energy Science & Engineering, 2024, 12(11): 4876-4893.
[6]HARISH NAYAK G H, ALAM M W, AVINASH G, et al. Transformer-based deep learning architecture for time series forecasting[J]. Software Impacts,2024,22: 100716.
[7]VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]∥Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS). New York:ACM, 2017: 6000-6010.
[8]NIE Y Q, NGUYEN N H, SINTHONG P, et al. A time series is worth 64 words: long-term forecasting with transformers[EB/OL]. (2022-11-27)[2025-08-13]. https:∥doi.org/10.48550/arXiv.2211.14730.
[9]LIU Y, HU T G, ZHANG H R, et al. iTransformer: inverted transformers are effective for time series forecasting[EB/OL]. (2023-10-10)[2025-08-13]. https:∥doi.org/10.48550/arXiv.2310.06625.
[10]姜晓勇, 李忠义, 黄朗月, 等. 神经网络剪枝技术研究综述[J]. 应用科学学报, 2022, 40(5): 838-849.
JIANG X Y, LI Z Y, HUANG L Y, et al. Review of neural network pruning techniques[J]. Journal of Applied Sciences, 2022, 40(5): 838-849.
[11]高杨, 曹仰杰, 段鹏松. 神经网络模型轻量化方法综述[J]. 计算机科学, 2024, 51(增刊1): 11-21.
GAO Y, CAO Y J, DUAN P S. Lightweighting methods for neural network models: a review[J]. Computer Science, 2024, 51(S1): 11-21.
[12]王改华, 李柯鸿, 龙潜, 等. 基于知识蒸馏的轻量化Transformer目标检测[J]. 系统仿真学报, 2024, 36(11): 2517-2527.
WANG G H, LI K H, LONG Q, et al. Object detection of lightweight Transformer based on knowledge distillation[J]. Journal of System Simulation, 2024, 36(11): 25172527.
[13] SAHA R, SRIVASTAVA V, PILANCI M. Matrix compression via randomized low rank and low precision factorization[EB/OL]. (2023-10-17)[2025-08-13]. https:∥doi.org/10.48550/arXiv.2310.11028.
[14] YANG X H, LIU W F, LIU W, et al. A survey on canonical correlation analysis[J]. IEEE Transactions on Knowledge and Data Engineering, 2021, 33(6): 23492368.
[15] HAN K, WANG Y H, CHEN H T, et al. A survey on vision transformer[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 87110.
[16]孟祥福, 石皓源. 基于Transformer模型的时序数据预测方法综述[J]. 计算机科学与探索, 2025, 19(1): 45-64.
MENG X F, SHI H Y. Survey of Transformer-based model for time series forecasting[J]. Journal of Frontiers of Computer Science & Technology, 2025, 19(1): 45-64.
[17]连家诚. 面向长序列Transformer的计算简化[D]. 合肥: 中国科学技术大学, 2023.
LIAN J C. Computational simplification for long sequence Transformer[D]. Hefei: University of Science and Technology of China, 2023.
[18] ANTONELLI M, LAGO U D, DAVOLI D, et al. An arithmetic theory for the poly-time random functions[EB/OL]. (2023-01-27)[2025-08-13].https:∥doi. org/10.48550/arXiv.2301.12028.
[19] TAY Y, DEHGHANI M, BAHRI D, et al. Efficient transformers: a survey[J]. ACM Computing Surveys, 2022, 55(6): 1-28.
[20] CHEN B D, DAO T, WINSOR E, et al. Scatterbrain: unifying sparse and low-rank attention approximation[EB/OL]. (2021-10-28)[2025-08-13].https:∥doi. org/10.48550/arXiv.2110.15343.
[21] BOYAPATI M, AYGUN R. Semanformer: semantics-aware embedding dimensionality reduction using transformer-based models[C]∥2024 IEEE 18th International Conference on Semantic Computing. Piscataway: IEEE, 2024: 134-141.
[22] ZHOU H Y, ZHANG S H, PENG J Q, et al. Informer: beyond efficient transformer for long sequence time-series forecasting[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(12): 11106-11115.
[23] SALINAS D, FLUNKERT V, GASTHAUS J, et al. DeepAR: probabilistic forecasting with autoregressive recurrent networks[J]. International Journal of Forecasting, 2020, 36(3): 1181-1191.
[24] QIN Y, SONG D J, CHENG H F, et al. A dual-stage attention-based recurrent neural network for time series prediction[C]∥Proceedings of the 26th International Joint Conference on Artificial Intelligence. New York: ACM, 2017: 2627-2633.
[25] NI Z L, YU H, LIU S Z, et al. BasisFormer: attentionbased time series forecasting with learnable and interpretable basis[EB/OL]. (2023-10-31)[2025-08-13].https:∥doi.org/10.48550/arXiv.2310.20496. 
[26] KONG X J, CHEN Z H, LIU W Y, et al. Deep learning for time series forecasting: a survey[J]. International Journal of Machine Learning and Cybernetics, 2025, 16(7): 5079-5112.
[27] KITAEV N, KALEV L, LEVSKAYA A, et al. Reformer: the efficient transformer[EB/OL]. (2020-01-13)[2025-08-13]. https:∥doi. org/10.48550/arXiv.2001.04451.
[28] DAO T, FU D Y, ERMON S, et al. Flashattention: fast and memory-efficient exact attention with io-awareness[C]∥36th Conference on Neural Information Processing Systems.Cambridge:MIT,2022:16344-16359.

更新日期/Last Update: 2026-03-04