Learnable Temporal Sparse Memory iTransformer: Revisiting Sparsity and Memory in Transformers

Polycarp Shizawaliyi Yakoi; Xiangfu Meng; Chunlin Yu; Victor Adeyi Odeh; Danladi Suleman; Yongqin Zhang; Xiaoyan Zhang

doi:10.1109/ACCESS.2025.3649490

Back

Learnable Temporal Sparse Memory iTransformer: Revisiting Sparsity and Memory in Transformers

Journal article

Open access

Peer reviewed

Learnable Temporal Sparse Memory iTransformer: Revisiting Sparsity and Memory in Transformers

Polycarp Shizawaliyi Yakoi, Xiangfu Meng, Chunlin Yu, Victor Adeyi Odeh, Danladi Suleman, Yongqin Zhang and Xiaoyan Zhang

IEEE Access, Vol.14, pp.11505-11533

2026

DOI: https://doi.org/10.1109/ACCESS.2025.3649490

Files and links (1)

pdf

Learnable_Temporal_Sparse_Memory_iTransformer_Revisiting_Sparsity_and_Memory_in_Transformers5.48 MBDownload View

Published VersionCC BY-NC-ND V4.0, Open Access

Abstract

Transformer-based models for time series forecasting have advanced considerably, yet many approaches treat attention sparsity and memory mechanisms as separate strategies. In this extended work, we revisit these paradigms and present LTSMiTransformer, a unified architecture that integrates Learnable Temporal Sparse Attention (LTSA), a Memory-Augmented Module (MAM) and a unified embedding strategy that enhances feature representation across heterogeneous datasets to improve efficiency, scalability, and generalization in long-horizon multivariate time series forecasting. LTSA employs a trainable threshold to dynamically filter irrelevant time steps, reducing attention complexity from O ( L ²) to O ( L log L ), while MAM encodes long-term patterns into a compact memory bank with gated updates, enabling persistent context retention without uncontrolled memory growth. Unlike prior models, these components are jointly optimized to balance sparsity and memory in a coherent training framework. This extended work provides new theoretical analysis of LTSA's sub-quadratic efficiency and MAM's convergence behaviour under sparse gating. Comprehensive robustness tests under noise, missing values, and temporal imbalance, along with evaluations on eight diverse datasets from finance, energy, weather, and traffic, demonstrate LTSMiTransformer's accuracy, generalizability, and compactness. The theoretical proofs, ablation studies, and robustness evaluations included show that the LTSMiTransformer achieves a strong balance between scalability, interpretability, and generalization across diverse datasets. This paper significantly expands our earlier CNIOT 2025 conference papers by adding new theoretical results, deeper joint-training analysis, and wider empirical validation. All source code and datasets are available on GitHub to support reproducibility.

Details

Title: Learnable Temporal Sparse Memory iTransformer: Revisiting Sparsity and Memory in Transformers
Authors: Polycarp Shizawaliyi Yakoi - Liaoning Technical University
Xiangfu Meng - Liaoning Technical University
Chunlin Yu - Liaoning Technical University
Victor Adeyi Odeh - University of Electronic Science and Technology of China
Danladi Suleman - University of the Sunshine Coast, Queensland, School of Science, Technology and Engineering
Yongqin Zhang - Liaoning Technical University
Xiaoyan Zhang - Liaoning Technical University
Publication details: IEEE Access, Vol.14, pp.11505-11533
Publisher: Institute of Electrical and Electronics Engineers
Date published: 2026
DOI: 10.1109/ACCESS.2025.3649490
ISSN: 2169-3536
Organisation Unit: School of Science, Technology and Engineering
Language: English
Record Identifier: 991200750302621
Output Type: Journal article

Metrics

2 File views/ downloads

6 Record Views

InCites Highlights

These are selected metrics from InCites Benchmarking & Analytics tool, related to this output

Collaboration types: Domestic collaboration; International collaboration
Web Of Science research areas: Computer Science, Information Systems; Engineering, Electrical & Electronic; Telecommunications

UN Sustainable Development Goals (SDGs)

This output has contributed to the advancement of the following goals:

Source: InCites