Logo image
Learnable Temporal Sparse Memory iTransformer: Revisiting Sparsity and Memory in Transformers
Journal article   Open access   Peer reviewed

Learnable Temporal Sparse Memory iTransformer: Revisiting Sparsity and Memory in Transformers

Polycarp Shizawaliyi Yakoi, Xiangfu Meng, Chunlin Yu, Victor Adeyi Odeh, Danladi Suleman, Yongqin Zhang and Xiaoyan Zhang
IEEE Access, Vol.14, pp.11505-11533
2026
pdf
Learnable_Temporal_Sparse_Memory_iTransformer_Revisiting_Sparsity_and_Memory_in_Transformers5.48 MBDownloadView
Published VersionCC BY-NC-ND V4.0 Open Access

Abstract

Transformer-based models for time series forecasting have advanced considerably, yet many approaches treat attention sparsity and memory mechanisms as separate strategies. In this extended work, we revisit these paradigms and present LTSMiTransformer, a unified architecture that integrates Learnable Temporal Sparse Attention (LTSA), a Memory-Augmented Module (MAM) and a unified embedding strategy that enhances feature representation across heterogeneous datasets to improve efficiency, scalability, and generalization in long-horizon multivariate time series forecasting. LTSA employs a trainable threshold to dynamically filter irrelevant time steps, reducing attention complexity from O ( L ²) to O ( L log L ), while MAM encodes long-term patterns into a compact memory bank with gated updates, enabling persistent context retention without uncontrolled memory growth. Unlike prior models, these components are jointly optimized to balance sparsity and memory in a coherent training framework. This extended work provides new theoretical analysis of LTSA's sub-quadratic efficiency and MAM's convergence behaviour under sparse gating. Comprehensive robustness tests under noise, missing values, and temporal imbalance, along with evaluations on eight diverse datasets from finance, energy, weather, and traffic, demonstrate LTSMiTransformer's accuracy, generalizability, and compactness. The theoretical proofs, ablation studies, and robustness evaluations included show that the LTSMiTransformer achieves a strong balance between scalability, interpretability, and generalization across diverse datasets. This paper significantly expands our earlier CNIOT 2025 conference papers by adding new theoretical results, deeper joint-training analysis, and wider empirical validation. All source code and datasets are available on GitHub to support reproducibility.

Details

Metrics

2 File views/ downloads
6 Record Views

InCites Highlights

These are selected metrics from InCites Benchmarking & Analytics tool, related to this output

Collaboration types
Domestic collaboration
International collaboration
Web Of Science research areas
Computer Science, Information Systems
Engineering, Electrical & Electronic
Telecommunications

UN Sustainable Development Goals (SDGs)

This output has contributed to the advancement of the following goals:

#7 Affordable and Clean Energy

Source: InCites

Logo image