Journal article
Gate‐Align‐SED: Semi‐Supervised Sound Event Detection via Adaptive Feature Gating and Cross‐Task Alignment in Situation Awareness
Advanced Intelligent Systems, Vol.Advanced access
16-Apr-2026
Appears in UniSC Supported Open Access Outputs
Abstract
In complex real‐world environments such as disaster monitoring, effective sound event detection (SED) is often hindered by the presence of noise and limited labeled data. This article presents Gate‐Align‐SED, a unified semi‐supervised framework designed to bridge the gap between clip‐level and frame‐level acoustic modeling for disaster‐related audio understanding. The proposed method integrates adaptive feature fusion, mutual attention mechanisms, and a novel label alignment strategy that introduces a learnable correlation matrix to align heterogeneous label granularities. Furthermore, we incorporate a consistency learning paradigm grounded in the Mean‐Teacher framework, promoting robust representation learning across both temporal scales and annotation levels. Experiments demonstrate that the proposed approach enhances both the flexibility and stability of SED systems, particularly under label‐sparse or noisy conditions. Our work offers a scalable and generalizable solution for leveraging both weakly labeled and unlabeled data in critical acoustic event recognition scenarios.
Details
- Title
- Gate‐Align‐SED: Semi‐Supervised Sound Event Detection via Adaptive Feature Gating and Cross‐Task Alignment in Situation Awareness
- Authors
- Jieli Chen - Xi’an Jiaotong-Liverpool UniversityLi‐Minn Ang (Corresponding Author) - University of the Sunshine CoastChee Shen Lim - Xi’an Jiaotong-Liverpool UniversityKah Phooi Seng - University of the Sunshine CoastJeremy Smith - University of Liverpool
- Publication details
- Advanced Intelligent Systems, Vol.Advanced access
- Publisher
- Wiley-VCH Verlag GmbH & Co. KGaA
- DOI
- 10.1002/aisy.70355
- ISSN
- 2640-4567
- Copyright note
- © 2026 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
- Data Availability
- The VGGSound dataset can be accessed through a public link: ∼https://www.robots.ox.ac.uk/∼vgg/data/vggsound/.
- Organisation Unit
- School of Science, Technology and Engineering; Engage Research Lab
- Language
- English
- Record Identifier
- 991224995702621
- Output Type
- Journal article
Metrics
1 Record Views