Journal article
A deep embedded clustering technique using dip test and unique neighbourhood set
Neural Computing and Applications, Vol.37, pp.1345-1356
2025
Abstract
In recent years, there has been a growing interest in deep learning-based clustering. A recently introduced technique called DipDECK has shown effective performance on large and high-dimensional datasets. DipDECK utilises Hartigan’s dip test, a statistical test, to merge small non-viable clusters. Notably, DipDECK was the first deep learning-based clustering technique to incorporate the dip test. However, the number of initial clusters of DipDECK is overestimated and the algorithm then randomly selects the initial seeds to produce the final clusters for a dataset. Therefore, in this paper, we presented a technique called UNSDipDECK , which is an improved version of DipDECK and does not require user input for datasets with an unknown number of clusters. UNSDipDECK produces high-quality initial seeds and the initial number of clusters through a deterministic process. UNSDipDECK uses the unique closest neighbourhood and unique neighbourhood set approaches to determine high-quality initial seeds for a dataset. In our study, we compared the performance of UNSDipDECK with fifteen baseline clustering techniques, including DipDECK, using NMI and ARI metrics. The experimental results indicate that UNSDipDECK outperforms the baseline techniques, including DipDECK. Additionally, we demonstrated that the initial seed selection process significantly contributes to UNSDipDECK ’s ability to produce high-quality clusters.
Details
- Title
- A deep embedded clustering technique using dip test and unique neighbourhood set
- Authors
- Md Anisur Rahman (Corresponding Author) - La Trobe UniversityLi-minn Ang - University of the Sunshine Coast, Queensland, School of Science, Technology and EngineeringYuan Sun - La Trobe UniversityKah Phooi Seng - Xi’an Jiaotong-Liverpool University
- Publication details
- Neural Computing and Applications, Vol.37, pp.1345-1356
- Publisher
- Springer UK
- Date published
- 2025
- DOI
- 10.1007/s00521-024-10497-4
- ISSN
- 1433-3058
- Data Availability
- Data are available upon request from the corresponding author.
- Organisation Unit
- School of Science, Technology and Engineering
- Language
- English
- Record Identifier
- 991087489002621
- Output Type
- Journal article
Metrics
4 Record Views