Book chapter
A New Supervised Term Ranking Method for Text Categorization
AI 2010: Advances in Artificial Intelligence, pp.102-111
Australasian Joint Conference on Artificial Intelligence (AI), 23rd (Adelaide, Australia, 07-Dec-2010–10-Dec-2010)
Lecture Notes in Computer Science (LNCS), 6464, Springer
2011
Abstract
In text categorization, different supervised term weighting methods have been applied to improve classification performance by weighting terms with respect to different categories, for example, Information Gain, χ 2 statistic, and Odds Ratio. From the literature there are three term ranking methods to summarize term weights of different categories for multi-class text categorization. They are Summation, Average, and Maximum methods. In this paper we present a new term ranking method to summarize term weights, i.e. Maximum Gap. Using two different methods of information gain and χ 2 statistic, we setup controlled experiments for different term ranking methods. Reuter-21578 text corpus is used as the dataset. Two popular classification algorithms SVM and Boostexter are adopted to evaluate the performance of different term ranking methods. Experimental results show that the new term ranking method performs better.
Details
- Title
- A New Supervised Term Ranking Method for Text Categorization
- Authors
- M Mammadov (Author) - University of Ballarat (Victoria, Australia)J Yearwood (Author) - University of Ballarat (Victoria, Australia)Lei Zhao (Author) - University of Ballarat (Victoria, Australia)
- Contributors
- Jiuyong Li (Editor)
- Publication details
- AI 2010: Advances in Artificial Intelligence, pp.102-111
- Conference details
- Australasian Joint Conference on Artificial Intelligence (AI), 23rd (Adelaide, Australia, 07-Dec-2010–10-Dec-2010)
- Series
- Lecture Notes in Computer Science (LNCS); 6464
- Publisher
- Springer
- Date published
- 2011
- DOI
- 10.1007/978-3-642-17432-2_11
- ISSN
- 0302-9743; 1611-3349; 0302-9743
- ISBN
- 9783642174322
- Copyright note
- Copyright © 2011 Springer-Verlag. The author's accepted version is reproduced here in accordance with the publisher's copyright policy. The final publication is available at www.springerlink.com
- Organisation Unit
- Insights & Analytics Unit; University of the Sunshine Coast, Queensland; Office of Research
- Language
- English
- Record Identifier
- 99450495102621
- Output Type
- Book chapter
- Research Statement
- false
Metrics
27 File views/ downloads
567 Record Views