TY  - JOUR
T1  - Feature Selection for  Efficient Text Categorization and Knowledge Discovery Using Classification Techniques
AU - , Christy, A. AU - , P. Thambidurai 
JO  - Asian Journal of Information Technology
VL  - 5
IS  - 8
SP  - 872
EP  - 876
PY  - 2006
DA  - 2001/08/19
SN  - 1682-3915
DO  - ajit.2006.872.876
UR  - https://makhillpublications.co/view-article.php?doi=ajit.2006.872.876
KW  - Feature set extraction
KW  -filter
KW  -C4.8
KW  -precision
KW  -recall
KW  -information gain
KW  -etc
AB  - Text Categorization, which consists of automatically assigning documents to a set of categories deals with the management of huge number of features. Feature selection is one of the important and frequently used techniques in data preprocessing for data mining. It removes irrelevant, redundant or noisy data and brings immediate effects for data mining applications. In this study, we propose a filter system for feature set extraction, based on the similarity distance measure.  Although past literatures have suggested that the use of features from irrelevant categories can improve the measure of text categorization, we believe that by incorporating only relevant feature can be highly effective. The experimental comparison is carried out between distance measure  and four well-known classification techniques: C4.8, Multilayer perceptron, Least Mean Square and  Linear Regression. The results also show that our proposed method can perform comparatively well with other classification measures, especially on a highly overlapped collection of topics and also it is found that C4.8 acts as a better classifier than other techniques.
ER  -