files/journal/2022-09-02_11-59-20-000000_418.png

Asian Journal of Information Technology

ISSN: Online 1993-5994
ISSN: Print 1682-3915
112
Views
1
Downloads

Feature Selection for Efficient Text Categorization and Knowledge Discovery Using Classification Techniques

Christy, A. and P. Thambidurai
Page: 872-876 | Received 21 Sep 2022, Published online: 21 Sep 2022

Full Text Reference XML File PDF File

Abstract

Text Categorization, which consists of automatically assigning documents to a set of categories deals with the management of huge number of features. Feature selection is one of the important and frequently used techniques in data preprocessing for data mining. It removes irrelevant, redundant or noisy data and brings immediate effects for data mining applications. In this study, we propose a filter system for feature set extraction, based on the similarity distance measure. Although past literatures have suggested that the use of features from irrelevant categories can improve the measure of text categorization, we believe that by incorporating only relevant feature can be highly effective. The experimental comparison is carried out between distance measure and four well-known classification techniques: C4.8, Multilayer perceptron, Least Mean Square and Linear Regression. The results also show that our proposed method can perform comparatively well with other classification measures, especially on a highly overlapped collection of topics and also it is found that C4.8 acts as a better classifier than other techniques.


How to cite this article:

Christy, A. and P. Thambidurai . Feature Selection for Efficient Text Categorization and Knowledge Discovery Using Classification Techniques.
DOI: https://doi.org/10.36478/ajit.2006.872.876
URL: https://www.makhillpublications.co/view-article/1682-3915/ajit.2006.872.876