files/journal/2022-09-02_11-59-20-000000_418.png

Asian Journal of Information Technology

ISSN: Online 1993-5994
ISSN: Print 1682-3915
126
Views
1
Downloads

An Effective Approach to the Evaluation and Construction of Training Corpus for Text Classification

Jihong Guan and Shuigeng Zhou
Page: 33-40 | Received 21 Sep 2022, Published online: 21 Sep 2022

Full Text Reference XML File PDF File

Abstract

Text classification is becoming more and more important with the rapid growth of on-line information available. It was observed that the quality of training corpus impacts the performance of the trained classifier. This paper proposes an approach to build high-quality training corpuses for better classification performance by first exploring the properties of training corpuses, and then giving an algorithm for constructing training corpuses semi-automatically. Preliminary experimental results validate our approach: classifiers based on the training corpuses constructed by our approach can achieve good performance while the training corpus` size is significantly reduced. Our approach can be used for building efficient and lightweight classification systems.


How to cite this article:

Jihong Guan and Shuigeng Zhou . An Effective Approach to the Evaluation and Construction of Training Corpus for Text Classification.
DOI: https://doi.org/10.36478/ajit.2005.33.40
URL: https://www.makhillpublications.co/view-article/1682-3915/ajit.2005.33.40