files/journal/2022-09-03_18-45-30-000000_586.png

Research Journal of Applied Sciences

ISSN: Online 1993-6079
ISSN: Print 1815-932x
95
Views
2
Downloads

Integrating Ontology to Enhance HCL-Based Text Document Clustering

S. Vijayalakshmi and D. Manimegalai
Page: 358-368 | Received 21 Sep 2022, Published online: 21 Sep 2022

Full Text Reference XML File PDF File

Abstract

Increasingly large text datasets and the high dimensionality associated with natural language is a great challenge of text mining. Initially, researchers have been compared using three types of Document Representation (Bag of Word (BoW), Bag of Noun (BoN) and Bag of Phrase (BoP)) and researchers found that Bag of Noun and Bag of Phrase are performing better than BoW. BoP significantly improves the better F-measure than BoN and BoW when the corpus is smaller. If the corpus is larger, it increases the dimensionality. BoN document representation working efficiently and also used to reduce its dimensionality when the corpus is larger in text document clustering than BoP and BoN. Researchers have been used Bag of Noun document representation. Nouns are checked with ontology and extracted to construct term document matrix, although it reduces the dimension and gives semantics. The comparative study result shows that the performance of Bag of Noun document representation is better than Bag of Phrase. Exploration of learning algorithm gives promising results in recent years. In this study, researchers propose ontology based OHCLK-Means Clustering algorithm. It significantly improves the clustering quality than ontology based K-means and ontology based ONVK-means.


How to cite this article:

S. Vijayalakshmi and D. Manimegalai . Integrating Ontology to Enhance HCL-Based Text Document Clustering.
DOI: https://doi.org/10.36478/rjasci.2013.358.368
URL: https://www.makhillpublications.co/view-article/1815-932x/rjasci.2013.358.368