files/journal/2022-09-03_18-45-30-000000_586.png

Research Journal of Applied Sciences

ISSN: Online 1993-6079
ISSN: Print 1815-932x
93
Views
1
Downloads

Source Code Classification using Latent Semantic Indexing with Structural and Frequency Term Weighting

Yuhanis Yusof, Taha Alhersh, Massudi Mahmuddin and Aniza Mohamed Din
Page: 266-271 | Received 21 Sep 2022, Published online: 21 Sep 2022

Full Text Reference XML File PDF File

Abstract

In recent years, there is an increase in the number of open source software. Hence, the demand for automatic software classification is also increasing. Latent Semantic Indexing (LSI) is an information retrieval approach that is utilized in classifying source code programs. This research proposes a Latent Semantic Indexing classifier that integrates information on structural and frequency of terms in its weighting scheme. The content terms are identified by extracting words in the source code program. Based on the undertaken experiment, the LSI classifier is noted to generate a higher precision and recall compared to the C4.5 algorithm. Furthermore, it is also learned that the use of structural information in the weighting scheme contribute to a better classification.


How to cite this article:

Yuhanis Yusof, Taha Alhersh, Massudi Mahmuddin and Aniza Mohamed Din. Source Code Classification using Latent Semantic Indexing with Structural and Frequency Term Weighting.
DOI: https://doi.org/10.36478/rjasci.2012.266.271
URL: https://www.makhillpublications.co/view-article/1815-932x/rjasci.2012.266.271