TY  - JOUR
T1  - Psuedo Relevance Feeback for Literary Documents
AU - Farhan, Yasir Hadi AU - Mohd. Noah, Shahrul Azman 
JO  - Asian Journal of Information Technology
VL  - 16
IS  - 7
SP  - 599
EP  - 604
PY  - 2017
DA  - 2001/08/19
SN  - 1682-3915
DO  - ajit.2017.599.604
UR  - https://makhillpublications.co/view-article.php?doi=ajit.2017.599.604
KW  - Information retrieval
KW  -pseudo relevance feedback
KW  -query expansion
KW  -Qur`anic text retrieval
KW  -expansion
KW  -algorithm
AB  - One of the biggest issues that affect the Information Retrieval (IR) systems performance is the
difficulties facing users to define exactly what their information needs as that information might be a gap in their
knowledge. Such an issue is more problematic for classical and literary documents such as the holy Qur&#146;an. One
of the approaches to overcome such an issue is pseudo-relevance feedback which assumes a small number of
top-ranked documents as relevant in the initial retrieval results. It selects related terms from these documents
to improve the query representation through query expansion. Among the issues in the Qur&#146;anic text are
ambiguities and complexit y of the text. Due to these issues, users need to reformulate and refine their queries
to match their information needs. Pseudo-relevance feedback can help relieve these issues. The classic Rocchio
algorithm has been widely used to support query reformulation in pseudo relevance feedbacks. In this research,
a modified Rocchio algorithm was proposed by considering element of terms selection and query importance.
In this case it combines the Term Frequency and Inverse Document Frequency (TF-IDF) weights and Rocchio&#146;s
algorithm weights in order to generate a new query. It also uses the frequency of terms to choose suitable
expansion words. Evaluation of the proposed algorithm were compared against the probabilistic IR Model
implemented in Lucene toolkit and against the WordNet query expansion approach. The experiments only
consider relevance feedbacks after two iterations. The evaluation used the Qur&#146;anic dataset previously used
by other researchers. Twelve queries were considered during the evaluation. The results of the experiments
showed that the proposed method exhibit significant improvement in recall and precision. The average precision
through pseudo relevance feedback for the first iteration was 8.3% and for the second iteration was 11.3%
whereas, the average precision by Lucene was 3.3% and the average precision by WordNet query expansion
was 2.7%. These results prove that the proposed method improves retrieval performance.
ER  -