This paper presents a hybrid machine learning approach to extract information from WWW. It applies structure analysis to improve the extraction accuracy, with 96.5% average precision and 96.7% average recall for static web page, and 100% precision and recall for dynamic web page. Furthermore, the working time is short (< 800 ms) and the number of learning examples is small (< 4) due to little user participation. Our results prove that this approach offers the attractive advantageous of fast, convenient and high-accuracy requirements of practical applications.
Kun Yu, Zhi Cai , Xufa Wang and Qingsheng Cai . A Hybrid Machine Learning Approach for Extracting Information from WWW.
DOI: https://doi.org/10.36478/ajit.2005.41.48
URL: https://www.makhillpublications.co/view-article/1682-3915/ajit.2005.41.48