Pakistan Science Abstracts
Article details & metrics
No Detail Found!!
A HVS model for representation of domain-oriented web page topic features.
Author(s):
1. Xianghua Wu: School of Automation, Beijing Institute of Technology, Beijing 100081, China
2. Qiao Guo: School of Automation, Beijing Institute of Technology, Beijing 100081, China
3. Lei La: School of Automation, Beijing Institute of Technology, Beijing 100081, China
4. Qimin Cao: School of Automation, Beijing Institute of Technology, Beijing 100081, China
Abstract:
Domain-oriented web page extraction is a new and practical direction in the field of information extraction. The paper focuses on the representation of domain-oriented web page topic features, and hierarchical vector space (HVS) model is put forward. Considering the hierarchical characteristics of the web page itself, topic features of the web page are expressed more effectively by HVS model from the facets of the page structure and the content. Then the topic-related page identification problem is discussed by the similarity calculation. Experimental results show good accuracy and applicability for our system to domain-oriented web extraction.
Page(s): 710-715
DOI: DOI not available
Published: Journal: Journal of Theoretical and Applied Information Technology, Volume: 45, Issue: 2, Year: 2012
Keywords:
Keywords are not available for this article.
References:
References are not available for this document.
Citations
Citations are not available for this document.
0

Citations

0

Downloads

21

Views