|
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
| Volume 52 - Issue 19 |
| Published: August 2012 |
| Authors: Vidushi Singhal, Sachin Sharma |
10.5120/8309-1827
|
Vidushi Singhal, Sachin Sharma . Crawling the Web Surface Databases. International Journal of Computer Applications. 52, 19 (August 2012), 15-22. DOI=10.5120/8309-1827
@article{ 10.5120/8309-1827,
author = { Vidushi Singhal,Sachin Sharma },
title = { Crawling the Web Surface Databases },
journal = { International Journal of Computer Applications },
year = { 2012 },
volume = { 52 },
number = { 19 },
pages = { 15-22 },
doi = { 10.5120/8309-1827 },
publisher = { Foundation of Computer Science (FCS), NY, USA }
}
%0 Journal Article
%D 2012
%A Vidushi Singhal
%A Sachin Sharma
%T Crawling the Web Surface Databases%T
%J International Journal of Computer Applications
%V 52
%N 19
%P 15-22
%R 10.5120/8309-1827
%I Foundation of Computer Science (FCS), NY, USA
The World Wide Web is growing at a rapid rate. A web crawler is a computer program which independently browses the World Wide Web. The size of web as on February 2007 was 29 billion pages. One of the most important uses of web page is in indexing purpose and keeping web pages up to date which can be used by search engine to serve the end user queries. Web is dynamic in nature; hence we need to update the web pages constantly. In this paper, we put forward a technique to update a page stored in web repository. This paper put forward an efficient method to refresh a page. We are proposing two methods for refreshing the page by comparing the page structure. First method compares the page structure with the help of tags used in it. And second method creates a document tree compare structures of pages.