On URL and content persistence

@inproceedings{Gomes2005OnUA,
  title={On URL and content persistence},
  author={Daniel Gomes and M{\'a}rio J. Silva},
  year={2005}
}
This report presents a study of URL and content persistence among 51 million pages from a national web harvested 8 times over almost 3 years. This study differs from previous ones because it describes the evolution of a large set of web pages for several years, studying in depth the characteristics of persistent data. We found that the persistence of URLs and contents follows a logarithmic distribution. We characterized persistent URLs and contents, and identified reasons for URL death. We… CONTINUE READING

Citations

Publications citing this paper.

Similar Papers

Loading similar papers…