FoCUS : Learning to Crawl Web Forums

@inproceedings{Shanmugapriya2014FoCUSL,
  title={FoCUS : Learning to Crawl Web Forums},
  author={V. Shanmugapriya and S. Krishnaveni},
  year={2014}
}
In this paper, we present Forum Crawler Under Supervision (FoCUS), a supervised web-scale forum crawler. The goal of FoCUS is to crawl relevant forum content from the web with minimal overhead. Forum threads contain information content that is the target of forum crawlers. Although forums have different layouts or styles and are powered by different forum software packages, they always have similar implicit navigation paths connected by specific URL types to lead users from entry pages to… CONTINUE READING
Highly Cited
This paper has 39 citations. REVIEW CITATIONS
25 Citations
6 References
Similar Papers

Citations

Publications citing this paper.
Showing 1-10 of 25 extracted citations

References

Publications referenced by this paper.
Showing 1-6 of 6 references

ForumMatrix

  • ForumMatrix
  • 2012

Hot Scripts

  • Hot Scripts
  • 2012

Internet Forum

  • Internet Forum
  • 2012
1 Excerpt

RFC 1738—Uniform Resource Locators (URL)

  • RFC 1738—Uniform Resource Locators (URL)
  • 2012
1 Excerpt

The Web Robots Pages

  • The Web Robots Pages
  • 2012

http://www.big-boards.com/ statistics

  • Message Boards Statistics
  • 2012

Similar Papers

Loading similar papers…