Skip to search form
Skip to main content
Skip to account menu
Semantic Scholar
Semantic Scholar's Logo
Search 218,250,721 papers from all fields of science
Search
Sign In
Create Free Account
Web scraping
Known as:
Web scrape
, Web scrapers
, Web Harvesting
Expand
Web scraping (web harvesting or web data extraction) is a computer software technique of extracting information from websites. This is accomplished…
Expand
Wikipedia
(opens in a new tab)
Create Alert
Alert
Related topics
Related topics
50 relations
ASP.NET
Application firewall
Aptana Studio
Boxee
Expand
Broader (1)
Spamming
Papers overview
Semantic Scholar uses AI to extract papers important to this topic.
2020
2020
A Novel Web Scraping Approach Using the Additional Information Obtained From Web Pages
Erdinç Uzun
IEEE Access
2020
Corpus ID: 215740364
Web scraping is a process of extracting valuable and interesting text information from web pages. Most of the current studies…
Expand
Highly Cited
2019
Highly Cited
2019
Web Scraping: State-of-the-Art and Areas of Application
Rabiyatou Diouf
,
Edouard Ngor Sarr
,
Ousmane Sall
,
B. Birregah
,
M. Bousso
,
Sény Ndiaye Mbaye
IEEE International Conference on Big Data (Big…
2019
Corpus ID: 211297691
Main objective of Web Scraping is to extract information from one or many websites and process it into simple structures such as…
Expand
Highly Cited
2018
Highly Cited
2018
Rousillon: Scraping Distributed Hierarchical Web Data
Sarah E. Chasins
,
Maria Mueller
,
Rastislav Bodík
ACM Symposium on User Interface Software and…
2018
Corpus ID: 52832378
Programming by Demonstration (PBD) promises to enable data scientists to collect web data. However, in formative interviews with…
Expand
2017
2017
Cloud Based Web Scraping for Big Data Applications
Ram Sharan Chaulagain
,
Santosh Pandey
,
S. R. Basnet
,
S. Shakya
International Conference on Smart Cloud
2017
Corpus ID: 25361975
With the penetration of new technologies, there is a rapid growth of internet users and data (mostly unstructured) generated by…
Expand
Highly Cited
2015
Highly Cited
2015
Web Scraping with Python: Collecting Data from the Modern Web
Ryan Mitchell
2015
Corpus ID: 60876733
Learn web scraping and crawling techniques to access unlimited data from any web source in any format. With this practical guide…
Expand
Highly Cited
2014
Highly Cited
2014
Automated Data Collection with R: A Practical Guide to Web Scraping and Text Mining
Simon Munzert
,
C. Rubba
,
Peter Meiner
,
Dominic Nyhuis
2014
Corpus ID: 61138161
A hands on guide to web scraping and text mining for both beginners and experienced users of RIntroduces fundamental concepts of…
Expand
Highly Cited
2012
Highly Cited
2012
Exploiting web scraping in a collaborative filtering- based approach to web advertising
E. Vargiu
,
Mirko Urru
Artificial intelligence research
2012
Corpus ID: 29662240
Web scraping is the set of techniques used to automatically get some information from a website instead of manually copying it…
Expand
Highly Cited
2012
Highly Cited
2012
Harvesting and analysis of weak signals for detecting lone wolf terrorists
J. Brynielsson
,
Andreas Horndahl
,
F. Johansson
,
Lisa Kaati
,
Christian Mårtenson
,
Pontus Svenson
European Intelligence and Security Informatics…
2012
Corpus ID: 5658562
AbstractLone wolf terrorists pose a large threat to modern society. The current ability to identify and stop these kinds of…
Expand
Review
2009
Review
2009
Extracting article text from the web with maximum subsequence segmentation
Jeff Pasternack
,
D. Roth
The Web Conference
2009
Corpus ID: 346124
Much of the information on the Web is found in articles from online news outlets, magazines, encyclopedias, review collections…
Expand
Highly Cited
1997
Highly Cited
1997
Nutrient, Carbon, and Mass Loss during Composting of Beef Cattle Feedlot Manure
B. Eghball
,
J. Power
,
J. Gilley
,
J. Doran
1997
Corpus ID: 46331489
Quantification of nutrient and mass loss during composting is needed to understand the composting process, to implement methods…
Expand
By clicking accept or continuing to use the site, you agree to the terms outlined in our
Privacy Policy
(opens in a new tab)
,
Terms of Service
(opens in a new tab)
, and
Dataset License
(opens in a new tab)
ACCEPT & CONTINUE