Monday, March 5, 2012

Uses and applications of web scraping

Some people wonder what the uses of web scraping might be. Well, your imagination is the only limit (along with the copyright notices perhaps). There is a huge wealth of data out there and many believe that the open Web is a real goldmine. So, web data extraction tools and DEiXTo in particular could help you unlock this treasure and give birth to innovations, applications and new ideas.
    Public institutions, companies and organizations, entrepreneurs, professionals as well as mere citizens and users generate an enormous amount of information every single day. The question is: how effectively is it being used? Towards this direction, web content extraction can prove a valuable ally. Along with data mining, they have much to offer in every field you can imagine. The following are only some of the uses of web scraping:
  • collect properties from real estate listings
  • scrape retailer sites on a daily basis
  • extract offers and discounts from deal-of-the-day websites
  • gather data for hotels and vacation rentals
  • scrape jobs postings and internships
  • crawl forums and social sites so as to enable analysis and post-processing of their rich data
  • power aggregators and product search engines
  • monitor your online reputation and check what is being said for you or your brand
  • quickly populate product catalogues with full specifications
  • monitor prices of the competition
  • scrape the content of digital libraries in order to transform it into suitable, structured forms
  • collect and aggregate government and public data
  • search (in real time) bibliographic databases and online sources that don't offer an API, thus powering federated search engines
  • look for educational material and information from across traditional formal higher education subjects and real-life context environments in order to help the contemporary learner
  • power mobile applications
  • help building geolocation apps (e.g. extracting addresses available on web pages and using their coordinates to build meaningful maps with points of interest)
  • prepare large, focused datasets for scientific tasks (i.e. data mining)
  • extract and summarize large volumes of text (e.g. summarizing product reviews)
  • <your scraping task goes here!>
    This list can grow very long. There are countless use cases and potential scenarios, either business-oriented or non-profit. As far as the access and copyright restrictions are concerned, it is a really significant issue that has raised a lot of discussion and controversy. However, the opinion that seems to be gaining ground is that (well-intentioned) web scraping is legal since the data is publicly and freely available on the Web. So, let your creativity and imagination loose; DEiXTo can probably help you to achieve your scraping-based project goals. We would be more than happy to hear from you.

3 comments:

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete