Thursday, January 24, 2013

Cloudify your browser testing (and scraping) with Sauce!

For quite some time now, along with our DEiXTo scraping software, we have been using Selenium which is perhaps the best web browser automation tool currently available. It's really great and has helped us a lot in a variety of web data extraction cases (we published another post about it recently). We tried it locally as well as on remote GNU/Linux servers and we wrote code for a couple of automated tests and scraping tasks. However, it was not that easy to set everything up and get things running; we came across various difficulties (ranging from installation to stability issues e.g. sporadic timeout errors) although we were finally able to surpass most of them.
    Wouldn't it be great though if there was a robust framework that would provide you with the necessary infrastructure and all possible browser/OS combinations and allow you to run your Selenium tests in the cloud? You would not have to worry about setting a bunch of things up, installing updates, machines management, maintenance, etc. Well, there is! And it offers a whole lot more.. Its name is Sauce Labs and it provides an amazing set of tools and features. Admittedly they have done awesome work and they bring great products to software developers. Moreover, their team seems to share some great values: pursuit of excellence, innovation and open source culture (among others).
    They offer a variety of pricing plans (a bit expensive in my opinion though) while the free account includes 100 automated code minutes for Win, Linux and Android, 40 automated code minutes for Mac and iOS and 30 Minutes of manual testing. And for those contributing to an open source project that needs testing support, Open Sauce Plan is just for you (unlimited minutes without any cost!). Please note that the Selenium project is sponsored by Sauce Labs.

    Being a Perl programmer, I could not resist signing up and writing some Perl code to run a test on the host! I was already familiar with the WWW::Selenium CPAN module, so it was quite easy and straightforward. It should be noted that they provide useful guidelines and various examples online for multiple languages e.g. Python, Java, PHP and others. Overall my test script worked pretty well but it was a bit slow (compared to running the same code locally). However, one could improve speed by deploying lots of processes in parallel (if the use case scenario is suitable) and by disabling video (the script's execution and browser activity is recorded for easier debugging). Furthermore, Sauce's big advantage is that it can go large scale, which would be especially suited for complex cases with heavy requirements.
     The bottom line is that the "Selenium - Sauce Labs" pair is remarkable and can be very useful in a wide range of cases and purposes. Sauce in particular offers developers an exciting way to cloudify and manage their automated browser testing (although we personally focus more on the scraping capabilities that these tools provide). Their combination with DEiXTo extraction patterns could definitely be very fertile and open new, interesting potentials. In conclusion, the uses and applications of web scraping are limitless and Selenium turns out to be a powerful tool in our quiver!

No comments:

Post a Comment