Please log in to watch this conference skillscast.
This talk will describe Skyscraper, a library that allows for easy scraping of entire Web sites. You will discover how Skyscraper grew organically as a generalization of individual scrapers tailored to different websites, how abstractions common to these scrapers were found, and how these abstractions gave rise to Skyscraper's design.
You will learn about the problems with real-world sites and the solutions Daniel had implemented in Skyscraper to overcome them, including lessons learned from his biggest scraping project to date, a scraper of 500K+ pages of the Polish parliament.
You will also explore the realm of data structures: how the output of Skyscraper fits the definition of a data table and how to represent these in Clojure.
The Call for Papers is now open for Clojure eXchange 2017! Submit your talk for the chance to join a stellar line-up of experts on stage. Find out more.
YOU MAY ALSO LIKE:
- Lightbend Scala Language - Professional (in London on 10th - 11th December 2019)
- Clojure eXchange 2019 (in London on 2nd - 3rd December 2019)
- The Sonic Contender (in London on 28th October 2019)
- Free Code Camp - October (in London on 29th October 2019)
- Abstract Data Types In The Region Of Abysmal Pain, And How To Navigate Them (SkillsCast recorded in September 2019)
- How to say no to Salesforce and build your own CRM (SkillsCast recorded in September 2019)
Skyscraper: Restructuring the Web
Daniel has been in love with functional programming languages ever since being exposed to OCaml in his freshman year at Warsaw University in 2000. He has since worked with Standard ML, Haskell, Scheme, Common Lisp and Clojure, which is now his preferred way of expressing thoughts as code. He writes Ruby for a living, and hacks on Clojure in his spare time. When not coding, he can be found playing Scrabble, cycling or petting cats.