This document discusses web scraping and data extraction. It defines scraping as converting unstructured data like HTML or PDFs into machine-readable formats by separating data from formatting. Scraping legality depends on the purpose and terms of service - most public data is copyrighted but fair use may apply. The document outlines the anatomy of a scraper including loading documents, parsing, extracting data, and transforming it. It also reviews several scraping tools and libraries for different programming languages.