Scrapy

Python web-crawling framework From Wikipedia, the free encyclopedia

Scrapy (/ˈskrp/[2] SKRAY-peye) is a free and open-source web-crawling framework written in Python. Originally designed for web scraping, it can also be used to extract data using APIs or as a general-purpose web crawler.[3] It is currently maintained by Zyte (formerly Scrapinghub), a web-scraping development and services company.

Quick Facts Developer(s), Initial release ...
Scrapy
Developer(s)Zyte (formerly Scrapinghub)
Initial release26 June 2008 (2008-06-26)
Stable release
2.12.0[1]  / 18 November 2024; 4 months ago (18 November 2024)
Repository
Written inPython
Operating systemWindows, macOS, Linux
TypeWeb crawler
LicenseBSD License
Websitescrapy.org 
Close

Scrapy project architecture is built around "spiders", which are self-contained crawlers that are given a set of instructions. Following the spirit of other don't repeat yourself frameworks, such as Django,[4] it makes it easier to build and scale large crawling projects by allowing developers to reuse their code.

Some well-known companies and products using Scrapy are: Lyst,[5][6] Parse.ly,[7] Sayone Technologies,[8] Sciences Po Medialab,[9] Data.gov.uk’s World Government Data site.[10]

History

Scrapy was born at London-based web-aggregation and e-commerce company Mydeco, where it was developed and maintained by employees of Mydeco and Insophia (a web-consulting company based in Montevideo, Uruguay). The first public release was in August 2008 under the BSD license, with a milestone 1.0 release happening in June 2015.[11] In 2011, Zyte (formerly Scrapinghub) became the new official maintainer.[12][13]

References

Loading related searches...

Wikiwand - on

Seamless Wikipedia browsing. On steroids.