Scrapy
2 minutes of reading
Scrapy is an open source framework written in Python for processing data from websites. It is a tool designed for web scraping, which is the automatic retrieval of data from websites.
Often when programming we use available APIs that provide us with the data we need for our application. For example, building an app that will show us the current weather, we need to get this data from somewhere, and most often we use the available APIs on the market, but what if we can't find the API we are interested in? That's when it's worth considering, page scraping. In this article I will just introduce a tool that will help us scrape pages.

What is page scraping?
Page scraping is nothing more than extracting some content from a page and saving this data for use in your application, for example. Page scraping is used by sites such as ceneo, google, or portals that collect job listings from other portals. Keep in mind that what we do later with such data can sometimes be illegal.
What is Scrapy?
Scrapy is a Python language framework and it is the most popular and powerful tool for scraping websites. Scrapy provides all the necessary tools you need to efficiently extract data from pages, process it and store it in your preferred structure and format. Scrapy is easy to use, has support for asynchronous requests, and automatically adjusts indexing speed with an "Auto-throttling" mechanism.
Scrapy Spider
The most important part in Scrapy are the Spider classes. Scrapy uses them to collect information from the website. They define how our Spider should extract data from the page.
An example of a Spider class that extracts quotes from a page.
import scrapy
class QuotesSpider(scrapy.Spider):
name = 'quotes'
start_urls = [
'https://quotes.toscrape.com/tag/humor/',
]
def parse(self, response):
for quote in response.css('div.quote'):
yield {
'author': quote.xpath('span/small/text()').get(),
'text': quote.css('span.text::text').get(),
}
next_page = response.css('li.next a::attr("href")').get()
if next_page is not None:
yield response.follow(next_page, self.parse)
We write such code to the file "quotes_spider.py" and start our scraping bot with the command:
scrapy runspider quotes_spider.py -o quotes.jl
When our bot finishes its work we should get a file "quotes.jl", which will contain a list of quotes saved in json format.
{"author": "Jane Austen", "text": "\u201cThe person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.\u201d"}
{"author": "Steve Martin", "text": "\u201cA day without sunshine is like, you know, night.\u201d"}
{"author": "Garrison Keillor", "text": "\u201cAnyone who thinks sitting in church can make you a Christian must also think that sitting in a garage can make you a car.\u201d"}
...Our offer
Web development
Find out moreMobile development
Find out moreE-commerce
Find out moreUX/UI Design
Find out moreOutsourcing
Find out moreRelated articles
How Digital Rental Management Is Transforming the Real Estate Industry
1 Jan 2026
The real estate industry is undergoing a major digital transformation, and rental management is at the center of this change. Growing portfolios, rising tenant expectations, and increasing regulatory complexity are pushing property managers to move beyond traditional, manual processes. Digital rental management offers a smarter, more efficient way to handle operations, data, and communication across the entire rental lifecycle.

Real Estate 4.0: The Smart Future of Property
13 Nov 2025
Real estate is entering a new era shaped by digital transformation, data, and intelligent technologies. Real Estate 4.0 represents a fundamental shift from traditional, asset-centric models toward smart, connected, and data-driven property ecosystems. As market expectations, sustainability requirements, and operational complexity continue to grow, technology is becoming a critical driver of efficiency and long-term value.
What Is iBuying and How Does It Work?
21 Oct 2025
The real estate industry has long been associated with complex processes, lengthy timelines, and uncertain outcomes. iBuying emerged as a response to these challenges, offering homeowners a faster and more predictable way to sell their properties. By combining technology, data, and operational efficiency, iBuying simplifies traditional transactions and reduces friction on both sides of the market.
PropTech: How technology is reshaping property
31 Aug 2025
The real estate industry is undergoing a profound transformation driven by rapid technological innovation. What was once a traditionally manual and relationship-based sector is now becoming increasingly digital, data-driven, and customer-centric. PropTech is reshaping how properties are developed, marketed, managed, and experienced across the entire value chain.
The benefits of long-tail keywords for SEO
3 Sep 2024
Explore the untapped potential of long-tail keywords in your SEO strategy. These specific, less competitive phrases can surprisingly boost your website's visibility. Dive into the intriguing world of long-tail SEO, discover its benefits, and learn to master its power unseen by many.
Mastering UX writing: A comprehensive guide to enhancing usability
29 Aug 2024
UX writing is the practice of crafting micro-copy that guides a user within digital products. A critical aspect of usability, it helps users understand how to interact with an interface. In this article, we'll unpack UX writing and strategies on mastering it, positioning you to elevate user experience through simple, precise, and engaging copy.
Understanding the concepts of Domain-Driven Design (DDD)
29 Aug 2024
Domain-Driven Design (DDD) is a powerful strategy for building effective, complex software systems. Conceptualizing abstract domain models often poses challenges. This comprehensive guide serves to decipher the intricacies of DDD, delivering a practical roadmap for software developers and architects.
Show all articles