Scrapy
2 minutes of reading
Scrapy is an open source framework written in Python for processing data from websites. It is a tool designed for web scraping, which is the automatic retrieval of data from websites.
Often when programming we use available APIs that provide us with the data we need for our application. For example, building an app that will show us the current weather, we need to get this data from somewhere, and most often we use the available APIs on the market, but what if we can't find the API we are interested in? That's when it's worth considering, page scraping. In this article I will just introduce a tool that will help us scrape pages.

What is page scraping?
Page scraping is nothing more than extracting some content from a page and saving this data for use in your application, for example. Page scraping is used by sites such as ceneo, google, or portals that collect job listings from other portals. Keep in mind that what we do later with such data can sometimes be illegal.
What is Scrapy?
Scrapy is a Python language framework and it is the most popular and powerful tool for scraping websites. Scrapy provides all the necessary tools you need to efficiently extract data from pages, process it and store it in your preferred structure and format. Scrapy is easy to use, has support for asynchronous requests, and automatically adjusts indexing speed with an "Auto-throttling" mechanism.
Scrapy Spider
The most important part in Scrapy are the Spider classes. Scrapy uses them to collect information from the website. They define how our Spider should extract data from the page.
An example of a Spider class that extracts quotes from a page.
import scrapy
class QuotesSpider(scrapy.Spider):
name = 'quotes'
start_urls = [
'https://quotes.toscrape.com/tag/humor/',
]
def parse(self, response):
for quote in response.css('div.quote'):
yield {
'author': quote.xpath('span/small/text()').get(),
'text': quote.css('span.text::text').get(),
}
next_page = response.css('li.next a::attr("href")').get()
if next_page is not None:
yield response.follow(next_page, self.parse)
We write such code to the file "quotes_spider.py" and start our scraping bot with the command:
scrapy runspider quotes_spider.py -o quotes.jl
When our bot finishes its work we should get a file "quotes.jl", which will contain a list of quotes saved in json format.
{"author": "Jane Austen", "text": "\u201cThe person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.\u201d"}
{"author": "Steve Martin", "text": "\u201cA day without sunshine is like, you know, night.\u201d"}
{"author": "Garrison Keillor", "text": "\u201cAnyone who thinks sitting in church can make you a Christian must also think that sitting in a garage can make you a car.\u201d"}
...Our offer
Web development
Find out moreMobile development
Find out moreE-commerce
Find out moreUX/UI Design
Find out moreOutsourcing
Find out moreRelated articles
How Digital Rental Management Is Transforming the Real Estate Industry
1 Jan 2026
The real estate industry is undergoing a major digital transformation, and rental management is at the center of this change. Growing portfolios, rising tenant expectations, and increasing regulatory complexity are pushing property managers to move beyond traditional, manual processes. Digital rental management offers a smarter, more efficient way to handle operations, data, and communication across the entire rental lifecycle.

Tools for Developers and Agents: Real Estate Planners & Project Management Platforms
18 Dec 2025
Real estate development and sales are becoming increasingly complex, requiring close coordination between developers, agents, and multiple external stakeholders. Traditional tools and disconnected systems often fail to provide the visibility and control needed to manage modern property projects effectively. Dedicated real estate planning and project management platforms are changing the way teams plan, execute, and deliver property developments. By combining strategic planning, task coordination, and real-time tracking in one environment, these tools help professionals work more efficiently and make better-informed decisions across the entire project lifecycle.
Real Estate 4.0: The Smart Future of Property
13 Nov 2025
Real estate is entering a new era shaped by digital transformation, data, and intelligent technologies. Real Estate 4.0 represents a fundamental shift from traditional, asset-centric models toward smart, connected, and data-driven property ecosystems. As market expectations, sustainability requirements, and operational complexity continue to grow, technology is becoming a critical driver of efficiency and long-term value.
Branded residences – what are they, and how is technology reshaping the luxury real estate market?
29 Oct 2025
The luxury real estate market is experiencing rapid growth, with one of the most notable trends in recent years being branded residences - apartments and homes created in collaboration with prestigious brands. They combine the privacy and comfort of residential living with amenities typical of five-star hotels, catering to the evolving expectations of the most discerning clients. Technology also plays a crucial role in their development, ranging from advanced building management systems and smart home solutions to digital concierge services.
What Is iBuying and How Does It Work?
21 Oct 2025
The real estate industry has long been associated with complex processes, lengthy timelines, and uncertain outcomes. iBuying emerged as a response to these challenges, offering homeowners a faster and more predictable way to sell their properties. By combining technology, data, and operational efficiency, iBuying simplifies traditional transactions and reduces friction on both sides of the market.
Tenant Experience Management – a new standard in building management
13 Oct 2025
Rapid changes in the real estate market mean that traditional approaches to building management are no longer sufficient. Today, true success is driven by the tenant experience - their comfort, engagement, and overall satisfaction with the space they use. In response to these evolving needs, the concept of Tenant Experience Management (TXM) has emerged, integrating technology, communication, and management into one cohesive system. This represents a new industry standard, where a building is no longer just a place to work, but a space that fosters relationships, convenience, and modern experiences.
Property Aggregation Platforms – the Future of Online Real Estate Search
23 Sep 2025
In recent years, the real estate market has been undergoing rapid change, driven by technological advances and evolving user expectations. Aggregator platforms are becoming increasingly popular, as they collect listings from multiple sources and present them in one convenient place. As a result, searching for an apartment or a house is faster, more transparent, and better tailored to individual needs.
Show all articles