Proxy locations

Europe

North America

South America

Asia

Africa

Oceania

See all locations

Network status Careers

hello@oxylabs.io

English (EN)

English

中文

Proxies

Proxies & Advanced Proxy Solutions

Residential Proxies

Human-like scraping without IP blocking

Mobile Proxies

Harness the power of IP addresses from real mobile devices

Rotating ISP Proxies

Extract the required data without the fear of getting blocked

Web Unblocker

AI-powered proxy solution for block-free scraping

Shared Datacenter Proxies

Fast and reliable proxies for cost-effective scraping

Dedicated Datacenter Proxies

The highest performing proxies on the market

Static Residential Proxies

Combined power of Datacenter and Residential IPs

Tools & Addons

Oxy Proxy Extension for Chrome

Free Chrome proxy manager extension that works with any proxy provider.

Oxy Proxy Manager for Android

Free Android proxy manager app that works with any proxy provider.

Proxy RotatorAdd-on

Rotates your Datacenter Proxies to help increase success rates.

Scraper APIs

SERP Scraper APIFREE TRIAL

Scalable SERP data delivery from major search engines

E-Commerce Scraper APIFREE TRIAL

Enterprise-level data from largest e-commerce marketplaces

Real Estate Scraper APIFREE TRIAL

Real-time data from popular real estate websites

Web Scraper APIFREE TRIAL

Public data delivery from a majority of websites

Features

Web Crawler

Discovers all pages on a website and fetches data at scale.

Scheduler

Schedules multiple scraping and parsing jobs at specified frequencies.

Custom Parser

Parses scraped documents by executing given parsing instructions.

Headless BrowserNEW

Render JavaScript and execute browser instructions.

DatasetsNew

Datasets

Company Data

Comprehensive datasets for business profiling

E-Commerce Product Data

Datasets for product catalog insights from E-Commerce stores

Job Postings Data

Datasets for labour market research and insights

Community and Code Data

Datasets for developer community trends

Product Review Data

Fresh datasets for user sentiment analysis

Pricing

Proxies

Residential Proxies

Human-like scraping

Starts from

$10

Pay as you go

Mobile Proxies

3G/4G/5G Mobile Proxies

Starts from

$22

Pay as you go

Rotating ISP Proxies

Extended sessions

Starts from

$340/month

Shared Datacenter Proxies

Cost-effective solution

Starts from

$50/month

Dedicated Datacenter Proxies

Superior performance

Starts from

$50/month

Scraper APIs

SERP Scraper API

Scalable SERP data delivery

Starts from

$49/month

E-Commerce Scraper API

Enterprise-level product page data

Starts from

$49/month

Web Scraper API

Data from a majority of websites

Starts from

$49/month

Real Estate Scraper API

Real-time real estate data

Starts from

$49/month

Advanced Proxy Solutions

Web Unblocker

AI-powered proxy solution

Starts from

$75/month

Learn

Getting Started

Knowledge Base

Read the latest articles about the world of web scraping, proxies, and more

Webinars

Check our webinars to learn more about data gathering issues and solutions

White papers

Get extensive white papers to understand the most complex scraping topics

OxyCon

Join inspiring discussions at Oxylabs’ annual web scraping conference

Scraping Experts

Watch lessons by industry-leading experts to gain insights on data gathering

Useful Information

Quick Start Guides

Featured

Explore tutorials and code samples to build a web scraping infrastructure with Oxylabs solutions.

Solutions

By Industry

E-Commerce

Get access to valuable e-commerce data with the help of advanced scraping solutions

Cybersecurity

Collect threat intelligence and inspect risky activities anonymously with reliable proxies

Brand protection

Monitor the web on a large scale to ensure no unauthorized product seeped into the market

SERP Monitoring

Monitor SERPs to enhance your business strategy

Travel and hospitality

Gather real-time flight and hotel data to and build a solid strategy for your travel business.

By Use Case

View all

By Target

View all

Back to blog

Tutorials Scrapers

How to Scrape Data from Zillow: A Step-by-Step Guide

Roberta Aukstikalnyte

2023-03-245 min read

Zillow is one of the largest real estate websites in the United States, with 200+ million visits per month. With a number this big, it’s no surprise that this website contains immense amounts of valuable information for real estate professionals. But to take advantage of this data, you’ll require a reliable web scraping solution. In today’s article, we’ll give an in-depth demonstration of how to use the Zillow data API to gather real estate listings data.

Benefits of real estate data scraping

Before we get started with the actual steps, let’s answer one crucial question: is scraping property listings beneficial? Yes, in fact, it is beneficial due to several reasons:

Collecting bulk data

Automated web scraping tools allow you to easily gather large amounts of data from multiple sources. This way, you don’t have to spend hours of repetitive work; also, it would nearly be impossible to collect large volumes of data manually. And, as a real estate professional, you'll definitely need large quantities of data to make informed decisions.

Accessing data from various sources

Certain trends and patterns may not be apparent from a single source of data. That said, it would be wise to scrape data from several sources, including listings sites, property portals, and individual agent or broker websites. This way, you’ll be sure to get a more comprehensive view of the real estate market.

Detect new opportunities

Scraping real estate data can also help you identify opportunities and make more informed decisions. For example, as an investor, you can use scraped data to identify real estate properties that are undervalued or overvalued in order to make more profitable investment decisions.

Similarly, you can use scraped data to identify properties that are similar to your own listings – this way, you can determine the optimal pricing and marketing strategy.

Challenges of real estate data scraping

Nothing good ever comes easy, and the process of scraping real estate websites is no exception. Let’s take a look at some of the common obstacles you may come across during the process:

Sophisticated dynamic layouts

Often, property websites use complex and dynamic web layouts. Because of that, it may be difficult for web scrapers to adapt and extract relevant information. As a result, the extracted data may be inaccurate or incomplete, requiring you to make fixes manually.

Advanced anti-scraping measures

Another common challenge is that many property websites use technologies like JavaScript, AJAX, and CAPTCHA. These technologies may prevent you from gathering the data or even result in an IP block, so you’ll need specific techniques to bypass them.

Questionable data quality

It’s no secret that property prices change rapidly; hence, there’s a risk of receiving outdated information that doesn’t reflect the present state of the real estate market.

Copyrighted data

All in all, the legality of web scraping is a largely debated topic. And, when it comes to scraping real estate websites, it’s no exception. The rule of thumb is if the data is considered publicly available, you should be able to scrape it. On the other hand, if the data is copyrighted, you should respect the rules and not scrape it. In general, it’s best if you consult a legal professional about your specific situation so you can be sure you’re not breaching any rules.

How to scrape Zillow using Real Estate Scraper API

To gather Zillow data, we’ll be using Python to interact with the Real Estate Scraper API; however, you can choose a different programming language if you like.

1. Install Python and required libraries

We’ll begin by installing the latest version of Python. Once that’s done, you’ll need to install the following packages using Python's package manager pip:

python -m pip install requests bs4

The command above will install `requests` and `bs4` libraries. We’ll use these modules to interact with Real Estate Scraper API and parse all the extracted HTML files.

2. Send a POST request to the Real Estate Scraper API with the source and URL parameters

Before we start writing the code, let’s discuss the parameters of the API. Oxylabs’ Real Estate Scraper API requires only two parameters – source and url; the rest are optional. Let’s take a look at what they do:

source – to scrape Zillow data, this parameter needs to be set to universal;

url – a valid link to any Zillow page;

user_agent_type – sets the device type and browser;

geo_location – allows acquiring data from a specific location;

locale – sets the `Accept-Language` header based on this parameter;

render - enables JavaScript rendering.

In this section, we’ll build a web scraper that will allow us to extract data from Zillow search results. First, let’s import the necessary dependencies:

import requests
from bs4 import Beautifulsoup

Next, we’ll insert a search query and copy the URL. For this example, we’ll search for properties on sale, which gives us this URL: https://www.zillow.com/homes/for_sale/_rb/

Using this URL, we’ll create a payload:

		 	 	 		
url = "https://www.zillow.com/homes/for_sale/_rb/"
payload = {

						
  'source': 'universal',
   'url': url,
   'user_agent_type': 'desktop',
						
}

Now, we’ll use the requests module to make a POST request to the API. We’ll store the result in the response variable:

response = requests.post(
   'https://realtime.oxylabs.io/v1/queries',
   auth=('USERNAME', 'PASSWORD'),
   json=payload,

						
)

Notice, we’re passing a tuple with `username` and `password` – make sure to replace those with your own Oxylabs’ credentials. We’ll also send the payload as `json` .

Now, we’ll print the response code to validate that the request was sent successfully:

print(response.status_code)

Here, you should get a 200 status code; if you don’t, make sure your internet connection is working and you’ve entered the correct URL and credentials.

3. Parse the HTML using BeautifulSoup

Next, we’ll parse the Zillow website’s HTML using the Beautiful Soup library. First, we need to grab the HTML from the json output of the API and then we’ll parse it:

content = response.json()['results'][0].get("content", "")
soup = BeautifulSoup(content, 'html.parser')
data = []

The soup object will contain the parsed HTML of the Zillow page. The rest of the task is easy – we’ll parse it just like any other normal HTML page. See below:

for div in soup.find("div", {
"class": "StyledPropertyCardDataArea-c11n-8-85-1__sc-yipmu-0"}):
						
  price = div.find("span", {
     "data-test": "property-card-price"

						
  }).text
   address = div.find("address", {

						
    "data-test": "property-card-addr"
   }).text

						
  data.append({
       "price": price,

						
      "address": address,
   })

Here, we’re simply looping over the search results and parsing each result to grab the address and price from the property page. We just inspect the HTML code for the HTML properties and select the proper tags using the find method of Beautiful Soup. Using the same technique, we can also extract various other properties as well.

Scraping individual listings

Now, let’s see how we can extract individual listings from Zillow. For our example, we’ll be using the link below, but feel free to replace it with your own: https://www.zillow.com/homedetails/3789-Conley-Downs-Ln-Decatur-GA-30034/14427531_zpid/

url =
"https://www.zillow.com/homedetails/3789-Conley-Downs-Ln-Decatur
-GA-30034/14427531_zpid/"
payload = {

						
  'source': 'universal',
   'url': url,
   'user_agent_type': 'desktop',		 	 	 				
}

We’ll also have to inspect the desired elements to find the specific HTML tags and attributes. We’ll do it with a web browser and parse the listing page with Beautiful Soup, using that information accordingly.

content = response.json()['results'][0].get("content", "")
soup = BeautifulSoup(content, 'html.parser')
price = soup.find("span", {'data-testid':
'price'}).find("span").text

address = soup.find("h1", {'class': 'qxgaF'}).text number_of_bed, size = [elem.find("strong").text for elem in soup.find_all("span", {'data-testid': 'bed-bath-item'})[:2]] status = soup.find("span", {'class': 'ixkFNb'}).text
						
property_data = {
   "link": url,						
  "price": price,
   "address": address,
   "number of bed": number_of_bed,
   "size (sqft)": size,
   "status": status,
						
}
print(property_data)

Scraping real estate agent data

We also can extract real estate agent data using the same API. For this, we’ll only need to slightly modify the search result scraper we built earlier and put the correct HTML tag and attributes.

For this example, we’ll use a URL to a list of real estate agents in the Decatur GA area:
https://www.zillow.com/professionals/real-estate-agent-reviews/decatur-ga/

import requests
from bs4 import Beautifulsoup 
url =
"https://www.zillow.com/professionals/real-estate-agent-reviews/
decatur-ga/"
payload = {						
  'source': 'universal',
   'url': url,
   'user_agent_type': 'desktop',
}
response = requests.post(
   'https://realtime.oxylabs.io/v1/queries',
   auth=('USERNAME', 'PASSWORD'),
   json=payload,							
)
print(response.status_code)					
content = response.json()['results'][0].get("content", "")
soup = BeautifulSoup(content, 'html.parser')
agents = []
for elem in soup.find("tr", {"class": "cUqKEI"}):
agent_name = elem.find("a", {"class": "jMHzWg"}).text agent_link = elem.find("a", {"class": "jMHzWg"}).get("href") phone = elem.find("div", {"class": "dlivvk"}).text
address = elem.find("div", {"class": "bmKuCz"}).text agents.append({						
      "Name": agent_name,
       "Link": agent_link,
       "Phone": phone,
       "Address": address,						
  })
print(agents)

Once you run this code, it’ll use the Real Estate Scraper API and extract the list of real estate agents. Keep in mind that Zillow frequently changes the layout of its website and the HTML attributes. You might have to change a few class names or attributes if the above code stops working; it should be relatively simple though.

Conclusion

Due to the frequent layout changes and anti-bot measures, scraping Zillow can be rather challenging. Luckily, Oxylabs’ Zillow data scraper is designed to deal with these obstacles so you can scrape Zillow data successfully.

If you run into any questions or uncertainties, don’t hesitate to reach out to our support team via email or live chat on our website. Our professional team will gladly consult you about any matter related to scraping public data from Zillow.

About the author

Roberta Aukstikalnyte

Senior Content Manager

Roberta Aukstikalnyte is a Senior Content Manager at Oxylabs. Having worked various jobs in the tech industry, she especially enjoys finding ways to express complex ideas in simple ways through content. In her free time, Roberta unwinds by reading Ottessa Moshfegh's novels, going to boxing classes, and playing around with makeup.

Learn more about Roberta Aukstikalnyte

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.

Data acquisition