Download PDF
Amazon Web Services > Case Studies > Zillow Provides Near-Real-Time Home-Value Estimates Using Amazon Kinesis
Amazon Web Services Logo

Zillow Provides Near-Real-Time Home-Value Estimates Using Amazon Kinesis

Technology Category
  • Platform as a Service (PaaS) - Data Management Platforms
Applicable Functions
  • Sales & Marketing
  • Business Operation
Use Cases
  • Real-Time Location System (RTLS)
  • Predictive Quality Analytics
  • Remote Asset Management
Services
  • Cloud Planning, Design & Implementation Services
  • Data Science Services
The Challenge
Zillow Group, the owner and operator of the largest online real-estate and home-related brands, was struggling to provide timely and accurate home valuations, known as Zestimates, for all new homes. The company's in-house machine-learning framework, which ran on-premise to process vertically scaling workloads, was unable to scale fast enough to meet the growing amount of data and the increasing complexity of machine-learning models for accurate Zestimates. The company specifically sought a distributed platform, which would enable the fast creation and execution of massively parallel machine-learning jobs. The existing technology was taking too long to compute Zestimates, sometimes more than a day, which meant that customers weren’t getting updated information fast enough.
About The Customer
Zillow Group owns and operates a portfolio of the largest online real-estate and home-related brands, including the Zillow website. Tens of millions of users search Zillow daily for information about 110 million homes and apartments across the U.S. The most popular feature of the Zillow website is the Zestimate—a home-valuation tool that provides buyers and sellers with the estimated market value for a specific home. Zillow currently offers Zestimates for more than 100 million homes in the U.S., with hundreds of attributes for each property. The company uses a wide variety of public-record data—including tax assessments, sales transactions, images of homes, MLS listing data, and other information provided by homeowners—as inputs to its Zestimate algorithm.
The Solution
Zillow decided to expand its use of Amazon Web Services (AWS) to solve the scalability and performance problems it faced with the Zestimate tool. Zillow chose to run Apache Spark on Amazon Elastic MapReduce (Amazon EMR). By running Zillow’s machine-learning algorithms using Spark on Amazon EMR, Zillow can quickly create scalable Spark clusters and use Spark’s distributed processing capabilities to process large data sets in near real time, create features, and train and score millions of machine learning models. Zillow uses Amazon Kinesis Streams to ingest a variety of data, including public-property records, home tax assessments, sales transactions, images and video, MLS-listing data, and user-provided information. All this data is ingested and pushed into Spark on Amazon EMR, which runs machine-learning models and gives users near-real-time Zestimates.
Operational Impact
  • Zillow can execute massively parallel machine-learning jobs on a distributed platform, enabling it to run distributed machine learning across multiple nodes to calculate Zestimates.
  • Zillow can compute Zestimates faster and more frequently, because Amazon Kinesis Streams and Spark on Amazon EMR enable near-real-time data processing.
  • Zillow does not have to be concerned with managing and scaling a fleet of servers for ingesting real-time streaming data.
Quantitative Benefit
  • Zillow can compute Zestimates in seconds, as opposed to hours.
  • Zillow manages petabytes of data in its Amazon S3 data lake.

Related Case Studies.

Contact us

Let's talk!

* Required
* Required
* Required
* Invalid email address
By submitting this form, you agree that IoT ONE may contact you with insights and marketing messaging.
No thanks, I don't want to receive any marketing emails from IoT ONE.
Submit

Thank you for your message!
We will contact you soon.