Zillow Extract Filtered Listings
Purpose
Given a location (city + state, ZIP, neighborhood, full Zillow URL, or free-form
region) and a multi-dimensional filter spec (price, beds, baths, sqft, lot size,
year built, days-on-market, property type, listing status, HOA, monthly payment,
home features), return the matching active for-sale listings as structured JSON
— including zpid, formatted + raw price, beds, baths, interior sqft, lot size
with unit, full address, property type, listing status, days on Zillow,
Zestimate (when present), HOA (when present), monthly-payment estimate (when
present), primary photo URL, and the canonical detail URL — plus region-wide
totalResultCount, totalPages, and the exact searchQueryState URL used.
Read-only — never click Save, Tour, Contact Agent, or any mutation control.
When to Use
- A user asks for "homes for sale in {region} with {filters}".
- An agent needs comparable for-sale comps for a property.
- Daily/hourly monitoring of new listings matching a complex filter (e.g., "single-family or townhouse, $400k–$750k, 3+ beds, listed in last 30 days, with garage and A/C").
- Any workflow that previously scraped Zillow's HTML — the
__NEXT_DATA__blob is faster, structurally richer, and (viabrowse cloud fetch --proxies) bypasses Zillow's anti-bot wall that hard-blocks scripted browsers.
Workflow
Zillow's SRP is a Next.js app that server-renders the full search state into
a <script id="__NEXT_DATA__"> JSON blob on the initial HTML response. The
canonical filter representation is the searchQueryState URL query parameter
(URL-encoded JSON). Constructing a filtered URL upfront and reading the
response's __NEXT_DATA__ is the right path — fetching unfiltered results and
post-filtering client-side is wasteful and lossy.
The optimal transport is browse cloud fetch --proxies (Browserbase's
lightweight HTTP API path), NOT a scripted browser session. Zillow fronts
the site with PerimeterX / HUMAN's "Press & Hold" bot defense, which fires a
hard JS-challenge modal on the very first navigation from a Browserbase-
fingerprinted Chromium — even bare https://www.zillow.com/austin-tx/ gets
blocked across regions (us-east-1, us-west-2, eu-central-1) and across
--verified / --proxies / --solve-captchas flag combinations. The Fetch
API path uses a different HTTP stack + residential-proxy pool that PerimeterX
serves freely (verified 200 OK on identical URLs that the browser path got 403
on, same minute).
1. Resolve location → Zillow SEO slug
Map free-form input to one of Zillow's path-based region slugs. The slug locks
the region; filters layer on top via searchQueryState. Use the slug WITHOUT
filters first if the region resolution looks ambiguous — it's idempotent.
| Input shape | Slug pattern | Example |
|---|---|---|
| City + state | /{city-lowercased-hyphenated}-{state-2letter-lc}/ | Austin, TX → /austin-tx/ |
| ZIP code | /homes/{zip}/ | 30307 → /homes/30307/ |
| Neighborhood | /{nbhd-slug}-{city-slug}-{state}/ | Mission, San Francisco → /mission-san-francisco-ca/ |
| Full Zillow URL | Use as-is | https://www.zillow.com/austin-tx/houses/ |
| Free-form region | Fall back to homepage search resolver (see Gotchas) | "South Bay Area" → needs lookup |
After fetching, verify searchPageState.queryState.regionSelection in the
response — it returns [{regionId: N, regionType: T}] where:
regionType: 4— stateregionType: 6— cityregionType: 7— ZIPregionType: 8— neighborhood
If the region didn't resolve as expected (e.g., 404 or wrong city), fall back to fetching the homepage and using the search-as-you-type endpoint (see Gotchas).
2. Build searchQueryState with the full filter surface
searchQueryState is a URL-encoded JSON object. Minimum viable shape:
{
"pagination": {"currentPage": 1},
"isMapVisible": false,
"isListVisible": true,
"regionSelection": [{"regionId": 10221, "regionType": 6}],
"filterState": { /* see below */ }
}
mapBounds is optional — Zillow fills it server-side from the region. Omit
unless you specifically need to scope to a sub-bbox of the region.
filterState schema — these keys are the long-form id values (NOT the
URL-bar shortIds like con/sf). All 110 keys come from
searchPageState.filterDefinitions in any SRP response — print that map once
per Zillow build to verify.
Property type (Boolean; default = true for ALL → everything included)
To narrow to a subset, explicitly set the unwanted types to false. Setting
the wanted type to true alone does nothing because they're all-on by default.
| Filter id | shortId | Label |
|---|---|---|
isSingleFamily | sf | Houses |
isCondo | con | Condos/Co-ops |
isTownhouse | tow | Townhomes |
isMultiFamily | mf | Multi-family |
isApartment | apa | Apartments |
isApartmentOrCondo | apco | Apt-or-condo composite — set false when narrowing away from condos |
isManufactured | manu | Manufactured |
isLotLand | land | Lots/Land |
"filterState": {
"isCondo": {"value": false},
"isMultiFamily": {"value": false},
"isApartment": {"value": false},
"isManufactured": {"value": false},
"isLotLand": {"value": false},
"isApartmentOrCondo": {"value": false}
// isSingleFamily + isTownhouse stay default true → houses + townhouses only
}
Listing status (Boolean; defaults vary)
| Filter id | shortId | Default | Label |
|---|---|---|---|
isForSaleByAgent | fsba | true | Agent listed |
isForSaleByOwner | fsbo | true | Owner posted |
isNewConstruction | nc | true | New construction |
isComingSoon | cmsn | true | Coming soon |
isAuction | auc | true | Auctions |
isForSaleForeclosure | fore | true | Foreclosures |
isPreMarketForeclosure | pmf | false | Foreclosed (pre-market) |
isPreMarketPreForeclosure | pf | false | Pre-foreclosures |
isRecentlySold | rs | false | Recently sold |
isPendingListingsSelected | pnd | false | Pending & under contract |
isAcceptingBackupOffersSelected | abo | false | Accepting backup offers |
isOpenHousesOnly | open | false | Open-house only |
"For sale" is the default — you only need to flip statuses when filtering to
something non-default. Example: sold listings only → set isForSaleByAgent: false,
isForSaleByOwner: false, isNewConstruction: false, isAuction: false,
isForSaleForeclosure: false, isComingSoon: false, isRecentlySold: true.
Range filters ({min, max} — both nullable)
| Filter id | shortId | Unit |
|---|---|---|
price | (none — uses full id) | USD |
monthlyPayment | mp | USD/month — mortgage + tax + HOA estimate |
beds | (none) | integer |
baths | (none) | number (half-baths allowed → 1.5, 2.5) |
sqft | (none) | interior living area |
lotSize | lot | sqft by default; {min, max, units: "acre"} for acres |
built | (none) | year |
hoa | (none) | max USD/month (set max, leave min: null) |
parkingSpots | parks | {min} only |
"price": {"min": 400000, "max": 750000},
"beds": {"min": 3},
"baths": {"min": 2},
"sqft": {"min": 1500},
"built": {"min": 1990},
"hoa": {"max": 200},
"lotSize":{"min": 5000, "max": 20000}
Enum / String filters
| Filter id | shortId | Type | Values |
|---|---|---|---|
doz | (none) | Enum | "1", "7", "14", "30", "90", "6m", "12m", "24m", "36m", "any" |
sortSelection | sort | Enum | "globalrelevanceex" (Homes for You — default on SRP), "days" (newest), "priced" (low→high), "pricea" (high→low), "beds", "baths", "size", "lot", "built", "saved", "listingstatus", "featured", "paymentaa" |
keywords | att | String | free text — searches listing descriptions |
ageRestricted55Plus | 55plus | Enum | "i" (include), "e" (exclude), "o" (only) |
dataSourceSelection | dsrc | String | "all" (default) |
Home-feature toggles (Boolean; default = false → set true to require)
| Filter id | shortId | Label |
|---|---|---|
hasGarage | gar | Must have garage |
parkingSpots (Range) | parks | Parking spots {min} |
hasAirConditioning | ac | Must have A/C |
hasPool | pool | Must have pool |
isWaterfront | wat | Waterfront |
singleStory | sto | Single-story only |
hasBasement | hbas | Has basement |
isBasementFinished | basf | Finished basement |
isBasementUnfinished | basu | Unfinished basement |
hasDisabledAccess | disac | Accessible |
isCityView / isMountainView / isParkView / isWaterView | cityv / mouv / parkv / watv | View attributes |
is3dHome | 3d | Has 3D tour |
onlyWithPhotos | (none) | Has photos |
onlyPriceReduction | (none) | Price-reduced |
HOA hide-toggle
Setting hoa.max filters to listings whose HOA is at or below that value
— it does NOT include listings with unknown HOA data unless you also set:
"includeHomesWithNoHoaData": {"value": true} // default = true
If the user wants "no HOA fee at all", combine hoa: {max: 0} with
includeHomesWithNoHoaData: {value: true} and explicitly check hdpData.homeInfo
for hoaFee after extraction — see Gotchas.
3. Construct the URL
const sqs = {
pagination: {currentPage: 1},
isMapVisible: false,
isListVisible: true,
regionSelection: [{regionId: 10221, regionType: 6}],
filterState: { /* keys above */ }
};
const url = `https://www.zillow.com/austin-tx/?searchQueryState=${encodeURIComponent(JSON.stringify(sqs))}`;
For page N > 1, BOTH set searchQueryState.pagination.currentPage = N AND
insert /N_p/ into the path:
https://www.zillow.com/austin-tx/2_p/?searchQueryState=... # page 2
https://www.zillow.com/austin-tx/3_p/?searchQueryState=... # page 3
The pagination.nextUrl / pagination.previousUrl fields in the response
(searchPageState.cat1.searchList.pagination) tell you the next path slug
verbatim — use those when paginating.
4. Fetch via Browserbase Fetch API (not a browser session)
export BROWSERBASE_API_KEY="$BB_API_KEY"
browse cloud fetch "$URL" --proxies --allow-redirects --output /tmp/srp.html
Verified 200 OK, ~960KB HTML, full __NEXT_DATA__ blob on the filtered URL.
The browser path returns the PerimeterX "Press & Hold" challenge instead
(see Gotchas — this is the dominant gotcha on Zillow).
5. Parse __NEXT_DATA__
const m = html.match(/<script id="__NEXT_DATA__" type="application\/json">([\s\S]+?)<\/script>/);
const data = JSON.parse(m[1]);
const sps = data.props.pageProps.searchPageState;
const listings = sps.cat1.searchResults.listResults; // 41 listings/page
const total = sps.cat1.searchList.totalResultCount; // region-wide total post-filter
const pages = sps.cat1.searchList.totalPages; // total pages
const perPage = sps.cat1.searchList.resultsPerPage; // 41 in observed responses
const nextSlug = sps.cat1.searchList.pagination?.nextUrl; // e.g. "/austin-tx/2_p/"
const filterEcho = sps.queryState.filterState; // ← verify your filter was accepted
const regionEcho = sps.queryState.regionSelection; // ← verify region resolution
6. Decode each listing
Each entry in listResults[] has:
| Path | Description |
|---|---|
zpid (string) — also hdpData.homeInfo.zpid (number) | Canonical property id |
price (string, e.g. "$480,000") + unformattedPrice (number, e.g. 480000) | Display + raw price |
beds, baths (numbers) | Bedrooms/bathrooms — baths is a number that may be 1.5, 2.5 (full + half summed) |
area (number, sqft) | Interior living area (also hdpData.homeInfo.livingArea) |
hdpData.homeInfo.lotAreaValue + hdpData.homeInfo.lotAreaUnit | Lot size + unit ("sqft" or "acres") |
address (string) | "123 Main St, Austin, TX 78704" |
addressStreet, addressCity, addressState, addressZipcode | Parts |
latLong ({latitude, longitude}) | Lat/Lon |
statusType ("FOR_SALE", "FOR_RENT", "RECENTLY_SOLD", ...) | Listing status enum |
rawHomeStatusCd ("ForSale", "Pending", ...) | Raw status code |
marketingStatusSimplifiedCd ("For Sale by Agent", "For Sale by Owner", "New Construction", "Coming Soon", "Foreclosure", "Auction") | Marketing variant |
statusText ("Active", "Pending", ...) | Display label |
hdpData.homeInfo.homeType ("SINGLE_FAMILY", "CONDO", "TOWNHOUSE", "MULTI_FAMILY", "APARTMENT", "MANUFACTURED", "LOT") | Property-type enum |
hdpData.homeInfo.daysOnZillow (number) | Days on market |
hdpData.homeInfo.taxAssessedValue (number, optional) | Tax-assessed value |
zestimate (number, top-level, optional) | Zillow estimate when shown |
imgSrc (string) | Primary photo URL |
detailUrl (string — already absolute or use https://www.zillow.com prefix) | Canonical detail page |
carouselPhotosComposable (array, optional) | Additional photo URLs |
hdpData.homeInfo.listing_sub_type (object) | {is_FSBA: true}, {is_FSBO: true}, {is_newHome: true}, {is_foreclosure: true}, etc. |
Per-listing HOA (hoaFee) and monthly-payment-estimate fields are NOT
consistently present in listResults[]. Zillow stores those on the property's
detail page (/homedetails/.../<zpid>_zpid/) — if the user requires them,
follow the detailUrl and parse its __NEXT_DATA__ (gdpClientCache →
property keys). See Gotchas.
7. Paginate if needed
If totalPages > 1 AND the caller wants all results, fetch each /N_p/?searchQueryState=...
page (incrementing pagination.currentPage AND the /N_p/ path segment in
tandem). Zillow caps SRP results at the first ~500 listings (≤ 13 pages of
41) regardless of totalResultCount. If totalResultCount > 500, set
resultsCapped: true in the output and advise the caller to narrow filters.
8. Build the response
Emit the JSON schema in "Expected Output" below. Include the exact
searchQueryState_url you used and the echoed filterState_applied from
the response (Zillow occasionally normalizes / drops unrecognized keys — the
echo is the source of truth).
Site-Specific Gotchas
-
PerimeterX "Press & Hold" hard-blocks the browser path. A scripted Chromium (Browserbase
browse open --remote) onhttps://www.zillow.com/austin-tx/returns a page whose<title>is "Access to this page has been denied" and body reads "Press & Hold to confirm you are a human (and not a bot)" with a PerimeterX reference ID. Verified across:--verifiedalone--verified --proxies--verified --proxies --solve-captchas- regions
us-west-2,us-east-1,eu-central-1 - viewport 1920x1080
- bare homepage, bare regional SRP, path-based filters (
/houses/), and fullsearchQueryStateURLs
The very first session occasionally renders the bare SRP successfully but shows the Press & Hold modal over the page; any subsequent navigation hard-locks at the HTTP level. PerimeterX appears to fingerprint Browserbase's Chromium fleet —
--solve-captchasdoes NOT solve the Press & Hold puzzle (it's a long-hold gesture, not a checkbox/click captcha). Don't fight it — use the Fetch API. -
browse cloud fetch --proxiesbypasses PerimeterX entirely. Verified 200 OK on identical URLs that the browser path 403'd on, in the same minute. The Fetch API path uses a different HTTP stack + residential-proxy pool that Zillow's PerimeterX rules don't fingerprint. Always pass--proxies— bare Fetch without proxies may also be flagged on follow-up requests. Always pass--allow-redirectssince Zillow 301's bare-IP hits to geo-localized variants. -
__NEXT_DATA__is the full SRP state. No XHR scraping needed —props.pageProps.searchPageStatecontains:queryState(echo of the appliedsearchQueryState)filterDefinitions(the full 110-filter schema — IDs, types, defaults, shortIds, allowed enums)cat1.searchResults.listResults(the page's listings)cat1.searchList.{totalResultCount, totalPages, resultsPerPage, pagination}
-
filterStatekeys are long-form IDs, not URL-bar shortIds. The URL bar may show e.g.?searchQueryState=...sortSelection...but ALSO Zillow's SEO paths use shortIds like/3-_beds/. In the JSON body, use the long form (isSingleFamily, notsf;hasGarage, notgar). The shortIds are only for the path-segment SEO URLs. -
Property-type filters are all-on by default. Setting
isSingleFamily: {value: true}alone does NOTHING — every property type is included by default. To narrow to one type, set the other types to{value: false}(and rememberisApartmentOrCondois a composite that must also be false when narrowing away from condos). -
Pagination requires BOTH the path AND the query state. Page 2 of Austin is
/austin-tx/2_p/?searchQueryState=...pagination.currentPage=2.... Setting only one of them silently returns page 1. Use the response'ssearchList.pagination.nextUrlas the authoritative path slug. -
41 results per page, capped at ~500 total.
resultsPerPageis 41 in every observed response. Zillow's SRP scroll caps at ~500 results (~13 pages) — beyond that, the SRP returns the same last page andtotalPagesreflects the cap, not the true total. MirrortotalResultCount > 500→resultsCapped: truein your output and advise the caller to narrow filters (tighter geography, tighter price band, etc.). -
Lot-size units toggle.
lotSizedefaults to sqft ({min, max, units: "sqft"}). To filter by acres, setunits: "acre"AND interprethdpData.homeInfo.lotAreaUnitin the response — Zillow may return some listings in acres and some in sqft (typically sqft below 1 acre, acres above). -
HOA filter behavior.
hoa: {max: N}keeps listings with HOA ≤ N. To also include unknown-HOA listings, ensureincludeHomesWithNoHoaData: {value: true}(the default). For "zero HOA only":hoa: {max: 0}AND checkhdpData.homeInfo.hoaFeepost-extraction. -
Per-listing
hoaFee,zestimate,monthlyPaymentare inconsistently present inlistResults[].zestimateappears on some listings (~30% in Austin sample),hoaFeeandmonthlyPaymentare usually absent at the SRP level. To fetch them reliably, followdetailUrland parse the detail page's__NEXT_DATA__(gdpClientCache→ForSaleShopperPlatformFullRenderQuery→property→hoaFee,zestimate,monthlyHoaFee,monthlyHoaFeeDisplay). Only do this when the caller specifically asks — it's N extra fetches. -
Region slug for free-form input falls back to homepage search. ZIP (
/homes/{zip}/), city + state (/{slug}-{state}/), and most major neighborhoods (/{nbhd}-{city}-{state}/) work via direct path. For unknown free-form regions ("South Bay Area", "DFW Metroplex"), fetch the homepagehttps://www.zillow.com/homes/and submit the term to the search resolver: Zillow's autocomplete GraphQL endpoint (/zg-graph/autocomplete/results) returns 400 without Apollo client headers (x-apollo-operation-name,client-id,x-caller-id) — easiest path is to slugify the input heuristically and fall back to a 404-detection retry chain. -
mapBoundsis optional. Zillow fills it server-side fromregionSelection. Including it scopes to a sub-bbox; omitting it gives the full region. -
Sold listings need
isRecentlySold: trueAND all for-sale toggles set tofalse. Otherwise the SRP returns the union of for-sale + recently-sold. -
dozEnum, not Range. Days-on-Zillow is{value: "30"}(string), not{min: 0, max: 30}. Only the listed enum values ("1","7","14","30","90","6m","12m","24m","36m","any") are accepted — unrecognized values default to"any". -
searchPageState.filterDefinitionsis the canonical schema source. When in doubt about a filter's shape (Boolean? Range? Enum? what's the default?), fetch ANY SRP response and read itsfilterDefinitionsblock. Zillow updates the schema over time — re-derive once per quarter. -
Detail URLs are canonical.
detailUrlishttps://www.zillow.com/homedetails/{slug}/{zpid}_zpid/— the_zpid/suffix is mandatory and the slug is informational. Hitting justhttps://www.zillow.com/homedetails/{zpid}_zpid/also redirects to the canonical. -
READ-ONLY. Do NOT click Save, Save Search, Tour, Contact Agent, Apply, or any other CTA on the SRP or detail pages. The skill never makes a state change.
Browser fallback (use only if Fetch API fails)
If browse cloud fetch --proxies starts returning 4xx (e.g., Zillow extends
PX to the Fetch path):
- Spin a fresh
browse cloud sessions create --keep-alive --verified --proxies --solve-captchas --region us-east-1session. - Open
https://www.zillow.com/(homepage, NOT the regional SRP first). wait timeout 6000for any PX challenge animation to finish.- Use
browse fillto enter the location in the search box,browse press Enter. This routes through the SPA and may bypass the direct-URL PX fingerprint. - Read
__NEXT_DATA__viabrowse eval:JSON.parse(document.getElementById('__NEXT_DATA__').textContent).props.pageProps.searchPageState - To apply filters, use the SRP's filter UI — click each filter pill, set
values via
browse fill/browse click, thenbrowse clickthe "Apply" button. Zillow's SPA will updatesearchQueryStateviahistory.pushStatewithout triggering a hard navigation — this avoids the PX fingerprint. - After every action,
wait timeout 2000and re-read__NEXT_DATA__/window.__INITIAL_STATE__for fresh results.
This path costs ~5–10× more turns than the Fetch path and is brittle to PX modal interrupts. Use Fetch API as primary; fall back to browser only when Fetch breaks.
Expected Output
{
"success": true,
"searchQueryState_url": "https://www.zillow.com/austin-tx/?searchQueryState=%7B%22pagination%22%3A%7B%22currentPage%22%3A1%7D%2C...",
"filterState_applied": {
"sortSelection": {"value": "globalrelevanceex"},
"price": {"min": 400000, "max": 750000},
"beds": {"min": 3, "max": null},
"baths": {"min": 2, "max": null},
"sqft": {"min": 1500, "max": null},
"built": {"min": 1990, "max": null},
"doz": {"value": "30"},
"isCondo": {"value": false},
"isMultiFamily": {"value": false},
"isApartment": {"value": false},
"isManufactured": {"value": false},
"isLotLand": {"value": false},
"isApartmentOrCondo": {"value": false},
"hasGarage": {"value": true},
"hasAirConditioning": {"value": true}
},
"regionSelection": [{"regionId": 10221, "regionType": 6}],
"totalResultCount": 258,
"currentPage": 1,
"totalPages": 7,
"resultsPerPage": 41,
"resultsCapped": false,
"listings": [
{
"zpid": "111969308",
"price": "$480,000",
"priceRaw": 480000,
"beds": 3,
"baths": 2,
"area": 1849,
"lotAreaValue": 7056.72,
"lotAreaUnit": "sqft",
"address": {
"full": "13933 Turkey Hollow Trl, Austin, TX 78717",
"streetAddress": "13933 Turkey Hollow Trl",
"city": "Austin",
"state": "TX",
"zipcode": "78717"
},
"latLong": {"latitude": 30.49, "longitude": -97.79543},
"propertyType": "SINGLE_FAMILY",
"homeStatus": "FOR_SALE",
"statusText": "Active",
"marketingStatus": "For Sale by Agent",
"listingSubType": {"is_FSBA": true},
"daysOnZillow": 3,
"zestimate": null,
"taxAssessedValue": 499783,
"hoa": null,
"monthlyPayment": null,
"imgSrc": "https://photos.zillowstatic.com/fp/671419edea6ca08359352874f3b8ad57-p_e.jpg",
"detailUrl": "https://www.zillow.com/homedetails/13933-Turkey-Hollow-Trl-Austin-TX-78717/111969308_zpid/"
}
]
}
Outcome shapes
// Filtered results found
{ "success": true, "totalResultCount": 258, "listings": [ /* up to resultsPerPage */ ], ... }
// Zero results for the applied filter (Zillow returns the response anyway with empty listResults)
{ "success": true, "totalResultCount": 0, "listings": [],
"zeroResultMessage": "No matching results — try widening your search radius or relaxing filters" }
// Region not resolved (404 on the constructed slug, or wrong region echoed)
{ "success": false, "reason": "region_not_resolved", "input": "South Bay Area",
"attemptedSlugs": ["/south-bay-area/", "/south-bay-area-ca/"] }
// Anti-bot wall (Fetch API also blocked)
{ "success": false, "reason": "perimeterx_block", "challenge": "press_and_hold",
"referenceId": "...", "fallback_attempted": "browser+homepage" }
// Results capped (totalResultCount > 500, Zillow won't paginate past the cap)
{ "success": true, "totalResultCount": 12450, "resultsCapped": true,
"advice": "narrow filters or use a smaller region", "listings": [ /* first ≤500 */ ] }