amazon.com

search-products

Installation

Adds this website's skill for your agents

 

Summary

Search Amazon (any storefront TLD) for products matching a keyword, full search URL, ASIN list, or category-browse intent, with full filter-rail coverage (department, brand, rating, price, Prime, delivery speed, deals, condition, seller, apparel attrs, niche filters, sort, pagination). Returns structured per-product JSON with ASIN, title, brand, image, prices, rating, badges, and canonical /dp URL.

FIG. 01
FIG. 02
FIG. 03
FIG. 04
SKILL.md
384 lines

Amazon Product Search

Purpose

Given a search intent — free-form keyword, full Amazon search URL, ASIN list, or a category-browse intent ("Bestsellers in Coffee") — plus any combination of Amazon's left-rail / top-bar filters, return a structured JSON page of products with per-item ASIN, title, brand, primary + thumbnail image URLs, current price + currency, list price + discount %, star rating + review count, Prime eligibility, sponsored / Amazon's-Choice / bestseller / Climate-Pledge / coupon badges, ships-from / sold-by attribution, and the canonical /dp/{ASIN} URL. Captures the page-level totalResultCount so callers know the returned slice is partial. Honors region-specific TLDs (.com, .co.uk, .de, .ca, .jp, ...) — each storefront uses its own filter / category ID space. Read-only — never clicks Add to Cart, Buy Now, Subscribe & Save, or Sign In. When the page presents a captcha or Akamai bot-wall, captures the screenshot and emits a captcha_wall failure shape rather than attempting to solve.

When to Use

  • "Find me {N} matching products on Amazon for {keyword}" with any filter set.
  • Periodic price / rating / stock-level monitoring of a specific ASIN cohort.
  • Bulk cross-category extraction (/zgbs/, /gp/bestsellers/, /gp/new-releases/).
  • Comparison-shopping agents that need a structured product feed instead of HTML.
  • Anywhere a caller would otherwise scrape /s?k=... by hand — this skill encapsulates the entire filter-encoding scheme + anti-bot wrapper.

Workflow

Amazon's /s search route is gated by Akamai Bot Manager (the bm_sz / ak_bmsc cookie family + bm-verify 5-second JS challenge interstitial). Browserless HTTP fetches — even through a residential proxy — fail closed in ~75% of query variants observed during iteration. The reliable path is a fully verified Browserbase session with --verified --proxies enabled, which executes the bot-challenge JS, sets the bm-cookies, and then re-requests the SERP cleanly.

Primary: scripted-browser path

  1. Create a Verified + residential-proxy session — both flags are mandatory.

    SID=$(browse cloud sessions create --keep-alive --verified --proxies | jq -r .id)
    export BROWSE_SESSION="$SID"
    
  2. Warm the bm-cookies before the targeted SERP by opening the homepage:

    browse open --remote "https://www.amazon.com/"
    browse wait load --remote
    browse wait timeout 1500 --remote
    

    This lets Akamai's JS run once, seeding ak_bmsc + bm_sz cookies in the session. Skipping the warm-up triples the captcha rate on the next request.

  3. Build the search URL. Pick the right storefront TLD by region (table below). Encode filters into rh= segments, sort via s=, and paginate via page=N. See "Filter encoding" in Site-Specific Gotchas for the full map.

    https://www.amazon.com/s
      ?k={URL-encoded keyword}
      &rh={filter-1},{filter-2},...      (URL-encoded comma is %2C)
      &s={sort enum}
      &page={N}
    

    Example — "wireless mechanical keyboard, 4+ stars, ≤$150, Prime-eligible, sorted by review rank":

    /s?k=wireless+mechanical+keyboard
       &rh=p_72%3A1248915011%2Cp_36%3A-15000%2Cp_85%3A2470955011
       &s=review-rank
    
  4. Open the SERP in the same session:

    browse open --remote "$URL"
    browse wait load --remote
    browse wait timeout 1500 --remote     # lazy-loaded badges / Prime icons / Climate Pledge render after `load`
    
  5. Detect the captcha wall before parsing. If the page title is "Robot Check" or the body contains <form action="/errors/validateCaptcha", stop — emit the captcha_wall shape (see Expected Output). Do not attempt to solve.

    TITLE=$(browse get title --remote)
    if [ "$TITLE" = "Robot Check" ] || browse get html body --remote | grep -q '/errors/validateCaptcha'; then
      # ship candidate output, captura screenshot, exit
    fi
    
  6. Read the rendered HTML and parse cards:

    browse get html body --remote > /tmp/serp.html
    
    • Card root: each <div data-component-type="s-search-result" data-asin="..."> is one product. A clean SERP renders 22 cards per page (numbered 16 organic + 6 sponsored slots).

    • Total result header: 1-{X} of [over ]{Y,YYY} results — captures totalResultCount (the Y value, often "over 200,000").

    • Per-card selectors (use stable data-cy anchors — these are Amazon's internal test selectors, more stable than CSS classes):

      FieldSelector / regex
      asinouter data-asin attribute
      titleh2 > a > span text inside data-cy="title-recipe"
      urlh2 > a href, normalized to https://www.amazon.com/dp/{ASIN}
      imageimg.s-image src attribute
      price.formatted.a-offscreen inside data-cy="price-recipe", value like "$39.99"
      price.rawparse price.formatted minus the $
      list_price.a-price.a-text-price > .a-offscreen (strikethrough variant). When absent, no sale.
      discount_pctround(1 - price/list_price) when both present
      rating.starsaria-label="X.Y out of 5 stars" inside data-cy="reviews-block" — extract the leading X.Y
      rating.review_countadjacent aria-label="N,NNN ratings" — extract N,NNN and parse to integer
      primepresence of i-aok-prime class OR aria-label="Prime" inside data-cy="delivery-recipe"
      sponsoredenclosing card contains AdHolder class OR s-sponsored-result data attr
      amazons_choicetext "Amazon's Choice" inside data-cy="s-pc-faceout-badge"
      bestsellertext "Best Seller" / "#1 Best Seller in ..." inside data-cy="s-pc-faceout-badge"
      climate_pledgedata-cy="certification-recipe" contains "Climate Pledge Friendly"
      deal_labelred-tagged .a-color-price.s-background-color-platinum text ("Limited time deal", "Lightning Deal", "Coupon: $5 off")
      ships_from / sold_byinside data-cy="delivery-block" text content
  7. Paginate. Up to 7 pages of organic results are typically available; Amazon caps at page=7-20 depending on category. Detect end via <a aria-label="Go to next page" presence/absence.

  8. Release the session:

    browse cloud sessions update "$SID" --status REQUEST_RELEASE
    

Fast path for category-browse intents

For "Bestsellers in {category}" / "New Releases in {category}" / "Movers & Shakers in {category}" — skip /s entirely and use the curated browse roots. These are not Akamai-gated as aggressively as /s (verified 2026-05-18: GET /gp/bestsellers/ returned 644 KB of real HTML with data-asin markers via residential-proxy Fetch, no challenge interstitial).

https://www.amazon.com/gp/bestsellers/{category-slug}/        — top 100 bestsellers
https://www.amazon.com/gp/new-releases/{category-slug}/       — top 100 new releases
https://www.amazon.com/gp/movers-and-shakers/{category-slug}/ — biggest-gainers
https://www.amazon.com/zgbs/{category-slug}                   — alias for /gp/bestsellers

Slugs are visible in the /zgbs/ HTML: amazon-devices, appliances, arts-crafts, audible, automotive, baby-products, beauty, etc.

Fast path for ASIN lookup

When the caller already has ASINs, skip search and hit /dp/{ASIN} directly. Each /dp/{ASIN} HTML response is >1 MB, so use a Browserbase remote session — NOT cloud fetch (Fetch API is hard-capped at 1 MB response body and will 502).

Static-HTML fallback (browserless)

When a Browserbase remote session is unavailable, browse cloud fetch <SERP-URL> --proxies can return server-side-rendered HTML for single-word, no-filter, no-pagination queries (verified: /s?k=test&ref=nb_sb_noss → 933 KB clean response). This path:

  • Returns 70-80% of the per-card fields. Static HTML does contain data-asin, h2 title, .s-image src, .a-offscreen price, .a-price-whole integer, and the rating + review-count aria-labels.
  • Does NOT contain the Prime icon (i-aok-prime), Amazon's Choice badge, Climate Pledge badge, bestseller badge, or the sort-dropdown anchor — these are JS-rendered post-load.
  • Fails closed (returns the 2 KB Akamai bm-verify interstitial HTML) when the query contains: gibberish strings, multi-segment rh= filters, i={department} shortcuts, or page>1 pagination. Detect by <meta http-equiv="refresh" + bm-verify= in the response body.
  • Is hard-capped at 1 MB response body (Browserbase Fetch API limit). Most legit SERPs are 900 KB – 1.4 MB; expect ~50% to overflow with a 502 The response body exceeded the maximum allowed size of 1MB error. Use the browser session for those.

Site-Specific Gotchas

  • Akamai Bot Manager gates /s route aggressively. Symptoms: a 2-3 KB HTML body containing <meta http-equiv="refresh" content="5; URL='/s?...&bm-verify=...'" /> and an iframe src=https://m.media-amazon.com/images/S/sash/...gif instead of the SERP grid. Triggers observed during iteration: gibberish queries, multi-segment rh= filters, i={department} shortcuts, paginated requests, low-confidence query strings, and concatenated keywords (e.g. k=usbcable blocked while k=usb+cable passed). The only reliable bypass is a --verified Browserbase session with the homepage-warmed bm-cookies in place before the SERP request.

  • Captcha (Robot Check) page is a distinct outcome — DO NOT solve it. Marker: <title>Robot Check</title> + <form action="/errors/validateCaptcha". Triggered by sustained request rate from a flagged IP or a freshly minted session without a referer chain. Emit success: false, reason: "captcha_wall", ship as candidate outcome. Captcha-solving services are out of scope for read-only product extraction.

  • Filter IDs are NOT stable across queries / categories. The widely-documented "4 Stars & Up" filter is p_72:1248915011 in the canonical US storefront, but a /s?k=test fetch on 2026-05-18 returned p_72:3014475011 in its rendered filter rail — Amazon resolves IDs per-query / per-category-context, sometimes for A/B-test cohorts. The robust pattern is: (a) try the canonical IDs first; (b) if the resulting filter rail anchor href contains a different ID for the same logical filter, re-request with that ID; (c) cache the (query-prefix, category, filter-key) → ID triple per session, not globally.

  • Filter encoding scheme — verified rh= prefixes from a real SERP fetch (2026-05-18):

    PrefixFilter dimensionNotes
    n:{id}Department / category nodeCanonical examples: 172282 Electronics, 283155 Books, 1055398 Home & Kitchen, 7141123011 Clothing, Shoes & Jewelry, 2619533011 Beauty. Combine sub-categories with comma: rh=n:172282%2Cn:172456
    p_72:{id}Customer rating thresholdCanonical: 1248915011 4★+, 1248914011 3★+, 1248913011 2★+, 1248912011 1★+. Per-query overrides happen — see "Filter IDs are not stable" above.
    p_36:-{maxCents} / p_36:{minCents}-{maxCents} / p_36:{minCents}-Custom price range (cents)p_36:-15000 = ≤$150. p_36:2500-5000 = $25–$50. p_36:20000- = $200+.
    p_n_price_fma:{id}Preset price bucketEight buckets observed (1034681201110346819011) corresponding to the displayed labels: $0 to $1, $1 to $3, $3 to $5, $5 to $10, $10 to $15, $15 to $20, Under $10, Over $20. IDs vary per-category.
    p_85:{id}Prime eligibilityCanonical: 2470955011 (US).
    p_76:{id}Free shipping / FBAPer-storefront IDs.
    p_90:{id}Seller — Amazon-as-sellerCanonical: 8308921011 (US). For third-party / specific seller, use emi= query param or the seller-id me= shortcut.
    p_n_deal_type:{id}DealsThree observed values: Today's Deals, Lightning Deals, Coupons — IDs in the 23566065011 family.
    p_n_condition-type:{id}Condition6461716011 New, 6461717011 Used, 6461718011 Renewed/Refurbished (US).
    p_n_date:{id}New arrivalsLast 30 / 90 days etc.
    p_n_availability:{id}Include out-of-stockToggle for showing OOS items.
    p_n_feature_browse-bin:{id}Apparel: Color, Size, Material, Pattern, FitHighly category-specific; discover from filter rail.
    p_n_climate_pledge_friendly:{id}Climate Pledge FriendlyNiche filter.
    p_n_subscribe_save_eligibility:{id}Subscribe & SaveNiche filter.
    p_n_small_business:{id}Small BusinessNiche filter.

    Multi-filter URL-encoding uses %2C between segments (raw comma). Same key repeated forms a logical OR; different keys form a logical AND.

  • Sort enum (s= query param):

    ValueDisplay label
    relevanceblenderFeatured (default)
    price-asc-rankPrice: Low to High
    price-desc-rankPrice: High to Low
    review-rankAvg. Customer Review
    date-desc-rankNewest Arrivals
    exact-aware-popularity-rankBest Sellers
  • Pagination URL shape: ?page=N&xpid={experiment-id}&qid={timestamp}&ref=sr_pg_N. xpid and qid are surfaced by Amazon's renderer but are not required for a correct response — pass only page=N for clean re-requests. Pages 2+ are heavily Akamai-gated; a browser session warmed by an earlier page-1 fetch is the reliable path.

  • 22 cards per page, not 24. The "default 24 results" mentioned in some references is the target count but the rendered SERP typically shows 22 organic cards with the remaining slots occupied by sponsored content (AdHolder divs interleaved). Always derive results_returned from actual card count, not from a fixed assumption.

  • Sponsored cards repeat the same ASIN as organic. When parsing, the same data-asin value sometimes appears in two consecutive <div>s — once with AdHolder (sponsored placement) and once organically lower on the page. Dedupe by (asin, sponsored) tuple, or surface both with position and sponsored flags so the caller can decide.

  • Storefront TLD maps to a separate ID space. Filter IDs (p_72, p_85, etc.) and category nodes (n:...) are NOT portable across .com / .co.uk / .de / .ca / .jp. Each storefront uses its own integer ranges. The skill must re-discover IDs per locale via the rendered filter rail. UK uses .co.uk, Germany/Austria/Switzerland use .de, Canada uses .ca, Japan uses .co.jp, Australia uses .com.au, Mexico uses .com.mx, Brazil uses .com.br, India uses .in.

  • application/ld+json is NOT embedded on SERPs. Verified 2026-05-18: zero <script type="application/ld+json"> blocks in a clean /s?k=test response. Product LD-JSON appears only on /dp/{ASIN} detail pages. Do not waste time looking for structured data on /s.

  • JS-rendered badges are missing from browse cloud fetch output. The Browserbase Fetch API path returns server-side-rendered HTML before Amazon's lazy-loaded widgets populate. Specifically absent: i-aok-prime icons, "Amazon's Choice" / Bestseller / Climate Pledge badges, and the sort-dropdown anchor. To extract these, drive a full browser session (which executes the JS) and snapshot after wait timeout 1500.

  • 1 MB response cap on browse cloud fetch. The Fetch API returns 502 The response body exceeded the maximum allowed size of 1MB. Use a browser session to handle large responses. for any response payload ≥ 1 MB. Observed: /dp/{ASIN} pages and most full-category SERPs exceed this. /gp/bestsellers/ (~644 KB), /robots.txt (2 KB), and /s?k=test&ref=nb_sb_noss (933 KB) fit comfortably. Use a browser session as primary; reach for cloud fetch only as a fallback for known-small endpoints.

  • DO NOT click data-cy="add-to-cart", the "Buy Now" button, "Subscribe & Save", "Sign In", "Try Prime", or any other mutation control. Read-only is the contract. Stop at the SERP — never navigate into the product detail page's purchase flow.

  • Region routing on bare amazon.com. When the request IP is in a non-US country, Amazon may redirect to the local storefront mid-request (302 to /). Lock the locale by including the storefront TLD in the URL and setting &language=en_US if the response would otherwise localize text.

  • browse cloud fetch --proxies is geo-locked by Browserbase residential proxy egress. During iteration, the egress IP was US-west; results matched US storefront pricing. For non-US storefronts, the proxy region matters — Browserbase doesn't currently expose a per-storefront proxy region flag, so non-US localization may be inconsistent.

  • xpid / qid / ref=sr_* / dib=... tracking params are noise. Strip from extracted product URLs before emitting — the canonical clean form is https://www.amazon.com/dp/{ASIN} (or /{locale-slug}/dp/{ASIN} if locale is meaningful).

  • Volatility: Amazon changes selectors quarterly. The data-cy test-selector layer has been stable since early 2024 but is not contractual. If data-cy="title-recipe" returns empty, fall back to h2 a span text. Maintain a per-selector fallback chain.

Expected Output

Successful result with full filter set applied:

{
  "success": true,
  "storefront": "amazon.com",
  "query": "wireless mechanical keyboard",
  "filters_applied": {
    "min_rating": 4,
    "price_max_usd": 150,
    "prime_only": true,
    "sort": "review-rank",
    "department": null,
    "brand": null,
    "condition": null
  },
  "url_used": "https://www.amazon.com/s?k=wireless+mechanical+keyboard&rh=p_72%3A1248915011%2Cp_36%3A-15000%2Cp_85%3A2470955011&s=review-rank&page=1",
  "total_result_count": 4000,
  "total_result_count_is_approximate": true,
  "results_returned": 22,
  "page": 1,
  "has_next_page": true,
  "products": [
    {
      "position": 1,
      "asin": "B0XXXXXXXX",
      "title": "Keychron K8 Pro QMK/VIA Wireless Mechanical Keyboard",
      "brand": "Keychron",
      "image": "https://m.media-amazon.com/images/I/81xxxxxxxxL._AC_UY218_.jpg",
      "thumbnails": [
        "https://m.media-amazon.com/images/I/71xxxxxxxxL._AC_UY218_.jpg"
      ],
      "price": { "formatted": "$129.99", "raw": 129.99, "currency": "USD" },
      "list_price": { "formatted": "$149.99", "raw": 149.99 },
      "discount_pct": 13,
      "rating": { "stars": 4.5, "review_count": 1820 },
      "prime": true,
      "sponsored": false,
      "amazons_choice": false,
      "bestseller": false,
      "climate_pledge": false,
      "deal_label": null,
      "ships_from": "Amazon.com",
      "sold_by": "Keychron Official",
      "url": "https://www.amazon.com/dp/B0XXXXXXXX"
    }
  ]
}

Zero-result outcome (valid, not a failure):

{
  "success": true,
  "storefront": "amazon.com",
  "query": "asdfkjasdljaksdljasldkjasl",
  "total_result_count": 0,
  "results_returned": 0,
  "page": 1,
  "products": []
}

Captcha / Robot Check wall (ship as candidate, do NOT attempt to solve):

{
  "success": false,
  "reason": "captcha_wall",
  "storefront": "amazon.com",
  "query": "wireless mechanical keyboard",
  "url_attempted": "https://www.amazon.com/s?k=wireless+mechanical+keyboard&...",
  "page_title": "Robot Check",
  "screenshot_path": "screenshots/04-robot-check.png",
  "trigger_hint": "Sustained request rate from this residential-proxy IP or missing bm-verify session cookie. Retry with a fresh session + homepage warm-up."
}

Akamai bm-verify interstitial (HTML shape: 5-second meta-refresh + iframe + _sec/verify POST script):

{
  "success": false,
  "reason": "akamai_interstitial",
  "storefront": "amazon.com",
  "query": "...",
  "url_attempted": "...",
  "response_size_bytes": 2310,
  "screenshot_path": "screenshots/02-akamai-bot-wall.png",
  "trigger_hint": "Static-HTML Fetch path triggered Akamai. Retry via a verified Browserbase remote session with proxies + homepage warm-up."
}

ASIN-direct lookup (used when caller passes ASINs instead of a query):

{
  "success": true,
  "storefront": "amazon.com",
  "asins_requested": ["B08XXXXX01", "B09XXXXX02"],
  "results_returned": 2,
  "products": [
    {
      "asin": "B08XXXXX01",
      "title": "...",
      "url": "https://www.amazon.com/dp/B08XXXXX01",
      "price": { "formatted": "$24.99", "raw": 24.99, "currency": "USD" },
      "rating": { "stars": 4.6, "review_count": 12504 },
      "prime": true,
      "ships_from": "Amazon.com",
      "sold_by": "Acme Inc."
    }
  ]
}

Ambiguous query (Amazon's "Did you mean" / spell-correction interstitial):

{
  "success": true,
  "storefront": "amazon.com",
  "query": "wirless mechanikal keybord",
  "spell_correction_suggested": "wireless mechanical keyboard",
  "spell_correction_applied": true,
  "url_used": "https://www.amazon.com/s?k=wireless+mechanical+keyboard&...",
  "total_result_count": 4000,
  "results_returned": 22,
  "products": [ /* ... */ ]
}