amazon.in

browse-products

Installation

Adds this website's skill for your agents

 

Summary

Search Amazon.in by keyword and return structured product listings — ASIN, title, current price (INR), MRP, rating, review count, canonical URL, sponsored flag, and free-delivery flag — from the first search-results page.

FIG. 01
FIG. 02
FIG. 03
FIG. 04
FIG. 05
SKILL.md
214 lines

Amazon India — Browse Products

Purpose

Search Amazon.in (the India marketplace) for products by keyword and return the first-page search-result listings as structured records — ASIN, title, current price (INR), MRP/list price, star rating, review count, canonical product URL, primary thumbnail image, sponsored flag, and free-delivery flag. Read-only; never adds to cart, never checks out, never signs in.

When to Use

  • Capturing a price/rating snapshot for a query (price tracking, competitive intel).
  • Building a comparison table across queries (e.g. "wireless earbuds under 2000", "python programming book").
  • Feeding ASIN + canonical URL into a downstream product-detail crawler.
  • Anywhere you'd otherwise scrape Amazon.in HTML by hand — a deep-link search URL plus a single document.querySelectorAll pass beats clicking through the search form.

Workflow

There is no public Amazon Product API available without a Seller Central / Product Advertising API approval and access keys. The deep-link search URL (https://www.amazon.in/s?k=<query>) is the reliable shortcut: it's an unauthenticated GET, accepts a small set of well-known query parameters (page, s for sort, rh for refinements), and renders fully server-side — every result card is in the initial HTML, no scroll/XHR pagination required. The extraction below runs in one document.querySelectorAll pass; do not try to drive the search form via the homepage #twotabsearchtextbox — the homepage is heavier (~25s wall time, multiple A/B-test variants of the search box), the deep link is ~3s and identical.

A Browserbase session with --verified --proxies (Indian residential proxy) is required when the outbound IP is outside India — amazon.in serves a reduced/redirect homepage to non-Indian IPs and a CSRF challenge on the search endpoint. With verified+proxies enabled, no captcha or login wall was observed across 4 distinct queries (commodity electronics, books, ascending-price sort, page 2 pagination).

  1. Create a stealth Browserbase session.

    sid=$(browse cloud sessions create --keep-alive --proxies --verified \
      | node -e "let s='';process.stdin.on('data',c=>s+=c).on('end',()=>process.stdout.write(JSON.parse(s).id))")
    export BROWSE_SESSION="$sid"
    
  2. Deep-link to the search URL. Skip the homepage entirely.

    QUERY="wireless earbuds under 2000"
    ENC=$(node -e "process.stdout.write(encodeURIComponent(process.argv[1]).replace(/%20/g,'+'))" "$QUERY")
    browse open "https://www.amazon.in/s?k=$ENC" --remote --session "$sid"
    browse wait timeout 3000 --remote --session "$sid"
    

    Optional query params:

    • page=N — pagination (1-indexed). 22 raw cards/page, ~19 unique after dedup.
    • s=price-asc-rank (low→high), s=price-desc-rank (high→low), s=review-rank (avg rating), s=date-desc-rank (newest), s=relevanceblender (default).
    • rh=p_36:50000-200000 — price range in paise (×100, so 50000-200000 = ₹500–₹2000). Combine with &dc for "department-confined".
    • i=stripbooks / i=electronics / i=fashion — department refinement (the i= value matches the slug shown in Amazon's left-nav department links).
  3. Extract product cards via one DOM pass. Pipe this script through browse eval (the IIFE pattern below returns a JSON string that browse eval will surface in its result field):

    (() => {
      const ORIGIN = 'https://www.amazon.in';
      const decodeAsin = href => {
        if (!href) return null;
        if (href.startsWith('/sspa/click')) {
          try {
            const dest = new URL(href, ORIGIN).searchParams.get('url');
            if (dest) {
              const m = decodeURIComponent(dest).match(/\/dp\/([A-Z0-9]{10})/);
              if (m) return m[1];
            }
          } catch (e) {}
        }
        const m = href.match(/\/dp\/([A-Z0-9]{10})/);
        return m ? m[1] : null;
      };
      const parsePrice = t => {
        if (!t) return null;
        const d = t.replace(/[^\d.]/g, '').replace(/\./g, '');
        return d ? parseInt(d, 10) : null;
      };
      const parseReviewCount = t => {
        if (!t) return null;
        const c = t.replace(/[(),]/g, '').trim();
        const m = c.match(/^([\d.]+)\s*([KMkm]?)$/);
        if (!m) return parseInt(c.replace(/\D/g, ''), 10) || null;
        const n = parseFloat(m[1]), s = m[2].toLowerCase();
        if (s === 'k') return Math.round(n * 1000);
        if (s === 'm') return Math.round(n * 1e6);
        return Math.round(n);
      };
      const parseRating = t => {
        if (!t) return null;
        const m = t.match(/^([\d.]+)/);
        return m ? parseFloat(m[1]) : null;
      };
      const items = document.querySelectorAll('[data-component-type="s-search-result"]');
      const out = [], seen = new Set();
      items.forEach(el => {
        const rawAsin = el.getAttribute('data-asin');
        if (!rawAsin) return;
        const linkEl = el.querySelector('h2 a, a.s-line-clamp-2, a.s-no-outline');
        const href = linkEl?.getAttribute('href');
        const asin = decodeAsin(href) || rawAsin;
        if (seen.has(asin)) return;
        seen.add(asin);
        const titleEl = el.querySelector('h2 span');
        const priceWhole = el.querySelector('.a-price:not(.a-text-price) .a-price-whole');
        const priceSym = el.querySelector('.a-price:not(.a-text-price) .a-price-symbol');
        const priceOff = el.querySelector('.a-price.a-text-price .a-offscreen');
        const ratingAria = el.querySelector('[aria-label*="out of"]');
        const ratingAlt = el.querySelector('.a-icon-alt');
        const reviewsEl = el.querySelector('a span.s-underline-text, a[aria-label*="ratings"] span');
        const sponsoredEl = el.querySelector('[aria-label="Sponsored"], .puis-sponsored-label-text, .s-sponsored-label-text');
        const imageEl = el.querySelector('img.s-image');
        const ratingText = ratingAria?.getAttribute('aria-label') || ratingAlt?.textContent || null;
        out.push({
          asin,
          title: titleEl?.textContent.trim() || null,
          price_inr: parsePrice((priceSym?.textContent || '') + (priceWhole?.textContent || '')),
          mrp_inr: parsePrice(priceOff?.textContent),
          rating: parseRating(ratingText),
          review_count: parseReviewCount(reviewsEl?.textContent),
          url: `${ORIGIN}/dp/${asin}`,
          image: imageEl?.getAttribute('src') || null,
          sponsored: !!sponsoredEl,
          free_delivery: (el.textContent || '').includes('FREE delivery'),
        });
      });
      return JSON.stringify({ query: location.search, total: items.length, items: out });
    })()
    
  4. Take the top N entries of items for return to the caller. Sponsored items are surfaced at the top by Amazon's ranker — result_count should be applied after dedup (sponsored cards re-appear inline mid-page, which is why the extractor dedupes by canonical ASIN).

  5. Paginate by re-navigating with &page=N and re-running the extractor — there is no client-side incremental loading; each page is a fresh server-rendered HTML document with ~22 raw cards.

  6. Release the session.

    browse cloud sessions update "$sid" --status REQUEST_RELEASE
    

Site-Specific Gotchas

  • Geo-IP filtering: amazon.in serves a degraded homepage and rejects search to non-Indian IPs without a captcha challenge. Always use --proxies --verified when running from a US/EU sandbox. With Indian residential proxies, no captcha or login wall was encountered across electronics, books, and price-sorted queries.
  • Prime badge is effectively absent from amazon.in search cards — across 4 queries in 2026, zero matches for i.a-icon-prime, .s-prime, or [aria-label*="Prime"]. The Indian site advertises delivery on the result card via the literal string "FREE delivery" instead. Do not try to extract a Prime boolean; capture free_delivery: el.textContent.includes('FREE delivery') instead.
  • Sponsored cards duplicate canonical ASINs. A sponsored card and the organic card for the same product both have data-asin="B0XXX..." on the wrapper, but the sponsored card's <a href> is a tracker redirect /sspa/click?...&url=%2F...%2Fdp%2FB0XXX%2F... while the organic card's href is the direct /dp/B0XXX/ref=sr_1_N. Decode the url query-param of the sspa redirect to extract the canonical ASIN, then dedupe by ASIN — otherwise you ship ~22 items where 2–4 are duplicates.
  • ASIN format varies by category. Electronics use 10-char B0[A-Z0-9]{8}. Books use 10-digit ISBNs (often 1XXXXXXXXX or 9XXXXXXXXX). The regex \/dp\/([A-Z0-9]{10}) covers both; don't constrain to B0\w{8}.
  • Price comes in two DOM nodes, currency symbol + integer whole. Selector pattern: .a-price .a-price-symbol + .a-price .a-price-whole. The integer is rendered without thousand-separator commas in the text node (CSS-injected via :before/:after for display only), so text.replace(/[^\d]/g,'') parses correctly. Beware of the .a-text-price sibling — that is the strike-through MRP/list price, exposed via .a-offscreen. Always filter the current price selector with :not(.a-text-price).
  • s= sort parameter values are kebab-rank suffixes: price-asc-rank, price-desc-rank, review-rank, date-desc-rank (recent), exact-aware-popularity-rank, relevanceblender (default). Other values silently fall back to relevance.
  • Price-filter param rh=p_36:lo-hi is in paise (₹ × 100). rh=p_36:50000-200000 ≠ ₹500 to ₹2000 in INR; it actually IS ₹500–₹2000 because the prefix p_36 is the price-range refinement and the values are in paise (50000 paise = ₹500). Empirically confirmed — sort by price-asc-rank after applying this filter and the lowest result is ≥₹500.
  • Rating is in aria-label of a popover trigger, not in the .a-icon-alt of the star icon child. Selector [aria-label*="out of"] reliably hits the trigger <a aria-label="4.2 out of 5 stars, rating details">. The fallback .a-icon-alt text "4.2 out of 5 stars" works for non-sponsored cards but is missing/empty on some sponsored placements; prefer the aria-label.
  • Review count uses K/M abbreviation in parentheses — e.g. (13.4K) → 13,400, (1.5K) → 1,500. The raw integer is not in the DOM text; only the abbreviated form. Parse the suffix.
  • Don't waste time on the homepage search form. Driving #twotabsearchtextbox + pressing Enter works but adds ~20s and one extra page load with no benefit. Deep-link https://www.amazon.in/s?k=<encoded> is the canonical path.
  • No JSON API. The undocumented /s?k=<q>&format=json endpoint returns HTML, not JSON. The official Product Advertising API (webservices.amazon.in/paapi5/searchitems) requires a Seller Central account and access-key signing — not available for general scraping. Confirmed dead-end during iteration; do not waste turns probing.
  • net::ERR_ABORTED failures are benign. The browser-trace summary shows 7–64 failed requests per page navigation, all net::ERR_ABORTED on Scripts/XHRs cancelled by subsequent navigations or prefetch teardown. None affect the rendered search-result HTML. Do not treat these as anti-bot blocks.

Expected Output

{
  "success": true,
  "search_query": "wireless earbuds under 2000",
  "search_url": "https://www.amazon.in/s?k=wireless+earbuds+under+2000",
  "total_cards_on_page": 22,
  "result_count": 10,
  "products": [
    {
      "asin": "B0FMDL81GS",
      "title": "OnePlus Nord Buds 3r TWS Earbuds up to 54 Hours Playback, 2-mic Clear Calls, 3D Spatial Audio, AI Translation, 12.4mm Drivers, Dual-Device Connectivity, 47ms Low Latency - Ash Black",
      "price_inr": 1999,
      "mrp_inr": null,
      "rating": 4.3,
      "review_count": 44300,
      "url": "https://www.amazon.in/dp/B0FMDL81GS",
      "image": "https://m.media-amazon.com/images/I/51nBTTG3hNL._AC_UY218_.jpg",
      "sponsored": false,
      "free_delivery": true
    },
    {
      "asin": "B0BW8TXJJ2",
      "title": "Boat Nirvana Ion, 120HRS Battery, ...",
      "price_inr": 1699,
      "mrp_inr": 7990,
      "rating": 4.1,
      "review_count": 13400,
      "url": "https://www.amazon.in/dp/B0BW8TXJJ2",
      "image": "https://m.media-amazon.com/images/I/81-TGXuOMAL._AC_UY218_.jpg",
      "sponsored": true,
      "free_delivery": true
    }
  ],
  "error_reasoning": null
}

Outcome shapes observed during 1-iter convergence across 4 queries:

  • results_oksuccess: true, products[] populated. The common path. Example above.
  • results_ok_books — same shape; asin is a 10-digit ISBN (1636512933, 9367257651), mrp_inr is usually present (MRP is mandatory on books), image and rating populated. No structural difference — just be aware that the ASIN regex must accept digits-only IDs.
  • results_emptytotal_cards_on_page: 0, products: []. Returned for nonsense queries (?k=qwerasdf123); Amazon renders a "No results" banner. Treat as success with empty array, not as failure.
  • geo_blocked — when the session is run without --proxies --verified from a non-IN IP, the homepage redirects to a thin landing with no search-bar accessibility refs and /s?k=… returns a "Sorry, we couldn't find that page" body. total_cards_on_page: 0. Set success: false, error_reasoning: "Geo-blocked: amazon.in requires an Indian IP. Re-run session with --proxies --verified.".
  • captcha_wall — not observed under --verified --proxies during this iteration, but documented for completeness: Amazon's "Enter the characters you see" page renders #captchacharacters input. If present, abort with success: false, error_reasoning: "Captcha wall — session fingerprint flagged; rotate proxy + retry." Do not attempt to solve.