Hugging Face — Browse Latest Models
Purpose
Return the most recently created models on Hugging Face — for each model: id (owner/name), author, createdAt, lastModified, downloads, likes, tags, pipeline_tag, library_name, gated flag, and canonical model URL. Optionally narrow by pipeline task (e.g. text-generation, text-to-image), library (transformers, diffusers, gguf, ...), author/org, or a free-text search. Read-only — never creates, edits, or downloads model artifacts.
When to Use
- "What are the newest models on Hugging Face right now?"
- Hourly / daily polling for newly-uploaded models matching a task or library.
- Watching a specific org (e.g.
meta-llama,google,stabilityai) for new releases. - Discovering new fine-tunes of a base model (combine with
search=<base>orfilter=<task>). - Any flow that would otherwise scrape
huggingface.co/models?sort=created— the JSON API is faster, cheaper, paginated cleanly, and returns richer per-model metadata.
Workflow
Hugging Face exposes a fully public, unauthenticated JSON API at https://huggingface.co/api/models. No cookies, no anti-bot, no residential proxy required. Rate limit is 500 requests / 5-minute fixed window on the api scope (advertised via Ratelimit-Policy and Ratelimit response headers). robots.txt is Allow: / for all user-agents. Lead with the API; the browser path works as a fallback but is ~50× slower because the listing page is fully JS-rendered.
-
Fetch the most recent models (default — no filters):
GET https://huggingface.co/api/models?sort=createdAt&direction=-1&limit=50Returns a JSON array of model objects.
direction=-1is descending (newest first); pair withsort=createdAtfor upload time.limitis per-page; observed max is 1000 per request — paginate via theLinkheader for more. -
Optional query parameters (combine freely):
Param Effect Example sortSort field. Valid: createdAt,lastModified,downloads,likes,trendingScoresort=createdAtdirection-1desc,1ascdirection=-1limitPage size (≤ 1000) limit=50filterPipeline tag filter — text-generation,text-to-image,text-to-video,image-text-to-text,automatic-speech-recognition,feature-extraction,robotics,any-to-any, etc.filter=text-generationlibraryLibrary filter — transformers,diffusers,gguf,mlx,sentence-transformers,transformers.js,pytorch,tf,jax,onnx,safetensorslibrary=diffusersauthorRestrict to one user/org namespace author=meta-llamasearchFree-text substring match on model id search=llama-3fullIf true, includeauthor,sha,gated,lastModified,siblings[](file manifest)full=trueconfigIf true, includeconfig.jsoncontents (architectures, model_type, tokenizer_config) inlineconfig=truecardDataIf true, includecardData(model-card frontmatter: license, language, datasets, base_model)cardData=trueUnrecognized params are silently dropped. Combine filters and
searchfor narrow queries, e.g.?filter=text-to-image&library=diffusers&search=flux&sort=createdAt&direction=-1. -
Parse each result object. Every item is a flat JSON object (named fields — not positional). Default-mode fields:
id—"owner/name"(e.g."meta-llama/Llama-3.2-1B") or a single-segment legacy id ("bert-base-uncased"). This is alsomodelId(duplicated)._id— MongoDB ObjectId (12-byte hex). Its first 4 bytes encode the upload timestamp; this is whatcursor=paginates against. Don't treat this as the model identifier — useid.createdAt— ISO-8601 UTC timestamp of initial upload.tags[]— string array. Includes raw labels ("transformers","safetensors","qwen2"), pipeline-tag duplicates ("text-generation"), language codes ("en","fr"), license tags ("license:apache-2.0"),"base_model:<id>","endpoints_compatible", and a trailing"region:us"deployment-region tag.pipeline_tag— canonical task (e.g."text-generation","text-to-image"). Absent when the uploader didn't tag the model — many fresh uploads have nopipeline_taguntil the README is committed. Don't assume it's always present.library_name— canonical library (e.g."transformers","diffusers"). Also frequently absent on bare uploads.downloads,likes— integers; both0for fresh uploads (uploads are rate-counted with delay).private— alwaysfalsefor results returned by this endpoint (private models are filtered server-side).
With
full=true, additionally:author,gated(false/"manual"/"auto"),lastModified,sha(repo commit SHA),siblings[](array of{rfilename}entries — the file manifest). -
Construct the canonical model URL:
https://huggingface.co/{id}idis used verbatim, slashes included ("meta-llama/Llama-Prompt-Guard-2-86M"→https://huggingface.co/meta-llama/Llama-Prompt-Guard-2-86M). No URL-encoding needed for the slash. The page exists for every model in the response. -
Paginate (only if you need >
limitresults). The response includes aLinkheader:Link: <https://huggingface.co/api/models?sort=createdAt&direction=-1&limit=50&cursor=eyJfaWQiOnsiJGx0IjoiNmEwZTMxNDFmMjdlNGU0NGU5OTlhMjhhIn19>; rel="next"Parse the URL between
<and>and follow it — thecursoris opaque (base64-encoded{_id: {$lt: <ObjectId>}}mongo predicate); do not decode/construct it yourself. There is noprevlink. Stop when theLinkheader is absent or the response array is empty. -
Honor the rate limit. Each response carries:
Ratelimit-Policy: "fixed window";"api";q=500;w=300 Ratelimit: "api";r=<remaining>;t=<seconds-to-window-reset>500 requests per 300-second fixed window. Stay well below it (e.g. ≤ 1 req/s sustained) and back off when
Ratelimit: r=...drops below ~50. There is no documented per-IP block on overage — the server returns429 Too Many Requestsand you waittseconds.
Browser fallback
Use only if the JSON API is unreachable from your egress (it shouldn't be — no anti-bot, no geo restrictions observed).
https://huggingface.co/models?sort=created&pipeline_tag=<task>&library=<lib>&search=<q>&p=<page>
sort=created(note: the browser URL usescreated, the API usescreatedAt— these are not interchangeable across surfaces).p=Npaginates (0-indexed, 30 results per page).- The listing is fully JS-rendered;
browse get markdown bodyparses cleanly. Each card text is<id> [task • params •] Updated <relative-time> agofollowed by the canonical URL in the surrounding anchor<a href="/{id}">.browse snapshotreturns workable refs for clicking through to a model page, but for bulk extraction prefer markdown parsing.
Site-Specific Gotchas
sort=createdAtvssort=created: the API endpoint usescreatedAt(camelCase). The browser URL usescreated(no suffix). They are not aliased — passingsort=createdto/api/modelsis silently ignored and the API falls back to its default sort (which is not createdAt — it'slastModifieddesc, so you'll get stale results that look "recent" but aren't). Always usesort=createdAt&direction=-1on the API.createdAt≠lastModified:createdAtis when the repo was first pushed.lastModifiedis the most recent commit (README edit, weight reupload, etc.). For "newest models" usecreatedAt. For "recently updated models" uselastModified. The two diverge by hours/days for active repos.pipeline_tagandlibrary_nameare often absent on fresh uploads. Many models surface insort=createdAt&direction=-1with no README and no auto-detected pipeline. Treat both as optional fields and key offtags[]if you must classify._idis not the model identifier —idis. The_idfield is the internal MongoDB ObjectId; it changes if the repo is recreated. The user-facing identifier isid(also exposed asmodelId). Useidfor canonical URLs and downstream/api/models/{id}lookups.- No
total_countis returned. Unlike Craigslist'stotalResultCount, the HF models endpoint doesn't include a total. The total models count (~2.9M as of 2026-05) is only available from the browser listing page header text. If you need a count, scrape it fromhttps://huggingface.co/modelsand parse the number under the# Modelsh1. - Cursor pagination is opaque and one-way. The
Link: rel="next"header carries a base64'd mongo predicate. There is norel="prev"— you can only walk forward. If you need to resume from a known model, supplycursor=<base64 of {"_id":{"$lt":"<the model's _id>"}}>— the predicate is straightforward to construct if you have a prior_id, but the safer pattern is to walk from the start and stop whencreatedAt < <cutoff>. limitceiling is 1000. Passinglimit=10000clamps silently to 1000.- Gated and private models.
private: truerepos are never returned by this endpoint regardless of auth.gated: "manual"/"auto"repos are returned (visible in the listing) but their model page may require accepting terms before download — the listing itself is public. Surfacegatedto callers whenfull=trueso they know to expect a terms-gate on click-through. - Adult / NSFW models surface in
sort=createdAt. The default firehose includes user-uploaded LoRAs and image models with explicit names/content. Callers that render results to end users should filter ontags[]for"not-for-all-audiences"/"nsfw"or apply name-based filtering — this is not auto-redacted by the API. tags[]is a multi-namespace bag, not normalized. Same value can appear as a raw label and as a namespaced tag (e.g."text-generation"and"pipeline_tag:text-generation"rarely both appear, but"safetensors"may appear both as a raw tag and as"library:safetensors"-equivalent). Don't expect uniqueness or a stable schema across categories.- Rate-limit policy header is the source of truth. The 500-per-5-minute number above is observed on the
apiscope as of 2026-05-20. ReadRatelimit-Policyon each response in case HF changes it — don't hardcode the window. - No residential proxy required, no stealth required.
browse cloud fetch <url>(the bare HTTP path) works fine — no need to spin up a stealth session unless you're also doing browser interactions in the same flow. This is a cost win: a singlefetchcall costs ~$0 vs. ~$0.01–0.05 for a full session. huggingface_hubPython SDK is the official client (from huggingface_hub import HfApi; HfApi().list_models(sort="createdAt", direction=-1, limit=50)). It wraps this exact endpoint with auth/retry/typing. Skill consumers who can run Python should prefer it for ergonomic typed results; agents driving from a sandbox without Python should use the raw HTTP path above.
Expected Output
{
"query": {
"sort": "createdAt",
"direction": -1,
"limit": 50,
"filter": "text-generation",
"library": null,
"author": null,
"search": null
},
"count": 50,
"next_cursor_url": "https://huggingface.co/api/models?sort=createdAt&direction=-1&limit=50&filter=text-generation&cursor=eyJfaWQiOnsiJGx0IjoiNmEwZTJlZmM2Mjc4ZDhiMmU2MjNlMTk0In19",
"models": [
{
"id": "sstoica12/UAS_qwen7b_medmcqa_100_alpaca_400_proximity_0_8_diversity_0_19999999999999996",
"author": "sstoica12",
"created_at": "2026-05-20T22:00:28.000Z",
"last_modified": "2026-05-20T22:03:07.000Z",
"pipeline_tag": "text-generation",
"library_name": "transformers",
"tags": [
"transformers",
"safetensors",
"qwen2",
"text-generation",
"conversational",
"arxiv:1910.09700",
"text-generation-inference",
"endpoints_compatible",
"region:us"
],
"downloads": 0,
"likes": 0,
"gated": false,
"private": false,
"sha": "1993fd1a13a3aebc3cfb2db24c7c8f32f79b52ed",
"url": "https://huggingface.co/sstoica12/UAS_qwen7b_medmcqa_100_alpaca_400_proximity_0_8_diversity_0_19999999999999996"
},
{
"id": "longtermrisk/Olmo-3-7B-Instruct-replaydistillsftjob-306b1e549725-replay_distillation-a0.3-b0.1-s3407",
"author": "longtermrisk",
"created_at": "2026-05-20T22:10:38.000Z",
"last_modified": "2026-05-20T22:10:44.000Z",
"pipeline_tag": null,
"library_name": "transformers",
"tags": ["transformers", "safetensors", "arxiv:1910.09700", "endpoints_compatible", "region:us"],
"downloads": 0,
"likes": 0,
"gated": false,
"private": false,
"url": "https://huggingface.co/longtermrisk/Olmo-3-7B-Instruct-replaydistillsftjob-306b1e549725-replay_distillation-a0.3-b0.1-s3407"
}
]
}
Three shapes the caller should be prepared for:
// 1. Normal — array of model objects, plus next cursor URL.
{ "count": 50, "next_cursor_url": "...", "models": [...] }
// 2. End of pagination — empty array, no Link header.
{ "count": 0, "next_cursor_url": null, "models": [] }
// 3. Rate-limited — server returns 429, no JSON body, with Retry-After / Ratelimit headers indicating wait time.
{ "error": "rate_limited", "retry_after_seconds": 187, "ratelimit_remaining": 0 }