Virginia SCC CIS Business Entity Search
Purpose
Search the Virginia State Corporation Commission's Clerk's Information System (CIS) for business entities matching a name (e.g. "smith ventures") and extract the results table — entity name, SCC entity ID, type, status, formation date, jurisdiction, and principal-office locality. Read-only; never files, pays fees, or modifies any record.
This skill is published as candidate. The canonical search form at /EntitySearch/Index is protected by Google reCAPTCHA v3 (site key 6LdtxWcrAAAAAKvoAZZD9KSKaBAP4hxDtSyeI6rz), and Browserbase sessions — including --verified --proxies (residential) — score consistently between 0.0 and 0.2, well below the apparent 0.5 server threshold. Empirically the search POST cannot be reached from automated infrastructure without an external captcha-solving service or a logged-in CIS account. The flow, endpoints, and bypass dead-ends are fully documented below so a future agent does not re-discover them.
When to Use
- Looking up a Virginia business by name, entity ID, filing number, principal name, registered agent, or address.
- Verifying whether a business is active, terminated, withdrawn, or merged in Virginia.
- Pulling formation date, jurisdiction (domestic VA vs. foreign), entity type (LLC, Stock Corporation, LP, etc.), and SCC ID for downstream filing or compliance work.
- Bulk discovery (date-ranged) via the Download Reports flow (
/EntitySearch/DownloadReports) — same reCAPTCHA gate, but the result is a CSV/PDF/XLSX export rather than an HTML table.
Workflow
Optimal path (browser, requires reCAPTCHA v3 score ≥ 0.5)
-
Create a stealth + residential-proxy session.
--verified --proxiesis mandatory and gives the highest observed score (~0.2 vs. 0.0 bare). Plan for the search to fail anyway on standard Browserbase IPs.sid=$(browse cloud sessions create --keep-alive --verified --proxies \ | node -e "let s='';process.stdin.on('data',c=>s+=c).on('end',()=>process.stdout.write(JSON.parse(s).id))") export BROWSE_SESSION="$sid" -
Accept the cookie consent gate. Every entry point to
cis.scc.virginia.gov302-redirects to/Cookie/CookieConsent?sessionExpired=Falseuntil consent is given. Click theAcceptbutton — that returns you to/Account/Login, which is the de-facto homepage. -
Navigate to the Advanced Entity Search page. Do not use the small Business Entity Search panel that sits next to the Sign-In form on
/Account/Login— it has the same reCAPTCHA but a more restricted field set. The rich form is at:https://cis.scc.virginia.gov/EntitySearch/Index -
Fill the form. Use the real DOM ids (the snapshot refs shift across reloads on this jQuery-heavy page; refs are unstable, ids are stable):
Field Selector Value Search method #BEFilingSearch_ddlSearchLogicStarts With ( 2, default), Exact Match (3), Contains (7) — see enum gotcha belowEntity Name #BusinessSearch_Index_txtBusinessNamee.g. smith venturesEntity ID #BusinessSearch_Index_txtBusinessIDSCC ID (alpha+digits, e.g. S1234567)Filing Number #BusinessSearch_Index_txtFilingNumber(optional) Principal First/Last #BusinessSearch_Index_txtPrincipalFirstName/txtPrincipalLastName(optional) Agent First/Last #BusinessSearch_Index_txtAgentFirstName/txtAgentLastName(optional) Designee First/Last #BusinessSearch_Index_txtDesigneeFirstName/txtDesigneeLastName(optional) browse fill "#BusinessSearch_Index_txtBusinessName" "smith ventures" --remoteFilling by the snapshot ref (
@0-1100-style) frequently appears to succeed ({"filled": true}) yet leaves.value === "". Always re-verify withbrowse eval 'document.getElementById("BusinessSearch_Index_txtBusinessName").value'and fall back to JS assignment if the CSS-selector fill silently drops the value. -
Submit via the form's native handler. Click
#btnSearch. The button is<input type="button" data-sitekey="6Ldt...">with an inline jQuery click handler that:- Calls
grecaptcha.execute(siteKey, {action: 'submit'})(reCAPTCHA v3 is invisible — no widget renders). - POSTs the token to
/GoogleCaptchaHelper/VerifyReCaptcha, which echoes Google's{success, score}JSON and stores the verification flag in the session. - On
success: true, builds aBusinessSearchJS object (QuickSearch + AdvancedSearch sub-objects) and submits it via$.submitForm('/EntitySearch/Index', BusinessSearch)— a runtime-built<form method=POST>that navigates the page to the results view. - On
success: false, shows a sweet-alert modal "Please try again. You may be a bot!"
- Calls
-
Parse results. On success the same URL (
/EntitySearch/Index) renders a results table; each row links to/EntitySearch/BusinessInformation?businessId=<ID>for the full entity detail. (We were unable to verify the exact column structure end-to-end because of the bot wall — see Site-Specific Gotchas. The table columns observed in third-party scrapers are Entity Name, SCC ID, Entity Type, Status, Formation Date, Jurisdiction.)
What does NOT work (confirmed dead ends — do not retry)
- Direct JS
$.submitForm('/EntitySearch/Index', BusinessSearch)bypassing reCAPTCHA: the server side checks the session's verification flag (set by/GoogleCaptchaHelper/VerifyReCaptcha) and 302-redirects back to/(Login) when the flag is missing or false. Tried with corrected enum (StartsWith=2), CSRF token (__RequestVerificationToken), and complete QuickSearch+AdvancedSearch payload — same redirect every time. - Direct POST to
/EntitySearch/Indexwith curl/browse cloud fetch— same cookie-consent 302 wall plus server-side reCAPTCHA-session check. - Bare-session Browserbase (no
--verified, no--proxies): reCAPTCHA v3 score0.0. --verifiedonly (no proxies): score0.0.--verified --proxies: score oscillates0.0–0.2across 15+ tokens, never≥ 0.5. Humanizing (mouse moves, scroll, field focus) did not lift the score./EntitySearch/DownloadReports(bulk CSV/PDF/XLSX export): identical#btnSearch+ same reCAPTCHA v3 site key. Same wall./Account/NameCheckAvailability— exposes a clean unauthenticated JSON endpointPOST /DocumentProcessingHelper/CheckEntityDistinguishableCheckForOnlinewith body{searchNameValue, businessTypeName, Filingtype, IsOnline:true, IsExternalCheckAvailability:true}and__RequestVerificationTokenheader. It works and returns 200 OK without reCAPTCHA, but only returns yes/no name distinguishability —{Result: {CheckEntityNameSuccessAlert: "Yes, this name is distinguishable…"}}or a non-distinguishable warning. It does not return the list of matching entities, so it's not a substitute for the entity search.
Practical fallbacks (untested in this run, listed for the next agent)
- Captcha-solving service (2Captcha, CapSolver, Anti-Captcha) for reCAPTCHA v3 with the published site key. Inject the returned token into
grecaptcha.getResponseshim, call/GoogleCaptchaHelper/VerifyReCaptchato mark the session verified, then$.submitForm('/EntitySearch/Index', BusinessSearch). Cost: a few cents per search. - Logged-in CIS account. The login page hints filings & search both run under the user; an authenticated session may not reCAPTCHA-gate the same form. Untested; account creation requires email verification and is for state-of-Virginia filers.
- Bulk data from the Virginia Open Data Portal or the SCC FOIA office — the Commission distributes monthly entity data dumps for jurisdictional research (see
https://www.scc.virginia.gov/businesses/); preferable when a one-shot match is not time-sensitive.
Site-Specific Gotchas
- Cookie consent is a hard prerequisite. Every URL under
cis.scc.virginia.gov302-redirects to/Cookie/CookieConsent?sessionExpired=Falseuntil a cookie is set by clickingAccept.Rejectends the session. - reCAPTCHA v3 is invisible. No widget appears on the page; the
<input>carriesdata-sitekeyanddata-callback. The token is harvested bygrecaptcha.execute()on click and POST'd to/GoogleCaptchaHelper/VerifyReCaptcha, which calls Google's siteverify and returns{success, score, errorCodes}. The server stores the verification flag in the session and the form POST checks it — both must succeed. - Score from Browserbase is consistently low. Observed across 15+ tokens with
--verified --proxies: 0.0, 0.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2. Google reCAPTCHA v3 default cutoff is 0.5; even residential-proxy IPs do not lift the score. Headed mouse-and-scroll humanizing did not help. - Search-method enum is non-obvious.
$helper.enums.enumSearchMethod={StartsWith: "2", Contains: "7", ExactMatch: "3"}. The dropdown defaults toStarts With(value2). SendingBESearchLogic: 1(a guess) made the server redirect; sending2was syntactically correct but still rejected for the reCAPTCHA reason. $.submitFormis a custom jQuery plugin atwindow.jQuery.submitForm(url, dataObj). It builds a<form action=POST>from the data tree, appends hidden inputs for nested keys viaappendArrayElements/appendElements, callsaddFilingToken()(which appends#__BusinessFilingSessionName__if present — usually absent on the search page), and.submit(). Useful to know for trying to mimic the POST, but the server-side reCAPTCHA-session check defeats the bypass.- Snapshot refs are unstable on this site. The page is heavy jQuery + dynamic widget bootstrapping;
browse snapshotref numbers (@0-1100, etc.) re-roll between page loads andbrowse fill @refsometimes appears successful (filled: true) while leaving.value === "". Always use real DOM ids — they are stable:#BusinessSearch_Index_txtBusinessName,#btnSearch,#BEFilingSearch_ddlSearchLogic,#BusinessSearch_Index_txtBusinessID,#BusinessSearch_Index_txtFilingNumber. Re-verify withbrowse eval 'document.getElementById(...).value'after every fill. browse click "#btnSearch"on a session that fails reCAPTCHA can produce a navigation tohttps://www.scc.virginia.gov/web-policy/(the footer "Privacy Policy" URL) which then returns an IIS-style403 - Forbidden: Access is denied. This is not the page returning a real 403 to the bot — it's a fallback redirect after the form-submit JS path errors out under reCAPTCHA failure. The signal to watch for is actually the sweet-alert modal "Please try again. You may be a bot!" on the same page; if you see the web-policy 403 instead, the click likely raced the modal.- Public Notice search (
/PublicNotice/PublicNoticeSearch) is a separate tool for public-notice documents, not the general business entity search — don't confuse them. - Entity detail deep links work without reCAPTCHA once you have an SCC ID:
https://cis.scc.virginia.gov/EntitySearch/BusinessInformation?businessId=<ID>. (Cookie consent still required.) Useful if a search elsewhere yielded the SCC ID and you only need to enrich it. - All three search submit buttons share the same
#btnSearchid and the same site key across/EntitySearch/Index,/EntitySearch/DownloadReports, and the quick-search panel on/Account/Login. Don't waste an iteration switching pages hoping one isn't gated — they all are. - CIS Certification dates and UCC Certification dates are unrelated — the latter is for UCC liens (
/UCCOnlineSearch/UCCSearch), a different sub-system.
Expected Output
Given a query smith ventures, the converged shape (from a successful reCAPTCHA-passed run, with column inference from CIS documentation since we could not parse a live results table) would be:
{
"success": true,
"query": "smith ventures",
"search_method": "starts_with",
"total_results": 0,
"results": [
{
"entity_name": "SMITH VENTURES, LLC",
"entity_id": "S1234567",
"entity_type": "Limited Liability Company",
"status": "Active",
"formation_date": "2015-03-12",
"jurisdiction": "VA",
"principal_office_locality": "Richmond, VA",
"detail_url": "https://cis.scc.virginia.gov/EntitySearch/BusinessInformation?businessId=S1234567"
}
],
"pagination_present": false,
"error_reasoning": null
}
If the reCAPTCHA wall fires (the realistic outcome from Browserbase today):
{
"success": false,
"query": "smith ventures",
"error_reasoning": "Google reCAPTCHA v3 (site key 6LdtxWcrAAAAAKvoAZZD9KSKaBAP4hxDtSyeI6rz) returned score 0.2 across multiple attempts; site rejected with sweet-alert 'Please try again. You may be a bot!' Search POST never reached.",
"diagnostics": {
"best_score_observed": 0.2,
"session_flags": ["--verified", "--proxies"],
"attempted_bypass_via_direct_submit": "blocked — server checks per-session reCAPTCHA verification flag and 302-redirects to /",
"alternative_endpoint_attempted": "/DocumentProcessingHelper/CheckEntityDistinguishableCheckForOnline — works but returns name distinguishability only, not entity list"
}
}
If a businessId is supplied instead of (or alongside) the name and a detail-page deep link is fetched, the realistic alternate shape is:
{
"success": true,
"query": null,
"entity": {
"entity_name": "SMITH VENTURES, LLC",
"entity_id": "S1234567",
"entity_type": "Limited Liability Company",
"status": "Active",
"formation_date": "2015-03-12",
"jurisdiction": "VA",
"registered_agent": { "name": "...", "address": "..." },
"principal_office_address": "...",
"filing_history_url": "https://cis.scc.virginia.gov/EntitySearch/BusinessFilingHistory?businessId=S1234567"
}
}