Reverse-Engineering SPA APIs
When a government site hides behind a JavaScript app, the data is still there. You just need to find the API the frontend calls.
The Core Insight
Every React/Vue/Angular app is a pretty face on top of a JSON API. The browser makes HTTP requests to a backend. Find those endpoints, call them with curl — no browser needed.
Step 1: Fetch the HTML Shell
curl -s 'https://target-site.gov/' | head -50
Look for script tags — the main JS bundle filename.
Step 2: Grep the JS Bundle for API Paths
curl -s 'https://target-site.gov/assets/index-HASH.js' \ | grep -oP '/api/[^"\s]+' | head -40
Hunt for /api/v1/something or baseURL assignments.
CIRIS result: /api/ciris/v1
Step 3: Find Service/Resource Names
curl -s '...index-HASH.js' \
| grep -oP '"[a-z][a-zA-Z/-]{2,40}"' \
| sort -u | grep -iE 'search|person|result'
CIRIS result: incarceratedPerson — actual endpoint was incarceratedpersons (lowercase, plural).
Step 4: Find the Function That Makes the Call
curl -s '...index-HASH.js' \ | sed 's/;/;\n/g' \ | grep 'YOUR_RESOURCE_NAME'
Breaking on semicolons turns minified code into readable lines. Reveals HTTP method, param names, construction pattern.
Step 5: Extract Parameter Names
curl -s '...index-HASH.js' \ | sed 's/,/,\n/g' \ | grep -iE 'append|param|query'
CIRIS revealed: lastName (required), firstName, ageMin, ageMax, commitmentCounties, $limit, $skip, $sort[field]
Step 6: Identify the Framework from Error Responses
curl -s 'https://target/api/v1/wrong' \
-H 'Accept: application/json'
{"name":"NotFound","className":"not-found","code":404}
That error shape = Feathers.js.
| ERROR SHAPE | FRAMEWORK |
|---|---|
| className, code, name | Feathers.js |
| $limit, $skip, $sort | Feathers.js |
| /api/v2/ + filter[field] | JSON:API (Rails) |
| GraphQL POST body | Apollo / GraphQL |
| _page, _limit | json-server |
| offset, limit | Django REST / FastAPI |
| pageToken / nextPageToken | Google-style APIs |
Step 7: Make the Call
curl -s 'https://ciris.mt.cdcr.ca.gov/api/ciris/v1/ incarceratedpersons?lastName=Smith &%24limit=25&%24sort%5BfullName%5D=1' \ -H 'Accept: application/json' \ | python3 -m json.tool
Step 8: Parse Results
curl -s 'FULL_URL' -H 'Accept: application/json' \
| python3 -c "
import json, sys
data = json.load(sys.stdin)
print(f'Total: {data[\"total\"]}')
for p in data.get('data', []):
print(f' {p[\"fullName\"]} | {p[\"age\"]}')"
Tips
URL-encode special chars: $ → %24 [ → %5B ] → %5D
Try singular AND plural resource names.
Try lowercase — JS says incarceratedPerson, endpoint is incarceratedpersons.
Check for auth: 401/403 means look in JS for token handling.
Pagination: check total vs returned count — there may be more pages.