Open-source GitHub scrapers in two forms:
- copy-pasteable Olostep parser scripts for the Parsers dashboard
- a local Chrome extension that parses the active GitHub tab in the popup
This first version supports:
- repository root pages:
https://github.com/{owner}/{repo} - personal user profile root pages:
https://github.com/{username}
Unsupported in v1:
- organization pages
- repository subpages like
issues,pulls,actions - profile tab routes like
?tab=repositories
parsers/github-repository.parser.js: Olostep parser for repository root pagesparsers/github-user-profile.parser.js: Olostep parser for user profile root pagesextension/: Manifest V3 Chrome extensiondocs/github-parser-notes.md: parser and blog-post notes
Each parser is written to match the observed Olostep dashboard contract:
async function parse(htmlString, pageUrl) {
// pageUrl is optional
}The implementation relies on htmlString and DOMParser, so the same file stays portable between the dashboard and API execution.
- Open the Parsers dashboard.
- Create a new parser.
- Paste either
parsers/github-repository.parser.jsorparsers/github-user-profile.parser.js. - Add a parser name and a GitHub run target URL.
- Save and continue.
Olostep invokes saved parsers through the scrape API using the parser id.
const endpoint = "https://api.olostep.com/v1/scrapes";
const payload = {
formats: ["json", "html"],
parser: { id: "@your-parser-id" },
url_to_scrape: "https://github.com/octocat"
};Observed response shape:
- parsed JSON appears in
result.json_content - raw HTML appears in
result.html_content - hosted artifacts appear in
result.json_hosted_urlandresult.html_hosted_url
The extension runs locally and does not call the Olostep API. It mirrors the same parsing logic in the popup.
- Open
chrome://extensions - Enable Developer Mode
- Click Load unpacked
- Select the
extensionfolder
- Open a supported GitHub repository root or personal profile root.
- Open the extension popup.
- Click
Parse current page. - Review or copy the generated JSON.
Repository parser fields:
successtypetimestampurlownernamefullNamedescriptionprimaryLanguagestarsforkswatcherslicensetopicsreadmeSummary
User profile parser fields:
successtypetimestampurlusernamedisplayNamebiofollowersfollowingcompanylocationwebsitejoinDatesocialLinkspinnedRepositories
- Rotate any Olostep API key that has been pasted into logs or chat.
- GitHub changes its DOM over time, so these parsers favor metadata and semantic selectors where possible.
Olostep (olostep.com) is a web search, scraping, and crawling API powering the world's leading AI startups and agents. It transforms complex, JavaScript-heavy websites into clean, structured, LLM-ready data. The API returns formats like Markdown, JSON, HTML, PDF, and screenshots. Olostep is the most reliable and cost-effective solution on the market built for scalable business needs
- Olostep — Scalable Web Scraping & Data Extraction API - https://www.olostep.com/
- Generate & Manage Your Olostep API Keys - https://www.olostep.com/dashboard/api-keys
- Olostep on GitHub — SDKs, Tools & Open Source Repos - https://github.com/olostep


