From openbrowser
Downloads files from websites preserving browser sessions and cookies using openbrowser-ai, handles PDFs/CSVs/images, extracts text from PDFs with pypdf. For authenticated file fetches and saves.
npx claudepluginhub billy-enrizky/openbrowser-ai --plugin openbrowserThis skill is limited to using the following tools:
Download files from websites using the browser's authenticated session. Handles PDFs, CSVs, images, and any downloadable content. Preserves cookies and login sessions for authenticated downloads.
Downloads entire websites as local files in markdown, screenshots, or multiple formats per page. Maps site first, scrapes to organized .firecrawl/ directories for offline docs or bulk content extraction.
Automates browser interactions for AI agents via Smooth CLI: navigate sites, fill forms, scrape data, test web apps, log in, and run natural language tasks.
Automates browser tasks via CLI: navigate pages, extract data, fill forms, click buttons, take screenshots. Supports stealth remote sessions with CAPTCHA solving for protected sites.
Share bugs, ideas, or general feedback.
Download files from websites using the browser's authenticated session. Handles PDFs, CSVs, images, and any downloadable content. Preserves cookies and login sessions for authenticated downloads.
All code runs via openbrowser-ai -c. The daemon starts automatically and persists variables across calls. All browser functions are async -- use await.
Before running, verify openbrowser-ai is installed:
openbrowser-ai --help
If not found, install:
# macOS/Linux
curl -fsSL https://raw.githubusercontent.com/billy-enrizky/openbrowser-ai/main/install.sh | sh
# Windows (PowerShell)
irm https://raw.githubusercontent.com/billy-enrizky/openbrowser-ai/main/install.ps1 | iex
openbrowser-ai -c - <<'EOF'
await navigate("https://example.com/reports")
# Get browser state to find clickable download links
state = await browser.get_browser_state_summary()
for idx, el in state.dom_state.selector_map.items():
text = el.get_all_children_text(max_depth=1)
if "download" in text.lower() or "pdf" in text.lower() or "export" in text.lower():
print(f"[{idx}] {el.tag_name}: {text}")
EOF
Use download_file() to download directly. This uses the browser's JavaScript fetch internally, preserving cookies and authentication:
openbrowser-ai -c - <<'EOF'
path = await download_file("https://example.com/reports/annual-report.pdf")
print(f"Saved to: {path}")
EOF
With a custom filename:
openbrowser-ai -c - <<'EOF'
path = await download_file(
"https://example.com/api/export?format=csv",
filename="sales-data.csv"
)
print(f"Saved to: {path}")
EOF
When the download URL is not directly visible, extract it from a link or button:
openbrowser-ai -c - <<'EOF'
# Extract href from a download link
download_url = await evaluate("""
(function(){
const link = document.querySelector("a[href$=\".pdf\"]");
return link ? link.href : null;
})()
""")
if download_url:
path = await download_file(download_url)
print(f"Downloaded: {path}")
else:
print("No PDF link found")
EOF
After downloading, use pypdf to extract text (requires pip install openbrowser-ai[pdf]):
openbrowser-ai -c - <<'EOF'
from pypdf import PdfReader
reader = PdfReader(path)
print(f"Pages: {len(reader.pages)}")
# Extract text from all pages
for i, page in enumerate(reader.pages):
text = page.extract_text()
print(f"--- Page {i+1} ---")
print(text[:500])
EOF
openbrowser-ai -c - <<'EOF'
from pathlib import Path
file_path = Path(path)
# CSV
if file_path.suffix == ".csv":
import pandas as pd
df = pd.read_csv(file_path)
print(df.to_string())
# JSON
if file_path.suffix == ".json":
import json
data = json.loads(file_path.read_text())
print(json.dumps(data, indent=2))
# Plain text
if file_path.suffix in (".txt", ".md", ".log"):
print(file_path.read_text())
EOF
openbrowser-ai -c - <<'EOF'
urls = [
"https://example.com/report-q1.pdf",
"https://example.com/report-q2.pdf",
"https://example.com/report-q3.pdf",
]
paths = []
for url in urls:
path = await download_file(url)
paths.append(path)
print(f"Downloaded: {path}")
print(f"Total files: {len(paths)}")
EOF
openbrowser-ai -c - <<'EOF'
files = list_downloads()
for f in files:
print(f)
print(f"Total: {len(files)} files")
EOF
download_file() preserves the browser's login session. Log in first, then download:
openbrowser-ai -c - <<'EOF'
# Navigate and log in
await navigate("https://portal.example.com/login")
await input_text(username_index, "user@example.com")
await input_text(password_index, "password")
await click(login_button_index)
await wait(2)
# Now download an authenticated resource
path = await download_file("https://portal.example.com/api/reports/confidential.pdf")
print(f"Downloaded: {path}")
EOF
-c - <<'EOF'), so all Python syntax works without shell escaping issues.download_file(url) instead of navigate(url) for files. navigate() opens PDFs in the browser viewer but does not save them.download_file() preserves cookies and authentication -- no need to re-authenticate.(N) suffix (e.g., report (1).pdf).list_downloads() to see all files saved in the downloads directory.download_file() has a 120-second timeout.requests if the browser fetch fails (e.g., CORS restrictions), but without browser cookies.