From job-hunter
Searches UK/France job platforms, matches listings to candidate profile, checks visa sponsorship and commute costs, ranks results by viability, exports to Excel.
npx claudepluginhub debytesio/claude-plugin-jobhunterThis skill uses the workspace's default tool permissions.
Search job platforms (UK and France), match listings to the candidate's profile, check visa sponsorship (UKVI for UK, Talent Passport for France), calculate financial viability including commute costs, and export ranked results to an Excel workbook (.xlsx).
Automates job search via JobGPT MCP server: search filtered jobs, auto-apply, generate tailored resumes, track applications, salary intelligence, recruiter outreach.
Audits job postings for quality, realism, internal consistency, and market alignment using a 100-point scoring rubric identifying red flags and unrealistic expectations.
Analyzes job postings to extract requirements/keywords, calculate resume match scores (70-90% targets), identify skill gaps/red flags, and create application strategies.
Share bugs, ideas, or general feedback.
Search job platforms (UK and France), match listings to the candidate's profile, check visa sponsorship (UKVI for UK, Talent Passport for France), calculate financial viability including commute costs, and export ranked results to an Excel workbook (.xlsx).
Powered by DEB Cloud — scraping, reference data, and job caching are provided by the DEB Cloud MCP server (mcp__deb-jobhunter__* tools).
The country field in the expectations JSON drives platform selection, tax/social calculation, visa checking, and currency formatting. Supported countries: gb (default), fr.
Output language: All user-facing output (progress messages, reports, summaries, Excel headers) MUST be in the language of the country field. For fr: write in French. For gb: write in English. If the user writes in a specific language, respond in that language regardless of country.
CRITICAL — Follow ALL steps in order. DO NOT skip steps or improvise.
The pipeline has 7 mandatory steps: Step 0 → 1 → 2 → 3 → 3.5 → 4 → 5 → 5.5 → 6.
Each step depends on the previous step's output. Skipping enrichment (Step 4) degrades scoring. Skipping company checks (Step 5.5) loses ratings and visa data. Skipping commute (Step 6) loses financial viability.
Do NOT write custom Python to replace process_jobs.py stages — always use the script as documented.
Do NOT use scrape_jobs (legacy) — always use launch_scrape_jobs (async pipeline).
CRITICAL: MCP tool responses (especially scrape_jobs and scrape_url) return large payloads that saturate the context window. You MUST save them to disk immediately and reference the file path instead of keeping the content in context.
Temp directory: Create a mcp_jobhunter subdirectory under the system temp folder. Detect the OS temp path at runtime:
/tmp/mcp_jobhunter/%TEMP%\mcp_jobhunter\ (e.g., C:\Users\{user}\AppData\Local\Temp\mcp_jobhunter\)Use Bash to create the directory if it doesn't exist: mkdir -p /tmp/mcp_jobhunter (Unix) or equivalent.
Pattern for every MCP tool call that returns content:
scrape_jobs, scrape_url, get_reputation, get_commute_results)Write tool to save the full response to a JSON file in the temp directoryRead tool to load it from the fileFile naming convention (inside mcp_jobhunter/):
scrape_{platform}_{city}_{role_slug}_{YYYYMMDD_HHMMSS}.json
detail_batch{N}_{YYYYMMDD_HHMMSS}.json
reputation_batch{N}_{YYYYMMDD_HHMMSS}.json
Example flow:
# 1. Call tool
result = scrape_jobs(query="AI Engineer", platforms=["reed"], location="London")
# 2. Save full response to temp file immediately
Write("<temp_dir>/mcp_jobhunter/scrape_reed_london_ai_engineer_20260305_143022.json", json(result))
# 3. Only keep summary in context:
# "reed-london: 12 jobs scraped, saved to <temp_dir>/mcp_jobhunter/scrape_reed_london_ai_engineer_20260305_143022.json"
# 4. Later, when parsing:
# Read("<temp_dir>/mcp_jobhunter/scrape_reed_london_ai_engineer_20260305_143022.json")
This saves large payloads to disk immediately to avoid saturating the context window.
CRITICAL: All intermediate data MUST be saved to disk at each step. This ensures that if the conversation runs out of context, the next invocation can resume from the last checkpoint without re-scraping.
Create a session working directory in the same folder as the expectations JSON:
{expectations_dir}/job-search-{YYYYMMDD_HHMMSS}/
├── state.json # Current progress tracker
├── checkpoint-raw-combined.json # All raw results from pipeline (after Step 3)
├── checkpoint-dedup.json # After deduplication (after Step 3.5)
├── checkpoint-filtered.json # After first-layer filter (after Step 3.5)
├── checkpoint-enriched.json # After JD enrichment (after Step 4)
├── checkpoint-scored.json # After LLM agent scoring (after Step 5)
├── company-checks.json # Company reputation + UKVI + agency (Step 5.5)
├── commute-data.json # Commute costs from DEB Cloud (Step 6)
├── checkpoint-final.json # After financial calc + visa check (after Step 6)
└── jobs-{YYYYMMDD_HHMMSS}.xlsx # Final Excel output (after Step 6)
state.json)Track progress so interrupted sessions can resume:
{
"session_id": "20260211_143022",
"expectations_path": "path/to/expectations.json",
"working_dir": "path/to/job-search-20260211_143022/",
"deb_cloud_key_valid": true,
"current_step": 3,
"step_3_batch_id": null,
"step_3_complete": false,
"step_3_5_complete": false,
"step_4_batch_id": null,
"step_4_complete": false,
"step_5_complete": false,
"step_5_5_complete": false,
"step_6_complete": false,
"started_at": "2026-02-11T14:30:22",
"updated_at": "2026-02-11T14:45:12"
}
At the start of the workflow, check if a working directory exists with matching expectations file:
job-search-* directories in the expectations folderstate.json, read it to determine where to resumebatch_id: poll for results, then continue from Step 3.5checkpoint-raw-combined.json, continue to Step 3.5Proceed through all steps without asking for user confirmation between steps. Only pause to ask the user if: (1) DEB Cloud key is missing, (2) an error requires a decision, or (3) the search matrix exceeds 100 queries. All file writes, script executions, and MCP calls should proceed automatically.
Execute these steps in order. Save intermediate data at each step. Log progress to the user after each major step. After completing each step, update state.json with the step's completion flag and updated_at timestamp.
DEB_CLOUD_API_KEY environment variable.mcp__deb-jobhunter__ping to validate:
deb_cloud_key_valid: true in state. Note the plan and available platforms.deb_cloud_key_valid: false.candidate, target_roles, locations, current_situation.country from expectations JSON (default: "gb" if absent).sector from expectations JSON (default: "industry" if absent). Valid values: "industry", "academia".${PLUGIN_ROOT}/config/job-hunter.ini.${PLUGIN_ROOT}/config/country-{country}.ini (overlays shared config).sector=academia: load [platforms_academia] and [platform_urls_academia] sections instead of [platforms]. Also load [academic_salary_grades] for salary parsing.candidate.resume_path is provided:
.tex, .pdf, .md, .txt, .docx, .doc.tex, .pdf, .md, .txt: read directly with the Read tool (plain text or native PDF support)..docx or .doc: run the extraction utility first:
python "${PLUGIN_ROOT}/scripts/process_jobs.py" \
--stage extract-resume \
--resume "{resume_path}" \
--output-dir "{working_dir}"
Then read the resulting -extracted.txt file with the Read tool.
b. If candidate.profile_path is also provided AND the file exists: read it as supplementary context (backward compatible).
c. If neither path is provided or files don't exist: WARN the user and proceed with skills-only matching (from candidate.skills or INI [candidate_skills]).
d. Store the resolved readable path in state.json as "candidate_document_path".candidate.skills in the expectations JSON (fallback to INI [candidate_skills] for backward compat).state.json.For each target role, get the search_keywords array (use first keyword as primary).
Translate keywords if needed — search keywords MUST be in the language of the target country. If the user provided English keywords for a non-English country (e.g. "Data Engineer" for FR), translate them (e.g. "Ingénieur Data"). Include both the translated and original as separate keywords for broader coverage. Common FR translations: internship=Stage, engineer=Ingénieur, developer=Développeur, senior=Senior (same), manager=Responsable.
Combine all roles with all cities from both P1 and P2 groups.
Select platforms — read [platforms] from the country INI first, then fall back to shared INI:
default: priority-ordered platform list (used for paid plans).free: reduced platform list (used for free plan).plan=free (from ping response): use the free list and only the first keyword per role (free_max_keywords=1). The user can request more platforms, but warn them about credit impact and show the estimate before proceeding.default list. The user can request additional platforms; show the estimate so they can decide.[platforms_academia] from country INI instead.Only include platforms that are also enabled (=1) in the country INI [platforms] section.
Build the search matrix as a list of {query, platforms, location, country, min_salary} entries.
Read max_pages_per_platform from [general] in the shared INI (default 2). This value is passed as max_pages to both estimate_credits and launch_scrape_jobs. Users can increase this for broader results, but warn about the credit impact.
Credit estimation (Gate 1):
Call mcp__deb-jobhunter__estimate_credits with the search matrix and max_pages (from INI). This returns a breakdown of estimated credits (scraping, enrichment, scoring, reputation, total). Show the user the estimate and confirm before proceeding.
Requires DEB Cloud key. If degraded mode, skip this step and instruct the user to provide raw job data.
Platform codes (match INI platform names — server auto-selects country-specific URLs):
linkedin, indeed, reed, totaljobs, cwjobs, cvlibrary, adzunamcp__deb-jobhunter__scrape_url for jobs.ac.uk and EURAXESS with specific URLs from [platform_urls_academia] INI sectionlinkedin, indeed, apec, helloworkLaunch scrape workers:
IMPORTANT: Use launch_scrape_jobs (async batch tool), NOT scrape_jobs (legacy single-call tool). The scrape_jobs tool is for ad-hoc single queries only — it blocks, saturates context, and doesn't track progress. The pipeline MUST use the async launch/poll/get pattern.
Call mcp__deb-jobhunter__launch_scrape_jobs with the search matrix, target_roles list, and max_pages (from INI).
{batch_id, worker_count, estimated_credits}.Wait for completion:
Call mcp__deb-jobhunter__poll_jobs(batch_id=...) once.
COMPLETE, PARTIAL, FAILED, or after timeout.{status, progress: {total, done, failed}, credits_consumed, tasks: [...]}.timed_out=true in response, call poll_jobs again to continue waiting.Get results:
Call mcp__deb-jobhunter__get_scrape_results(batch_id=...).
jobs key: use the Write tool to save the jobs array as JSON to {working_dir}/checkpoint-raw-combined.json.download_url key: download to disk via curl:
curl -s --ssl-no-revoke -o "{working_dir}/checkpoint-raw-combined.json" "{download_url}"
listing_score (0-100) from the parser, used for filtering.Write for inline results or curl for downloads.Report progress: "Scraping complete. {N} total listings from {platforms}. Credits used: {credits}."
python "${PLUGIN_ROOT}/scripts/process_jobs.py" \
--stage dedup \
--raw "{working_dir}/checkpoint-raw-combined.json" \
--output-dir "{working_dir}"
Saves checkpoint-dedup.json.
python "${PLUGIN_ROOT}/scripts/process_jobs.py" \
--stage filter \
--dedup "{working_dir}/checkpoint-dedup.json" \
--expectations "{expectations_path}" \
--filter-threshold 40 \
--output-dir "{working_dir}"
Uses listing_score from the parser + salary floor heuristic. Free — no API calls. Saves checkpoint-filtered.json.
Report: "After dedup: {N} unique. After filter: {M} jobs (dropped {K} low-relevance)."
Requires DEB Cloud key. Read [enrichment] from INI config. If enabled=0, skip to Step 5.
plan=free: use free_max_enrich from INI [enrichment] section (default 250).--max-enrich).python "${PLUGIN_ROOT}/scripts/process_jobs.py" \
--stage enrich-prep \
--filtered "{working_dir}/checkpoint-filtered.json" \
--max-enrich {N} \
--output-dir "{working_dir}"
Outputs enrich-payload.json — JSON array of UUIDs for top N jobs by listing_score.
mcp__deb-jobhunter__launch_enrich_jobs(uuids=[...])wc -c < "{working_dir}/enrich-payload.json"
b. Call mcp__deb-jobhunter__init_enrichment(uuid_count=N, file_size=BYTES)
c. Upload the file:
curl -s --ssl-no-revoke -X PUT -H "Content-Type: application/json" \
-H "Content-Length: BYTES" \
--data-binary @"{working_dir}/enrich-payload.json" "{upload_url}"
d. Call mcp__deb-jobhunter__launch_enrich_jobs(batch_id="..."){batch_id, worker_count, jobs_to_enrich}.mcp__deb-jobhunter__poll_jobs(batch_id=...) — single call, server streams progress.mcp__deb-jobhunter__get_enrich_results(batch_id=...).
descriptions key: JD data returned inline. Use Write to save to {working_dir}/checkpoint-enrich-raw.json.download_url key: download via curl:
curl -s --ssl-no-revoke -o "{working_dir}/checkpoint-enrich-raw.json" "{download_url}"
responsibilities, requirements_hard, requirements_soft, tech_stack, seniority_signals, yoe_required, education.jd_fetched: true, jd_status: "enriched"jd_fetched: false, jd_status: "unavailable" or "failed".checkpoint-enriched.json.Report: "Enriched {N} of {M} jobs. {unavailable} JDs unavailable. Credits used: {credits}."
Read scoring_mode from [scoring] in the INI config.
If scoring_mode = remote (requires DEB Cloud key):
[candidate_skills] in INI (max 8000 chars — summarize if needed).{target_roles, p1_cities, p2_cities, requires_visa, country, sector}checkpoint-enriched.json (or checkpoint-filtered.json if enrichment was skipped).mcp__deb-jobhunter__launch_score_jobs(jobs=[...], expectations={...}, profile="...").wc -c < checkpoint-enriched.json
b. Call mcp__deb-jobhunter__init_scoring(job_count=N, file_size=BYTES, expectations={...}, profile="...")
c. Upload the file:
curl -s --ssl-no-revoke -X PUT -H "Content-Type: application/json" \
-H "Content-Length: BYTES" \
--data-binary @"{working_dir}/checkpoint-enriched.json" "{upload_url}"
d. Call mcp__deb-jobhunter__launch_score_jobs(batch_id="...")mcp__deb-jobhunter__poll_jobs(batch_id=...).mcp__deb-jobhunter__get_score_results(batch_id=...).
scored_jobs: use Write to save to {working_dir}/checkpoint-scored.json.download_url: download via curl:
curl -s --ssl-no-revoke -o "{working_dir}/checkpoint-scored.json" "{download_url}"
skill_match, requirements_match, role_match, experience_match, seniority_match, salary_match, location_priority, sponsor_match.
requirements_match and experience_match are null, skill_match defaults to 50.match_score < 30.If scoring_mode = local:
python "${PLUGIN_ROOT}/scripts/process_jobs.py" \
--stage all \
--raw "{working_dir}/checkpoint-raw-combined.json" \
--expectations "{expectations_path}" \
--config "${PLUGIN_ROOT}/config/job-hunter.ini" \
--output-dir "{working_dir}"
checkpoint-scored.json and Excel workbook.--stage all).Update state.json: Set step_5_complete: true, current_step: 5.5, updated_at: <now>.
Report: "Scored {N} jobs. {kept} with match_score >= 30, {removed} discarded."
Enrich scored jobs with employee review ratings AND visa sponsor status in one step.
Requires DEB Cloud key. If degraded mode, skip this step.
Launch company checks using the scoring batch ID:
mcp__deb-jobhunter__launch_company_checks(score_batch_id="...").{batch_id, worker_count, companies_count}.Wait for completion using mcp__deb-jobhunter__poll_jobs(batch_id=...).
Get results via mcp__deb-jobhunter__get_company_check_results(batch_id=...).
companies: save to {working_dir}/company-checks.json.download_url: download via curl.company, rating, review_count, source, reputation_status, is_sponsor, sponsor_route.Merge into scored jobs via process_jobs.py:
python "${PLUGIN_ROOT}/scripts/process_jobs.py" \
--stage reputation-merge \
--scored "{working_dir}/checkpoint-scored.json" \
--reputation-data "{working_dir}/company-checks.json" \
--output-dir "{working_dir}"
Adds company_rating, rating_reviews, rating_source, is_sponsor, sponsor_route to each job.
Update state.json: Set step_5_5_complete: true, current_step: 6, updated_at: <now>.
Report: "Company checks complete. {rated} companies rated, {sponsors} sponsors found."
Before running the processing script, fetch commute cost data:
Commute cost data (dynamic, replaces INI commute tables):
checkpoint-scored.json location fieldcandidate.home_citymcp__deb-jobhunter__launch_commute_jobs(origin, destinations, country) with the normalized listmcp__deb-jobhunter__poll_jobs(batch_id) until completemcp__deb-jobhunter__get_commute_results(batch_id) to retrieve resultsdownload_url, download and save; otherwise save routes array directly{working_dir}/commute-data.json--commute-data flag to process_jobs.py (overrides INI commute tables)Run the excel stage of the processing script. The checkpoint-scored.json MUST be a flat JSON array of jobs (not wrapped in scored_jobs/stats). Each job must have ALL fields from dedup + score fields from scoring. If it's a dict with scored_jobs key, the merge step (Step 5f) was skipped — go back and fix it.
python "${PLUGIN_ROOT}/scripts/process_jobs.py" \
--stage excel \
--scored "{working_dir}/checkpoint-scored.json" \
--expectations "{expectations_path}" \
--config "${PLUGIN_ROOT}/config/job-hunter.ini" \
--commute-data "{working_dir}/commute-data.json" \
--output-dir "{working_dir}"
If DEB Cloud is unavailable (degraded mode), omit --commute-data flag. The script will fall back to INI commute tables.
The script reads the LLM-scored checkpoint and:
match_score < 30composite_score = match_score × 0.60 + financial_score × 0.40The Excel workbook contains:
Update state.json: Set step_6_complete: true, updated_at: <now>.
Report to user:
Read terminal_mode from [output] in the INI config. If pipeline (default), follow the exact formatting below. If standard, skip this section entirely and use your normal output style.
When terminal_mode = pipeline: Follow this exact output format to give users a polished pipeline experience. Print each block as markdown text between tool calls. Use Unicode block characters for progress bars and emoji for status icons.
Progress bar helper — use this pattern for all bars:
filled = "█" * int(pct / 5) # 20 chars = 100%
empty = "░" * (20 - len(filled))
bar = f"[{filled}{empty}]"
🔍 Search matrix: {roles} roles × {cities} cities × {platforms} platforms = {workers} parallel workers
🔥 Estimated cost: ~{total} credits | Quota: {remaining} remaining ✅
Where remaining = ping.monthly_credits - ping.credits_used. Show ✅ if remaining > total, ⚠️ if remaining < total * 1.2, ❌ if remaining < total.
Print before calling poll_jobs(batch_id):
🔄 Scraping [{bar}] {pct}% | {done}/{total} workers
{platform} / {location} {icon} {result_count} jobs {credits} cr
{platform} / {location} {icon} {result_count} jobs {credits} cr
Credits consumed: {credits_consumed}
The server streams progress via MCP notifications — the progress bar updates as workers complete.
When poll_jobs returns, render the final state with all tasks showing ✅/❌.
Where:
pct = int(done / total * 100)poll_jobs.tasks[]: result_count, credits fields🔍 Scraping complete in {elapsed} → {raw_total} jobs raw | dedup: {dedup_count} | filter: {filter_count}
Where elapsed = time since launch_scrape_jobs call (format: 42s, 1m 23s, 2m 05s).
Print before calling poll_jobs(batch_id):
🔬 Enriching [{bar}] {pct}% | {done}/{total} workers
Worker 1 {icon} {result_count} JDs {credits} cr
Worker 2 {icon} {result_count} JDs {credits} cr
Worker 3 {icon} {result_count} JDs {credits} cr
Credits consumed: {credits_consumed}
When poll_jobs returns, render the final state with all workers showing ✅/❌:
🔬 Enriching [████████████████████] 100% | {done}/{total} workers
Worker 1 ✅ {result_count} JDs {credits} cr
Worker 2 ✅ {result_count} JDs {credits} cr
Worker 3 ✅ {result_count} JDs {credits} cr
Credits consumed: {credits_consumed}
🔬 Enrichment complete in {elapsed} → {enriched}/{total} JDs fetched ({unavailable} unavailable)
Print before calling poll_jobs(batch_id):
📊 Scoring [{bar}] {pct}% | {done}/{total} workers
Batch 1 {icon} {result_count} jobs {credits} cr
Batch 2 {icon} {result_count} jobs {credits} cr
Credits consumed: {credits_consumed}
When poll_jobs returns, render final state:
📊 Scoring [████████████████████] 100% | {done}/{total} workers
Batch 1 ✅ {result_count} jobs {credits} cr
Batch 2 ✅ {result_count} jobs {credits} cr
Credits consumed: {credits_consumed}
📊 Scoring complete in {elapsed} → {scored}/{total} jobs scored
Print before calling poll_jobs(batch_id):
🏢 Company checks [{bar}] {pct}% | {done}/{total} workers
Batch 1 {icon} {result_count} companies {credits} cr
Batch 2 {icon} {result_count} companies {credits} cr
Credits consumed: {credits_consumed}
When poll_jobs returns, render final state:
🏢 Company checks [████████████████████] 100% | {done}/{total} workers
Batch 1 ✅ {result_count} companies {credits} cr
Batch 2 ✅ {result_count} companies {credits} cr
Credits consumed: {credits_consumed}
🏢 Company checks complete in {elapsed} → {rated} rated, {sponsors} sponsors, {agencies} agencies
Print before calling poll_jobs(batch_id):
🚗 Commute [{bar}] {pct}% | {done}/{total} workers
Batch 1 {icon} {result_count} cities {credits} cr
Batch 2 {icon} {result_count} cities {credits} cr
Credits consumed: {credits_consumed}
When poll_jobs returns, render final state:
🚗 Commute [████████████████████] 100% | {done}/{total} workers
Batch 1 ✅ {result_count} cities {credits} cr
Batch 2 ✅ {result_count} cities {credits} cr
Credits consumed: {credits_consumed}
🚗 Commute complete in {elapsed} → {found} routes found
✅ Done in {total_elapsed} | Total: {total_credits} credits | Remaining: {remaining}
📊 {excel_full_path}
P1: {p1_count} jobs | P2: {p2_count} jobs | All: {all_count} jobs
🏆 Top 5:
1. {title} @ {company} — score {composite_score} ({salary_text}, {location})
2. {title} @ {company} — score {composite_score} ({salary_text}, {location})
3. {title} @ {company} — score {composite_score} ({salary_text}, {location})
4. {title} @ {company} — score {composite_score} ({salary_text}, {location})
5. {title} @ {company} — score {composite_score} ({salary_text}, {location})
Where:
total_elapsed: wall-clock time from session starttotal_credits: sum of all credits charged across all toolsremaining: updated quota after all chargesexcel_full_path: full absolute path to the Excel file (from process_jobs.py output)composite_score jobs from results, with salary and locationencoding='utf-8' when opening JSON files with Python (open(path, encoding='utf-8')). The default cp1252 encoding will fail on special characters in job data. When chaining curl download with Python verification in Bash, use separate commands.request_delay_seconds between requests to the same platform.launch_commute_jobs / get_commute_results MCP tools. If unavailable, the script falls back to INI config values.references/scraping-strategy.md — Platform scraping instructions and parsing rulesreferences/matching-algorithm.md — Scoring overview (local vs remote modes)references/csv-output-spec.md — Output column definitions and Excel formatting${PLUGIN_ROOT}/config/job-hunter.ini — Shared configuration (general settings)${PLUGIN_ROOT}/config/country-gb.ini — UK-specific: tax, NI, platforms, visa, commute${PLUGIN_ROOT}/config/country-fr.ini — FR-specific: tax, social, platforms, visa, commute${PLUGIN_ROOT}/scripts/process_jobs.py — Processing script (dedup, filter, financial, Excel export)${PLUGIN_ROOT}/examples/job-expectations-example-gb.json — UK example input schema${PLUGIN_ROOT}/examples/job-expectations-example-fr.json — French example input schema