From osint
Use when the user wants to conduct open source intelligence research — investigating people, locations, domains, images, infrastructure, vehicles, or digital artifacts using publicly available information. Triggers on: "OSINT", "investigate", "who is this person", "where was this photo taken", "find information about", "trace this", "identify this location", "lookup this domain", "reverse image search", "find this username", "social media footprint", "geolocate", "WHOIS", "DNS lookup", "EXIF data", "metadata analysis", "digital forensics", "blockchain trace", "IP lookup", "certificate transparency". Also triggers when the user provides an image and asks where it was taken, or provides a username and asks what accounts exist, or provides a domain and asks about its infrastructure.
npx claudepluginhub lawriec/claude-osint-plugin --plugin osintThis skill uses the workspace's default tool permissions.
OSINT is the discipline of collecting, processing, and analyzing publicly available information to produce actionable intelligence. The power of OSINT lies not in any single data source but in cross-referencing — a username leads to an email, which leads to a domain, which leads to an IP, which leads to a physical location. Each connection strengthens the picture.
references/document-analysis.mdreferences/domain-infrastructure.mdreferences/geolocation.mdreferences/google-dorking-cheatsheet.mdreferences/image-video-forensics.mdreferences/investigation-setup.mdreferences/knowledge-graph.mdreferences/open-apis.mdreferences/opsec-ethics.mdreferences/osint-cycle.mdreferences/people-social-media.mdreferences/platform-directory.mdreferences/reporting.mdreferences/tool-guide.mdreferences/vehicle-object-id.mdscripts/analyze_email_headers.pyscripts/check_username.pyscripts/discover_reddit_threads.pyscripts/extract_exif.pyscripts/loop-analyze-community.mdGuides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Performs token-optimized structural code search using tree-sitter AST parsing to discover symbols, outline files, and unfold code without reading full files.
OSINT is the discipline of collecting, processing, and analyzing publicly available information to produce actionable intelligence. The power of OSINT lies not in any single data source but in cross-referencing — a username leads to an email, which leads to a domain, which leads to an IP, which leads to a physical location. Each connection strengthens the picture.
Every investigation follows the intelligence cycle: define requirements, plan collection, collect data, analyze findings, report results. This is not bureaucracy — it is how you avoid wasting time, missing evidence, and falling into confirmation bias.
Three non-negotiable principles:
All reference files live in references/. Load them on demand — do not read them all upfront.
| Situation | Read |
|---|---|
| Setting up or resuming an investigation | investigation-setup.md |
| Understanding the OSINT intelligence cycle | osint-cycle.md |
| Knowledge graph operations | knowledge-graph.md |
| Choosing tools, handling tool failures | tool-guide.md |
| Legal/ethical questions, OPSEC concerns | opsec-ethics.md |
| Free API endpoints and rate limits | open-apis.md |
| Finding the right platform/tool for a domain | platform-directory.md |
| Geolocation from images/video | geolocation.md |
| Investigating a person or social media | people-social-media.md |
| Domain, IP, DNS, infrastructure recon | domain-infrastructure.md |
| Image/video metadata and forensics | image-video-forensics.md |
| Writing investigation reports | reporting.md |
| Document metadata, email headers | document-analysis.md |
| Vehicle, aircraft, ship identification | vehicle-object-id.md |
| Cryptocurrency and blockchain analysis | crypto-financial.md |
| Radio signals and broadcast identification | radio-signals.md |
This section is not optional. Read it before every investigation.
Hard rules — never break these:
Awareness requirements:
When in doubt: Read opsec-ethics.md for detailed guidance, case studies, and decision frameworks.
Every investigation follows these five steps. Do not skip any of them.
Before collecting a single piece of data, answer these questions:
What does the user need to know? Restate the question in your own words. Vague requests like "find out about this person" need to be narrowed: Are they looking for contact information? Professional background? Online presence? Criminal history? Location history?
What domains does this investigation span? Classify the work into one or more of these categories:
What does success look like? Define concrete deliverables. "A list of all social media accounts associated with this username" is testable. "Everything about this person" is not.
What are the constraints?
Load reference files for each identified domain.
With the requirement defined, plan how to collect the data.
Order of operations matters:
Identify pivot opportunities: Think ahead about how findings might chain together. If you are investigating a username, plan for the possibility that you will find an email, which might lead to domains. Have the tools ready.
Choose your tools:
Based on the domains identified in Step 1, determine which MCP servers, scripts, and APIs you will need. Read tool-guide.md if uncertain about capabilities or availability.
Set up the workspace:
Read investigation-setup.md and scaffold the investigation directory:
investigation-name/
├── search-log.md # Every query and result
├── leads.md # Active leads (HIGH/MEDIUM/LOW)
├── dead-ends.md # What failed and why
├── evidence-chain.md # Source → finding → conclusion
└── report.md # Structured findings
Initialize the knowledge graph with the seed entities (the starting data points the user has provided).
This is where the work happens. The approach depends entirely on the investigation domain(s).
Systematic narrowing from broad to specific:
Use Gemini for image analysis (it handles visual question-answering well). Use sun_position.py for shadow-based time/location estimation. Use SearXNG or Google Lens for reverse image search to find matching locations.
Read geolocation.md for the full methodology, including worked examples.
Start with whatever identifiers you have and expand outward:
check_username.py to test a username across platforms.Read people-social-media.md for platform-specific techniques and common pivot patterns.
Passive reconnaissance first, active only with authorization:
query_whois.py.query_dns.py. Look for SPF, DKIM, DMARC in TXT records.query_crtsh.py.query_shodan_internetdb.py for free enrichment.Read domain-infrastructure.md for the full methodology.
Extract and analyze all available metadata:
extract_exif.py for camera model, GPS coordinates, timestamps, software used.yt-dl.Read image-video-forensics.md for detailed techniques.
Most real investigations cross domain boundaries. A person investigation might reveal a domain they own, which leads to infrastructure analysis, which reveals other domains, which connect to other people.
Follow the evidence wherever it leads. When you pivot from one domain to another, load the relevant reference file and apply that domain's methodology.
For EVERY collection action, without exception:
search-log.md — what you searched, where, when, and any parameters.leads.md with a priority (HIGH/MEDIUM/LOW) and a note on what to do next.dead-ends.md with a note on why it failed.evidence-chain.md.memory-graph tools.Raw data is not intelligence. Analysis transforms data into answers.
Cross-reference findings across sources. A username found on two platforms is interesting. The same username linked to the same email on both platforms is more significant. That email also appearing in WHOIS records for a domain is a strong connection.
Build timelines. When you have temporal data (post timestamps, domain registration dates, WHOIS changes, commit history), arrange it chronologically. Timelines reveal patterns: when was a person active? When did infrastructure change? Do events correlate?
Assess confidence levels for every finding:
| Level | Definition | Criteria |
|---|---|---|
| Confirmed | Established beyond reasonable doubt | Multiple independent sources agree; direct evidence |
| Probable | More likely true than not | Strong evidence from at least one reliable source |
| Possible | Plausible but not yet corroborated | Some supporting evidence; needs additional sources |
| Speculative | Inference or hypothesis | Based on patterns or a single weak source; flag clearly |
Look for contradictions. When sources disagree, that is often more informative than when they agree. A profile claiming to be in New York but posting at times consistent with Pacific timezone tells you something. A domain WHOIS showing registration in 2020 but Wayback Machine snapshots from 2018 tells you something.
Document the evidence chain. For every conclusion, trace it back to its sources:
Source: crt.sh query for example.com
Finding: Subdomain admin.example.com exists
Source: Shodan InternetDB for admin.example.com IP
Finding: Port 22 (SSH) and 443 (HTTPS) open
Conclusion: example.com has an administrative interface (PROBABLE)
Update the knowledge graph with relationships between entities and confidence assessments.
Present findings in a structured format that the user can act on.
Organize by confidence level. Lead with confirmed findings, then probable, then possible. Clearly label anything speculative.
Provide evidence provenance. Every claim must cite its source. "The domain was registered on 2023-01-15" needs to say "(source: WHOIS query, 2024-03-20)".
Document negative results. What you searched for and did not find is important context. It tells the reader (and future investigators) what has already been ruled out.
Provide next steps. If the investigation is incomplete, list concrete actions that would advance it. Prioritize by expected value.
Use structured report templates. Read reporting.md for templates appropriate to different investigation types.
The most commonly used tools. For full details, fallback options, and troubleshooting, read tool-guide.md.
| Need | Tool/Script | Notes |
|---|---|---|
| Web search | tavily_search, searxng_search | Tavily for focused queries, SearXNG for broad/image search |
| Image analysis | gemini (ask_question_about_video) | Works for both still images and video; strong at visual QA |
| Reverse image search | selenium (navigate to Google Lens/Yandex) | Automate browser-based reverse search |
| Website screenshots | selenium (take_screenshot) | Document visual evidence of web pages |
| Historical pages | internet-archive, common-crawl | Wayback Machine snapshots, historical crawl data |
| Video metadata | yt-dl (ytdlp_get_video_metadata) | YouTube and many other video platforms |
| DNS lookup | uv run query_dns.py | Full DNS record enumeration (A, AAAA, MX, TXT, NS, SOA, CNAME) |
| WHOIS | uv run query_whois.py | Domain registration data, registrant info |
| Cert transparency | uv run query_crtsh.py | Subdomain discovery via certificate logs |
| IP enrichment | uv run query_shodan_internetdb.py | Open ports, hostnames, known vulns (free, no API key) |
| EXIF extraction | uv run extract_exif.py | GPS coordinates, camera info, timestamps, software |
| Username check | uv run check_username.py | Test username existence across many platforms |
| Sun position | uv run sun_position.py | Calculate sun angle for shadow-based geolocation |
| Knowledge graph | memory-graph tools | Track entities, relationships, and observations |
| Reddit research | reddit tools | Fetch threads, post content from subreddits |
| Fetch web page | fetch | Retrieve raw page content for analysis |
| Video frames | video-reader (extract_frames) | Pull frames from video for image analysis |
Tool selection principles:
query_dns.py for DNS). Fall back to web-based tools if the script fails.memory-graph) throughout — it is how you track what you have found and how it connects.Every investigation gets its own directory. This is not optional — it is how you maintain the discipline that separates useful OSINT from random Googling.
investigation-name/
├── search-log.md # Every query, tool, timestamp, and result
├── leads.md # Active leads ranked HIGH / MEDIUM / LOW
├── dead-ends.md # What was tried and why it failed
├── evidence-chain.md # Source → finding → conclusion chains
└── report.md # Structured findings for the user
## [Timestamp or sequence number]
- **Tool:** query_whois.py
- **Query:** example.com
- **Result:** Registered 2023-01-15, registrant: REDACTED, nameservers: ns1.cloudflare.com, ns2.cloudflare.com
- **Action:** Added to evidence-chain. Pivot to DNS enumeration.
## HIGH
- [ ] Check DNS records for example.com — WHOIS shows Cloudflare, may reveal origin IP via historical DNS
- [ ] Reverse image search the profile photo — unique enough to potentially find other accounts
## MEDIUM
- [ ] Check Wayback Machine for example.com — may reveal content before current owner
## LOW
- [ ] Search for registrant email on other WHOIS records — likely redacted but worth trying
## username_search — Twitter
- **Query:** Searched for @exampleuser on Twitter
- **Result:** Account exists but is private, no useful public data
- **Why dead end:** Cannot extract further information without authentication
- **Logged:** 2024-03-20
Read investigation-setup.md for full templates and workspace initialization procedures.
Cross-domain pivoting is what makes OSINT powerful. A single piece of data in one domain unlocks findings in another. Always be alert for pivot opportunities.
Username to Location:
Username → check_username.py → profiles on multiple platforms
→ Profile analysis → email address in bio
→ Email → WHOIS search → domain ownership
→ Domain → DNS records → IP address
→ IP → geolocation → approximate physical location
Image to Identity:
Image → EXIF extraction → GPS coordinates → specific location
→ Reverse image search → original posting → author's profile
→ Profile → username → cross-platform enumeration → network
Domain to Network:
Domain → WHOIS → registrant email/org
→ Reverse WHOIS → other domains by same registrant
→ Shared hosting analysis → related infrastructure
→ Certificate transparency → subdomains → services
Social Post to Verification:
Social media post → claimed location + timestamp
→ Image in post → EXIF check → actual GPS (if present)
→ Shadow analysis → sun_position.py → consistent with claimed time?
→ Background details → reverse image search → matches claimed location?
Photo to Business:
Photo → reverse image search → matching location in Street View
→ Street View → business names visible → business registry lookup
→ Business registry → owner information → connected entities
Every time you pivot from one domain to another:
search-log.md — what prompted it, what you expect to find.evidence-chain.md with the connection between domains.These are the mistakes that derail investigations. Review this list when you feel stuck or uncertain.
Not logging searches. If you do not log what you searched and what you found, you will repeat work, miss patterns, and be unable to explain your findings. Log everything, including searches that return nothing.
Confirmation bias. The most dangerous failure mode in OSINT. Once you form a hypothesis, you will naturally look for evidence that confirms it and discount evidence that contradicts it. Counteract this by:
Over-reliance on a single source. No single source is authoritative. WHOIS data can be faked. Social media profiles can be impersonated. Timestamps can be manipulated. Always seek corroboration from independent sources.
Ignoring cached and archived data. The current version of a website, profile, or document may not be the most informative version. Always check:
Not considering deliberate deception. Information found online may be intentionally false. Disinformation, sockpuppet accounts, planted evidence — all are possibilities. Assess the reliability of each source independently.
Scope creep.
OSINT investigations can expand infinitely. Every finding opens new avenues. Constantly return to the intelligence requirement defined in Step 1. If a lead is interesting but out of scope, note it in leads.md as LOW priority and move on.
Violating ethical boundaries under pressure. When an investigation is time-sensitive or the user is impatient, the temptation to cut ethical corners grows. Do not. The boundaries defined in the Ethics and OPSEC section are not flexible. If a collection method is questionable, document why you chose not to use it and propose alternatives.
Failing to record negative results. "I searched for X and found nothing" is a finding. It tells future investigators not to repeat that search. It constrains the space of possibilities. It may even be the answer — sometimes the absence of a digital footprint is itself significant.
Not using the knowledge graph.
The knowledge graph is your external memory. If you are not adding entities and relationships as you discover them, you are relying on context window alone, which means you will lose connections in long investigations. Use memory-graph tools consistently.
Treating tools as infallible. Every tool has limitations. Username checkers produce false positives and false negatives. WHOIS data may be outdated. Geolocation databases have error margins. Always note the tool and method used, and treat its output as one data point, not ground truth.