From david-skills
Audits markdown wiki vaults to find unlinked mentions of other pages in body text and suggests Wikipedia-style inline wikilinks, excluding generics and self-links, to improve graph connectivity after imports or growth.
npx claudepluginhub thedavidweng/skillsThis skill uses the workspace's default tool permissions.
Find terms in wiki page body text that reference other wiki pages but aren't wikilinked. Goal: Wikipedia-style inline linking where entities are naturally linked when mentioned in flowing text — not backlink sections at the bottom.
Guides Next.js Cache Components and Partial Prerendering (PPR): 'use cache' directives, cacheLife(), cacheTag(), revalidateTag() for caching, invalidation, static/dynamic optimization. Auto-activates on cacheComponents: true.
Processes PDFs: extracts text/tables/images, merges/splits/rotates pages, adds watermarks, creates/fills forms, encrypts/decrypts, OCRs scans. Activates on PDF mentions or output requests.
Share bugs, ideas, or general feedback.
Find terms in wiki page body text that reference other wiki pages but aren't wikilinked. Goal: Wikipedia-style inline linking where entities are naturally linked when mentioned in flowing text — not backlink sections at the bottom.
# For each wiki page, collect: slug -> {title, aliases, kind}
# Build term_to_slug: lowercase(term) -> slug for all titles + aliases
# Exclude course pages from auto-link targets if user has course indexes
Course pages exclusion: If the wiki has course-type pages (slugs matching faculty-course-number patterns like asia-101, cpsc-110), exclude them from auto-link targets. User likely wants faculty-level index pages instead of linking to individual courses. See "Course Index Pages" below.
Process longest terms first: Sort term_to_slug by term length descending. This prevents shorter substrings from being linked before longer complete phrases (e.g., "Jane Smith" should match before "Jane").
For each page, extract body (strip frontmatter ---...---) and find paragraphs of flowing text. Exclude:
#, |, - , * , >, 1.)[[wikilinks]] (check unclosed [[ before match position)## Sources section content\b word boundary match, case-insensitiveGeneric terms to exclude (too vague to link):
history, professional, academic, education, student, production, media, digital,
community, design, content, analysis, technology, science, management, development,
research, network, project, study, learning, writing, creative, art, film, video,
culture, language, introduction, foundation, modern, contemporary, critical, early,
global, information, society, family, environmental, geography, music, economics,
politics, singing, photography, cinema, horror, fiction, visualization, policy,
meteorology, anatomy, physiology, chemistry, program, interactive, storytelling,
audiences, industries, religions, thought, approaches, career, applied, systematic,
methods, networks, crowds, communities, arts, visual, second language, english,
new media, human, cap, seminar
[[wenzhou-no22-high-school]]). Safe to auto-fix.Replace term with [[target-slug|term]] in body text. Use |display form to preserve original text.
Precautions:
jane-smith), skip shorter substrings ("Jane", "Smith") at the same position. After batch fix, scan for [[slug|X]] [[slug|Y]] patterns and merge to [[slug|X Y]].[[ without space is correct (中文不需要空格)jane-smith, the script may link them separately creating [[jane-smith|Jane]] [[jane-smith|Smith]]. Fix: when processing, skip shorter terms that are substrings of already-matched longer terms at the same position. Always run a post-fix scan: r'\[\[([^|]+)\|([^]]+)\]\]\s*\[\[\1\|([^]]+)\]\]' and merge to [[slug|X Y]]. In practice this appeared in 4 files after a 189-file batch fix.[[slug|display]] into table cells, trailing spaces end up inside brackets ([[slug |display]] or [[slug|display ]]). These render correctly in Obsidian but are broken links in the graph. After batch linking tables, always scan and fix: re.sub(r'\[\[([^\]|]+?)\s+(\|[^\]]+?)?\]\]', lambda m: f'[[{m.group(1)}{m.group(2) or ""}]]', content). In one case this caused 109 broken links from a single roster file.re.sub(r'\[\[([^\]|]+\|)?([^\]]+)\]\]', r'\2', cell).| lines) — tables are structured data, not flowing prose.**** masking. The actual file content may be correct — verify with xxd or hex() before assuming data is corrupted. The sandbox masks DISPLAY only, not file content.After wiki grows, tag sprawl is common (30+ unique tags). Standardize to a two-layer system.
kind)Map directly to page type: person | project | place | topic | concept | map
Broad domains relevant to the wiki owner: e.g., ai | media | startup | transit | writing
tauri, rust, nextjs, react-native, supabase) — belong in article bodyowonetwork, citang) — page title already identifies the entitymaintenance, index, social-circle) — not useful for filteringschool → school, media-studies → media, token-tracking → ai'person' → personBuild tag map (old → new), apply via regex on frontmatter tags: line. Document the taxonomy in system/wiki/schema.md so future agents follow it.
When many pages are stubs (< 10 body lines), maintain a topics/stub-index.md listing them with title, kind, and line count. Group by kind. This serves as a task list for future enrichment and cross-reference point when new inbox imports arrive.
Conventions discovered during wiki work MUST be written into system/wiki/schema.md and system/wiki/workflows.md. Other agents and future sessions won't know about session-only decisions. If you discover a new convention (e.g., "no ## Related sections"), immediately add it to schema.md. Otherwise the next agent will reintroduce the pattern you just removed.
## Related SectionsWhen the wiki has ## Related sections at the bottom of pages (a list of wikilinks), these should be removed once inline links are in place — Wikipedia doesn't have a "related pages" section; links are woven into body text.
Categorize Related sections by pattern before removing:
Batch removal: For pages where all Related links are already covered in body, remove ## Related\n...\n block directly. Clean up triple newlines.
Manual integration: For uncovered links, add them naturally to body text before removing the section.
Verify: After removal, confirm zero ## Related sections remain.
When reviewing a wiki from an encyclopedia perspective:
## Related sections should not exist; links should be inline.[[links]] pointing to non-existent pages. Check for trailing spaces inside brackets ([[slug ]]).stub-index.md for tracking.## Sources section. Course index pages are exempt.## 2024-01) indicate chronological structure instead of thematic. Rewrite.inbox/ paths should appear in ## Sources sections.When a page has long data tables (email lists, phone numbers, credentials), wrap them in Obsidian collapsible callouts instead of leaving them as open tables:
> [!info]- All emails (21)
> | Address | Purpose |
> |---------|---------|
> | ... | ... |
When the same entity exists as multiple pages (e.g., projects/bc-ai-ecosystem.md AND topics/bc-ai-ecosystem.md):
kind and tags if neededSocial media nicknames (微信昵称, Instagram handles, etc.) must NOT go in:
aliases: in frontmatterNicknames belong in body text under Contact or personal info section:
# wang-wu
某中学同学。社交昵称"Dreamer"。
NOT:
# wang-wu (社交昵称) ← WRONG
aliases: ['wang-wu', '社交昵称'] ← WRONG
Aliases are for real name variants and common transliterations only (e.g., "James" for James Doe, "Alex" for Alex Lee).
When auditing person pages, scan for:
## Sources sections must NEVER reference inbox/ paths. Inbox is a temporary staging area — all content is deleted after distillation into sources/ and wiki/.
When importing data, copy raw files to sources/ first, then reference sources/ paths in wiki Sources sections. Common migrations:
inbox/identity/contacts.vcf → sources/identity/contacts.vcfinbox/Whatsapp/*.txt → sources/chats/*.txtinbox/facebook-* → already absorbed into wiki, remove referenceWhen the wiki has many course-type pages, don't link to individual courses from other pages. Instead create a hierarchy:
courses.md (总索引)
├── courses-university.md (university courses, grouped by Faculty)
│ ├── ASIA (9 courses)
│ ├── CHIN (4 courses)
│ └── ...
└── courses-school.md (school courses, grouped by type)
├── BC 省课程
└── 中方课程
Each course page links to its faculty/institution index in ## 相关链接. Other pages (people, projects) link to the index page, not individual courses.
When importing Flighty flight export CSV:
sources/flights/ with date slugwiki/topics/transportation.md page with stats (total flights, common routes, cities visited)some-person.md) with city list linking to existing place pagessystem/wiki/index.mdAirport codes should link to place pages where they exist (e.g., [[vancouver|Vancouver]] for YVR). Cities without place pages remain plain text.
When you discover a new convention during wiki work (e.g., "no ## Related sections", "WeChat nicknames not in aliases", "no inbox/ in Sources"), write it into system/wiki/schema.md or system/wiki/workflows.md immediately, in the same session. Do not defer. Other agents and future sessions have no memory of session-only decisions. A convention not written down will be violated by the next agent.