From bungkust-skills
Scrapes daily Heartopia game updates from Instagram @myheartopia.id, extracts weather, events, resource locations via OCR on carousel images, parses data, and appends structured entries to Obsidian vault Markdown file.
npx claudepluginhub bungkust/bungkust-skillsThis skill uses the workspace's default tool permissions.
> Scrape Heartopia game info dari Instagram @myheartopia.id, OCR images, update ke Obsidian vault.
Scrapes Instagram posts from public/private accounts using browser cookies and insta-fetcher Android API. Extracts images/carousels, captions, likes/comments; supports OCR and local saves.
Scrapes events from Instagram accounts, web aggregators, and Facebook URLs using Python CLI tools. Downloads flyer images and stores raw data in SQLite database.
Automates Xiaohongshu (Rednote) platform interactions with Playwright: search notes by keyword, extract markdown content, like/comment/collect/follow/publish. Useful for social media tasks on xiaohongshu.com.
Share bugs, ideas, or general feedback.
Scrape Heartopia game info dari Instagram @myheartopia.id, OCR images, update ke Obsidian vault.
1. Fetch latest "Update Harian" post dari @myheartopia.id (insta-fetcher)
2. Extract image carousel URLs
3. Download images + OCR with tesseract
4. Parse OCR output → structured data
5. Append entry to Heartopia-Daily-Tracker.md in vault
# Install deps if needed
cd /tmp/heartopia-scrape
npm install insta-fetcher 2>/dev/null
# Install tesseract
apt-get install -y tesseract-ocr 2>/dev/null
# Get Instagram cookies from browser:
# Instagram → DevTools → Application → Cookies → Copy sessionid value
# Format: sessionid=YOUR_DS_USER_ID%3A90HYFuvCoL9HCs%3A19%3AAYjerQ...
// Full cookie string from browser (not just sessionid)
const cookieStr = 'csrftoken=...; sessionid=...; mid=...; ig_did=...; datr=...; ds_user_id=...; ps_l=1; ps_n=1;';
items[].carousel_media[].image_versions2.candidates[].urlffmpeg -i img.jpg img.png)Heartopia update images biasanya berisi:
img_0: Tanggal, Oak Tree location, Fluorite location, Cuaca
img_1: Dory's items (saat hujan), price info
ffmpeg -i img.jpg -q:v 3 img.png -y
tesseract img.png stdout 2>/dev/null
## DD MMMM YYYY
**Cuaca:** [description]
**Oak Tree:** Rumah No. X
**Fluorite:** Rumah No. X
**Dory (saat hujan):** Rumah No. X / Jual: Item1, Item2, Item3
**Events:** [jika ada]
Scripts ada di /tmp/heartopia-scrape/:
scrape_heartopia.mjs — main scraper scriptcookies.json — cookie storage (gitignored, jangan commit!)package.json — dependenciescd /tmp/heartopia-scrape
node scrape_heartopia.mjs
| Problem | Solution |
|---|---|
| "Page Not Found" on IG API | sessionid expired — refresh cookie dari browser |
| OCR returns empty | Image terlalu kecil, convert ke PNG + resize sebelum OCR |
| No carousel images | Post mungkin single image, bukan carousel |
| tesseract not found | apt-get install tesseract-ocr |
| Wrong home location | Check @hey.bunnyxo untuk referensi |