Skill

convert-to-md

Batch convert documents (DOCX, PDF, XLSX, TXT, PPTX, MSG, DOC) to markdown, preserving tracked changes and comments.

Install

Run in your terminal

npx claudepluginhub nicsuzor/aops

Tool Access

This skill is limited to using the following tools:

BashRead

Supporting Assets

View in Repository

scripts/pdf2md.py

Skill Content

Similar Skills

agent-harness-construction

Designs and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.

ecc

140.3k

agent-payment-x402

Enables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.

ecc

140.3k

agent-eval

Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.

ecc

140.3k

Stats

Parent Repo Stars0

Parent Repo Forks0

Last CommitApr 4, 2026

Actions

View Source View Plugin View on GitHub View README

Supported Formats

Format	Method	Notes
DOCX	pandoc `--track-changes=all`	Preserves comments & tracked changes
PDF	PyMuPDF	Text extraction
XLSX	pandas	Converts to markdown tables
TXT	rename	Direct rename to .md
PPTX	pandoc	Slide content to markdown
MSG	extract-msg	Email metadata + body
DOC	textutil	macOS native (fallback)
DOTX	pandoc	Word templates

Format

Method

Notes

DOCX

pandoc --track-changes=all

Preserves comments & tracked changes

PDF

PyMuPDF

Text extraction

XLSX

pandas

Converts to markdown tables

TXT

rename

Direct rename to .md

PPTX

pandoc

Slide content to markdown

MSG

extract-msg

Email metadata + body

DOC

textutil

macOS native (fallback)

DOTX

pandoc

Word templates

Process

Install dependencies (if needed):

uv add pymupdf pandas openpyxl tabulate extract-msg

Convert DOCX (preserves comments/edits):

for f in *.docx; do
  pandoc --track-changes=all -f docx -t markdown -o "${f%.docx}.md" "$f" && rm "$f"
done

Convert PDF:

import fitz
from pathlib import Path
for pdf in Path(".").glob("*.pdf"):
    doc = fitz.open(pdf)
    text = "\n\n".join(page.get_text() for page in doc)
    pdf.with_suffix(".md").write_text(text.strip())
    pdf.unlink()

Convert XLSX to tables:

import pandas as pd
for xlsx in Path(".").glob("*.xlsx"):
    xls = pd.ExcelFile(xlsx)
    content = f"# {xlsx.stem}\n\n"
    for sheet in xls.sheet_names:
        df = pd.read_excel(xlsx, sheet_name=sheet)
        content += f"## {sheet}\n\n{df.to_markdown(index=False)}\n\n"
    xlsx.with_suffix(".md").write_text(content)
    xlsx.unlink()

Convert TXT: for f in *.txt; do mv "$f" "${f%.txt}.md"; done

Convert MSG:

import extract_msg
msg = extract_msg.Message("file.msg")
content = f"# {msg.subject}\n\n**From:** {msg.sender}\n**Date:** {msg.date}\n\n{msg.body}"

Clean up: Remove *:Zone.Identifier files (Windows metadata)

Supported Formats

Format	Method	Notes
DOCX	pandoc `--track-changes=all`	Preserves comments & tracked changes
PDF	PyMuPDF	Text extraction
XLSX	pandas	Converts to markdown tables
TXT	rename	Direct rename to .md
PPTX	pandoc	Slide content to markdown
MSG	extract-msg	Email metadata + body
DOC	textutil	macOS native (fallback)
DOTX	pandoc	Word templates

Format

Method

Notes

DOCX

pandoc --track-changes=all

Preserves comments & tracked changes

PDF

PyMuPDF

Text extraction

XLSX

pandas

Converts to markdown tables

TXT

rename

Direct rename to .md

PPTX

pandoc

Slide content to markdown

MSG

extract-msg

Email metadata + body

DOC

textutil

macOS native (fallback)

DOTX

pandoc

Word templates

Process

Install dependencies (if needed):

uv add pymupdf pandas openpyxl tabulate extract-msg

Convert DOCX (preserves comments/edits):

for f in *.docx; do
  pandoc --track-changes=all -f docx -t markdown -o "${f%.docx}.md" "$f" && rm "$f"
done

Convert PDF:

import fitz
from pathlib import Path
for pdf in Path(".").glob("*.pdf"):
    doc = fitz.open(pdf)
    text = "\n\n".join(page.get_text() for page in doc)
    pdf.with_suffix(".md").write_text(text.strip())
    pdf.unlink()

Convert XLSX to tables:

import pandas as pd
for xlsx in Path(".").glob("*.xlsx"):
    xls = pd.ExcelFile(xlsx)
    content = f"# {xlsx.stem}\n\n"
    for sheet in xls.sheet_names:
        df = pd.read_excel(xlsx, sheet_name=sheet)
        content += f"## {sheet}\n\n{df.to_markdown(index=False)}\n\n"
    xlsx.with_suffix(".md").write_text(content)
    xlsx.unlink()

Convert TXT: for f in *.txt; do mv "$f" "${f%.txt}.md"; done

Convert MSG:

import extract_msg
msg = extract_msg.Message("file.msg")
content = f"# {msg.subject}\n\n**From:** {msg.sender}\n**Date:** {msg.date}\n\n{msg.body}"

Clean up: Remove *:Zone.Identifier files (Windows metadata)

convert-to-md

convert-to-md

Document to Markdown Conversion

Usage

Supported Formats

Process

Behavior

Dependencies

Document to Markdown Conversion

Usage

Supported Formats

Process

Behavior

Dependencies