Generate PDFs from HTML and upload to Databricks Unity Catalog volumes for test data, demos, reports, or evaluation datasets.
npx claudepluginhub databricks-solutions/ai-dev-kit --plugin databricks-ai-dev-kitThis skill uses the workspace's default tool permissions.
Convert HTML content to PDF documents and upload them to Unity Catalog Volumes.
Provides Ktor server patterns for routing DSL, plugins (auth, CORS, serialization), Koin DI, WebSockets, services, and testApplication testing.
Conducts multi-source web research with firecrawl and exa MCPs: searches, scrapes pages, synthesizes cited reports. For deep dives, competitive analysis, tech evaluations, or due diligence.
Provides demand forecasting, safety stock optimization, replenishment planning, and promotional lift estimation for multi-location retailers managing 300-800 SKUs.
Convert HTML content to PDF documents and upload them to Unity Catalog Volumes.
The generate_and_upload_pdf MCP tool converts HTML to PDF and uploads to a Unity Catalog Volume. You (the LLM) generate the HTML content, and the tool handles conversion and upload.
generate_and_upload_pdf(
html_content: str, # Complete HTML document
filename: str, # PDF filename (e.g., "report.pdf")
catalog: str, # Unity Catalog name
schema: str, # Schema name
volume: str = "raw_data", # Volume name (default: "raw_data")
folder: str = None, # Optional subfolder
)
Returns:
{
"success": true,
"volume_path": "/Volumes/catalog/schema/volume/filename.pdf",
"error": null
}
Generate a simple PDF:
generate_and_upload_pdf(
html_content='''<!DOCTYPE html>
<html>
<head>
<style>
body { font-family: Arial, sans-serif; margin: 40px; }
h1 { color: #1a73e8; border-bottom: 2px solid #1a73e8; padding-bottom: 10px; }
.section { margin: 20px 0; }
</style>
</head>
<body>
<h1>Quarterly Report Q1 2024</h1>
<div class="section">
<h2>Executive Summary</h2>
<p>Revenue increased 15% year-over-year...</p>
</div>
</body>
</html>''',
filename="q1_report.pdf",
catalog="my_catalog",
schema="my_schema"
)
IMPORTANT: PDF generation and upload can take 2-5 seconds per document. When generating multiple PDFs, call the tool in parallel to maximize throughput.
Make 5 simultaneous generate_and_upload_pdf calls:
# Call 1
generate_and_upload_pdf(
html_content="<html>...Employee Handbook content...</html>",
filename="employee_handbook.pdf",
catalog="hr_catalog", schema="policies", folder="2024"
)
# Call 2 (parallel)
generate_and_upload_pdf(
html_content="<html>...Leave Policy content...</html>",
filename="leave_policy.pdf",
catalog="hr_catalog", schema="policies", folder="2024"
)
# Call 3 (parallel)
generate_and_upload_pdf(
html_content="<html>...Code of Conduct content...</html>",
filename="code_of_conduct.pdf",
catalog="hr_catalog", schema="policies", folder="2024"
)
# Call 4 (parallel)
generate_and_upload_pdf(
html_content="<html>...Benefits Guide content...</html>",
filename="benefits_guide.pdf",
catalog="hr_catalog", schema="policies", folder="2024"
)
# Call 5 (parallel)
generate_and_upload_pdf(
html_content="<html>...Remote Work Policy content...</html>",
filename="remote_work_policy.pdf",
catalog="hr_catalog", schema="policies", folder="2024"
)
By calling these in parallel (not sequentially), 5 PDFs that would take 15-25 seconds sequentially complete in 3-5 seconds total.
Always include the full HTML structure:
<!DOCTYPE html>
<html>
<head>
<style>
/* Your CSS here */
</style>
</head>
<body>
<!-- Your content here -->
</body>
</html>
PlutoPrint supports modern CSS3:
--var-name)<!DOCTYPE html>
<html>
<head>
<style>
:root {
--primary: #1a73e8;
--text: #202124;
--gray: #5f6368;
}
body {
font-family: 'Segoe UI', Arial, sans-serif;
margin: 50px;
color: var(--text);
line-height: 1.6;
}
h1 {
color: var(--primary);
border-bottom: 3px solid var(--primary);
padding-bottom: 15px;
}
h2 { color: var(--text); margin-top: 30px; }
.highlight {
background: #e8f0fe;
padding: 15px;
border-left: 4px solid var(--primary);
margin: 20px 0;
}
table {
width: 100%;
border-collapse: collapse;
margin: 20px 0;
}
th, td {
border: 1px solid #dadce0;
padding: 12px;
text-align: left;
}
th { background: #f1f3f4; }
.footer {
margin-top: 50px;
padding-top: 20px;
border-top: 1px solid #dadce0;
color: var(--gray);
font-size: 0.9em;
}
</style>
</head>
<body>
<h1>Document Title</h1>
<h2>Section 1</h2>
<p>Content here...</p>
<div class="highlight">
<strong>Important:</strong> Key information highlighted here.
</div>
<h2>Data Table</h2>
<table>
<tr><th>Column 1</th><th>Column 2</th><th>Column 3</th></tr>
<tr><td>Data</td><td>Data</td><td>Data</td></tr>
</table>
<div class="footer">
Generated on 2024-01-15 | Confidential
</div>
</body>
</html>
Generate API documentation, user guides, or technical specs:
generate_and_upload_pdf(
html_content='''<!DOCTYPE html>
<html>
<head><style>
body { font-family: monospace; margin: 40px; }
code { background: #f4f4f4; padding: 2px 6px; }
pre { background: #f4f4f4; padding: 15px; overflow-x: auto; }
.endpoint { background: #e3f2fd; padding: 10px; margin: 10px 0; }
</style></head>
<body>
<h1>API Reference</h1>
<div class="endpoint">
<code>GET /api/v1/users</code>
<p>Returns a list of all users.</p>
</div>
<h2>Request Headers</h2>
<pre>Authorization: Bearer {token}
Content-Type: application/json</pre>
</body>
</html>''',
filename="api_reference.pdf",
catalog="docs_catalog",
schema="api_docs"
)
generate_and_upload_pdf(
html_content='''<!DOCTYPE html>
<html>
<head><style>
body { font-family: Georgia, serif; margin: 50px; }
.metric { display: inline-block; text-align: center; margin: 20px; }
.metric-value { font-size: 2em; color: #1a73e8; }
.metric-label { color: #666; }
</style></head>
<body>
<h1>Q1 2024 Performance Report</h1>
<div class="metric">
<div class="metric-value">$2.4M</div>
<div class="metric-label">Revenue</div>
</div>
<div class="metric">
<div class="metric-value">+15%</div>
<div class="metric-label">Growth</div>
</div>
</body>
</html>''',
filename="q1_2024_report.pdf",
catalog="finance",
schema="reports",
folder="quarterly"
)
generate_and_upload_pdf(
html_content='''<!DOCTYPE html>
<html>
<head><style>
body { font-family: Arial; margin: 40px; line-height: 1.8; }
.policy-section { margin: 30px 0; }
.important { background: #fff3e0; padding: 15px; border-radius: 5px; }
</style></head>
<body>
<h1>Employee Leave Policy</h1>
<p><em>Effective: January 1, 2024</em></p>
<div class="policy-section">
<h2>1. Annual Leave</h2>
<p>All full-time employees are entitled to 20 days of paid annual leave per calendar year.</p>
</div>
<div class="important">
<strong>Note:</strong> Leave requests must be submitted at least 2 weeks in advance.
</div>
</body>
</html>''',
filename="leave_policy.pdf",
catalog="hr_catalog",
schema="policies"
)
When asked to generate multiple PDFs:
generate_and_upload_pdf callsraw_data)| Issue | Solution |
|---|---|
| "Volume does not exist" | Create the volume first or use an existing one |
| "Schema does not exist" | Create the schema or check the name |
| PDF looks wrong | Check HTML/CSS syntax, use supported CSS features |
| Slow generation | Call multiple PDFs in parallel, not sequentially |