This skill should be used when the user asks to "analyze DBF files", "read DBF data", "query DBF database", "convert DBF to Parquet", "analyze Thai accounting data", "explore legacy database", "run DuckDB queries on DBF", or mentions DBF, Parquet conversion, or Thai legacy accounting systems. Provides comprehensive guidance for reading, converting, and analyzing Thai legacy DBF accounting databases.
/plugin marketplace add ninyawee/armed-claude/plugin install supabase-skills@armed-claudeThis skill inherits all available tools. When active, it can use any tool Claude has access to.
references/query-patterns.mdreferences/table-schemas.mdscripts/dbf_to_parquet.pyscripts/inspect_dbf.pyToolkit for analyzing Thai legacy DBF accounting databases using roonpoo, Parquet, and DuckDB.
This skill enables analysis of legacy Thai accounting systems that use DBF (dBase) files with TIS-620/CP874 encoding. The workflow involves:
Before starting analysis, ensure the roonpoo library is available:
cd libs/python
uv sync
For Parquet conversion and DuckDB queries:
uv pip install pyarrow duckdb
from roonpoo import DBF
# Stream records (memory-efficient)
for record in DBF('path/to/file.DBF', encoding='tis-620'):
print(record)
# Preload all records
table = DBF('path/to/file.DBF', encoding='tis-620', preload=True)
print(table.records[0])
| Parameter | Description |
|---|---|
encoding | Use 'tis-620' or 'cp874' for Thai text |
preload | Load all records into memory |
ignore_missing_memo | Skip if .FPT/.DBT memo file missing |
char_decode_errors | 'strict', 'ignore', or 'replace' |
table = DBF('file.DBF', encoding='tis-620')
# Metadata
print(f"Version: {table.dbversion}")
print(f"Last modified: {table.date}")
print(f"Records: {table.header.numrecords}")
# Fields
for field in table.fields:
print(f"{field.name}: type={field.type}, len={field.length}")
| Type | Description |
|---|---|
| C | Character (string) |
| N | Numeric |
| D | Date |
| L | Logical (boolean) |
| M | Memo (requires .FPT/.DBT) |
Use the conversion script at scripts/dbf_to_parquet.py:
uv run python scripts/dbf_to_parquet.py /path/to/DATA/*.DBF -o /path/to/output/
Or inline:
from roonpoo import DBF
import pyarrow as pa
import pyarrow.parquet as pq
from decimal import Decimal
def convert_dbf_to_parquet(dbf_path, output_path):
table = DBF(dbf_path, encoding='tis-620', char_decode_errors='replace')
records = list(table)
columns = {f.name: [] for f in table.fields}
for rec in records:
for field in table.fields:
val = rec.get(field.name)
if isinstance(val, Decimal):
val = float(val)
columns[field.name].append(val)
arrow_table = pa.table(columns)
pq.write_table(arrow_table, output_path)
return len(records)
import duckdb
con = duckdb.connect()
parquet_dir = 'path/to/parquet/files'
Query single file:
SELECT * FROM 'asParquet/ARMST.parquet' LIMIT 10
List all tables with row counts:
SELECT
replace(filename, 'path/', '') as file,
count(*) as rows
FROM parquet_scan('asParquet/*.parquet', filename=true)
GROUP BY filename
ORDER BY rows DESC
Cross-table JOIN:
SELECT
a.ACCID,
m.COMP,
COUNT(*) as txn_count
FROM 'asParquet/ARTR.parquet' a
JOIN 'asParquet/ARMST.parquet' m ON a.ACCID = m.ACCID
GROUP BY a.ACCID, m.COMP
ORDER BY txn_count DESC
Schema inspection:
DESCRIBE SELECT * FROM 'asParquet/ARTR.parquet'
| Table | Description | Key Fields |
|---|---|---|
| ARMST | Customer master | ACCID, COMP, NAME, TEL |
| APMST | Vendor master | ACCID, COMP, NAME |
| ARTR | AR transactions | DOCNO, DATEDOC, ACCID, AMOUNT |
| APTR | AP transactions | DOCNO, DATEDOC, ACCID, AMOUNT |
| GLTR | GL transactions | GLID, DEBIT, CREDIT |
| GLTRHD | GL headers | DOCNO, DATEDOC |
| INVLOC | Inventory location | PCODE, LOCID, QTY |
Complete analysis workflow:
from roonpoo import DBF
from pathlib import Path
import duckdb
# 1. Explore DBF structure
data_dir = Path('sample_company/ALLDATA/DATA2011')
for dbf_file in sorted(data_dir.glob('*.DBF'))[:5]:
table = DBF(dbf_file, encoding='tis-620')
print(f"{dbf_file.name}: {table.header.numrecords} records")
# 2. Convert key tables to Parquet
# (use scripts/dbf_to_parquet.py)
# 3. Query with DuckDB
con = duckdb.connect()
result = con.execute("""
SELECT GLID, SUM(DEBIT) as total_debit, SUM(CREDIT) as total_credit
FROM 'asParquet/GLTR.parquet'
GROUP BY GLID
ORDER BY total_debit DESC
LIMIT 10
""").fetchdf()
print(result)
For files with encoding problems:
# Replace invalid characters
table = DBF('file.DBF', encoding='tis-620', char_decode_errors='replace')
# Or ignore them
table = DBF('file.DBF', encoding='tis-620', char_decode_errors='ignore')
scripts/dbf_to_parquet.py - Batch convert DBF files to Parquetscripts/inspect_dbf.py - Inspect DBF structure and sample datareferences/table-schemas.md - Common Thai accounting table schemasreferences/query-patterns.md - Advanced DuckDB query patternsCreate employment contracts, offer letters, and HR policy documents following legal best practices. Use when drafting employment agreements, creating HR policies, or standardizing employment documentation.
Implement GDPR-compliant data handling with consent management, data subject rights, and privacy by design. Use when building systems that process EU personal data, implementing privacy controls, or conducting GDPR compliance reviews.