Skill

dewey

Queries and downloads datasets from the Dewey Data academic marketplace (POI, foot traffic, mobility, consumer, real estate) via API key, DuckDB, or MCP server.

DuckDB

data-engineering

npx claudepluginhub edwinhu/workflows --plugin workflows

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/workflows:dewey

Not user invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

- [What Dewey Is](#what-dewey-is)

Supporting Files

examples/btm_safegraph_pull.pyreferences/access-options.mdreferences/catalog.csvreferences/catalog.mdreferences/datasets.mdreferences/deweypy-client.mdreferences/duckdb.mdreferences/linkage.mdreferences/mcp.mdreferences/safegraph-places.md

SKILL.md

157 lines · ~2.9k tokens

Similar Skills

opendata-api

Queries OpenData API datasets via REST for row fetching, filtering, sorting, aggregation, column inspection, and metadata retrieval. Use for data research, analysis, and pipelines on Parquet files.

8 files

opendata

datacommons

Queries public statistical data from Data Commons (demographics, economics, health, environment) via the Python API v2. Use for population, GDP, unemployment, disease prevalence, and geographic entity resolution.

4 files

superpowers

anysite-cli

Operates the anysite CLI for web data extraction, dataset pipelines, batch API processing, scheduling, SQL queries, database loading, and LLM-powered data analysis.

4 files

anysite-skills

Stats

LanguageJupyter Notebook

Stars16

Forks5

MaintenanceExcellent

Last CommitJun 6, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.

	WRDS	Dewey
Data	Finance/accounting	POI, foot traffic, mobility, consumer, real estate
Access	PostgreSQL / SAS on the grid	File download (Parquet/CSV.gz) via API key
Query engine	server-side SQL	DuckDB over the files (local or remote presigned URLs)
Licensing	per-vendor, negotiated	one platform subscription unlocks the catalog
AI access	none	MCP server (`api.deweydata.io/mcp`)

WRDS

Dewey

Data

Finance/accounting

POI, foot traffic, mobility, consumer, real estate

Access

PostgreSQL / SAS on the grid

File download (Parquet/CSV.gz) via API key

Query engine

server-side SQL

DuckDB over the files (local or remote presigned URLs)

Licensing

per-vendor, negotiated

one platform subscription unlocks the catalog

AI access

none

MCP server (api.deweydata.io/mcp)

Excuse	Reality	Do Instead
"I'll just download everything and filter in pandas"	SafeGraph Patterns is multi-TB; you'll fill the disk	DuckDB `COPY TO` with WHERE on remote parquet — pull only the rows/cols you need
"I don't need to check the schema first"	Column names differ by provider/release (`naics_code` vs `NAICS_CODE`, `opened_on` may not exist)	`read_sample(nrows=100)` BEFORE the full pull
"No date filter needed, I want all of it"	Most datasets are date-partitioned; "all" = every weekly file ever	Set `partition_key_after/before` to your study window
"The download started, so it's correct"	A started download ≠ the right columns	Inspect a sample on disk before claiming success
"Presigned links are fine for a long job"	Links expire in 24h (`download_files0`)	Use `download_files1` (page-by-page, refreshes links) for large multi-day pulls
"I'll hardcode the product path I think it is"	Wrong `prj_` → 404 or someone else's data	Get it from Connect to API, or MCP `search_datasets`

Excuse

Reality

Do Instead

"I'll just download everything and filter in pandas"

SafeGraph Patterns is multi-TB; you'll fill the disk

DuckDB COPY TO with WHERE on remote parquet — pull only the rows/cols you need

"I don't need to check the schema first"

Column names differ by provider/release (naics_code vs NAICS_CODE, opened_on may not exist)

read_sample(nrows=100) BEFORE the full pull

"No date filter needed, I want all of it"

Most datasets are date-partitioned; "all" = every weekly file ever

Set partition_key_after/before to your study window

"The download started, so it's correct"

A started download ≠ the right columns

Inspect a sample on disk before claiming success

"Presigned links are fine for a long job"

Links expire in 24h (download_files0)

Use download_files1 (page-by-page, refreshes links) for large multi-day pulls

"I'll hardcode the product path I think it is"

Wrong prj_ → 404 or someone else's data

Get it from Connect to API, or MCP search_datasets

Need	Method	Reference
Discover/search datasets, check schema, sample — from inside Claude	MCP server (`api.deweydata.io/mcp`)	`references/mcp.md`
Scripted Python bulk download	deweypy (recommended) or deweydatapy (legacy, product_path API)	`references/deweypy-client.md`
Selective pull — specific columns/rows from huge datasets	DuckDB over presigned URLs (`read_parquet($urls)` + `COPY TO`)	`references/duckdb.md`
R workflow	deweyr (`download_dewey()`)	`references/deweypy-client.md`
One-off, dataset < 2.0 GB	UI CSV download (platform → project)	`references/access-options.md`
Analyze data already on disk	DuckDB / pandas / polars over `.parquet` or `.csv.gz`	`references/access-options.md`

Need

Method

Reference

Discover/search datasets, check schema, sample — from inside Claude

MCP server (api.deweydata.io/mcp)

references/mcp.md

Scripted Python bulk download

deweypy (recommended) or deweydatapy (legacy, product_path API)

references/deweypy-client.md

Selective pull — specific columns/rows from huge datasets

DuckDB over presigned URLs (read_parquet($urls) + COPY TO)

references/duckdb.md

R workflow

deweyr (download_dewey())

references/deweypy-client.md

One-off, dataset < 2.0 GB

UI CSV download (platform → project)

references/access-options.md

Analyze data already on disk

DuckDB / pandas / polars over *.parquet or *.csv.gz

references/access-options.md

Provider	Dataset(s)	What it is
SafeGraph	Global Places (POI), Geometry, Spend, Patterns	POI master, building footprints, card spend, foot-traffic visit patterns
Advan Research	Monthly/Weekly Patterns, Home Panel	Foot traffic aggregated to place & census-block
dataplor	POI	Global POI, strong emerging-markets coverage
Veraset	Movement	Device-level mobility (institutional license only)
PassBy	Foot Traffic	Per-POI foot-traffic analytics
Consumer Edge / PDI	Spend / transactions	Card & product-level purchasing
LinkUp	Job postings	Labor-market activity
ATTOM / Dwellsy / RentHub	Real estate	Property records, rentals

Provider

Dataset(s)

What it is

SafeGraph

Global Places (POI), Geometry, Spend, Patterns

POI master, building footprints, card spend, foot-traffic visit patterns

Advan Research

Monthly/Weekly Patterns, Home Panel

Foot traffic aggregated to place & census-block

dataplor

POI

Global POI, strong emerging-markets coverage

Veraset

Movement

Device-level mobility (institutional license only)

PassBy

Foot Traffic

Per-POI foot-traffic analytics

Consumer Edge / PDI

Spend / transactions

Card & product-level purchasing

LinkUp

Job postings

Labor-market activity

ATTOM / Dwellsy / RentHub

Real estate

Property records, rentals

Column	Meaning
`PLACEKEY`	Stable unique POI id (join key across SafeGraph products)
`LOCATION_NAME`	POI name
`BRANDS`	JSON array: `[{"safegraph_brand_name":"…"}]` — not plain text
`STREET_ADDRESS`,`CITY`,`REGION`,`POSTAL_CODE`,`ISO_COUNTRY_CODE`	Address (`REGION`=US state)
`LATITUDE`,`LONGITUDE`	Coordinates
`NAICS_CODE`,`NAICS_CODE_2022`	6-digit NAICS (string)
`TOP_CATEGORY`,`SUB_CATEGORY`	Category labels
`OPENED_ON`,`CLOSED_ON`,`TRACKING_CLOSED_SINCE`	Open/close dates (exist but sparsely populated — NULL for BTMs)

Column

Meaning

PLACEKEY

Stable unique POI id (join key across SafeGraph products)

LOCATION_NAME

POI name

BRANDS

JSON array: [{"safegraph_brand_name":"…"}] — not plain text

STREET_ADDRESS,CITY,REGION,POSTAL_CODE,ISO_COUNTRY_CODE

Address (REGION=US state)

LATITUDE,LONGITUDE

Coordinates

NAICS_CODE,NAICS_CODE_2022

6-digit NAICS (string)

TOP_CATEGORY,SUB_CATEGORY

Category labels

OPENED_ON,CLOSED_ON,TRACKING_CLOSED_SINCE

Open/close dates (exist but sparsely populated — NULL for BTMs)

	WRDS	Dewey
Data	Finance/accounting	POI, foot traffic, mobility, consumer, real estate
Access	PostgreSQL / SAS on the grid	File download (Parquet/CSV.gz) via API key
Query engine	server-side SQL	DuckDB over the files (local or remote presigned URLs)
Licensing	per-vendor, negotiated	one platform subscription unlocks the catalog
AI access	none	MCP server (`api.deweydata.io/mcp`)

WRDS

Dewey

Data

Finance/accounting

POI, foot traffic, mobility, consumer, real estate

Access

PostgreSQL / SAS on the grid

File download (Parquet/CSV.gz) via API key

Query engine

server-side SQL

DuckDB over the files (local or remote presigned URLs)

Licensing

per-vendor, negotiated

one platform subscription unlocks the catalog

AI access

none

MCP server (api.deweydata.io/mcp)

Excuse	Reality	Do Instead
"I'll just download everything and filter in pandas"	SafeGraph Patterns is multi-TB; you'll fill the disk	DuckDB `COPY TO` with WHERE on remote parquet — pull only the rows/cols you need
"I don't need to check the schema first"	Column names differ by provider/release (`naics_code` vs `NAICS_CODE`, `opened_on` may not exist)	`read_sample(nrows=100)` BEFORE the full pull
"No date filter needed, I want all of it"	Most datasets are date-partitioned; "all" = every weekly file ever	Set `partition_key_after/before` to your study window
"The download started, so it's correct"	A started download ≠ the right columns	Inspect a sample on disk before claiming success
"Presigned links are fine for a long job"	Links expire in 24h (`download_files0`)	Use `download_files1` (page-by-page, refreshes links) for large multi-day pulls
"I'll hardcode the product path I think it is"	Wrong `prj_` → 404 or someone else's data	Get it from Connect to API, or MCP `search_datasets`

Excuse

Reality

Do Instead

"I'll just download everything and filter in pandas"

SafeGraph Patterns is multi-TB; you'll fill the disk

DuckDB COPY TO with WHERE on remote parquet — pull only the rows/cols you need

"I don't need to check the schema first"

Column names differ by provider/release (naics_code vs NAICS_CODE, opened_on may not exist)

read_sample(nrows=100) BEFORE the full pull

"No date filter needed, I want all of it"

Most datasets are date-partitioned; "all" = every weekly file ever

Set partition_key_after/before to your study window

"The download started, so it's correct"

A started download ≠ the right columns

Inspect a sample on disk before claiming success

"Presigned links are fine for a long job"

Links expire in 24h (download_files0)

Use download_files1 (page-by-page, refreshes links) for large multi-day pulls

"I'll hardcode the product path I think it is"

Wrong prj_ → 404 or someone else's data

Get it from Connect to API, or MCP search_datasets

Need	Method	Reference
Discover/search datasets, check schema, sample — from inside Claude	MCP server (`api.deweydata.io/mcp`)	`references/mcp.md`
Scripted Python bulk download	deweypy (recommended) or deweydatapy (legacy, product_path API)	`references/deweypy-client.md`
Selective pull — specific columns/rows from huge datasets	DuckDB over presigned URLs (`read_parquet($urls)` + `COPY TO`)	`references/duckdb.md`
R workflow	deweyr (`download_dewey()`)	`references/deweypy-client.md`
One-off, dataset < 2.0 GB	UI CSV download (platform → project)	`references/access-options.md`
Analyze data already on disk	DuckDB / pandas / polars over `.parquet` or `.csv.gz`	`references/access-options.md`

Need

Method

Reference

Discover/search datasets, check schema, sample — from inside Claude

MCP server (api.deweydata.io/mcp)

references/mcp.md

Scripted Python bulk download

deweypy (recommended) or deweydatapy (legacy, product_path API)

references/deweypy-client.md

Selective pull — specific columns/rows from huge datasets

DuckDB over presigned URLs (read_parquet($urls) + COPY TO)

references/duckdb.md

R workflow

deweyr (download_dewey())

references/deweypy-client.md

One-off, dataset < 2.0 GB

UI CSV download (platform → project)

references/access-options.md

Analyze data already on disk

DuckDB / pandas / polars over *.parquet or *.csv.gz

references/access-options.md

Provider	Dataset(s)	What it is
SafeGraph	Global Places (POI), Geometry, Spend, Patterns	POI master, building footprints, card spend, foot-traffic visit patterns
Advan Research	Monthly/Weekly Patterns, Home Panel	Foot traffic aggregated to place & census-block
dataplor	POI	Global POI, strong emerging-markets coverage
Veraset	Movement	Device-level mobility (institutional license only)
PassBy	Foot Traffic	Per-POI foot-traffic analytics
Consumer Edge / PDI	Spend / transactions	Card & product-level purchasing
LinkUp	Job postings	Labor-market activity
ATTOM / Dwellsy / RentHub	Real estate	Property records, rentals

Provider

Dataset(s)

What it is

SafeGraph

Global Places (POI), Geometry, Spend, Patterns

POI master, building footprints, card spend, foot-traffic visit patterns

Advan Research

Monthly/Weekly Patterns, Home Panel

Foot traffic aggregated to place & census-block

dataplor

POI

Global POI, strong emerging-markets coverage

Veraset

Movement

Device-level mobility (institutional license only)

PassBy

Foot Traffic

Per-POI foot-traffic analytics

Consumer Edge / PDI

Spend / transactions

Card & product-level purchasing

LinkUp

Job postings

Labor-market activity

ATTOM / Dwellsy / RentHub

Real estate

Property records, rentals

Column	Meaning
`PLACEKEY`	Stable unique POI id (join key across SafeGraph products)
`LOCATION_NAME`	POI name
`BRANDS`	JSON array: `[{"safegraph_brand_name":"…"}]` — not plain text
`STREET_ADDRESS`,`CITY`,`REGION`,`POSTAL_CODE`,`ISO_COUNTRY_CODE`	Address (`REGION`=US state)
`LATITUDE`,`LONGITUDE`	Coordinates
`NAICS_CODE`,`NAICS_CODE_2022`	6-digit NAICS (string)
`TOP_CATEGORY`,`SUB_CATEGORY`	Category labels
`OPENED_ON`,`CLOSED_ON`,`TRACKING_CLOSED_SINCE`	Open/close dates (exist but sparsely populated — NULL for BTMs)

Column

Meaning

PLACEKEY

Stable unique POI id (join key across SafeGraph products)

LOCATION_NAME

POI name

BRANDS

JSON array: [{"safegraph_brand_name":"…"}] — not plain text

STREET_ADDRESS,CITY,REGION,POSTAL_CODE,ISO_COUNTRY_CODE

Address (REGION=US state)

LATITUDE,LONGITUDE

Coordinates

NAICS_CODE,NAICS_CODE_2022

6-digit NAICS (string)

TOP_CATEGORY,SUB_CATEGORY

Category labels

OPENED_ON,CLOSED_ON,TRACKING_CLOSED_SINCE

Open/close dates (exist but sparsely populated — NULL for BTMs)

dewey

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Similar Skills

Help us improve

Help us improve

Find plugins for your project

dewey

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Contents

What Dewey Is

Credential Enforcement

IRON LAW: NEVER GUESS, INVENT, OR HARDCODE THE API KEY

Download Enforcement

IRON LAW: NO BULK DOWNLOAD WITHOUT METADATA + SAMPLE + FILTER FIRST

Rationalization Table — STOP If You Think:

Red Flags — STOP Immediately If You're About To:

Access Method Decision Table

Authentication

Quick Reference: Featured Datasets

SafeGraph Global Places Quick Reference

Additional Resources

Reference Files

Example Files

Similar Skills

Help us improve

Contents

What Dewey Is

Credential Enforcement

IRON LAW: NEVER GUESS, INVENT, OR HARDCODE THE API KEY

Download Enforcement

IRON LAW: NO BULK DOWNLOAD WITHOUT METADATA + SAMPLE + FILTER FIRST

Rationalization Table — STOP If You Think:

Red Flags — STOP Immediately If You're About To:

Access Method Decision Table

Authentication

Quick Reference: Featured Datasets

SafeGraph Global Places Quick Reference

Additional Resources

Reference Files

Example Files