From harness-claude
Implements API field selection (?fields=) to let clients request only needed response properties, reducing payload size and costs without GraphQL or narrow endpoints.
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeThis skill uses the workspace's default tool permissions.
> FIELD SELECTION LETS CLIENTS REQUEST ONLY THE RESPONSE PROPERTIES THEY NEED — REDUCING PAYLOAD SIZE, ELIMINATING OVER-FETCHING, AND CUTTING BOTH BANDWIDTH AND SERIALIZATION COST ON THE SERVER WITHOUT REQUIRING A SEPARATE GRAPHQL LAYER OR BESPOKE NARROW ENDPOINTS.
Guides API resource granularity decisions: fine-grained (flexible, chatty) vs coarse-grained (fewer round trips, over-fetching) to match client patterns. Use for endpoint design, performance debugging, composites/includes.
Guides REST API design patterns for production-grade endpoints including resource naming, status codes, pagination, filtering, error responses, versioning, and rate limiting. Use when designing new APIs or reviewing contracts.
Provides REST API design patterns for resource naming, URL structures, HTTP methods/status codes, pagination, filtering, errors, versioning, and rate limiting.
Share bugs, ideas, or general feedback.
FIELD SELECTION LETS CLIENTS REQUEST ONLY THE RESPONSE PROPERTIES THEY NEED — REDUCING PAYLOAD SIZE, ELIMINATING OVER-FETCHING, AND CUTTING BOTH BANDWIDTH AND SERIALIZATION COST ON THE SERVER WITHOUT REQUIRING A SEPARATE GRAPHQL LAYER OR BESPOKE NARROW ENDPOINTS.
?fields= to allow clients to project a subset of response properties?fields=id,name,owner.login) for resources with embedded sub-objectsSparse Fieldsets — A sparse fieldset is the requested subset of fields for a resource type. The client specifies which fields it wants; the server omits all others from the response. The JSON:API specification formalizes this as ?fields[resource_type]=field1,field2. Simpler APIs use ?fields=field1,field2 without the type qualifier. Both are valid; choose based on whether your API has a single resource type per endpoint or multiple embedded types.
?fields=id,name,email → return only id, name, email
?fields[user]=id,name → JSON:API style for the user resource
?fields[user]=id,name&fields[org]=id,slug → multi-type projection
Nested Field Selection — For resources with embedded sub-objects, use dot notation to select specific sub-fields. This avoids returning the full sub-object when only one property is needed:
?fields=id,title,author.name,author.avatar_url
The server must parse the dot-path and project only the specified sub-fields from the embedded object. Fields not listed in the projection are omitted from the embedded object entirely, not set to null.
Always-Present Fields — Some fields must always be present in every response regardless of projection: the resource ID, resource type, and any fields required for link traversal (e.g., href, self). Document these as mandatory fields that clients cannot exclude. If a client requests ?fields=name but omits id, the response must still include id.
Server-Side Projection vs. Post-Processing — True field selection projects fields at the data retrieval layer (SQL SELECT col1, col2 instead of SELECT *) to avoid reading unused data from disk. A naive implementation fetches the full object and strips fields before serialization — this reduces response size but does not reduce database I/O or memory use. For maximum benefit, push the projection down to the query layer. Use SQL column lists, MongoDB projections, or equivalent for non-relational stores.
Performance Tradeoffs vs. GraphQL — Field selection via ?fields= covers 80% of over-fetching use cases with zero client-side schema knowledge and no query language. GraphQL covers the remaining 20%: deeply nested multi-resource queries, aliased field names, inline fragments, and complex query composition. For REST APIs serving heterogeneous clients, ?fields= is a low-cost, high-value addition. Introducing GraphQL for field selection alone adds significant operational complexity. Choose GraphQL when clients need rich query composition, not just field projection.
Documenting Field Names — The ?fields= parameter is only usable if the available field names are documented. Include a complete field reference in the API documentation for every resource type, noting which fields are always-present and which are optional. An undocumented field surface forces clients to discover fields by trial and error.
The Google Drive API implements fields parameter selection on nearly every endpoint, making it a canonical reference for production-grade field selection:
Without field selection (full resource — 847 bytes):
GET /drive/v3/files/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms
Authorization: Bearer ya29...
HTTP/1.1 200 OK
Content-Type: application/json
{
"kind": "drive#file",
"id": "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms",
"name": "Q1 Report",
"mimeType": "application/vnd.google-apps.spreadsheet",
"description": "",
"starred": false,
"trashed": false,
"explicitlyTrashed": false,
"parents": ["0AEoMqdKFcXE3Uk9PVA"],
"spaces": ["drive"],
"version": "1",
"webViewLink": "https://docs.google.com/spreadsheets/...",
"iconLink": "https://drive-thirdparty.googleusercontent.com/...",
"hasThumbnail": false,
"thumbnailVersion": "0",
"viewedByMe": true,
"createdTime": "2024-01-10T08:00:00.000Z",
"modifiedTime": "2024-03-15T10:22:00.000Z",
"modifiedByMeTime": "2024-03-15T10:22:00.000Z",
"owners": [{ "kind": "drive#user", "displayName": "Alice", "emailAddress": "alice@example.com" }],
"lastModifyingUser": { "kind": "drive#user", "displayName": "Alice" },
"shared": false,
"ownedByMe": true,
"capabilities": { "canEdit": true, "canComment": true, "canShare": true }
}
With field selection (only id, name, modifiedTime — 94 bytes, 89% reduction):
GET /drive/v3/files/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms?fields=id,name,modifiedTime
Authorization: Bearer ya29...
HTTP/1.1 200 OK
Content-Type: application/json
{
"id": "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms",
"name": "Q1 Report",
"modifiedTime": "2024-03-15T10:22:00.000Z"
}
List endpoint with nested field selection:
GET /drive/v3/files?fields=files(id,name,owners/displayName),nextPageToken
Authorization: Bearer ya29...
Google Drive uses parentheses for nested selection in lists (files(id,name)) and / for nested object paths (owners/displayName). The ?fields= syntax is documented exhaustively in the Google API Discovery documents.
Stripping fields after full serialization. Fetching all columns from the database, constructing the full response object in memory, serializing it to JSON, and then removing keys before sending saves bandwidth but wastes CPU, memory, and database I/O. For large objects with blob fields (markdown content, HTML, image metadata), the serialization of the unused fields may dominate request processing time. Push the projection to the SQL SELECT list or ORM query.
Silently ignoring unknown field names. If a client requests ?fields=id,nme (typo), silently returning only id makes the client believe it received all requested fields. The client may then treat the missing name field as null rather than detecting the typo. Return 400 Bad Request listing the unrecognized field names. This prevents silent data bugs in client integrations.
Allowing field selection to bypass access control. If field-level permissions exist (e.g., salary is restricted to HR roles), a generic ?fields=salary from an unprivileged caller must still return a 403 for that field, not simply include it in the projection. Field selection must be applied after, not instead of, field-level authorization. The projection layer should filter fields from the set the caller is already authorized to see.
Omitting always-present fields from the documentation. If clients discover through trial and error that id is always returned even when not in ?fields=, they may rely on undocumented behavior. Document always-present fields explicitly in the API reference so clients can reason about the projection contract without experimenting.
SQL projection (preferred):
ALLOWED_FIELDS = {"id", "name", "email", "created_at", "role"}
ALWAYS_PRESENT = {"id"}
requested = set(params.get("fields", "").split(",")) & ALLOWED_FIELDS | ALWAYS_PRESENT
columns = ", ".join(f"u.{f}" for f in requested)
cursor.execute(f"SELECT {columns} FROM users u WHERE u.id = %s", [user_id])
Post-serialization stripping (acceptable for small objects):
full_response = serialize(user)
if "fields" in params:
allowed = set(params["fields"].split(",")) | ALWAYS_PRESENT
full_response = {k: v for k, v in full_response.items() if k in allowed}
On list endpoints, apply the projection to every item in the collection. The nextPageToken or next_cursor field in the outer envelope is not subject to projection — it is metadata about the result set, not part of the resource schema.
{
"items": [
{ "id": "1", "name": "Alice" },
{ "id": "2", "name": "Bob" }
],
"next_cursor": "Y3Vyc29yMg=="
}
Contentful's Delivery API serves content to web and mobile clients from a single content graph. Mobile clients for a news app needed only sys.id, fields.title, and fields.heroImage.url from entries that carried 40+ fields including rich-text body content averaging 12KB per entry. After implementing ?select=sys.id,fields.title,fields.heroImage (Contentful's field selection syntax), the news app's feed API response size dropped from 387KB per 10-item page to 12KB — a 97% reduction. Server-side CPU for JSON serialization fell by 68% on that endpoint, and CDN cache efficiency improved because smaller, more cache-friendly responses fit in CDN edge memory more densely.
?fields= with 400 Bad Request listing valid options.SELECT column list or ORM projection from the requested fields, not from the full schema. Include always-present fields unconditionally.?fields=.?fields= return 400 Bad Request with a list of valid field names.?fields= syntax.