Runs Hugging Face ML models in JavaScript/TypeScript for Node.js or browser. Supports text, image, speech tasks via pipeline API with memory management.
From antigravity-awesome-skillsnpx claudepluginhub sickn33/antigravity-awesome-skills --plugin antigravity-awesome-skillsThis skill uses the workspace's default tool permissions.
references/CACHE.mdreferences/CONFIGURATION.mdreferences/EXAMPLES.mdreferences/MODEL_ARCHITECTURES.mdreferences/PIPELINE_OPTIONS.mdreferences/TEXT_GENERATION.mdDesigns and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Enables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Transformers.js enables running state-of-the-art machine learning models directly in JavaScript, both in browsers and Node.js environments, with no server required.
Use this skill when you need to:
npm install @huggingface/transformers
<script type="module">
import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers';
</script>
The pipeline API is the easiest way to use models. It groups together preprocessing, model inference, and postprocessing:
import { pipeline } from '@huggingface/transformers';
// Create a pipeline for a specific task
const pipe = await pipeline('sentiment-analysis');
// Use the pipeline
const result = await pipe('I love transformers!');
// Output: [{ label: 'POSITIVE', score: 0.999817686 }]
// IMPORTANT: Always dispose when done to free memory
await classifier.dispose();
⚠️ Memory Management: All pipelines must be disposed with pipe.dispose() when finished to prevent memory leaks. See examples in Code Examples for cleanup patterns across different environments.
You can specify a custom model as the second argument:
const pipe = await pipeline(
'sentiment-analysis',
'Xenova/bert-base-multilingual-uncased-sentiment'
);
Finding Models:
Browse available Transformers.js models on Hugging Face Hub:
pipeline_tag parameter
Tip: Filter by task type, sort by trending/downloads, and check model cards for performance metrics and usage examples.
Choose where to run the model:
// Run on CPU (default for WASM)
const pipe = await pipeline('sentiment-analysis', 'model-id');
// Run on GPU (WebGPU - experimental)
const pipe = await pipeline('sentiment-analysis', 'model-id', {
device: 'webgpu',
});
Control model precision vs. performance:
// Use quantized model (faster, smaller)
const pipe = await pipeline('sentiment-analysis', 'model-id', {
dtype: 'q4', // Options: 'fp32', 'fp16', 'q8', 'q4'
});
Note: All examples below show basic usage.
const classifier = await pipeline('text-classification');
const result = await classifier('This movie was amazing!');
const ner = await pipeline('token-classification');
const entities = await ner('My name is John and I live in New York.');
const qa = await pipeline('question-answering');
const answer = await qa({
question: 'What is the capital of France?',
context: 'Paris is the capital and largest city of France.'
});
const generator = await pipeline('text-generation', 'onnx-community/gemma-3-270m-it-ONNX');
const text = await generator('Once upon a time', {
max_new_tokens: 100,
temperature: 0.7
});
For streaming and chat: See Text Generation Guide for:
TextStreamerconst translator = await pipeline('translation', 'Xenova/nllb-200-distilled-600M');
const output = await translator('Hello, how are you?', {
src_lang: 'eng_Latn',
tgt_lang: 'fra_Latn'
});
const summarizer = await pipeline('summarization');
const summary = await summarizer(longText, {
max_length: 100,
min_length: 30
});
const classifier = await pipeline('zero-shot-classification');
const result = await classifier('This is a story about sports.', ['politics', 'sports', 'technology']);
const classifier = await pipeline('image-classification');
const result = await classifier('https://example.com/image.jpg');
// Or with local file
const result = await classifier(imageUrl);
const detector = await pipeline('object-detection');
const objects = await detector('https://example.com/image.jpg');
// Returns: [{ label: 'person', score: 0.95, box: { xmin, ymin, xmax, ymax } }, ...]
const segmenter = await pipeline('image-segmentation');
const segments = await segmenter('https://example.com/image.jpg');
const depthEstimator = await pipeline('depth-estimation');
const depth = await depthEstimator('https://example.com/image.jpg');
const classifier = await pipeline('zero-shot-image-classification');
const result = await classifier('image.jpg', ['cat', 'dog', 'bird']);
const transcriber = await pipeline('automatic-speech-recognition');
const result = await transcriber('audio.wav');
// Returns: { text: 'transcribed text here' }
const classifier = await pipeline('audio-classification');
const result = await classifier('audio.wav');
const synthesizer = await pipeline('text-to-speech', 'Xenova/speecht5_tts');
const audio = await synthesizer('Hello, this is a test.', {
speaker_embeddings: speakerEmbeddings
});
const captioner = await pipeline('image-to-text');
const caption = await captioner('image.jpg');
const docQA = await pipeline('document-question-answering');
const answer = await docQA('document-image.jpg', 'What is the total amount?');
const detector = await pipeline('zero-shot-object-detection');
const objects = await detector('image.jpg', ['person', 'car', 'tree']);
const extractor = await pipeline('feature-extraction');
const embeddings = await extractor('This is a sentence to embed.');
// Returns: tensor of shape [1, sequence_length, hidden_size]
// For sentence embeddings (mean pooling)
const extractor = await pipeline('feature-extraction', 'onnx-community/all-MiniLM-L6-v2-ONNX');
const embeddings = await extractor('Text to embed', { pooling: 'mean', normalize: true });
Discover compatible Transformers.js models on Hugging Face Hub:
Base URL (all models):
https://huggingface.co/models?library=transformers.js&sort=trending
Filter by task using the pipeline_tag parameter:
Sort options:
&sort=trending - Most popular recently&sort=downloads - Most downloaded overall&sort=likes - Most liked by community&sort=modified - Recently updatedConsider these factors when selecting a model:
1. Model Size
2. Quantization Models are often available in different quantization levels:
fp32 - Full precision (largest, most accurate)fp16 - Half precision (smaller, still accurate)q8 - 8-bit quantized (much smaller, slight accuracy loss)q4 - 4-bit quantized (smallest, noticeable accuracy loss)3. Task Compatibility Check the model card for:
4. Performance Metrics Model cards typically show:
// 1. Visit: https://huggingface.co/models?pipeline_tag=text-generation&library=transformers.js&sort=trending
// 2. Browse and select a model (e.g., onnx-community/gemma-3-270m-it-ONNX)
// 3. Check model card for:
// - Model size: ~270M parameters
// - Quantization: q4 available
// - Language: English
// - Use case: Instruction-following chat
// 4. Use the model:
import { pipeline } from '@huggingface/transformers';
const generator = await pipeline(
'text-generation',
'onnx-community/gemma-3-270m-it-ONNX',
{ dtype: 'q4' } // Use quantized version for faster inference
);
const output = await generator('Explain quantum computing in simple terms.', {
max_new_tokens: 100
});
await generator.dispose();
onnx folder in model repo)Xenova (Transformers.js maintainer) or onnx-communityconst pipe = await pipeline('task', 'model-id', { revision: 'abc123' });
env)The env object provides comprehensive control over Transformers.js execution, caching, and model loading.
Quick Overview:
import { env } from '@huggingface/transformers';
// View version
console.log(env.version); // e.g., '3.8.1'
// Common settings
env.allowRemoteModels = true; // Load from Hugging Face Hub
env.allowLocalModels = false; // Load from file system
env.localModelPath = '/models/'; // Local model directory
env.useFSCache = true; // Cache models on disk (Node.js)
env.useBrowserCache = true; // Cache models in browser
env.cacheDir = './.cache'; // Cache directory location
Configuration Patterns:
// Development: Fast iteration with remote models
env.allowRemoteModels = true;
env.useFSCache = true;
// Production: Local models only
env.allowRemoteModels = false;
env.allowLocalModels = true;
env.localModelPath = '/app/models/';
// Custom CDN
env.remoteHost = 'https://cdn.example.com/models';
// Disable caching (testing)
env.useFSCache = false;
env.useBrowserCache = false;
For complete documentation on all configuration options, caching strategies, cache management, pre-downloading models, and more, see:
import { AutoTokenizer, AutoModel } from '@huggingface/transformers';
// Load tokenizer and model separately for more control
const tokenizer = await AutoTokenizer.from_pretrained('bert-base-uncased');
const model = await AutoModel.from_pretrained('bert-base-uncased');
// Tokenize input
const inputs = await tokenizer('Hello world!');
// Run model
const outputs = await model(inputs);
const classifier = await pipeline('sentiment-analysis');
// Process multiple texts
const results = await classifier([
'I love this!',
'This is terrible.',
'It was okay.'
]);
WebGPU provides GPU acceleration in browsers:
const pipe = await pipeline('text-generation', 'onnx-community/gemma-3-270m-it-ONNX', {
device: 'webgpu',
dtype: 'fp32'
});
Note: WebGPU is experimental. Check browser compatibility and file issues if problems occur.
Default browser execution uses WASM:
// Optimized for browsers with quantization
const pipe = await pipeline('sentiment-analysis', 'model-id', {
dtype: 'q8' // or 'q4' for even smaller size
});
Models can be large (ranging from a few MB to several GB) and consist of multiple files. Track download progress by passing a callback to the pipeline() function:
import { pipeline } from '@huggingface/transformers';
// Track progress for each file
const fileProgress = {};
function onProgress(info) {
console.log(`${info.status}: ${info.file}`);
if (info.status === 'progress') {
fileProgress[info.file] = info.progress;
console.log(`${info.file}: ${info.progress.toFixed(1)}%`);
}
if (info.status === 'done') {
console.log(`✓ ${info.file} complete`);
}
}
// Pass callback to pipeline
const classifier = await pipeline('sentiment-analysis', null, {
progress_callback: onProgress
});
Progress Info Properties:
interface ProgressInfo {
status: 'initiate' | 'download' | 'progress' | 'done' | 'ready';
name: string; // Model id or path
file: string; // File being processed
progress?: number; // Percentage (0-100, only for 'progress' status)
loaded?: number; // Bytes downloaded (only for 'progress' status)
total?: number; // Total bytes (only for 'progress' status)
}
For complete examples including browser UIs, React components, CLI progress bars, and retry logic, see:
→ Pipeline Options - Progress Callback
try {
const pipe = await pipeline('sentiment-analysis', 'model-id');
const result = await pipe('text to analyze');
} catch (error) {
if (error.message.includes('fetch')) {
console.error('Model download failed. Check internet connection.');
} else if (error.message.includes('ONNX')) {
console.error('Model execution failed. Check model compatibility.');
} else {
console.error('Unknown error:', error);
}
}
q8 or q4 for faster inferencemax_new_tokens to avoid memory issuespipe.dispose() when done to free memoryIMPORTANT: Always call pipe.dispose() when finished to prevent memory leaks.
const pipe = await pipeline('sentiment-analysis');
const result = await pipe('Great product!');
await pipe.dispose(); // ✓ Free memory (100MB - several GB per model)
When to dispose:
Models consume significant memory and hold GPU/CPU resources. Disposal is critical for browser memory limits and server stability.
For detailed patterns (React cleanup, servers, browser), see Code Examples
onnx folder in model repo)dtype: 'q4')max_lengthdtype: 'fp16' if fp32 failspipeline() with progress_callback, device, dtype, etc.env configuration for caching and model loadingpipe.dispose() when done - critical for preventing memory leaks| Task | Task ID |
|---|---|
| Text classification | text-classification or sentiment-analysis |
| Token classification | token-classification or ner |
| Question answering | question-answering |
| Fill mask | fill-mask |
| Summarization | summarization |
| Translation | translation |
| Text generation | text-generation |
| Text-to-text generation | text2text-generation |
| Zero-shot classification | zero-shot-classification |
| Image classification | image-classification |
| Image segmentation | image-segmentation |
| Object detection | object-detection |
| Depth estimation | depth-estimation |
| Image-to-image | image-to-image |
| Zero-shot image classification | zero-shot-image-classification |
| Zero-shot object detection | zero-shot-object-detection |
| Automatic speech recognition | automatic-speech-recognition |
| Audio classification | audio-classification |
| Text-to-speech | text-to-speech or text-to-audio |
| Image-to-text | image-to-text |
| Document question answering | document-question-answering |
| Feature extraction | feature-extraction |
| Sentence similarity | sentence-similarity |
This skill enables you to integrate state-of-the-art machine learning capabilities directly into JavaScript applications without requiring separate ML servers or Python environments.