OpenCL SDK (Khronos Group) for cross-platform GPU/CPU parallel computing in C and C++. Use when writing OpenCL kernels, managing devices/contexts/queues, allocating and transferring buffers or images, building and executing programs, or using the C++ wrapper (opencl.hpp / cl::CommandQueue, cl::Buffer, cl::KernelFunctor). Covers OpenCL C API, C++ bindings, and SDK utility libraries (OpenCLUtils, OpenCLSDK).
From openclnpx claudepluginhub datathings/marketplace --plugin openclThis skill uses the workspace's default tool permissions.
references/api-context-queue.mdreferences/api-cpp-wrapper.mdreferences/api-execution.mdreferences/api-memory.mdreferences/api-platform-device.mdreferences/api-program-kernel.mdreferences/workflows.mdGuides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Generates FastAPI project templates with async routes, dependency injection, Pydantic schemas, repository patterns, middleware, and config for PostgreSQL/MongoDB backends.
Version: v2025.07.23 (Khronos Group OpenCL-SDK) Language: C (OpenCL 1.0–3.0) / C++ (opencl.hpp wrapper) License: Apache-2.0 Repo: https://github.com/KhronosGroup/OpenCL-SDK
OpenCL (Open Computing Language) is a framework for parallel programming across heterogeneous platforms — GPUs, CPUs, FPGAs, and DSPs — from a single API. The SDK bundles:
<CL/cl.h>, <CL/cl_ext.h>)<CL/opencl.hpp>)<CL/Utils/>, <CL/SDK/>)Kernel file saxpy.cl:
__kernel void saxpy(float a, __global float *x, __global float *y) {
int i = get_global_id(0);
y[i] = fma(a, x[i], y[i]);
}
Host:
#include <CL/cl.h>
cl_platform_id plat; cl_device_id dev;
clGetPlatformIDs(1, &plat, NULL);
clGetDeviceIDs(plat, CL_DEVICE_TYPE_DEFAULT, 1, &dev, NULL);
cl_context ctx = clCreateContext(NULL, 1, &dev, NULL, NULL, &err);
cl_command_queue q = clCreateCommandQueueWithProperties(ctx, dev, NULL, &err);
// ... load source, clCreateProgramWithSource, clBuildProgram,
// clCreateKernel, clSetKernelArg, clEnqueueNDRangeKernel,
// clEnqueueReadBuffer, clReleaseXxx ...
#define CL_HPP_ENABLE_EXCEPTIONS
#define CL_HPP_TARGET_OPENCL_VERSION 200
#include <CL/opencl.hpp>
cl::Context ctx{CL_DEVICE_TYPE_DEFAULT};
cl::Device dev = ctx.getInfo<CL_CONTEXT_DEVICES>()[0];
cl::CommandQueue queue{ctx, dev};
cl::Program prog{ctx, source_string};
prog.build(dev);
auto saxpy = cl::KernelFunctor<cl_float, cl::Buffer, cl::Buffer>(prog, "saxpy");
saxpy(cl::EnqueueArgs{queue, cl::NDRange{N}}, a, buf_x, buf_y);
__kernel function compiled from OpenCL C source or SPIR-V__global (buffers), __local (shared), __constant (read-only), __private (per-item)| Domain | Reference File | Key Functions / Types |
|---|---|---|
| Platform & Device | references/api-platform-device.md | clGetPlatformIDs, clGetDeviceIDs, clGetDeviceInfo, cl_util_get_device |
| Context & Queue | references/api-context-queue.md | clCreateContext, clCreateCommandQueueWithProperties, clFlush, clFinish |
| Memory Objects | references/api-memory.md | clCreateBuffer, clCreateImage, clEnqueueRead/WriteBuffer, clEnqueueMapBuffer, SVM |
| Programs & Kernels | references/api-program-kernel.md | clCreateProgramWithSource, clBuildProgram, clCreateKernel, clSetKernelArg |
| Execution & Events | references/api-execution.md | clEnqueueNDRangeKernel, clWaitForEvents, clSetEventCallback, profiling |
| C++ Wrapper | references/api-cpp-wrapper.md | cl::Context, cl::Buffer, cl::KernelFunctor, cl::EnqueueArgs, exceptions |
| Workflows | references/workflows.md | Quick-start, vector add, image blur, async events, binary caching, error handling |
See references/workflows.md for complete, runnable examples:
KernelFunctor pattern with RAIIread_imageui / write_imageuiInclude <CL/Utils/Utils.h> (C) or <CL/Utils/Utils.hpp> (C++) and link OpenCLUtils / OpenCLUtilsCpp.
| Header | API |
|---|---|
<CL/Utils/Context.h> | cl_util_get_device, cl_util_get_context, cl_util_print_device_info |
<CL/Utils/File.h> | cl_util_read_text_file, cl_util_read_exe_relative_text_file, cl_util_write_binaries |
<CL/Utils/Error.h> | OCLERROR_RET, OCLERROR_PAR, MEM_CHECK macros, cl_util_print_error |
<CL/Utils/Event.h> | cl_util_get_event_duration |
<CL/Utils/Device.hpp> | cl::util::supports_extension, cl::util::supports_feature |
SDK Library (samples only, not installed): <CL/SDK/CLI.h>, <CL/SDK/Random.h>, <CL/SDK/Image.h>.
Release everything: Every clCreate* call must be paired with the corresponding clRelease*. Leak buffers or kernels and you exhaust device memory silently.
Blocking vs. non-blocking transfers: clEnqueueReadBuffer(..., CL_TRUE, ...) blocks the CPU. Use CL_FALSE + events for overlap. Always clFlush before blocking on an event from another thread.
Local work-group size: Must evenly divide global work size in each dimension. Query CL_KERNEL_WORK_GROUP_SIZE for the max; CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE for optimal alignment. Passing NULL lets the runtime choose (portable, not always optimal).
Build log on failure: clBuildProgram returns CL_BUILD_PROGRAM_FAILURE — always query CL_PROGRAM_BUILD_LOG to get the compiler error message. The SDK's cl_util_build_program does this automatically.
Image format validation: Not all cl_image_format combinations are supported on every device. Call clGetSupportedImageFormats before creating images.
Event callbacks must not block: Callbacks registered via clSetEventCallback are invoked from a runtime thread. Never call clFinish or clWaitForEvents inside a callback.
C++ exceptions: Enable with #define CL_HPP_ENABLE_EXCEPTIONS before including <CL/opencl.hpp>. Without it, check cl_int error parameters manually.
OpenCL version targeting: Set CL_HPP_TARGET_OPENCL_VERSION (e.g., 300, 200, 120) to control which API surface is available in the C++ wrapper. OpenCL 1.x deprecated clCreateCommandQueue; use clCreateCommandQueueWithProperties for 2.0+.
SVM requires OpenCL 2.0+: Shared Virtual Memory (clSVMAlloc) requires device support for CL_DEVICE_SVM_CAPABILITIES. Check before use.