Help us improve
Share bugs, ideas, or general feedback.
From game-porting-skills
Creates Metal 4 pipeline state objects (render, compute, mesh) via MTL4Compiler, covering function constants, async compilation, pipeline caching, and reflection.
npx claudepluginhub apple/game-porting-toolkit --plugin game-porting-skillsHow this skill is triggered — by the user, by Claude, or both
Slash command
/game-porting-skills:creating-metal4-shader-pipelinesThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Covers Metal 4 pipeline state creation when porting games: loading pre-compiled metallibs, building render/compute/mesh PSOs via `MTL4Compiler`, function constants, flexible PSOs, color-attachment mapping, async compilation, and pipeline caching.
Translates graphics code to Metal 4 with cross-API mappings from Metal 3, D3D12, or Vulkan, and covers Apple GPU TBDR architecture.
Creates p5.js generative art with seeded randomness, noise fields, and interactive parameter exploration. Use for algorithmic art, flow fields, or particle systems.
Share bugs, ideas, or general feedback.
Covers Metal 4 pipeline state creation when porting games: loading pre-compiled metallibs, building render/compute/mesh PSOs via MTL4Compiler, function constants, flexible PSOs, color-attachment mapping, async compilation, and pipeline caching.
For compiling HLSL/DXIL shaders to metallibs, see the compiling-with-metal-shaderconverter skill.
For compiling Metal shader language (MSL) source, see MSL Source Compilation below.
Read the relevant Metal 4 SDK header before writing pipeline code — the headers are the source of truth for property names, types, and method signatures.
$(xcrun --show-sdk-path)/System/Library/Frameworks/Metal.framework/Headers/ — focus on MTL4Compiler.h, MTL4RenderPipeline.h, MTL4ComputePipeline.h, MTL4MeshRenderPipeline.h, MTL4TileRenderPipeline.h, MTL4LinkingDescriptor.hMetal 3 PSOs work on Metal 4 encoders (and vice versa). This enables incremental porting — pipeline creation can be migrated to MTL4Compiler independently of encoder migration.
Cross-API equivalents (metallib → PSO):
| Concept | D3D12 | Vulkan | Metal 4 |
|---|---|---|---|
| Bytecode → loaded library | embedded D3D12_SHADER_BYTECODE | vkCreateShaderModule(SPIR-V) | -[device newLibraryWithURL:] (load metallib) |
| Function reference for PSO | bytecode + entry-point name | VkPipelineShaderStageCreateInfo | MTL4LibraryFunctionDescriptor (library + name) |
| Render PSO create | CreateGraphicsPipelineState(D3D12_GRAPHICS_PIPELINE_STATE_DESC) | vkCreateGraphicsPipelines(VkGraphicsPipelineCreateInfo) | -[MTL4Compiler newRenderPipelineStateWithDescriptor:](MTL4RenderPipelineDescriptor) |
| Compute PSO create | CreateComputePipelineState(D3D12_COMPUTE_PIPELINE_STATE_DESC) | vkCreateComputePipelines(VkComputePipelineCreateInfo) | -[MTL4Compiler newComputePipelineStateWithDescriptor:](MTL4ComputePipelineDescriptor) |
| Mesh-shader PSO create | ID3D12Device2::CreatePipelineState (stream w/ MS subobjects) | vkCreateGraphicsPipelines (mesh stages) | render selector above — pass MTL4MeshRenderPipelineDescriptor |
| Specialization | none (ship compiled permutations) | VkSpecializationInfo | MTL4SpecializedFunctionDescriptor (function constants); flexible PSOs for format/blend |
| Pipeline cache (disk) | ID3D12PipelineLibrary | VkPipelineCache | MTL4Archive + MTL4PipelineDataSetSerializer |
Notes:
-[MTL4Compiler newRenderPipelineStateWithDescriptor:] is polymorphic: render, mesh, and tile pipeline descriptors all derive from MTL4PipelineDescriptor and use the same selector.completionHandler: and returning MTL4CompilerTask.compiling-with-metal-shaderconverter skill.Metal 4 uses MTL4Compiler for explicit pipeline compilation, replacing Metal 3's [device newRenderPipelineStateWithDescriptor:error:]. Pipelines use FunctionDescriptor (not MTLFunction), support flexible (unspecialized) states for reduced compilation time, and color-attachment mapping for PSO reuse across render pass configurations.
Synchronous (blocks the caller):
// 1. Create compiler
MTL4CompilerDescriptor* compilerDescriptor = [[MTL4CompilerDescriptor alloc] init];
id<MTL4Compiler> compiler = [device newCompilerWithDescriptor:compilerDescriptor error:&error];
// 2. Load library (most common: from pre-compiled metallib)
id<MTLLibrary> library = [device newLibraryWithURL:metallibURL error:&error];
// 3. Create function descriptors (NOT MTLFunction)
MTL4LibraryFunctionDescriptor* vertexFunction = [[MTL4LibraryFunctionDescriptor alloc] init];
vertexFunction.library = library;
vertexFunction.name = @"vertexShader";
// 4. Create pipeline
MTL4RenderPipelineDescriptor* pipelineDescriptor = [[MTL4RenderPipelineDescriptor alloc] init];
pipelineDescriptor.vertexFunctionDescriptor = vertexFunction;
pipelineDescriptor.fragmentFunctionDescriptor = fragmentFunction;
pipelineDescriptor.colorAttachments[0].pixelFormat = MTLPixelFormatBGRA8Unorm; // width/height inferred from render pass
id<MTLRenderPipelineState> pipelineState = [compiler newRenderPipelineStateWithDescriptor:pipelineDescriptor compilerTaskOptions:nil error:&error];
Asynchronous (returns immediately; preferred for AAA games and large projects):
MTL4CompilerTaskOptions* taskOptions = [[MTL4CompilerTaskOptions alloc] init];
id<MTL4CompilerTask> task = [compiler newRenderPipelineStateWithDescriptor:pipelineDescriptor
compilerTaskOptions:taskOptions
completionHandler:^(id<MTLRenderPipelineState> pipelineState, NSError* error) {
if (pipelineState) { /* cache PSO */ }
}];
Multithreaded by default; configure QoS via MTL4CompilerTaskOptions.
For native MSL, use MTL4Compiler:
MTL4LibraryDescriptor* libraryDescriptor = [[MTL4LibraryDescriptor alloc] init];
libraryDescriptor.source = mslSourceString;
id<MTLLibrary> library = [compiler newLibraryWithDescriptor:libraryDescriptor error:&error];
Offline alternative: xcrun metal.
| Type | Use Case |
|---|---|
MTL4LibraryFunctionDescriptor | Standard: library + function name |
MTL4SpecializedFunctionDescriptor | Function constants specialization |
MTL4BinaryFunctionDescriptor | Pre-compiled binary functions for binary linking — see below |
See MTL4RenderPipeline.h for the full property list (pipeline functions, color attachments, vertex input, rasterization, features, linking).
Options and reflection. MTL4PipelineOptions configures reflection capture and validation at compile time:
MTL4PipelineOptions* pipelineOptions = [[MTL4PipelineOptions alloc] init];
pipelineOptions.shaderReflection = MTL4ShaderReflectionBindingInfo | MTL4ShaderReflectionBufferTypeInfo;
pipelineOptions.shaderValidation = MTLShaderValidationEnabled;
pipelineDescriptor.options = pipelineOptions;
id<MTLRenderPipelineState> pipelineState = [compiler newRenderPipelineStateWithDescriptor:pipelineDescriptor
compilerTaskOptions:nil
error:&error];
When debugging binding mismatches, use reflection to inspect pipeline bindings at runtime.
See MTL4MeshRenderPipeline.h. MTL4MeshRenderPipelineDescriptor adds object/mesh function descriptors and per-threadgroup limits on top of the render pipeline base.
See MTL4ComputePipeline.h. Set threadGroupSizeIsMultipleOfThreadExecutionWidth = YES only when the threadgroup size is guaranteed to be a multiple of execution width — this enables SIMD optimizations.
Metal 4 offers three ways to make one PSO serve many cases: function constants (compile-time specialization on values), flexible pipeline states (deferred format/blend selection), and color-attachment mapping (output index remapping).
Metal's mechanism for compile-time shader specialization — eliminates runtime branches entirely.
MTLFunctionConstantValues* functionConstants = [[MTLFunctionConstantValues alloc] init];
BOOL enableLighting = YES;
[functionConstants setConstantValue:&enableLighting type:MTLDataTypeBool atIndex:0];
MTL4LibraryFunctionDescriptor* baseFunction = [[MTL4LibraryFunctionDescriptor alloc] init];
baseFunction.library = library;
baseFunction.name = @"fragmentShader";
MTL4SpecializedFunctionDescriptor* specializedFunction = [[MTL4SpecializedFunctionDescriptor alloc] init];
specializedFunction.functionDescriptor = baseFunction;
specializedFunction.constantValues = functionConstants;
// specializedFunction.specializedName = @"optimizedName"; // optional — names the specialized variant
Key considerations:
[[function_constant(N)]] declarationsconstant bool &flag [[function_constant(0)]] — branches on this are eliminated at pipeline creation timeCompile once at launch without committing to pixel format or blend state, then specialize at runtime without recompiling shader code. Use this when the same shader must work with many pixel format or blend state combinations — common in engines with configurable render targets, multiple output formats, or runtime-variable blend modes. If the pipeline configuration is known and fixed, full compilation is simpler and relatively as fast at draw time.
// 1. Compile flexible PSO at launch
pipelineDescriptor.colorAttachments[0].pixelFormat = MTLPixelFormatUnspecialized;
pipelineDescriptor.colorAttachments[0].blendingState = MTL4BlendStateUnspecialized;
id<MTLRenderPipelineState> flexiblePipelineState = [compiler newRenderPipelineStateWithDescriptor:pipelineDescriptor
compilerTaskOptions:nil
error:&error];
// 2. Specialize at runtime (fast — no shader recompile)
MTL4RenderPipelineDescriptor* specializationDescriptor = [[MTL4RenderPipelineDescriptor alloc] init];
specializationDescriptor.colorAttachments[0].pixelFormat = MTLPixelFormatBGRA8Unorm;
specializationDescriptor.colorAttachments[0].blendingState = MTL4BlendStateEnabled;
// set blend factors...
id<MTLRenderPipelineState> specializedPipelineState = [compiler newRenderPipelineStateBySpecializationWithDescriptor:specializationDescriptor
pipeline:flexiblePipelineState
error:&error];
Benefits:
Allows remapping a shader's logical [[color(N)]] outputs to different physical render pass attachment indices at draw time. A single PSO works across different render pass output configurations without recompilation:
MTL4RenderPipelineDescriptor* pipelineDescriptor = [[MTL4RenderPipelineDescriptor alloc] init];
// Pipeline: inherit mapping from encoder (not baked into PSO)
pipelineDescriptor.colorAttachmentMappingState = MTL4LogicalToPhysicalColorAttachmentMappingStateInherited;
// Render pass: enable mapping
renderPassDescriptor.supportColorAttachmentMapping = YES;
// Encoder: set mapping at draw time
// Encoder: remap logical [[color(0)]] to physical attachment 2
MTLLogicalToPhysicalColorAttachmentMap* attachmentMap = [[MTLLogicalToPhysicalColorAttachmentMap alloc] init];
[attachmentMap setPhysicalIndex:2 forLogicalIndex:0];
[renderEncoder setColorAttachmentMap:attachmentMap];
Benefits:
Metal 4 provides two mechanisms for linking pre-compiled shader functions into a pipeline. These are advanced features — most ports don't need them initially, but engines with modular shader architectures (material systems, effect graphs) may benefit.
Static linking links additional shader functions at Metal IR level during PSO creation. Because linking occurs at compile time, the compiler can inline and optimize across function boundaries. Configured per-stage on the pipeline descriptor via vertexStaticLinkingDescriptor / fragmentStaticLinkingDescriptor (render) or staticLinkingDescriptor (compute/tile), using MTL4StaticLinkingDescriptor.
Binary linking links pre-compiled binary functions (MTL4BinaryFunction) to a pipeline. Since the functions are already compiled to machine code, no cross-function inlining or optimization is possible — but compilation is faster because the linked functions don't need recompilation. Configured via MTL4PipelineStageDynamicLinkingDescriptor passed to newRenderPipelineState:dynamicLinkingDescriptor: (or the compute equivalent). To later add binary functions to an existing pipeline, set supportVertexBinaryLinking / supportFragmentBinaryLinking on the pipeline descriptor at creation time.
When to use binary linking: Binary functions save compilation time when the same function is reused across many PSOs — the function is compiled to machine code once and linked without recompilation. However, because the function call cannot be inlined, there is a runtime cost: call frame maintenance and potential stack spilling. Profile to ensure the compilation time savings justify the runtime overhead for your workload. Static linking is preferred when runtime performance matters more than compilation time.
Metal maintains a system-wide shader cache - details of which are not specified - that reuses compiled pipelines across runs of the same app. When creating pipeline descriptors, populate input arrays in the same order across runs (for example, the MTL4FunctionDescriptor arrays in MTL4StaticLinkingDescriptor). Reordering changes the cache key which results in a cache miss, forcing recompilation.
For explicit cache control: on the app's first launch, attach an MTL4PipelineDataSetSerializer to the compiler so it captures pipeline data as PSOs are built, then flush the serializer to an MTL4Archive file. On subsequent app launches, pass the archive via MTL4CompilerTaskOptions.lookupArchives; PSOs whose descriptors match load from disk instead of recompiling.
// First launch — attach serializer so the compiler captures pipeline data as PSOs are built
// serializer comes from [device newPipelineDataSetSerializerWithDescriptor:serializerDescriptor error:&error]
compilerDescriptor.pipelineDataSetSerializer = serializer;
id<MTL4Compiler> compiler = [device newCompilerWithDescriptor:compilerDescriptor error:&error];
// ... create PSOs ...
[serializer serializeAsArchiveAndFlushToURL:archiveURL error:&error];
// Subsequent launches — load the archive and let the compiler look up cached PSOs
id<MTL4Archive> archive = [device newArchiveWithURL:archiveURL error:&error];
MTL4CompilerTaskOptions* taskOptions = [[MTL4CompilerTaskOptions alloc] init];
taskOptions.lookupArchives = @[archive];
Binary archives are compatible across Metal 3 and Metal 4.
man MetalValidation for environment variables. Use MTL4PipelineOptions.shaderValidation for fine-grained per-pipeline control.MTL4CompilerTask parallelizes compilation off the render thread.MTL4Compiler when you need its features (flexible PSOs, caching, binary linking).MTL4Archive) to cache compiled pipelines across app launches. First-launch compilation can be slow; harvest pipeline states, serialize, and load on subsequent runs.