State-space model with O(n) complexity vs Transformers' O(n squared). 5x faster inference, million-token sequences, no KV cache. Selective SSM with hardware-aware design. Mamba-1 (d_state=16) and Mamba-2 (d_state=128, multi-head). Models 130M-2.8B on HuggingFace.
/plugin marketplace add zechenzhangAGI/AI-research-SKILLs/plugin install mamba-architecture@zechenzhangAGI/AI-research-SKILLs