From agent-almanac
Designs and simulates a minimal CPU: defines ISA, builds datapath with ALU/registers/memory, control unit, fetch-decode-execute cycle; traces programs clock-by-clock for verification. For learning architecture or custom processors.
npx claudepluginhub pjt222/agent-almanacThis skill uses the workspace's default tool permissions.
---
Creates comprehensive RTL implementation plans for hardware designs like DMA controllers, UARTs, and memory subsystems, covering blocks, interfaces, clocks, FSMs, and pipelines.
Designs sequential logic circuits like latches, flip-flops, registers, counters, and Mealy/Moore FSMs from behavioral specs. Builds state diagrams, excitation equations, gate-level logic, and verifies timing.
Generates breadboard circuit mockups and diagrams using HTML5 Canvas for retro projects with 6502, 555 timers, logic gates, LEDs, resistors, and wires.
Share bugs, ideas, or general feedback.
Design a minimal but complete CPU by defining an instruction set architecture, composing an ALU and register file into a datapath, designing a control unit that generates the correct signals for each instruction phase, implementing the fetch-decode-execute cycle, simulating a small program to completion, and verifying every clock cycle against expected register and memory state.
Specify everything a programmer needs to know to write machine code for this CPU:
## ISA Specification
- **Data width**: [N] bits
- **Address width**: [M] bits
- **Registers**: [count] general-purpose + [list of special-purpose]
- **Instruction width**: [W] bits
### Instruction Format
| Field | Bits | Width |
|----------|-----------|-------|
| Opcode | [W-1:X] | [n] |
| Rd | [X-1:Y] | [n] |
| Rs | [Y-1:Z] | [n] |
| Imm | [Z-1:0] | [n] |
### Instruction Catalog
| Mnemonic | Opcode | Format | RTL Operation | Flags |
|----------|--------|-----------|------------------------|-------|
| LOAD | 0000 | Rd, [addr]| Rd <- MEM[addr] | - |
| STORE | 0001 | Rs, [addr]| MEM[addr] <- Rs | - |
| ADD | 0010 | Rd, Rs | Rd <- Rd + Rs | Z,C,N |
| ... | ... | ... | ... | ... |
| HALT | 1111 | - | Stop execution | - |
Expected: A complete ISA specification where every instruction has a unique opcode, well-defined operand fields, an unambiguous RTL description, and documented flag effects. The instruction encoding must be decodable without ambiguity.
On failure: If the instruction word is too narrow to encode all needed fields, either widen the instruction, reduce the register count, use variable-length instructions (more complex decoding), or split instructions into sub-operations. If opcodes collide, reassign the encoding.
Build the register-transfer level hardware that moves and transforms data:
## Datapath Components
| Component | Width | Ports / Signals |
|----------------|--------|------------------------------------|
| ALU | [N]-bit| OpA, OpB, ALUop -> Result, Flags |
| Register File | [N]-bit| RdAddrA, RdAddrB, WrAddr, WrData, RegWrite -> RdDataA, RdDataB |
| PC | [M]-bit| PCnext, PCwrite -> PCout |
| Instruction Mem| [W]-bit| Addr -> Instruction |
| Data Memory | [N]-bit| Addr, WrData, MemRead, MemWrite -> RdData |
## Datapath Multiplexers
| Mux Name | Inputs | Select Signal | Output |
|-------------|----------------------|---------------|-------------|
| ALU_B_mux | RegDataB, Immediate | ALUsrc | ALU OpB |
| WrData_mux | ALU Result, MemData | MemToReg | Reg WrData |
| PC_mux | PC+1, BranchTarget | PCsrc | PC next |
Expected: A complete datapath diagram (as a component table and mux table) where every instruction in the ISA has a viable path for its data to flow from source to destination through the ALU, register file, and memory.
On failure: If an instruction cannot be executed with the current datapath (e.g., no path from memory to a register for LOAD), add the missing multiplexer or data path. Walk through each instruction's RTL operation and trace the required signal flow through the datapath.
Build the logic that orchestrates the datapath for each instruction:
## Control Signals
| Signal | Width | Function |
|-----------|-------|---------------------------------------|
| ALUop | [k] | Selects ALU operation |
| ALUsrc | 1 | 0=register, 1=immediate for ALU B |
| RegWrite | 1 | Enable register file write |
| MemRead | 1 | Enable data memory read |
| MemWrite | 1 | Enable data memory write |
| MemToReg | 1 | 0=ALU result, 1=memory data to register |
| PCsrc | 1 | 0=PC+1, 1=branch target |
| IRwrite | 1 | Enable instruction register load |
## Multi-Cycle Control FSM
| State | Active Signals | Next State |
|---------|----------------------------------------|-------------------|
| FETCH | MemRead, IRwrite, PC <- PC+1 | DECODE |
| DECODE | Read registers, sign-extend immediate | EXECUTE |
| EXECUTE | ALUop=[per instruction], ALUsrc=[...] | MEM_ACCESS or WB |
| MEM_ACC | MemRead or MemWrite | WRITE_BACK |
| WB | RegWrite, MemToReg=[...] | FETCH |
Expected: A control unit (combinational or FSM) that generates the correct control signal values for every instruction at every phase, with no conflicting signals (e.g., MemRead and MemWrite both active simultaneously on the same memory).
On failure: If a control signal conflict exists, the phases are not properly separated. Ensure that load and store instructions access memory in different phases or that the memory interface supports simultaneous read and write on separate ports. If the FSM has too many states, check whether some instructions share phases and can be merged.
Connect the datapath and control unit into a working processor:
## Cycle Execution Summary
| Instruction Type | Phases | Cycles |
|-----------------|--------------------------------|--------|
| ALU (reg-reg) | Fetch, Decode, Execute, WB | 4 |
| Load | Fetch, Decode, Execute, Mem, WB| 5 |
| Store | Fetch, Decode, Execute, Mem | 4 |
| Branch (taken) | Fetch, Decode, Execute | 3 |
| Branch (not taken)| Fetch, Decode, Execute | 3 |
| Halt | Fetch, Decode | 2 |
Expected: A fully connected processor where the control FSM drives the datapath through the correct sequence of phases for each instruction type, and all state transitions occur synchronously on the clock edge.
On failure: If the processor hangs (never reaches HALT) or produces incorrect results, the most likely cause is a control signal error in one specific phase. Use Step 5's cycle-by-cycle trace to isolate the failing cycle. If the PC does not increment correctly, check the FETCH phase wiring. If the wrong register is written, check the register address field extraction from IR.
Execute a concrete program and verify every clock cycle against expected state:
## Test Program: Fibonacci (first 8 terms)
| Addr | Instruction | Mnemonic | Comment |
|------|---------------|------------------|----------------------|
| 0x00 | [encoding] | LOAD R1, #1 | R1 = 1 (F1) |
| 0x01 | [encoding] | LOAD R2, #1 | R2 = 1 (F2) |
| 0x02 | [encoding] | LOAD R3, #6 | R3 = 6 (loop count) |
| 0x03 | [encoding] | ADD R4, R1, R2 | R4 = R1 + R2 |
| 0x04 | [encoding] | MOV R1, R2 | R1 = R2 |
| 0x05 | [encoding] | MOV R2, R4 | R2 = R4 |
| 0x06 | [encoding] | SUB R3, R3, #1 | R3 = R3 - 1 |
| 0x07 | [encoding] | BNZ 0x03 | Branch if R3 != 0 |
| 0x08 | [encoding] | HALT | Stop |
## Cycle-by-Cycle Trace (excerpt)
| Cycle | Phase | PC | IR | ALU Op | Result | RegWrite | Flags |
|-------|---------|-----|----------|----------|--------|----------|-------|
| 1 | FETCH | 0x00| LOAD R1,1| - | - | No | - |
| 2 | DECODE | 0x01| LOAD R1,1| - | - | No | - |
| 3 | EXECUTE | 0x01| LOAD R1,1| PASS #1 | 1 | No | - |
| 4 | WB | 0x01| LOAD R1,1| - | - | R1 <- 1 | - |
| ... | ... | ... | ... | ... | ... | ... | ... |
## Expected Final State
| Register | Value | Description |
|----------|-------|---------------------|
| R1 | [val] | Second-to-last Fib |
| R2 | [val] | Last computed Fib |
| R3 | 0 | Loop counter done |
| R4 | [val] | Same as R2 |
| PC | 0x09 | One past HALT |
Expected: The cycle-by-cycle trace matches the expected final state. Every instruction produces the correct register and memory updates. The program terminates at HALT with the correct Fibonacci values in registers.
On failure: Compare the first divergence between expected and actual state. Common causes: (1) ALU operation select is wrong for one instruction type -- check the control signal table. (2) Branch offset calculation is off by one -- verify whether branches are PC-relative from the current or next instruction address. (3) Write-back writes to the wrong register -- check the register address field extraction. (4) Flags not updated correctly -- trace the ALU flag logic for the specific operands that cause the mismatch.
design-logic-circuit -- design the ALU, multiplexers, decoders, and other combinational blocksbuild-sequential-circuit -- design the register file, program counter, control FSM, and other sequential blocksevaluate-boolean-expression -- simplify the control logic equations for the hardwired control unitderive-theoretical-result -- formalize performance analysis (CPI, throughput, Amdahl's law)