dcbz
Data Cache Block Set to Zero - 7C 00 07 EC
dcbz
Instruction Syntax
Mnemonic | Format | Flags |
dcbz | rA,rB | - |
Instruction Encoding
0
1
1
1
1
1
0
0
0
0
0
A
A
A
A
A
B
B
B
B
B
0
1
1
1
1
1
1
0
1
1
0
Field | Bits | Description |
Primary Opcode | 0-5 | 011111 (0x1F) |
Reserved | 6-10 | 00000 |
rA | 11-15 | Register A (base address) |
rB | 16-20 | Register B (index) |
Reserved | 21 | 0 |
XO | 22-30 | 1111110110 (1014) |
Reserved | 31 | 0 |
Operation
if rA = 0 then EA ← (rB) else EA ← (rA) + (rB)
Set all bytes in cache block containing EA to zero
The data cache block set to zero instruction establishes a cache block at the effective address and sets all bytes in the block to zero. If the block is already in the cache, its contents are set to zero. If not in cache, the block is allocated and initialized to zero without reading from memory.
Note: This instruction is very efficient for initializing memory blocks, as it avoids the memory read that would normally occur on a cache miss.
Affected Registers
None - This instruction does not affect any registers.
For more information on cache management see Section 3.2, "Cache Management Instructions," in the PowerPC Microprocessor Family: The Programming Environments manual.
Examples
Basic Memory Clearing
# Clear a cache block to zero lis r3, clear_buffer@ha # Load high part of buffer address addi r3, r3, clear_buffer@l # Complete buffer address dcbz 0, r3 # Clear 32 bytes to zero
Fast Buffer Initialization
# Initialize entire buffer to zero efficiently lis r3, large_buffer@ha addi r3, r3, large_buffer@l li r4, 0 # Start offset li r5, 4096 # Buffer size (must be multiple of cache line size) li r6, 32 # Cache line size zero_loop: add r7, r3, r4 # Calculate address dcbz 0, r7 # Zero entire cache line add r4, r4, r6 # Next cache line cmpw r4, r5 # Check if done blt zero_loop # Continue if more data
Optimized Memory Allocation
# Efficiently clear allocated memory # r3 = pointer to allocated memory # r4 = size to clear (must be multiple of 32) li r5, 0 # Offset counter li r6, 32 # Cache line size clear_allocated: dcbz r5, r3 # Clear cache line at offset add r5, r5, r6 # Next cache line cmpw r5, r4 # Check if all cleared blt clear_allocated # Continue until done
Matrix Initialization
# Initialize matrix to zero using cache-efficient method lis r3, matrix_data@ha addi r3, r3, matrix_data@l li r4, 0 # Row counter li r5, 64 # Rows in matrix li r6, 128 # Bytes per row (must be multiple of 32) li r7, 32 # Cache line size matrix_zero_loop: mulli r8, r4, r6 # Calculate row offset add r9, r3, r8 # Row start address li r10, 0 # Column offset within row row_zero_loop: add r11, r9, r10 # Current position in row dcbz 0, r11 # Zero cache line add r10, r10, r7 # Next cache line in row cmpw r10, r6 # Check if row done blt row_zero_loop # Continue row addi r4, r4, 1 # Next row cmpw r4, r5 # Check if all rows done blt matrix_zero_loop # Continue with next row
Structure Array Initialization
# Initialize array of structures to zero lis r3, struct_array@ha addi r3, r3, struct_array@l li r4, 100 # Number of structures li r5, 64 # Size of each structure (bytes) li r6, 0 # Structure counter struct_init_loop: mulli r7, r6, r5 # Calculate structure offset add r8, r3, r7 # Structure address # Clear structure (assuming 64 bytes = 2 cache lines) dcbz 0, r8 # Clear first 32 bytes dcbz 32, r8 # Clear second 32 bytes addi r6, r6, 1 # Next structure cmpw r6, r4 # Check if done blt struct_init_loop # Continue initialization