dcbz
Data Cache Block Set to Zero - 7C 00 07 EC
dcbz
Instruction Syntax
| Mnemonic | Format | Flags |
| dcbz | rA,rB | - |
Instruction Encoding
0
1
1
1
1
1
0
0
0
0
0
A
A
A
A
A
B
B
B
B
B
0
1
1
1
1
1
1
0
1
1
0
| Field | Bits | Description |
| Primary Opcode | 0-5 | 011111 (0x1F) |
| Reserved | 6-10 | 00000 |
| rA | 11-15 | Register A (base address) |
| rB | 16-20 | Register B (index) |
| Reserved | 21 | 0 |
| XO | 22-30 | 1111110110 (1014) |
| Reserved | 31 | 0 |
Operation
if rA = 0 then EA ← (rB) else EA ← (rA) + (rB)
Set all bytes in cache block containing EA to zero
The data cache block set to zero instruction establishes a cache block at the effective address and sets all bytes in the block to zero. If the block is already in the cache, its contents are set to zero. If not in cache, the block is allocated and initialized to zero without reading from memory.
Note: This instruction is very efficient for initializing memory blocks, as it avoids the memory read that would normally occur on a cache miss.
Affected Registers
None - This instruction does not affect any registers.
For more information on cache management see Section 3.2, "Cache Management Instructions," in the PowerPC Microprocessor Family: The Programming Environments manual.
Examples
Basic Memory Clearing
# Clear a cache block to zero lis r3, clear_buffer@ha # Load high part of buffer address addi r3, r3, clear_buffer@l # Complete buffer address dcbz 0, r3 # Clear 32 bytes to zero
Fast Buffer Initialization
# Initialize entire buffer to zero efficiently
lis r3, large_buffer@ha
addi r3, r3, large_buffer@l
li r4, 0 # Start offset
li r5, 4096 # Buffer size (must be multiple of cache line size)
li r6, 32 # Cache line size
zero_loop:
add r7, r3, r4 # Calculate address
dcbz 0, r7 # Zero entire cache line
add r4, r4, r6 # Next cache line
cmpw r4, r5 # Check if done
blt zero_loop # Continue if more data
Optimized Memory Allocation
# Efficiently clear allocated memory
# r3 = pointer to allocated memory
# r4 = size to clear (must be multiple of 32)
li r5, 0 # Offset counter
li r6, 32 # Cache line size
clear_allocated:
dcbz r5, r3 # Clear cache line at offset
add r5, r5, r6 # Next cache line
cmpw r5, r4 # Check if all cleared
blt clear_allocated # Continue until done
Matrix Initialization
# Initialize matrix to zero using cache-efficient method
lis r3, matrix_data@ha
addi r3, r3, matrix_data@l
li r4, 0 # Row counter
li r5, 64 # Rows in matrix
li r6, 128 # Bytes per row (must be multiple of 32)
li r7, 32 # Cache line size
matrix_zero_loop:
mulli r8, r4, r6 # Calculate row offset
add r9, r3, r8 # Row start address
li r10, 0 # Column offset within row
row_zero_loop:
add r11, r9, r10 # Current position in row
dcbz 0, r11 # Zero cache line
add r10, r10, r7 # Next cache line in row
cmpw r10, r6 # Check if row done
blt row_zero_loop # Continue row
addi r4, r4, 1 # Next row
cmpw r4, r5 # Check if all rows done
blt matrix_zero_loop # Continue with next row
Structure Array Initialization
# Initialize array of structures to zero
lis r3, struct_array@ha
addi r3, r3, struct_array@l
li r4, 100 # Number of structures
li r5, 64 # Size of each structure (bytes)
li r6, 0 # Structure counter
struct_init_loop:
mulli r7, r6, r5 # Calculate structure offset
add r8, r3, r7 # Structure address
# Clear structure (assuming 64 bytes = 2 cache lines)
dcbz 0, r8 # Clear first 32 bytes
dcbz 32, r8 # Clear second 32 bytes
addi r6, r6, 1 # Next structure
cmpw r6, r4 # Check if done
blt struct_init_loop # Continue initialization