Instruction Syntax
Mnemonic | Format | Flags |
lwaux | rD,rA,rB | - |
Instruction Encoding
Field | Bits | Description |
Primary Opcode | 0-5 | 011111 (0x1F) |
rD | 6-10 | Destination register |
rA | 11-15 | Source register A |
rB | 16-20 | Source register B |
XO | 21-30 | 375 (Extended opcode) |
Rc | 31 | Reserved (0) |
Operation
EA ← (rA) + (rB) rD ← EXTS(MEM(EA, 4)) rA ← EA
A word (32 bits) is loaded from memory, sign-extended to 64 bits, and placed in register rD. The effective address is computed by adding the contents of registers rA and rB. After the load, the effective address is stored back into register rA.
Note: This instruction cannot be used with rA=0. The update form requires a valid base register. This is the most advanced addressing mode for signed word loads, combining indexed addressing with automatic pointer advancement and sign extension to 64 bits. Essential for processing signed 32-bit data structures with dynamic stride patterns in 64-bit environments.
Affected Registers
rA - Updated with the effective address after the load operation.
For more information on memory addressing see Section 2.1.6, "Effective Address Calculation," in the PowerPC Microprocessor Family: The Programming Environments manual.
Examples
64-bit Array Processing with Dynamic Strides
# Process 64-bit arrays with variable stride patterns lis r3, data_array@ha addi r3, r3, data_array@l lis r4, stride_patterns@ha addi r4, r4, stride_patterns@l lwz r5, num_elements(r0) # Number of elements # Process array with dynamic stride advancement array_processing_loop: # Load stride for current element lwz r6, 0(r4) # Load stride value # Load signed 32-bit data with automatic advancement lwaux r7, r3, r6 # Load data and advance by stride # Process 64-bit arithmetic (value is sign-extended) # Calculate running average with 64-bit precision lwz r8, running_sum(r0) # Load current running sum add r9, r8, r7 # Add new value (64-bit addition) stw r9, running_sum(r0) # Store updated sum # Calculate variance components lwz r10, count(r0) # Load current count addi r11, r10, 1 # Increment count stw r11, count(r0) # Store updated count # Calculate mean: sum / count divw r12, r9, r11 # mean = sum / count # Calculate squared difference: (value - mean)² sub r13, r7, r12 # value - mean (64-bit subtraction) mullw r14, r13, r13 # (value - mean)² (64-bit multiplication) # Update variance accumulator lwz r15, variance_sum(r0) # Load current variance sum add r16, r15, r14 # Add squared difference stw r16, variance_sum(r0) # Store updated variance sum addi r4, r4, 4 # Next stride pattern subi r5, r5, 1 # Decrement element counter cmpwi r5, 0 bne array_processing_loop # Continue processing
Advanced Memory Management - Dynamic Allocation
# Manage dynamic memory allocation with 64-bit addressing lis r3, memory_pool@ha addi r3, r3, memory_pool@l lis r4, allocation_sizes@ha addi r4, r4, allocation_sizes@l lwz r5, num_allocations(r0) # Number of memory allocations # Dynamic memory allocation with automatic pointer advancement allocation_loop: # Load allocation size for current request lwz r6, 0(r4) # Load allocation size # Load current pool pointer with automatic advancement lwaux r7, r3, r6 # Load pool status and advance by allocation size # Check if allocation was successful cmpwi r7, 0 # Check allocation status blt allocation_failed # Branch if allocation failed # Successful allocation - initialize memory block # Calculate block header address sub r8, r3, r6 # Calculate header address stw r6, 0(r8) # Store block size in header # Initialize memory block with pattern li r9, 0 # Initialization counter mr r10, r8 # Current address for initialization init_loop: cmpw r9, r6 # Check if initialization complete bge allocation_complete # Exit if done stw r9, 0(r10) # Store initialization pattern addi r10, r10, 4 # Next word address addi r9, r9, 1 # Increment counter b init_loop # Continue initialization allocation_complete: # Mark allocation as successful li r11, 1 # Success status stw r11, allocation_status(r0) b next_allocation allocation_failed: # Handle allocation failure li r12, -1 # Failure status stw r12, allocation_status(r0) bl handle_allocation_failure next_allocation: addi r4, r4, 4 # Next allocation size subi r5, r5, 1 # Decrement allocation counter cmpwi r5, 0 bne allocation_loop # Continue allocations
64-bit Database Record Processing
# Process database records with variable field sizes lis r3, record_buffer@ha addi r3, r3, record_buffer@l lis r4, field_offsets@ha addi r4, r4, field_offsets@l lwz r5, num_records(r0) # Number of records to process # Process database records with dynamic field access record_processing_loop: # Load record type to determine field layout lwz r6, 0(r4) # Load record type offset lwaux r7, r3, r6 # Load record type and advance # Process based on record type cmpwi r7, RECORD_TYPE_A beq process_type_a cmpwi r7, RECORD_TYPE_B beq process_type_b cmpwi r7, RECORD_TYPE_C beq process_type_c b unknown_record_type process_type_a: # Type A record: [type, id, name, value, timestamp] lwz r8, 4(r4) # Load ID field offset lwaux r9, r3, r8 # Load ID and advance lwz r10, 8(r4) # Load name field offset lwaux r11, r3, r10 # Load name pointer and advance lwz r12, 12(r4) # Load value field offset lwaux r13, r3, r12 # Load signed value and advance lwz r14, 16(r4) # Load timestamp field offset lwaux r15, r3, r14 # Load timestamp and advance # Process Type A record bl process_type_a_record b record_complete process_type_b: # Type B record: [type, id, data_array, checksum] lwz r8, 4(r4) # Load ID field offset lwaux r9, r3, r8 # Load ID and advance lwz r10, 8(r4) # Load data array offset lwaux r11, r3, r10 # Load data array pointer and advance lwz r12, 12(r4) # Load checksum field offset lwaux r13, r3, r12 # Load checksum and advance # Process Type B record bl process_type_b_record b record_complete process_type_c: # Type C record: [type, id, metadata, payload] lwz r8, 4(r4) # Load ID field offset lwaux r9, r3, r8 # Load ID and advance lwz r10, 8(r4) # Load metadata offset lwaux r11, r3, r10 # Load metadata and advance lwz r12, 12(r4) # Load payload offset lwaux r13, r3, r12 # Load payload pointer and advance # Process Type C record bl process_type_c_record b record_complete unknown_record_type: bl handle_unknown_record_type record_complete: addi r4, r4, 20 # Next field offset set (5 fields * 4 bytes) subi r5, r5, 1 # Decrement record counter cmpwi r5, 0 bne record_processing_loop # Continue processing
64-bit Graphics Pipeline - Vertex Processing
# Process 3D graphics vertices with dynamic attribute strides lis r3, vertex_buffer@ha addi r3, r3, vertex_buffer@l lis r4, attribute_strides@ha addi r4, r4, attribute_strides@l lwz r5, num_vertices(r0) # Number of vertices to process # Process vertices with variable attribute layouts vertex_processing_loop: # Load position attribute with automatic advancement lwz r6, 0(r4) # Load position stride lwaux r7, r3, r6 # Load X coordinate and advance lwaux r8, r3, r6 # Load Y coordinate and advance lwaux r9, r3, r6 # Load Z coordinate and advance # Apply 64-bit transformation matrix # Load transformation matrix elements lis r10, transform_matrix@ha addi r10, r10, transform_matrix@l # Transform X coordinate: new_x = m11*x + m12*y + m13*z + m14 lwz r11, 0(r10) # Load m11 mullw r12, r7, r11 # m11 * x lwz r13, 4(r10) # Load m12 mullw r14, r8, r13 # m12 * y add r15, r12, r14 # m11*x + m12*y lwz r16, 8(r10) # Load m13 mullw r17, r9, r16 # m13 * z add r18, r15, r17 # m11*x + m12*y + m13*z lwz r19, 12(r10) # Load m14 add r20, r18, r19 # new_x = m11*x + m12*y + m13*z + m14 # Transform Y coordinate: new_y = m21*x + m22*y + m23*z + m24 lwz r21, 16(r10) # Load m21 mullw r22, r7, r21 # m21 * x lwz r23, 20(r10) # Load m22 mullw r24, r8, r23 # m22 * y add r25, r22, r24 # m21*x + m22*y lwz r26, 24(r10) # Load m23 mullw r27, r9, r26 # m23 * z add r28, r25, r27 # m21*x + m22*y + m23*z lwz r29, 28(r10) # Load m24 add r30, r28, r29 # new_y = m21*x + m22*y + m23*z + m24 # Transform Z coordinate: new_z = m31*x + m32*y + m33*z + m34 lwz r31, 32(r10) # Load m31 mullw r0, r7, r31 # m31 * x lwz r1, 36(r10) # Load m32 mullw r2, r8, r1 # m32 * y add r3, r0, r2 # m31*x + m32*y lwz r4, 40(r10) # Load m33 mullw r5, r9, r4 # m33 * z add r6, r3, r5 # m31*x + m32*y + m33*z lwz r7, 44(r10) # Load m34 add r8, r6, r7 # new_z = m31*x + m32*y + m33*z + m34 # Store transformed coordinates stw r20, transformed_x(r0) # Store new X stw r30, transformed_y(r0) # Store new Y stw r8, transformed_z(r0) # Store new Z # Load normal attributes if present lwz r9, 4(r4) # Load normal stride cmpwi r9, 0 # Check if normals present beq skip_normals lwaux r10, r3, r9 # Load normal X and advance lwaux r11, r3, r9 # Load normal Y and advance lwaux r12, r3, r9 # Load normal Z and advance # Transform normals (simplified - no translation) # Normal transformation requires matrix inversion and transposition bl transform_normal_vector skip_normals: # Load texture coordinates if present lwz r13, 8(r4) # Load texture stride cmpwi r13, 0 # Check if texture coords present beq skip_texture lwaux r14, r3, r13 # Load texture U and advance lwaux r15, r3, r13 # Load texture V and advance # Store texture coordinates stw r14, texture_u(r0) # Store U coordinate stw r15, texture_v(r0) # Store V coordinate skip_texture: # Send vertex to graphics pipeline bl send_vertex_to_pipeline addi r4, r4, 12 # Next attribute stride set (3 attributes * 4 bytes) subi r5, r5, 1 # Decrement vertex counter cmpwi r5, 0 bne vertex_processing_loop # Continue processing
64-bit Network Protocol Processing
# Process network packets with variable header structures lis r3, packet_buffer@ha addi r3, r3, packet_buffer@l lis r4, header_layouts@ha addi r4, r4, header_layouts@l lwz r5, num_packets(r0) # Number of packets to process # Process network packets with dynamic header parsing packet_processing_loop: # Load packet version to determine header layout lwz r6, 0(r4) # Load version field offset lwaux r7, r3, r6 # Load packet version and advance # Process based on protocol version cmpwi r7, PROTOCOL_V1 beq process_v1_packet cmpwi r7, PROTOCOL_V2 beq process_v2_packet cmpwi r7, PROTOCOL_V3 beq process_v3_packet b unknown_protocol process_v1_packet: # V1 packet: [version, length, source, destination, data] lwz r8, 4(r4) # Load length field offset lwaux r9, r3, r8 # Load packet length and advance lwz r10, 8(r4) # Load source field offset lwaux r11, r3, r10 # Load source address and advance lwz r12, 12(r4) # Load destination field offset lwaux r13, r3, r12 # Load destination address and advance lwz r14, 16(r4) # Load data field offset lwaux r15, r3, r14 # Load data pointer and advance # Process V1 packet bl process_v1_packet_data b packet_complete process_v2_packet: # V2 packet: [version, length, source, destination, flags, data] lwz r8, 4(r4) # Load length field offset lwaux r9, r3, r8 # Load packet length and advance lwz r10, 8(r4) # Load source field offset lwaux r11, r3, r10 # Load source address and advance lwz r12, 12(r4) # Load destination field offset lwaux r13, r3, r12 # Load destination address and advance lwz r14, 16(r4) # Load flags field offset lwaux r15, r3, r14 # Load flags and advance lwz r16, 20(r4) # Load data field offset lwaux r17, r3, r16 # Load data pointer and advance # Process V2 packet bl process_v2_packet_data b packet_complete process_v3_packet: # V3 packet: [version, length, source, destination, flags, timestamp, data] lwz r8, 4(r4) # Load length field offset lwaux r9, r3, r8 # Load packet length and advance lwz r10, 8(r4) # Load source field offset lwaux r11, r3, r10 # Load source address and advance lwz r12, 12(r4) # Load destination field offset lwaux r13, r3, r12 # Load destination address and advance lwz r14, 16(r4) # Load flags field offset lwaux r15, r3, r14 # Load flags and advance lwz r16, 20(r4) # Load timestamp field offset lwaux r17, r3, r16 # Load timestamp and advance lwz r18, 24(r4) # Load data field offset lwaux r19, r3, r18 # Load data pointer and advance # Process V3 packet bl process_v3_packet_data b packet_complete unknown_protocol: bl handle_unknown_protocol packet_complete: addi r4, r4, 28 # Next header layout (7 fields * 4 bytes) subi r5, r5, 1 # Decrement packet counter cmpwi r5, 0 bne packet_processing_loop # Continue processing