Instruction Syntax
| Mnemonic | Format | Flags |
| lfd | frD,d(rA) | - |
Instruction Encoding
| Field | Bits | Description |
| Primary Opcode | 0-5 | 110010 (0x32) |
| frD | 6-10 | Destination floating-point register |
| rA | 11-15 | Source register A |
| d | 16-31 | 16-bit signed displacement |
Operation
if rA = 0 then EA ← EXTS(d) else EA ← (rA) + EXTS(d) frD ← MEM(EA, 8)
A double-precision floating-point value (64 bits) is loaded from memory and placed in floating-point register frD. The effective address is computed by adding the sign-extended displacement to the contents of register rA, or zero if rA is 0.
Note: This instruction loads the full IEEE-754 double-precision format directly. The effective address must be doubleword-aligned (divisible by 8) for optimal performance. If rA=0, it is treated as the value 0, not the contents of register r0.
Affected Registers
None - This instruction does not affect any condition register fields or XER register bits.
For more information on floating-point operations see Section 2.1.4, "Floating-Point Status and Control Register (FPSCR)," in the PowerPC Microprocessor Family: The Programming Environments manual.
Examples
Basic Double-Precision Loading
lfd f1, 0(r3) # Load double from r3 lfd f2, 8(r3) # Load next double lfd f3, -16(r4) # Load from r4-16
Scientific Computing - Matrix Operations
# Load 3x3 matrix for scientific calculations lis r3, matrix_data@ha addi r3, r3, matrix_data@l # Load first row lfd f0, 0(r3) # m[0][0] lfd f1, 8(r3) # m[0][1] lfd f2, 16(r3) # m[0][2] # Load second row lfd f3, 24(r3) # m[1][0] lfd f4, 32(r3) # m[1][1] lfd f5, 40(r3) # m[1][2] # Load third row lfd f6, 48(r3) # m[2][0] lfd f7, 56(r3) # m[2][1] lfd f8, 64(r3) # m[2][2] # Calculate determinant: det = a11(a22*a33 - a23*a32) - a12(a21*a33 - a23*a31) + a13(a21*a32 - a22*a31) fmul f9, f4, f8 # a22 * a33 fmul f10, f5, f7 # a23 * a32 fsub f11, f9, f10 # (a22*a33 - a23*a32) fmul f12, f0, f11 # a11 * (a22*a33 - a23*a32) fmul f13, f3, f8 # a21 * a33 fmul f14, f5, f6 # a23 * a31 fsub f15, f13, f14 # (a21*a33 - a23*a31) fmul f16, f1, f15 # a12 * (a21*a33 - a23*a31) fmul f17, f3, f7 # a21 * a32 fmul f18, f4, f6 # a22 * a31 fsub f19, f17, f18 # (a21*a32 - a22*a31) fmul f20, f2, f19 # a13 * (a21*a32 - a22*a31) fsub f21, f12, f16 # First two terms fadd f22, f21, f20 # Final determinant
High-Precision Financial Calculations
# Black-Scholes option pricing with double precision lis r3, option_params@ha addi r3, r3, option_params@l lfd f1, STOCK_PRICE(r3) # Current stock price (S) lfd f2, STRIKE_PRICE(r3) # Strike price (K) lfd f3, RISK_FREE_RATE(r3) # Risk-free rate (r) lfd f4, TIME_TO_EXPIRY(r3) # Time to expiration (T) lfd f5, VOLATILITY(r3) # Volatility (σ) # Calculate d1 = [ln(S/K) + (r + σ²/2)T] / (σ√T) fdiv f6, f1, f2 # S/K bl ln_function # f6 = ln(S/K) fmul f7, f5, f5 # σ² lfd f8, half_constant(r0) # Load 0.5 fmul f9, f7, f8 # σ²/2 fadd f10, f3, f9 # r + σ²/2 fmul f11, f10, f4 # (r + σ²/2)T fadd f12, f6, f11 # ln(S/K) + (r + σ²/2)T fsqrt f13, f4 # √T fmul f14, f5, f13 # σ√T fdiv f15, f12, f14 # d1 # Calculate d2 = d1 - σ√T fsub f16, f15, f14 # d2 = d1 - σ√T # Calculate N(d1) and N(d2) using cumulative normal distribution fmr f1, f15 # Pass d1 bl normal_cdf # f1 = N(d1) fmr f17, f1 # Save N(d1) fmr f1, f16 # Pass d2 bl normal_cdf # f1 = N(d2) fmr f18, f1 # Save N(d2) # Calculate call option price: C = S*N(d1) - K*e^(-rT)*N(d2) lfd f19, stock_price(r3) # Reload S fmul f20, f19, f17 # S * N(d1) fneg f21, f3 # -r fmul f22, f21, f4 # -rT bl exp_function # f22 = e^(-rT) lfd f23, strike_price(r3) # Reload K fmul f24, f23, f22 # K * e^(-rT) fmul f25, f24, f18 # K * e^(-rT) * N(d2) fsub f26, f20, f25 # Call option price stfd f26, option_value(r3) # Store result
Digital Signal Processing - FFT
# Load complex numbers for FFT calculation
lis r3, fft_data@ha
addi r3, r3, fft_data@l
li r4, 1024 # FFT size
li r5, 0 # Current index
fft_load_loop:
slwi r6, r5, 4 # Index * 16 (8 bytes real + 8 bytes imag)
add r7, r3, r6 # Address of complex number
lfd f1, 0(r7) # Load real part
lfd f2, 8(r7) # Load imaginary part
# Store in separated real and imaginary arrays for processing
lis r8, real_array@ha
addi r8, r8, real_array@l
lis r9, imag_array@ha
addi r9, r9, imag_array@l
slwi r10, r5, 3 # Index * 8 bytes
stfdx f1, r8, r10 # Store real part
stfdx f2, r9, r10 # Store imaginary part
addi r5, r5, 1 # Next complex number
cmpw r5, r4 # Check if done
blt fft_load_loop
# Perform bit-reversal permutation
li r5, 0 # Current index
bit_reverse_loop:
mr r6, r5 # Copy index
li r7, 0 # Reversed index
li r8, 10 # log2(1024) = 10 bits
reverse_bits:
rlwinm r9, r6, 0, 31, 31 # Extract LSB
slw r7, r7, 1 # Shift reversed index left
or r7, r7, r9 # Insert bit
srw r6, r6, 1 # Shift original right
subi r8, r8, 1 # Decrement bit count
cmpwi r8, 0
bgt reverse_bits
# Swap if needed
cmpw r5, r7 # Compare indices
bge no_swap # Skip if already processed
# Load and swap real parts
lis r8, real_array@ha
addi r8, r8, real_array@l
slwi r9, r5, 3 # r5 * 8
slwi r10, r7, 3 # r7 * 8
lfdx f1, r8, r9 # Load real[r5]
lfdx f2, r8, r10 # Load real[r7]
stfdx f2, r8, r9 # Store real[r7] at r5
stfdx f1, r8, r10 # Store real[r5] at r7
# Load and swap imaginary parts
lis r8, imag_array@ha
addi r8, r8, imag_array@l
lfdx f1, r8, r9 # Load imag[r5]
lfdx f2, r8, r10 # Load imag[r7]
stfdx f2, r8, r9 # Store imag[r7] at r5
stfdx f1, r8, r10 # Store imag[r5] at r7
no_swap:
addi r5, r5, 1 # Next index
cmpw r5, r4 # Check if done
blt bit_reverse_loop
3D Graphics Transformations
# Load and apply 4x4 transformation matrix to vertex lis r3, transform_matrix@ha addi r3, r3, transform_matrix@l lis r4, vertex_data@ha addi r4, r4, vertex_data@l # Load transformation matrix (row-major order) lfd f0, 0(r3) # m00 lfd f1, 8(r3) # m01 lfd f2, 16(r3) # m02 lfd f3, 24(r3) # m03 lfd f4, 32(r3) # m10 lfd f5, 40(r3) # m11 lfd f6, 48(r3) # m12 lfd f7, 56(r3) # m13 lfd f8, 64(r3) # m20 lfd f9, 72(r3) # m21 lfd f10, 80(r3) # m22 lfd f11, 88(r3) # m23 lfd f12, 96(r3) # m30 lfd f13, 104(r3) # m31 lfd f14, 112(r3) # m32 lfd f15, 120(r3) # m33 # Load vertex position (x, y, z, w) lfd f16, 0(r4) # x lfd f17, 8(r4) # y lfd f18, 16(r4) # z lfd f19, 24(r4) # w (usually 1.0) # Transform vertex: result = matrix * vertex # x' = m00*x + m01*y + m02*z + m03*w fmul f20, f0, f16 # m00 * x fmadd f20, f1, f17, f20 # + m01 * y fmadd f20, f2, f18, f20 # + m02 * z fmadd f20, f3, f19, f20 # + m03 * w # y' = m10*x + m11*y + m12*z + m13*w fmul f21, f4, f16 # m10 * x fmadd f21, f5, f17, f21 # + m11 * y fmadd f21, f6, f18, f21 # + m12 * z fmadd f21, f7, f19, f21 # + m13 * w # z' = m20*x + m21*y + m22*z + m23*w fmul f22, f8, f16 # m20 * x fmadd f22, f9, f17, f22 # + m21 * y fmadd f22, f10, f18, f22 # + m22 * z fmadd f22, f11, f19, f22 # + m23 * w # w' = m30*x + m31*y + m32*z + m33*w fmul f23, f12, f16 # m30 * x fmadd f23, f13, f17, f23 # + m31 * y fmadd f23, f14, f18, f23 # + m32 * z fmadd f23, f15, f19, f23 # + m33 * w # Store transformed vertex stfd f20, 0(r4) # Store x' stfd f21, 8(r4) # Store y' stfd f22, 16(r4) # Store z' stfd f23, 24(r4) # Store w'
Numerical Integration
# Simpson's rule numerical integration with double precision
lis r3, function_data@ha
addi r3, r3, function_data@l
lfd f1, start_point(r0) # Integration start
lfd f2, end_point(r0) # Integration end
lwz r4, num_intervals(r0) # Number of intervals (must be even)
# Calculate step size: h = (b - a) / n
fsub f3, f2, f1 # b - a
stw r4, temp_n(r1) # Store n as integer
lfs f4, temp_n(r1) # Load as float
fdiv f5, f3, f4 # h = (b - a) / n
# Initialize sum with f(a) and f(b)
fmr f6, f1 # x = a
bl evaluate_function # f(a)
fmr f7, f1 # Save f(a)
fmr f6, f2 # x = b
bl evaluate_function # f(b)
fadd f8, f7, f1 # f(a) + f(b)
# Add 4 * sum of odd points and 2 * sum of even points
lfd f9, zero_constant(r0) # Sum of odd points
lfd f10, zero_constant(r0) # Sum of even points
li r5, 1 # Current interval
integration_loop:
cmpw r5, r4 # Check if done
bge integration_done
# Calculate x = a + i * h
stw r5, temp_i(r1)
lfs f11, temp_i(r1) # i as float
fmul f12, f11, f5 # i * h
fadd f6, f1, f12 # x = a + i * h
bl evaluate_function # Evaluate f(x)
# Check if i is odd or even
andi. r6, r5, 1 # Check LSB
beq even_point
# Odd point
fadd f9, f9, f1 # Add to odd sum
b next_point
even_point:
# Even point
fadd f10, f10, f1 # Add to even sum
next_point:
addi r5, r5, 1 # Next interval
b integration_loop
integration_done:
# Calculate final result: (h/3) * [f(a) + f(b) + 4*odd_sum + 2*even_sum]
lfd f11, four_constant(r0) # Load 4.0
lfd f12, two_constant(r0) # Load 2.0
fmul f13, f11, f9 # 4 * odd_sum
fmul f14, f12, f10 # 2 * even_sum
fadd f15, f8, f13 # f(a) + f(b) + 4*odd_sum
fadd f16, f15, f14 # + 2*even_sum
lfd f17, three_constant(r0) # Load 3.0
fdiv f18, f5, f17 # h/3
fmul f19, f18, f16 # Final integral result
stfd f19, integral_result(r0) # Store result
Monte Carlo Simulation
# Monte Carlo estimation of π using double precision
lis r3, random_seed@ha
addi r3, r3, random_seed@l
lwz r4, num_samples(r0) # Number of random samples
li r5, 0 # Count of points inside circle
li r6, 0 # Current sample
monte_carlo_loop:
cmpw r6, r4 # Check if done
bge monte_carlo_done
# Generate random x coordinate [-1, 1]
bl generate_random # Returns random value in f1
lfd f2, two_constant(r0) # Load 2.0
fmul f3, f1, f2 # Scale to [0, 2]
lfd f4, one_constant(r0) # Load 1.0
fsub f5, f3, f4 # Shift to [-1, 1]
# Generate random y coordinate [-1, 1]
bl generate_random # Returns random value in f1
fmul f6, f1, f2 # Scale to [0, 2]
fsub f7, f6, f4 # Shift to [-1, 1]
# Calculate distance from origin: d² = x² + y²
fmul f8, f5, f5 # x²
fmadd f9, f7, f7, f8 # x² + y²
# Check if point is inside unit circle (d² < 1)
fcmpu cr0, f9, f4 # Compare d² with 1.0
bge outside_circle # Branch if >= 1.0
addi r5, r5, 1 # Increment inside count
outside_circle:
addi r6, r6, 1 # Next sample
b monte_carlo_loop
monte_carlo_done:
# Estimate π = 4 * (inside_count / total_samples)
stw r5, temp_inside(r1)
lfs f10, temp_inside(r1) # Inside count as float
stw r4, temp_total(r1)
lfs f11, temp_total(r1) # Total samples as float
fdiv f12, f10, f11 # Ratio of inside points
lfd f13, four_constant(r0) # Load 4.0
fmul f14, f12, f13 # π estimate
stfd f14, pi_estimate(r0) # Store π estimate
Machine Learning - Neural Network
# Load neural network weights and biases (double precision)
lis r3, weight_matrix@ha
addi r3, r3, weight_matrix@l
lis r4, bias_vector@ha
addi r4, r4, bias_vector@l
lis r5, input_vector@ha
addi r5, r5, input_vector@l
lwz r6, layer_size(r0) # Number of neurons
li r7, 0 # Current neuron
neural_layer_loop:
cmpw r7, r6 # Check if done with layer
bge layer_done
# Load bias for current neuron
slwi r8, r7, 3 # neuron * 8 bytes
lfdx f1, r4, r8 # Load bias
# Calculate weighted sum: sum = bias + Σ(weight * input)
lwz r9, input_size(r0) # Number of inputs
li r10, 0 # Input index
weight_sum_loop:
cmpw r10, r9 # Check if done with inputs
bge weight_sum_done
# Calculate weight matrix address: weights[neuron][input]
mullw r11, r7, r9 # neuron * input_size
add r11, r11, r10 # + input_index
slwi r11, r11, 3 # * 8 bytes
lfdx f2, r3, r11 # Load weight
# Load input value
slwi r12, r10, 3 # input * 8 bytes
lfdx f3, r5, r12 # Load input
# Multiply and accumulate
fmadd f1, f2, f3, f1 # sum += weight * input
addi r10, r10, 1 # Next input
b weight_sum_loop
weight_sum_done:
# Apply activation function (sigmoid)
# sigmoid(x) = 1 / (1 + e^(-x))
fneg f4, f1 # -x
bl exp_function # e^(-x)
lfd f5, one_constant(r0) # Load 1.0
fadd f6, f1, f5 # 1 + e^(-x)
fdiv f7, f5, f6 # 1 / (1 + e^(-x))
# Store neuron output
lis r13, output_vector@ha
addi r13, r13, output_vector@l
slwi r14, r7, 3 # neuron * 8 bytes
stfdx f7, r13, r14 # Store output
addi r7, r7, 1 # Next neuron
b neural_layer_loop
layer_done: