Higher-Order Nodes: Functional Programming in Pyiron Workflow
Pyiron Workflow supports a powerful functional programming paradigm where nodes can operate on other nodes, not just data. This advanced feature enables the creation of dynamic, reusable workflow patterns that can adapt to different computational scenarios, going beyond simple linear data flows.
What Are Higher-Order Nodes?
In functional programming, a higher-order function is one that: 1. Takes one or more functions as arguments, or 2. Returns a function as its result
Similarly, in Pyiron Workflow, a higher-order node: 1. Takes one or more nodes as inputs, or 2. Returns a node as its output
This capability allows for sophisticated workflow patterns like: - Looping constructs - Conditional execution paths - Workflow templates - Dynamic workflow generation - Recursive computations
Defining Higher-Order Nodes
To define a higher-order node, you need to:
- Import the
Nodetype for type annotations - Use
Nodein your type hints - Work with the node objects in your function body
Here's the basic structure:
from pyiron_core.pyiron_workflow import Node # Import the Node type
@as_function_node
def MyHigherOrderNode(input_node: Node, parameter: float) -> Node:
# Function body that operates on input_node
# ...
return modified_node
Output Port Naming in Higher-Order Nodes
Critical Insight: When working with higher-order nodes, output port names follow specific rules that are essential for proper connections:
- For regular function nodes, output ports are named after:
- The return variable name (e.g.,
return result→ port name "result") -
An explicit name in the decorator (e.g.,
@as_function_node("value")→ port name "value") -
When passing nodes to higher-order nodes, you must specify the exact output port:
# CORRECT - specifying the output port wf.result = loop_until( recursive_function=wf.recursive_func.outputs.new_x, max_steps=40 ) # INCORRECT - missing output port specification wf.result = loop_until( recursive_function=wf.recursive_func, # Missing .outputs.port_name max_steps=40 ) -
Higher-order node outputs follow the same naming rules as regular nodes:
- Based on return variable name or decorator specification
- Must be accessed using the standard output structure
Example: Looping Construct
The loop_until function is a perfect example of a higher-order node that implements a looping construct:
from pyiron_core.pyiron_workflow import Node
@as_function_node
def loop_until(recursive_function: Node, max_steps: int = 10) -> float:
"""Executes a recursive function until a condition is met"""
# Get the initial value from the recursive function's inputs
x = recursive_function.inputs.x.value
for i in range(max_steps):
# Execute the recursive function with current value
x, break_condition = recursive_function(x)
if break_condition:
break
return x # Output port will be named "x" (return variable name)
Using the Looping Construct
Here's how to use this higher-order node in a workflow:
# Define a recursive function node
@as_function_node
def recursive_step(x: float) -> tuple[float, bool]:
"""Computes next value and determines if we should stop"""
new_x = x * 0.9 # Reduce by 10% each step
should_stop = new_x < 0.1 # Stop when value drops below 0.1
return new_x, should_stop # Output ports: "new_x" and "should_stop"
# Create workflow
wf = Workflow("looping_example")
# Set up the recursive function with initial value
wf.recursive_func = recursive_step(x=1.0)
# Use the loop_until higher-order node
wf.result = loop_until(
recursive_function=wf.recursive_func.outputs.new_x, # Must specify output port
max_steps=40
)
# Execute workflow
wf.run()
# Access the result
result_value = wf.result.outputs.x.value # Output port "x" from loop_until
Key Points: - The recursive function returns two values, creating output ports "new_x" and "should_stop" - When connecting to the higher-order node, we must specify which output port to use - The higher-order node's output port is named "x" (from the return variable) - Results are accessed through the standard output structure
Benefits of Higher-Order Nodes
1. Workflow Reusability
Create workflow templates that can be applied to different computational problems:
# Define a generic optimization pattern
@as_function_node
def optimize(objective_function: Node, initial_guess: float, max_iterations: int = 100) -> float:
"""Optimizes a function using gradient descent"""
x = initial_guess
learning_rate = 0.1
for i in range(max_iterations):
# Compute gradient using finite differences
f_x = objective_function(x)
f_x_plus = objective_function(x + 1e-5)
gradient = (f_x_plus - f_x) / 1e-5
# Update position
x = x - learning_rate * gradient
# Check for convergence
if abs(gradient) < 1e-6:
break
return x # Output port will be named "x"
# Use with different objective functions
@as_function_node("result")
def quadratic_function(x: float) -> float:
return x**2 + 2*x + 1
@as_function_node("result")
def sine_function(x: float) -> float:
return np.sin(x)
# Create optimization workflows
wf_quadratic = Workflow("quadratic_opt")
wf_quadratic.objective = quadratic_function(x=0.0)
wf_quadratic.result = optimize(
objective_function=wf_quadratic.objective.outputs.result, # Specify output port
initial_guess=5.0
)
# Access result
result_value = wf_quadratic.result.outputs.x.value
2. Dynamic Behavior
Generate workflows based on runtime conditions:
@as_function_node
def select_computation(use_accurate: bool, x: float) -> float:
"""Selects between fast and accurate computation methods"""
if use_accurate:
# Create and execute accurate computation node
accurate_node = accurate_computation(x=x)
return accurate_node.run()
else:
# Create and execute fast computation node
fast_node = fast_computation(x=x)
return fast_node.run()
# This node dynamically creates and executes different computation paths
# based on the 'use_accurate' parameter
3. Abstraction
Hide complex workflow patterns behind simple interfaces:
@as_function_node
def monte_carlo_simulation(
model_function: Node,
num_samples: int = 1000,
random_seed: int = 42
) -> np.ndarray:
"""Runs Monte Carlo simulation using the provided model"""
np.random.seed(random_seed)
results = []
for _ in range(num_samples):
# Generate random inputs for the model
inputs = np.random.uniform(-1, 1, size=model_function.input_dimension)
# Execute the model with these inputs
result = model_function(*inputs)
results.append(result)
return np.array(results) # Output port will be named "results"
# Usage becomes simple despite the complex underlying pattern
wf.result = monte_carlo_simulation(
model_function=wf.physics_model.outputs.result, # Specify output port
num_samples=5000
)
# Access result
simulation_results = wf.result.outputs.results.value
Advanced Example: Conditional Execution
Here's an example showing conditional execution paths:
from pyiron_core.pyiron_workflow import Node
@as_function_node
def conditional_execute(condition: bool, true_node: Node, false_node: Node) -> float:
"""Executes true_node if condition is True, otherwise executes false_node"""
if condition:
return true_node.run()
else:
return false_node.run()
# Define two different computation paths
@as_function_node("result")
def fast_computation(x: float) -> float:
"""Fast but less accurate computation"""
return x * 2
@as_function_node
def accurate_computation(x: float) -> float:
"""Slow but more accurate computation"""
import time
time.sleep(0.1) # Simulate expensive computation
return x * x
# Create workflow
wf = Workflow("conditional_example")
wf.x = 2.0
wf.use_accurate = True # Switch between computation methods
# Set up the computation nodes
wf.fast = fast_computation(x=wf.x)
wf.accurate = accurate_computation(x=wf.x)
# Use conditional execution
wf.result = conditional_execute(
condition=wf.use_accurate,
true_node=wf.accurate.outputs.result, # Must specify output port
false_node=wf.fast.outputs.result # Must specify output port
)
# Execute workflow
wf.run()
# Access result
result_value = wf.result.outputs.node.value
Output Access Pattern for Higher-Order Nodes
After executing a workflow with higher-order nodes, you can access results through the structured output system:
# For a higher-order node that returns a single value
result_value = wf.result.outputs.x.value # Where "x" is the output port name
# For a higher-order node that returns multiple values
# (would follow the same pattern as regular multiple-output nodes)
The complete structure is:
- wf.result - The higher-order node instance
- .outputs - Contains all output ports of the higher-order node
- .x - The specific output port (name depends on return variable)
- .value - The actual data value produced
Important: The output port name for higher-order nodes follows the same rules as regular nodes: - Based on the return variable name - Or explicitly specified in the decorator
Common Mistakes to Avoid
1. Missing Output Port Specification
# ❌ Wrong - missing output port specification when connecting nodes
wf.result = loop_until(
recursive_function=wf.recursive_func, # Missing .outputs.port_name
max_steps=40
)
# ✅ Correct - specifying the exact output port
wf.result = loop_until(
recursive_function=wf.recursive_func.outputs.new_x,
max_steps=40
)
2. Incorrect Output Access
# ❌ Wrong - incorrect output port name
result_value = wf.result.outputs.result.value
# ✅ Correct - using the actual output port name
result_value = wf.result.outputs.x.value # Based on return variable name
3. Inconsistent Output Port Naming
# ❌ Wrong - inconsistent output port naming
@as_function_node
def process_value(x: float) -> float:
result = x * 2
return result # Output port will be "result"
@as_function_node
def optimize(objective_function: Node, initial_guess: float) -> float:
x = initial_guess
# ...
return x # Output port will be "x" (inconsistent with above)
# ✅ Better - consistent output port naming
@as_function_node("value")
def process_value(x: float) -> float:
result = x * 2
return result # Output port will be "value"
@as_function_node("value")
def optimize(objective_function: Node, initial_guess: float) -> float:
x = initial_guess
# ...
return x # Output port will be "value"
Best Practices for Higher-Order Nodes
1. Explicit Output Port Naming
Use explicit output port names with the decorator for clarity and consistency:
@as_function_node("value")
def compute_something(x: float) -> float:
result = x * x
return result
This ensures predictable output port names that are easier to work with in higher-order nodes.
2. Document Output Ports
Clearly document the output ports of your nodes:
@as_function_node
def SplitData(array: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
"""
Splits an array into even and odd indexed elements
Returns:
- even_elements: Elements at even indices
- odd_elements: Elements at odd indices
"""
even_elements = array[::2]
odd_elements = array[1::2]
return even_elements, odd_elements
# Output ports: "even_elements" and "odd_elements"
3. Use Meaningful Return Variable Names
Choose return variable names that clearly indicate their purpose:
# ❌ Unclear
return x, delta
# ✅ Clear
final_value = x
convergence_delta = delta
return final_value, convergence_delta
# Output ports: "final_value" and "convergence_delta"
4. Test with Multiple Scenarios
Test higher-order nodes with various input nodes to ensure robustness:
def test_loop_until():
# Test with different initial values
# Test with different convergence criteria
# Test with maximum iterations reached
pass
Real-World Applications
1. Parameter Optimization
@as_function_node("optimized_parameters")
def optimize_parameters(
objective_function: Node,
initial_parameters: np.ndarray,
learning_rate: float = 0.01,
max_iterations: int = 100
) -> np.ndarray:
"""Optimizes parameters using gradient descent"""
params = initial_parameters.copy()
for i in range(max_iterations):
# Compute gradient by finite differences
gradients = np.zeros_like(params)
for j in range(len(params)):
params_plus = params.copy()
params_plus[j] += 1e-5
loss = objective_function(params)
loss_plus = objective_function(params_plus)
gradients[j] = (loss_plus - loss) / 1e-5
# Update parameters
params -= learning_rate * gradients
# Check convergence
if np.linalg.norm(gradients) < 1e-6:
break
return params # Output port will be "optimized_parameters"
# Usage
wf.objective = physics_model(parameters=initial_guess)
wf.optimized = optimize_parameters(
objective_function=wf.objective.outputs.loss, # Specify output port
initial_parameters=initial_guess
)
result = wf.optimized.outputs.optimized_parameters.value
2. Adaptive Mesh Refinement
@as_function_node("final_solution")
def adaptive_mesh_refinement(
simulation_node: Node,
error_estimator_node: Node,
refine_mesh_node: Node,
initial_mesh_size: float,
max_refinements: int = 5,
tolerance: float = 0.01
) -> float:
"""Refines mesh until solution converges"""
mesh = initial_mesh_size
for i in range(max_refinements):
# Run simulation on current mesh
solution = simulation_node(mesh=mesh)
# Estimate error
error = error_estimator_node(solution=solution, mesh_size=mesh)
# Check if we've converged
if error < tolerance:
return solution
# Refine mesh in high-error regions
mesh = refine_mesh_node(mesh_size=mesh, error=error)
return solution # Output port will be "final_solution"
# Usage
wf.simulation = fluid_dynamics_simulation(mesh_size=initial_size)
wf.error_estimator = error_calculator(solution=wf.simulation)
wf.refine_mesh = mesh_refiner(mesh_size=initial_size, error=0.1)
wf.result = adaptive_mesh_refinement(
simulation_node=wf.simulation.outputs.solution, # Specify output ports
error_estimator_node=wf.error_estimator.outputs.error,
refine_mesh_node=wf.refine_mesh.outputs.refined_mesh,
initial_mesh_size=initial_size
)
solution = wf.result.outputs.final_solution.value
Conclusion
Higher-order nodes represent one of the most powerful features of Pyiron Workflow, enabling you to create sophisticated computational patterns that would be difficult or impossible with traditional workflow systems. By treating nodes as first-class objects that can be passed around and manipulated, Pyiron Workflow supports a truly functional approach to scientific computing.
When used appropriately, higher-order nodes can: - Reduce code duplication - Improve workflow readability - Enable more sophisticated computational patterns - Make your scientific workflows more adaptable and reusable
Critical Implementation Notes:
- Always specify exact output ports when connecting nodes (node.outputs.port_name)
- Be consistent with output port naming conventions
- Access results through the standard output structure (wf.node.outputs.port.value)
- Define all nodes at the module level (not inside functions)
As with any powerful feature, higher-order nodes should be used judiciously. Start with simple applications and gradually incorporate more complex patterns as you become comfortable with the paradigm. The ability to create dynamic, self-modifying workflows opens up new possibilities for scientific computing that go beyond traditional static workflow systems.