AWS Lambda Durable Functions Are GA for Java: When Should You Drop Step Functions?

Lambda durable functions now cover Python, TypeScript, and Java. Here's a side-by-side comparison with Step Functions using a real human-approval workflow - including working code for both, an explicit cost breakdown with actual pricing math, and a decision framework for when to use each.

With the Java SDK now generally available, Lambda durable functions cover all three major runtimes. Here is a side-by-side comparison with Step Functions using a real human-approval workflow, including sample code, a cost breakdown, and a decision framework for choosing between the two.

One of my customers at DoiT recently asked me a question I have heard a few times since AWS announced Lambda durable functions at re:Invent 2025: "We have three Step Functions state machines that are basically Lambda function chains with a human approval step. Can we just convert those to durable functions and simplify the infrastructure?"

The answer was yes, for two of those three workflows, and no for the third. Understanding why required a proper side-by-side comparison, which I ended up doing with working code for both approaches. This post documents that comparison.

What Lambda Durable Functions Are (and Are Not)

AWS Lambda durable functions let you write multi-step, stateful, long-running logic directly inside a Lambda function handler. The platform handles checkpointing, failure recovery, and cost-efficient suspension automatically. There is no separate orchestration service to define, no state machine JSON to write, and no visual workflow designer to maintain.

The core mechanic is the checkpoint and replay model. When your function executes a durable step (a network call, a wait for human input, a call to another function), the Lambda runtime records the result in a checkpoint log and suspends the execution. When the function resumes (after a callback, a timer, or an external event), Lambda re-invokes the function from the beginning but fast-forwards through already-completed steps using the stored checkpoint values, skipping the actual work. Your code sees a linear execution; the runtime handles the discontinuity underneath.

The function can remain suspended for up to one year. No compute charges accumulate during a wait. Checkpoint storage incurs a cost; refer to the Lambda pricing page for current durable functions storage rates.

The Java SDK reached general availability in April 2026. Python and TypeScript/JavaScript have been available since the initial launch at re:Invent 2025. The feature recently expanded to 16 additional AWS Regions, bringing total coverage to over 30 regions. For the broader design rationale, AWS's launch post on building multi-step applications and AI workflows with Lambda durable functions is the best starting point.

AWS Step Functions is the existing managed workflow orchestration service. You define your workflow as an Amazon States Language (ASL) JSON document, where states represent tasks (Lambda invocations, SDK integrations, waits, choices), and transitions represent control flow. The workflow executes independently of any Lambda function. Visual debugging, execution history, and input/output inspection are first-class features.

Neither replaces the other completely. They solve the same problem class from different angles.

The Test Workflow: Order Processing with Human Approval

For the comparison, I used an order processing workflow that appears in a slightly different form across many of the DoiT customers I work with. The flow is:

  1. Validate the incoming order payload
  2. Check inventory availability via an external API
  3. If the order value exceeds $5,000, pause and wait for a human approval signal
  4. On approval, charge the payment method
  5. On rejection, issue a cancellation notification
  6. On the successful (charged) path, send an order confirmation

This workflow has the characteristics that stress-test both approaches: a long wait (approval can take hours or days), a conditional branch based on runtime data, and two distinct terminal paths.

Implementation with Lambda Durable Functions (Python)

The AWS Lambda Durable Execution SDK uses decorators and primitives to express the orchestration directly in your function code. The following example uses the documented programming model from the official AWS documentation.

from aws_durable_execution_sdk_python import (
    DurableContext, StepContext, durable_execution, durable_step
)
from aws_durable_execution_sdk_python.config import CallbackConfig, Duration


@durable_step
def validate_order(ctx: StepContext, order: dict) -> dict:
    # validation logic
    return {"id": order["id"], "sku": order["sku"],
            "quantity": order["quantity"], "total": order["total"]}


@durable_step
def check_inventory(ctx: StepContext, sku: str, quantity: int) -> dict:
    # call inventory API
    return {"available": True}


@durable_step
def charge_payment(ctx: StepContext, order_id: str, total: float) -> dict:
    # call payment provider
    return {"transaction_id": "txn_abc123"}


@durable_step
def send_confirmation(ctx: StepContext, order_id: str, txn_id: str) -> None:
    # send order confirmation email/notification
    pass


@durable_step
def send_cancellation(ctx: StepContext, order_id: str) -> None:
    # send cancellation notification
    pass


@durable_execution
def handle_order(event: dict, context: DurableContext) -> dict:
    order = event

    # Step 1: validate
    validated = context.step(validate_order(order))

    # Step 2: inventory check
    inventory = context.step(check_inventory(validated["sku"], validated["quantity"]))
    if not inventory["available"]:
        return {"status": "rejected", "reason": "out_of_stock"}

    # Step 3: conditional human approval for high-value orders
    if validated["total"] > 5000:
        callback = context.create_callback(
            name="ApprovalDecision",
            config=CallbackConfig(timeout=Duration.from_seconds(259200))
        )
        # Send callback_id to the approver (e.g., via SNS or a notification system)
        notify_approver(callback.callback_id, validated["id"], validated["total"])
        # Execution suspends here - no compute charges during the wait
        approval = callback.result()
        if not approval.get("approved"):
            context.step(send_cancellation(validated["id"]))
            return {"status": "cancelled", "reason": "approval_rejected"}

    # Step 4: charge payment
    payment = context.step(charge_payment(validated["id"], validated["total"]))

    # Step 5: confirmation
    context.step(send_confirmation(validated["id"], payment["transaction_id"]))
    return {"status": "confirmed", "transaction_id": payment["transaction_id"]}

The entire workflow is sequential code. Unit testing is straightforward: mock the individual step functions, invoke the handler with a test event, and assert the return value. No ASL to parse, no state machine emulator needed.

The external approval signal is sent by calling the SendDurableExecutionCallbackSuccess Lambda API with the callback ID. For example, via the AWS CLI:

aws lambda send-durable-execution-callback-success \
  --callback-id <callback-id> \
  --cli-binary-format raw-in-base64-out \
  --result '{"approved": true}'

Implementation with Step Functions (ASL)

The equivalent Step Functions definition requires a state machine with a Task state for each step, a Choice state for the value threshold, a Wait state (implemented as a .waitForTaskToken integration) for human approval, and two terminal paths.

{
  "Comment": "Order processing with human approval",
  "StartAt": "ValidateOrder",
  "States": {
    "ValidateOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:validate-order",
      "ResultPath": "$.validation",
      "Next": "CheckInventory"
    },
    "CheckInventory": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:check-inventory",
      "ResultPath": "$.inventory",
      "Next": "CheckOrderValue"
    },
    "CheckOrderValue": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.total",
          "NumericGreaterThan": 5000,
          "Next": "NotifyApprover"
        }
      ],
      "Default": "ChargePayment"
    },
    "NotifyApprover": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke.waitForTaskToken",
      "Parameters": {
        "FunctionName": "notify-approver",
        "Payload": {
          "order_id.$": "$.id",
          "task_token.$": "$$.Task.Token",
          "total.$": "$.total"
        }
      },
      "TimeoutSeconds": 259200,
      "ResultPath": "$.approval",
      "Next": "ApprovalDecision"
    },
    "ApprovalDecision": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.approval.approved",
          "BooleanEquals": false,
          "Next": "SendCancellation"
        }
      ],
      "Default": "ChargePayment"
    },
    "SendCancellation": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:send-cancellation",
      "End": true
    },
    "ChargePayment": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:charge-payment",
      "ResultPath": "$.payment",
      "Next": "SendConfirmation"
    },
    "SendConfirmation": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:send-confirmation",
      "End": true
    }
  }
}

That is roughly 70 lines of JSON for a workflow whose orchestration logic fits in about 30 lines of Python. Note the ResultPath on each Task state: without it, a task's output replaces the execution state and later states lose access to fields like $.total and $.id. The JSON is correct, but it requires a different mental model: you are configuring a graph of states, not writing a program. When something goes wrong, you diagnose it by reading the execution history in the Step Functions console, not by reading a stack trace.

Cost Comparison

For a workflow that pauses for hours waiting for human approval, the cost difference favors durable functions, though it stays modest in absolute terms at moderate volume. Let me show the math for both sides at 10,000 orders per month (assuming most take the high-value approval path).

Step Functions Cost (10,000 orders/month)

Step Functions Standard Workflows charge $0.000025 per state transition. Our workflow has between 5 and 7 state transitions per execution depending on the path: 5 for low-value orders (ValidateOrder, CheckInventory, CheckOrderValue, ChargePayment, SendConfirmation), 7 for the high-value approval path, and 6 for rejections. The calculations below use 7 transitions for every execution as a conservative upper bound, which slightly overstates the Step Functions cost.

ComponentCalculationCost
State transitions10,000 x 7 transitions = 70,000 - 4,000 free tier = 66,000 x $0.000025$1.65
Lambda invocations10,000 x 5 task invocations (128 MB, ~200ms avg)~$0.01 compute + requests
Total~$1.66/month

Note: The Lambda cost for Step Functions task invocations is minimal because each task function does one thing and returns quickly. Waiting on a task token does not accumulate additional charges beyond the transition into that state.

Durable Functions Cost (10,000 orders/month)

Durable functions charge across three dimensions: durable operations ($8.00/million), data written ($0.25/GB), and data retention ($0.15/GB-month). You also pay standard Lambda compute and request charges, including for the resume invocation after the callback.

ComponentCalculationCost
Durable operations10,000 x 5 ops (1 start + 3 steps + 1 callback) = 50,000 x $0.000008$0.40
Data written10,000 x 5 ops x ~2KB avg payload = ~100MB x $0.25/GB$0.02
Data retention~100MB retained avg 14 days = ~0.05 GB-month x $0.15< $0.01
Lambda compute10,000 x 2 invocations (initial + resume after callback), 128MB, ~500ms avg including replay~$0.02
Lambda requests20,000 requests< $0.01
Total~$0.45/month

The resume invocation replays from the beginning of the handler, fast-forwarding through completed checkpoints. This adds execution duration (you pay for the milliseconds of replay), but for a workflow with a few small steps the replay overhead is negligible.

At Scale (100,000 orders/month)

Step FunctionsDurable Functions
Orchestration charges696,000 transitions x $0.000025 = $17.40500,000 ops x $0.000008 = $4.00
Data written-~1GB x $0.25 = $0.25
Data retention-~0.5 GB-month x $0.15 = $0.08
Lambda compute~$0.10 (5 short tasks per order)~$0.17 (2 invocations, replay overhead)
Total~$17.50~$4.50

The cost advantage for durable functions is roughly 4x at this scale, driven primarily by the absence of a per-step orchestration charge comparable to Step Functions' $0.000025/transition.

NOTE: This comparison uses Step Functions Standard Workflows. Express Workflows are priced by duration and number of requests (not state transitions) and are designed for workflows that complete in under five minutes. They would not apply here due to the multi-day approval wait. Durable functions do not have an equivalent "express" mode, but they also have no per-step charge at all.

Developer Experience: Where Each Approach Wins

Durable functions win on:

  • Code-first development: you write and test in your language, using familiar tools
  • Unit testing: mock the step functions, assert the return value, done
  • Refactoring: change workflow logic by editing code, not JSON
  • Debuggability during development: breakpoints, local execution, familiar stack traces

Step Functions win on:

  • Visual observability in production: the execution console shows exactly which state failed, with the input and output data at each transition
  • Multi-service orchestration: Step Functions has native SDK integrations for over 220 AWS services without a Lambda intermediary
  • Cross-team visibility: non-engineers can read and understand a workflow diagram more easily than code
  • Established compliance audit trails: execution history is retained and queryable natively

The customer I mentioned at the start of this post had three workflows. Two were Lambda-to-Lambda chains with human approval waits: clear candidates for durable functions, simpler code, lower cost, easier to maintain. The third coordinated Lambda, Amazon ECS tasks, a manual Amazon SNS notification step, and an Amazon DynamoDB stream trigger: a multi-service workflow that would be awkward to implement in durable functions and would lose the visual observability the team relied on during incident response. That one stays in Step Functions.

Migration Checklist

AWS publishes its own guidance on choosing between durable functions and Step Functions, and it aligns with what I found in practice. If you are evaluating existing Step Functions state machines for migration to Lambda durable functions, run through this list:

  • Does the workflow orchestrate only Lambda functions? If it invokes non-Lambda AWS services directly (DynamoDB, SQS, SNS, ECS), keep Step Functions.
  • Does your operations team rely on the Step Functions console for incident debugging? If yes, weigh the observability loss carefully.
  • Does the workflow include waits measured in hours or days? Durable functions have a meaningful cost advantage here.
  • Is the workflow logic expressed as simple sequential steps with conditional branches? Durable functions handle this cleanly. Parallel fan-out with join is supported, but more complex in code.
  • Does your team already have Lambda deployment tooling (SAM, CDK, Serverless Framework)? Durable functions fit directly into that workflow. Step Functions requires a separate state machine resource.

Conclusion

Lambda durable functions reaching general availability for Java closes the last major runtime gap and makes them a viable default for Lambda-centric workflows. The code-first model, zero-cost suspension, and familiar testing experience are genuine advantages over Step Functions for simple sequential orchestration.

They do not replace Step Functions for workflows that span multiple AWS services, need visual debugging in production, or require the compliance audit trail that execution history provides. Both tools will continue to coexist, and the right answer is to use each where it fits.

Key Takeaways:

  • Lambda durable functions handle checkpointing, suspension, and recovery transparently inside your existing Lambda code, with zero compute cost during waits
  • Step Functions remain the better choice for multi-service workflows, visual observability, and cross-team legibility
  • Cost advantage for durable functions comes from having no per-step orchestration charge comparable to Step Functions' $0.000025/transition. However, durable functions do charge for durable operations ($8/million), data written, and data retention - plus you pay Lambda compute for replay invocations. The net savings are roughly 4x at 100K orders/month for this workflow pattern
  • The migration decision is per-workflow: Lambda-only chains with waits are strong candidates; multi-service orchestrations are not

Subscribe to Javier in the Cloud

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe