Skip to content

Correction Engine

The Correction Engine is CertusOrdo's self-healing component. When the Decision Engine chooses ROLLBACK_AND_RETRY, the Correction Engine determines what to change before the retry attempt — transforming failures into successes without human intervention.

Overview

Decision: ROLLBACK_AND_RETRY → [CORRECTION ENGINE] → Correction Payload → Retry
                                       ├── Anomaly Analysis
                                       ├── Strategy Selection
                                       ├── Payload Generation
                                       └── Feedback Loop

Key Insight: Most AI agent failures aren't random — they follow patterns. A timeout often needs a longer timeout. A scope violation needs constrained permissions. The Correction Engine encodes this knowledge.


Why Corrections Matter

Traditional retry logic: "Try again and hope for the best."

CertusOrdo retry logic: "Try again with these specific adjustments."

Example:

Anomaly: Transaction value exceeds limit ($15,000 > $10,000)
Traditional: Retry same transaction → Same failure
CertusOrdo: Split into two $7,500 transactions → Success


Correction Strategies

The Correction Engine supports 20 distinct strategies:

Parameter Adjustments

Strategy When Used Example
ADJUST_PARAMETER Config values causing issues Increase timeout from 30s to 60s
REDUCE_SCOPE Too many operations at once Process 100 records instead of 1000
DECREASE_BATCH_SIZE Batch processing overload Reduce from 50 to 10 items
INCREASE_TIMEOUT Operations timing out Extend deadline
RATE_LIMIT_SELF Agent moving too fast Add 100ms delay between calls

Content Modifications

Strategy When Used Example
MODIFY_FIELD Wrong value in specific field Change currency from USD to EUR
ADD_CONTEXT Missing necessary information Include customer ID in request
REMOVE_AMBIGUITY Unclear instructions Specify exact output format
ENFORCE_FORMAT Output schema issues Add JSON schema constraint
CONSTRAIN_OUTPUT Output too verbose/complex Limit response to 500 tokens

Behavioral Changes

Strategy When Used Example
ADD_INSTRUCTION Agent needs more guidance "Verify before submitting"
SIMPLIFY_TASK Task too complex Break into 3 sequential steps
DECOMPOSE_TASK Multi-part task failing Execute parts independently
ADD_VALIDATION Missing pre/post checks Verify balance before transfer
REQUEST_CONFIRMATION High-risk action flagged Require explicit approval step

Recovery Actions

Strategy When Used Example
RETRY_AS_IS Transient error Network glitch, just retry
USE_FALLBACK Primary approach failed Switch to backup API endpoint
SWITCH_MODEL Current model underperforming Use GPT-4 instead of GPT-3.5
CACHE_RESULT Repeated expensive operations Store and reuse intermediate results
ESCALATE_TO_HUMAN Cannot auto-correct safely Route to human operator

API Reference

Generate Correction

POST /v1/safety/correct/generate
Content-Type: application/json
X-API-Key: your_api_key

Request Body:

{
  "decision_id": "uuid",
  "transaction_id": "uuid",
  "anomalies": [
    {
      "type": "value_bounds",
      "severity": "medium",
      "code": "VAL002",
      "message": "Transaction value $15,000 exceeds limit $10,000",
      "details": {
        "actual_value": 15000.00,
        "limit": 10000.00,
        "field": "value_usd"
      }
    }
  ],
  "retry_count": 0,
  "original_payload": {
    "action": "wire_transfer",
    "amount": 15000.00,
    "recipient": "account_xyz"
  }
}

Response:

{
  "correction_id": "uuid",
  "transaction_id": "uuid",
  "strategy": "DECOMPOSE_TASK",
  "confidence": 0.89,
  "corrections": [
    {
      "action": "SPLIT_TRANSACTION",
      "description": "Split single transaction into two within limits",
      "original_field": "amount",
      "original_value": 15000.00,
      "corrected_payloads": [
        {
          "action": "wire_transfer",
          "amount": 7500.00,
          "recipient": "account_xyz",
          "sequence": 1
        },
        {
          "action": "wire_transfer",
          "amount": 7500.00,
          "recipient": "account_xyz",
          "sequence": 2
        }
      ]
    }
  ],
  "reasoning": "Value exceeds single-transaction limit. Decomposing into two transactions of $7,500 each keeps both within bounds while completing the full transfer.",
  "estimated_success_probability": 0.94,
  "retry_delay_ms": 1000
}

Preview Correction (Dry Run)

POST /v1/safety/correct/preview
Content-Type: application/json
X-API-Key: your_api_key

Same request body as /generate, but returns the correction without executing it. Useful for testing and debugging.

List Available Strategies

GET /v1/safety/correct/strategies
X-API-Key: your_api_key

Response:

{
  "strategies": [
    {
      "name": "MODIFY_FIELD",
      "description": "Change a specific field value to correct an anomaly",
      "applicable_anomaly_types": ["value_bounds", "schema", "consistency"],
      "risk_level": "low",
      "requires_original_payload": true
    },
    {
      "name": "DECOMPOSE_TASK",
      "description": "Break a complex task into smaller sequential steps",
      "applicable_anomaly_types": ["value_bounds", "rate_limit", "scope"],
      "risk_level": "medium",
      "requires_original_payload": true
    }
    // ... 18 more strategies
  ]
}

Submit Feedback

POST /v1/safety/correct/feedback
Content-Type: application/json
X-API-Key: your_api_key

Request Body:

{
  "correction_id": "uuid",
  "outcome": "success",
  "retry_count": 1,
  "final_confidence": 0.96,
  "notes": "Split transaction strategy worked on first retry"
}

Feedback improves future correction selection.


Strategy Selection Algorithm

The Correction Engine selects strategies based on anomaly type and context:

def select_strategy(anomalies, context):
    # Priority 1: Direct match
    for anomaly in anomalies:
        if template := get_template(anomaly.type, context.org_id):
            return template.strategy

    # Priority 2: Severity-based defaults
    if any(a.severity == "critical" for a in anomalies):
        return "ESCALATE_TO_HUMAN"

    # Priority 3: Anomaly type mapping
    strategy_map = {
        "value_bounds": "MODIFY_FIELD",
        "rate_limit": "RATE_LIMIT_SELF",
        "timeout": "INCREASE_TIMEOUT",
        "scope": "REDUCE_SCOPE",
        "schema": "ENFORCE_FORMAT",
        "behavioral": "ADD_INSTRUCTION",
        "content_quality": "CONSTRAIN_OUTPUT",
    }

    primary_anomaly = max(anomalies, key=lambda a: a.severity_weight)
    return strategy_map.get(primary_anomaly.type, "RETRY_AS_IS")

Correction Templates

Organizations can define custom correction templates:

correction_template = {
    "name": "payment_limit_exceeded",
    "description": "Handle transactions exceeding single-payment limits",

    # When this template applies
    "trigger": {
        "anomaly_type": "value_bounds",
        "anomaly_code": "VAL002",
        "context_match": {
            "action_type": ["wire_transfer", "ach_payment"]
        }
    },

    # What correction to apply
    "strategy": "DECOMPOSE_TASK",
    "parameters": {
        "split_method": "equal",
        "max_per_transaction": 10000.00,
        "delay_between_ms": 5000
    },

    # Metadata
    "success_rate": 0.92,
    "avg_retries": 1.1,
    "last_updated": "2026-01-15"
}

Retry Logic

The Correction Engine manages retry attempts with exponential backoff:

Retry 1: Apply correction, wait 1 second
    ↓ (if still fails)
Retry 2: Apply enhanced correction, wait 2 seconds
    ↓ (if still fails)
Retry 3: Apply aggressive correction, wait 4 seconds
    ↓ (if still fails)
Escalate to human or terminate

Correction Escalation:

Retry Correction Approach
1 Minimal adjustment (same strategy)
2 Enhanced adjustment (stronger parameters)
3 Alternative strategy
4+ Human escalation

Integration with Decision Engine

The Correction Engine is invoked when the Decision Engine returns ROLLBACK_AND_RETRY:

async def handle_rollback_and_retry(decision, transaction):
    # Step 1: Rollback the transaction
    await transaction.rollback()

    # Step 2: Generate correction
    correction = await correction_engine.generate(
        decision_id=decision.id,
        transaction_id=transaction.id,
        anomalies=decision.anomalies,
        retry_count=decision.retry_count,
        original_payload=transaction.payload
    )

    # Step 3: Apply correction to payload
    corrected_payload = apply_correction(
        original=transaction.payload,
        correction=correction
    )

    # Step 4: Retry with corrected payload
    retry_result = await transaction.retry(
        payload=corrected_payload,
        delay_ms=correction.retry_delay_ms
    )

    # Step 5: Submit feedback for learning
    await correction_engine.feedback(
        correction_id=correction.id,
        outcome="success" if retry_result.success else "failure",
        final_confidence=retry_result.confidence
    )

    return retry_result

Success Metrics

Track correction effectiveness:

Metric Target Current
First-retry success rate > 70% 74%
Overall correction success > 90% 91%
Average retries to success < 2.0 1.4
Human escalation rate < 10% 7%

Design Principles

  1. Deterministic — Same anomaly patterns produce same corrections
  2. Conservative — Start with minimal changes, escalate if needed
  3. Traceable — Every correction is logged for learning
  4. Configurable — Templates allow org-specific corrections
  5. Safe — Never make corrections that could cause harm

Failure Modes

When corrections can't be generated safely:

Scenario Response
Unknown anomaly type Return RETRY_AS_IS with low confidence
Critical severity Return ESCALATE_TO_HUMAN
No applicable template Use default strategy mapping
Original payload missing Return error, require payload
Max retries exceeded Return ESCALATE_TO_HUMAN

Next Steps

When decisions require human notification, the Notification Engine handles multi-channel delivery with escalation chains.