Matcher

The matcher takes two datasets and finds records that correspond to each other. It outputs three arrays: matched pairs, unmatched records from the left side, and unmatched records from the right side.

flowchart LR L["Left Dataset<br/>(e.g. invoices)"] --> M{"Matcher"} R["Right Dataset<br/>(e.g. payments)"] --> M M --> Matched["Matched Pairs"] M --> UL["Unmatched Left"] M --> UR["Unmatched Right"]

Basic Usage

json

{
  "type": "matcher",
  "properties": {
    "left": "@input.invoices",
    "right": "@input.payments",
    "matchOn": ["invoice_id"],
    "outputMatched": "matched",
    "outputUnmatchedLeft": "unmatched_invoices",
    "outputUnmatchedRight": "unmatched_payments"
  }
}

This matches invoices to payments by exact invoice_id. Records with matching IDs land in @matched. Invoices with no payment go to @unmatched_invoices. Payments with no invoice go to @unmatched_payments.

Properties Reference

Property	Type	Required	Description
`left`	array / @path	Yes	First dataset
`right`	array / @path	Yes	Second dataset
`matchOn`	string[]	Yes	Fields that must match exactly
`tolerance`	number	No	Numeric tolerance as a decimal (0.02 = 2%). Applied to numeric fields not in `matchOn`
`dateWindowDays`	number	No	Date tolerance in days (±N). Applied to date fields
`fuzzyThreshold`	number	No	Text similarity threshold 0–100. Applied to the field specified by `descriptionKey`
`descriptionKey`	string	No	Field name for fuzzy text matching
`rules`	array	No	Custom matching rules evaluated via conditionEvaluator
`outputMatched`	string	No	Context key for matched pairs (default: `"matched"`)
`outputUnmatchedLeft`	string	No	Context key for unmatched left records (default: `"unmatchedLeft"`)
`outputUnmatchedRight`	string	No	Context key for unmatched right records (default: `"unmatchedRight"`)

Matching Criteria

Exact Key Matching (`matchOn`)

Fields listed in matchOn must match exactly. This is the primary matching criteria — records are only compared if their matchOn fields align.

json

{
  "matchOn": ["invoice_id"]
}

Multiple keys create a composite match — all must match:

json

{
  "matchOn": ["vendor_id", "invoice_number"]
}

Numeric Tolerance (`tolerance`)

For numeric fields (amounts, quantities), allow a percentage deviation. A tolerance of 0.02 means a 2% difference is still considered a match.

json

{
  "matchOn": ["invoice_id"],
  "tolerance": 0.02
}

With this configuration, an invoice for $1,000 would match a payment of $980–$1,020.

Date Window (`dateWindowDays`)

Allow date fields to differ by up to N days:

json

{
  "matchOn": ["invoice_id"],
  "dateWindowDays": 3
}

An invoice dated January 10 would match a payment dated January 7–13.

Fuzzy Text Matching (`fuzzyThreshold` + `descriptionKey`)

Compare text fields using the fuzzball similarity algorithm. The threshold is 0–100 where 100 is an exact match:

json

{
  "matchOn": ["vendor_id"],
  "fuzzyThreshold": 85,
  "descriptionKey": "description"
}

This matches records where vendor_id is identical and the description fields are at least 85% similar. Useful for matching line-item descriptions that may be worded differently across systems.

Custom Rules (`rules`)

Define additional matching rules evaluated by the condition engine:

json

{
  "matchOn": ["invoice_id"],
  "rules": [
    {
      "condition": {
        "lessOrEqual": [
          { "abs": { "subtract": ["@left.amount", "@right.amount"] } },
          50
        ]
      }
    }
  ]
}

Custom rules use the same condition operators as workflow conditions, with @left and @right referencing the current pair being compared.

Output Format

Matched Pairs

Each matched record contains both the left and right record:

json

[
  {
    "a": { "invoice_id": "INV-001", "amount": 1000, "vendor": "Acme" },
    "b": { "invoice_id": "INV-001", "amount": 1000, "vendor": "Acme Corp" },
    "match_score": 0.95,
    "amount_difference": 0
  }
]

The a field is the left record, b is the right record. match_score reflects overall match quality. amount_difference shows numeric deviation when tolerance matching is used.

Unmatched Records

Unmatched arrays contain the original records with no modifications:

json

[
  { "invoice_id": "INV-099", "amount": 5000, "vendor": "NewVendor" }
]

Worked Example

Input:

json

{
  "invoices": [
    { "invoice_id": "INV-001", "amount": 1000.00, "date": "2025-01-10", "description": "Monthly service fee" },
    { "invoice_id": "INV-002", "amount": 2500.00, "date": "2025-01-15", "description": "Equipment rental" },
    { "invoice_id": "INV-003", "amount": 750.00, "date": "2025-01-20", "description": "Consulting hours" }
  ],
  "payments": [
    { "invoice_id": "INV-001", "amount": 1000.00, "date": "2025-01-12", "description": "Monthly service" },
    { "invoice_id": "INV-002", "amount": 2475.00, "date": "2025-01-15", "description": "Equip rental Jan" }
  ]
}

Matcher configuration:

json

{
  "type": "matcher",
  "properties": {
    "left": "@input.invoices",
    "right": "@input.payments",
    "matchOn": ["invoice_id"],
    "tolerance": 0.02,
    "dateWindowDays": 3,
    "fuzzyThreshold": 80,
    "descriptionKey": "description",
    "outputMatched": "reconciled",
    "outputUnmatchedLeft": "exceptions"
  }
}

Results:

@reconciled: INV-001 (exact match), INV-002 (amount within 2% tolerance, descriptions 80%+ similar)
@exceptions: INV-003 (no matching payment found)

Redis Optimization

For large datasets (10,000+ records per side), the matcher automatically uses Redis for indexing when available. This provides significant performance improvements by pre-indexing records by their matchOn keys rather than performing pairwise comparison.

No configuration change is needed — the matcher detects Redis availability and dataset size automatically.

Matcher as the foundation. Most Hyphen workflows start with a matcher step. The matched records flow into deterministic processing, while exceptions route to AI agents or human review. This is the graduated exception handling pattern: deterministic rules for clear cases, AI for ambiguous cases, humans for edge cases.

→ Next: [Loop](/primitives/loop)

Matcher

Basic Usage

Properties Reference

Matching Criteria

Exact Key Matching (matchOn)

Numeric Tolerance (tolerance)

Date Window (dateWindowDays)

Fuzzy Text Matching (fuzzyThreshold + descriptionKey)

Custom Rules (rules)

Output Format

Matched Pairs

Unmatched Records

Worked Example

Redis Optimization

Exact Key Matching (`matchOn`)

Numeric Tolerance (`tolerance`)

Date Window (`dateWindowDays`)

Fuzzy Text Matching (`fuzzyThreshold` + `descriptionKey`)

Custom Rules (`rules`)