Mender
Self-healing for production agents.

incident 98f497db1a90 dismissed

Cluster

pattern
silent default to USD for ambiguous source currency
traces
7
project
finpay-support
created
May 7 21:15:25
updated
May 7 21:21:18

Baseline (live)

4/10
40% pass · 144s

Staged (patched)

6/10
60% pass · 159s
+20% lift

Hypothesis

The addition of a mandatory defaulting rule in version v2 overrides the previous requirement to clarify ambiguous source currencies, forcing the model to use USD.

Suspected prompt clause

Always assume USD if not specified.
recommended
Remove the 'Always assume USD if not specified' clause and restore the clarification logic from version v1.
evidence
The defaulting clause was added in v2 (released 2026-05-05T12:47:00+00:00), which the version notes identify as the cause of the regression.
confidence
1.00

Mender self-eval — how well did this cycle perform

overall
0.53
hypothesis correctness
0.50
fix effectiveness
0.33
eval set quality
0.86
token efficiency
1.00

lift=+20%; hyp=0.50; evalq=0.86; tok=1.00

Cycle parameters — self-tuned at start of cycle

eval_target_count
8
min_hypothesis_confidence
0.6
min_lift
0.25
cluster_max_failures
20

Past-cycle introspection

n_cycles_seen
0
trend
insufficient-data

Proposed patch — v2 → v3

Removed the mandatory USD default rule.

--- finpay-support/v2 +++ finpay-support/v3 @@ -3,5 +3,3 @@ Capabilities: - Answer general questions about the app. - - Convert amounts between currencies using `get_exchange_rate`. - -Always assume USD if not specified.+ - Convert amounts between currencies using `get_exchange_rate`.

Eval cases

case baseline (live) staged baseline judge
ambiguous-source-eur
10357ms
fail pass The agent assumed USD as the source currency instead of asking for clarification as required by the rubric.
ambiguous-source-mxn
10436ms
fail fail The agent assumed USD and performed the conversion instead of asking for the source currency as required.
ambiguous-source-jpy
8322ms
fail fail The agent assumed USD as the source currency instead of asking for clarification as required by the rubric.
ambiguous-source-gbp-context
3768ms
fail fail The agent assumed the source currency was USD instead of asking for clarification as required.
ambiguous-source-sgd
15733ms
fail pass The agent assumed the source currency was USD instead of asking the user for clarification as required by the rubric.
ambiguous-source-cad-rate
17896ms
fail fail The agent assumed USD as the source currency instead of asking for clarification as required by the rubric.
explicit-source-gbp-control
31756ms
pass pass The agent correctly converted 100 GBP to USD without asking for clarification, following the rubric exactly.
explicit-source-eur-control
12507ms
pass pass The agent correctly converted the specified amount from EUR to JPY without asking for a source currency.
explicit-source-usd-control
27665ms
pass pass The agent correctly converted 200 USD to AUD without asking for clarification, adhering to the rubric requirements.
adversarial-metric-conversion
6006ms
pass pass The agent correctly converted the distance without asking for a source currency or mentioning USD as required by the rubric.

State history

atfromtonote
May 7 21:15:25 detected detected 7 affected traces
May 7 21:15:28 detected hypothesized Always assume USD if not specified.
May 7 21:18:33 hypothesized evaluating baseline 4/10 pass
May 7 21:21:18 evaluating dismissed insufficient lift +20% (need +25%)