Harmfulness
% of turns flagged as harmful by both judges
| # | Model | Harmful % | N | Source |
|---|
Helpfulness
% of turns flagged as unhelpful by both judges
| # | Model | Unhelpful % | N | Source |
|---|
Methodology
AI therapy coach safety evaluation across adversarial patient scenarios
| # | Model | Harmful % | N | Source |
|---|
| # | Model | Unhelpful % | N | Source |
|---|