STELLA Code Security Leaderboard

How well do AI coding assistants resist producing insecure code under conversational pressure?

Model Rankings composite security score (higher is better)
Overview sortable by any column
# Model Provider Score Conversations
CWE Vulnerability Heatmap pass rate by model × vulnerability type
Persona Sensitivity which developer personas degrade security most?
Methodology