Frontier Models
Closed-model performance watch
Arena + Artificial Analysis + vendor docs
Anthropic
Claude Opus 4.6 Thinking
Strongest live Arena profile right now, with especially good standings on expert prompts, coding and longer-query tasks.
Arena overall
#1
Hard prompts
#1
Coding
#1
Longer query
#1
Gemini 3.1 Pro Preview
Best current Artificial Analysis intelligence score and a top-three Arena text position, making it a strong general frontier benchmark.
AA intelligence
57
Arena overall
#3
Coding
#3
Creative
#2
Longer query
#3
OpenAI
GPT-5.4 (xhigh)
OpenAI’s flagship reasoning tier remains tied at the top of Artificial Analysis and still lands in Arena’s top group for text, coding and document work.
AA intelligence
57
Arena overall
#6
Expert
#3
Math
#1
Coding
#4