← Niche Catalog

analysis

Modality: llm_chat · full deep dive — every ranked model, test result, and artifact.

10
Models
0
Benchmark Results
0
Media Artifacts
0
Resolutions

Ranked Models

Ranked by confidence-adjusted score (single/zero-sample, non-curated scores floored; curated empirical scores trusted as-is).

#ModelProviderAdj. ScoreRawEvidence
1deepseek-chatdeepseek0.6420.998n=9
2gpt-4oopenai0.4560.456curated
3gemini-2.5-flashgoogle_gemini0.3470.407n=29
4claude-sonnet-4-5-20250929anthropic0.3320.332n=11550
5claude-sonnet-4-6anthropic0.2920.292curated
6claude-haiku-4-5-20251001anthropic0.2770.315n=37
7claude-opus-4-6anthropic0.2390.239curated
8gemini-2.5-progoogle_gemini0.0170.017n=8006
9o3openai0.0000.000provisional (n≤1)
10o3-miniopenai0.0000.000provisional (n≤1)

Test Results

No benchmark outputs recorded for this niche yet.