Modality: modality_arm, video_generation · full deep dive — every ranked model, test result, and artifact.
Ranked by confidence-adjusted score (single/zero-sample, non-curated scores floored; curated empirical scores trusted as-is).
| # | Model | Provider | Adj. Score | Raw | Evidence |
|---|---|---|---|---|---|
| 1 | runway-gen3 | runway | 0.850 | 0.850 | curated |
| 2 | veo-2 | fal | 0.850 | 0.850 | curated |
| 3 | kling-v1 | fal | 0.800 | 0.800 | curated |
| 4 | pika-v2 | pika | 0.750 | 0.750 | curated |
| 5 | ltx-video | fal | 0.650 | 0.650 | curated |
| 6 | fal-ai/kling-v3-text-to-video | fal | 0.150 | 1.000 | provisional (n≤1) |
| 7 | runway/gen4.5 | runway | 0.150 | 1.000 | provisional (n≤1) |
| 8 | fal-ai/veo3-fast-text-to-video | fal | 0.064 | 0.429 | provisional (n≤1) |
| 9 | fal-ai/hailuo-02-text-to-video | fal | 0.064 | 0.429 | provisional (n≤1) |
| 10 | runway/gen-3-alpha-turbo | runway | 0.064 | 0.429 | provisional (n≤1) |
| 11 | stability/stable-video-diffusion | stability | 0.064 | 0.429 | provisional (n≤1) |
| 12 | fal-ai/kling-video/v2.5-turbo/pro/text-to-video | fal | 0.000 | 0.000 | provisional (n≤1) |
| 13 | internal/cesium-cartography | internal | 0.000 | 0.000 | provisional (n≤1) |
| 14 | synthesia/avatar-video | synthesia | 0.000 | 0.000 | provisional (n≤1) |
| 15 | d-id/avatar-video | d-id | 0.000 | 0.000 | provisional (n≤1) |
| 16 | hedra/avatar-video | hedra | 0.000 | 0.000 | provisional (n≤1) |
| 17 | elevenlabs/video-generation | elevenlabs | 0.000 | 0.000 | provisional (n≤1) |
| 18 | runway/gen-3-alpha | runway | 0.000 | 0.000 | provisional (n≤1) |
| 19 | pika/v2-2 | pika | 0.000 | 0.000 | provisional (n≤1) |
| 20 | luma/dream-machine-v1-5 | luma | 0.000 | 0.000 | provisional (n≤1) |
No benchmark outputs recorded for this niche yet.