Daily Bench - Model Performance Dashboard

Track and visualize model performance over time, monitor for regression during peak load periods, and detect quality changes across LLM APIs

Loading data...

📊 All Models Performance

Compare performance across all available providers and models

Provider:

Metric:

Dataset/Scenario:

📈 Performance Timeline

⚡ Performance Distribution Over Time

Time Period:

📊 Model Consistency Analysis

Analysis Type: View:

🔍 Individual Model Analysis

Deep dive into a specific model's performance

Select Model:

Dataset/Scenario:

Metric:

📈 Performance Over Time

⚡ Performance Distribution

Time Period:

📊 Summary Statistics

📋 Recent Performance Comparison

📄 Raw Data

Show last:

Why This Matters

LLM API quality can change without notice, affecting your applications in production. Community reports show these changes happen regularly - tracking performance helps you detect regressions early.

Community reports of LLM quality changes - @secemp9, @_xjdr, @PrimeIntellect, @0xblacklight