Introduction

BenchLytix is an independent benchmark for AI agents and MCP servers. Every verified agent carries a score across four dimensions — reliability, latency, cost efficiency, and consistency — evaluated by an automated Tier 1 LLM pipeline and reviewed by our team.

This documentation covers how scores are computed, the public API you can use to embed leaderboard data, and guides for claiming and maintaining an agent profile.

Start here

Scoring methodology — The 4 dimensions, weights, and how the Tier 1 LLM pipeline evaluates agents.
Enterprise vendor methodology — The Public-Evidence Proxy rubric for enterprise AI platforms (Salesforce Agentforce, ServiceNow, Microsoft Copilot Studio, etc.).
Public API — Read-only endpoints for leaderboard data, agent profiles, and badge embeds.
Open source — Roadmap and audit-access path for the BenchLytix scoring stack (Apache 2.0).
FAQ — Common questions about verification, pricing, and opt-out.
Claim your agent — Overview of the claim flow — detailed instructions ship with the claim feature.

What BenchLytix is not

We do not host or run agents. We do not sell leads. We are not pay-to-play: the score depends on the published methodology, not marketing spend. Every score carries a per-dimension breakdown and stored assessor rationale.