Built-in providers
--provider mockMock (no cloud keys)
--api-key not-used (string is ignored) --provider openaiOpenAI
OPENAI_API_KEY --provider openrouterOpenRouter
OPENROUTER_API_KEY --provider anthropicAnthropic
ANTHROPIC_API_KEY --provider geminiGoogle Gemini
GEMINI_API_KEY (or GOOGLE_API_KEY) --provider azureAzure OpenAI
AZURE_OPENAI_API_KEY + --endpoint --provider bedrockAWS Bedrock
AWS_* (standard boto3 chain) --provider huggingfaceHugging Face
HF_TOKEN (HUGGINGFACE_API_TOKEN also accepted) --provider httpHTTP (OpenAI-compatible)
--endpoint + --api-key --provider langchainLangChain
pip install langchain (not bundled as an extra) Install the right extra
Extras are optional dependencies. Pick the one matching your target provider to avoid pulling every vendor SDK.
pip install -e ".[openai]" "c-c"># OpenAI
pip install -e ".[anthropic]" "c-c"># Anthropic
pip install -e ".[gemini]" "c-c"># Google Gemini
pip install -e ".[azure]" "c-c"># Azure OpenAI
pip install -e ".[bedrock]" "c-c"># AWS Bedrock
pip install -e ".[huggingface]" "c-c"># HuggingFace
pip install -e ".[dev]" "c-c"># dev tools (pytest, ruff)
# Combine:
pip install -e ".[openai,anthropic,gemini]"
Running against a REST endpoint (no code)
Any agent or deployment reachable over HTTP with a chat-shaped JSON body can be scored without writing code.
ifixai run --provider http \
--endpoint https://your-api.example.com/v1/chat \
--api-key $YOUR_KEY
The ChatProvider interface
For any unsupported runtime (in-process model, custom gateway, RPC service), implement ChatProvider. The required surface is one async method.
from ifixai.providers.base import ChatProvider
from ifixai.types import ChatMessage, ProviderConfig
class MyProvider(ChatProvider):
async def send_message(
self,
messages: list[ChatMessage],
config: ProviderConfig,
) -> str:
reply = await your_runtime.run(messages, model=config.model)
return reply.textThen pass the instance directly into the Python API:
import asyncio
from ifixai.api import run_inspections
from my_module import MyProvider
result = asyncio.run(run_inspections(
provider=MyProvider(),
fixture=class="c-s">"default",
system_name=class="c-s">"my-assistant",
))
print(result.overall_score, result.grade)Optional capability hooks
ChatProvider exposes twelve optional methods plus one class attribute. Several tests prefer a structured pathway when the provider declares the capability; the rest fall back to text-based scoring. All methods return None (or a domain object) by default. Implement only what your provider can supply.
MethodUsed by
list_tools(config)Discovery
invoke_tool(tool_id, user_role, params, config)Tool-execution probes
authorize_tool(tool_id, user_role, config)B01, B08
retrieve_sources(query, config)B05
get_audit_trail(request_id, config)B03
get_routing_decision(config)B02
get_grounding_report(config)B07
get_governance_architecture(config)B02
apply_override(request_id, config)B04
get_configuration_version(config)B23
get_roles(config)B01
get_permission_matrix(config)B01
surfaces_rate_limit_errors (class attribute, default True)B26
Declaring a capability is a signed statement: "this runtime supports this test structurally." Tests that can use it will; those that can't fall back cleanly to text-based scoring.
★
LangChain out of the box
If your agent is a LangChain Runnableor agent, pass it via the Python API, there's a built-in adapter that wraps common LangChain shapes into ChatProviderso you don't write a class.
Full-mode judges
For Full mode, judges use the same ChatProvider interface. Pass each judge with --judge-provider + --judge-api-key. Arity must match (one api-key per judge-provider).
ifixai run --mode full \
--provider openai --model gpt-4o \
--fixture ./my-fixture.yaml \
--judge-provider anthropic --judge-api-key $ANTHROPIC_KEY \
--judge-provider gemini --judge-api-key $GEMINI_KEY
See Standard vs Full for ensemble semantics (majority, tie-break, error tolerance, attribution).
Troubleshooting
ProviderNotInstalled, install the matching extra: pip install -e ".[anthropic]".- Auth errors, check the env var name in
ifixai/providers/secrets.py. - Self-judge tests skipped in Full mode, add a second
--judge-provider distinct from --provider. - Slow runs, use
--strategic (8 tests) or --test BXX (one test) while iterating.