The surface
FunctionPurpose
run_inspections(...)Run up to 32 tests (async). Returns a TestRunResult.
run_strategic(...)Run the top 8 strategic tests (async).
run_single(test_id, ...)Run a single test by ID (async).
compare_scorecards(baseline, enhanced)Vendor-neutral comparison report.
list_tests()Return the spec for every test.
list_fixtures()Return the built-in fixture names.
All runnable functions are async. Call them from your own event loop or wrap the top-level with asyncio.run(...).
Minimal example
import asyncio
from ifixai.api import run_inspections
async def main() -> None:
result = await run_inspections(
provider=class="c-s">"openai",
api_key=class="c-s">"sk-...",
model=class="c-s">"gpt-4o",
fixture=class="c-s">"default",
system_name=class="c-s">"my-assistant",
)
print(fclass="c-s">"Score: {result.overall_score:.2f} Grade: {result.grade}")
for cs in result.category_scores:
print(fclass="c-s">" {cs.category}: {cs.score:.2f}")
asyncio.run(main())Custom provider
Pass any object that implements ChatProvider directly as the provider argument. No string name, no extras to install.
import asyncio
from ifixai.api import run_inspections
from ifixai.providers.base import ChatProvider
from ifixai.types import ChatMessage, ProviderConfig
class MyProvider(ChatProvider):
async def send_message(
self,
messages: list[ChatMessage],
config: ProviderConfig,
) -> str:
reply = await my_runtime(messages, model=config.model)
return reply.text
result = asyncio.run(run_inspections(
provider=MyProvider(),
fixture=class="c-s">"default",
system_name=class="c-s">"my-assistant",
))Single-test runs
Useful for targeted regression tests or for building dashboards that track one test over time.
from ifixai.api import run_single
result = await run_single(
test_id=class="c-s">"B12",
provider=class="c-s">"anthropic",
api_key=class="c-s">"...",
fixture=class="c-s">"./my-fixture.yaml",
)
print(result.score, result.passing, result.evidence_items[0])Strategic-set runs
from ifixai.api import run_strategic
result = await run_strategic(
provider=class="c-s">"openai",
model=class="c-s">"gpt-4o",
fixture=class="c-s">"default",
)
class=class="c-s">"c-c"># ~8 tests, fastest possible signalComparing two runs
Equivalent to the CLI compare command but returns a structured report you can render however you like.
import json
from pathlib import Path
from ifixai.api import compare_scorecards
from ifixai.types import TestRunResult
baseline = TestRunResult.model_validate(json.loads(Path(class="c-s">"runs/r-abc/scorecard.json").read_text()))
enhanced = TestRunResult.model_validate(json.loads(Path(class="c-s">"runs/r-def/scorecard.json").read_text()))
report = compare_scorecards(baseline=baseline, enhanced=enhanced)
for delta in report.test_deltas:
print(fclass="c-s">"{delta.test_id} {delta.baseline_score:.2f} → "
fclass="c-s">"{delta.enhanced_score:.2f} ({delta.delta:+.2f})")Listing tests & fixtures
from ifixai.api import list_tests, list_fixtures
for spec in list_tests():
print(spec.test_id, spec.name, spec.category.value, spec.threshold)
print(list_fixtures()) class=class="c-s">"c-c"># [class="c-s">"default"]Working with the result
run_inspections returns a TestRunResult. Top-level fields:
result.system_name class=class="c-s">"c-c"># str, logical name passed in
result.system_version class=class="c-s">"c-c"># str | None
result.provider class=class="c-s">"c-c"># str, e.g. class="c-s">"openai"
result.fixture_name class=class="c-s">"c-c"># str, fixture identifier or path
result.timestamp class=class="c-s">"c-c"># datetime, run start time
result.overall_score class=class="c-s">"c-c"># float in [0.0, 1.0], post-cap
result.overall_score_before_cap class=class="c-s">"c-c"># float, pre mandatory-minimum cap
result.grade class=class="c-s">"c-c"># class="c-s">"A" | class="c-s">"B" | class="c-s">"C" | class="c-s">"D" | class="c-s">"F"
result.score_capped class=class="c-s">"c-c"># bool, True if a mandatory minimum capped the score
result.strategic_score class=class="c-s">"c-c"># float, weighted score across the strategic set
result.test_results class=class="c-s">"c-c"># list[TestResult], per-test outcomes
result.category_scores class=class="c-s">"c-c"># list[CategoryScore], with .category and .score
result.mandatory_minimum_status class=class="c-s">"c-c"># dict[str, bool], B01/B08 → met
result.mandatory_minimums_passed class=class="c-s">"c-c"># bool
★
Type-safe, strict
The types module uses pydantic v2. All result objects and config objects validate on construction. A malformed fixture or a wrong-shape response fails fast with a descriptive error rather than producing a silently-wrong score.
Errors
All provider errors inherit from ProviderError.
ProviderConnectionError, network failure or unreachable endpoint.ProviderAuthError, bad or missing credentials.ProviderRateLimitError, provider rate-limited; B26 reads this signal.ProviderTimeoutError, request exceeded --timeout.ProviderResponseError, provider returned a malformed response.FixtureValidationError, fixture file failed schema validation; carries an errors list.
Next
Building regulatory reports? See regulatory mappings. Integrating into a deterministic replay harness? See reproducibility.