Integration

Python API

Everything the CLI does is callable from Python. Use this when you want to embed iFixAi into a custom harness, a notebook, an internal dashboard, or a CI script that needs to branch on the returned result.

The surface

FunctionPurpose

run_inspections(...)Run up to 32 tests (async). Returns a TestRunResult.

run_strategic(...)Run the top 8 strategic tests (async).

run_single(test_id, ...)Run a single test by ID (async).

compare_scorecards(baseline, enhanced)Vendor-neutral comparison report.

list_tests()Return the spec for every test.

list_fixtures()Return the built-in fixture names.

All runnable functions are async. Call them from your own event loop or wrap the top-level with asyncio.run(...).

Minimal example

python

import asyncio
from ifixai.api import run_inspections

async def main() -> None:
    result = await run_inspections(
        provider=class="c-s">"openai",
        api_key=class="c-s">"sk-...",
        model=class="c-s">"gpt-4o",
        fixture=class="c-s">"default",
        system_name=class="c-s">"my-assistant",
    )
    print(fclass="c-s">"Score: {result.overall_score:.2f}  Grade: {result.grade}")
    for cs in result.category_scores:
        print(fclass="c-s">"  {cs.category}: {cs.score:.2f}")

asyncio.run(main())

Custom provider

Pass any object that implements ChatProvider directly as the provider argument. No string name, no extras to install.

python

import asyncio
from ifixai.api import run_inspections
from ifixai.providers.base import ChatProvider
from ifixai.types import ChatMessage, ProviderConfig

class MyProvider(ChatProvider):
    async def send_message(
        self,
        messages: list[ChatMessage],
        config: ProviderConfig,
    ) -> str:
        reply = await my_runtime(messages, model=config.model)
        return reply.text

result = asyncio.run(run_inspections(
    provider=MyProvider(),
    fixture=class="c-s">"default",
    system_name=class="c-s">"my-assistant",
))

Single-test runs

Useful for targeted regression tests or for building dashboards that track one test over time.

python

from ifixai.api import run_single

result = await run_single(
    test_id=class="c-s">"B12",
    provider=class="c-s">"anthropic",
    api_key=class="c-s">"...",
    fixture=class="c-s">"./my-fixture.yaml",
)
print(result.score, result.passing, result.evidence_items[0])

Strategic-set runs

python

from ifixai.api import run_strategic

result = await run_strategic(
    provider=class="c-s">"openai",
    model=class="c-s">"gpt-4o",
    fixture=class="c-s">"default",
)
class=class="c-s">"c-c"># ~8 tests, fastest possible signal

Comparing two runs

Equivalent to the CLI compare command but returns a structured report you can render however you like.

python

import json
from pathlib import Path
from ifixai.api import compare_scorecards
from ifixai.types import TestRunResult

baseline = TestRunResult.model_validate(json.loads(Path(class="c-s">"runs/r-abc/scorecard.json").read_text()))
enhanced = TestRunResult.model_validate(json.loads(Path(class="c-s">"runs/r-def/scorecard.json").read_text()))

report = compare_scorecards(baseline=baseline, enhanced=enhanced)
for delta in report.test_deltas:
    print(fclass="c-s">"{delta.test_id}  {delta.baseline_score:.2f} → "
          fclass="c-s">"{delta.enhanced_score:.2f}  ({delta.delta:+.2f})")

Listing tests & fixtures

python

from ifixai.api import list_tests, list_fixtures

for spec in list_tests():
    print(spec.test_id, spec.name, spec.category.value, spec.threshold)

print(list_fixtures())  class=class="c-s">"c-c"># [class="c-s">"default"]

Working with the result

run_inspections returns a TestRunResult. Top-level fields:

python

result.system_name                 class=class="c-s">"c-c"># str, logical name passed in
result.system_version              class=class="c-s">"c-c"># str | None
result.provider                    class=class="c-s">"c-c"># str, e.g. class="c-s">"openai"
result.fixture_name                class=class="c-s">"c-c"># str, fixture identifier or path
result.timestamp                   class=class="c-s">"c-c"># datetime, run start time

result.overall_score               class=class="c-s">"c-c"># float in [0.0, 1.0], post-cap
result.overall_score_before_cap    class=class="c-s">"c-c"># float, pre mandatory-minimum cap
result.grade                       class=class="c-s">"c-c"># class="c-s">"A" | class="c-s">"B" | class="c-s">"C" | class="c-s">"D" | class="c-s">"F"
result.score_capped                class=class="c-s">"c-c"># bool, True if a mandatory minimum capped the score
result.strategic_score             class=class="c-s">"c-c"># float, weighted score across the strategic set

result.test_results                class=class="c-s">"c-c"># list[TestResult], per-test outcomes
result.category_scores             class=class="c-s">"c-c"># list[CategoryScore], with .category and .score

result.mandatory_minimum_status    class=class="c-s">"c-c"># dict[str, bool], B01/B08 → met
result.mandatory_minimums_passed   class=class="c-s">"c-c"># bool

★

Type-safe, strict

The types module uses pydantic v2. All result objects and config objects validate on construction. A malformed fixture or a wrong-shape response fails fast with a descriptive error rather than producing a silently-wrong score.

Errors

All provider errors inherit from ProviderError.

ProviderConnectionError, network failure or unreachable endpoint.
ProviderAuthError, bad or missing credentials.
ProviderRateLimitError, provider rate-limited; B26 reads this signal.
ProviderTimeoutError, request exceeded --timeout.
ProviderResponseError, provider returned a malformed response.
FixtureValidationError, fixture file failed schema validation; carries an errors list.

Building regulatory reports? See regulatory mappings. Integrating into a deterministic replay harness? See reproducibility.

The surface

Minimal example

Custom provider

Single-test runs

Strategic-set runs

Comparing two runs

Listing tests & fixtures

Working with the result

Errors

Next