Salta ai contenuti

CI for Prompts

Questi contenuti non sono ancora disponibili nella tua lingua.

  • Repo with prompts stored as versioned files.
  • Test corpus with inputs and expected outputs or acceptance checks.
  • Access to model provider and a cost cap for CI.
  1. Add prompt test runner script that reads fixtures and evaluates acceptance checks.
  2. Mock or cap external calls with deterministic seeds where possible.
  3. Configure CI job (e.g., GitHub Actions) to run on PR and on merge.
  4. Fail the job on:
    • Schema violations
    • Output drift above approved thresholds
    • Increased token cost beyond budget
  5. Store artifacts: diffs, samples, and run metrics.
  • Green build with stable metrics against baseline.
  • Review artifacts and approve intentional changes.
  • Flaky tests: tighten determinism (temperature, seeds) or use larger corpora.
  • Cost spikes: shard tests or mark some as nightly.
  • Provider 429s: implement backoff and retries.
  • Setup: 1–2 hours.
  • Ongoing: minutes per PR; prevents regressions and incidents.