Testing Jaiph Workflows

Overview

Jaiph ships a small native test runner for workflow modules. You write *.test.jh (or *.test.jph) files that import workflows under test, optionally replace prompts and other symbols with mocks, run workflows through the same managed runtime as jaiph run, and assert on captured output or return values.

Why mocks matter. Real workflows call LLMs, shell, and other workflows. That output is non-deterministic and environment-dependent. The test harness records mock prompt responses and can substitute shell bodies for workflows, rules, and functions so runs stay fast, repeatable, and offline-friendly.

Core concepts

File naming and layout

Running tests

A path to a *.test.jh or *.test.jph file is treated like jaiph test on that file (see CLI).

# All *.test.jh / *.test.jph files under the detected workspace root (recursive)
jaiph test

# All tests under a directory (recursive)
jaiph test ./e2e

# One file (equivalent: jaiph ./e2e/workflow_greeting.test.jh)
jaiph test ./e2e/workflow_greeting.test.jh

Test blocks

Inside a *.test.jh file you may use:

  1. Imports — Same syntax as workflow files: import "<path>" as <alias>.
  2. Test blockstest "description" { ... } with one or more steps.

Workflow references in run steps must be exactly <alias>.<workflow> (import alias and workflow name). The compiler validates that the alias exists and the imported module defines that workflow.

Optional shebang (#! on line 1) and full-line # comments at the top level are ignored like in other Jaiph files.

Example

The following matches the repo fixture e2e/workflow_greeting.test.jh (run from the repository root with jaiph test e2e/workflow_greeting.test.jh). The mock string is arbitrary; it is the canned reply injected for the next prompt call.

import "workflow_greeting.jh" as w

test "runs happy path and output contains expected mock" {
  mock prompt "e2e-greeting-mock"
  response = w.default
  expectContain response "e2e-greeting-mock"
  expectContain response "done"
}

Steps inside a test block

Typed prompts (returns '{ ... }'): mock text must be one line of JSON that satisfies the schema so parsing and field variables behave like a real agent response (see e2e/prompt_returns_run_capture.test.jh).

Pass/fail reporting

Each test block runs in isolation. Assertions, shell errors, or a workflow exiting non-zero (without allow_failure) mark that case failed.

The runner prints:

testing <file name>
  ▸ <description>
  ✓ <elapsed>s                          # success: checkmark line follows the ▸ line
  ▸ <description>
  ✗ <first error line> <elapsed>s       # failure
    <further stderr lines, indented>

✗ 1 / 2 test(s) failed
  - <failed test description>

When every test passes, the summary is ✓ N test(s) passed. Exit status is 0 on full success, non-zero if any test failed.

Given / When / Then

Structure is optional; comments and blank lines are fine:

import "app.jh" as app

test "default workflow prints greeting" {
  # Given
  mock prompt "hello"

  # When
  out = app.default

  # Then
  expectContain out "hello"
}

(Use a real import path next to your test file.)

Stress and soak testing

For concurrency-sensitive behavior (for example parallel inbox dispatch), the repository includes shell-based E2E scenarios that go beyond single native tests:

Examples: e2e/tests/93_inbox_stress.sh, e2e/tests/94_parallel_shell_steps.sh (background jobs, wait, concurrent stdout capture, artifact checks). For run_summary.jsonl contracts (LOG / LOGERR, inbox events, step pairing) under run.inbox_parallel, see e2e/tests/88_run_summary_event_contract.sh (needs python3).

E2E testing

Shell harnesses and CI expectations for the full repo are described in Contributing — E2E testing.

Limitations (v1)