Jaiph ships a small native test runner for workflow modules. You write *.test.jh (or *.test.jph) files that import workflows under test, optionally replace prompts and other symbols with mocks, run workflows through the same managed runtime as jaiph run, and assert on captured output or return values.
Why mocks matter. Real workflows call LLMs, shell, and other workflows. That output is non-deterministic and environment-dependent. The test harness records mock prompt responses and can substitute shell bodies for workflows, rules, and functions so runs stay fast, repeatable, and offline-friendly.
Core concepts
.test.jh or .test.jph, discovered by jaiph test. Each file lists imports and one or more test "..." { ... } blocks.expectContain, expectNotContain, and expectEqual check the captured string. Capture semantics are described under Workflow run (capture) below..test.jh or .test.jph suffix (for example workflow_greeting.test.jh).test blocks only. The parser may accept other top-level declarations in a *.test.jh file, but *.test.jh / *.test.jph modules are skipped when the workspace is compiled to .sh, so workflows or rules defined only in a test file are never emitted as runnable shell modules. Keeping tests to imports + test blocks avoids dead code and matches how the runner is meant to be used.import "..." as alias resolve relative to the directory of the test file, with the same extension fallback as ordinary modules (try .jh, then .jph). See Grammar — Import path.jaiph test walks the given directory recursively (or the workspace root when no path is passed). The workspace root is found by walking up from the current directory until a .jaiph or .git directory exists; if neither is found, the current directory is used.A path to a *.test.jh or *.test.jph file is treated like jaiph test on that file (see CLI).
# All *.test.jh / *.test.jph files under the detected workspace root (recursive)
jaiph test
# All tests under a directory (recursive)
jaiph test ./e2e
# One file (equivalent: jaiph ./e2e/workflow_greeting.test.jh)
jaiph test ./e2e/workflow_greeting.test.jh
Inside a *.test.jh file you may use:
import "<path>" as <alias>.test "description" { ... } with one or more steps.Workflow references in run steps must be exactly <alias>.<workflow> (import alias and workflow name). The compiler validates that the alias exists and the imported module defines that workflow.
Optional shebang (#! on line 1) and full-line # comments at the top level are ignored like in other Jaiph files.
The following matches the repo fixture e2e/workflow_greeting.test.jh (run from the repository root with jaiph test e2e/workflow_greeting.test.jh). The mock string is arbitrary; it is the canned reply injected for the next prompt call.
import "workflow_greeting.jh" as w
test "runs happy path and output contains expected mock" {
mock prompt "e2e-greeting-mock"
response = w.default
expectContain response "e2e-greeting-mock"
expectContain response "done"
}
mkdir, etc.). Lines whose trimmed form starts with # are comments and are skipped (not emitted as shell).mock prompt "<response>" — Queues one response for the next prompt in the workflow under test. Single-quoted responses are allowed. Multiple mock prompt lines queue in order (one consumption per prompt call). Do not mix with a mock prompt { ... } block in the same test: if a block mock is present, inline mock prompt "..." steps are not emitted and have no effect (see Limitations).mock prompt { ... } — Content-based dispatch. The body must follow this shape only:
if $1 contains "pattern" ; then / elif $1 contains "pattern" ; then / optional elseif / elif / else, a single respond "..." line is required before the next branch keyword or fifi then the closing } of mock prompt
Matching uses substring match on the prompt text (same idea as “contains”). The first matching branch wins. Without else, an unmatched prompt fails the test with a short preview of the prompt text.mock workflow <ref> { ... } — Replace that workflow for this test with the given shell body (e.g. echo ok). <ref> is <alias> or <alias>.<workflow>; a single-segment ref is resolved against the first import’s module symbol (prefer the two-part form for clarity).mock rule <ref> { ... } — Same for a rule.mock function <ref> { ... } — Same for a function; <ref> is <name> or <alias>.<name>.name = <alias>.<workflow> runs the workflow like jaiph run. Capture prefers the workflow’s explicit return value when the callee wrote one; otherwise the harness stores combined stdout and stderr with lines starting with __JAIPH_EVENT__ removed. The test fails on non-zero exit unless you add allow_failure. Variants: optional one string argument (name = w.default "arg"), and/or allow_failure (name = w.default allow_failure, name = w.default "arg" allow_failure). This is the test form of managed invocation—do not wrap the workflow call in $(…); see Grammar — Managed calls vs command substitution. Legacy alternate for allow-failure capture: name=$( { alias.workflow 2>&1; } || true ) (equivalent to allow_failure on an assignment capture).<alias>.<workflow> or <alias>.<workflow> "arg" runs without storing output; still fails on non-zero exit unless allow_failure is appended (same optional argument patterns as above).expectContain / expectNotContain / expectEqual — expectContain <var> "substring", etc. The expected string must be double-quoted (escape " inside the string with \" if needed). Failures print expected vs actual previews where applicable.Typed prompts (returns '{ ... }'): mock text must be one line of JSON that satisfies the schema so parsing and field variables behave like a real agent response (see e2e/prompt_returns_run_capture.test.jh).
Each test block runs in isolation. Assertions, shell errors, or a workflow exiting non-zero (without allow_failure) mark that case failed.
The runner prints:
testing <file name>
▸ <description>
✓ <elapsed>s # success: checkmark line follows the ▸ line
▸ <description>
✗ <first error line> <elapsed>s # failure
<further stderr lines, indented>
✗ 1 / 2 test(s) failed
- <failed test description>
When every test passes, the summary is ✓ N test(s) passed. Exit status is 0 on full success, non-zero if any test failed.
Structure is optional; comments and blank lines are fine:
import "app.jh" as app
test "default workflow prints greeting" {
# Given
mock prompt "hello"
# When
out = app.default
# Then
expectContain out "hello"
}
(Use a real import path next to your test file.)
For concurrency-sensitive behavior (for example parallel inbox dispatch), the repository includes shell-based E2E scenarios that go beyond single native tests:
Examples: e2e/tests/93_inbox_stress.sh, e2e/tests/94_parallel_shell_steps.sh (background jobs, wait, concurrent stdout capture, artifact checks). For run_summary.jsonl contracts (LOG / LOGERR, inbox events, step pairing) under run.inbox_parallel, see e2e/tests/88_run_summary_event_contract.sh (needs python3).
Shell harnesses and CI expectations for the full repo are described in Contributing — E2E testing.
mock prompt "..." or a single mock prompt { ... } dispatcher). Older external mock config formats are not supported.mock prompt { ... } with mock prompt "..." in the same test block; only the block path is active and inline queue steps are ignored in generated bash.return wins over stdout/stderr aggregation (with internal event lines stripped from the latter).expect* lines.jaiph test <file> are currently unused by the runner.