A shared workflow needs shared expectations: which branch to target, how to build from a clone, and what evidence a change should carry. This page is that contract for Jaiph — branching, local install, code and testing philosophy, the layered test stack (TypeScript, txtar, goldens, bash E2E), and what CI enforces. It does not teach the language; for that, use Getting Started (documentation map), Setup (install and workspace), and Grammar. For how the implementation is structured (components, compile and run pipelines, buildRuntimeGraph vs validation, runtime contracts, artifact paths), use Architecture as the source of truth.
Development moves quickly and may include breaking changes. Two primary branches: main (stable) and nightly (latest).
main branch, and check if the issue has already been addressed in nightly.nightly (and add or update tests in e2e/tests).e2e/tests).Use the local installer wrapper script in this repo:
./docs/install-from-local.sh
After install, verify:
jaiph --version
jaiph --help
The script installs from local source (including uncommitted changes) and places the CLI in ~/.local/bin by default (or JAIPH_BIN_DIR if set).
For day-to-day work on the compiler and CLI you usually stay inside the clone: install dev dependencies once, then build and run tests from npm scripts.
Prerequisites: Node.js and npm (the installer also expects git and bash). End-to-end tests are written in bash and are run by e2e/test_all.sh.
Typical commands (from the repo root):
| Command | What it runs |
|---|---|
npm install |
Installs TypeScript and types (dev dependencies). |
npm run build |
Runs tsc, then copies src/runtime → dist/src/runtime (kernel JS for the compiled CLI) and runtime/overlay-run.sh → dist/src/runtime/overlay-run.sh (Docker overlay entrypoint). |
npm run build:standalone |
npm run build, then copies dist/src/runtime → dist/runtime and runs bun build --compile on src/cli.ts → dist/jaiph. Requires Bun. Ship the dist/ tree (binary plus the runtime directory) for a self-contained layout. |
npm test |
npm run clean, then npm run build, then the Node.js test runner with JAIPH_UNSAFE=true, NODE_OPTIONS including --enable-source-maps and a large heap limit, on every file under dist/integration/ matching *.test.js, every file under dist/src/ matching *.test.js or *.acceptance.test.js (via find), dist/test-infra/compiler-test-runner.js (txtar compiler tests), and dist/test-infra/golden-ast-runner.js (golden AST tests). |
npm run test:compiler |
npm run build, then node --test on dist/test-infra/compiler-test-runner.js — runs txtar-based compiler test fixtures from test-fixtures/compiler-txtar/. |
npm run test:golden-ast |
npm run build, then node --test on dist/test-infra/golden-ast-runner.js — runs golden AST tests from test-fixtures/golden-ast/. Use UPDATE_GOLDEN=1 npm run test:golden-ast to regenerate goldens after intentional parser changes. |
npm run test:acceptance:compiler |
npm run build, then node --test on only dist/src/**/*.acceptance.test.js — compiler acceptance tests without the full unit suite or E2E. |
npm run test:acceptance:runtime |
bash ./e2e/test_all.sh only — same E2E driver as below without an implicit rebuild; ensure dist/ is up to date before running. |
npm run test:acceptance |
npm run test:acceptance:compiler then npm run test:acceptance:runtime. |
npm run test:e2e |
npm run build, then bash ./e2e/test_all.sh. Prefer this when you want a fresh dist/ before E2E. By default this exercises the Docker sandbox when JAIPH_UNSAFE is unset. For a faster host-only run (no container), use JAIPH_UNSAFE=true npm run test:e2e. |
npm run test:samples |
npx playwright test — Playwright suite for the docs landing page (e2e/playwright/). Uses http://127.0.0.1:4000 (see playwright.config.ts); starts Jekyll via webServer or reuses one already on that port. Requires Playwright (npx playwright install chromium once). |
npm run test:ci |
npm test followed by npm run test:e2e — useful before pushing when you want the full local picture. |
Run a single Node test file after a build with e.g. node --test dist/src/parse/parse-core.test.js. The dist/ paths mirror the source layout under src/.
The root .gitignore blocks common debug and temp directory patterns so they never reach version control:
| Pattern | Purpose |
|---|---|
docker-*/ |
Leftover Docker debug/experiment directories |
nested-*/ |
Nested-run debug directories |
overlay-*/ |
Overlay/fuse debug directories |
local-*/ |
Local debug directories |
.tmp*/ |
Temp build/debug directories |
QUEUE.md.tmp.* |
Stale queue temp files |
If you create throwaway directories during development, use one of these prefixes so they are automatically ignored. To track a file that matches a blocked pattern, use git add -f.
Jaiph’s codebase is maintained by both humans and AI agents. Code should be easy to read, navigate, and modify for both — which means the same thing: straightforward, flat, and explicit.
Jaiph uses several test layers. Each layer catches a different class of bug. Use the narrowest layer that covers the behavior you’re verifying.
| Layer | Location | What it catches | When to use |
|---|---|---|---|
| Module tests | src/**/*.test.ts (colocated) |
Bugs in pure functions (event parsing, param formatting, path resolution, config merging) | The function is self-contained, takes input and returns output, no I/O |
| Compiler acceptance tests | src/transpile/*.acceptance.test.ts (colocated) |
Cross-module compiler behavior: validation errors, resolution, and other cases that need a temp project tree or subprocess | You need a deterministic error string, multi-file buildScripts, or behavior that does not fit a tiny golden snippet |
| Compiler golden tests | src/transpile/compiler-golden.test.ts (colocated) |
Regressions in the parser, validation messages, and scripts-only extraction (buildScriptFiles in emit-script.ts) — expectations are inline in the test file |
You changed the parser, validator, or script extraction and need to lock an exact error string, extracted script shape, or corpus behavior |
| Compiler tests (txtar) | test-fixtures/compiler-txtar/*.txt |
Parse and validate outcomes — success, parse errors, validation errors — using language-agnostic txtar fixtures (hundreds of === cases across the four *.txt files) |
You want a portable test case that can be reused by alternative compiler implementations; the test is a .jh input paired with an expected outcome |
| Golden AST tests | test-fixtures/golden-ast/fixtures/*.jh + test-fixtures/golden-ast/expected/*.json |
Parse tree shape for successful parses — serialized to deterministic JSON with locations stripped (9 fixtures: e.g. imports, brace-if, log, match and match-multiline, params, prompt-capture, run-ensure, script-defs) | You changed the parser and need to verify the AST structure hasn’t drifted; txtar tests only check pass/fail, goldens lock in the actual tree shape |
| Integration tests | integration/*.test.ts, integration/sample-build/*.test.ts |
Process-level integration behavior: signal handling, TTY rendering, run summary structure, sample builds | The test spans multiple modules or requires subprocess/PTY harnesses |
| E2E tests | e2e/tests/*.sh |
Runtime behavior — does the workflow actually execute correctly end-to-end? | The behavior involves the CLI launcher, Node runtime, process lifecycle, or file artifacts |
buildScripts / emitScriptsForModule run validateReferences before any script files are written. buildRuntimeGraph() only parses modules and follows imports — it does not re-run that validation. Lock compile errors in the compiler/validator tests; the runtime graph is the wrong layer for that (see Architecture — Transpiler / Node workflow runtime).npm test. Failures in src/transpile/compiler-golden.test.ts usually mean updating an explicit expected string or fixture in that file — there is no separate dump script; align expectations with intentional emitter changes and re-run npm test. Golden AST tests (test-fixtures/golden-ast/) complement this by locking in the parse tree shape — if those fail, regenerate with UPDATE_GOLDEN=1 npm run test:golden-ast and review the diff.e2e::expect_stdout) and what the runtime persists (artifact files via e2e::expect_out, e2e::expect_file). A bug could break one without the other.src/ (e.g. src/parse/parse-core.test.ts, src/cli/run/display.test.ts, src/transpile/compiler-golden.test.ts). Names are *.test.ts or *.acceptance.test.ts.integration/ (see Integration tests).e2e/tests/*.sh, driven by e2e/test_all.sh.npm test discovers colocated files under src/, integration tests under integration/, and test infrastructure in test-infra/; see the Developing in the repository table for the exact command.Module tests live next to the source files they validate, inside the same src/ tree. Names are *.test.ts or *.acceptance.test.ts. To list them from the repo root:
find src -type f \( -name '*.test.ts' -o -name '*.acceptance.test.ts' \) | sort
Grouping (use the find output as authoritative after refactors):
| Area | Typical location | What it usually covers |
|---|---|---|
| Parser and tokenizer helpers | src/parse/*.test.ts, src/parse/dedent.test.ts |
.jh / .test.jh surface: imports, config, steps, strings, channels, fences, run async, … |
| CLI and terminal UX | src/cli/**/*.test.ts |
Commands, jaiph run lifecycle, progress, hooks, resolve-env |
| Transpiler and validation | src/transpile/*.test.ts + *.acceptance.test.ts |
validateReferences, emit, golden compiler (compiler-golden.test.ts), cross-module edge cases (compiler-edge.acceptance.test.ts) |
| Formatter | src/format/*.test.ts |
jaiph format |
| Runtime and Docker | src/runtime/kernel/*.test.ts, src/runtime/docker.test.ts |
Graph, emit, prompts, test runner, workflow launch, docker helper |
| Standalone root tests | e.g. src/inline-script-name.test.ts |
Small colocated cases that are not under a feature subtree |
When adding a new source module or extending an existing one, create or extend the corresponding *.test.ts in the same directory. For kernel internals, the compile path, and artifact contracts, see Architecture.
Tests that span multiple modules, require subprocess/PTY harnesses, or exercise process-level behavior live in integration/. These do not belong to a single module:
| Test file | Kind | What it covers |
|---|---|---|
integration/sample-build/build.test.ts |
Integration | Build/transpile behavior — buildScripts, buildScriptFiles, script extraction |
integration/sample-build/cli-tree.test.ts |
Integration | CLI tree output rendering for sample workflows |
integration/sample-build/run-core.test.ts |
Integration | Core runtime execution — workflow runs, step sequencing, artifacts |
integration/sample-build/run-prompt-agent.test.ts |
Integration | Prompt and agent interaction in sample workflows |
integration/sample-build/recover-handle.test.ts |
Integration | recover / Handle<T> async behavior in sample workflows |
integration/sample-build/test-advanced.test.ts |
Integration | Advanced test harness behavior — mocks, channels, edge cases |
integration/sample-build/test-framework.test.ts |
Integration | Test framework basics — mock prompt, expect_*, test block lifecycle |
integration/run-summary-jsonl.test.ts |
Integration | Runs the CLI on a small workflow and asserts structure and fields of run_summary.jsonl under .jaiph/runs/ |
integration/signal-lifecycle.test.ts |
Acceptance | After SIGINT/SIGTERM, verifies jaiph run exits within a time bound and leaves no stale child processes |
integration/tty-running-timer.test.ts |
Acceptance | In a TTY, verifies the “RUNNING workflow” line updates over time (requires Python 3 PTY harness) |
The integration/sample-build/ directory also has a shared helpers.ts module used by the sample-build tests. Shared test fixtures (.jh source files and expected output) live in test-fixtures/sample-build/.
The project uses GitHub Actions (.github/workflows/ci.yml). The workflow defines six jobs; on a typical feature-branch push, five of them run. The sixth — Publish Docker runtime image — runs only on pushes to nightly and on v* version tags, after the other jobs succeed. It builds and pushes ghcr.io/jaiphlang/jaiph-runtime (the default runtime.docker_image / JAIPH_DOCKER_IMAGE when Docker sandboxing is on; see Docker runtime helper in Architecture).
| Job | Runner | Purpose |
|---|---|---|
| ShellCheck | ubuntu-latest |
Runs shellcheck on runtime/overlay-run.sh to lint the standalone shell script shipped in the npm package. |
| Compiler and unit tests | ubuntu-latest |
npm test (TypeScript unit + acceptance + golden tests), plus a curl check that the public install URL responds and a git-tag verification on main. |
| E2E install and CLI workflow | Matrix: ubuntu-latest twice + macos-latest |
npm run test:e2e — full build-and-run E2E suite. In CI, the docker matrix leg builds jaiph-ci-runtime:local from runtime/Dockerfile and sets JAIPH_DOCKER_IMAGE so the job does not pull the public GHCR image during the run. Ubuntu — docker: JAIPH_UNSAFE unset (container sandbox). Ubuntu / macOS — host: JAIPH_UNSAFE=true (no Docker; macOS does not run the docker leg). On a developer machine, with JAIPH_UNSAFE unset, the CLI still resolves the default image (typically ghcr.io/jaiphlang/jaiph-runtime) for Docker-backed runs — see src/runtime/docker.ts and Architecture. |
| Getting started (local) | ubuntu-latest |
Serves the Jekyll site from docs/ on 127.0.0.1:4000, smoke-checks key routes with curl, builds the same local runtime image as E2E for any Docker-backed sample paths, installs Playwright (Chromium), and runs npx playwright test for landing-page samples. The Playwright step builds Jaiph, checks sample source against examples/*.jh, and runs deterministic samples through the CLI. No runtime dependency on jaiph.org for the site content. |
| E2E install and CLI workflow (windows-latest + wsl) | windows-latest |
Provisions or selects a WSL distro, installs Node inside it, and runs npm run test:e2e under WSL with JAIPH_UNSAFE=true. |
| Publish Docker runtime image | ubuntu-latest |
Conditional (see above). Multi-arch push to GHCR. |
Pushing a version tag (v*) triggers .github/workflows/release.yml, which publishes to npm using trusted publishing (OIDC). No classic NPM_TOKEN secret is stored in the repo. After a successful publish, a smoke job installs jaiph globally and verifies --version and --help match expectations. The npm package must have trusted publishing enabled for the jaiphlang/jaiph repo and release.yml workflow on npmjs.com.
The Getting started (local) CI job validates that the documentation site under docs/ can be built and served from source. It uses Ruby 3.2 with bundler-cache, runs bundle exec jekyll serve --host 127.0.0.1 --port 4000 in the background, and polls http://127.0.0.1:4000/ for up to 30 seconds before asserting HTTP 200 on /, /getting-started, /setup, /libraries, and /artifacts. The same job also prepares Node, a local jaiph-ci-runtime:local image, Playwright Chromium, and (for samples that need them) external CLIs — see the docs-local job in .github/workflows/ci.yml for the exact package list, which can change.
To run the same check locally:
cd docs
bundle install # first time only
bundle exec jekyll serve --host 127.0.0.1 --port 4000
# In another terminal:
curl -fsSL http://127.0.0.1:4000/
The Jekyll project lives entirely inside docs/ — Gemfile, _config.yml, layouts, and all Markdown pages.
After the Jekyll smoke-check, the CI job also verifies that code samples shown on the landing page match real CLI behavior. This uses Playwright (Chromium) with a test suite in e2e/playwright/landing-page.spec.ts.
The test does two things:
[data-sample-source] elements inside [data-sample] tab panels) and compares it byte-for-byte against the corresponding file in examples/ (identified by data-sample-file).say_hello.jh failure path and agent_inbox.jh), runs the workflow via node dist/src/cli.js run and asserts that key output lines match what the page displays ([data-sample-output] blocks), after normalizing ANSI codes, timestamps, and trailing whitespace.To run locally:
npm run test:samples
If a Jekyll server is already running on http://127.0.0.1:4000, Playwright reuses it (reuseExistingServer in playwright.config.ts). Otherwise it starts one.
Samples that require live agent backends (e.g. async.jh, recover_loop.jh) are verified for source parity only — output verification is limited to fully deterministic workflows.
The E2E test suite (e2e/tests/*.sh) exercises the full build-and-run pipeline from the outside: compile a workflow, run it, and assert on both the CLI tree output and the run artifact files (.out, .err) written to .jaiph/runs/.
Some scripts are contract tests: they validate persisted machine-readable output (for example e2e/tests/88_run_summary_event_contract.sh and run_summary.jsonl) in addition to or instead of golden CLI trees.
E2E tests are the outermost behavior contracts for the CLI and runtime. Each test should exercise the real pipeline and assert on two independent surfaces:
e2e::expect_stdout with a heredoc)..jaiph/runs/<date>/<source>/ (e2e::expect_out, e2e::expect_file, e2e::expect_run_file). Inbox files live under the run directory when the feature touches inbox behavior.Default contract: every assertion should compare the full expected text (stdout heredoc, artifact file contents, JSONL lines) unless there is a documented exception. Use e2e::expect_stdout, e2e::expect_out, e2e::expect_file, e2e::expect_run_file, or e2e::assert_equals / e2e::assert_output_equals for full comparisons.
e2e::assert_contains (substring check) is allowed only when full equality is not feasible. Every such use must have an inline comment explaining why. Valid reasons:
<time> normalization.run_summary.jsonl with platform-dependent event counts, or live step output where line count varies.Normalization: e2e::normalize_output (in e2e/lib/common.sh) strips ANSI codes, replaces timing values with <time>, normalizes some CLI-specific strings (<agent-command>, <script-path>), and sorts a class of async progress lines (UTF-8 subscript markers) so strict equality stays stable when parallel branches finish in different orders. This keeps full-equality heredocs usable across machines.
Where files land on disk (directory tree, sequence prefixes): Architecture — Durable artifact layout. Runtime testing with *.test.jh is covered in Testing. The run_summary.jsonl event contract is exercised in e2e/tests/88_run_summary_event_contract.sh.
Every E2E test follows a Given / When / Then pattern using helper functions from e2e/lib/common.sh. The helpers eliminate boilerplate so each test reads like a specification:
#!/usr/bin/env bash
set -euo pipefail
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
source "${ROOT_DIR}/e2e/lib/common.sh"
trap e2e::cleanup EXIT
e2e::prepare_test_env "my_test"
TEST_DIR="${JAIPH_E2E_TEST_DIR}"
e2e::section "Feature under test"
# Given — create the workflow file inline (script + workflow; same shape as e2e/tests/10_basic_workflows.sh)
e2e::file "hello.jh" <<'EOF'
script hello_impl = `echo "hello-jh"`
workflow default() {
const msg = run hello_impl()
return "${msg}"
}
EOF
# When — build and run
hello_out="$(e2e::run "hello.jh")"
# Then — assert on CLI tree output
e2e::expect_stdout "${hello_out}" <<'EOF'
Jaiph: Running hello.jh
workflow default
▸ script hello_impl
✓ script hello_impl (<time>)
✓ PASS workflow default (<time>)
EOF
# Then — assert on run artifacts
e2e::expect_out_files "hello.jh" 2
e2e::expect_out "hello.jh" "hello_impl" "hello-jh"
When you need a specific sequence-prefixed filename (e.g. 000002-module__step.out), use e2e::expect_run_file — see e2e/tests/72_docker_run_artifacts.sh and similar.
All helpers are defined in e2e/lib/common.sh.
| Helper | Description |
|---|---|
e2e::prepare_test_env "name" |
Set up a clean, isolated test environment: shared context, local install, temp test directory. Call once at the top of each test. |
e2e::cleanup |
Remove temp directories and stop any local server. Register with trap e2e::cleanup EXIT. |
e2e::section "label" |
Print a == label == header for visual grouping of assertions. |
| Helper | Description |
|---|---|
e2e::file "name" <<'EOF' ... EOF |
Write a workflow file into the test directory from a heredoc. Creates parent directories as needed. |
e2e::run "file" [args...] |
Build and run a workflow file. Returns the CLI stdout for capture. |
e2e::expect_fail "file" [args...] |
Assert that running the workflow fails (non-zero exit). |
e2e::run_dir "file" |
Return the path of the single run directory for file under .jaiph/runs/. Fails if zero or more than one match. |
e2e::run_dir_at "base" "file" |
Same as e2e::run_dir but searches under a custom base directory. |
e2e::latest_run_dir_at "base" "file" |
Return the path of the most recent run directory for file under a custom base. Useful when a file may have been run multiple times. |
e2e::git_init |
Initialize a git repo in the test directory (portable across bash versions). |
e2e::git_current_branch |
Return the current branch name (defaults to main if detached). |
e2e::readonly_sandbox_available |
Return success if Linux read-only sandboxing prerequisites (unshare, passwordless sudo) are available. Use to guard platform-dependent tests with e2e::skip. |
| Helper | Description |
|---|---|
e2e::expect_stdout "$var" <<'EOF' ... EOF |
Assert that the captured CLI output matches the expected heredoc exactly (after ANSI stripping and time normalization). Use <time> as a placeholder for timing values. |
After a workflow runs, its step outputs are written as sequenced artifact files under .jaiph/runs/. These helpers verify artifact content independently from CLI display output. For the on-disk layout and naming scheme, see Architecture — Durable artifact layout.
| Helper | Description |
|---|---|
e2e::expect_out_files "file" N |
Assert that the run directory for file contains exactly N .out files. Use 0 for steps with no stdout (e.g. touch, test, redirected output). |
e2e::expect_out "file" "step" "expected" |
Assert that the .out file for the named step (script, rule, or default workflow bucket) matches expected exactly. |
e2e::expect_rule_out "file" "rule" "expected" |
Assert that the .out file for a rule step matches expected exactly. Dot-separated rule names are normalized (e.g. lib.ready → lib__ready). |
e2e::expect_run_file "file" "name" "expected" |
Assert that a specific named file (e.g. 000002-module__step.out) in the run directory for file matches expected exactly. Use when you need to assert on a file by its sequence-prefixed name. |
e2e::expect_run_file_at "base" "file" "name" "expected" |
Same as e2e::expect_run_file but searches under a custom base directory instead of .jaiph/runs/. Use for tests with custom run.logs_dir or JAIPH_RUNS_DIR. |
e2e::expect_run_file_count "file" N |
Assert that the run directory for file contains exactly N artifact files (.out + .err combined). |
e2e::expect_run_file_count_at "base" "file" N |
Same as e2e::expect_run_file_count but under a custom base directory. |
e2e::expect_file "glob" <<'EOF' ... EOF |
Assert that exactly one file matching glob exists under .jaiph/runs/ and its content matches the heredoc. Useful for .err files or non-standard artifact names. |
e2e::expect_no_file "glob" |
Assert that no file matching glob exists under .jaiph/runs/. |
| Helper | Description |
|---|---|
e2e::assert_contains "$actual" "$needle" "label" |
Assert that actual contains needle. |
e2e::assert_equals "$actual" "$expected" "label" |
Assert exact string equality. |
e2e::assert_file_exists "path" "label" |
Assert that a file exists at path. |
e2e::assert_file_executable "path" "label" |
Assert that a file exists and is executable. |
e2e::pass "label" |
Print a [PASS] line. |
e2e::fail "label" |
Print a [FAIL] line to stderr and exit. |
e2e::skip "label" |
Print a [SKIP] line (for platform-dependent tests). |
Quick reference: default to full-equality helpers (e2e::expect_stdout, e2e::expect_out, e2e::expect_file, e2e::expect_run_file, e2e::assert_equals). e2e::assert_contains is the exception — every use needs an inline comment; rationale list in E2E philosophy above. Audit substring usage:
rg 'e2e::assert_contains' e2e/tests -n
Every .jh and .test.jh file under e2e/ must be referenced by at least one test script (e2e/tests/*.sh, e2e/test_all.sh, or e2e/lib/). Unreferenced samples confuse contributors, hide drift from the canonical examples/ corpus, and make it unclear which fixtures are load-bearing.
The guard script e2e/check_orphan_samples.sh detects orphans automatically. It scans every .jh and .test.jh file under e2e/, checks whether its basename appears in any test runner or helper, and also resolves indirect references (a file imported by another .jh that is itself referenced counts as covered). Any file that is neither directly nor indirectly referenced is reported as an orphan.
# Run manually from the repo root
bash e2e/check_orphan_samples.sh
On success the script prints OK: no orphan e2e samples detected. and exits 0. On failure it lists the unreferenced filenames and exits 1, with guidance to either wire them into a test, move them to examples/, or delete them.
When adding a new .jh fixture to e2e/, make sure it is exercised by a test in e2e/tests/ or imported by a file that is. If a sample exists purely for documentation or demonstration purposes, it belongs in examples/ instead.
Every .jh and .test.jh file under examples/ must be accounted for in e2e/tests/110_examples.sh. The script maintains three arrays that together form the example matrix:
| Array | Purpose |
|---|---|
COVERED_RUN |
Examples exercised via jaiph run with strict e2e::expect_stdout assertions. |
COVERED_TEST |
Test companions (*.test.jh) exercised via jaiph test. |
EXCLUDED |
Files that cannot run in E2E (e.g. CI-specific, require real agent backends). Each entry must have an inline comment explaining why. |
An orphan guard at the bottom of the script fails CI if any example file is not listed in one of the three arrays. To add a new example:
.jh file in examples/.COVERED_RUN, COVERED_TEST, or EXCLUDED (with a comment).e2e::expect_stdout and artifact assertions.