Testing doctrine, commands, and test layout conventions
Two types of tests are preferred:
Unit tests are located colocated with the source they test as ".test.ts[x]" files.
Integration tests are located in tests/, with these primary harnesses:
tests/ipc — tests that rely on the IPC and are focussed on ensuring backend behavior.tests/ui — frontend integration tests that use the real IPC and happy-dom Full App rendering.tests/e2e - end-to-end tests using Playwright which are needed to verify browser behavior that
can't be easily tested with happy-dom.Additionally, we have stories in src/browser/stories that are primarily used for human visual
verification of UI changes.
Avoid mock-heavy tests that verify implementation details rather than behavior.
If you need mocks to test something, consider whether the code should be restructured to be more testable.
There is at least one exception to this rule: we have a mockAiRouter that can be used to simulate
LLM responses. Broadly the use of LLMs in tests follow these rules:
Avoid tautological tests (simple mappings, identical copies of implementation); focus on invariants and boundary failures.
Ideally, all new features and bugs are well tested. Do not implement a feature or fix without a robust testing strategy.
When fixing bugs, always start with the test (practice TDD). Reproduce the bug in the test, then fix the production code, then verify the test passes.
All tests in tests/ are run under bun x jest with TEST_INTEGRATION=1 set.
Otherwise, tests that live in src/ run under bun test (generally these are unit tests).
make typecheckmake static-check (fast local lint/typecheck/fmt path)make static-check-full (adds docs link and bench-agent validation used in CI)bun x jest tests/ipc/sendMessage.test.ts -t "pattern")/tmp scripts) and commit them.App.*.stories.tsx). Do not add isolated component stories, even for small UI changes (they are not used/accepted in this repo).@storybook/test utilities (within, userEvent, waitFor) to interact with the UI and set up the desired visual state. Do not add props to production components solely for storybook convenience.Math.random(), Date.now(), or other non-deterministic values in story setup. Pass explicit values when ordering or timing matters for visual stability.useAutoScroll's ResizeObserver RAF to complete. Use double-RAF: await new Promise(r => requestAnimationFrame(() => requestAnimationFrame(r))).tests/ui)tests/ui must render the full app via AppLoader and drive interactions from the user's perspective (clicking, typing, navigating).renderReviewPanel() helper or similar patterns that render <AppLoader client={apiClient} />.*.test.ts).env.orpc.workspace.remove()) to trigger actions that you're testing—always simulate the user action (click the delete button, etc.). Calling the API bypasses frontend logic like navigation, state updates, and error handling, which is often where bugs hide.tests/ipc if backend logic needs granular testing.updatePersistedState to change UI state—go through the UI to trigger the desired behavior.TEST_INTEGRATION=1; use shouldRunIntegrationTests() guard.validateApiKeys() in tests that actually make AI API calls. Pure UI interaction tests (clicking buttons, selecting items) don't need API keys.document.body — happy-dom doesn't place it under view.container, so queries scoped to the app root will miss dialog/popover content.view.container.ownerDocument.body (or within(document.body)) and drive interactions there. Prefer userEvent for typing to ensure controlled inputs update.{isOpen && <div>...}) like AgentModePicker, or fall back to tests/e2e (~2min startup time).view.container (not document.body) when using non-portal components.waitFor() with explicit error messages to aid debugging: if (!el) throw new Error("Element not found").openBaseSelectorDropdown(), selectSuggestion(), not implementation details.tests/ipc)Strive to test the backend entirely via IPC interactions. Avoid directly asserting or modifying backend state here.
Exceptions include:
Strive to minimize raciness of tests. They run in a variety of environments, including bogged down CI runners.
Prefer explicit synchronization over arbitrary sleeps.
When explicit synchronization is not feasible, use patterns such as waitFor which can complete quickly in the common case.
In some cases due to infrastructure or performance constraints we may opt to diverge from these guidelines.
In such cases, ensure the test code (or production code which lacks tests) is well commented with the rationale behind the exception.