Documentation

Calibrate.WTF Documentation

Zero-config TypeScript & Next.js code-quality runner. Scores a project across six weighted checks, runs SRP-style module assessments, and can score historical commits without touching your workspace.

What it is

@decoperations/calibrate is a code-quality engine. Calibrate doesn't hard-code a fixed quality bar — it runs whatever bar your team decides to set, then reduces every signal to one weighted score from 0 to 100 and a CI gate. Three things compose into a calibrate run:

The Check contract— a small TypeScript interface every rule implements. Same shape whether the rule ships in the bundled std-lib, a third-party preset package, or your project's ./calibrate-rules/ directory.
The standard library (@calibrate/standard) — the transitional floor: TypeScript / ESLint / Prettier / security / tests / coverage + 6 SRP rules. Bundled in v0.9; extracted into its own package in v1.0.
The ecosystem — sharable preset packs (@calibrate/react, @calibrate/node-cli,@acme/calibrate-rules, …) plus your project's own local rules.

On top of the engine, the same package ships an auto-fix orchestrator, a module SRP assessor, a per-file scorer with diff support, a historical commit scorer (via temporary git worktrees), a local Express dashboard, GitHub Actions integration, and a publishing pipeline — but every one of those is a consumer of the same Check / Layer machinery.

Calibrate is package-manager agnostic (pnpm / npm / yarn) and monorepo-aware. Configuration lives in .calibrate/config.json and can be overridden per workspace.

Engine + ecosystem

Product commitment

Every project that uses calibrate is expected to define its own rules and checks. Calibrate is not a fixed quality bar — it is the engine that runs whatever bar your team sets, plus the dashboard / scorer / reporter that turns those signals into one score.

v0.9 ships the bundled std-lib so existing users see no behaviour change. v1.0 extracts @calibrate/standard into its own package; the engine then ships with zero opinionated checks. Projects pick presets the same way they pick TS or ESLint configs today.

Three kinds of check

The same registry / scheduler / scorer handles all three. The kind drives only cost and determinism handling — the result shape is identical.

`deterministic`

Pure code analysis. Reads files, runs in-process, returns a score. The existing SRP rules; AST walkers; Biome/ESLint wrappers. Default weight 1.0. Always runs under --strict.

`external`

Wraps any CLI that emits structured output. Lighthouse, bundlesize, knip, depcheck, golangci-lint, ruff, clippy, custom shell scripts — anything that prints JSON. Default weight 0.5.

`llm`

Prompt + Zod schema + model. Adapter handles content-hash caching, USD-budget ledger, structured parsing. Advisory by default — opt in to weighting.

What the ecosystem covers

Because external checks wrap any CLI that emits structured output, the engine is fundamentally language-agnostic. The TS-specific bits ship in @calibrate/standardbecause that's today's audience; the ecosystem is designed to grow across:

Area	Examples
Languages	TypeScript, JavaScript, Python (ruff/mypy), Go (golangci-lint), Rust (clippy)
Frameworks	Next.js, React, Vue, Svelte, Express, Fastify, NestJS, Django, Rails
Tools	Biome, ESLint, Prettier, Lighthouse, bundlesize, knip, depcheck, audit-ci
Domains	Accessibility, security, performance, i18n, API contracts, schema migrations
House rules	Internal arch invariants, naming conventions, deprecation deadlines, license policy

Today's bundled std-lib is TS-first. The wider ecosystem above is the v1.0+ direction — see specs/extensibility.md and ROADMAP.md for the phased plan.

How a real config composes

{
  "checks": {
    "extends": [
      "@calibrate/standard",           // TS / ESLint / Prettier / sec / tests / cov + SRP
      "@calibrate/react",              // React-specific rules
      "@acme/calibrate-rules"          // your team's org-wide rules
    ],
    "include": [
      "./calibrate-rules/no-direct-db.ts",        // a project-local rule
      "./calibrate-rules/clarity-llm.ts"          // a project-local LLM rule
    ],
    "overrides": {
      "srp/god-file":           { "weight": 2.0 },
      "llm/clarity":            { "advisory": false, "weight": 0.5 },
      "external/bundlesize":    { "config": { "budget": "200kb" } }
    },
    "disable": ["org/legacy-rule"]
  },
  "budgets": {
    "maxLlmCostUSD": 0.50,
    "maxRuntimeSeconds": 300
  }
}

Zero-bundled mode is supported: extends: [] + include: [] runs cleanly with nothing but the rules a project writes itself. Score aggregation, CI gating, and the dashboard all work the same on a list of 1 check as on a list of 50.

Installation

Requires Node.js 18+. TypeScript 4+ projects only.

1. One-shot (no install)

Drop into any TS / Next.js repo and run calibrate without adding it as a dependency. First run auto-detects the project, writes .calibrate/config.json and a calibrate-rules/ directory, then analyzes.

npx --yes @decoperations/calibrate          # interactive
npx --yes @decoperations/calibrate ci       # non-interactive (for CI)

2. As a devDependency

pnpm add -D @decoperations/calibrate
pnpm calibrate                # auto-init + analyze + report

3. As a GitHub Action

Drop the composite action into any workflow — see the GitHub Action section for the full reference.

- uses: actions/checkout@v4
  with: { fetch-depth: 0 }
- uses: decoperations/calibrate.wtf@v1
  with:
    thresholds: standard

4. Global install (for local dev)

npm install -g @decoperations/calibrate
calibrate                     # available on $PATH

Next.js and React are optional peer dependencies — calibrate uses them for framework-aware checks when present, but works on plain Node TypeScript projects too. The package is currently published to GitHub Packages (npm.pkg.github.com) — see the Publishing section for npm authentication.

GitHub Action

decoperations/calibrate.wtf@v1 is a composite action that installs the CLI, runs a quality scan, surfaces the score as a workflow output, and uploads the analysis directory as an artifact. The action is published from this same repo — no separate publish step.

Quick start

name: Quality
on:
  pull_request:
  push:
    branches: [main]

permissions:
  contents: read
  packages: read            # needed to install from GitHub Packages
  pull-requests: write      # only if you comment on PRs

jobs:
  calibrate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0    # required for --diff-against / commit scoring
      - uses: decoperations/calibrate.wtf@v1
        with:
          thresholds: standard
          fail-on-error: true

Inputs

Name	Default	Description
`directory`	.	Working directory to run calibrate in.
`command`	ci	`ci` (GitHub-aware) or `check`.
`workspace`	—	Single workspace (monorepos only).
`all-workspaces`	false	Check every workspace in a monorepo.
`fix`	false	Run auto-fixers before scoring. `check` command only.
`thresholds`	standard	`strict` \| `standard` \| `relaxed` \| `custom`.
`specific`	—	Comma-separated check ids (e.g. `typescript,eslint`).
`output-dir`	calibrate-out	Directory for raw analysis output.
`fail-on-error`	true	Fail the workflow if calibrate exits non-zero.
`version`	latest	CLI version to install.
`node-version`	22	Node.js version used to run the CLI.
`upload-artifact`	true	Upload analysis output as a workflow artifact.
`artifact-name`	calibrate-report	Name for the uploaded artifact.
`github-token`	github.token	Token used to install from GitHub Packages (needs `read:packages`).

Outputs

score — quality score 0–100.
passed — "true" or "false".
report-path — path to the raw analysis output directory.

Recipes

Comment on PRs with the score:

- id: calibrate
  uses: decoperations/calibrate.wtf@v1
  with:
    fail-on-error: false

- if: github.event_name == 'pull_request'
  uses: actions/github-script@v7
  with:
    script: |
      const score  = '${{ steps.calibrate.outputs.score }}';
      const passed = '${{ steps.calibrate.outputs.passed }}' === 'true';
      const emoji  = passed ? '✅' : '❌';
      github.rest.issues.createComment({
        issue_number: context.issue.number,
        owner: context.repo.owner,
        repo:  context.repo.repo,
        body:  `${emoji} Calibrate: **${score}/100**`,
      });

Strict gate on main:

- uses: decoperations/calibrate.wtf@v1
  with:
    command: check
    thresholds: strict
    fail-on-error: true

Single workspace in a monorepo:

- uses: decoperations/calibrate.wtf@v1
  with:
    workspace: packages/api

Auto-fix on a fix/* branch:

- uses: decoperations/calibrate.wtf@v1
  with:
    command: check
    fix: true

Full reference: docs/github-action.md and action.yml in the repo root.

CLI commands

The full top-level command surface. Run calibrate <cmd> --helpfor any command's flags.

Command	What it does
`calibrate init`	Generate .calibrate/config.json, install hooks, set up CI workflows
`calibrate init-interactive`	Legacy interactive init flow with prompts
`calibrate check`	Run the six weighted quality checks
`calibrate assess`	Module-level SRP assessment across 5 progressive levels
`calibrate serve`	Local Express dashboard for analysis history
`calibrate report`	Render reports from raw analysis output
`calibrate aggregate`	Aggregate analysis output across commits / time ranges
`calibrate ci`	CI quality-gate flow with structured exit codes
`calibrate config`	Inspect / update .calibrate/config.json
`calibrate status`	Show current calibrate state for the project
`calibrate workspaces`	List monorepo workspaces calibrate sees
`calibrate workspace:configure`	Per-workspace configuration override
`calibrate self-update / upgrade`	Update calibrate itself
`calibrate publishing:*`	12 commands for npm/GitHub-Packages publishing setup, validation, and lifecycle

calibrate check

The main command. Runs each enabled quality check, weights the results, and produces a 0–100 score. Pass threshold is ≥ 80.

TypeScript

tsc --noEmit

ESLint

eslint . --format json

Security

audit-ci / npm audit

Tests

vitest / jest --run

Coverage

coverage % vs threshold

Prettier

prettier --list-different

Score formula: round(sum(passed × weight) / sum(weight) × 100). When a check has an infrastructure failure(e.g. tsc OOM even at 8 GB heap), it's flagged with infraFailure: trueand given full credit so the score doesn't penalise broken tooling. Prettier supports partial credit — 1 dirty file → 80%, 2–5 → 50%, 6–20 → 25%, 21+ → 0%.

Common flags

calibrate                                # default subcommand — same as 'calibrate check'
                                         # auto-inits .calibrate/ + calibrate-rules/
                                         # on first run, then analyzes
calibrate check                          # all six checks, score, recommendations
calibrate check --fix                    # run auto-fixers first
calibrate check --ci                     # non-interactive, CI-friendly output
calibrate check --workspace packages/x   # single workspace in a monorepo
calibrate check --root-only              # skip workspace iteration
calibrate check --specific typescript,eslint  # run a subset of legacy checks
calibrate check --diff-against HEAD~1    # per-file score delta vs another commit
calibrate check --raw                    # write structured analysis to ./calibrate-out
calibrate check --thresholds strict      # strict | standard | relaxed | custom
calibrate check --output json            # machine-readable output

# v0.9 Check contract (bypasses legacy QualityRunner)
calibrate check --list-checks            # list registered Check ids
calibrate check --only srp/god-file,srp/coupling   # filter to specific Checks
calibrate check --strict                 # only deterministic Checks (reproducible CI)

Output lands in ./calibrate-out/summary.json with category sub-scores. With --raw, full per-issue analysis is written to ./calibrate-out/<commit>.json plus a .meta.json.

Configuration

calibrate init writes .calibrate/config.json with auto-detected defaults. Every section is optional.

{
  "framework": "nextjs",
  "typescript": true,
  "eslint":   { "enabled": true, "rules": "recommended", "plugins": [] },
  "prettier": { "enabled": true, "semi": false, "singleQuote": true },
  "husky":    { "enabled": true,
                "hooks": { "pre-commit": ["calibrate check --ci"] } },
  "github":   { "enabled": true,
                "workflows": { "ci": true, "codeql": true,
                               "dependabot": true } },
  "testing":  { "framework": "vitest", "coverage": true,
                "threshold": { "statements": 80 } },
  "quality":  { "typescript": true, "linting": true, "formatting": true,
                "security": true, "performance": true,
                "accessibility": true, "bundleSize": true,
                "dependencies": true, "documentation": true },
  "ci":       { "provider": "github", "nodeVersions": ["18","20"],
                "packageManager": "pnpm", "cacheEnabled": true,
                "parallelJobs": true },
  "monorepo": { "enabled": true, "type": "pnpm",
                "workspaces": ["packages/*", "apps/*"],
                "selectedWorkspaces": ["packages/calibrate"] },
  "autoTracking":  { "enabled": true, "preCommitHooks": true,
                     "trackInGit": true,
                     "reportsDirectory": ".calibrate/reports" },
  "qualityGates":  { "enabled": false, "minimumScore": 80 }
}

In a monorepo, selectedWorkspaces controls which workspaces are checked by default. Private packages ("private": true in their package.json) are skipped unless --workspace or --all-workspaces is passed explicitly.

Writing your first custom check

Product commitment

Calibrate is the engine — registry, scheduler, scorer, LLM & external adapters, dashboard. The bundled checks are a starting floor. Every project that uses calibrate is expected to define its own rules. v1.0 extracts the bundled std-lib into a separately-published @calibrate/standard package; the engine ships with zero opinionated checks. See specs/extensibility.md.

A Checkis a small object that takes a file (or module / project) and returns a 0–100 score plus evidence. The same contract powers calibrate's built-in rules, npm-published preset packs, and your project's local rules — everything goes through the same scheduler and scorer.

The contract

import type { Check } from '@decoperations/calibrate'

export const noConsoleLog: Check = {
  id: 'org/no-console-log',
  category: 'org',
  scope: 'file',
  kind: 'deterministic',

  async run({ rootDir, files, metrics }) {
    const fs = await import('fs/promises')
    const path = await import('path')
    const filePath = files[0]
    const src = await fs.readFile(filePath, 'utf8')
    const hits = [...src.matchAll(/\bconsole\.log\b/g)]

    if (hits.length === 0) {
      return { score: 100, severity: 'info', message: 'OK' }
    }

    return {
      score: Math.max(0, 100 - hits.length * 20),
      severity: hits.length > 3 ? 'high' : 'medium',
      message: `${hits.length} console.log call(s)`,
      evidence: hits.map(m => ({
        file: path.relative(rootDir, filePath),
        line: src.slice(0, m.index).split('\n').length,
        quote: 'console.log(...)',
      })),
    }
  },
}

Discovery via config

Drop the file under ./calibrate-rules/ and reference it from .calibrate/config.json:

{
  "checks": {
    "extends": ["bundled:srp"],
    "include": [
      "./calibrate-rules/no-console-log.ts",
      "@acme/calibrate-rules"
    ],
    "overrides": {
      "srp/god-file": { "weight": 2.0 },
      "org/no-console-log": { "advisory": false, "weight": 1.0 }
    },
    "disable": ["srp/coupling"]
  },
  "budgets": {
    "maxLlmCostUSD": 0.50
  }
}

Three kinds of check

deterministic

Pure code analysis. Runs in-process. Default weight 1.0. Always runs under --strict.

external

Wraps any CLI that emits structured output (Lighthouse, bundlesize, custom shell). Default weight 0.5.

llm

Prompt + Zod schema + model. Adapter handles content-hash caching, USD-budget ledger, structured parsing. Advisory by default — opt in to weighting.

CLI flags (v0.9)

# List the checks calibrate would run
calibrate check --list-checks

# Run only specific checks
calibrate check --only srp/god-file,org/no-console-log

# Run only deterministic checks (skip LLM/external — reproducible CI)
calibrate check --strict

Phase 1 of v0.9 ships the contract and the six SRP rules as built-in Check instances. Phase 2 adds the LlmCheck adapter; Phase 3 adds ExternalCheck and the preset-package ecosystem. Track progress in specs/extensibility.md and ROADMAP.md.

Presets & shareable configs

A preset is just an npm package that exports Check instances. Adding one is a single line in checks.extends. Any team can publish one; any ecosystem (language, framework, tool, internal platform) can have one.

Anatomy of a preset

// @acme/calibrate-rules/index.ts
import type { Check } from '@decoperations/calibrate'
import { NoDirectDbAccess } from './rules/no-direct-db'
import { RequireFeatureFlagJira } from './rules/feature-flag-jira'
import { ServiceBoundaryEnforcer } from './rules/service-boundary'

export const checks: Check[] = [
  new NoDirectDbAccess(),
  new RequireFeatureFlagJira(),
  new ServiceBoundaryEnforcer(),
]

Consume it in any project that wants those rules:

{
  "checks": {
    "extends": ["@calibrate/standard", "@acme/calibrate-rules"]
  }
}

First-party presets

Package	Status	What it ships
`@calibrate/standard`	bundled (v0.9), separately published in v1.0	TS, ESLint, Prettier, security, tests, coverage + 6 SRP rules. The transitional floor.
`@calibrate/strict`	planned (v1.0)	Tighter thresholds and additional rules for repos that want a higher bar by default.
`@calibrate/react`	planned (v1.0)	React-specific rules (hooks misuse, key props, accessibility, perf anti-patterns).
`@calibrate/node-cli`	planned (v1.0)	CLI / Node service rules (long-running process patterns, signal handling, log hygiene).
`@acme/calibrate-rules`	your org	Your team's house rules — e.g. "no direct DB access outside services/db/", "every feature flag must reference a Jira ticket".

How presets compose

Order matters. Later extendsentries can override earlier ones via the registry's last-write-wins rule.
Overrides are local. A consumer adjusts weight / advisory / config of any imported check without forking the source.
Disable is explicit. Don't fork to drop a rule — "disable": ["org/legacy-rule"] turns it off.
Local rules are first-class. Anything under ./calibrate-rules/ goes through the same scheduler and scorer; calibrate init scaffolds an example to make this obvious from day one.

Publishing your own preset

# 1. Scaffold a package
mkdir my-calibrate-rules && cd my-calibrate-rules
npm init -y

# 2. Add the contract as a peer dep
npm pkg set peerDependencies.@decoperations/calibrate=">=0.9.0"

# 3. Export an array of Check instances from the package entry
#    (see "Anatomy of a preset" above)

# 4. Publish
npm publish --access public

# 5. Consumers add one line to .calibrate/config.json:
#    "checks": { "extends": ["my-calibrate-rules"] }

A managed marketplace / registry is an explicit non-goal for v0.9. The path is npm — same way TS configs, ESLint configs, and Prettier configs already distribute.

Bundled standard library

These six weighted checks plus the six SRP rules ship as the bundled std-lib in v0.9. They're the transitional floor — in v1.0 they move into a separately-published @calibrate/standardpackage and the engine ships with zero opinionated checks. Until then, they're what you get on calibrate init by default. Every one of them is implemented as a Check under the same contract as your custom rules.

TypeScript (weight 30)

Runs tsc --noEmit. Includes an automatic OOM recovery: if the heap exhausts, calibrate retries with --max-old-space-size=8192 and, if that also fails, marks the check as an infra failure (no score penalty).

ESLint (weight 25)

eslint . --format json. Specific error patterns (ESM vs CJS mismatch, missing config, missing plugins) get tailored remediation suggestions.

Security (weight 15)

Prefers audit-ci with audit-ci.json (auto-created with --fix); falls back to npm/pnpm/yarn audit at moderate+ severity.

Tests + Coverage (15 + 10)

Auto-detects vitest or jest from package.json; falls back to a passthrough that runs npm/pnpm/yarn testas written. Coverage is parsed from the runner's text output and compared to testing.threshold.statements.

Prettier (weight 5)

prettier --list-different across **/*.{ts,tsx,js,jsx,json,md}. Returns partial credit so a single dirty file doesn't crater the score.

Module quality assessment

calibrate assess detects modules by scanning your project for named groups of files that span multiple architectural layers (components/foo/, hooks/foo/, lib/foo/) and scores each module on five progressive SRP levels.

L1 External SRP — Module Containment

weight 25

Are component/hook/service files actually inside their module's namespace, or scattered?

L2 Internal SRP — Submodule Architecture

weight 20

Coverage, cohesion, size balance, co-located tests within a module.

L3 Cohesion SRP — Ownership Audit

weight 20

Domain leakage, feature creep, shared infra in disguise, external coupling.

L4 Granularity SRP — Responsibility Factoring

weight 20

Concerns per file, change coupling, size uniformity, composability.

L5 Unit Test Coverage

weight 15

Critical workflow coverage, layer coverage, test breadth, co-location.

calibrate assess                 # all modules, all 5 levels
calibrate assess -m meeting      # one module
calibrate assess -l 4            # cap at level 4
calibrate assess -r              # show prioritized recommendations (P0–P3)
calibrate assess -v              # show per-dimension breakdown
calibrate assess -f json         # JSON output

Each level reports weighted dimensions, severity-tagged issues, and effort-tagged recommendations (low / medium / high). Recommendations include estimated impact (e.g. L4 +20) so you can prioritize work by score lift.

Behavior / UI QA (v0.10, planned)

Planned — tracked by issue #71

v0.10 adds product-side QA on top of code quality: a BehaviorLayer that boots your app, drives it via Playwright, and scores whether the running product matches its product spec.

Calibrate's existing spec-rigor engine (core/spec/) already scores nine layers — requirements, ontology, interfaces, data model, state machines, policy, formal, traceability, architecture. v0.10 adds a tenth: behavior. Same two-axis scoring as the other layers:

Coverage — every required product feature has at least one tagged scenario.
Adherence — scenarios pass when Playwright drives the actual running app.

Three phases

#72 — Dogfood Playwright on apps/web — config + smoke tests for /, /docs, /dashboard. No calibrate code yet.
#73 — BehaviorLayer runs Playwright as a check — discovers tagged scenarios, runs them, emits coverage-gap / adherence-drift / orphan-spec DiffEvents.
#74 — behavior.spec.yaml DSL — given/when/then scenarios in YAML, translated to Playwright actions automatically. The product spec becomes the source of truth.

How it composes with everything else

Because the behavior layer is just another Check (well, a spec-rigor Layer exposed as Checks), all the engine machinery applies: it lands in summary.json, shows on the dashboard, respects --only, contributes to the 0–100 score. No new top-level concept.

# Once v0.10 lands, you'll be able to:
calibrate check --only spec/behavior   # run UI scenarios against the live app
calibrate assess --layer behavior      # coverage + adherence breakdown

Clean code analyzer

File-level violation detector. Walks your source tree (skipping node_modules, dist, .next, tests, configs, .d.ts) and flags six kinds of structural problems:

Violation	Trigger
`god-file / god-component`	≥ 500 lines & ≥ 3 distinct concerns
`oversized-file`	> 300 lines (suppressed if already a god-file)
`oversized-component`	> 200 lines for .tsx/.jsx files
`oversized-function`	longest function > 50 lines
`high-coupling`	cross-module imports / internal imports > 2:1
`deep-nesting`	brace/paren nesting > 4 levels
`mixed-concerns`	> 100 lines & ≥ 3 detected responsibilities

Concern detectionuses regex heuristics to classify a file's responsibilities into 12 buckets. A file that touches three or more is flagged as mixed-concerns:

state management

side effects

data fetching

rendering

event handling

form handling

routing

authentication

error handling

animation

real-time communication

storage

Note: this analyzer is being renamed (likely SingleResponsibilityAnalyzer) and split into per-rule modules. The behavior won't change — the public CLI and SDK output are stable.

Per-file scoring & diffs

The per-file analyzer scores every changed file on a 0–100 scale from these metrics:

line count
longest function length
average + max cyclomatic complexity (decision-point counting)
public exports
imports

The --diff-against <commit> flag writes a per-file-delta.json to the output directory that lists every changed file with scoreBefore, scoreAfter, delta, and full metric snapshots — so you can see which files improved, regressed, or didn't move at a per-file granularity.

calibrate check --diff-against main
# ... writes calibrate-out/per-file-delta.json:
# [
#   { "path": "src/foo.ts",
#     "scoreBefore": 72, "scoreAfter": 88, "delta": +16,
#     "metricsBefore": {...}, "metricsAfter": {...} }
# ]

Auto-fixers

calibrate check --fix runs five specialized fixers in dependency order. Each fixer extends BaseAutoFixer with a shouldRun() guard so they skip silently when not applicable.

1. LockfileAutoFixer

pnpm lockfile config & version-mismatch fixes (runs first so deps install cleanly)

2. EslintAutoFixer

ESLint v8 → v9 migration, common config issues, eslint --fix

3. PrettierAutoFixer

prettier --write across the project

4. SecurityAutoFixer

audit-ci config bootstrap, npm audit fix

5. TestCoverageAutoFixer

Test scaffolding for uncovered files

Historical commit scoring

The CommitScorerSDK class is calibrate's differentiator: it analyses any historical commit by spinning up a temporary git worktree, running the full quality pipeline inside it, then cleaning up — so the analysis never touches your working tree, your branch, or your stash.

import { CommitScorer } from '@decoperations/calibrate';

const scorer = new CommitScorer(process.cwd());
const result = await scorer.scoreCommit('abc1234');

console.log(result.quality.score);          // 0–100
console.log(result.metadata.gitInfo.author);
console.log(result.performance.qualityAnalysis); // ms

Worktrees land in .calibrate/tmp-workspace-* by default and are torn down on success or failure. Active worktrees are tracked, so concurrent scoring is safe.

Dashboard / calibrate serve

calibrate serve starts a local Express server that reads .calibrate/reports/ and serves a dashboard plus a JSON API.

calibrate serve --open --watch
# defaults: localhost:8888, watches reports dir for new analyses

# API:
# GET /api/trends             → trends time-series
# GET /api/commits            → all analyzed commits
# GET /api/commits/:commitId  → full report for one commit

The dashboard ships pre-built; with --watch it auto-refreshes when a new analysis lands.

GitHub init workflows

In addition to the composite GitHub Action you can drop into any workflow, calibrate init with GitHub integration enabled writes a full set of project files into your repo:

.github/workflows/calibrate.yml — runs calibrate check --ci on PRs across the configured Node versions
.github/workflows/codeql.yml — CodeQL analysis (optional)
.github/dependabot.yml — weekly dependency updates
Issue + PR templates in .github/ISSUE_TEMPLATE/

The CI workflow exits non-zero when the score falls below 80 (or your qualityGates.minimumScore), so PRs failing the gate cannot merge under branch protection.

Publishing pipeline

The publishing integration is a full npm / GitHub-Packages publication system with stage-based lifecycle gates. It's composed of eight specialised modules:

NpmAuthManager + NpmrcValidator — token / .npmrc validation
GitHubPackagesDetector — detects GitHub Packages registry config
PnpmMonorepoDetector — workspace-aware publishing
DependencyValidator — dependency graph + workspace deps validation
PublishingQualityChecker — pre-publish quality gates (build, tests, README, semver)
LifecycleManager — stages pre-alpha → alpha → beta → ga with progression rules
PublishWorkflowGenerator — generates publish-on-tag / publish-on-release GH Actions

calibrate publishing:setup                   # one-shot setup
calibrate publishing:auth-setup              # configure auth
calibrate publishing:auth-status             # verify auth is wired
calibrate publishing:validate-config         # validate package.json + .npmrc
calibrate publishing:validate-deps           # workspace deps OK?
calibrate publishing:analyze-dependencies    # dep graph audit
calibrate publishing:lifecycle-status        # current stage + version
calibrate publishing:promote-stage           # alpha → beta → ga
calibrate publishing:generate-ci             # publish workflow
calibrate publishing:diagnose                # full health check
calibrate publishing:init-github-packages    # GitHub Packages init
calibrate publishing:ci-instructions         # CI setup instructions

CI / pipeline mode

--raw turns calibrate into a structured data emitter. Combined with --ci and a commit ID, you get auditable analysis output you can persist, diff, and ship to external dashboards.

calibrate check \
  --ci \
  --raw \
  --output-dir ./out \
  --commit-id $GITHUB_SHA

# Produces:
#   out/<sha>.json        full EnhancedCalibrateOutput
#   out/<sha>.meta.json   summary metadata
#   out/<sha>.ndjson      one issue per line (with --format ndjson)

Output schema is EnhancedCalibrateOutput — see packages/calibrate/src/types.ts for the full TypeScript shape. calibrate aggregate rolls multiple commits into a single trend report; calibrate report renders human-readable output from the raw data.

Exit codes

0 — score ≥ 80 (or quality gate threshold)
1 — score below threshold / hard failure
2 — configuration error (missing workspace selection in CI, etc.)

Programmatic API

Everything the CLI does is exposed as TypeScript classes from the @decoperations/calibrate package.

import {
  CalibrateSDK,
  ConfigManager,
  QualityRunner,
  ProjectDetector,
  ModuleAnalyzer,
  CleanCodeAnalyzer,
  GitHubIntegration,
} from '@decoperations/calibrate';

// Top-level SDK
const calibrate = new CalibrateSDK();
await calibrate.init({ skipPrompts: true });
const result = await calibrate.check({ raw: true });
console.log(result.score, result.passed);

// Direct module access
const detector  = new ProjectDetector();
const project   = await detector.detect();
const analyzer  = new ModuleAnalyzer(process.cwd());
const assessment = await analyzer.assess();

const ccAnalyzer = new CleanCodeAnalyzer(process.cwd());
const violations = ccAnalyzer.analyze();

All the public types — QualityResult, ModuleQualityResult, CleanCodeViolation, FileQualityScore, EnhancedCalibrateOutput, etc. — are re-exported from the package root.

Source of truth

Types live in packages/calibrate/src/types.ts

CLI definitions in packages/calibrate/src/cli.ts

Score weights in packages/calibrate/src/core/QualityRunner.ts