Documentation

Calibrate.WTF Documentation

Zero-config TypeScript & Next.js code-quality runner. Scores a project across six weighted checks, runs SRP-style module assessments, and can score historical commits without touching your workspace.

What it is

@decoperations/calibrate is a code-quality engine. Calibrate doesn't hard-code a fixed quality bar — it runs whatever bar your team decides to set, then reduces every signal to one weighted score from 0 to 100 and a CI gate. Three things compose into a calibrate run:

  1. The Check contract— a small TypeScript interface every rule implements. Same shape whether the rule ships in the bundled std-lib, a third-party preset package, or your project's ./calibrate-rules/ directory.
  2. The standard library (@calibrate/standard) — the transitional floor: TypeScript / ESLint / Prettier / security / tests / coverage + 6 SRP rules. Bundled in v0.9; extracted into its own package in v1.0.
  3. The ecosystem — sharable preset packs (@calibrate/react, @calibrate/node-cli,@acme/calibrate-rules, …) plus your project's own local rules.

On top of the engine, the same package ships an auto-fix orchestrator, a module SRP assessor, a per-file scorer with diff support, a historical commit scorer (via temporary git worktrees), a local Express dashboard, GitHub Actions integration, and a publishing pipeline — but every one of those is a consumer of the same Check / Layer machinery.

Calibrate is package-manager agnostic (pnpm / npm / yarn) and monorepo-aware. Configuration lives in .calibrate/config.json and can be overridden per workspace.

Engine + ecosystem

Product commitment

Every project that uses calibrate is expected to define its own rules and checks. Calibrate is not a fixed quality bar — it is the engine that runs whatever bar your team sets, plus the dashboard / scorer / reporter that turns those signals into one score.

v0.9 ships the bundled std-lib so existing users see no behaviour change. v1.0 extracts @calibrate/standard into its own package; the engine then ships with zero opinionated checks. Projects pick presets the same way they pick TS or ESLint configs today.

Three kinds of check

The same registry / scheduler / scorer handles all three. The kind drives only cost and determinism handling — the result shape is identical.

deterministic

Pure code analysis. Reads files, runs in-process, returns a score. The existing SRP rules; AST walkers; Biome/ESLint wrappers. Default weight 1.0. Always runs under --strict.

external

Wraps any CLI that emits structured output. Lighthouse, bundlesize, knip, depcheck, golangci-lint, ruff, clippy, custom shell scripts — anything that prints JSON. Default weight 0.5.

llm

Prompt + Zod schema + model. Adapter handles content-hash caching, USD-budget ledger, structured parsing. Advisory by default — opt in to weighting.

What the ecosystem covers

Because external checks wrap any CLI that emits structured output, the engine is fundamentally language-agnostic. The TS-specific bits ship in @calibrate/standardbecause that's today's audience; the ecosystem is designed to grow across:

AreaExamples
LanguagesTypeScript, JavaScript, Python (ruff/mypy), Go (golangci-lint), Rust (clippy)
FrameworksNext.js, React, Vue, Svelte, Express, Fastify, NestJS, Django, Rails
ToolsBiome, ESLint, Prettier, Lighthouse, bundlesize, knip, depcheck, audit-ci
DomainsAccessibility, security, performance, i18n, API contracts, schema migrations
House rulesInternal arch invariants, naming conventions, deprecation deadlines, license policy

Today's bundled std-lib is TS-first. The wider ecosystem above is the v1.0+ direction — see specs/extensibility.md and ROADMAP.md for the phased plan.

How a real config composes

{
  "checks": {
    "extends": [
      "@calibrate/standard",           // TS / ESLint / Prettier / sec / tests / cov + SRP
      "@calibrate/react",              // React-specific rules
      "@acme/calibrate-rules"          // your team's org-wide rules
    ],
    "include": [
      "./calibrate-rules/no-direct-db.ts",        // a project-local rule
      "./calibrate-rules/clarity-llm.ts"          // a project-local LLM rule
    ],
    "overrides": {
      "srp/god-file":           { "weight": 2.0 },
      "llm/clarity":            { "advisory": false, "weight": 0.5 },
      "external/bundlesize":    { "config": { "budget": "200kb" } }
    },
    "disable": ["org/legacy-rule"]
  },
  "budgets": {
    "maxLlmCostUSD": 0.50,
    "maxRuntimeSeconds": 300
  }
}

Zero-bundled mode is supported: extends: [] + include: [] runs cleanly with nothing but the rules a project writes itself. Score aggregation, CI gating, and the dashboard all work the same on a list of 1 check as on a list of 50.

Installation

Requires Node.js 18+. TypeScript 4+ projects only.

1. One-shot (no install)

Drop into any TS / Next.js repo and run calibrate without adding it as a dependency. First run auto-detects the project, writes .calibrate/config.json and a calibrate-rules/ directory, then analyzes.

npx --yes @decoperations/calibrate          # interactive
npx --yes @decoperations/calibrate ci       # non-interactive (for CI)

2. As a devDependency

pnpm add -D @decoperations/calibrate
pnpm calibrate                # auto-init + analyze + report

3. As a GitHub Action

Drop the composite action into any workflow — see the GitHub Action section for the full reference.

- uses: actions/checkout@v4
  with: { fetch-depth: 0 }
- uses: decoperations/calibrate.wtf@v1
  with:
    thresholds: standard

4. Global install (for local dev)

npm install -g @decoperations/calibrate
calibrate                     # available on $PATH

Next.js and React are optional peer dependencies — calibrate uses them for framework-aware checks when present, but works on plain Node TypeScript projects too. The package is currently published to GitHub Packages (npm.pkg.github.com) — see the Publishing section for npm authentication.

GitHub Action

decoperations/calibrate.wtf@v1 is a composite action that installs the CLI, runs a quality scan, surfaces the score as a workflow output, and uploads the analysis directory as an artifact. The action is published from this same repo — no separate publish step.

Quick start

name: Quality
on:
  pull_request:
  push:
    branches: [main]

permissions:
  contents: read
  packages: read            # needed to install from GitHub Packages
  pull-requests: write      # only if you comment on PRs

jobs:
  calibrate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0    # required for --diff-against / commit scoring
      - uses: decoperations/calibrate.wtf@v1
        with:
          thresholds: standard
          fail-on-error: true

Inputs

NameDefaultDescription
directory.Working directory to run calibrate in.
commandcici (GitHub-aware) or check.
workspaceSingle workspace (monorepos only).
all-workspacesfalseCheck every workspace in a monorepo.
fixfalseRun auto-fixers before scoring. check command only.
thresholdsstandardstrict | standard | relaxed | custom.
specificComma-separated check ids (e.g. typescript,eslint).
output-dircalibrate-outDirectory for raw analysis output.
fail-on-errortrueFail the workflow if calibrate exits non-zero.
versionlatestCLI version to install.
node-version22Node.js version used to run the CLI.
upload-artifacttrueUpload analysis output as a workflow artifact.
artifact-namecalibrate-reportName for the uploaded artifact.
github-tokengithub.tokenToken used to install from GitHub Packages (needs read:packages).

Outputs

  • score — quality score 0–100.
  • passed"true" or "false".
  • report-path — path to the raw analysis output directory.

Recipes

Comment on PRs with the score:

- id: calibrate
  uses: decoperations/calibrate.wtf@v1
  with:
    fail-on-error: false

- if: github.event_name == 'pull_request'
  uses: actions/github-script@v7
  with:
    script: |
      const score  = '${{ steps.calibrate.outputs.score }}';
      const passed = '${{ steps.calibrate.outputs.passed }}' === 'true';
      const emoji  = passed ? '✅' : '❌';
      github.rest.issues.createComment({
        issue_number: context.issue.number,
        owner: context.repo.owner,
        repo:  context.repo.repo,
        body:  `${emoji} Calibrate: **${score}/100**`,
      });

Strict gate on main:

- uses: decoperations/calibrate.wtf@v1
  with:
    command: check
    thresholds: strict
    fail-on-error: true

Single workspace in a monorepo:

- uses: decoperations/calibrate.wtf@v1
  with:
    workspace: packages/api

Auto-fix on a fix/* branch:

- uses: decoperations/calibrate.wtf@v1
  with:
    command: check
    fix: true

Full reference: docs/github-action.md and action.yml in the repo root.

CLI commands

The full top-level command surface. Run calibrate <cmd> --helpfor any command's flags.

CommandWhat it does
calibrate initGenerate .calibrate/config.json, install hooks, set up CI workflows
calibrate init-interactiveLegacy interactive init flow with prompts
calibrate checkRun the six weighted quality checks
calibrate assessModule-level SRP assessment across 5 progressive levels
calibrate serveLocal Express dashboard for analysis history
calibrate reportRender reports from raw analysis output
calibrate aggregateAggregate analysis output across commits / time ranges
calibrate ciCI quality-gate flow with structured exit codes
calibrate configInspect / update .calibrate/config.json
calibrate statusShow current calibrate state for the project
calibrate workspacesList monorepo workspaces calibrate sees
calibrate workspace:configurePer-workspace configuration override
calibrate self-update / upgradeUpdate calibrate itself
calibrate publishing:*12 commands for npm/GitHub-Packages publishing setup, validation, and lifecycle

calibrate check

The main command. Runs each enabled quality check, weights the results, and produces a 0–100 score. Pass threshold is ≥ 80.

TypeScript

30
tsc --noEmit

ESLint

25
eslint . --format json

Security

15
audit-ci / npm audit

Tests

15
vitest / jest --run

Coverage

10
coverage % vs threshold

Prettier

5
prettier --list-different

Score formula: round(sum(passed × weight) / sum(weight) × 100). When a check has an infrastructure failure(e.g. tsc OOM even at 8 GB heap), it's flagged with infraFailure: trueand given full credit so the score doesn't penalise broken tooling. Prettier supports partial credit — 1 dirty file → 80%, 2–5 → 50%, 6–20 → 25%, 21+ → 0%.

Common flags

calibrate                                # default subcommand — same as 'calibrate check'
                                         # auto-inits .calibrate/ + calibrate-rules/
                                         # on first run, then analyzes
calibrate check                          # all six checks, score, recommendations
calibrate check --fix                    # run auto-fixers first
calibrate check --ci                     # non-interactive, CI-friendly output
calibrate check --workspace packages/x   # single workspace in a monorepo
calibrate check --root-only              # skip workspace iteration
calibrate check --specific typescript,eslint  # run a subset of legacy checks
calibrate check --diff-against HEAD~1    # per-file score delta vs another commit
calibrate check --raw                    # write structured analysis to ./calibrate-out
calibrate check --thresholds strict      # strict | standard | relaxed | custom
calibrate check --output json            # machine-readable output

# v0.9 Check contract (bypasses legacy QualityRunner)
calibrate check --list-checks            # list registered Check ids
calibrate check --only srp/god-file,srp/coupling   # filter to specific Checks
calibrate check --strict                 # only deterministic Checks (reproducible CI)

Output lands in ./calibrate-out/summary.json with category sub-scores. With --raw, full per-issue analysis is written to ./calibrate-out/<commit>.json plus a .meta.json.

Configuration

calibrate init writes .calibrate/config.json with auto-detected defaults. Every section is optional.

{
  "framework": "nextjs",
  "typescript": true,
  "eslint":   { "enabled": true, "rules": "recommended", "plugins": [] },
  "prettier": { "enabled": true, "semi": false, "singleQuote": true },
  "husky":    { "enabled": true,
                "hooks": { "pre-commit": ["calibrate check --ci"] } },
  "github":   { "enabled": true,
                "workflows": { "ci": true, "codeql": true,
                               "dependabot": true } },
  "testing":  { "framework": "vitest", "coverage": true,
                "threshold": { "statements": 80 } },
  "quality":  { "typescript": true, "linting": true, "formatting": true,
                "security": true, "performance": true,
                "accessibility": true, "bundleSize": true,
                "dependencies": true, "documentation": true },
  "ci":       { "provider": "github", "nodeVersions": ["18","20"],
                "packageManager": "pnpm", "cacheEnabled": true,
                "parallelJobs": true },
  "monorepo": { "enabled": true, "type": "pnpm",
                "workspaces": ["packages/*", "apps/*"],
                "selectedWorkspaces": ["packages/calibrate"] },
  "autoTracking":  { "enabled": true, "preCommitHooks": true,
                     "trackInGit": true,
                     "reportsDirectory": ".calibrate/reports" },
  "qualityGates":  { "enabled": false, "minimumScore": 80 }
}

In a monorepo, selectedWorkspaces controls which workspaces are checked by default. Private packages ("private": true in their package.json) are skipped unless --workspace or --all-workspaces is passed explicitly.

Writing your first custom check

Product commitment

Calibrate is the engine — registry, scheduler, scorer, LLM & external adapters, dashboard. The bundled checks are a starting floor. Every project that uses calibrate is expected to define its own rules. v1.0 extracts the bundled std-lib into a separately-published @calibrate/standard package; the engine ships with zero opinionated checks. See specs/extensibility.md.

A Checkis a small object that takes a file (or module / project) and returns a 0–100 score plus evidence. The same contract powers calibrate's built-in rules, npm-published preset packs, and your project's local rules — everything goes through the same scheduler and scorer.

The contract

import type { Check } from '@decoperations/calibrate'

export const noConsoleLog: Check = {
  id: 'org/no-console-log',
  category: 'org',
  scope: 'file',
  kind: 'deterministic',

  async run({ rootDir, files, metrics }) {
    const fs = await import('fs/promises')
    const path = await import('path')
    const filePath = files[0]
    const src = await fs.readFile(filePath, 'utf8')
    const hits = [...src.matchAll(/\bconsole\.log\b/g)]

    if (hits.length === 0) {
      return { score: 100, severity: 'info', message: 'OK' }
    }

    return {
      score: Math.max(0, 100 - hits.length * 20),
      severity: hits.length > 3 ? 'high' : 'medium',
      message: `${hits.length} console.log call(s)`,
      evidence: hits.map(m => ({
        file: path.relative(rootDir, filePath),
        line: src.slice(0, m.index).split('\n').length,
        quote: 'console.log(...)',
      })),
    }
  },
}

Discovery via config

Drop the file under ./calibrate-rules/ and reference it from .calibrate/config.json:

{
  "checks": {
    "extends": ["bundled:srp"],
    "include": [
      "./calibrate-rules/no-console-log.ts",
      "@acme/calibrate-rules"
    ],
    "overrides": {
      "srp/god-file": { "weight": 2.0 },
      "org/no-console-log": { "advisory": false, "weight": 1.0 }
    },
    "disable": ["srp/coupling"]
  },
  "budgets": {
    "maxLlmCostUSD": 0.50
  }
}

Three kinds of check

deterministic

Pure code analysis. Runs in-process. Default weight 1.0. Always runs under --strict.

external

Wraps any CLI that emits structured output (Lighthouse, bundlesize, custom shell). Default weight 0.5.

llm

Prompt + Zod schema + model. Adapter handles content-hash caching, USD-budget ledger, structured parsing. Advisory by default — opt in to weighting.

CLI flags (v0.9)

# List the checks calibrate would run
calibrate check --list-checks

# Run only specific checks
calibrate check --only srp/god-file,org/no-console-log

# Run only deterministic checks (skip LLM/external — reproducible CI)
calibrate check --strict

Phase 1 of v0.9 ships the contract and the six SRP rules as built-in Check instances. Phase 2 adds the LlmCheck adapter; Phase 3 adds ExternalCheck and the preset-package ecosystem. Track progress in specs/extensibility.md and ROADMAP.md.

Presets & shareable configs

A preset is just an npm package that exports Check instances. Adding one is a single line in checks.extends. Any team can publish one; any ecosystem (language, framework, tool, internal platform) can have one.

Anatomy of a preset

// @acme/calibrate-rules/index.ts
import type { Check } from '@decoperations/calibrate'
import { NoDirectDbAccess } from './rules/no-direct-db'
import { RequireFeatureFlagJira } from './rules/feature-flag-jira'
import { ServiceBoundaryEnforcer } from './rules/service-boundary'

export const checks: Check[] = [
  new NoDirectDbAccess(),
  new RequireFeatureFlagJira(),
  new ServiceBoundaryEnforcer(),
]

Consume it in any project that wants those rules:

{
  "checks": {
    "extends": ["@calibrate/standard", "@acme/calibrate-rules"]
  }
}

First-party presets

PackageStatusWhat it ships
@calibrate/standard
bundled (v0.9), separately published in v1.0
TS, ESLint, Prettier, security, tests, coverage + 6 SRP rules. The transitional floor.
@calibrate/strict
planned (v1.0)
Tighter thresholds and additional rules for repos that want a higher bar by default.
@calibrate/react
planned (v1.0)
React-specific rules (hooks misuse, key props, accessibility, perf anti-patterns).
@calibrate/node-cli
planned (v1.0)
CLI / Node service rules (long-running process patterns, signal handling, log hygiene).
@acme/calibrate-rules
your org
Your team's house rules — e.g. "no direct DB access outside services/db/", "every feature flag must reference a Jira ticket".

How presets compose

  • Order matters. Later extendsentries can override earlier ones via the registry's last-write-wins rule.
  • Overrides are local. A consumer adjusts weight / advisory / config of any imported check without forking the source.
  • Disable is explicit. Don't fork to drop a rule — "disable": ["org/legacy-rule"] turns it off.
  • Local rules are first-class. Anything under ./calibrate-rules/ goes through the same scheduler and scorer; calibrate init scaffolds an example to make this obvious from day one.

Publishing your own preset

# 1. Scaffold a package
mkdir my-calibrate-rules && cd my-calibrate-rules
npm init -y

# 2. Add the contract as a peer dep
npm pkg set peerDependencies.@decoperations/calibrate=">=0.9.0"

# 3. Export an array of Check instances from the package entry
#    (see "Anatomy of a preset" above)

# 4. Publish
npm publish --access public

# 5. Consumers add one line to .calibrate/config.json:
#    "checks": { "extends": ["my-calibrate-rules"] }

A managed marketplace / registry is an explicit non-goal for v0.9. The path is npm — same way TS configs, ESLint configs, and Prettier configs already distribute.

Bundled standard library

These six weighted checks plus the six SRP rules ship as the bundled std-lib in v0.9. They're the transitional floor — in v1.0 they move into a separately-published @calibrate/standardpackage and the engine ships with zero opinionated checks. Until then, they're what you get on calibrate init by default. Every one of them is implemented as a Check under the same contract as your custom rules.

TypeScript (weight 30)

Runs tsc --noEmit. Includes an automatic OOM recovery: if the heap exhausts, calibrate retries with --max-old-space-size=8192 and, if that also fails, marks the check as an infra failure (no score penalty).

ESLint (weight 25)

eslint . --format json. Specific error patterns (ESM vs CJS mismatch, missing config, missing plugins) get tailored remediation suggestions.

Security (weight 15)

Prefers audit-ci with audit-ci.json (auto-created with --fix); falls back to npm/pnpm/yarn audit at moderate+ severity.

Tests + Coverage (15 + 10)

Auto-detects vitest or jest from package.json; falls back to a passthrough that runs npm/pnpm/yarn testas written. Coverage is parsed from the runner's text output and compared to testing.threshold.statements.

Prettier (weight 5)

prettier --list-different across **/*.{ts,tsx,js,jsx,json,md}. Returns partial credit so a single dirty file doesn't crater the score.

Module quality assessment

calibrate assess detects modules by scanning your project for named groups of files that span multiple architectural layers (components/foo/, hooks/foo/, lib/foo/) and scores each module on five progressive SRP levels.

L1 External SRP — Module Containment

weight 25
Are component/hook/service files actually inside their module's namespace, or scattered?

L2 Internal SRP — Submodule Architecture

weight 20
Coverage, cohesion, size balance, co-located tests within a module.

L3 Cohesion SRP — Ownership Audit

weight 20
Domain leakage, feature creep, shared infra in disguise, external coupling.

L4 Granularity SRP — Responsibility Factoring

weight 20
Concerns per file, change coupling, size uniformity, composability.

L5 Unit Test Coverage

weight 15
Critical workflow coverage, layer coverage, test breadth, co-location.
calibrate assess                 # all modules, all 5 levels
calibrate assess -m meeting      # one module
calibrate assess -l 4            # cap at level 4
calibrate assess -r              # show prioritized recommendations (P0–P3)
calibrate assess -v              # show per-dimension breakdown
calibrate assess -f json         # JSON output

Each level reports weighted dimensions, severity-tagged issues, and effort-tagged recommendations (low / medium / high). Recommendations include estimated impact (e.g. L4 +20) so you can prioritize work by score lift.

Behavior / UI QA (v0.10, planned)

Planned — tracked by issue #71

v0.10 adds product-side QA on top of code quality: a BehaviorLayer that boots your app, drives it via Playwright, and scores whether the running product matches its product spec.

Calibrate's existing spec-rigor engine (core/spec/) already scores nine layers — requirements, ontology, interfaces, data model, state machines, policy, formal, traceability, architecture. v0.10 adds a tenth: behavior. Same two-axis scoring as the other layers:

  • Coverage — every required product feature has at least one tagged scenario.
  • Adherence — scenarios pass when Playwright drives the actual running app.

Three phases

  1. #72 — Dogfood Playwright on apps/web — config + smoke tests for /, /docs, /dashboard. No calibrate code yet.
  2. #73 BehaviorLayer runs Playwright as a check — discovers tagged scenarios, runs them, emits coverage-gap / adherence-drift / orphan-spec DiffEvents.
  3. #74 behavior.spec.yaml DSL — given/when/then scenarios in YAML, translated to Playwright actions automatically. The product spec becomes the source of truth.

How it composes with everything else

Because the behavior layer is just another Check (well, a spec-rigor Layer exposed as Checks), all the engine machinery applies: it lands in summary.json, shows on the dashboard, respects --only, contributes to the 0–100 score. No new top-level concept.

# Once v0.10 lands, you'll be able to:
calibrate check --only spec/behavior   # run UI scenarios against the live app
calibrate assess --layer behavior      # coverage + adherence breakdown

Clean code analyzer

File-level violation detector. Walks your source tree (skipping node_modules, dist, .next, tests, configs, .d.ts) and flags six kinds of structural problems:

ViolationTrigger
god-file / god-component≥ 500 lines & ≥ 3 distinct concerns
oversized-file> 300 lines (suppressed if already a god-file)
oversized-component> 200 lines for .tsx/.jsx files
oversized-functionlongest function > 50 lines
high-couplingcross-module imports / internal imports > 2:1
deep-nestingbrace/paren nesting > 4 levels
mixed-concerns> 100 lines & ≥ 3 detected responsibilities

Concern detectionuses regex heuristics to classify a file's responsibilities into 12 buckets. A file that touches three or more is flagged as mixed-concerns:

state management
side effects
data fetching
rendering
event handling
form handling
routing
authentication
error handling
animation
real-time communication
storage

Note: this analyzer is being renamed (likely SingleResponsibilityAnalyzer) and split into per-rule modules. The behavior won't change — the public CLI and SDK output are stable.

Per-file scoring & diffs

The per-file analyzer scores every changed file on a 0–100 scale from these metrics:

  • line count
  • longest function length
  • average + max cyclomatic complexity (decision-point counting)
  • public exports
  • imports

The --diff-against <commit> flag writes a per-file-delta.json to the output directory that lists every changed file with scoreBefore, scoreAfter, delta, and full metric snapshots — so you can see which files improved, regressed, or didn't move at a per-file granularity.

calibrate check --diff-against main
# ... writes calibrate-out/per-file-delta.json:
# [
#   { "path": "src/foo.ts",
#     "scoreBefore": 72, "scoreAfter": 88, "delta": +16,
#     "metricsBefore": {...}, "metricsAfter": {...} }
# ]

Auto-fixers

calibrate check --fix runs five specialized fixers in dependency order. Each fixer extends BaseAutoFixer with a shouldRun() guard so they skip silently when not applicable.

1. LockfileAutoFixer

pnpm lockfile config & version-mismatch fixes (runs first so deps install cleanly)

2. EslintAutoFixer

ESLint v8 → v9 migration, common config issues, eslint --fix

3. PrettierAutoFixer

prettier --write across the project

4. SecurityAutoFixer

audit-ci config bootstrap, npm audit fix

5. TestCoverageAutoFixer

Test scaffolding for uncovered files

Historical commit scoring

The CommitScorerSDK class is calibrate's differentiator: it analyses any historical commit by spinning up a temporary git worktree, running the full quality pipeline inside it, then cleaning up — so the analysis never touches your working tree, your branch, or your stash.

import { CommitScorer } from '@decoperations/calibrate';

const scorer = new CommitScorer(process.cwd());
const result = await scorer.scoreCommit('abc1234');

console.log(result.quality.score);          // 0–100
console.log(result.metadata.gitInfo.author);
console.log(result.performance.qualityAnalysis); // ms

Worktrees land in .calibrate/tmp-workspace-* by default and are torn down on success or failure. Active worktrees are tracked, so concurrent scoring is safe.

Dashboard / calibrate serve

calibrate serve starts a local Express server that reads .calibrate/reports/ and serves a dashboard plus a JSON API.

calibrate serve --open --watch
# defaults: localhost:8888, watches reports dir for new analyses

# API:
# GET /api/trends             → trends time-series
# GET /api/commits            → all analyzed commits
# GET /api/commits/:commitId  → full report for one commit

The dashboard ships pre-built; with --watch it auto-refreshes when a new analysis lands.

GitHub init workflows

In addition to the composite GitHub Action you can drop into any workflow, calibrate init with GitHub integration enabled writes a full set of project files into your repo:

  • .github/workflows/calibrate.yml — runs calibrate check --ci on PRs across the configured Node versions
  • .github/workflows/codeql.yml — CodeQL analysis (optional)
  • .github/dependabot.yml — weekly dependency updates
  • Issue + PR templates in .github/ISSUE_TEMPLATE/

The CI workflow exits non-zero when the score falls below 80 (or your qualityGates.minimumScore), so PRs failing the gate cannot merge under branch protection.

Publishing pipeline

The publishing integration is a full npm / GitHub-Packages publication system with stage-based lifecycle gates. It's composed of eight specialised modules:

  • NpmAuthManager + NpmrcValidator — token / .npmrc validation
  • GitHubPackagesDetector — detects GitHub Packages registry config
  • PnpmMonorepoDetector — workspace-aware publishing
  • DependencyValidator — dependency graph + workspace deps validation
  • PublishingQualityChecker — pre-publish quality gates (build, tests, README, semver)
  • LifecycleManager — stages pre-alphaalpha betaga with progression rules
  • PublishWorkflowGenerator — generates publish-on-tag / publish-on-release GH Actions
calibrate publishing:setup                   # one-shot setup
calibrate publishing:auth-setup              # configure auth
calibrate publishing:auth-status             # verify auth is wired
calibrate publishing:validate-config         # validate package.json + .npmrc
calibrate publishing:validate-deps           # workspace deps OK?
calibrate publishing:analyze-dependencies    # dep graph audit
calibrate publishing:lifecycle-status        # current stage + version
calibrate publishing:promote-stage           # alpha → beta → ga
calibrate publishing:generate-ci             # publish workflow
calibrate publishing:diagnose                # full health check
calibrate publishing:init-github-packages    # GitHub Packages init
calibrate publishing:ci-instructions         # CI setup instructions

CI / pipeline mode

--raw turns calibrate into a structured data emitter. Combined with --ci and a commit ID, you get auditable analysis output you can persist, diff, and ship to external dashboards.

calibrate check \
  --ci \
  --raw \
  --output-dir ./out \
  --commit-id $GITHUB_SHA

# Produces:
#   out/<sha>.json        full EnhancedCalibrateOutput
#   out/<sha>.meta.json   summary metadata
#   out/<sha>.ndjson      one issue per line (with --format ndjson)

Output schema is EnhancedCalibrateOutput — see packages/calibrate/src/types.ts for the full TypeScript shape. calibrate aggregate rolls multiple commits into a single trend report; calibrate report renders human-readable output from the raw data.

Exit codes

  • 0 — score ≥ 80 (or quality gate threshold)
  • 1 — score below threshold / hard failure
  • 2 — configuration error (missing workspace selection in CI, etc.)

Programmatic API

Everything the CLI does is exposed as TypeScript classes from the @decoperations/calibrate package.

import {
  CalibrateSDK,
  ConfigManager,
  QualityRunner,
  ProjectDetector,
  ModuleAnalyzer,
  CleanCodeAnalyzer,
  GitHubIntegration,
} from '@decoperations/calibrate';

// Top-level SDK
const calibrate = new CalibrateSDK();
await calibrate.init({ skipPrompts: true });
const result = await calibrate.check({ raw: true });
console.log(result.score, result.passed);

// Direct module access
const detector  = new ProjectDetector();
const project   = await detector.detect();
const analyzer  = new ModuleAnalyzer(process.cwd());
const assessment = await analyzer.assess();

const ccAnalyzer = new CleanCodeAnalyzer(process.cwd());
const violations = ccAnalyzer.analyze();

All the public types — QualityResult, ModuleQualityResult, CleanCodeViolation, FileQualityScore, EnhancedCalibrateOutput, etc. — are re-exported from the package root.

Source of truth

Types live in packages/calibrate/src/types.ts
CLI definitions in packages/calibrate/src/cli.ts
Score weights in packages/calibrate/src/core/QualityRunner.ts