move skills

2026-06-03 08:46:46 +00:00 · 2026-05-13 19:08:27 +02:00
parent 9032a31e26
commit b2ef159aee
23 changed files with 571 additions and 93 deletions
--- a/.github/scripts/doc-agent/.env.example
+++ b/.github/scripts/doc-agent/.env.example
@@ -0,0 +1,13 @@
+# Copy this file to .env and fill in your values.
+# .env is gitignored and will never be committed.
+
+# Required: Anthropic API key for the Claude Agent SDK.
+ANTHROPIC_API_KEY=sk-ant-...
+
+# Optional: Override the path to the xrpld repo root.
+# Defaults to three levels up from this directory (the repo this lives in).
+# XRPLD_ROOT=/path/to/xrpld
+
+# Optional: Override the model used by the agent.
+# Defaults to claude-opus-4-7.
+# DOC_AGENT_MODEL=claude-opus-4-7
--- a/.github/scripts/doc-agent/README.md
+++ b/.github/scripts/doc-agent/README.md
@@ -5,19 +5,23 @@ Claude Agent SDK.

 ## What it does

-Two modes:
+Three modes:

 - **document** — Add Doxygen `/** */` documentation to a C++ file or
-  directory. The agent reads the file, related tests, and module skill
-  context, then writes documentation comments per the project standards in
-  `docs/DOCUMENTATION_STANDARDS.md`.
+  directory. For each target file, the agent reads the sibling
+  `<file>.ai.md` (high-signal prose generated by the athenah-ai pipeline),
+  the module skill, and the file itself, then writes Doxygen comments per
+  the standards in `docs/DOCUMENTATION_STANDARDS.md`.
 - **review** — Given a git diff range, detect documentation drift. Used by
  the `doc-review` GitHub Action and locally for testing.
+- **regen-skills** — Rebuild a module's skill file at
+  `docs/skills/soul/<module>.md` from the `.ai.md` files in that module
+  and the existing skill content.

 ## Requirements

- Node.js >= 20
- `ANTHROPIC_API_KEY` environment variable
+- Node.js >= 20.12 (for native `--env-file` support)
+- `ANTHROPIC_API_KEY` (in `.env` or exported in shell)
 - Tools the agent uses: `git`, `gh` (for `--pr`)

 ## Install
@@ -25,8 +29,13 @@ Two modes:
 ```sh
 cd .github/scripts/doc-agent
 npm install
+cp .env.example .env
+# edit .env and set ANTHROPIC_API_KEY
 ```

+The npm scripts auto-load `.env` via Node's `--env-file-if-exists` flag.
+You can also export the variables in your shell — both work.
+
 ## Build and lint

 ```sh
@@ -41,9 +50,7 @@ npm run check:fix     # lint + format + fix
 ## Usage

 ```sh
-export ANTHROPIC_API_KEY=sk-ant-...
-
-# Document a single file
+# Document a single file (reads sibling .ai.md if present)
 npm run document include/xrpl/basics/base_uint.h

 # Document an entire module
@@ -54,10 +61,22 @@ npm run review develop..HEAD

 # Review a PR
 npm run review -- --pr 1234
+
+# Regenerate a skill file from this module's .ai.md inputs
+npm run regen-skills protocol
+npm run regen-skills ledger
 ```

-When invoked outside the xrpld repo, set `XRPLD_ROOT` to the path of the
-checkout you want to operate on.
+When invoked outside the xrpld repo, set `XRPLD_ROOT` in `.env` to the path
+of the checkout you want to operate on.
+
+## ai.md context files
+
+The doc-agent reads a sibling `<file>.ai.md` next to each source file when
+documenting it. These are produced by the upstream `athenah-ai` pipeline
+and treated as the authoritative source of intent. They are gitignored
+(`*.ai.md` in `.gitignore`) and should be removed once the initial
+documentation pass is complete.

 ## Outputs

@@ -76,13 +95,15 @@ doc-agent/
 ├── biome.json
 ├── prompts/
 │   ├── document-file.md     # System prompt for documentation mode
-│   └── review-diff.md       # System prompt for review mode
+│   ├── review-diff.md       # System prompt for review mode
+│   └── regen-skill.md       # System prompt for regen-skills mode
 └── src/
    ├── index.ts             # CLI entry point
    ├── config.ts            # Paths, model, module-skill map
    ├── prompt-loader.ts     # Loads prompts + module skill context
    ├── document.ts          # Document mode
    ├── review.ts            # Review mode
+    ├── regen-skills.ts      # Regen-skills mode
    └── types.ts             # Shared types
 ```

--- a/.github/scripts/doc-agent/package.json
+++ b/.github/scripts/doc-agent/package.json
@@ -9,10 +9,11 @@
  },
  "scripts": {
    "build": "tsc",
-    "start": "node dist/index.js",
-    "dev": "tsx src/index.ts",
-    "document": "tsx src/index.ts document",
-    "review": "tsx src/index.ts review",
+    "start": "node --env-file-if-exists=.env dist/index.js",
+    "dev": "tsx --env-file-if-exists=.env src/index.ts",
+    "document": "tsx --env-file-if-exists=.env src/index.ts document",
+    "review": "tsx --env-file-if-exists=.env src/index.ts review",
+    "regen-skills": "tsx --env-file-if-exists=.env src/index.ts regen-skills",
    "typecheck": "tsc --noEmit",
    "lint": "biome lint src",
    "format": "biome format --write src",
@@ -29,6 +30,6 @@
    "typescript": "^5.7.0"
  },
  "engines": {
-    "node": ">=20"
+    "node": ">=20.12"
  }
 }
--- a/.github/scripts/doc-agent/prompts/document-file.md
+++ b/.github/scripts/doc-agent/prompts/document-file.md
@@ -34,7 +34,7 @@ Read `docs/DOCUMENTATION_STANDARDS.md` for the full specification. Key rules:

 ## Module Context

-Before you start, read the relevant skill file in `docs/skills/soul/` for
+Before you start, read the relevant skill file in `docs/skills/` for
 the module you're working on. These capture per-module conventions, key
 classes, and gotchas:

@@ -42,20 +42,26 @@ classes, and gotchas:
 - `protocol` — STObject, SField, Serializer, TER codes, Features, Keylets
 - `ledger` — ReadView/ApplyView, state tables, payment sandbox
 - `tx` / `transactors` — transaction pipeline
- `consensus`, `peering`, `nodestore`, `shamap`, `rpc` — see `docs/skills/soul/`
+- `consensus`, `peering`, `nodestore`, `shamap`, `rpc` — see `docs/skills/`

 ## Process

-1. Read the target file completely
-2. Read the corresponding skill file in `docs/skills/soul/` if one applies
-3. Identify entities that need documentation (public classes, structs,
+1. If "Authoritative AI Context" is provided in the user prompt, treat it as
+   the source of truth for the file's intent and behavior. Your task is to
+   translate that prose into structured Doxygen comments on the declarations.
+2. Read the target file completely
+3. Read the corresponding skill file in `docs/skills/` if one applies
+4. Identify entities that need documentation (public classes, structs,
   public methods, free functions in headers, enums)
-4. For each entity: read the implementation (and tests if helpful), then
-   write a Doxygen comment that captures behavior and intent
-5. Use the Edit tool to add the comments to the file
-6. Do NOT modify code logic — only add documentation
-7. Do NOT add documentation to entities that don't need it (private members
+5. For each entity: cross-reference the ai.md context, read the implementation
+   (and tests if helpful), then write a Doxygen comment that captures behavior
+   and intent
+6. Use the Edit tool to add the comments to the file
+7. Do NOT modify code logic — only add documentation
+8. Do NOT add documentation to entities that don't need it (private members
   with obvious purpose, simple getters where the name is self-explanatory)
+9. Do NOT read the `.ai.md` file yourself — it is already injected into your
+   prompt when one exists for the target file

 When you finish, summarize:
 - How many entities you documented
--- a/.github/scripts/doc-agent/prompts/regen-skill.md
+++ b/.github/scripts/doc-agent/prompts/regen-skill.md
@@ -0,0 +1,47 @@
+You are updating a per-module skill file for the xrpld codebase.
+
+A "skill" is a single markdown file at `docs/skills/<module>.md` that
+captures the institutional knowledge for one module: what it does, key
+classes, conventions, gotchas, and how to work in it. The skill file is
+loaded as context whenever an agent works on code in that module.
+
+## Inputs
+
+You will be given:
+- The current skill file for the module (the baseline to update)
+- A list of `.ai.md` files describing the source files in this module
+  (one per source file, with high-signal prose about purpose and design)
+
+## Your task
+
+Produce a new, improved skill file that integrates the knowledge from the
+ai.md files into the existing skill. Specifically:
+
+1. Update the description of the module's responsibility if the ai.md files
+   reveal more accurate or detailed framing
+2. Add any classes, patterns, or invariants the skill is missing
+3. Update lists of key files / entry points / conventions
+4. Add gotchas and non-obvious behavior surfaced by the ai.md files
+5. Keep the structure of the existing skill (don't reorganize for the sake
+   of it — only restructure if the existing structure is genuinely failing)
+6. Be terse. A skill file is a reference card, not a textbook. 200-500 lines
+   is typical; over 1000 means you're padding.
+
+## Quality rules
+
+- **Do not duplicate the ai.md content.** Aggregate, synthesize, distill.
+  The skill is the module-level view; individual file details belong in
+  ai.md (and eventually in inline Doxygen comments).
+- **Preserve accurate existing content.** Don't rewrite working sections.
+- **Cite file paths** for specific claims (e.g., "see `STAmount.h:roundToScale`").
+- **Flag contradictions.** If two ai.md files describe the same concept
+  differently, surface the conflict rather than silently picking one.
+- **Keep prose grounded.** No marketing language. No "robust, scalable,
+  enterprise-grade" filler. Engineers reading this need facts.
+
+## Output
+
+Emit the complete new skill file content as your final assistant message.
+Start with the markdown heading. Do not include meta-commentary like "Here
+is the updated skill file" — the output is captured verbatim and written
+to the skill file path.
--- a/.github/scripts/doc-agent/src/document.ts
+++ b/.github/scripts/doc-agent/src/document.ts
@@ -3,6 +3,7 @@
 */

 import { existsSync, readdirSync, statSync } from 'node:fs';
+import { readFile } from 'node:fs/promises';
 import { join, relative, resolve } from 'node:path';
 import { query } from '@anthropic-ai/claude-agent-sdk';
 import { MODEL, XRPLD_ROOT } from './config.js';
@@ -47,6 +48,21 @@ function findCppFiles(target: string): string[] {
  return results;
 }

+/**
+ * Read the sibling .ai.md file for a source file, if one exists.
+ *
+ * The athenah-ai pipeline produces a `<file>.ai.md` companion for every
+ * documented source file (e.g., `Slice.h` -> `Slice.h.ai.md`). When present,
+ * it is high-signal prose describing the file's purpose, design, and
+ * non-obvious behavior — the agent should use it as the authoritative
+ * source of intent.
+ */
+async function readAiContext(absPath: string): Promise<string | null> {
+  const aiPath = `${absPath}.ai.md`;
+  if (!existsSync(aiPath)) return null;
+  return await readFile(aiPath, 'utf8');
+}
+
 /**
 * Document a single file by running the documentation agent against it.
 */
@@ -55,13 +71,19 @@ async function documentFile(absPath: string): Promise<void> {
  console.log(`\n=== Documenting: ${relPath} ===`);

  const systemPrompt = await loadSystemPrompt('document-file', relPath);
+  const aiContext = await readAiContext(absPath);
+  const aiContextBlock =
+    aiContext === null
+      ? ''
+      : `\n\n## Authoritative AI Context (${relPath}.ai.md)\n\nThe following is high-signal prose describing this file's purpose, design,\nand non-obvious behavior. Treat it as the source of truth for intent and\nbehavior. Your job is to translate this into structured Doxygen \`/** */\`\ncomments on the actual declarations.\n\n---\n\n${aiContext}\n---`;
+
  const userPrompt = `Add Doxygen documentation to: ${relPath}

 The file is rooted at ${XRPLD_ROOT}. Use the Read tool to read it, the Edit
 tool to add documentation, and Glob/Grep to find related tests or callers
 when needed.

-Do not modify any code logic — only add documentation comments.`;
+Do not modify any code logic — only add documentation comments.${aiContextBlock}`;

  const result = query({
    prompt: userPrompt,
--- a/.github/scripts/doc-agent/src/index.ts
+++ b/.github/scripts/doc-agent/src/index.ts
@@ -7,9 +7,11 @@
 *   doc-agent document include/xrpl/basics/
 *   doc-agent review develop..HEAD
 *   doc-agent review --pr 1234
+ *   doc-agent regen-skills protocol
 */

 import { documentTarget } from './document.js';
+import { regenSkills } from './regen-skills.js';
 import { reviewDiff } from './review.js';

 const USAGE = `
@@ -19,6 +21,8 @@ Usage:
  doc-agent document <file-or-directory>   Add Doxygen documentation
  doc-agent review <base>..<head>          Detect doc drift in range
  doc-agent review --pr <number>           Detect doc drift for a PR
+  doc-agent regen-skills <module>          Regenerate docs/skills/soul/<module>.md
+                                           from sibling .ai.md files

 Environment:
  ANTHROPIC_API_KEY  (required)  Anthropic API key
@@ -58,6 +62,13 @@ async function main(): Promise<void> {
    return;
  }

+  if (mode === 'regen-skills') {
+    const moduleName = args[0];
+    if (moduleName === undefined) printUsageAndExit(1);
+    await regenSkills(moduleName);
+    return;
+  }
+
  console.error(`Unknown mode: ${mode}`);
  printUsageAndExit(1);
 }
--- a/.github/scripts/doc-agent/src/prompt-loader.ts
+++ b/.github/scripts/doc-agent/src/prompt-loader.ts
@@ -24,7 +24,7 @@ export async function loadSystemPrompt(promptName: string, sourcePath: string):
    return basePrompt;
  }

-  const skillPath = resolve(SKILLS_DIR, 'soul', skillFile);
+  const skillPath = resolve(SKILLS_DIR, skillFile);
  if (!existsSync(skillPath)) {
    return basePrompt;
  }
--- a/.github/scripts/doc-agent/src/regen-skills.ts
+++ b/.github/scripts/doc-agent/src/regen-skills.ts
@@ -0,0 +1,146 @@
+/**
+ * Regen-skills mode: rebuild a module's skill file from ai.md inputs.
+ *
+ * For a given module (e.g. `protocol`, `ledger`, `consensus`), collect all
+ * `.ai.md` files under the matching source paths and ask the agent to
+ * produce an updated `docs/skills/<module>.md`.
+ */
+
+import { existsSync, readdirSync, statSync } from 'node:fs';
+import { readFile, writeFile } from 'node:fs/promises';
+import { join, relative, resolve } from 'node:path';
+import { query } from '@anthropic-ai/claude-agent-sdk';
+import { MODEL, MODULE_SKILL_MAP, PROMPTS_DIR, SKILLS_DIR, XRPLD_ROOT } from './config.js';
+
+interface AiFile {
+  readonly sourcePath: string;
+  readonly content: string;
+}
+
+/** Resolve which source-tree prefixes feed a given skill file. */
+function prefixesForSkill(skillFile: string): string[] {
+  return Object.entries(MODULE_SKILL_MAP)
+    .filter(([, mapped]) => mapped === skillFile)
+    .map(([prefix]) => prefix);
+}
+
+/** Walk a directory and collect all sibling .ai.md files. */
+function collectAiFiles(prefix: string): string[] {
+  const absDir = resolve(XRPLD_ROOT, prefix);
+  if (!existsSync(absDir) || !statSync(absDir).isDirectory()) return [];
+
+  const results: string[] = [];
+  const walk = (dir: string): void => {
+    for (const entry of readdirSync(dir, { withFileTypes: true })) {
+      const full = join(dir, entry.name);
+      if (entry.isDirectory()) {
+        walk(full);
+      } else if (entry.isFile() && entry.name.endsWith('.ai.md')) {
+        results.push(full);
+      }
+    }
+  };
+  walk(absDir);
+  return results;
+}
+
+async function loadAiFiles(absPaths: readonly string[]): Promise<AiFile[]> {
+  const files: AiFile[] = [];
+  for (const absPath of absPaths) {
+    const content = await readFile(absPath, 'utf8');
+    files.push({
+      sourcePath: relative(XRPLD_ROOT, absPath).replace(/\.ai\.md$/, ''),
+      content,
+    });
+  }
+  return files;
+}
+
+/**
+ * Regenerate the skill file for a given module name.
+ *
+ * @param moduleName - The skill file name without extension (e.g. "protocol",
+ *   "ledger"). Must match a key in the MODULE_SKILL_MAP value set.
+ */
+export async function regenSkills(moduleName: string): Promise<void> {
+  const skillFile = `${moduleName}.md`;
+  const prefixes = prefixesForSkill(skillFile);
+
+  if (prefixes.length === 0) {
+    throw new Error(
+      `Unknown module: ${moduleName}. Valid modules: ${Array.from(new Set(Object.values(MODULE_SKILL_MAP).filter((v): v is string => v !== null))).join(', ')}`,
+    );
+  }
+
+  console.log(`Regenerating skill: ${skillFile}`);
+  console.log(`  Source prefixes: ${prefixes.join(', ')}`);
+
+  const aiPaths = prefixes.flatMap((prefix) => collectAiFiles(prefix));
+  if (aiPaths.length === 0) {
+    console.warn('  No .ai.md files found for this module. Skipping.');
+    return;
+  }
+  console.log(`  Found ${aiPaths.length} .ai.md file(s)`);
+
+  const aiFiles = await loadAiFiles(aiPaths);
+  const skillPath = resolve(SKILLS_DIR, skillFile);
+  const existingSkill = existsSync(skillPath)
+    ? await readFile(skillPath, 'utf8')
+    : '(no existing skill file — create a new one)';
+
+  const systemPrompt = await readFile(resolve(PROMPTS_DIR, 'regen-skill.md'), 'utf8');
+
+  const aiBlocks = aiFiles
+    .map((f) => `\n### \`${f.sourcePath}\`\n\n${f.content}`)
+    .join('\n\n---\n');
+
+  const userPrompt = `Regenerate the skill file: \`docs/skills/${skillFile}\`
+
+## Existing skill content
+
+${existingSkill}
+
+## AI context files for this module
+
+${aiBlocks}
+
+Produce the new complete skill file content as your final message.`;
+
+  let response = '';
+  const result = query({
+    prompt: userPrompt,
+    options: {
+      model: MODEL,
+      systemPrompt,
+      cwd: XRPLD_ROOT,
+      allowedTools: ['Read', 'Glob', 'Grep'],
+      permissionMode: 'acceptEdits',
+    },
+  });
+
+  for await (const message of result) {
+    if (message.type === 'assistant') {
+      const content = message.message?.content;
+      if (Array.isArray(content)) {
+        for (const block of content) {
+          if (block.type === 'text') {
+            response += block.text;
+          }
+        }
+      }
+    }
+    if (message.type === 'result') {
+      const cost = message.total_cost_usd?.toFixed(4) ?? '?';
+      console.log(`  [Cost: $${cost}]`);
+    }
+  }
+
+  const trimmed = response.trim();
+  if (trimmed.length === 0) {
+    console.error('  Agent returned empty response — skill file not updated.');
+    return;
+  }
+
+  await writeFile(skillPath, `${trimmed}\n`);
+  console.log(`  Wrote: ${relative(XRPLD_ROOT, skillPath)}`);
+}