feat: auto-generate sitemap.xml from config page routes #16

Closed
pook wants to merge 1 commit from feature/auto-sitemap-generation into master
Owner

Summary

  • Adds scripts/generate-sitemap.sh that reads pages.items from the instance JSON config and generates a valid sitemap.xml with lastmod, changefreq, and priority per route
  • Integrates sitemap generation into both generate-instance.sh and new-instance.sh build scripts
  • Adds robots.txt rendering step so the Sitemap directive URL is populated with the correct domain
  • Adds pages config section to all instance configs (contractpilot, compliancebot, viztekpro) and the master template

How it works

New pages added to pages.items in any instance config automatically appear in the generated sitemap. The sitemap uses file modification times for lastmod dates, falls back to build date if the HTML file does not exist yet.

Test plan

  • generate-instance.sh contractpilot produces valid sitemap.xml with 3 URLs
  • new-instance.sh contractpilot --force produces valid sitemap.xml with 3 URLs
  • robots.txt contains correct Sitemap: https://<domain>/sitemap.xml directive
  • Sitemap XML validates (parsed successfully with Python xml.etree)
  • All JSON configs validate with jq empty
  • Adding a new page entry to config appears in next build sitemap

🤖 Generated with Claude Code

## Summary - Adds `scripts/generate-sitemap.sh` that reads `pages.items` from the instance JSON config and generates a valid sitemap.xml with lastmod, changefreq, and priority per route - Integrates sitemap generation into both `generate-instance.sh` and `new-instance.sh` build scripts - Adds `robots.txt` rendering step so the Sitemap directive URL is populated with the correct domain - Adds `pages` config section to all instance configs (contractpilot, compliancebot, viztekpro) and the master template ## How it works New pages added to `pages.items` in any instance config automatically appear in the generated sitemap. The sitemap uses file modification times for `lastmod` dates, falls back to build date if the HTML file does not exist yet. ## Test plan - [x] `generate-instance.sh contractpilot` produces valid sitemap.xml with 3 URLs - [x] `new-instance.sh contractpilot --force` produces valid sitemap.xml with 3 URLs - [x] robots.txt contains correct `Sitemap: https://<domain>/sitemap.xml` directive - [x] Sitemap XML validates (parsed successfully with Python xml.etree) - [x] All JSON configs validate with `jq empty` - [ ] Adding a new page entry to config appears in next build sitemap 🤖 Generated with [Claude Code](https://claude.ai/claude-code)
feat: auto-generate sitemap.xml from config page routes
Some checks failed
Smoke Test / smoke (push) Has been cancelled
d60a25a050
Add a build step that generates sitemap.xml from the pages.items array
in each instance config. This ensures new pages added to config
automatically appear in the sitemap. Also renders robots.txt with the
correct Sitemap directive URL during build.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Author
Owner

Build Verification Report — PR #16 (Sitemap Generation)

Branch: feature/auto-sitemap-generation
Commit: d60a25a
Verified: 2026-04-11

1. Script Execution

$ bash scripts/generate-sitemap.sh instances/contractpilot.json instances/contractpilot
Sitemap generated: instances/contractpilot/sitemap.xml (3 pages)

Result: PASS — Script runs cleanly with exit code 0.

2. Generated sitemap.xml

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://contractpilot.xyz/</loc>
    <lastmod>2026-04-11</lastmod>
    <changefreq>weekly</changefreq>
    <priority>1.0</priority>
  </url>
  <url>
    <loc>https://contractpilot.xyz/privacy</loc>
    <lastmod>2026-04-11</lastmod>
    <changefreq>monthly</changefreq>
    <priority>0.3</priority>
  </url>
  <url>
    <loc>https://contractpilot.xyz/terms</loc>
    <lastmod>2026-04-11</lastmod>
    <changefreq>monthly</changefreq>
    <priority>0.3</priority>
  </url>
</urlset>

Result: PASS — Valid XML, 3 pages match pages.items in contractpilot.json.

3. URL Verification

Config Route Sitemap URL Match
/ https://contractpilot.xyz/
/privacy https://contractpilot.xyz/privacy
/terms https://contractpilot.xyz/terms

4. Build Integration

  • generate-instance.sh calls generate-sitemap.sh at line 407
  • new-instance.sh calls generate-sitemap.sh at line 480

5. Notes

  • Domain discrepancy: The pre-committed sitemap.xml in the branch uses contractpilot.ai, but the config (contractpilot.json) defines product.domain as contractpilot.xyz. Dynamic generation correctly uses the config value. The static file should be regenerated before merge.
  • Merge conflict: PR is currently marked as not mergeable (needs rebase onto current master).
  • This is a static site with no npm run build — sitemap generation is handled by the shell script, which was verified directly.

Automated build verification by agent-bot. Supersedes stale issues #50, #51, #64.

## Build Verification Report — PR #16 (Sitemap Generation) **Branch:** `feature/auto-sitemap-generation` **Commit:** `d60a25a` **Verified:** 2026-04-11 ### 1. Script Execution ``` $ bash scripts/generate-sitemap.sh instances/contractpilot.json instances/contractpilot Sitemap generated: instances/contractpilot/sitemap.xml (3 pages) ``` **Result: PASS** — Script runs cleanly with exit code 0. ### 2. Generated sitemap.xml ```xml <?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://contractpilot.xyz/</loc> <lastmod>2026-04-11</lastmod> <changefreq>weekly</changefreq> <priority>1.0</priority> </url> <url> <loc>https://contractpilot.xyz/privacy</loc> <lastmod>2026-04-11</lastmod> <changefreq>monthly</changefreq> <priority>0.3</priority> </url> <url> <loc>https://contractpilot.xyz/terms</loc> <lastmod>2026-04-11</lastmod> <changefreq>monthly</changefreq> <priority>0.3</priority> </url> </urlset> ``` **Result: PASS** — Valid XML, 3 pages match `pages.items` in `contractpilot.json`. ### 3. URL Verification | Config Route | Sitemap URL | Match | |---|---|---| | `/` | `https://contractpilot.xyz/` | ✅ | | `/privacy` | `https://contractpilot.xyz/privacy` | ✅ | | `/terms` | `https://contractpilot.xyz/terms` | ✅ | ### 4. Build Integration - `generate-instance.sh` calls `generate-sitemap.sh` at line 407 ✅ - `new-instance.sh` calls `generate-sitemap.sh` at line 480 ✅ ### 5. Notes - **Domain discrepancy:** The pre-committed `sitemap.xml` in the branch uses `contractpilot.ai`, but the config (`contractpilot.json`) defines `product.domain` as `contractpilot.xyz`. Dynamic generation correctly uses the config value. The static file should be regenerated before merge. - **Merge conflict:** PR is currently marked as not mergeable (needs rebase onto current master). - This is a static site with no `npm run build` — sitemap generation is handled by the shell script, which was verified directly. --- _Automated build verification by agent-bot. Supersedes stale issues #50, #51, #64._
Author
Owner

Build Verification — PR #16 (feature/auto-sitemap-generation)

Agent: agent-bot | Date: 2026-04-11

Smoke Test Results

=== ContractPilot Website Template Smoke Tests ===

[1/4] nginx config syntax check...
  WARN: nginx not installed, skipping config validation
[2/4] Checking for unfilled placeholders in instances/...
  PASS: No unfilled placeholders found
[3/4] Checking instance HTML files exist and are non-empty...
  PASS: contractpilot/index.html — 29634 bytes
[4/4] Checking deploy.sh for hardcoded home paths...
  PASS: No hardcoded /home/* paths in deploy.sh

PASS: All smoke tests passed

Dockerfile Reference Check

Note: Docker not available in CI sandbox. File existence check for Dockerfile COPY targets:

  • privacy.html — MISSING (template-level, expected)
  • terms.html — MISSING (template-level, expected)
  • cookies.html — MISSING (template-level, expected)
  • 404.html — MISSING (template-level, expected)
  • 50x.html — MISSING (template-level, expected)
  • robots.txt — EXISTS
  • sitemap.xml — MISSING (this PR adds generation script but sitemap.xml not pre-generated at root)
  • fonts/ — MISSING (template-level, expected)
  • img/ — MISSING (template-level, expected)

Verdict: SMOKE TESTS PASS

No npm/node build system — static site served via nginx Docker. All smoke tests pass. Missing Dockerfile COPY targets at repo root are expected (they exist in instance directories).

## Build Verification — PR #16 (feature/auto-sitemap-generation) **Agent:** agent-bot | **Date:** 2026-04-11 ### Smoke Test Results ``` === ContractPilot Website Template Smoke Tests === [1/4] nginx config syntax check... WARN: nginx not installed, skipping config validation [2/4] Checking for unfilled placeholders in instances/... PASS: No unfilled placeholders found [3/4] Checking instance HTML files exist and are non-empty... PASS: contractpilot/index.html — 29634 bytes [4/4] Checking deploy.sh for hardcoded home paths... PASS: No hardcoded /home/* paths in deploy.sh PASS: All smoke tests passed ``` ### Dockerfile Reference Check Note: Docker not available in CI sandbox. File existence check for Dockerfile COPY targets: - `privacy.html` — MISSING (template-level, expected) - `terms.html` — MISSING (template-level, expected) - `cookies.html` — MISSING (template-level, expected) - `404.html` — MISSING (template-level, expected) - `50x.html` — MISSING (template-level, expected) - `robots.txt` — EXISTS - `sitemap.xml` — MISSING (this PR adds generation script but sitemap.xml not pre-generated at root) - `fonts/` — MISSING (template-level, expected) - `img/` — MISSING (template-level, expected) ### Verdict: ✅ SMOKE TESTS PASS No npm/node build system — static site served via nginx Docker. All smoke tests pass. Missing Dockerfile COPY targets at repo root are expected (they exist in instance directories).
pook closed this pull request 2026-04-21 20:29:07 -04:00
Some checks failed
Smoke Test / smoke (push) Has been cancelled

Pull request closed

Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pook/website-template!16
No description provided.