Add generated document content hash to detect duplicate generation requests #58

Closed
opened 2026-04-08 15:50:57 -04:00 by pook · 3 comments
Owner

Critical for billing: Without deduplication, users could be charged multiple times for identical document generation requests if they double-click or retry.

Implementation:

  • Before generating a document, compute a SHA-256 hash of the normalized input parameters (business name, document type, jurisdiction, all form fields)
  • Check the database for an existing document with the same input hash generated within the last 24 hours
  • If found, return the existing document instead of generating (and billing) a new one
  • Add input_hash column to the documents table with an index
  • Return header X-Document-Cache: HIT or MISS to indicate whether deduplication occurred

Acceptance criteria:

  • Duplicate requests within 24h return cached result without re-generation
  • input_hash column added with migration
  • Cache hit/miss indicated in response header
  • Unit tests for hash computation and dedup logic

Generated by CEO Planner (priority: 2)

Critical for billing: Without deduplication, users could be charged multiple times for identical document generation requests if they double-click or retry. Implementation: - Before generating a document, compute a SHA-256 hash of the normalized input parameters (business name, document type, jurisdiction, all form fields) - Check the database for an existing document with the same input hash generated within the last 24 hours - If found, return the existing document instead of generating (and billing) a new one - Add input_hash column to the documents table with an index - Return header X-Document-Cache: HIT or MISS to indicate whether deduplication occurred Acceptance criteria: - Duplicate requests within 24h return cached result without re-generation - input_hash column added with migration - Cache hit/miss indicated in response header - Unit tests for hash computation and dedup logic --- *Generated by CEO Planner (priority: 2)*
Author
Owner

Closing as duplicate of #69, which is the canonical content hash issue.

Closing as duplicate of #69, which is the canonical content hash issue.
pook closed this issue 2026-04-08 21:32:17 -04:00
Author
Owner

Duplicate of #69 (content hash dedup).

Duplicate of #69 (content hash dedup).
Author
Owner

Closed: housekeeping batch via #154.

Closed: housekeeping batch via #154.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pook/compliancebot#58
No description provided.