General Document Policy
Policy for structuring miscellaneous files, internal memos, press releases, and general knowledge documents.
When to use it
Select this policy for any miscellaneous files, internal memos, press releases, custom forms, or general knowledge documents that do not fit the specialized categories but must be preserved exactly as written.
How the General Document Policy Protects Your Data
1. The "Literal Transcription" Rule
LLMs naturally want to summarize, paraphrase, and embellish text. In a RAG pipeline, summarization is dangerous because it destroys nuance. This policy enforces an absolute, hard constraint: DO NOT EVER rewrite, paraphrase, embellish, or interpret. It copies the exact wording, preserving the original meaning, terminology, dates, and number formats exactly as found in the source file.
2. Auto-Titling for Semantic Search
To make miscellaneous documents highly retrievable, the policy analyzes the raw text and generates an accurate, slightly specific title. This title acts as a heavy semantic weight in the Vector Database, ensuring that when an n8n workflow searches for a specific concept, this document ranks correctly.
3. Structural Cleanup (No Duplication)
Miscellaneous documents often contain repetitive headers, messy OCR scans, or duplicated clauses. The engine acts as an information organizer, ensuring that each data point appears only once in the most appropriate section. It converts the chaos into clean, LLM-readable Markdown.
4. The Universal Inventory Firewall
As a standard AgentBrains safety measure, this policy actively hunts for and redacts any mention of product inventory or stock levels. Even if a press release mentions "we have 5,000 units in our warehouse," it is stripped out. This ensures your AI agent never promises a customer stock based on an outdated memo.
What the Output Looks Like
When your n8n workflow retrieves a General Document, it receives a perfectly literal, structurally clean Markdown file, complete with a semantic title.
# Press Release: 2025 Warehouse Expansion Initiative
## Overview
Bugsy's Company officially broke ground on the new 50,000-square-foot logistics center located in the West Loop industrial sector on March 15, 2025.
## Timeline and Investment
The project represents a $2.4 million capital investment. Phase 1 of construction is scheduled for completion on August 30, 2025. The facility will exclusively handle outbound commercial freight and will not be open for retail customer pickups.
## Environmental Compliance
The facility conforms strictly to the 2025 EPA Green Building Code (Title 24, Part 6), utilizing high-efficiency R-410A commercial HVAC systems.Why This Matters for Automation Developers
By applying the General Document Policy as your fallback ingestor, you guarantee a baseline of quality across your entire RAG architecture:
- No Data Loss: Because the engine is forbidden from summarizing, you never have to worry that the AI "compressed" a crucial legal clause out of a document during ingestion.
- Universal Formatting: Whether the client gives you a messy .txt file, a scraped URL, or a poorly formatted Word doc, your workflows will always receive clean, predictable Markdown in return.
- Semantic Anchoring: The auto-generated titles ensure that even vaguely named source files (like doc_v4_final.pdf) are transformed into highly searchable vector assets (like 2025 Warehouse Expansion Initiative).