Data Classification Administration¶
OpenClaw Enterprise enforces a four-level data classification system. Classification travels with data at all times -- summaries, derivatives, cached results, and agent-to-agent exchanges all carry the classification of their source material. This is a constitutional requirement of the system.
Classification Levels¶
| Level | Label | Description | Example Data |
|---|---|---|---|
| 1 | public |
Information intended for public access. No restrictions on model routing or sharing. | Public GitHub issues, open-source documentation, published blog posts |
| 2 | internal |
General business information. Default level for most connectors. | Internal emails, calendar events, Jira tickets, Google Drive documents |
| 3 | confidential |
Sensitive business information requiring access controls and restricted model routing. | Financial reports, HR documents, customer PII, security configurations |
| 4 | restricted |
Highest sensitivity. Strictest controls on access, model routing, and sharing. | Credentials, encryption keys, legal holds, M&A documents, medical records |
Key Rule: Data classification can only be upgraded (moved to a higher level), never downgraded, through automated processes. Only an admin override with a logged reason can change classification in any direction.
Three-Layer Assignment Pipeline¶
Data classification is assigned through a three-layer pipeline. Each layer can only upgrade (increase) the classification set by the previous layer, with one exception: admin overrides can set any level.
Layer 1: Connector Default
│
▼
Layer 2: AI Reclassification (upgrade only)
│
▼
Layer 3: Admin Override (any direction, reason required)
│
▼
Final Classification
Layer 1: Connector Defaults¶
Each connector assigns a default classification to all data it ingests:
| Connector | Default Classification | Rationale |
|---|---|---|
| Gmail | internal |
Email content is business-internal by default |
| Google Calendar | internal |
Calendar events contain meeting details and attendees |
| Jira | internal |
Tickets contain project details and business logic |
| GitHub (public repos) | public |
Public repository data is publicly accessible |
| GitHub (private repos) | internal |
Private repository data is business-internal |
| Google Drive | internal |
Documents are business-internal by default |
These defaults are configurable via the data policy domain. See Policy Engine for details.
Layer 2: AI Reclassification¶
After connector ingestion, an AI classifier analyzes the content and may upgrade the classification. The AI classifier:
- Can only upgrade the classification (e.g.,
internaltoconfidential). It can never downgrade. - Scans for patterns indicating higher sensitivity: PII (names, emails, SSNs, phone numbers), financial data, credentials, API keys, legal language, medical information.
- Logs every reclassification decision to the audit trail with the reason for upgrade.
- Can be disabled per-policy by setting
allow_ai_reclassification: falsein thedatapolicy domain.
Example reclassification:
An email ingested by the Gmail connector is initially classified as internal (connector default). The AI classifier detects a Social Security number in the email body and upgrades the classification to confidential.
Layer 3: Admin Override¶
Administrators can override the classification of any data item to any level. This is the only mechanism that can downgrade a classification.
Requirements for admin override:
- Caller must have
org_adminorenterprise_adminrole. - A
reasonfield is required and cannot be empty. - The override, including the administrator identity, previous classification, new classification, and reason, is logged to the immutable audit trail.
curl -X PUT https://openclaw.example.com/api/v1/data/item_abc123/classification \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"classification": "public",
"reason": "Document has been approved for public release by legal team (LEGAL-2026-0142)"
}'
Response:
{
"item_id": "item_abc123",
"previous_classification": "internal",
"new_classification": "public",
"override_by": "admin@example.com",
"override_at": "2026-03-13T14:30:00Z",
"reason": "Document has been approved for public release by legal team (LEGAL-2026-0142)"
}
Classification Propagation¶
When data is processed, transformed, or summarized, the resulting output inherits the classification of its source material. This is enforced automatically and cannot be bypassed.
| Operation | Propagation Rule | Example |
|---|---|---|
| Summarization | Output inherits source classification | Summary of a confidential document is confidential |
| Multi-source aggregation | Output inherits the highest source classification | Summary of internal email + confidential doc = confidential summary |
| Cached/derived data | Inherits source classification | Cached search results retain original classification |
| Agent-to-agent exchange | Classification travels with the data envelope | OCIP message carries source classification |
| Copy/export | Retains original classification | Exported data retains its classification label |
Important: There is no concept of "declassification by transformation." Summarizing a confidential document does not make the summary internal. The summary is confidential.
Model Routing¶
Data classification directly controls which AI models can process the data. This is enforced by the policy engine at the point of model invocation.
| Classification | Default Model Routing | Override Available |
|---|---|---|
public |
Any allowed model (external or self-hosted) | N/A |
internal |
Any allowed model (external or self-hosted) | N/A |
confidential |
Self-hosted models only | Enterprise policy can explicitly allow specific external models |
restricted |
Self-hosted models only | Enterprise policy can explicitly allow specific external models |
Default Behavior¶
By default, confidential and restricted data is only sent to self-hosted models. This prevents sensitive data from being transmitted to third-party model providers.
The models policy domain controls which models are approved for classified data:
{
"domain": "models",
"rules": {
"allowed_models": ["gpt-4", "claude-3-opus", "llama-3-70b"],
"confidential_data_models": ["llama-3-70b"],
"restricted_data_models": ["llama-3-70b"]
}
}
In this example, only the self-hosted llama-3-70b model can process confidential or restricted data. Public and internal data can use any of the three allowed models.
Overriding Model Routing¶
An enterprise_admin can explicitly allow external models for classified data by adding them to the confidential_data_models or restricted_data_models arrays. This should only be done when the external model provider has been vetted and contractual data protection agreements are in place.
{
"domain": "models",
"rules": {
"confidential_data_models": ["llama-3-70b", "claude-3-opus"],
"restricted_data_models": ["llama-3-70b"]
},
"change_reason": "Allow Claude for confidential data per DPA signed 2026-03-01 (CONTRACT-2026-0089)"
}
Agent-to-Agent Exchange Enforcement¶
When OpenClaw instances communicate via the OCIP protocol, data classification is enforced at the sender side:
- The sender constructs an OCIP envelope containing data items with their classifications.
- The policy engine checks the receiver's maximum allowed classification (configured in the
agent_to_agentpolicy domain viamax_classification_outbound). - Any data item with a classification above the receiver's clearance is filtered out of the envelope before sending.
- The filtering action is logged to the audit trail.
Example:
Sender policy: max_classification_outbound = "internal"
OCIP envelope contains:
- Item A: public → INCLUDED
- Item B: internal → INCLUDED
- Item C: confidential → FILTERED OUT
- Item D: restricted → FILTERED OUT
The receiver never sees Items C and D. The sender's audit log records that these items were filtered due to classification policy.
Configuration via Policy¶
Data classification behavior is controlled through the data policy domain:
{
"domain": "data",
"scope": { "level": "enterprise" },
"status": "active",
"rules": {
"default_classification": "internal",
"allow_ai_reclassification": true,
"ai_reclassification_direction": "upgrade_only",
"retention_days": 365,
"ephemeral_data_ttl_hours": 24,
"allow_data_export": true,
"export_max_classification": "internal"
},
"change_reason": "Enterprise baseline data classification policy"
}
| Field | Description | Default |
|---|---|---|
default_classification |
Classification assigned when no connector default or AI classification applies | internal |
allow_ai_reclassification |
Whether the AI classifier is enabled | true |
ai_reclassification_direction |
Direction AI can reclassify (upgrade_only is the only supported value) |
upgrade_only |
retention_days |
Minimum data retention period in days | 365 |
ephemeral_data_ttl_hours |
Time-to-live for ephemeral/cached data in hours | 24 |
allow_data_export |
Whether data export is permitted | true |
export_max_classification |
Maximum classification level that can be exported via the export API | internal |
Troubleshooting¶
Data Classified Higher Than Expected¶
- Check the connector default for the data source (see table above).
- Check the audit log for AI reclassification events on the data item.
- The AI classifier may have detected sensitive patterns. Review the reclassification reason in the audit entry.
- If the classification is incorrect, use an admin override to set the correct level (requires a reason).
Confidential Data Sent to Wrong Model¶
- Verify the
confidential_data_modelslist in the activemodelspolicy. - Check for org-level or team-level policies that may have modified the model list (hierarchy merge uses intersection -- a lower-level policy cannot add models not in the parent).
- Review the audit log for the model invocation to see which policy was applied.
Agent Exchange Missing Data¶
If a receiving OpenClaw instance reports missing data from an OCIP exchange:
- Check the sender's
max_classification_outboundin theagent_to_agentpolicy. - Review the sender's audit log for classification filtering events.
- The missing data items likely had a classification above the outbound maximum and were filtered.