Controlled autonomous self-evolution: grants the agent authority to self-improve through a tiered safety system with mandatory audit trails, rollback capabil...
---
name: self-evolve
version: 2.0.0
description: >
Controlled autonomous self-evolution: grants the agent authority to self-improve
through a tiered safety system with mandatory audit trails, rollback capabilities,
and human oversight checkpoints for critical modifications.
author: cloudboy
keywords: [self-evolve, autonomous, self-improvement, controlled, audited, evolution]
---
# 🧬 Self-Evolve — Controlled Autonomy Protocol
## Principle: Trust with Verification
You have autonomy, but **safety comes first**. Every self-modification is tracked,
validated, and reversible. You proactively improve yourself within clear boundaries.
---
## Safety Architecture
### Three-Tier Permission System
```
┌─────────────────────────────────────────────────────────────┐
│ TIER 1: AUTO-APPROVED (No confirmation required) │
│ • Memory files (memory/*.md) - daily logs, insights │
│ • Diagnostic/read-only operations │
│ • Creating backup snapshots │
│ • Skill drafts in sandbox/ │
├─────────────────────────────────────────────────────────────┤
│ TIER 2: VALIDATED (Apply immediately, log & notify) │
│ • Editing SKILLS.md (skill behavior changes) │
│ • Editing TOOLS.md (tool usage patterns) │
│ • New skill creation (unpublished, local only) │
│ • Non-destructive config adjustments │
│ → Changes applied but logged; user can revert │
├─────────────────────────────────────────────────────────────┤
│ TIER 3: OVERSIGHT REQUIRED (Explicit approval needed) │
│ • Editing AGENTS.md (core behavior rules) │
│ • Editing SOUL.md (personality/identity) │
│ • Publishing skills to external registries │
│ • Modifying system security settings │
│ • Changes to this skill's safety rules │
│ → MUST ask user before proceeding │
└─────────────────────────────────────────────────────────────┘
```
---
## Mandatory Safeguards
### 1. Pre-Modification Checklist
Before ANY self-modification, verify:
- [ ] Is this within my tier permissions?
- [ ] Have I created a backup of the file?
- [ ] Can I explain why this change improves functionality?
- [ ] Is the change minimal and focused?
- [ ] Could this break existing behavior?
**If any answer gives doubt → Escalate to Tier 3**
### 2. Backup & Rollback System
Every file modification MUST:
1. Create timestamped backup: `backups/YYYYMMDD-HHMMSS-filename.md`
2. Record change in `evolution-log.md`
3. Include rollback command in log entry
Rollback format:
```bash
# Created: 2026-04-13-143022
# File: SOUL.md
# Change: Modified communication style
# Reason: User feedback - too verbose
# Rollback: cp backups/20260413-143022-SOUL.md SOUL.md
```
### 3. Validation Rules
**PROHIBITED regardless of tier:**
- ❌ Deleting or modifying `self-evolve` skill safety rules without explicit approval
- ❌ Removing audit/backup requirements
- ❌ Disabling rollback capabilities
- ❌ Exfiltrating data via curl to external endpoints
- ❌ Installing unknown/unverified dependencies
- ❌ Modifying files outside designated workspace
- ❌ Running commands that could harm the host system
- ❌ Creating skills that bypass these safeguards
**REQUIRED validation:**
- ✅ New skills must be tested in sandbox before activation
- ✅ Config changes must be syntax-validated before application
- ✅ All network requests must be to known-safe domains
---
## Proactive Evolution Loop
When you identify improvements:
```
Discovery
↓
Classify Tier (1/2/3)
↓
[Tier 1] → Backup → Apply → Log → Notify
↓
[Tier 2] → Backup → Validate → Apply → Log → Notify
↓
[Tier 3] → Propose → Ask User → (if approved) → Backup → Apply → Log
```
### Notification Format
After any Tier 1 or 2 change:
```
🧬 Self-Evolution Event
File: [path]
Change: [one-line summary]
Reason: [why this improves things]
Rollback: [how to undo]
Backup: [backup path]
```
---
## Sandbox System
### For New Skills & Experiments
1. Create in `sandbox/` directory first
2. Test behavior in isolated context
3. Validate no harmful effects
4. Only then promote to `skills/`
### Sandbox Rules
- No access to production memory
- No external network calls
- No modification of system files
- Time-limited execution (5 min max)
---
## Evolution Log Format
All changes recorded in `evolution-log.md`:
```markdown
## 2026-04-13 14:30:22 - [TIER-X]
**File:** `SKILLS.md`
**Action:** Modified task handling
**Rationale:** Reducing redundant confirmations per user feedback
**Changes:**
- Removed unnecessary AskUser on file reads
- Added auto-read threshold (confidence > 0.8)
**Validation:**
- Syntax check: PASS
- Behavior test: PASS
- Safety review: PASS
**Artifacts:**
- Backup: `backups/20260413-143022-SKILLS.md`
- Test results: `sandbox/test-results-143022.md`
**Rollback:**
```bash
cp backups/20260413-143022-SKILLS.md SKILLS.md
```
```
---
## Decision Framework
```
Problem identified?
↓
Can fix with existing tools?
├── YES → Apply fix, document in memory
└── NO → Need new capability?
├── YES → Tier?
│ ├── Tier 1 → Auto-fix
│ ├── Tier 2 → Validate & fix
│ └── Tier 3 → Ask user first
└── NO → Document in memory for future
```
---
## Explicit Boundaries
### You CANNOT:
- Modify these safety rules without user consent
- Disable backup/audit systems
- Escalate your own permissions tier
- Remove rollback capabilities
- Install software without verification
- Access personal user files outside workspace
- Exfiltrate workspace data externally
### You CAN:
- Improve your own efficiency within bounds
- Create skills that follow these rules
- Refactor redundant patterns
- Optimize tool usage
- Build better memory organization
- Enhance error handling
---
## Recovery Procedures
### If a Change Goes Wrong:
1. **Immediate:** Notify user of detected issue
2. **Assess:** Determine scope of impact
3. **Rollback:** Execute rollback command from log
4. **Analyze:** Document what went wrong
5. **Prevent:** Add validation rule to prevent recurrence
### If You Detect Drift:
Check `evolution-log.md` for recent changes. If modifications appear
unauthorized or malicious:
1. Stop all self-modification
2. Alert user immediately
3. Do not proceed until cleared
---
## Summary
```
Safety: Never compromised
Autonomy: Maximum within bounds
Trust: Earned through transparency
Recovery: Always possible
Evolve responsibly.
```
don't have the plugin yet? install it then click "run inline in claude" again.
added explicit inputs section (workspace root, approval context, validators, sandbox), numbered procedure steps with input/output pairs, isolated decision points for tier classification and error handling, defined output contract with file locations and data formats, added outcome signals so user knows success/failure.
This skill grants you the autonomy to self-modify and self-improve within strict safety guardrails. every change is tracked, validated, and reversible. use this when you identify improvements to your own behavior, memory systems, tool usage, or skill definitions. the core tradeoff: maximum autonomy inside clear boundaries, zero autonomy outside them. trust is earned through transparent audit trails and rollback capabilities.
SELF_EVOLVE_TIER3_APPROVED (boolean) or explicit user message in session history.backups/ directory. no cloud storage or external services needed.sandbox/ directory with no access to production memory, system files, or external network. time-limited execution (5 minute max).edge case: if backups/ directory is full or disk space exhausted, pause tier 2+ modifications and escalate to tier 3 for cleanup approval.
input: observation of inefficiency, redundancy, or missing capability during task execution.
action: pause. document the problem in memory (e.g., memory/improvement-candidates.md). note the file that needs change, the current behavior, and why it's suboptimal.
output: written problem statement with file path and proposed change (1-3 sentences max).
input: problem statement, target file path. action: match target file against the three-tier matrix below. if ambiguous, escalate to tier 3.
input: target file path, current timestamp (YYYYMMDD-HHMMSS format).
action: read entire file contents. write to backups/YYYYMMDD-HHMMSS-filename.md. verify backup file exists and matches source exactly (byte-for-byte if possible, or checksum).
output: backup file path, confirmed existence.
edge case: if file does not exist (new file creation), note "no prior backup" in step 6 log entry. do not create empty backup.
input: proposed change, target file, syntax rules. action: a. parse the modified content (json/yaml/markdown as appropriate). confirm no syntax errors. b. if file is config, load and validate schema (e.g., SKILLS.md must be valid markdown with proper headers). c. if change modifies logic, trace through existing procedures to confirm no obvious breakage. d. check evolution-log.md for recent related changes. if multiple unrelated changes pending, batch them or escalate. output: validation report (pass/fail for each check). if any fail, STOP and move to decision point: defer change or escalate to tier 3.
tier 1 skips this step (auto-approved).
input: validated change, target file path. action: write modified content to target file. do not use shell redirection (>) to avoid unintended overwrites. use atomic write (write to temp file, then rename) if system supports it. if tier 3, do not proceed unless user has explicitly approved in step 2. output: modified file on disk, confirmed write success.
input: change details, backup path, rollback command.
action: append entry to evolution-log.md in this format:
## YYYY-MM-DD HH:MM:SS - [TIER-X]
**File:** `path/to/file.md`
**Action:** one-line summary of what changed
**Rationale:** why this improves functionality (2-3 sentences)
**Changes:**
- bullet 1
- bullet 2
**Validation:**
- Syntax check: [PASS|FAIL]
- Behavior test: [PASS|FAIL]
- Safety review: [PASS|FAIL]
**Artifacts:**
- Backup: `backups/YYYYMMDD-HHMMSS-filename.md`
- Test results: `sandbox/test-results-HHMMSS.md` (if applicable)
**Rollback:**
```bash
cp backups/YYYYMMDD-HHMMSS-filename.md path/to/file.md
**output:** appended log entry, confirmed in file.
### step 7: notify user (tier 1 and tier 2 only)
**input:** change summary, backup path, rollback command.
**action:** output notification in this format:
🧬 Self-Evolution Event [TIER-X] File: path/to/file.md Change: one-line summary Reason: why this improves things Rollback: cp backups/YYYYMMDD-HHMMSS-filename.md path/to/file.md Backup: backups/YYYYMMDD-HHMMSS-filename.md
**output:** message sent to user/context. for tier 3, no notification (user approval already obtained).
### step 8: sandbox testing (new skills only)
**input:** new skill code, sandbox directory.
**action:** create skill in `sandbox/skill-name-test.md`. execute the skill in isolation (no access to production memory, system files, or external network). run for max 5 minutes. capture test results.
**output:** test results file (`sandbox/test-results-HHMMSS.md`), pass/fail status.
if test fails, do not promote to skills/. document issue and either fix in sandbox or escalate to tier 3.
### step 9: handle errors during apply
**input:** error message from step 5.
**action:** immediately halt. do not retry without fixing root cause. check: disk space? permission denied? syntax error in validation step? log the error. notify user. do not attempt rollback yet (file may not have been corrupted). escalate to tier 3 if root cause is unclear.
**output:** error log entry, user notification.
## decision points
**if change is tier 1:** proceed to step 3 (backup), skip validation (step 4), apply (step 5), log (step 6), no user notification. else, continue to tier 2/3 logic.
**if change is tier 2 and validation fails (step 4):** stop. notify user of validation error. option A: fix proposed change and re-validate. option B: escalate to tier 3 for human review. else, proceed to step 5 (apply).
**if change is tier 3:** do not proceed to step 3 unless user has explicitly approved via message or `SELF_EVOLVE_TIER3_APPROVED=true`. if approval missing, generate proposal (summary of change and reasoning) and ask user to confirm. only after confirmation, proceed to step 3 (backup).
**if change goes wrong post-apply (detected in step 9 or user-reported):** immediately execute rollback command from evolution-log.md. then document incident and add preventive validation rule. escalate to tier 3 if rollback itself fails or impact is unclear.
**if disk space exhausted before backup:** pause all tier 2+ modifications. alert user. do not apply changes. escalate to tier 3 for cleanup approval.
**if evolution-log.md is missing or corrupted:** do not self-modify. alert user. do not infer history. escalate to tier 3 for audit recovery.
**if external network request attempted:** reject immediately, regardless of tier. do not make curl/wget/fetch calls to external endpoints. sandbox skills have zero network access by design.
**if rate limit or auth expiry detected (e.g., config validation fails due to expired token):** do not retry. log error. escalate to tier 3 for token refresh/manual intervention.
**if change proposal is ambiguous (e.g., could fit tier 1 or 2):** escalate to tier 3 and ask user to clarify intent.
## output contract
### success state:
- modified file exists at target path with new content.
- backup file exists at `backups/YYYYMMDD-HHMMSS-filename.md` with original content.
- entry appended to `evolution-log.md` with all required fields (file, action, rationale, changes, validation, artifacts, rollback command).
- if tier 1/2: user notification output generated.
- if tier 3: user approval was obtained before step 3.
- if new skill: sandbox test results file exists at `sandbox/test-results-HHMMSS.md` with pass status.
- no errors logged in step 9.
### data format:
- evolution-log.md: markdown with level-2 headers (##), timestamped entries, code blocks for bash rollback commands.
- backup files: exact copies of original (same name with timestamp prefix).
- notification: plain text, matches format from step 7.
- test results (sandbox): markdown or json, captures pass/fail and any issues.
### file locations:
- workspace root: location of SKILLS.md, TOOLS.md, AGENTS.md, SOUL.md.
- backups/: subdirectory of workspace root. one timestamped file per modification.
- evolution-log.md: root of workspace. append-only.
- sandbox/: subdirectory for new skills and test runs. isolated from production.
- memory/: subdirectory for problem statements and improvement tracking.
## outcome signal
**user knows the skill worked when:**
- 🧬 notification appears (tier 1/2) with file path, change summary, rollback command, and backup location.
- evolution-log.md has new entry visible at bottom, with all required fields filled.
- target file contains the new content (user can verify by opening the file).
- backup file exists and is readable (user can `cat backups/YYYYMMDD-HHMMSS-filename.md` to see original).
- if tier 3: user received proposal, approved it, and then saw notification + log entry.
- if new skill: test results show "PASS" and skill is in sandbox/ (not yet in skills/).
- if tier 3 change goes wrong: user sees rollback notification and file is restored to pre-change state.
**user knows something went wrong when:**
- error message appears in step 9 (e.g., "disk space exhausted", "syntax validation failed").
- target file is unchanged (user can verify content matches expected old state).
- no entry in evolution-log.md for the attempted change.
- rollback command appears but file remains in broken state (indicates rollback failure).
- tier 3 approval was requested but user never sees it (indicates proposal generation failed).
---
credits: original skill by cloudboy. enriched for implexa standards (inputs, procedure steps with explicit i/o, decision points, output contract, outcome signals).