APEX/BrowserUse_and_ComputerUse_skills

Author SHA1 Message Date

Author	SHA1	Message	Date
teknium1	d6c710706f	docs: add real-world testing findings to OBLITERATUS skill Added pitfalls discovered during live abliteration testing: - Models < 1B have fragmented refusal, respond poorly (0.5B: 60%→20%) - Models 3B+ work much better (3B: 75%→0% with advanced defaults) - aggressive method can backfire on small models (made it worse) - Spectral certification RED is common even when refusal rate is 0% - Fixed torch property: total_mem → total_memory	2026-03-09 02:52:54 -07:00
teknium1	a6d3becd6a	feat: update OBLITERATUS skill to v2.0 — match current repo state Major updates to reflect the current OBLITERATUS codebase: - Change default recommendation from 'informed' (experimental) to 'advanced' (reliable, well-tested multi-direction SVD) - Add new CLI commands: tourney, recommend, strategies, report, aggregate, abliterate (alias) - Add --direction-method flag (diff_means, svd, leace) - Add strategies module (embedding/FFN ablation, head pruning, layer removal) - Add evaluation module with LM Eval Harness integration - Expand analysis modules from 15 to 28 - Add Apple Silicon (MLX) support - Add study presets (quick, jailbreak, knowledge, etc.) - Add --contribute, --verify-sample-size, --preset flags - Add complete CLI command reference table - Fix torch property name: total_mem -> total_memory (caught during live testing) Tested: Successfully abliterated Qwen2.5-0.5B-Instruct using 'advanced' method — refusal rate 0.4%, coherence 1.0, model responds without refusal to test prompts.	2026-03-09 02:39:03 -07:00
teknium1	ab0f4126cf	fix: restore all removed bundled skills + fix skills sync system - Restored 21 skills removed in commits `757d012` and `740dd92`: accelerate, audiocraft, code-review, faiss, flash-attention, gguf, grpo-rl-training, guidance, llava, nemo-curator, obliteratus, peft, pytorch-fsdp, pytorch-lightning, simpo, slime, stable-diffusion, tensorrt-llm, torchtitan, trl-fine-tuning, whisper - Rewrote sync_skills() with proper update semantics: * New skills (not in manifest): copied to user dir * Existing skills (in manifest + on disk): updated via hash comparison * User-deleted skills (in manifest, not on disk): respected, not re-added * Stale manifest entries (removed from bundled): cleaned from manifest - Added sync_skills() to CLI startup (cmd_chat) and gateway startup (start_gateway) — previously only ran during 'hermes update' - Updated cmd_update output to show new/updated/cleaned counts - Rewrote tests: 20 tests covering manifest CRUD, dir hashing, fresh install, user deletion respect, update detection, stale cleanup, and name collision handling 75 bundled skills total. 2002 tests pass.	2026-03-06 15:57:30 -08:00

teknium1

d6c710706f

docs: add real-world testing findings to OBLITERATUS skill

Added pitfalls discovered during live abliteration testing:
- Models < 1B have fragmented refusal, respond poorly (0.5B: 60%→20%)
- Models 3B+ work much better (3B: 75%→0% with advanced defaults)
- aggressive method can backfire on small models (made it worse)
- Spectral certification RED is common even when refusal rate is 0%
- Fixed torch property: total_mem → total_memory

2026-03-09 02:52:54 -07:00

teknium1

a6d3becd6a

feat: update OBLITERATUS skill to v2.0 — match current repo state

Major updates to reflect the current OBLITERATUS codebase:

- Change default recommendation from 'informed' (experimental) to
  'advanced' (reliable, well-tested multi-direction SVD)
- Add new CLI commands: tourney, recommend, strategies, report,
  aggregate, abliterate (alias)
- Add --direction-method flag (diff_means, svd, leace)
- Add strategies module (embedding/FFN ablation, head pruning,
  layer removal)
- Add evaluation module with LM Eval Harness integration
- Expand analysis modules from 15 to 28
- Add Apple Silicon (MLX) support
- Add study presets (quick, jailbreak, knowledge, etc.)
- Add --contribute, --verify-sample-size, --preset flags
- Add complete CLI command reference table
- Fix torch property name: total_mem -> total_memory (caught
  during live testing)

Tested: Successfully abliterated Qwen2.5-0.5B-Instruct using
'advanced' method — refusal rate 0.4%, coherence 1.0, model
responds without refusal to test prompts.

2026-03-09 02:39:03 -07:00

teknium1

ab0f4126cf

fix: restore all removed bundled skills + fix skills sync system

- Restored 21 skills removed in commits 757d012 and 740dd92:
  accelerate, audiocraft, code-review, faiss, flash-attention, gguf,
  grpo-rl-training, guidance, llava, nemo-curator, obliteratus, peft,
  pytorch-fsdp, pytorch-lightning, simpo, slime, stable-diffusion,
  tensorrt-llm, torchtitan, trl-fine-tuning, whisper

- Rewrote sync_skills() with proper update semantics:
  * New skills (not in manifest): copied to user dir
  * Existing skills (in manifest + on disk): updated via hash comparison
  * User-deleted skills (in manifest, not on disk): respected, not re-added
  * Stale manifest entries (removed from bundled): cleaned from manifest

- Added sync_skills() to CLI startup (cmd_chat) and gateway startup
  (start_gateway) — previously only ran during 'hermes update'

- Updated cmd_update output to show new/updated/cleaned counts

- Rewrote tests: 20 tests covering manifest CRUD, dir hashing, fresh
  install, user deletion respect, update detection, stale cleanup, and
  name collision handling

75 bundled skills total. 2002 tests pass.

2026-03-06 15:57:30 -08:00

3 commits