Skills can now declare runtime prerequisites (env vars, CLI binaries) via YAML frontmatter. Skills with unmet prerequisites are excluded from the system prompt so the agent never claims capabilities it can't deliver, and skill_view() warns the agent about what's missing. Three layers of defense: - build_skills_system_prompt() filters out unavailable skills - _find_all_skills() flags unmet prerequisites in metadata - skill_view() returns prerequisites_warning with actionable details Tagged 12 bundled skills that have hard runtime dependencies: gif-search (TENOR_API_KEY), notion (NOTION_API_KEY), himalaya, imessage, apple-notes, apple-reminders, openhue, duckduckgo-search, codebase-inspection, blogwatcher, songsee, mcporter. Closes #658 Fixes #630
82 lines
2.4 KiB
Markdown
82 lines
2.4 KiB
Markdown
---
|
|
name: songsee
|
|
description: Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc.) from audio files via CLI. Useful for audio analysis, music production debugging, and visual documentation.
|
|
version: 1.0.0
|
|
author: community
|
|
license: MIT
|
|
metadata:
|
|
hermes:
|
|
tags: [Audio, Visualization, Spectrogram, Music, Analysis]
|
|
homepage: https://github.com/steipete/songsee
|
|
prerequisites:
|
|
commands: [songsee]
|
|
---
|
|
|
|
# songsee
|
|
|
|
Generate spectrograms and multi-panel audio feature visualizations from audio files.
|
|
|
|
## Prerequisites
|
|
|
|
Requires [Go](https://go.dev/doc/install):
|
|
```bash
|
|
go install github.com/steipete/songsee/cmd/songsee@latest
|
|
```
|
|
|
|
Optional: `ffmpeg` for formats beyond WAV/MP3.
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# Basic spectrogram
|
|
songsee track.mp3
|
|
|
|
# Save to specific file
|
|
songsee track.mp3 -o spectrogram.png
|
|
|
|
# Multi-panel visualization grid
|
|
songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,flux
|
|
|
|
# Time slice (start at 12.5s, 8s duration)
|
|
songsee track.mp3 --start 12.5 --duration 8 -o slice.jpg
|
|
|
|
# From stdin
|
|
cat track.mp3 | songsee - --format png -o out.png
|
|
```
|
|
|
|
## Visualization Types
|
|
|
|
Use `--viz` with comma-separated values:
|
|
|
|
| Type | Description |
|
|
|------|-------------|
|
|
| `spectrogram` | Standard frequency spectrogram |
|
|
| `mel` | Mel-scaled spectrogram |
|
|
| `chroma` | Pitch class distribution |
|
|
| `hpss` | Harmonic/percussive separation |
|
|
| `selfsim` | Self-similarity matrix |
|
|
| `loudness` | Loudness over time |
|
|
| `tempogram` | Tempo estimation |
|
|
| `mfcc` | Mel-frequency cepstral coefficients |
|
|
| `flux` | Spectral flux (onset detection) |
|
|
|
|
Multiple `--viz` types render as a grid in a single image.
|
|
|
|
## Common Flags
|
|
|
|
| Flag | Description |
|
|
|------|-------------|
|
|
| `--viz` | Visualization types (comma-separated) |
|
|
| `--style` | Color palette: `classic`, `magma`, `inferno`, `viridis`, `gray` |
|
|
| `--width` / `--height` | Output image dimensions |
|
|
| `--window` / `--hop` | FFT window and hop size |
|
|
| `--min-freq` / `--max-freq` | Frequency range filter |
|
|
| `--start` / `--duration` | Time slice of the audio |
|
|
| `--format` | Output format: `jpg` or `png` |
|
|
| `-o` | Output file path |
|
|
|
|
## Notes
|
|
|
|
- WAV and MP3 are decoded natively; other formats require `ffmpeg`
|
|
- Output images can be inspected with `vision_analyze` for automated audio analysis
|
|
- Useful for comparing audio outputs, debugging synthesis, or documenting audio processing pipelines
|