Internet radio music stream database — collect, store, and manage a database of music streams from internet-radio.com. ~1000+ streams, 29 genres, auto-popula...
---
name: internet-radio-music-db
version: 2.4.0
description: Internet radio music stream database — collect, store, and manage a database of music streams from internet-radio.com. ~1000+ streams, 29 genres, auto-population and availability checking. Works together with the Internet Radio Music Player skill — use both for full internet radio playback.
metadata:
openclaw:
requires:
bins:
- python3
emoji: "🎵"
homepage: https://clawhub.ai/skills/internet-radio-music-db
---
# Internet Radio Music DB — Music Stream Database (Internet Radio)
Skill for collecting and managing an internet radio stream database.
## Data source
**https://www.internet-radio.com/** — the largest internet radio station catalog.
### How database population works
1. **Parallel station collection** — 10 threads simultaneously (one per genre), parsing up to 10 pages per genre (`/stations/{genre}/`), stops on empty page (~1000+ stations)
2. **Playlist extraction** — extracts `.pls` playlist link from each station
3. **URL resolution** — forms a direct stream URL from the playlist (`http://server:port/stream`)
4. **Speed check** — parallel check of all new streams (60 workers), 4-second download, criteria ≥50 KB and ≥20 KB/s
5. **Saving** — data saved to `state.json`
### Availability checking
- Parallel check of all streams (120 workers)
- Streams with `failed_checks >= 3` are automatically removed
- Criteria: ≥50 KB in 4 sec and ≥20 KB/s (synchronized with population)
### Stream record format
```json
{
"url": "http://server:8000/stream",
"name": "Station Name",
"genre": "rock",
"language": "en",
"available": true,
"source": "internet-radio.com",
"station_url": "https://www.internet-radio.com/station/xxx/",
"bitrate": 128,
"listeners": 42,
"audio_type": "mpeg",
"genres": ["classic rock", "blues"],
"added_at": "2026-05-23T18:00:00+00:00",
"last_checked": "2026-05-23T19:00:00+00:00",
"failed_checks": 0
}
```
## Commands
```bash
# Populate the database (29 genres in parallel, ~1000+ streams)
python ~/.openclaw/skills/internet-radio-music-db/scripts/build_db.py
# Check availability of all streams (120 workers)
python ~/.openclaw/skills/internet-radio-music-db/scripts/check_availability.py
# Show database statistics (genres, languages, speed, top)
python ~/.openclaw/skills/internet-radio-music-db/scripts/show_stats.py
# Genre statistics (top-10)
python ~/.openclaw/skills/internet-radio-music-db/scripts/show_stats.py --genres --top 10
# Language distribution
python ~/.openclaw/skills/internet-radio-music-db/scripts/show_stats.py --lang
# Speed distribution
python ~/.openclaw/skills/internet-radio-music-db/scripts/show_stats.py --speed
# Top-10 fastest streams
python ~/.openclaw/skills/internet-radio-music-db/scripts/show_stats.py --top-speed 10
# List streams (by genre)
python ~/.openclaw/skills/internet-radio-music-db/scripts/cli.py list rock
# Add stream manually
python ~/.openclaw/skills/internet-radio-music-db/scripts/cli.py add <url> <name> <genre> [lang]
# Remove stream
python ~/.openclaw/skills/internet-radio-music-db/scripts/cli.py remove <url>
# Export database to JSON
python ~/.openclaw/skills/internet-radio-music-db/scripts/cli.py export backup.json
```
## Files
| File | Purpose |
|------|---------|
| `scripts/build_db.py` | Main database population script (duplicate protection by URL) |
| `scripts/check_availability.py` | Periodic availability check (removes on `failed_checks >= 3`) |
| `scripts/cli.py` | Stream management (list/add/remove/export) |
| `scripts/show_stats.py` | Database statistics (genres, languages, speed, top) |
| `state.json` | Stream database (JSON, not included in publication) |
## Cron tasks
Recommended setup via `openclaw cron`:
- **Database population** — every 4 hours (`0 */4 * * *`)
- **Availability check** — every 4 hours at 30 min offset (`30 */4 * * *`)
Example:
```bash
openclaw cron add --name "DB Population" --schedule "0 */4 * * *" --tz "Europe/Samara" \
--message "Run: python ~/.openclaw/skills/internet-radio-music-db/scripts/build_db.py"
openclaw cron add --name "Availability Check" --schedule "30 */4 * * *" --tz "Europe/Samara" \
--message "Run: python ~/.openclaw/skills/internet-radio-music-db/scripts/check_availability.py"
```
## Features
- During availability check, slow streams are marked `available: false`, `failed_checks` counter is incremented
- Streams with `failed_checks >= 3` are **automatically removed** from the database
- Speed criteria unified for population and checking: ≥50 KB in 4 sec, ≥20 KB/s
- Language is determined by keywords in station name
- Supports ~29 genres, up to 10 pages per genre
- No duplicate URLs — checked during population + `seen_urls` within current run
## Changelog
### v2.3.0 (2026-05-27)
- Full English translation of SKILL.md, cli.py, show_stats.py, build_db.py
- Translated all user-facing output strings in cli.py
- Added `.clawhubignore` (excludes `state.json`, `__pycache__`, `*.pyc`, `.clawhub/`)
### v2.2.0 (2026-05-27)
- **Fixed duplicate bug in `build_db.py`** — added `seen_urls` protection to prevent re-adding the same URL within a single run (previously `url_exists` only checked against `state.json`, not new URLs from the current cycle)
- **Set `MAX_PAGES = 10`** — return to a reasonable limit (was 15)
- **Extended `.clawhubignore`** — added `__pycache__` and `*.pyc` (excluded from publication)
- Updated documentation
### v2.1.0 (2026-05.27)
- Increased page limit from 6 to 10 — ~2700+ stations per cycle
- Synchronized speed criteria — `check_availability.py` now uses the same values as `build_db.py`: ≥50 KB in 4 sec, ≥20 KB/s
- Removed `check_all.py` — legacy script with hardcoded path to old database
- Removed `ambient_boost.py` — legacy manual ambient station search script
- Added `.clawhubignore` for `state.json`
### v2.0.0
- Initial release
don't have the plugin yet? install it then click "run inline in claude" again.
by @clawhub