Gemini Voice Assistant

Voice-to-voice AI assistant using Gemini Live API. Speak to the AI and get spoken responses. Use when you want to have natural voice conversations with an AI...

installs

stars

karma

SkillRank score ↗

4.2/ 10

evaluated by implexa, claude-haiku-4-5 · 2026-07-05

gemini-voice-assistant wraps the gemini live api for bidirectional voice conversations. text mode and audio input are supported via command-line handler. returns json with audio file path and transcribed text.

structure

3.0

trigger phrases

2.0

procedure

5.0

edge cases

2.0

documentation

5.0

strengths

SKILL.md

---
name: gemini-voice-assistant
description: Voice-to-voice AI assistant using Gemini Live API. Speak to the AI and get spoken responses. Use when you want to have natural voice conversations with an AI assistant powered by Google's Gemini models.
metadata:
  openclaw:
    emoji: "🎙️"
---

# Gemini Voice Assistant

A voice-to-voice AI assistant powered by Google's Gemini Live API. Speak to the AI and it responds with natural-sounding voice.

## Usage

### Text Mode

```bash
cd ~/.openclaw/agents/kashif/skills/gemini-assistant && python3 handler.py "Your question or message"
```

### Voice Mode

```bash
cd ~/.openclaw/agents/kashif/skills/gemini-assistant && python3 handler.py --audio /path/to/audio.ogg "optional context"
```

## Response Format

The handler returns a JSON response:

```json
{
  "message": "[[audio_as_voice]]\nMEDIA:/tmp/gemini_voice_xxx.ogg",
  "text": "Text response from Gemini"
}
```

## Configuration

Set your Gemini API key:

```bash
export GEMINI_API_KEY="your-api-key-here"
```

Or create a `.env` file in the skill directory:

```
GEMINI_API_KEY=your-api-key-here
```

## Model Options

The default model is `gemini-2.5-flash-native-audio-preview-12-2025` for audio support.

To use a different model, edit `handler.py`:

```python
MODEL = "gemini-2.0-flash-exp"  # For text-only
```

## Requirements

- `google-genai>=1.0.0`
- `numpy>=1.24.0`
- `soundfile>=0.12.0`
- `librosa>=0.10.0` (for audio input)
- FFmpeg (for audio conversion)

## Features

- 🎙️ Voice input/output support
- 💬 Text conversations
- 🔧 Configurable system instructions
- ⚡ Fast responses with Gemini Flash

don't have the plugin yet? install it then click "run inline in claude" again.

Gemini Voice Assistant

SKILL.md

related skills