130 lines
4.1 KiB
Markdown
130 lines
4.1 KiB
Markdown
# EcoBot - Asistent Vocal AI pentru Scoala Verde
|
|
|
|
EcoBot este o aplicatie AI care ruleaza local pe calculatorul tau. Ii vorbesti, iar el iti raspunde cu voce, in timp ce un caracter animat se misca pe ecran.
|
|
|
|
## Cum functioneaza
|
|
|
|
```
|
|
Microfon → faster-whisper (voce → text)
|
|
→ LM Studio / LLM local (generare raspuns)
|
|
→ Edge TTS / Piper TTS (text → voce)
|
|
→ Redare audio + caracter animat
|
|
```
|
|
|
|
## Cerinte
|
|
|
|
- **Python 3.10+**
|
|
- **LM Studio** (sau alt server LLM compatibil OpenAI API)
|
|
- **Windows 10/11** (testat pe Windows 11)
|
|
- **Conexiune internet** (doar pentru Edge TTS si prima descarcare Whisper)
|
|
|
|
### Hardware recomandat
|
|
|
|
| Componenta | Minim | Recomandat |
|
|
|-----------|-------|------------|
|
|
| RAM | 8 GB | 16+ GB |
|
|
| GPU | - | NVIDIA 4+ GB VRAM |
|
|
| CPU | Orice modern | 6+ cores |
|
|
|
|
## Instalare
|
|
|
|
### 1. Cloneaza repo-ul
|
|
|
|
```bash
|
|
git clone https://github.com/<user>/scoala-verde.git
|
|
cd scoala-verde
|
|
```
|
|
|
|
### 2. Creeaza mediul virtual si instaleaza dependintele
|
|
|
|
```bash
|
|
python -m venv venv
|
|
venv\Scripts\activate
|
|
pip install -r backend/requirements.txt
|
|
```
|
|
|
|
### 3. Configureaza LM Studio
|
|
|
|
1. Deschide LM Studio
|
|
2. Descarca un model (recomandat: **Llama 3.1 8B Instruct** sau **Mistral 7B Instruct**)
|
|
3. Du-te la **Local Server** → incarca modelul → **Start Server**
|
|
|
|
### 4. Configureaza variabilele de mediu
|
|
|
|
```bash
|
|
copy backend\.env.example backend\.env
|
|
```
|
|
|
|
Editeaza `backend/.env` cu setarile tale (portul LM Studio, modelul Whisper, vocea TTS, etc.)
|
|
|
|
### 5. (Optional) Descarca vocea Piper pentru mod offline
|
|
|
|
Descarca fisierele vocii romanesti de la [Piper Voices](https://github.com/rhasspy/piper/blob/master/VOICES.md) si pune `.onnx` + `.onnx.json` in `backend/voices/`.
|
|
|
|
## Pornire
|
|
|
|
Dublu-click pe `start.bat` sau:
|
|
|
|
```bash
|
|
venv\Scripts\activate
|
|
cd backend
|
|
python main.py
|
|
```
|
|
|
|
Deschide browserul la **http://localhost:8000**
|
|
|
|
## Utilizare
|
|
|
|
- **Butonul albastru** — activeaza ascultarea continua. Spune **"EcoBot"** ca sa activezi, apoi pune intrebarea
|
|
- **Butonul verde** — tine apasat si vorbeste direct
|
|
- **Butonul ⚙** — setari audio (selectie microfon/difuzor, amplificare, test)
|
|
|
|
## Structura proiectului
|
|
|
|
```
|
|
scoala-verde/
|
|
├── backend/
|
|
│ ├── main.py # FastAPI server (WebSocket)
|
|
│ ├── stt.py # Speech-to-Text (faster-whisper)
|
|
│ ├── llm.py # Comunicare LM Studio API
|
|
│ ├── tts.py # Text-to-Speech (Edge TTS / Piper)
|
|
│ ├── voices/ # Modele voce Piper (.onnx)
|
|
│ ├── requirements.txt
|
|
│ ├── .env # Configuratie locala (nu se comite)
|
|
│ └── .env.example # Template configuratie
|
|
├── frontend/
|
|
│ ├── index.html # Pagina principala
|
|
│ ├── css/style.css
|
|
│ └── js/
|
|
│ ├── app.js # Logica principala + WebSocket
|
|
│ ├── audio.js # Captura microfon + redare audio
|
|
│ ├── character.js # Caracter animat (copac)
|
|
│ └── settings.js # Pagina setari audio
|
|
├── dev/ # Documentatie dezvoltare
|
|
├── start.bat # Script pornire Windows
|
|
└── README.md
|
|
```
|
|
|
|
## Configuratie (.env)
|
|
|
|
| Variabila | Default | Descriere |
|
|
|-----------|---------|-----------|
|
|
| `LLM_BASE_URL` | `http://localhost:1234/v1` | URL-ul serverului LLM |
|
|
| `LLM_MODEL` | `local-model` | Numele modelului in LM Studio |
|
|
| `WHISPER_MODEL` | `small` | Dimensiunea modelului Whisper (tiny/small/medium) |
|
|
| `WHISPER_DEVICE` | `cuda` | Dispozitiv Whisper (cuda/cpu) |
|
|
| `TTS_ENGINE` | `edge` | Engine TTS: `edge` (online) sau `piper` (offline) |
|
|
| `TTS_VOICE` | `ro-RO-EmilNeural` | Vocea Edge TTS |
|
|
| `SERVER_PORT` | `8000` | Portul serverului |
|
|
|
|
## Tehnologii
|
|
|
|
- **[faster-whisper](https://github.com/SYSTRAN/faster-whisper)** — Speech-to-Text
|
|
- **[LM Studio](https://lmstudio.ai/)** — Server LLM local
|
|
- **[Edge TTS](https://github.com/rany2/edge-tts)** — Text-to-Speech (Microsoft Neural)
|
|
- **[Piper TTS](https://github.com/rhasspy/piper)** — Text-to-Speech offline
|
|
- **[FastAPI](https://fastapi.tiangolo.com/)** — Backend Python
|
|
|
|
## Licenta
|
|
|
|
MIT
|