-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide examples to use alternative Text-to-Speech services #26
Comments
@Fu-u0718 says: |
Hi @Fu-u0718 ,
Here is an example for Azure:
import aiohttp
import asyncio
import io
from logging import getLogger, NullHandler
import traceback
import wave
import numpy
import sounddevice
from . import SpeechController
class VoiceClip:
def __init__(self, text: str):
self.text = text
self.download_task = None
self.audio_clip = None
class AzureSpeechController(SpeechController):
def __init__(self, api_key: str, region: str, speaker_name: str="ja-JP-AoiNeural", speaker_gender: str="Female", lang="ja-JP", device_index: int=-1, playback_margin: float=0.1):
self.logger = getLogger(__name__)
self.logger.addHandler(NullHandler())
self.api_key = api_key
self.region = region
self.speaker_name = speaker_name
self.speaker_gender = speaker_gender
self.lang = lang
self.device_index = device_index
self.playback_margin = playback_margin
self.voice_clips = {}
self._is_speaking = False
async def download(self, voice: VoiceClip):
url = f"https://{self.region}.tts.speech.microsoft.com/cognitiveservices/v1"
headers = {
"X-Microsoft-OutputFormat": "riff-16khz-16bit-mono-pcm",
"Content-Type": "application/ssml+xml",
"Ocp-Apim-Subscription-Key": self.api_key
}
ssml_text = f"<speak version='1.0' xml:lang='{self.lang}'><voice xml:lang='{self.lang}' xml:gender='{self.speaker_gender}' name='{self.speaker_name}'>{voice.text}</voice></speak>"
data = ssml_text.encode("utf-8")
async with aiohttp.ClientSession() as session:
async with session.post(url, headers=headers, data=data) as response:
if response.status == 200:
voice.audio_clip = await response.read()
def prefetch(self, text: str):
v = self.voice_clips.get(text)
if v:
return v
v = VoiceClip(text)
v.download_task = asyncio.create_task(self.download(v))
self.voice_clips[text] = v
return v
async def speak(self, text: str):
voice = self.prefetch(text)
if not voice.audio_clip:
await voice.download_task
with wave.open(io.BytesIO(voice.audio_clip), "rb") as f:
try:
self._is_speaking = True
data = numpy.frombuffer(
f.readframes(f.getnframes()),
dtype=numpy.int16
)
framerate = f.getframerate()
sounddevice.play(data, framerate, device=self.device_index, blocking=False)
await asyncio.sleep(len(data) / framerate + self.playback_margin)
except Exception as ex:
self.logger.error(f"Error at speaking: {str(ex)}\n{traceback.format_exc()}")
finally:
self._is_speaking = False
def is_speaking(self) -> bool:
return self._is_speaking
app.avatar_controller.speech_controller = AzureSpeechController(
AZURE_SUBSCRIPTION_KEY, AZURE_REGION,
speaker_name="en-US-AvaNeural",
speaker_gender="Female",
lang="en-US",
device_index=2 # Set output device number on you PC
) However, I've found that AIAvatar has an issue handling English responses from ChatGPT. I will fix it soon. |
I've fixed it👍 |
thank you! You will learn a lot. I would also like to enjoy conversation in English. Thank you for taking the time out of your busy schedule to respond! |
Hi I tried with openai speech service, however it got stucked on [INFO] 2024-07-15 17:28:44,009 : Listening... (OpenAIWakewordListener) |
Hi @mosu7, |
アバターと日本語だけではなく、英語での会話も行ってみたいと考えているのですが、コードで使用しているVOICEVOXは英語が話せないと知りました。例えば、GoogleやAzureのText-to-Speechを使用するなどして組んだプログラムはお作りになっていませんか?
The text was updated successfully, but these errors were encountered: