ARTICLE AD BOX
Hii I am trying to build a AI Voice agent with only open-source tools. I am planning to use Kokoro TTS for TTS and this code instead of LLM but I am not sure about STT and other things that maybe needed? Let me know If I should proceed with it or not. I thought about using Wav2Vec2 but I don't know if it will be fast enough or not for a Ryzen 5 5600G with 16 GB Ram no graphic card to do customer care calling. For TTS I will use OCR to extract the patient name and other changable information in advance and convert it into audios but not sure about the SST part.
Code Github Repository Link - https://github.com/hey12301/replaceHumanity_project/blob/main/testing_calling_process.py
