netlinux-ai

Notes from a TTS fine-tuning project: getting F5-TTS and StyleTTS2 running on a 2010-era CPU (AMD Phenom II X6, no AVX2) with an RTX 3060, fine-tuning each on a small Northern English corpus, and comparing what each architecture actually learns.

The work produced two production scripts (tts.sh using StyleTTS2 for clean phonetics; tts-f5.sh using F5-TTS for stronger accent commitment) plus a number of upstream patches and documentation pieces published as gists and PRs.

Posts

Posts

subscribe via RSS