This site collects notes, patches, and write-ups from a small TTS fine-tuning project. The hardware (“grahams-brain”) is a 2010-era AMD Phenom II X6 with an RTX 3060 12 GB — a deliberately constrained rig that forces every piece of the modern Python AI stack to be examined for compatibility.

The corpus is ~3 hours of clean single-speaker British (Bolton/Lancashire) audio from three speakers (Sara Cox, Maxine Peake, Diane Morgan). Both F5-TTS and StyleTTS2 were fine-tuned on it; the resulting models ship as distinct production scripts because they hit different sweet spots on the accent-strength vs phonetic-stability trade-off.

Code, patches, gists

Upstream contributions

Contact

GitHub: netlinux-ai