Switching to bytes is the ultimate fix, but for the interim, if you want reliable rhyming with an LLM, you need filter-assisted decoding: https://paperswithcode.com/paper/most-language-models-can-be... and replicas post about this work: https://replicate.com/blog/turn-your-llm-into-a-poet
Switching to bytes is the ultimate fix, but for the interim, if you want reliable rhyming with an LLM, you need filter-assisted decoding: https://paperswithcode.com/paper/most-language-models-can-be... and replicas post about this work: https://replicate.com/blog/turn-your-llm-into-a-poet