Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Cool! I had been thinking about trying this as well, after reading about the idea in one of Cosma Shalizi's notebooks [0]. I'd love to see how something like this performs when "trained" on a corpus the size of the web when given the same kind of computational resources used to train modern LLMs.

[0] http://bactra.org/notebooks/nn-attention-and-transformers.ht...



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: