Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Giant models are slowing us down. In the long run, the best models are the ones which can be iterated upon quickly. We should make small variants more than an afterthought, now that we know what is possible in the <20B parameter regime.

Maybe this is true for the median query/conversation that people are having with these agents - but it certainly has not been what I have observed in my experience in technical/research work.

GPT-4 is legitimately very useful. But any of the agents below that (including ChatGPT) cannot perform complex tasks up to snuff.



My understanding was that most of the current research effort was towards trimming and/or producing smaller models with power of larger models, is that not true?


Doesn't mean the smaller models are anywhere close to the capabilities of GPT-4.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: