Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As far as I've seen, local OSS video understanding models just really aren't there yet. I briefly looked at facial recognition models but a good amount of signal was actually in the video's audio instead of the raw video frames. Depends on the accuracy you're looking for at the end of the day.


Thanks for the reply. Let's hope local models catch up.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: