SubQ: a sub-quadratic LLM with 12M-token context

(subq.ai)

21 points | by mitchwainer 3 hours ago ago

11 comments

creamyhorror 2 hours ago ago
Whether this is real or not, multiple commenters here look like astroturfers - created in the past year (or hours) with very low karma
pstorm 2 hours ago ago
I’m very surprised this isn’t getting more attention. Am I missing something?
It seems at or above SOTA on the given benchmarks, doesn’t have context rot, is orders of magnitude faster, and uses less compute that current transformer models. I suppose it’s just an announcement and we can’t test it ourselves yet.
[-]
- jakevoytko 2 hours ago ago
  The proof is in the pudding. At this point, there have been plenty of models that overperformed on benchmarks and underperformed on real work. So my stance is that I'm curious, I'm excited to see where it goes, and I don't believe it until I can try it.
- remaximize 2 hours ago ago
  I agree, it's a real architectural breakthrough if true
remaximize 3 hours ago ago
This is pretty remarkable. We've spent a lot of time finding workarounds for LLMs reading long docs. Now that's gone.
williamimoh 3 hours ago ago
Looks like long context isn’t a problem anymore
[-]
- tamarru 3 hours ago ago
  Neither is cost, and latency, in the long-term. LLMs ultimately become more economically viable than they are now, and broaden the scope of every existing LLM-driven application (particularly STS, conversational AI, etc, etc.)
tuandin 2 hours ago ago
if it's true then it's a breakthrough.
wilddolphin 3 hours ago ago
optimizing AI in general. How cool is that?
[-]
- 2 hours ago ago
  [deleted]
thlt 2 hours ago ago
[dead]