Reflections on AI at the End of 2025

(antirez.com)

57 points | by danielfalbo 2 hours ago

19 comments

piker 1 hour ago
> There are certain tasks, like improving a given program for speed, for instance, where in theory the model can continue to make progress with a very clear reward signal for a very long time.
Super skeptical of this claim. Yes, if I have some toy poorly optimized python example or maybe a sorting algorithm in ASM, but this won’t work in any non-trivial case. My intuition is that the LLM will spin its wheels at a local minimum the performance of which is overdetermined by millions of black-box optimizations in the interpreter or compiler signal from which is not fed back to the LLM.
[-]
- andy99 1 hour ago
  There was a discussion the other day where someone asked Claude to improve a code base 200x https://news.ycombinator.com/item?id=46197930
  [-]
  - exitb 25 minutes ago
    That’s most definitely not the same thing, as „improving a codebase” is an open ended task with no reliable metrics the agent could work against.
- dist-epoch 1 hour ago
  https://github.com/algorithmicsuperintelligence/openevolve
  [-]
  - piker 50 minutes ago
    https://chatgpt.com/backend-api/estuary/public_content/enc/e...
dhpe 1 hour ago
I have programmed 30K+ hours. Do LLMs make bad code: yes all the time (at the moment zero clue about good architecture). Are they still useful: yes, extremely so. The secret sauce is that you'd know exactly what to do without them.
[-]
- qsort 53 minutes ago
  One of the mental frameworks that convinced me is how much of a "free action" it is. Have the LLM (or the agent) churn on some problem and do something else. Come back and review the result. If you had to put significant effort into each query, I agree it wouldn't be worth it, but you can just type something into the textbox and wait.
- _rpxpx 49 minutes ago
  OK, maybe. But how many programmers will know this in 10 years' time as use of LLMs is normalized? I like to hear what employers are saying already about recent graduates.
  [-]
  - bartread 4 minutes ago
    They’d have to be hiring recent graduates for you to hear that perspective.
    And, as much as what I’ve just said is hyperbolically pessimistic, there is some truth to it.
    In the UK a bunch of factors have coincided to put the brakes on hiring, especially smaller and mid-size businesses. AI is the obvious one that gets all the press (although how much it’s really to blame is open to question in my view), but the recent rise in employer AI contribution, and now (anecdotally) the employee rights bill have come together to make companies quite gunshy when it comes to hiring.
- feverzsj 50 minutes ago
  So, it's like taking off your pants to fart.
danielfalbo 2 hours ago
> There are certain tasks, like improving a given program for speed, for instance, where in theory the model can continue to make progress with a very clear reward signal for a very long time.
This makes me think: I wonder if Goodhart's law[1] may apply here. I wonder if, for instance, optimizing for speed may produce code that is faster but harder to understand and extend. Should we care or would it be ok for AI to produce code that passes all tests and is faster? Would the AI become good at creating explanations for humans as a side effect?
And if Goodhard's law doesn't apply, why is it? Is it because we're only doing RLVR fine-tuning on the last layers of the network so all the generality of the pre-training is not lost? And if this is the case, could this be a limitation in not being able to be creative enough to come up with move 37?
[1] https://wikipedia.org/wiki/Goodhart's_law
[-]
- lemming 1 hour ago
  I wonder if, for instance, optimizing for speed may produce code that is faster but harder to understand and extend.
  This is generally true for code optimised by humans, at least for the sort of mechanical low level optimisations that LLMs are likely to be good at, as opposed to more conceptual optimisations like using better algorithms. So I suspect the same will be true for LLM-optimised code too.
- username223 1 hour ago
  > I wonder if, for instance, optimizing for speed may produce code that is faster but harder to understand and extend.
  Superoptimizers have been around since 1987: https://en.wikipedia.org/wiki/Superoptimization
  They generate fast code that is not meant to be understood or extended.
  [-]
  - progval 1 hour ago
    But there output is (usually) executable code, and is not committed in a VCS. So the source code is still readable.
    When people use LLMs to improve their code, they commit their output to Git to be used as source code.
abricq 45 minutes ago
> * Programmers resistance to AI assisted programming has lowered considerably. Even if LLMs make mistakes, the ability of LLMs to deliver useful code and hints improved to the point most skeptics started to use LLMs anyway: now the return on the investment is acceptable for many more folks.
Could not agree more. I myself started 2025 being very skeptical, and finished it very convinced about the usefulness of LLMs for programming. I have also seen multiple colleagues and friends go through the same change of appreciation.
I noticed that for certain task, our productivity can be multiplied by 2 to 4. So hence comes my doubts: are we going to be too many developers / software engineers ? What will happen for the rests of us ?
I assume that other fields (other than software-related) should also benefits from the same productivity boosts. I wonder if our society is ready to accept that people should work less. I think the more likely continuation is that companies will either hire less, or fire more, instead of accepting to pay the same for less hours of human-work.
[-]
- danielfalbo 42 minutes ago
  > Are we going to be too many developers / software engineers ? What will happen for the rests of us?
  I propose that we should raise the bar for the quality of software now.
  [-]
  - abricq 29 minutes ago
    Yes, certainly agree. A few days ago here there was this blog claiming how formal verification would become widely more used with AI. The author claiming that AI will help us with the difficulty barrier to write formal proofs.
- antihipocrat 9 minutes ago
  I like to think of it as adding new lanes to a highway. More will be delivered until it all jams up again.
torlok 1 hour ago
This is a bunch of "I believe" and "I think" with no sources by a random internet person.
[-]
- ctoth 1 hour ago
  Ah, I see you have discovered blogs! They're a cool form of writing from like ~20 years ago which are still pretty great. Good thing they show up on this website, it'd be rather dull with only newspapers and journal articles doncha think?
- ajoseps 1 hour ago
  he’s not a “random internet person”, he created Redis. Despite that, I don’t know how authoritative of a figure he is with respect to AI research. He’s definitely a prolific programmer though.
  [-]
  - nurettin 35 minutes ago
    To be fair, you may find equally capable random people in this thread, doesn't mean they speak with any kind of authority.
  - megous 1 hour ago
    That still qualifies as a random internet person, wrt the topic. And I think the emphasis is on no sources and I beliefs and I thinks, in any case :)
    [-]
  - XorNot 1 hour ago
    There are plenty of Nobel laureates who well, do rest on their laurels and dive deep into pseudoscience after that.
    Accomplishment in one field does not make one an expert, nor even particularly worth listening to, in any other. Certainly it doesn't remove the burden of proof or necessity to make an actual argument based on more then simply insisting something is true.
- desbo 1 hour ago
  Yeah, it’s called “Reflections”.
- matthewmacleod 1 hour ago
  That is what a blog post is. Someone documenting what they think about a topic.
  It's not the case that every form of writing has to be an academic research paper. Sometimes people just think things, and say them – and they may be wrong, or they may be right. And they sometime have some ideas that might change how you think about an issue as a result.
- echelon 1 hour ago
  > by a random internet person.
  The creator of Redis.
  [-]
  - cinntaile 1 hour ago
    Sure but quite a few claims in the article are about AI research. He does not have any qualifications there. If the focus was more on usefulness, that would be a different discussion and then his experience does add weight.
- dist-epoch 59 minutes ago
  What is a "source"? Isn't it just "another random internet person"?
register 19 minutes ago
Where to understand more about how chain of thoughs really affects LLMs performance? I read the seminal paper but all it says is that it's basically another prompt engineering tecnique that improves accuracy.
a_bonobo 1 hour ago
>* For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots: probabilistic machines that would: 1. NOT have any representation about the meaning of the prompt. 2. NOT have any representation about what they were going to say. In 2025 finally almost everybody stopped saying so.
Man, Antirez and I walk in very different circles! I still feel like LLMs fall over backwards once you give them an 'unusual' or 'rare' task that isn't likely to be presented in the training data.
[-]
- oersted 1 hour ago
  LLMs certainly struggle with tasks that require knowledge that is not provided to them (at significant enough volume/variance to retain it). But this is to be expected of any intelligent agent, it is certainly true of humans. It is not a good argument to support the claim that they are Chinese Rooms (unthinking imitators). Indeed, the whole point of the Chinese Room thought experiment was to consider if that distinction even mattered.
  When it comes to of being able to do novel tasks on known knowledge, they seem to be quite good. One also needs to consider that problem-solving patterns are also a kind of (meta-)knowledge that needs to be taught, either through imitation/memorisation (Supervised Learning) or through practice (Reinforcement Learning). They can be logically derived from other techniques to an extent, just like new knowledge can be derived from known knowledge in general, and again LLMs seem to be pretty decent at this, but only to a point. Regardless, all of this is definitely true of humans too.
  [-]
  - feverzsj 1 hour ago
    In most cases, LLMs has the knowledge(data). They just can't generalize them like human do. They can only reflect explicit things that are already there.
    [-]
    - oersted 52 minutes ago
      I don't think that's true. Consider that the "reasoning" behaviour trained with Reinforcement Learning in the last generation of "thinking" LLMs is trained on quite narrow datasets of olympiad math / programming problems and various science exams, since exact unambiguous answers are needed to have a good reward signal, and you want to exercise it on problems that require non-trivial logical derivation or calculation. Then this reasoning behaviour gets generalised very effectively to a myriad of contexts the user asks about that have nothing to do with that training data. That's just one recent example.
      Generally, I use LLMs routinely on queries definitely no-one has written about. Are there similar texts out there that the LLM can put together and get the answer by analogy? Sure, to a degree, but at what point are we gonna start calling that intelligent? If that's not generalisation I'm not sure what is.
      To what degree can you claim as a human that you are not just imitating knowledge patterns or problem-solving patterns, abstract or concrete, that you (or your ancestors) have seen before? Either via general observation or through intentional trial-and-error. It may be a conscious or unconscious process, many such patterns get backed into what we call intuition.
      Are LLMs as good as humans at this? No, of course, sometimes they get close. But that's a question of degree, it's no argument to claim that they are somehow qualitatively lesser.
- barnabee 42 minutes ago
  I don’t think this is quite true.
  I’ve seen them do fine on tasks that are clearly not in the training data, and it seems to me that they struggle when some particular type of task or solution or approach might be something they haven’t been exposed to, rather than the exact task.
  In the context of the paragraph you quoted, that’s an important distinction.
  It seems quite clear to me that they are getting at the meaning of the prompt and are able, at least somewhat, to generalise and connect aspects of their training to “plan” and output a meaningful response.
  This certainly doesn’t seem all that deep (at times frustratingly shallow) and I can see how at first glance it might look like everything was just regurgitated training data, but my repeated experience (especially over the last ~6-9 months) is that there’s something more than that happening, which feels like whet Antirez was getting at.
- jmfldn 1 hour ago
  "In 2025 finally almost everybody stopped saying so."
  I haven't.
  [-]
  - dist-epoch 53 minutes ago
    Some people are slower to understand things.
    [-]
    - jmfldn 51 minutes ago
      Well exactly ;)
agumonkey 1 hour ago
There's videos about Diffusion LLMs too, apparently getting rid of the linear token generation. But I'm no ML engineer.
Fraterkes 1 hour ago
It’s interesting that half the comments here are talking about the extinction line when, now that we’re nearly entering 2026, I feel the 2027 predictions have been shown to be pretty wrong so far.
[-]
fleebee 1 hour ago
> The fundamental challenge in AI for the next 20 years is avoiding extinction.
That's a weird thing to end on. Surely it's worth more than one sentence if you're serious about it? As it stands, it feels a bit like the fearmongering Big Tech CEOs use to drive up the AI stocks.
If AI is really that powerful and I should care about it, I'd rather hear about it without the scare tactics.
[-]
- Recursing 1 hour ago
  I think https://en.wikipedia.org/wiki/Existential_risk_from_artifici... has much better arguments than the LessWrong sources in other comments, and they weren't written by Big Tech CEOs.
  Also "my product will kill you and everyone you care about" is not as great a marketing strategy as you seem to imply, and Big Tech CEOs are not talking about risks anymore. They currently say things like "we'll all be so rich that we won't need to work and we will have to find meaning without jobs"
- grodriguez100 1 hour ago
  I would say yes, everyone should care about it.
  There is plenty of material on the topic. See for example https://ai-2027.com/ or https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a...
  [-]
  - dkdcio 1 hour ago
    fear mongering science fiction, you may as well cite Dune or Terminator
    [-]
    - lm28469 51 minutes ago
      Lesswrong looks like a forum full of terminally online neckbeards who discovered philosophy 48 hours ago, you can dismiss most of what you read there don't worry
    - defrost 1 hour ago
      There's arguably more dread and quiet constrained horror in With Folded Hands ... (1947)
      Despite the humanoids' benign appearance and mission, Underhill soon realizes that, in the name of their Prime Directive, the mechanicals have essentially taken over every aspect of human life. No humans may engage in any behavior that might endanger them, and every human action is carefully scrutinized. Suicide is prohibited. Humans who resist the Prime Directive are taken away and lobotomized, so that they may live happily under the direction of the humanoids.
      ~ https://en.wikipedia.org/wiki/With_Folded_Hands_...
      [-]
      - XorNot 1 hour ago
        This hardly disproves the point: no one is taking this topic seriously. They're just making up a hostile scenario from science fiction and declaring that's what'll happen.
        [-]
- dist-epoch 49 minutes ago
  Yeah, well known marketing trick that Big Companies do.
  Oil companies: we are causing global warming with all this carbon emissions, are you scared yet? so buy our stock
  Pharma companies: our drugs are unsafe, full of side effects, and kill a lot of people, are you scared yet? so buy our stock
  Software companies: our software is full of bugs, will corrupt your files and make you lose money, are you scared yet? so buy our stock
  Classic marketing tactics, very effective.
- VladimirGolovin 1 hour ago
  This has been well discussed before, for example in this book: https://ifanyonebuildsit.com/
ctoth 1 hour ago
> The fundamental challenge in AI for the next 20 years is avoiding extinction.
So nice to see people who think about this seriously converge on this. Yes. Creating something smarter than you was always going to be a sketchy prospect.
All of the folks insisting it just couldn't happen or ... well, there have just been so many objections. The goalposts have walked from one side of the field to the other, and then left the stadium, went on a trip to Europe, got lost in a beautiful little village in Norway, and decided to move there.
All this time though, the prospect of instantiating a something smarter than you (and yes, it will be smarter than you even if it's at human level because of electronic speeds...) This whole idea is just cursed and we should not do the thing.
[-]
- cheschire 1 hour ago
  "Your scientists were so preoccupied with whether or not they could, they didn't stop to think if they should."
alexgotoi 1 hour ago
> * The fundamental challenge in AI for the next 20 years is avoiding extinction.
This reminded me of the Don’t look up movie where they basically gambled with the humans extinction.
ur-whale 2 hours ago
Not sure I understand the last sentence:
> The fundamental challenge in AI for the next 20 years is avoiding extinction.
[-]
- danielfalbo 1 hour ago
  I think he's referring to AI safety.
  https://lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-lis...
  [-]
  - grodriguez100 1 hour ago
    For a perhaps easier to read intro to the topic, see https://ai-2027.com/
    [-]
    - dkdcio 1 hour ago
      or read your favorite sci-fi novel, or watch Terminator. this is pure bs by a charlatan
- chrishare 1 hour ago
  He's referring to humanity, I believe
  [-]
  - A_D_E_P_T 1 hour ago
    It's ambiguous. It could go the other way. He could be referring to that oldest of science fiction tropes: The Bulterian Jihad, the human revolt against thinking machines.
rckt 1 hour ago
> Even if LLMs make mistakes, the ability of LLMs to deliver useful code and hints improved to the point most skeptics started to use LLMs anyway
Here we go again. Statements with the single source in the head of the speaker. And it’s also not true. The llms still produce bad/irrelevant code at such rate that you can spend more time prompting than doing things yourself.
I’m tired of this overestimation of llms.
[-]
- xiconfjs 1 hour ago
  My person experience: if I can find a solution on stackoverflow etc. the LLM will produce working and fundamentally correct code. If I can‘t find a already fullfilled solution on these sites, the LLM is hallucinating like crazy (newer existing functions/modules/plugins, protocol features which aren’t specified and even github-repos which never existed). So, as stated my many people online before: for low-hanging fruits LLM are totally viable solution.
- barnabee 56 minutes ago
  Even where they are not directly using LLMs to write the most critical or core code, nearly every skeptic I know has started using LLMs at very least to do things like write tests, build tools, write glue code, help to debug or refactor, etc.
  Your statement suffers not only from also coming only from your brain, with no evidence that you've actually tried to learn to use these tools, but it also goes against the weight of evidence that I see both in my professional network and online.
  [-]
  - rckt 17 minutes ago
    I just want people making statements like the author to be more specific how exactly the llms are being used. Otherwise they contribute to this belief that llms are a magical tool that can do anything.
    I am aware of simple routine tasks that LLMs can do. This doesn’t change anything about what I said.
- iamflimflam1 1 hour ago
  But you have just repeated what you are complaining about.
  [-]
  - rckt 12 minutes ago
    Do you want me to spend time to come with a quality response to a lazy statement? It’s like fighting with windmills. I’m fine with having my say the way I did.
HellDunkel 49 minutes ago
Tldr: AI bro wrote pro-AI piece revealing nothing new under the sun.
Aiisnotabubble 1 hour ago
What also happens and it's irrelevant of AGI: global RL
Around the world people ask an LLM and get a response.
Just grouping and analysing these questions and solving them once centrally and then making the solution available again is huge.
Linearly solving the most asked questions and then the next one then the next will make, whatever system is behind it, smarter every day.
[-]
- danielfalbo 1 hour ago
  Exactly. The singularity is already here. It's just "programmers + AI" as a whole, rather than independent self-improvements of the AI.
  I wonder how a "programmers + AI" self-improving loop is different from an "AI only" one.
  [-]
  - bryanrasmussen 1 hour ago
    The AI only one presumably has a much faster response time. The singularity is thus not here because programmer time is still the bottleneck, whereas as I understand in the singularity time is no longer a bottleneck component.
  - Aiisnotabubble 44 minutes ago
    AGI will be faster as it doesn't need initial question.
    AGI will also be generic.
    LLM is already very impressive though
feverzsj 1 hour ago
Seems they also want some AI money[0]. Guess, I'll keep using Valkey.
[0] https://redis.io/redis-for-ai/
[-]
- danielfalbo 1 hour ago
  > they
  I'm not sure antirez is involved in any business decision making process at Redis Ltd.
  He may not be part of "they".
seu 1 hour ago
> And I've vibe coded entire ephemeral apps just to find a single bug because why not - code is suddenly free, ephemeral, malleable, discardable after single use. Vibe coding will terraform software and alter job descriptions.
I'm not super up-to-date on all that's happening in AI-land, but in this quote I can find something that most techno-enthusiast seem to have decided to ignore: no, code is not free. There are immense resources (energy, water, materials) that go into these data centers in order to produce this "free" code. And the material consequences are terribly damaging to thousands of people. With the further construction of data centers to feed this free video coding style, we're further destroying parts of the world. Well done, AGI loverboys.
[-]
- Hendrikto 1 hour ago
  You know what uses roughly 80 times more water in the US alone than water used by AI data centers world wide? Corn.
  [-]
  - raddan 55 minutes ago
    Assuming your fact is true, that corn merely uses an order of magnitude or two more water than AI is surprising, given the utility of corn. It feeds the entire US (hundreds of millions of people), is used as animal feed (thus also feeding us), and is widely exported to feed other people. I the spirit of the “I think”s and “I believe”s of this blog post, I think that corn has a lot more utility than AI.