More

virgilp · 2026-05-19T08:49:59 1779180599

If we ignore cost (which is kinda hard to ignore), I feel Codex kinda' does it for me. Sure it's not really an editor but I find I don't need that _that much_ and it's easy to launch an external editor (they actually have the feature).

The ironic thing is that half a year ago, after trying factory.ai I thought chat-first interface was a stupid idea that will never work.

virgilp · 2026-05-17T19:10:18 1779045018

I wonder how knowledgeable in compilation was the engineer that attempted this. I'm pretty confident that I could produce a decent C compiler in a few weeks (or less), if given Opus 4.7 + unlimited tokens + a good test suite. (and this is not blind unsubstantiated belief in AI, I've recently rewritten a somewhat sophisticated interpreter in a week with AI; and have worked on several C++ compilers in the past, including a GCC port to a custom DSP, so I have a bit of an idea about what this would take).

But yeah, this is not a "one shot" project, none of it is. One shot doesn't work even with humans - after all, this is exactly what killed waterfall as a methodology.

pron · 2026-05-17T19:38:06 1779046686

> I'm pretty confident that I could produce a decent C compiler in a few weeks (or less), if given Opus 4.7 + unlimited tokens + a good test suite.

Of course. The point is that a full, detailed spec isn't enough (even in the rare situations it does exist, like for a C compiler). At least for the moment, you need expert humans to supervise and direct the agents.

Vibe coders usually also let the agents write the tests, which mean that the only independent human validation of the software is some cursory manual inspection. That also obviously isn't enough to validate software.

> One shot doesn't work even with humans - after all, this is exactly what killed waterfall as a methodology.

You can one-shot a C compiler with humans. LLMs' software development ability is impressive and helpful, but it is not human-level yet, even if at some tasks the agents are better than most human programmers. And while many waterfall projects failed, many succeeded (although perhaps not as efficiently as they could have). So far I don't believe agents have been able to produce any non-trivial production software autonomously.

zem · 2026-05-17T19:25:21 1779045921

yeah, the key part is that there be a human in the loop, directing and course-correcting the ai while it produces code in reasonably small and well defined stages.

virgilp · 2026-05-17T08:49:50 1779007790

you can absolutely know. they do suspiciously well. you just give harder problems until they can't solve it. how they react/approach a problem that they can't immediately solve _is_ the interview - not the "how many things they solved correctly" part.

That said - I seldom need people to be hardcore algorithm solvers What I typically did was a variation of fizzbuzz (can the candidate code very basic logic?) and then finding a bug or minor requirements extension in their online screening test/"homework" and asking them to solve that on the spot (did they write the code themselves/can they modify it). It's typically enough, there's diminishing returns to test more in-depth the programming skills - the rest you can discuss domain knowledge, general experience, working style etc.

virgilp · 2026-05-12T06:20:05 1778566805

To be fair it doesn't say "you can't score a goal" or "you can't kick the ball", it says you can only decide to _try_ to do that. But agree it's not that deep as they seem to think, you can take this line of thinking further as much as you want (you can try to move your leg but perhaps due to injury or external blockage it won't move; so all you can really do is "desire" not "decide"? Is there some very deep meaning to this? And what even is this "you" that we are talking about?)

virgilp · 2026-05-07T13:11:20 1778159480

One thing that is worth pondering is what parts of the "old wisdom" (if any) are no longer true. Because the set of "common sense knowledge" has a tendency to mutate in time. Take the first statement:

> Impactful software tends to be written by many humans that need to collaborate.

This was definitely true. Is it still true to the same extent/ in the same way? Not obvious...

virgilp · 2026-05-04T10:16:39 1777889799

I kinda' like Ryanair as lowcost airline? They're fairly efficient (boarding, serving etc), they _actually fly_ the advertised flights (with relatively few exceptions), and the food is reasonably priced. During COVID they would just give your money back, no shenanigans like "they're in our company wallet". Sure they have their quirks but they don't seem to go out of their way to deceive you, they're pretty open about what you pay and what you get.

Now Wizzair is "mostly not an airline" for me, because they have all the negative traits I hinted above. E.g. they'll happily advertise flights they have no intention of flying, make refunds hard, are as misleading as they can be about pricing, make it impossible to checkin online a few hours before the flight so that you have to pay their high fees, etc.

I wouldn't want the Ryanair experience for long-haul flights; but for short 2-3h ones within Europe, they're fine, I'm always considering them. Not for the perceived cheapness, but for the "I expect them to actually fly AND be on time" part.

notahacker · 2026-05-04T11:06:51 1777892811

> During COVID they would just give your money back, no shenanigans like "they're in our company wallet"

Generally I agree with your view that Ryanair is decent at what it does, but COVID refunds happened only after the regulator stepped in to threaten them over their original "no refunds" and then "refund in the form of a voucher, with a short expiry date on it" policies actually being unlawful, and even allowing for the scale of its operations it received more complaints to the UK CAA than anyone else about refund handling during COVID.

virgilp · 2026-05-04T12:46:37 1777898797

In Romania I think they just gave back the money (or maybe it was on a voucher with "if you don't use the voucher by date X, we'll refund the money"). which is in stark contrast with how other low-cost airlines like WizzAir behaved. Perhaps it was regional policy; or perhaps it was due to their previous interactions with UK regulators? But for me, they gained a lot of respect for them back then (whereas WizzAir is on the "only if absolutely no other choice" list - and I think I only used it once, for a business trip where it had a good direct flight AND I didn't care if I actually made it to the destination, or if I got stranded there for a few days - since the company would've been paying)

reillyse · 2026-05-04T16:35:19 1777912519

Ryanair have been regulated into compliance very effectively by the European authorities- everyone knows they are scumbags and make sure they don’t get away with nonsense.

Literally how regulators should work. They look at the outrageous things they try to do and make laws to prevent them. It’s worked very well and also hasn’t ended Ryanair (which is the usual anti regulation argument , that we can’t have cheap things with regulations).

I personally never fly Ryanair because I’ve had to sue them (and won) in the past, they really do suck.

SOLAR_FIELDS · 2026-05-04T18:29:24 1777919364

When's the last time you tried to claw back your EU mandated clawback for things like delayed flights from one of these airlines without fighting tooth and nail and threatening legal action? Perhaps this has improved in recent years, but when I was flying EU regularly several years ago getting that refund has always been an uphill battle.

peyton · 2026-05-04T20:25:31 1777926331

Hah, I tried when BA had to cancel my flight due to a computer system outage. Coincidentally, some “activists” handcuffed themselves to a fence on the runway at the exact same time, which was an act of terror or whatever and thus not covered, so I did not receive my money back.

FridayoLeary · 2026-05-04T13:53:02 1777902782

I actually like wizz. They are dirt cheap which is the only thing i care about. The ground crew don't openly despise you, unlike easyjet, they tolerate you and their cabins are all right. Just they don't have any customer service if anything goes wrong.

"Wizz; Not the worst airline you've ever flown on"

virgilp · 2026-04-15T09:56:50 1776247010

Waterfall was bad due to the excessively long feedback loops (months-to-years from "planning" to "customer gets to see it/ we receive feedback on it"). It was NOT bad because it forced people to think before writing code! That part we should recover, it's not problematic at all.

kown7 · 2026-04-15T11:29:55 1776252595

If people actually read the original paper by Royce 1970 they would see that it's an iterative process with short feedback-loops.

The bad rep comes from (defense|gov.) contracting, where PRDs where connected to money and CR were expensive, see http://www.bawiki.com/wiki/Waterfall.html for better details.

paganel · 2026-04-15T11:34:11 1776252851

When you do most of the thinking before you start implementing the whole thing, and if you think that that's enough, then you've missed the unknown unknowns part, which was a big talking point in the mid 2000s, back when the anti-waterfall discourse got going (and for good reason).

But I expect the AI zealots to start (re-)integrating XProgramming (later rebranded as Agile) back into their workflow, somehow.

sersi · 2026-04-16T08:04:40 1776326680

Thinking before you start implementing the entire project is doomed to fail. Thinking before you implement each features/user story is usually rather important.

A waterfall model with short feedback loops iterating on small tasks is not the worst thing in the world

virgilp · 2026-04-20T07:53:04 1776671584

If your feedback loop is hours or days, I don't think it's bad you spend some time thinking ahead of doing. Oh, you missed the unknown unknowns? You'll hit them soon enough anyway, this is not a model that encourages abstract planning with no action taken, for extended periods of time....

The problem with "AI zealots" is seldom that they spend too much time planning ahead. If anything, it's the opposite.

virgilp · 2026-04-13T08:48:05 1776070085

qa has long ago merged with programming in "unified engineering". Also with SRE ("devops") and now the trend is to merge with CSE and product management too ("product mindset", forward-deployed engineers). So yeah, pretty much, that's the trend. What would you trust more - an engineer doing project management too - or a project manager doing the engineering job?

ben_w · 2026-04-13T08:51:26 1776070286

The PMs and QAs I know would disagree with that assessment.

> What would you trust more - an engineer doing project management too - or a project manager doing the engineering job?

If one of the three, {PM, QA, coder}, was replaced by AI, as a customer I'd prefer to pick the team missing the coder. But for teams replacing two roles with AI, I'd rather keep the coder.

But a deeper problem now is, as a customer, perhaps I can skip the team entirely and do it all myself? That way, no game of telephone from me to the PM to the coder and QA and back to me saying "no" and having another expensive sprint.

throwway120385 · 2026-04-13T13:42:27 1776087747

If I'm managing a company of about 10 people to do something in the physical world, I'd probably skip the PM & QA and hire the engineer and have the engineer task the LLM with QA given a clear set of requirements and then manage the projects given a clear set of deadlines.A good SE can do a "good enough" job at QA and PM in a small company that you won't notice the PM & QA is missing. But the PM & QA can always be added or QA can be augmented with a specialist assuming you're LLM-driven.

Of course if none of your software projects are business-critical to the degree that downtime costs money pretty directly then you can skip it all and just manage it yourself.

The other thing you should probably understand is that the feedback cycle for an LLM is so fast that you don't need to think of it in terms of sprints or "development cycles" since in many cases if you're iterating on something your work to acceptance test what you're getting is actually the long pole, especially if you're multitasking.

pdimitar · 2026-04-13T14:51:05 1776091865

> If one of the three, {PM, QA, coder}, was replaced by AI, as a customer I'd prefer to pick the team missing the coder.

I am curious: why? In all my years of career I've seen engineers take on extra responsibilities and doing anywhere from decent to fantastic job at it, while people who tend to start much more specialized (like QA / sysadmins / managers) I have historically observed struggling more -- obviously there are many and talented exceptions, they just never were the majority, is my anecdotal evidence.

In many situations I'd bet on the engineer becoming a T-shaped employee (wide area of surface-to-decent level of skills + a few where deep expertise exists).

robertlagrant · 2026-04-13T11:14:36 1776078876

> The PMs and QAs I know would disagree with that assessment.

It just depends on the org structure and what the org calls different skills. In lots of places now PM (as in project, not product) is in no way a leadership role.

_heimdall · 2026-04-13T13:48:30 1776088110

QA is still alive and well in many companies, including manual QA. I'm sure there's a wide range these days based on industry and scale, but you simply don't ship certain products without humans manually testing it against specs, especially if its a highly regulated industry.

I also wouldn't be so sure that programming is the hardest of the three roles for someone to learn. Each role requires a different skill set, and plenty of people will naturally be better at or more drawn to only one of those.

chaboud · 2026-04-13T11:09:32 1776078572

From my experience with modern software and services, the actual practice of QA has plainly atrophied.

In my first gig (~30 years ago), QA could hold up a release even if our CTO and President were breathing down their necks, and every SDE bug-hunted hard throughout the programs.

Now QA (if they even exist) are forced to punt thousands of issues and live with inertial debt. Devs are hostile to QA and reject responsibility constantly.

Back to the OP, these things aren't calculable, but they'll kill businesses every time.

Sevii · 2026-04-13T15:06:14 1776092774

Continuous delivery really killed QA.

philk10 · 2026-04-13T12:20:47 1776082847

that's not the role of QA to be a gatekeeper, they give the CTO and President information on the bugs and testing but it's a business decision to ship or not

skydhash · 2026-04-13T12:57:50 1776085070

I’m not a native English speaker, but isn’t gatekeeping exactly that? Blocking suspicious entities unless they’re allowed through by someone higher in the hierarchy?

wouldbecouldbe · 2026-04-13T10:32:14 1776076334

QA merged originally out of programming.

paulryanrogers · 2026-04-13T12:16:25 1776082585

emerged?

virgilp · 2026-03-26T20:37:38 1774557458

Actually, no. We always needed good checks - that's why you have techniques like automated canary analysis, extensive testing, checking for coverage - these are forms of "executable oracles". If you wanted to be able to do continuous deployment - you had to be very thorough in your validation.

LLMs just take this to the extreme. You can no longer rely on human code reviews (well you can but you give away all the LLM advantages) so then if you take out "human judgement" *from validation*[1], you have to resort to very sophisticated automated validation. This is it - it's not about "inventing a new language", it's about being much more thorough (and innovative, and efficient) in the validation process.

[1] never from design, or specification - you shouldn't outsource that to AI, I don't think we're close to an AI that can do that even moderately effective without human help.

nitwit005 · 2026-03-26T21:31:14 1774560674

If the LLM generates code exactly matching a specification, the specification becomes a conventional programing language. The LLM is just transforming from one language to another.

sanxiyn · 2026-03-26T21:48:55 1774561735

Yes, but a programming language with a proverbial sufficiently smart compiler. That is very useful.

Quekid5 · 2026-03-26T22:11:59 1774563119

Try writing an exhaustive spec for anything non-trivial and you might see the problem.

scuff3d · 2026-03-27T04:21:11 1774585271

Been saying this for a while now. I work in aerospace, and I can tell you from first hand experience software engineers don't know what designing a spec is.

Aero, mechanical, and electrical engineers spend years designing a system. Design, requirements, reviews, redesign, more reviews, more requirements. Every single corner of the system is well understood before anything gets made. It's a detailed, time consuming, arduous process.

Software engineers think they can duplicate that process with a few skills and a weekend planning session with Claude Code. Because implementation is cheaper we don't have to go as hard as the mechanical and electrical folks, but to properly spec a system is still a massive amount of up front effort.

sn9 · 2026-03-28T23:34:12 1774740852

And software isn't as constrained by physics as hardware, which massively expands both the design space as well as how many ways things can go wrong.

whattheheckheck · 2026-03-27T00:53:49 1774572829

Llm boys discover the halting problem!

virgilp · 2026-03-28T18:42:28 1774723348

I honestly don't see how this is related? Nothing says "one shot a full system from a perfect specification", I don't think this was ever a goal (or that it will be practical to do so)

virgilp · 2026-03-23T12:56:10 1774270570

Also: if that one particular AI-produced compiler has nothing innovative, that only means that the human "director" behind the AI didn't ask it to produce anything innovative; what it does not mean is that AI can never produce anything innovative in a compiler.

daveguy · 2026-03-23T20:05:21 1774296321

> if that one particular AI-produced compiler has nothing innovative, that only means that the human "director" behind the AI didn't ask it to produce anything innovative

Couldn't it also be true that the AI didn't produce innovative output even though the human asked it to produce something innovative?

Otherwise you're saying an AI always produces innovative output, if it is asked to produce something innovative. And I don't think that is a perfection that AI has achieved. Sometimes AI can't even produce correct output even when non-innovative output is requested.

vidarh · 2026-03-24T09:29:35 1774344575

> Couldn't it also be true that the AI didn't produce innovative output even though the human asked it to produce something innovative?

It could have been, but unless said human in this case was lying, there is no indication that they did. In fact, what they have said is that they steered it towards including things that makes for a very conventional compiler architecture at this point, such as telling it to use SSA.

> Otherwise you're saying an AI always produces innovative output

They did not say that. They suggested that the AI output closely matches what the human asks for.

> And I don't think that is a perfection that AI has achieved.

I won't answer for the person you replied to, but while I think AI can innovate, I would still 100% agree with this. It is of course by no means perfect at it. Arguably often not even good.

> Sometimes AI can't even produce correct output even when non-innovative output is requested.

Sometimes humans can't either. And that is true for innovation as well.

But on this subject, let me add that one of my first chats with GPT 5.1, I think it was, I asked it a question on parallelised parsing. That in itself is not entirely new, but it came up with a particular scheme for paralellised (GPU friendly) parsing and compiler transformations I have not found in the literature (I wouldn't call myself an expert, but I have kept tabs on the field for ~30 years). I might have missed something, so I intend to do further literature search. It's also not clear how practical it is, but it is interesting enough that when I have time, I'll set up a harness to let it explore it further and write it up, as irrespective of whether it'd be applicable for a production compiler, the ideas are fascinating.