Hacker Newsnew | past | comments | ask | show | jobs | submit | aakresearch's commentslogin

In my opinion review will always be a bottleneck, in OS as well as in commercial development.

To my understanding Code Review is first and foremost a trust-building exercise, seeking to establish common understanding behind the piece of code which is to be delivered. That it also may lead to improvements or "catch some bugs" is a distant tertiary side-effect, not the primary goal. At least this is the vantage point I reviewed any code from in the last two decades, and found that team morale and overall quality of teamwork - and delivered software, and customer satisfaction, as a result - responds very well to such interpretation. Regardless of the side you are on and competency level of your vis-a-vis. I would put "elevating competency level of both partners" as a secondary goal of Code Review, and very closely connected to the primary.

With current crop of LLMs there simply nothing on their side to which words "trust" and "understanding" can be meaningfully applied. Hence the "review" takes drastically different shape and implies very different goals. As both the primary and secondary goals described above cannot apply too. The only remaining goal of "improving and bug-catching", in absence of trust, understanding and learning, now requires much more work, which is also much more exhausting.


And a follow-up to my comment above, as wanted to reply to "create an issue before filing a PR" remark too.

I find it actually extremely useful practice. We engineers tend to center our thoughts around "code" and with such code-centric mindset to accumulate knowledge as "code-adjacent" - in repo's commits, PRs, markdown files of all sorts. But in reality most if not all projects extends past the code, and it makes much more sense to have a "project-centered" mindset. As such, an external "issue" captures much more context and provides more useful insight about project impact than PR description alone. Love or hate JIRA, beyond microscopic solo-projects it makes full sense to use broader-scope external tools for project management.

As a nice cosmetic effect it also removes (some) bickering about whether to put "type" or "scope" first in the commit message :). Simply provide a reference to the ticket! (No, really, I hate JIRA, honestly!)


I grant we are - and also to facilitate understanding of one's thinking by other team members. But what I write "between commits" is not any more an evidence of "thinking" or conducive to "understanding" than the NSFW mouse gestures I make or rubbish I am humming to myself.

Indeed, I hear you 100%. It may be painful to watch "your" military engage in wars that are not righteous, from one's perspective. Are there even "righteous" conflicts these days? But short of demand to abolish all state military, what is appropriate way to express one's indignation? Hopefully all can agree that such demand would be insane; but why, then, those "pacifist" performances, which effectively are calls to deprive military of the best weapons, people, thoughts, strategies, are not considered insane?

Agree or disagree with particular foreign policy or military action, why do people forget that the bulk of military is staffed with their fellow citizens? Many of whom aren't terribly privileged to enjoy ample alternative choices to elevate themselves socially or financially. It is exactly this lot who benefits the most from DEI policies, cherished by "pacifists", is it not? It is them who are the first and most massive direct casualties, caused by not having access to the best, superior materiel, doctrine and training on and beyond the battlefield.

I'll be the first to point that military and paramilitary forces attract many with unchecked lust for violence. That "pride", "honor" and "patriotism" are often terribly misused, to uphold goals of those with impure, malicious ambitions. Who, I grant it, also disproportionately represented in the command echelons of military and beyond. But if we are honest, that scum won't be shaken or taught a lesson by SotA technology being withheld from their use or corporation refusing cooperation. It is their subordinates, who, maybe naively, subscribe to "ideal", unquoted interpretation of Pride, Honor and Patriotism, will bear the brunt of being crippled (by the consequences of the withholding and refusal) on the battlefield, and pay with their lives. Don't their lives matter?


- A little script that is best described as "symbolic submodule" workflow enabler for git: https://github.com/fontmaniac/symgit. As in "symbolic link" vs "hard link" - where hash tying host repo and pseudo-submodule is not git-generated, which makes "submodule" completely substitutable. For the description of creation process see the relevant section of Readme.md

- Companion script named Dr.Emmett - to retrospectively time-walk git commit history and rebuild the repo in a new place. It is not opened yet, and may not be ever as it doesn't satisfy LLM/human ratio worthy of publishing.

- Both scripts were made in service to my FNA-"game" project VibeSopwith: https://github.com/fontmaniac/VibeSopwith.Game. A more or less comprehensive description, and treatise on my stance about vibe- and agentic-coding can be found in the project's Readme.md

- VibeSopwith catalysed creation (accretion?) of Nage.Strata: https://github.com/fontmaniac/Nage.Strata - "Not-a-game-engine" collection of useful "primitives" for FNA-based game development.

- VibeSopwith Readme.md has notable mention of "Aether Bodybuilder" - a helper tool for visually coupling Aether.Physics2D "bodies" to specific sprite textures. Fully vibe-coded in a couple of hours, and as such won't ever be publicly opened, having LLM/human ration approaching infinity.


> I fear that workflow is just chat, search, and being a rubber duck for my thoughts

This is exactly what I settled upon after my own trying really hard. It is liberating, I have no fear at all!


>>At the same time, do you really want every conversation you have with your doctor recorded

>Yes. This is what medical records are.

No. Medical records are limited extracts from conversations, which is your doctor and only your doctor is qualified to make, using "semantic analysis applied to your unique situation", not "linguistic probabilistic inference applied to conversation about your situation using token weights averaged over billion unrelated samples"

> It's not like the doctor is talking to you about which anime series are the best. You're talking about your health, your body, your disease, your treatment.

No jokes, no banter, no chit-chat, no complements to doctor's new Tesla?

> It's important to keep track of that.

Same fallacy Meta fell into when started tracking employees' keystrokes and mouse gestures. 90% of my mouse movements are just fidgeting, with no relation to the task at hand - and it is not a crime! But if I knew my mouse fidgeting is being watched, I'll make sure that percentage goes up to 99% - for the LLM which is gonna be trained off it to self-immolate over its NSFW nature.


Hey, I'm agreement with you.

I meant that these limited extracts do need to be recorded, that's all.

Read the rest of the comment :)


Oops... I am deeply sorry, thank you for the heads up! It seems I've myself committed a cardinal sin that I am usually quick to point in others - rushing to reply without comprehending the full message. (Meta-oops: I realized how LLM-ish it sounds. Quick, reboot before my cover is blown!)

I happen to believe that the flaw being discussed IS fundamental and inherent in the design and architecture of LLM - this is why I always put "AI" in scare quotes. I've spoken about it in some of my other comments, namely this https://news.ycombinator.com/item?id=47162553 and to some extent this https://news.ycombinator.com/item?id=48046333. And as you do, I, too, hope that I am wrong about the hype and its eventual clash with reality, but do not hold my breath.


No problem, this is all human and understandable. You don't sound LLM-ish, you sound like you care a lot about this.

Which both of us do.


I may have different perception facilities in the brain, but to me the images in the article are horrifying. Such bad, Frankenstein-level, criminal butchering of absolutely everything!


I am by no means an expert, but I'd like to offer my mental model - up to you to decide if it is solid or not, but it works for me.

I think the core intuition is that, like with any other "rasterized" system with finite memory that cannot encode an absence of anything - relation, concept, entity, LLM cannot encode an absence of something through its internal weights. Say, you can have "Product" or "Order" tables in you database, but you cannot have "NotAProduct" or "NotAnOrder" tables - for obvious reasons of such relations being infinite and uncountable. So, to establish an absence of Product or Order your application must execute a "search" operation through the relevant tables. But in LLM-space "search" operation does not exist. It is mathematically undefined. LLM arrives at output (or "what to do") through a sequence of projections of input token vector through its "latent space". It "moves toward" high-probability clusters, fundamentally unable to "move away". So, the success of any "negation" in the prompt ("don't touch this file", "draw me a ballot box without a flag on it") depends on how heavily such scenario represented in the training data/model space. And again, the absence-of-something may be hard-to-impossible to usefully encode, especially if "something" is not fixed. Therefore, to expect "don't touch this file" sentence to result in, well, not touching the file is pure gambling. Sometimes it may look like working, albeit for wrong reasons, and some other times LLM may do exactly the opposite - because its weight matrix statistically pushes it towards "touch this file", completely ignoring (nonexistent in its latent space) "don't".

There is no way to reliably know what will work, and no "skill" or "art" in this. Well, no more than in dice rolling or horoscope casting.

I'd like to add that for the above reason I find "agentic development" usefulness on par with avian remains reading. But when I explored it two practical advises seemed to be helpful in nudging LLM around negation problem:

- Omit the "don't" prompt completely, thus not creating a false "attractor" for LLM; and

- Provide an alternate positive directive ("what to DO", not "what to NOT DO") to act as "escape hatch" when LLM might "want" to touch the sacred file or drop the production DB.

While it looked like somewhat working, I think it is trivially obvious that trying to predict all the nonsense LLM might want to perform and coming up with possible "escape hatches" for everything very quickly becomes utterly impractical.


It seems that author unironically advises to write your commit messages like this: "Restructured Claude’s module architecture, rejected initial state management approach, rewrote error handling from scratch", to have a chance at defense in potential court hearing. I find it funny, if vindicating for my personal approach. If the expectation is to "restructure, reject, rewrite" what "AI" spits out, why use "AI" at all at this point???


To my understanding, LLM, by design, is unable to encode negation semantics. Neither negation "operation", nor any other "subtractive" operations are computable in LLM machinery. Thinking out loud, in your example the "Read code" and "Form hypothesis" seem to be useful instructions for what you want, while "Do not write any code" and "Not to fix the bug" might actually be misleading for the model. Intuitively (in human terms) one would imagine that, when given such "instruction", LLM would be repelled from latent-space region associated with "write any code" or "fix the bug". But in reality LLM cannot be "repelled", it is just attracted to the region associated with full, negated "DO NOT <xxxx>". And this region probably either has a significant overlap with the former ("DO <xxx>") or even includes it wholesale. This may explain why it sometimes seems to "work" as intended, albeit accidentally. My 2c.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: