Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If a counterexample to a specific claim doesn't disprove the claim, that sometimes suggests the claim is unfalsifiable and therefore suspect.


The claim is that GPT-4 can reason sometimes. Evidence that GPT-4 fails to reason sometimes isn't a counterexample.


The people who made GPT-4 have said it does not reason, please for the love of god drop this nonsense.


I never said that it does. I was pointing out a logical flaw in that person's argument. Also, why are the creators of GPT-4 authorities on what does or doesn't count as reasoning?


It’s suspect until it’s demonstrated. Once someone has demonstrated it, counterexamples are meaningless.

I claim I can juggle. I pick up three tennis balls and juggle them. You hand me three basketballs. I try and fail. My original claim, that I can juggle, still stands.


I disagree -- that would disprove your claim, as your claim was too broad. Same if they handed you chainsaws or elephants, or seventy-two tennis balls.

The more correct claim is you can juggle [some small number of items with particular properties].


Normal English implies that you can do something, not everything. It’s an any versus all distinction, and all is totally unreasonable except for the most formal circumstances.

“Can you ride a bike?”

“Yeah.”

“Prove it. Here I have the world’s smallest bicycle.” <- this person is not worth your time and attention.


It can't even count reliably. And this is a computer, not a human. That is one of the simplest things a computer should be able to do. It can't count because it doesn't know what counting is, not because it's unreliable in the way a human would be when counting. You cannot reason if you do not understand the concepts you are working with. The result is not the measure of success here, because it is good at mimicking, but when it fails at such a basic computing task, you can reasonably conclude it has no idea what it's doing.


Think about it step by step. There are people not able to count. We still say they can reason. A low ability to count does not disprove reasoning.


A child may not be able to count, because they don't yet understand the concept, but may be able to reason at a more basic level, yes. But GPT has ingested most of the books in the world and the entire internet. So if it hasn't learned to count, or what counting is by now, what is going to change? A child can learn the concept, GPT cannot. It doesn't understand concepts at all, it only seems like it does because the output is so close to what beings that do understand concepts generate themselves, and it's mimicking that.


There are adults who can't count as well as ChatGPT.


Ok, but how does that change the argument?


Inability to count does not prove inability to reason.


It can't count because it doesn't know what counting is. Otherwise it would be able to count like a computer can. After ingesting all the literature in the world, why can't it count yet?


It can, just not very well. Often the issue is people have it count letters and words but those aren’t the abstractions or deals with. It would be like if I told you to pick up the ball primarily reflecting 450nm light and called you colorblind if you couldn’t do it.

GPT is built on a computer. It is not a computer itself. I am made of cells, but I am not able to perform mitosis.

It’s also very capable of offloading counting tasks, since computers are so good at that. Just suggest it format things in code and evaluate it.

Here’s an example of it counting. I’m pretty confident it’s never seen this exact problem before.

How many fruits are in this list? banana, hammer, wrench, watermelon, screwdriver, pineapple, peach

There are 4 fruits in the list you provided: banana, watermelon, pineapple, and peach. The other items are tools, not fruits.


It's not counting, it's just outputting something that is likely to sound right. The reason it can't count is that it doesn't know that it should run a counting algorithm when you ask it to count, or what a counting algorithm is. There is no reason why an AI should not be able to run an algorithm. Coming up with appropriate algorithms and running them is one definition of AIs. This is not an AI. It is an algorithm itself but it cannot create or run algorithms, though it can output things that look like them. The "algorithms" it outputs are not even wrong, correctness and truth is orthogonal to it's output and without that you don't have real intelligence.


> It's not counting, it's just outputting something that is likely to sound right.

What does this mean? How can it possibly come up with the number 4 in a way that doesn’t involve counting?


How does it come up with comprehensible sentences that make logical sense about everyday things without understanding what these things are in the real world or even understanding what logic is? I don't have a stack trace for you, but if you want to hang on "the proof's in the pudding" and say the output is evidence for what it's doing, here's Chat GPT's output to your question:

There are two fruits in this list: banana and watermelon.

So what is the difference between GPT 4 and Chat GPT? Is GPT 4 all of a sudden running a counting algorithm over the appropriate words in the sentence because it understands that's what it needs to do to get the correct answer? Can anyone explain how you would get from whatever Chat GPT is doing to that by the changes to GPT 4? And more to the point, how does Chat GPT get to the answer 2 without counting? Apparently somehow, because if it can't even count to 4, I'm not sure how you could call whatever it's doing 'counting'.


> How does it come up with comprehensible sentences that make logical sense about everyday things without understanding what these things are in the real world or even understanding what logic is?

This is begging the question.

> Can anyone explain how you would get from whatever Chat GPT is doing to that by the changes to GPT 4?

I don't know. I can't explain how a 2-year-old can't count but an 8-year-old can, either. 3.5 generally can't count[0]. 4 generally can.

You've really sidestepped the question. I suggest you seriously consider it. What's the difference between something that looks like it's counting versus actually counting when it comes to data it's never seen before?

Thanks for discussing this with me.

[0] Add "Solve this like George Polya" to any problem and it will do a better job. When I do that, it's able to get to 4.


That's because

> I can juggle

is here shorthand for

> I can juggle at all; I can juggle at least some things

and the basketball case is only a counterexample to the much stronger claim

> I can juggle anything

But the argument about AIs reasoning has little to do with such examples, because juggling is about the ability to complete the task alone. When it comes to reasoning there are questions about authenticity that don't have analogs I'm determine whether a person can juggle.


This thread is exactly such examples.

What would “not alone” mean? Do you think someone is passing it the answers? Of course it was trained, but that’s like cheating on a test by reading the material so you can keep a cheat sheet in your brain.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: