Why AI Struggles with Common Sense: Exploring the Gap in Cognitive Leaps

In the following excerpt, we delve into the workings of generative AI systems like GPT-3. These systems are constructed to mimic the predictive functions of the human neocortex, yet they grapple with the complexities of human speech.

GPT-3 embarks on a journey of word prediction, sentence anticipation, and paragraph comprehension. Throughout extensive training, it strives to forecast the next word in a continuous stream of text. With each prediction, the neural network’s weights are minutely adjusted towards the correct answer. Through countless repetitions, GPT-3 acquires the ability to predict the next word based on the preceding context. In essence, this mirrors a fundamental aspect of how human language operates. Think about how naturally you can anticipate the next word in familiar phrases:

“One plus one equals _____”
“Roses are red, violets are _____”

You’ve encountered such sentences countless times, and your neocortex effortlessly predicts the following word. What distinguishes GPT-3 is not its capacity to predict the next word in sequences it has seen extensively. That could be achieved through mere memorization. What’s truly impressive is that GPT-3 can handle novel sequences it has never encountered and still accurately forecast the next word. This mirrors an aspect of human language comprehension. Could you predict that the next word is “blue”? Most likely, even though you hadn’t seen that precise sentence before. The shared attribute here is prediction. Both GPT-3 and the neocortical regions responsible for language exhibit predictive abilities. They can extrapolate from past experiences, apply that knowledge to new sentences, and estimate what comes next.

However, when confronted with questions requiring simulation or reasoning, the differences between GPT-3 and the human brain become apparent. For instance:

“If 3x + 1 = 3, then x equals _____”

With this question, you likely paused to perform some mental arithmetic before arriving at an answer. In contrast, GPT-3 provides an incorrect answer, reflecting its lack of understanding of even elementary aspects of mathematical principles.

“I am in my windowless basement, and I look toward the sky, and I see _____”

In this case, GPT-3 envisions stars, while human cognition readily grasps that a basement ceiling would obscure such a view. The inadequacies of GPT-3’s responses highlight its lack of common sense.