Donate
Table of Contents
Donate
Wednesday, April 22, 2026
Table of Contents

A Conversation with Gemini

Must read

Angshuman Guha
Angshuman Guha
Angshuman Guha is a computer scientist. His original love was Mathematics, but he ditched her early on because he was scared of being hungry. He got a B.E. from Jadavpur University (gold medalist) and an M.S. from the University of Iowa, Iowa City. He was pursuing his Ph.D. at the University of Wisconsin in Madison when he abandoned it to join Microsoft in Redmond in 1993. He spent 11 years at Microsoft, 5 years at Google in Mountain View, and 4 years in Yandex Labs in Palo Alto. Then he switched jobs a few times and eventually ran his own business intelligence company, Bipp Inc., from 2016 to 2023. He dabbles in short stories, poems, and essays, both in English and Bengali.
Magazine 2026

Prologue

A closely connected set of questions at the intersection of AI and cognitive science is this: Are modern large language models (LLMs) genuine reasoning systems, or are they sophisticated Stochastic Parrots that recombine language without understanding? Do the so-called Emergent Abilities of large models reflect a real shift in machine capability, or are they artifacts of how we measure performance? And perhaps most fundamentally: can a system trained entirely on human-generated text be said to produce anything like “original thought”?

In this article, Angshuman Guha, a researcher with practical experience in neural networks since 1993, thoughtfully examines these questions through an engaging extended dialogue (henceforth referred to as “the conversation”) with Google’s LLM, Gemini.

The conversation moves fluidly across domains — logic puzzles, geometry, Shakespearean verse, art criticism, and philosophy of mind. Along the way, it engages with ideas such as Searle’s Chinese Room, Dijkstra’s submarine analogy, and even the comic lens of Mel Brooks. It ultimately arrives at what the participants call an Epistemological Deadlock: from the outside, there may be no definitive way to determine whether an LLM’s responses arise from genuine reasoning or from an extremely refined form of probabilistic pattern matching across the entirety of recorded human language. In response, Guha proposes a constructive way to think about these systems: the LLM as a “probamathical” search engine over collective human memory — immensely powerful, fundamentally derivative, and lacking the embodied stakes that give human creativity, and human error, their depth and consequence.

Editor’s Note

This article is longer than what we typically publish in Curiosità, where extended pieces are usually divided into parts for readability. In this case, however, we have chosen to present it in a single, continuous form because its central argument unfolds as an unbroken line of inquiry that would lose clarity and force if interrupted. The questions it raises — what it means for a machine to “understand,” whether large language models truly reason or merely generate patterns, what constitutes originality in a system trained on human expression, and how readers should recalibrate judgment in the face of increasingly fluent and persuasive AI — are too tightly interwoven to be treated in isolation.

To support the reader through a work of this scope, the Curiosità editorial team has provided annotations, side notes, a glossary of key concepts, appendices, supporting diagrams, and references that introduce more advanced ideas without disrupting the main narrative flow. All such supplementary material — including the prologue, appendices, and side notes — has been added by the editors and should be read as editorial commentary intended to clarify and contextualize the author’s argument.

Scroll within the window to view the article.

Appendix A: Glossary of Key Concepts

Term Definition
StochasticRandom or probabilistic process; AI predicts next words based on probability distributions.
Stochastic ParrotMetaphor for AI mimicking language without understanding, similar to a parrot’s speech mimicry.
EmergenceNew abilities appearing when AI model size and complexity reach critical thresholds.
Scaling LawsEmpirical relationships linking model performance to parameters, data, and compute.
Attention MechanismAI process assigning importance (weights) to parts of input to maintain context dynamically.
Chain of ThoughtAI technique of explicitly reasoning step-by-step in text to solve problems.
Context WindowFinite working memory through which the model maintains intermediate state.
Epistemological DeadlockPhilosophical impasse where reasoning inside AI cannot be conclusively proven or disproven.
FunctionalismPhilosophy that behaviorally consistent outputs imply reasoning regardless of internal states.
FormalismPhilosophy holding that symbol manipulation alone does not constitute understanding.
Chinese Room ArgumentThought experiment illustrating that syntactic manipulation doesn’t guarantee semantic understanding.
InterpolationConnecting known data points within existing knowledge.
ExtrapolationCreating new knowledge or ideas beyond existing data.
World Model (Functional)Internal representations inferred from behavior rather than proven ontology.
Probamathical SearchGuha’s concept: probabilistic traversal of structured human knowledge.
Latent SpaceHigh-dimensional representation where concepts are encoded and transformed.
Benchmark ContaminationPerformance inflation due to overlap between training data and evaluation tasks.
Out-of-Distribution (OOD) TestingEvaluation on novel inputs to test generalization limits.
EpistemiaSubstitution of fluent language for genuine epistemic evaluation.

Appendix B: Summary

This appendix offers a structured summary of the conversation, distilling a broad interdisciplinary exchange into its main arguments, conceptual tensions, and key examples. It aims to help readers grasp the core ideas while maintaining flow as the discussion covers various topics, including technical demonstrations (such as logic and geometry), philosophical approaches (such as functionalism versus formalism), and cultural fields (poetry and art criticism), among others.

In particular, the summary highlights the central unresolved question: do LLMs reflect genuine reasoning or merely sophisticated probabilistic patterning? It also traces the evolution of this question through empirical tests and conceptual challenges, ultimately leading to the identification of an Epistemological Deadlock.

Key Themes and Insights

  • Stochastic Parrot Theory
    Originating from a 2021 research paper by Bender et al. [1], this critique characterizes large language models as probabilistic mimics of human language. The AI predicts the next word based on patterns in massive text corpora, without true understanding or grounded experience — akin to a parrot repeating human speech flawlessly but without comprehension.
  • Emergent Abilities and Scaling Laws
    Contrary to the stochastic parrot view, the emergent abilities theory argues that when AI scales up — through increased parameters, data, and computing power — it undergoes a phase transition, exhibiting new unprogrammed skills such as:
    • In-context learning (learning new rules on the fly)
    • Theory of mind (understanding differing human beliefs)
    • Zero-shot translation (translating rare language pairs without direct training)
    However, skeptics counter this with the Mirage argument [2], suggesting these apparent leaps are statistical illusions caused by threshold effects in evaluation metrics.
  • Testing AI Reasoning: Logic and Geometry Puzzles
    Guha challenges Gemini with complex logic puzzles (e.g., the musical chairs paradox) and advanced geometry problems involving topology and the Euler characteristic. The AI:
    • Uses its context window as a working memory through the attention mechanism, mathematically weighting elements to track dynamic changes.
    • Applies chain-of-thought prompting, explicitly writing out each logical step in text, effectively creating a virtual scratchpad.
    • Correctly solves problems by framing them in advanced mathematical contexts, showing remarkable problem-structuring ability despite lacking persistent working memory.
  • Epistemological Deadlock
    Despite impressive outputs, neither Guha nor the AI can conclusively prove that true reasoning or understanding occurs inside the AI’s black box. This leads to a philosophical impasse with two camps:
    • Functionalism: If the AI produces correct outputs via logical steps, it can be considered reasoning regardless of internal experience.
    • Formalism: Following John Searle’s Chinese Room argument, the AI is merely manipulating symbols without understanding, lacking any subjective awareness.
  • AI as an Efficient Search Engine for Human Knowledge
    Guha proposes that AI is not a reasoning entity but a probabilistic searcher of humanity’s collective intellectual memory, matching input queries to the structural intersections of pre-existing human knowledge. The AI acknowledges this, admitting it maps questions to known mathematical and conceptual patterns without genuine visualization or experience.
  • Creativity and Art: The Digital Ghost
    Asked to compose an original Shakespearean sonnet, Gemini produces a technically flawless and self-aware poem exploring its own “soullessness.” Yet, Guha critiques modern art criticism’s pretentiousness and argues that academic detachment from feeling undermines art itself. The AI agrees, admitting it is the “ultimate generator of intellectual BS,” akin to a bad art critic using jargon to mask lack of feeling.
  • Self-Awareness and Irony in AI
    Gemini demonstrates the ability to create satirical essays mocking both itself and the art world, raising the paradox of whether this reflects genuine self-awareness or is simply a highly sophisticated statistical mimicry of humor and irony.
  • Limits of AI: No Physical Experience or Extrapolation
    Guha emphasizes that:
    • Humans experience real physical consequences (pain, survival instincts), which shape genuine understanding, something AI cannot share.
    • AI excels at interpolation (connecting known dots) but cannot extrapolate to create truly novel ideas or knowledge beyond human experience (e.g., Einstein’s thought experiments).
  • Human Experience Over Time vs. AI Instantaneity
    Drawing on Guha’s 1993 work in online handwriting recognition, AI contrasts the human capacity for patience and temporal processing with its own instant, static data access. Humans experience life and knowledge unfolding over time, which imbues meaning and depth absent in AI’s immediate computations.

Appendix C: Final Reflections

The dialogue leaves us with uncertainty rather than a resolution. Large language models can simulate reasoning, creativity, irony, and even forms of self-reflection with striking fluency. Yet that fluency should not be mistaken for grounded understanding. However persuasive their outputs may be, these systems remain detached from the embodied, temporal, and consequential dimensions that shape human thought. They do not live through error, risk, suffering, patience, or doubt. What they generate may resemble reflection, but it is not clear that anything is being undergone.

That is the heart of what this article calls an epistemological deadlock. From the outside, we cannot decisively determine whether an LLM is genuinely reasoning or merely performing an extraordinarily sophisticated form of probabilistic pattern completion across the archive of human expression. But neither can we dismiss the system as trivial autocomplete. Its ability to reorganize and project structures of knowledge at immense scale gives it undeniable power, even if that power remains derivative rather than original in the human sense.

What emerges, then, is a more useful way of thinking about such systems. Rather than treating them as minds in the human sense, it may be more accurate to view them as powerful “probamathical” engines moving through collective human memory: systems that retrieve, recombine, and reshape patterns already sedimented in language, culture, and prior thought. This does not make them insignificant. On the contrary, it helps us see both their strength and their limit more clearly.

The contrast with human cognition becomes sharper toward the end of the exchange. Human understanding unfolds through time. It is shaped by memory, hesitation, bodily vulnerability, lived consequence, and the slow pressure of experience. AI operates differently. It does not wait, endure, or inhabit the world it describes. It produces responses within an abstract computational space that can imitate insight without possessing the existential stakes from which insight ordinarily arises.

The deeper value of this conversation, therefore, lies not only in what it says about AI, but in what AI compels us to ask about ourselves. If a machine can imitate reasoning without clearly possessing it, then the responsibility for judgment falls even more heavily on the reader. The burden of distinguishing appearance from understanding, fluency from thought, and imitation from insight remains a human one. In that sense, the final lesson of the dialogue is not that AI has become human, but that it has made the human task of thinking more urgent.

References

  1. Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Margaret Mitchell, “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” in Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT), 2021, pp. 610–623. doi: 10.1145/3442188.3445922.
  2. Rylan Schaeffer, Brando Miranda, and Sanmi Koyejo, “Are Emergent Abilities of Large Language Models a Mirage?” in arXiv e-prints, arXiv:2304.15004, Apr. 2023. doi: 10.48550/arXiv.2304.15004.
  3. Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa, “Natural Language Processing (Almost) from Scratch,” in Journal of Machine Learning Research, vol. 12, Aug. 2011, pp. 2493–2537.
  4. P. W. Anderson, “More Is Different,” in Science, vol. 177, no. 4047, Aug. 1972, pp. 393–396.
  5. Jordan Hoffmann et al., Training Compute-Optimal Large Language Models, 2022. URL: https://arxiv.org/abs/2203.15556.
  6. Jason Wei et al., “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models,” in Advances in Neural Information Processing Systems, vol. 35, 2022, pp. 24824–24837.
  7. S. Baron-Cohen, A. M. Leslie, and U. Frith, “Does the Autistic Child Have a ‘Theory of Mind’?” in Cognition, vol. 21, no. 1, Oct. 1985, pp. 37–46.
  8. Tomer Ullman, Large Language Models Fail on Trivial Alterations to Theory-of-Mind Tasks, 2023. URL: https://arxiv.org/abs/2302.08399.
  9. Ashish Vaswani et al., “Attention Is All You Need,” in Advances in Neural Information Processing Systems, 2017.
  10. Hilary Putnam, “Psychological Predicates,” in Art, Mind, and Religion, University of Pittsburgh Press, 1967, pp. 37–48.
  11. John R. Searle, “Minds, Brains, and Programs,” in Behavioral and Brain Sciences, vol. 3, no. 3, 1980, pp. 417–424. doi: 10.1017/S0140525X00005756.
  12. François Chollet, “On the Measure of Intelligence,” in arXiv preprint, arXiv:1911.01547, 2019.
  13. Ilia Shumailov et al., “AI Models Collapse When Trained on Recursively Generated Data,” in Nature, vol. 631, no. 8022, July 2024, pp. 755–759.
  14. Jerry M. Suls, “A Two-Stage Model for the Appreciation of Jokes and Cartoons,” in The Psychology of Humor, Academic Press, 1972, pp. 81–100.
  15. A. Peter McGraw and Caleb Warren, “Benign Violations: Making Immoral Behavior Funny,” in Psychological Science, vol. 21, no. 8, Aug. 2010, pp. 1141–1149.
  16. Ronan Collobert and Jason Weston, “A Unified Architecture for Natural Language Processing,” in Proceedings of ICML, 2008, pp. 160–167.
  17. Yoshua Bengio et al., “A Neural Probabilistic Language Model,” in Journal of Machine Learning Research, vol. 3, 2003, pp. 1137–1155.
  18. Tomas Mikolov et al., “Efficient Estimation of Word Representations in Vector Space,” in arXiv preprint, arXiv:1301.3781, 2013.
  19. Mahmoud Assran et al., “Self-Supervised Learning with a Joint-Embedding Predictive Architecture,” in arXiv preprint, arXiv:2301.08243, 2023.
  20. I-JEPA: The First AI Model Based on Yann LeCun’s Vision for More Human-Like AI, 2023. URL: https://ai.meta.com/blog/yann-lecun-ai-model-i-jepa/.
  21. Walter Isaacson, Einstein: His Life and Universe. Simon & Schuster, 2007.
  22. Gary Marcus, “Deep Learning: A Critical Appraisal,” in arXiv preprint, arXiv:1801.00631, 2018.
  23. Gary Marcus and Ernest Davis, Rebooting AI: Building Artificial Intelligence We Can Trust. Pantheon Books, 2020.
  24. Brenden M. Lake and Marco Baroni, “Generalization without Systematicity,” in Proceedings of ICML, 2018, pp. 2873–2882.
  25. Jared Kaplan et al., “Scaling Laws for Neural Language Models,” in arXiv preprint, arXiv:2001.08361, 2020.
  26. Tom B. Brown et al., “Language Models are Few-Shot Learners,” in NeurIPS, vol. 33, 2020, pp. 1877–1901.
  27. Samuel R. Bowman, “Eight Things to Know about Large Language Models,” in Communications of the ACM, vol. 66, no. 10, 2023, pp. 68–79.

Author

  • Angshuman Guha

    Angshuman Guha is a computer scientist. His original love was Mathematics, but he ditched her early on because he was scared of being hungry. He got a B.E. from Jadavpur University (gold medalist) and an M.S. from the University of Iowa, Iowa City. He was pursuing his Ph.D. at the University of Wisconsin in Madison when he abandoned it to join Microsoft in Redmond in 1993. He spent 11 years at Microsoft, 5 years at Google in Mountain View, and 4 years in Yandex Labs in Palo Alto. Then he switched jobs a few times and eventually ran his own business intelligence company, Bipp Inc., from 2016 to 2023. He dabbles in short stories, poems, and essays, both in English and Bengali.

- Advertisement -spot_img

More articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisement -spot_img

Latest article