On AI, Part 1: Computation, Not Cognition

Written on 31st March 2026

This post has been written with the help of AI tools: Wispr Flow for transcription, and ChatGPT to workshop the draft. I, however, take responsibility for all the words on this page. The thoughts are mine.


Back in 2007, when I was still in junior college, I had to write an independent study for a newfangled subject called Knowledge and Inquiry. For this independent study, we had to pick a field or a context and examine how knowledge is constructed within that context. So, with the brazenness of youth, I chose to write my independent study on artificial intelligence.

I had no formal background in computer science, philosophy of mind, psychology, nothing. I had done some light programming in Visual Basic and PHP, but I knew very little about software engineering, and certainly nothing about how artificial intelligence was actually implemented. The point of this independent study wasn't really to produce original research. The point was to learn how to think, how to structure an argument, and how to engage seriously with academic material.

The question that I formulated for that paper was:

If machines are computationally faster and more accurate than humans at some tasks, what is the role of human intelligence in constructing knowledge?

In 2007, the artificial intelligence landscape was very different from what it is now. Deep Blue beat Garry Kasparov in 1997, but Watson wouldn't compete in a public game of Jeopardy until 2011, and the AlphaGo project didn't begin until 2014.

It's no coincidence that games were and are an appealing domain for AI researchers. Games have structured rules. Many games have state spaces that are intractably large for the human brain to reason about, but that are still theoretically finite and a computer, given the right algorithm, can conceivably do a better job reasoning through than a human can. We even have a way to turn decision-making into math: it's called game theory. If you're trying to take something as vague, open-ended and context-dependent as decision-making, and translate that into something relatively constrained that a computer can do, games are a very natural fit.

My independent study made heavy reference to Claude Shannon's seminal 1950 paper "Programming a Computer for Playing Chess." Shannon talks about two different approaches that you can use when trying to create a computer program to play chess:

  • Type A, "in which all variations are considered out to a definite number of moves and the move then determined from a formula..." – basically a brute force search.
  • Type B, "... evaluate only at reasonable positions... select the variations to be explored by some process..." – narrowing the search field by some kind of heuristic.

Early on, when compute was very limited and expensive, most researchers focused on creating a Type B search for chess. In the 1970s, the calculus flipped: Northwestern University's Chess program used a Type A strategy to successfully beat its opponents, and modern chess AIs essentially use a Type A strategy with alpha-beta pruning.

This, I think, is not an uncommon trajectory in the history of artificial intelligence. Early research begins with attempts to build a formal model of human-like reasoning, with limited success, followed by a gradual shift towards approaches that take advantage of what the computers are actually good at: speed, scale, and systematic exploration.

Shannon himself suggested as much:

It is not being suggested that we should design the strategy in our own image. Rather it should be matched to the capacities and weakness of the computer. The computer is strong in speed and accuracy and weak in analytical abilities and recognition. Hence, it should make more use of brutal calculation than humans, but with possible variations increasing by a factor of 10^3 every move, a little selection goes a long way toward improving blind trial and error.

Fine, but what does this have to do with LLMs?

Today, when somebody talks about AI, they're not talking about a chess-playing program. Almost invariably, they're talking about one of two things: either a machine learning model, or generative AI. So before I continue, I want to define my terms.

My usage of the term "AI" is a fairly big tent. Under the AI umbrella, I include:

  • Classical AI: these are systems that are fundamentally deterministic and rule-based. These AIs run input through a set of explicitly defined transformations, and the idea is those transformations represent a model of human reasoning.
  • Machine Learning Models: these are probabilistic models defined with parameters and fed training data. Training an ML model means giving it an input and the corresponding expected output, and the ML model adjusts its parameter weights depending on whether it successfully produced the expected output or not. These models work well for classification problems like image recognition, speech recognition, and transcription.
  • Generative AI Models: like ML models, these are probabilistic in nature. Unlike ML models, the goal isn't classification, it's generation. A generative AI model processes the text, the images, the audio it's trained on, and probabilistically predicts the next word, the next pixel, the next element in the chain.

There are, of course, other categorizations of AI models that I cannot really speak about with any degree of confidence, like models using deep learning techniques, or neural networks. Regardless, I think it's important to distinguish the different types of AI: I think that when we have discussions about "AI", we often talk past each other, especially in this time when the most visible incarnation of AI is large language models.

As an undergrad, I majored in Spanish and minored in linguistics, so I jumped at the opportunity to take a Natural Language Processing class in graduate school, hoping to combine my love of language with my professional interest in artificial intelligence. This was a recipe for disappointment, because NLP techniques involve very little linguistics.

Linguists try to build a model of language: what is the phonetic and phonemic inventory of this language? What are its morphological and syntactical rules? How does this language form words and sentences? How do languages create meaning and reduce ambiguity?

This is not at all how an AI language model works. Rather than taking the rules that linguists have distilled from natural language and encoding those rules into a language model, modern LLMs distill the structure of language from a massive amount of data. What an LLM is doing is building a map of probabilities in text. What words tend to appear together? In what order? Under what conditions? In what contexts?

Essentially, this is a version of the Type A strategy: LLMs bring a tremendous amount of computational power to bear on the problem of language, and end up being able to mimic human language use with a relatively high degree of verisimilitude.

However, that does not mean that an LLM understands language.

The Chinese Room

The philosopher John Searle proposed a thought experiment known as the Chinese Room: suppose we put a man inside a room and give him a magical rulebook that perfectly models the Chinese language. Someone outside the room slips a note written in Chinese under the door, and the man in the room looks through his magical Chinese rulebook and writes a response, in perfect Chinese, to the person outside the door. In this way, the man inside the room appears capable of conversing in Chinese. (We don't have to get hung up about whether such a model of any language is practical – we are only interested in the thought experiment.)

Now, let's say the man in the room is an English speaker. Instead of slipping in a Chinese note under the door, we slip in English language notes under the door. This man does not need a magical English rulebook to produce a response. He can read the text in English, and produce his own response in English.

Is there a qualitative difference between his Chinese responses, generated by the rule book, and his English responses? I think most of us would say that there is: the man in the room can generate Chinese, but he understands English.

In other words, the man means what he says when he responds in English, in a way that he does not when he responds in Chinese.

Whether or not you agree with John Searle's broader argument about artificial intelligence not being truly intelligent, I do think that the Chinese Room argument lays bare a common pitfall of LLM use: because LLMs generate sentences like a human, we ascribe human qualities to LLMs, especially understanding and reasoning. From there, it's a short jump from looking at an argument that mimics reasoning, to assuming correctness and intent on the part of the LLM.

To be clear, I think there's a tremendous amount of value to LLMs: an LLM's ability to process massive amounts of text, images, audio, any sort of input, far more than a human will see in many lifetimes, makes it very good at synthesizing output. However, I think it is very important to understand that the meaning-making comes from us.

Where does this leave humans?

So, we come back to the original question. If a machine can process data at a far greater scale than humans can, and if, from that scale, emerges something that looks like human reasoning, what's left for us? What type of knowledge work can humans do that machines cannot?

I think the answer comes precisely from the fact that progress in artificial intelligence has not come from trying to make the machine think more like a human – that has historically not worked. Progress in AI has come from taking advantage of what computers are best at, which is computation. But judgement in open-ended problems, decisions involving trade-offs, tricky situations requiring creative solutions – these are still areas where human judgment is and will always be essential.

In a world where content is abundant, taste and curation become scarce resources. When code is cheap, the value of human judgement lies in determining what code is valuable and what is noise.

I believe that AI, and particularly generative AI, is a tool, but one that we have not collectively learned to use well. If I use a calculator to calculate a tip, I would say that I did the math, even if the calculator did the calculating. If I load and unload the washer, I would say that I did the laundry, even if the washer did the washing. We accept the use of these tools because there is no benefit to us to do the work by hand, as long as we accomplish the task we set out to do.

The problem arises when we conflate different ideas of what our real task is supposed to be. If I am teaching a college class and set an assignment for my students, I might ask them to submit a piece of writing or a snippet of code, but what I'm really asking for is not the writing or the code, it is evidence of thinking and learning. A learner does not yet know what good looks like, so there is value in the doing. The task is the learning.

On the other hand, if I'm building a piece of software, the task is to make sure the software works, works well, and continues working well. It doesn't matter whether I typed out the code myself – but it does matter that I know what it's doing there. Yet, in order to for me to take responsibility for that code in any meaningful way, I need to know what good software looks like, and that taste-building takes time, experience and yes – human judgement.

I don't think there will ever come a time when we completely abdicate this responsibility to AI tools. Nor should we.

P.S. all em-dashes in this post are mine – all mine. As a long-term, devoted em-dash user, I've become hyper-conscious of using it now that it's become a marker of ChatGPT usage, but no – I put them there myself. By hand. Like an artisan.