It’s All About the Context: Coding, Conversations, Books and Priming Capacity

My Blog

Overview

I have often felt that working memory does not quite capture our entire context as we drift through the world. The common understanding of working memory claims that we can juggle 5-9 items in our heads, and does not seem to differentiate between the intentional and unintentional usage of these items. Yet when holding a conversation or writing an article, what I say or do is constrained and informed by an expansive swath of the context so far, often without me intentionally doing so (for example, not repeating what’s already been said).

This post runs through some of my thoughts on this matter, beginning with two prerequisite concepts of utterances and the Priming Effect. These lead into my working model of the above observation: the priming capacity. The post concludes with some thought experiments about savants with perfect priming capacity and why AI probably won’t just be a “copycat of what’s already out there”.

Utterances

A non-intuitive way I’ve come to see things is that we’re essentially generative models. Deconstructing a human as an engineer-philosopher might, it seems that a lot of our operation can be described as follows:

  1. We get some kind of cue. It can come from the environment or yourself - for example, a thought.
  2. Through some black-box magic, the brain takes that cue as an input and outputs and utterance.

As two examples:

  1. The sensory cues from a muffin induce a complex utterance, starting with the sequence of motor commands to pick it up.
  2. The cue of “make a pancake” induces you to generate a progression of steps to do so, or just the first step, depending on working style.

The Priming Effect

Not too long ago, I asked my roommate’s partner - a psychologist-in-the-making - what her favourite psychological phenomenon was. She mentioned the Priming Effect. I had heard of it ages ago in Kahneman’s “Thinking Fast and Slow”, but that night it brought me an exciting thought!

Briefly, the effect is that our utterances are influenced by immediate context without us being aware of it. A classical experiment is something like

The experiment subject rolls a die and it lands on side N. Ask the subject to guess the population of Stockholm. Odds are, the first digit will be closer to N than without the dice roll.

The Priming Capacity

The main topic of this post is our fascinating ability to ground what we do or say with respect to what has already been done or said.

An example is a book: a fabric of hundreds of interwoven statements, each of which must not logically contradict, and hold aesthetic context relative to the ones prior. The most familiar example is probably when you're quite tired and/or overwhelmed with the topic at hand and feel unconfident speaking since you know you'll probably "say something dumb".

This ability seems to be related to the priming effect. When, for instance, we converse, we do not explicitly recall every item we have just discussed and every item leading up to the conversation in order to validate every phrase that we say. And yet it seems that we can stay relevant to the context automatically. Coders might have appreciated a similar effect. And yet, this ability to thread our utterances into the space left by the prior ones appears to be limited. Conversations that go in circles. Bugs in code. Repeating yourself on a report. Let’s call this limitation the priming capacity.

Some scattered thoughts about this phenomenon:

  1. We typically try to overcome our limited priming capacity through systematic approaches. An example is structuring a report before you dive in writing it. Unorganized workers might really be folks who suffered an unusually high priming capacity during development …
  2. In general, as the sequence of utterances progresses, there is a decreasing sleeve of valid utterances. This sleeve is likely massive, and I'm guessing that people navigate it largely via "common sense", including shared norms.
    • Surely you’ve had the experience where what someone says sounds completely wrong, yet you realize it’s logically correct! This might have resulted from different guiding forces within this sleeve of logical correctness.
  3. For the theoretically inclined:
    • A priming capacity of 1 reminds me of two things: a Markov Chain, since the distribution relies only on the most recent thing, and myself late night at a party.
    • A priming capacity of N reminds me of the N-gram, which has a breath-taking rendition in Shannon’s original Information Theory paper.

Markov Schmarkov: Infinite Priming Capacity

Meet Joe, your average family-man. Works 9-5 as a coder, plays Saturday Baseball, has a beer with dinner. Except one thing: everything he does is in context to everything he’s ever done. Is he a savant, tortured by a useless ability, or the next Newton?

Making Music: This subject reminds me of Bach, whose fugues reference themselves in clever ways all throughout, or a long orchestra piece. However, if enjoying a 4-minute fugue performance needs a priming capacity of 5N relative to my own N, Joe can tuck whole worlds of clever reference in the passage of notes with not a kindred soul in sight!

Debate: Joe is invited to debate Mehdi Hasan at the Oxford Union. I’d suspect that in the first half-hour, Mehdi would dominate with his eloquence and rhetoric. But as time goes on, Joe begins annihilating Mehdi by pointing out contradiction after contradiction… Joe’s victory is a cold statistical certainty.

Technical Work: A fun distinction is between active and passive work, where the former seeks to build and the latter seeks to understand systems. For passive work, say physics research, Joe’s uncanny ability would likely have limited use - we’ve moved from philosophizing the nature of matter to measuring it for a reason. However, active work such as coding and engineering complicated systems may be primarily constrained by the engineers’ abilities to build on existing work. Bugs and system vulnerabilities may be nothing but artifacts of the priming limits of engineers.

What this Means for AI

When we build AI that can understand the impressions of the world, it is reasonable to assume that it will be bundled with near-infinite priming capacity and inhumanly fast utterance generation. Now I pay homage to the proselytizers of the AI singularity with two quick thoughts on mathematics and engineering.

Polya’s book on problem solving and my time spent with math enthusiasts have lead me to view solving difficult math problems as an evolutionary process:

  1. Generate a random idea
  2. Execute it
  3. Repeat until getting to the solution

Here, the distinction between priming capacity and working memory gets hazy. Perhaps working memory determines the complexity of an idea that you can execute, for instance one requiring the simultaneous manipulation of 5 symbols versus 20. Priming capacity might be what people use to not repeat the same idea again. Regardless, the above properties may have AI solving conjectures before long.

Additionally, systems engineered by humans seem to suffer the phenomena of gravitational implosion. For example, code-based projects tend to become more buggy as they get larger. We modularize our systems to counteract this, but modularization can be artificial and lead to significant overhead. If our man-made systems are like companies of one-person departments, perhaps the systems designed by an AI would be like agile start-ups.

The human-centric manner in which we engineer our systems limits what we can design to a small, distant corner in the space of possible designs. A reasoning AI, equipped with all the benefits of Von Neumann silicon, could perhaps overcome the limits of our feeble brain-hardware and psychology and could engineer systems of unimaginable complexity and grace.

Appendix: Pending Thinking

There’s still lots to be worked through before this post can present a legitimate model. For instance,

  • Respecting logical constraints means you know the sleeve of non-contradiction. However, it might be a separate question whether one who sees the sleeve of non-contradiction can also generate utterances within it. From observation, much of the fuel for conversation probably comes from people not being fully relevant to what’s been said already. This invokes the inverse picture of the person who is too cognizant of the context and thus can think of nothing good to say.
    • Would a system with a large enough priming capacity just get stuck?
  • In a similar vein, the priming capacity has been defined to utterances of the past. However, it seems in life that there are those who are primed by the future in that their utterances appear to be “future-proof”. In the black box of the mind which generates our utterances, are we simulating future trajectories and choosing utterances that maximize results across them?

Appendix 2: Relationship with Short-term and Long-term Memory

Working memory, as defined by Alan Baddeley, one of the pioneers of the field, appears to tackle a different phenomenon than Priming Capacity. Working Memory, which is different from long-term memory, is measured by one's ability to recite, maintaining sequential order. It's measured by tests such as digit span, i.e. how many uncorrelated digits one can recall, while being confounded by factors like chunking, e.g. we can recite more numbers when we memorize them in groups of three rather than one.

With the concept of Priming Capacity, I try to explain generation and not just recitation. People place items into their Priming Buffer, and without transparent formula, these items get converted to actions like what people say and write. It's possible, like in the above case of the Priming Effect, that people cannot recite what's in their buffer, despite it affecting what they do.

In practice, people who say very relevant things appear to have good context, which comes from a mixture of them loading a priori knowledge (stuff they already know, say, about the conversation partner) into their priming buffer, and cycling new information through it through the course of a conversation. Like how our eyes typically dart around over a hundred times a minute without us knowing it, and this is controlled by particular brain areas, this management of the priming buffer is probably also automatic and differs in efficacy between people.

  1. How Emotions are Made: The Secret Life of the Brain (book)
  2. Cognitive Load Theory (field)
  3. Thinking, Fast and Slow (book)

Acknowledgements

A big thanks to Anton, Auguste, Nathan and Serene for the feedback! And I thought I could get away with having ChatGPT write the first version of this blog for me…

January 11, 2023

Report abuse