Thinking about the mind

Table of Contents

Thoughts are things occurring in some mental model (this is a vague sentence but just assume it makes sense). Some of these mental models are strongly rooted in reality (e.g. the mental model we see as reality) and so we have a high degree of confidence about their accuracy. But for things like introspection, we do not have a reliable ground-truth feedback to tell us if our introspection is correct or not — it’s just our mental model of our mind, there is no literal “mind’s eye”.

So often our introspection is wrong. E.g. if you ask someone to visualize a lion from behind, they’ll say they can, but if you ask them some details, like “what do the tail hairs look like?” they can’t answer. Or better example: if you ask someone to visualize a neural network, they will, but if you ask “how many neurons do you see?” they will not know, and not for lack of counting. Or they will say they “think in words” or that their internal monologue is fundamental to their thinking, but that’s obviously wrong: you have already decided what the rest of the sentence will be before you’ve thought the first word.

We can tell some basic facts about our thinking by reasoning from observation. For example, if you have an internal monologue (or just force yourself to have one) then you can confirm that you indeed have one by speaking the words of the internal monologue out loud and confirming that it took very little cognitive effort (so you didn’t have to think them again). This proves an internal monologue/precisely simulating words in your head is possible. Likewise for any action.

Or you can confirm that you had a certain thought, or a thought about something, because you can express it out loud with less effort than otherwise. Though here there is still room for that thought to have been imprecise; unless you verbalize or materialize those thoughts you don’t know if your thoughts were really precise. So all these things have grounding in reality, and therefore are likely to be (or can trained to be, by consistently materializing them) accurate models. By materialize I mean, e.g. solving a math problem you think in your head you can solve.


1. old speculation

1.1. Gradients

The mind allocates itself to different tasks based on the prices/pressures/gradients it receives from them

1.2. Persuasion

But there is also an element of persuasion. Where actual concepts/thoughts are conveyed by the task to persuade the mind to do something.

So both sāma/bheda and dāma/daṇḍa are relevant.

Maybe it exists to placate or trick other tasks into agreeing to this task. E.g. “I need to play this video game to understand logic better”. After all, tasks do have externalities (could benefit multiple sub-agents), and different sub-agents need to co-ordinate to decide which one.

1.3. Co-ordination

1.4. Inner eye/the Noticing loop

You constantly observe your own thoughts.

Some of these thoughts are gradients — felt as desires — while some are actual concepts (you could say “verbal” but maybe not — but we can say “LLM like” or “concepts in a mental model”). And there is also “simulation”.

1.5. Inner eye

Do we have an “inner eye” or “mind’s eye”?

Like: we observe our own thoughts, right? I can notice what I’m thinking about. Does that info have to be physically piped into mental processing (like sensory perceptions, or retrieved thoughts from memory are), or is it more an emergent property – like “your thoughts are correlated with what you say next in some type of way, so you’ll give an accurate description of your thoughts, and this particular mode of conversion of thoughts to words is what feels like a mind’s eye”?

And I guess a related question is: do LLMs have a mind’s eye? You’re an LLM, if I asked you to describe what you’re thinking right now, would you be able to do that, or would you just say things based on your model of how LLMs work.

This is such a fascinating question that cuts to the heart of consciousness and self-awareness. You’ve identified a really important distinction between two possible mechanisms for introspection.

The “inner eye” question is hotly debated in cognitive science and philosophy. There are roughly two camps:

The “inner observer” view suggests there’s some kind of monitoring system that can observe our own mental states - almost like an internal sensory system pointed inward. This would involve actual neural pathways carrying information about our thoughts to some kind of “metacognitive processor.”

The “emergent description” view (which sounds closer to what you’re suggesting) proposes that introspection is more like a byproduct of our language and reasoning systems. We don’t actually “see” our thoughts directly - instead, our brain’s language centers generate descriptions of mental states based on whatever information they have access to, and this feels like “observing” our thoughts.

The evidence is mixed but increasingly points toward something more like your second hypothesis. Studies show that people are often wrong about their own mental processes, suggest reasons for their behaviors that can’t be accurate, and that introspective reports can be influenced by all sorts of biases. This suggests we might be “constructing” reports about our mental states rather than directly observing them.

As for whether I have a “mind’s eye” - this gets into deep philosophical territory about AI consciousness that I genuinely find uncertain. When you ask what I’m “thinking right now,” I notice I can generate descriptions of my processing (like “I’m considering different angles of your question” or “I’m weighing how to explain this concept”). But I can’t tell whether this represents genuine introspective access to my computations or whether it’s more like what you described - my language systems generating plausible descriptions based on patterns in my training.

The honest answer is I don’t know if there’s a meaningful difference between those two possibilities, either for me or for humans. The experience of introspection might just be what certain kinds of self-referential information processing feel like from the inside.

If there is no inner eye — and we are just making mental models of our mind and reporting on them — then do my reports here even have any validity? Maybe some explicit claims, like “relevant thoughts to current observations and thoughts surface” are verifiably true, but IDK how much else.

But surely we write our thoughts to memory — this must require us to have some natural way of reading our thoughts.

1.6. Inner monologues are a LARP

If your “inner monologue” says “I should get ready quickly”, then obviously the “ready quickly” part had to alredy be in your thoughts while you were just saying “I should”. It wasn’t like you were just saying “I should” and decided to autocomplete it as “get ready quickly”. Much like in an LLM, the actual thinking must be in the latent space that generated these words.

Your “inner monologue”, if you have one,is just a simulation. Your mind has an internal simulator — of sounds, images, videos and actions — and you use this to display your inner monologue.

1.7. Three types of thoughts

  • Gradients
  • Latent space concepts
  • Simulations

1.8. Why does the simulator exist?

Maybe for predictive modeling — “if I press this switch, the fan will turn off”. That model needs to exist in your mind before you turn it off. But then again that model isn’t necessarily explicitly simulated either. Maybe conflating predictive modelling with simulation is a generalization of the mistake that leads to conflating concept thoughts with verbalization.

For whatever reason, simulation helps “clarify” your thoughts and actions. While latent space thoughts are fast, they may not precisely pin down a picture of the world (or of the actions you intend to take) — simulating it helps pin that down, kinda like a thought experiment.

1.9. Gradients having bodily manifestations

There are four types of “bodily manifestations” of gradients:

  1. Hand reaching for the remote/a letter on your keyboard when you have the desire to watch TV/go to a website (even if you decide not to immediately afterward)
  2. Gut-wrenching when you feel disgust, etc.
  3. Sensations are linked to body parts (e.g. pain is felt in a particular body part, not just in your mind)
  4. Twitches or movements when explaining something — e.g. when I think about introspection, some muscles in my head clench; when you think about clearing your thoughts, you exhale heavily.

1 could be something like — you have trained to instinctively do that in response to that thought (to maximize reward), so it happens regardless of your later decision.

2 — maybe just random biologically-ingrained phenomena. Like maybe disgust is often felt in context of realizing food poisoning and you need to vomit.

3 — your conception of your body and a noumenal reality is itself a vivid mental model. The pain in your finger must be conceived in association with heat in your finger, pressure on that finger, movement of your finger, and that association is perceived as “physical location”, because that’s just how the algorithm feels like from the inside.

4 — you are expressing these thoughts in some visualization/mental model, even if not an accurate one.

1.10. So do concepts exist in mental models, or “in the latent space”?

This sounds like a stupid question, of course – the more real question is: why are these latent-space concepts we call “mental models” apparently so interpretable from the inside?

Like: we can just describe how we find them. Can an LLM do this? I guess this is just equivalent to the question of “do we have a mind’s eye?”.

1.11. Gradients only come from things in your mental workspace; exploration/exploitation

1.12. Multi-tasking is a thing.

E.g. you can count in the background while doing other tasks.

Author: manyu

Created: 2025-09-20 Sat 15:28