The Geometry of Concepts: Sparse Autoencoder Feature Structure

Yuxiao Li, Eric J. Michaud, David D. Baek, Joshua Engels, Xiaoqing Sun, Max Tegmark

Sparse autoencoders have recently produced dictionaries of high-dimensional vectors corresponding to the universe of concepts represented by large language models. We find that this concept universe has interesting structure at three levels: 1) The “atomic” small-scale structure contains “crystals” whose faces are parallelograms or trapezoids, generalizing well-known examples such as (man-woman-king-queen). We find that the quality of such parallelograms and associated function vectors improves greatly when projecting out global distractor directions such as word length, which is efficiently done with linear discriminant analysis. 2) The “brain” intermediate-scale structure has significant spatial modularity; for example, math and code features form a “lobe” akin to functional lobes seen in neural fMRI images. We quantify the spatial locality of these lobes with multiple metrics and find that clusters of co-occurring features, at coarse enough scale, also cluster together spatially far more than one would expect if feature geometry were random. 3) The “galaxy” scale large-scale structure of the feature point cloud is not isotropic, but instead has a power law of eigenvalues with steepest slope in middle layers. We also quantify how the clustering entropy depends on the layer.

View PDF

Need AI Development or Sponsor Exposure?

Or check our Popular Categories...

About

Or check our Popular Categories...

The Geometry of Concepts: Sparse Autoencoder Feature Structure

Editor

Related Posts

Moonshot AI’s Kimi K3

GPT-5.6 and the Fight Over Frontier AI Access

You Missed

Moonshot AI’s Kimi K3

The Multi-Polar Digital Feudal Order That Could Emerge After the AI Bubble Bursts

Why RAG Remained So Primitive

GPT-5.6 and the Fight Over Frontier AI Access

Symbolism and Connectionism

AI Developments in June 2026: Major Releases, Products, Research, and Policy

Could AI Produce a Corporate Feudal Order

AI Nationalization and State Control

Exaggeration and Reality in Multi-Agent Systems

Emerging Scenarios for an AI Bubble Collapse

Comparing Neo-Grounded Theory, LOGOS, AcademiaOS, and GNG+MST Concept-Structure Analysis

Claude Mythos 5 and Claude Fable 5 Are Official Anthropic Releases, but Much of the Early Chatter Was Not

NVIDIA RTX Spark: The Chip That Could Turn the Windows PC Into a Local AI Workstation

AI Developments in May 2026

From “Waiting for Instructions” to “Autonomous Execution”: May 2026, Autonomous AI Agents and Extreme Multimodality Reshape the World

Corpus2Skill — New Standard of Knowledge Architecture for the LLM Era

The End of Hierarchy, the Rise of Intelligence: How “Company Brain” and “AI OS” Are Rewriting the Future of Organization

The Rise of the Forward Deployed Engineer: Bridging the High-Stakes Chasm Between AI Theory and Execution

Integrated AI After the LLM Boom

Andrej Karpathy’s latest concept ‘LLM Wiki’ and the future of enterprise knowledge

How to Build Enterprise AI

AI Developments in April 2026

The Rise of the Context Layer: Why AI Agents Need More Than Data

Comparison of Major Companies’ Computer Use Agents

GPT-5.5 Is Real, Powerful, and Expensive — but OpenAI’s Biggest Story Is the Race to Own Enterprise AI Work