{"id":1617,"date":"2025-06-15T10:59:23","date_gmt":"2025-06-15T01:59:23","guid":{"rendered":"https:\/\/www.aicritique.org\/us\/?p=1617"},"modified":"2025-06-15T10:59:23","modified_gmt":"2025-06-15T01:59:23","slug":"physics-of-intelligence-a-physics-based-approach-to-understanding-ai-and-the-brain","status":"publish","type":"post","link":"https:\/\/www.aicritique.org\/us\/2025\/06\/15\/physics-of-intelligence-a-physics-based-approach-to-understanding-ai-and-the-brain\/","title":{"rendered":"Physics of Intelligence: A Physics-Based Approach to Understanding AI and the Brain"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Dr. Hidenori Tanaka\u2019s <strong>\u201cPhysics of Intelligence\u201d<\/strong> project (also called the <strong>Physics of Artificial Intelligence<\/strong>) is an ambitious research initiative aiming to apply concepts from physics \u2013 such as symmetry, conservation laws, and phase transitions \u2013 to the study of intelligence in neural networks. Launched in the early 2020s and evolving through collaborations between industry and academia, this program treats artificial intelligence as a phenomenon to be understood with the same rigor as a natural science<a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=The%20two,meetings%2C%20and%20other%20associated%20costs\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a>. By integrating physics, neuroscience, computer science, and psychology, the project seeks fundamental <em>laws of intelligence<\/em> that could make AI systems more interpretable, trustworthy, and energy-efficient<a href=\"https:\/\/ntt-research.com\/pai-group\/#:~:text=AI%20is%20quite%20possibly%20the,be%20implemented%20to%20further%20humankind\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a><a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=%E2%80%9CWe%20are%20thrilled%20to%20support,%E2%80%9D\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a>. Below, we delve into the core hypotheses, methods, collaborations, findings, and broader impacts of this cutting-edge initiative, clearly distinguishing speculative visions from empirical results.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Core Scientific Hypotheses and Principles<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">At the heart of the Physics of Intelligence project is the hypothesis that <strong>principles from physics can explain and predict the behavior of learning systems (both artificial and biological)<\/strong>. In particular, Tanaka\u2019s team proposes that key phenomena in deep neural networks \u2013 such as how they learn, generalize, and exhibit emergent abilities \u2013 can be elucidated by analogies to physical laws:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Symmetry and Conservation in Learning Dynamics:<\/strong> Just as symmetry in physics leads to conserved quantities (via Noether\u2019s theorem), neural network architectures contain symmetries that impose invariants on the training dynamics<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Strikingly%20similar%20to%20Noether%E2%80%99s%20theorem%2C,constant%20under%20gradient%20flow%20dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a><a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=%5C%5B%5Cbegin,%7C%5Ctheta_%7B%5Cmathcal%7BA%7D_2%7D%280%29%7C%5E2%20%5Cend%7Baligned\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>. For example, modern networks often have <em>translation symmetries<\/em> (adding a constant bias to certain weights doesn\u2019t change the loss), <em>scale symmetries<\/em> (scaling weights before a normalization layer), or <em>rescaling symmetries<\/em> between layers. <strong>Tanaka et al. showed that under <em>gradient flow<\/em> (continuous idealized training), each such symmetry yields a conserved quantity \u2013 a combination of parameters that remains constant during learning<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Consider%20some%20subset%20of%20the,results%20in%20the%20conservation%20laws\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a><a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=%5C%5B%5Cbegin,%7C%5Ctheta_%7B%5Cmathcal%7BA%7D_2%7D%280%29%7C%5E2%20%5Cend%7Baligned\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>.<\/strong> This is directly analogous to Noether\u2019s theorem linking symmetry to conserved energy or momentum in physical systems. Concretely, for a network with translation symmetry, the <strong>sum of those parameters stays fixed<\/strong>; with scale symmetry, the <strong>overall weight norm is fixed<\/strong>; with rescaling symmetry, the <strong>difference of two weight norms is fixed<\/strong><a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=this%20equation%20through%20time%20results,in%20the%20conservation%20laws\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a><a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=effectively%20restricting%20the%20possible%20trajectory,their%20dynamics%20to%20a%20hyperbola\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>. These conservation laws constrain the trajectory of learning to specific geometric surfaces (e.g. a hyperplane for translation symmetry, a spherical surface for scale symmetry, or a hyperboloid for rescaling) as illustrated in <strong>Figure 1<\/strong> below. <strong>Importantly, these invariants help simplify the high-dimensional \u201cblack box\u201d of training into tractable directions of analysis<a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=Figure%203%3A%20Visualizing%20conservation,black%20lines%20are%20level%20sets\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=Application%20of%20this%20general%20theorem,following%20conservation%20law%20of%20learning\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a>.<\/strong> Just as in physics, ideal conservation laws are <strong>broken<\/strong> by real-world effects (analogous to friction or external forces) \u2013 here, finite learning rates, stochastic noise, weight decay, and other optimizer tricks violate the perfect conservation<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=A%20realistic%20continuous%20model%20for,stochastic%20gradient%20descent\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a><a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Modeling%20weight%20decay,the%20origin%20in%20parameter%20space\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>. A major hypothesis of the project is that understanding how these <em>approximate<\/em> conservation laws break can reveal the \u201cforces\u201d driving neural networks to generalize.<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"831\" height=\"773\" src=\"https:\/\/www.aicritique.org\/us\/wp-content\/uploads\/2025\/06\/image-4.png\" alt=\"\" class=\"wp-image-1618\" style=\"width:235px;height:auto\" srcset=\"https:\/\/www.aicritique.org\/us\/wp-content\/uploads\/2025\/06\/image-4.png 831w, https:\/\/www.aicritique.org\/us\/wp-content\/uploads\/2025\/06\/image-4-300x279.png 300w, https:\/\/www.aicritique.org\/us\/wp-content\/uploads\/2025\/06\/image-4-768x714.png 768w\" sizes=\"auto, (max-width: 831px) 100vw, 831px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><em>Figure 1:<\/em>* Geometric interpretation of a conservation law in neural network training (rescaling symmetry case). Due to a symmetry in the model, the learning trajectory is constrained to a <em>surface<\/em> (here a hyperbola) defined by a constant difference between two groups of weight norms. The heatmap color (red to blue) indicates the conserved quantity\u2019s value, and black lines show its level sets. Such <em>physics-inspired invariants<\/em> restrict how a network\u2019s parameters evolve<a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=Figure%203%3A%20Visualizing%20conservation,black%20lines%20are%20level%20sets\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=Application%20of%20this%20general%20theorem,following%20conservation%20law%20of%20learning\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a>.*<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Emergence and Phase Transitions in Learning:<\/strong> Another core idea is that <strong>\u201cemergent\u201d abilities in AI can be studied like phase transitions or percolation in physics<\/strong>. As AI models scale up or get more data, they often display <em>sudden jumps in capability<\/em> \u2013 for example, a language model might abruptly learn to do arithmetic once it passes a certain size. Tanaka\u2019s team draws an analogy to how matter can abruptly change phase (like water freezing) when an underlying parameter crosses a threshold. In a 2024 study, they define emergence in neural networks as the moment when a model <strong>acquires a general underlying structure that causes a sharp rise in performance on specific tasks<\/strong><a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=,We%20empirically\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a><a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=context,when%20changing%20the%20data%20structure\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>. They built a controllable experimental setup using a formal language task: a Transformer is trained on strings generated by a context-sensitive grammar. They observed that once the model internalized the <em>grammar\u2019s structure<\/em>, its accuracy on related \u201cnarrow\u201d tasks suddenly jumped \u2013 a clear emergent behavior<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=cause%20of%20sudden%20performance%20growth,when%20changing%20the%20data%20structure\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>. By <strong>analogy with percolation theory<\/strong>, they modeled the learning process as a graph that gradually \u201cconnects\u201d pieces of knowledge. The onset of emergence corresponds to a <strong>phase transition<\/strong> in this graph\u2019s connectivity<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=narrower%20tasks%20suddenly%20begins%20to,when%20changing%20the%20data%20structure\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>. This <em>percolation model of emergence<\/em> provided a <em>quantitative prediction<\/em> for when the performance jump will occur as training data or structure varies<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=Specifically%2C%20we%20show%20that%20once,predicting%20emergence%20in%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>. Such results support the hypothesis that seemingly mysterious leaps in AI ability can be demystified by <em>phase transition models<\/em>, lending a physicist\u2019s understanding to questions of generalization and capability growth.<\/li>\n\n\n\n<li><strong>Noether\u2019s Learning Dynamics and Symmetry Breaking:<\/strong> While perfect symmetry yields conservation, <em>broken symmetry<\/em> can be even more illuminating in learning. In their NeurIPS 2021 paper <strong>\u201cNoether\u2019s Learning Dynamics\u201d<\/strong>, Tanaka and collaborators extended Noether\u2019s theorem to realistic neural network training, which includes <em>kinetic symmetry breaking<\/em> (KSB)<a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=discrete%20learning%20dynamics%20of%20gradient,of%20implicit%20adaptive%20optimization%2C%20establishing\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a><a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=energy%20explicitly%20breaks%20the%20symmetry,learning%20dynamics%20of%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a>. They formulated the learning process in Lagrangian mechanics terms: treating the loss function as analogous to potential energy and the training rule (like stochastic gradient descent) as analogous to kinetic energy<a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=of%20symmetry%20breaking%20is%20not,account%20KSB%20and%20derive%20the\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a>. In this view, adding certain mechanisms (like normalization layers or momentum) <strong>explicitly breaks symmetries<\/strong> in the \u201ckinetic energy\u201d of learning. This broken symmetry is not a bug but a feature: the theory predicts it can introduce beneficial forces in parameter space. Indeed, they found that <strong>normalization layers induce a form of symmetry breaking that acts like an adaptive optimizer (similar to RMSProp) built into the dynamics<\/strong><a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=energy%20explicitly%20breaks%20the%20symmetry,learning%20dynamics%20of%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a><a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=resulting%20motion%20of%20the%20Noether,Lagrangian%20mechanics%2C%20we%20have%20established\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a>. In other words, what looks like a mere architectural tweak (batch normalization) has a <em>physics analog<\/em>: it breaks a conservation law in a way that makes learning more efficient and stable, much as friction can help a system settle to equilibrium. The broader hypothesis is that by systematically identifying when and how training <em>violates<\/em> ideal symmetries \u2013 through weight decay, noise, etc. \u2013 one can derive <em>exact equations for the broken conservation laws<\/em> that govern real neural networks<a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=5%20A%20Realistic%20Continuous%20Model,for%20Stochastic%20Gradient%20Descent\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=by%20random%20batches%2C%20and%20discretization,model%20for%20stochastic%20gradient%20descent\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a>. This yields analytic predictions for phenomena like parameter norm growth or decay under various training regimes<a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=Image%3A%20Refer%20to%20caption%20,Modified%20Loss\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a>. Such insights ground the often heuristic practice of deep learning in a firmer theoretical framework, akin to how breaking of physical symmetries (e.g. in crystal defects or particle masses) leads to deeper understanding in physics.<\/li>\n\n\n\n<li><strong>\u201cLaws of AI\u201d as a Scientific Goal:<\/strong> Underlying all these hypotheses is a unifying vision: <strong>just as physics produced laws like <em>F = ma<\/em> or <em>E = mc\u00b2<\/em>, there may exist concise laws governing intelligence and learning<\/strong><a href=\"https:\/\/ntt-research.com\/pai-group\/#:~:text=AI%20is%20quite%20possibly%20the,be%20implemented%20to%20further%20humankind\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. The Physics of Intelligence project explicitly seeks general principles that apply across different substrates \u2013 whether silicon neural nets or biological brains. For instance, a conjectured <em>law of generalization<\/em> might relate a network\u2019s architecture and training conditions to its ability to transfer knowledge, analogous to a conservation law or equation of state. While such laws are still speculative, <strong>Tanaka often emphasizes that AI\u2019s rapid advances present an opportunity much like past scientific revolutions: AI is \u201ca new subject of study for the science of intelligence\u201d that could yield new physics<a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=Shaping%20this%20ongoing%20academic%20cross,for%20the%20science%20of%20intelligence\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a>.<\/strong> This outlook is partly philosophical \u2013 treating intelligence itself as a natural phenomenon \u2013 and partly pragmatic, aiming to tame AI\u2019s complexity so that engineers can design systems with <em>predictable<\/em> and <em>safe<\/em> behavior. The project\u2019s hypotheses push beyond viewing neural networks as only engineering artifacts, instead regarding them as objects of scientific inquiry governed by emergent laws.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">In summary, the core scientific stance of the project is that <strong>intelligence can be understood through the lens of physics<\/strong>. Symmetries in networks lead to constraints just like in physical systems<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Strikingly%20similar%20to%20Noether%E2%80%99s%20theorem%2C,constant%20under%20gradient%20flow%20dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a><a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=%5C%5B%5Cbegin,%7C%5Ctheta_%7B%5Cmathcal%7BA%7D_2%7D%280%29%7C%5E2%20%5Cend%7Baligned\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>; sudden learning behaviors can be mapped to phase transitions<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=Specifically%2C%20we%20show%20that%20once,predicting%20emergence%20in%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>; and introducing certain architectural elements is akin to adding forces or breaking invariances in a physical system<a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=energy%20explicitly%20breaks%20the%20symmetry,learning%20dynamics%20of%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a>. These hypotheses have driven a series of theoretical models and experiments over 2020\u20132025, described next. While the notion of \u201claws of AI\u201d remains aspirational, the work to date provides concrete examples (conservation laws, percolation thresholds, etc.) where physics-style reasoning yields <strong>testable predictions<\/strong> about neural networks\u2019 behavior. Such predictions begin to bridge the gap between the <em>black box<\/em> complexity of deep learning and the <em>transparent<\/em> explanations scientists seek.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Techniques: Experimental and Mathematical Approaches<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">To investigate these hypotheses, Tanaka\u2019s Physics of Intelligence group employs a <strong>blend of theoretical physics methods, analytical modeling, and controlled experimental simulations<\/strong>. Their approach is highly interdisciplinary: they treat neural networks as <em>experimental subjects<\/em> (much like one would study an organism or a physical system) and use mathematical tools from physics to derive insights. Key techniques include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Continuous Dynamical Systems Analysis:<\/strong> One hallmark of the project is recasting discrete training processes (like iterative weight updates in SGD) into <em>continuous-time equations<\/em> that can be analyzed with calculus and differential equations<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=In%20order%20to%20make%20headway,that%20can%20be%20solved%20exactly\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a><a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Strikingly%20similar%20to%20Noether%E2%80%99s%20theorem%2C,constant%20under%20gradient%20flow%20dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>. By taking the <em>gradient flow<\/em> limit (infinitesimal learning rate), the team writes down <strong>ordinary differential equations (ODEs)<\/strong> for weight evolution<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Strikingly%20similar%20to%20Noether%E2%80%99s%20theorem%2C,constant%20under%20gradient%20flow%20dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>. In this formulation, training resembles a particle moving in a force field defined by the loss function. Classic physics tools can then be applied: for example, identifying conserved quantities via inner products of the ODE with symmetry generators<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Consider%20some%20subset%20of%20the,results%20in%20the%20conservation%20laws\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>, or using <strong>modified equation analysis<\/strong> from numerical analysis to account for finite step sizes<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=steepest%20descent%20path%20given%20by,discretization%20on%20the%20learning%20dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>. In practice, they developed <strong>\u201cmodified gradient flow\u201d<\/strong> equations that include correction terms for finite learning rates and momentum, improving the match between theory and actual training trajectories<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=equations,discretization%20on%20the%20learning%20dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a><a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Fig,the%20original%20gradient%20flow%20dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>. This continuous modeling lets them solve for certain weight combinations exactly (e.g., how a particular norm decays over time) and to visualize training as trajectories in a potential landscape<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Combining%20symmetry%20and%20modified%20gradient,to%20derive%20exact%20learning%20dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a><a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=gradient%20descent%2C%20we%20can%20derive,parameter%20combinations%20tied%20to%20the\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>. Such analysis revealed, for instance, how adding momentum in SGD introduces an effective <em>inertia<\/em> and rescales time without changing the path (akin to adding mass to a particle)<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Modeling%20momentum,the%20gradient%20flow%20trajectory%20intact\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>. It also showed that gradient noise due to mini-batches has a special low-rank structure that <em>does not<\/em> perturb the symmetry-constrained directions of motion<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Modeling%20stochasticity,flow%20dynamics%20in%20the%20directions\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a><a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=gradient%20observe%20the%20same%20geometric,the%20directions%20associated%20with%20symmetry\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a> \u2013 a nontrivial insight into why certain parameter combinations remain predictable despite stochastic training. This continuous dynamics approach is a powerful theoretical technique to derive <strong>closed-form expressions for learning curves<\/strong> under various conditions<a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=Symmetry%20and%20conservation%20laws%20in,has%20the%20corresponding%20conservation%20law\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=Application%20of%20this%20general%20theorem,following%20conservation%20law%20of%20learning\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a>. While these calculations often rely on idealizations (infinitesimal steps, infinite data, etc.), the group validates them on real networks (e.g., a VGG-16 on ImageNet) to ensure they capture real behavior<a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=associated%20conservation%20law%20in%20the,our%20work%20demonstrates%20that%20we\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=gradient%20flow%20with%20our%20framework,architectures%20trained%20on%20any%20dataset\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a>. The ability to predict aspects of training dynamics analytically is a significant step toward a <em>mechanistic understanding<\/em> of deep learning, analogous to solving equations of motion in physics rather than just numerically simulating them.<\/li>\n\n\n\n<li><strong>Synthetic \u201cModel Systems\u201d for Experiments:<\/strong> In parallel with mathematical analysis, the team conducts <strong>experiments on simplified or synthetic tasks<\/strong> to isolate phenomena of interest. This is akin to a physicist designing a clean experiment to reveal a specific effect. For example, to study emergent abilities and phase transitions, they constructed a <strong>context-free and context-sensitive grammar task<\/strong> for Transformers<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=cause%20of%20sudden%20performance%20growth,when%20changing%20the%20data%20structure\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>. By training networks on strings generated from a known grammar, they can measure exactly when the network grasps the underlying rules. This level of control is impossible with a massive language model trained on the entire internet, but in the synthetic setup, they observed clear phase-transition-like behavior (a sudden jump in performance once the grammar was learned)<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=context,when%20changing%20the%20data%20structure\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>. Another example is the use of <strong>formal languages and logical tasks<\/strong> to probe compositional generalization: in one 2024 study, the group examined how Transformers learn <em>concepts<\/em> and <em>rules<\/em> by training them on synthetic data where the ground-truth compositional structure is known<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=Y,Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=C.%20F.%20Park,E.S.%20Lubana\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. By tracking internal representations during training on these tasks, they identified distinct <strong>\u201calgorithmic phases\u201d<\/strong> \u2013 regimes in which the model appears to use one strategy vs. another<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=C.F.%20Park,Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. They even observed switching dynamics suggesting competition between strategies until one dominates (much like phases competing in a physical system). Furthermore, the team built toy models of neural networks (sometimes as simple as a single-neuron or one-dimensional system) to derive intuition. The <strong>\u201cghost mechanism\u201d<\/strong> study (2025) is a good example: they crafted a minimal recurrent network task (a \u201cdelayed activation\u201d toy problem) that produces an <em>abrupt learning<\/em> curve \u2013 long plateaus then sudden improvement<a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2501.02378#:~:text=Abrupt%20learning%20is%20commonly%20observed,learning\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2501.02378#:~:text=zone%20and%20an%20oscillatory%20minimum,redundancy%20in%20stabilizing%20learning%20dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a>. By analyzing this one-dimensional system, they discovered a \u201cghost\u201d fixed point causing the delay (related to ghost states in dynamical systems theory)<a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2501.02378#:~:text=underlying%20mechanisms,We%20demonstrate%20two%20complementary\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2501.02378#:~:text=accompany%20the%20destabilization%20of%20learning,redundancy%20in%20stabilizing%20learning%20dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a>. This insight then guided them to identify similar ghost effects in larger recurrent neural nets, along with methods to mitigate the plateaus (like lowering confidence of outputs or increasing model redundancy)<a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2501.02378#:~:text=destabilizes%20learning%20dynamics,free%20mechanism%20for%20abrupt%20learning\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2501.02378#:~:text=accompany%20the%20destabilization%20of%20learning,redundancy%20in%20stabilizing%20learning%20dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a>. In summary, *<em>the group uses simplified experimental setups \u2013 from formal languages to few-neuron models \u2013 to uncover mechanisms that would be hidden in more complex tasks<\/em>. These controlled experiments yield phenomena (emergence, ghost instabilities, etc.) that can be quantitatively measured and then linked back to theoretical models (like percolation graphs or bifurcation analysis). It\u2019s a marriage of simulation and theory reminiscent of early computational physics or systems biology: <strong>the simulation \u201cexperiments\u201d generate data to be explained, and the physics-based theory provides the explanatory framework<\/strong><a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=narrower%20tasks%20suddenly%20begins%20to,when%20changing%20the%20data%20structure\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2501.02378#:~:text=Yet%2C%20despite%20its%20common%20occurrence%2C,these%20predictions%20in%20recurrent%20neural\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a>.<\/li>\n\n\n\n<li><strong>Mathematical Modeling &amp; Proofs:<\/strong> Many results of the Physics of Intelligence project come in the form of <strong>mathematical derivations or proofs<\/strong>, borrowing techniques from statistical mechanics, linear algebra, and beyond. For instance, in the <strong>Noether\u2019s Learning Dynamics<\/strong> work, the team proved a generalized Noether theorem for learning: under <em>kinetic symmetry breaking<\/em>, the usual conservation law acquires an extra term (a \u201cNoether charge motion\u201d) that they derived explicitly<a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=discrete%20learning%20dynamics%20of%20gradient,of%20implicit%20adaptive%20optimization%2C%20establishing\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a><a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=energy%20explicitly%20breaks%20the%20symmetry,learning%20dynamics%20of%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a>. They then applied this theorem to derive an <em>exact correspondence<\/em> between a network with batch normalization and a form of the RMSProp update rule<a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=energy%20explicitly%20breaks%20the%20symmetry,learning%20dynamics%20of%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a><a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=resulting%20motion%20of%20the%20Noether,Lagrangian%20mechanics%2C%20we%20have%20established\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a> \u2013 an analytical insight bridging architecture and optimization. In the <strong>Synaptic Flow pruning algorithm<\/strong> (NeurIPS 2020), Tanaka et al. first <strong>mathematically formulated a conservation law for network connectivity at initialization<\/strong><a href=\"https:\/\/papers.nips.cc\/paper\/2020\/hash\/46a4378f835dc8040c8057beb6a2da52-Abstract.html#:~:text=ever%20training%2C%20or%20indeed%20without,at%20initialization%20subject%20to%20a\" target=\"_blank\" rel=\"noreferrer noopener\">papers.nips.cc<\/a>. They showed that naive weight pruning methods break a \u201cflow conservation\u201d across layers, leading to entire layers dying (layer-collapse)<a href=\"https:\/\/papers.nips.cc\/paper\/2020\/hash\/46a4378f835dc8040c8057beb6a2da52-Abstract.html#:~:text=ever%20training%2C%20or%20indeed%20without,at%20initialization%20subject%20to%20a\" target=\"_blank\" rel=\"noreferrer noopener\">papers.nips.cc<\/a>. By enforcing a <em>conservation of total synaptic strength<\/em> through the network, they derived a new pruning criterion (SynFlow) that provably avoids layer-collapse<a href=\"https:\/\/papers.nips.cc\/paper\/2020\/hash\/46a4378f835dc8040c8057beb6a2da52-Abstract.html#:~:text=first%20mathematically%20formulate%20and%20experimentally,art\" target=\"_blank\" rel=\"noreferrer noopener\">papers.nips.cc<\/a>. This was a theoretical contribution that immediately yielded a practical algorithm \u2013 one that prunes networks <strong>without any training data<\/strong>, yet achieves competitive performance by preserving an invariant quantity through the sparsification process<a href=\"https:\/\/papers.nips.cc\/paper\/2020\/hash\/46a4378f835dc8040c8057beb6a2da52-Abstract.html#:~:text=entirely%20avoided%2C%20motivating%20a%20novel,used%20to%20quantify%20which%20synapses\" target=\"_blank\" rel=\"noreferrer noopener\">papers.nips.cc<\/a>. The team also uses tools like <strong>Hessian eigenvalue analysis and mode connectivity<\/strong> to explore loss landscapes. In one 2023 paper, they examined how fine-tuning changes a model\u2019s internal representations by studying the connectivity of minima in weight space (mode connectivity), providing a <em>mechanistic<\/em> view of why fine-tuning sometimes drastically shifts behavior<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=,tuning%20on%20procedurally%20defined%20tasks\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. Additionally, they employ techniques from <strong>information theory and statistics<\/strong>: e.g., analyzing <em>shattering of representations<\/em> in transformer models when knowledge is edited<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=K,E.S.%20Lubana\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>, or using <em>percolation theory<\/em> equations to predict threshold points<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=Specifically%2C%20we%20show%20that%20once,predicting%20emergence%20in%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>. Many of these models are backed by rigorous proofs or derivations in appendices of their papers, underscoring the emphasis on theoretical soundness.<\/li>\n\n\n\n<li><strong>Neuroscience and Psychology Experiments:<\/strong> A distinctive aspect of the Physics of Intelligence program is that it doesn\u2019t study AI in isolation \u2013 it actively seeks parallels in biological intelligence. The team has collaborated with neuroscientists to test AI-driven hypotheses about brains. For example, Tanaka co-authored a <strong>Neuron 2023<\/strong> paper analyzing how the retina encodes natural scenes<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=N.%20Maheswaranathan,Baccus\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. They used deep learning models to infer the \u201ccode\u201d of retinal neurons and discovered interpretable computations, bridging from <em>deep learning to mechanistic understanding in neuroscience<\/em><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=H.%20Tanaka,Ganguli\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=H,Ganguli\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. In another collaborative effort, members of his group (including former colleagues now at Yale and Princeton) studied <strong>behavioral sequences in animals<\/strong>: a 2022 PLoS Computational Biology paper introduced a \u201clexical\u201d method to identify <em>action sequences<\/em> in animal behavior, akin to parsing sentences<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=G,Wyart%20%282022\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. Here the physics of intelligence approach \u2013 looking for structure and rules in complex sequences \u2013 was applied to <strong>biological data<\/strong>, showing the versatility of their techniques. Moreover, the group is interested in psychological applications; their site mentions exploring AI + psychology for education and psychiatry<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=,AI%27s%20generalization%20behaviors\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. While details are sparse, this likely involves using insights from AI learning dynamics to model human learning or mental processes, or vice versa. The <strong>interdisciplinary experiments<\/strong> are facilitated by the project\u2019s residence at Harvard\u2019s Center for Brain Science, where computational neuroscientists and cognitive scientists interact with the team. By designing experiments that compare <em>artificial neural networks and real neural circuits<\/em>, the project aims to find common principles of intelligence. This is still an emerging area, but it aligns with the project\u2019s foundational claim that <em>natural and artificial intelligence share underlying scientific principles<\/em><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=to%20industry%20applications%20and%20governance,physicists%20have%20done%20over%20many\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=leading%20academic%20researchers%2C%20the%20Physics,is%20to%20obtain%20a%20better\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. If true, this could lead to AI models that <strong>not only mimic performance<\/strong> but also mimic <em>mechanisms<\/em> of human cognition, potentially offering better interpretability and robustness.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">In summary, the Physics of Intelligence initiative employs a toolkit reminiscent of a scientific field rather than pure engineering: <strong>analytical equations, controlled experiments, toy models, and cross-species comparisons<\/strong>. This dual emphasis on theory <em>and<\/em> empirical validation is critical. Many of their predictions (e.g., conserved quantities, emergent thresholds) have been <strong>validated on actual neural networks<\/strong> or in simulation<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=While%20the%20conservation%20laws%20derived,realistic%20continuous%20models%20of%20SGD\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a><a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=Specifically%2C%20we%20show%20that%20once,predicting%20emergence%20in%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>, lending credibility to the approach. At the same time, some techniques are largely theoretical (e.g., Lagrangian formulations) and still need more empirical corroboration in large-scale AI systems. The combination of approaches \u2013 from pencil-and-paper math to GPU-powered training runs \u2013 exemplifies the interdisciplinary spirit of the project. By <strong>treating neural networks like physical systems for experimentation<\/strong>, Tanaka\u2019s group can derive insights that purely empirical deep learning or purely abstract theory might miss. This strategy has begun to yield a library of techniques and results (conservation laws, phase models, pruning algorithms, etc.) that pave the way toward a more <em>scientific understanding of AI<\/em>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Collaborators, Institutions, and Programs<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The Physics of Intelligence project is fundamentally a collaborative and multi-institutional effort, spanning a network of research labs and academic centers. Key players and partnerships from 2020\u20132025 include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>NTT Research (PHI Lab and Physics of AI Group):<\/strong> The project originated within NTT Research\u2019s Physics &amp; Informatics (PHI) Lab in Silicon Valley. Dr. Hidenori Tanaka joined NTT\u2019s PHI Lab in 2020 as a Senior Research Scientist and led its <strong>Intelligent Systems Group<\/strong><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=Group%20Leader%2C%20Sci%20ence%20of,of%20Intelligence%20%2C%20Harvard%20University\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=Approaches%3A%20We%20bring%20a%20unique,insights%20with%20practical%20engineering%20impact\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. The PHI Lab\u2019s mission was to explore new computing paradigms by fusing physics and information (for example, the lab is known for optical computing and the Coherent Ising Machine). Early on, NTT PHI Lab recognized that <strong>understanding the \u201cblack box\u201d of AI was crucial for building next-gen, energy-efficient systems<\/strong><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=Early%20on%2C%20the%20PHI%20Lab,is%20to%20obtain%20a%20better\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. This vision had strong support from NTT\u2019s leadership. By 2021, NTT Research entered a joint research agreement with Harvard\u2019s Center for Brain Science to collaborate on natural and artificial intelligence<a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=The%20new%20program%20will%20amplify,neuroscientists%20and%20psychologists%20at%20Harvard\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a>. In April 2025, NTT <strong>formally spun off<\/strong> the Physics of Artificial Intelligence (PAI) Group as an independent research group, elevating Tanaka to Group Head<a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=SUNNYVALE%2C%20Calif,Physics%20of%20Artificial%20Intelligence%20Group\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. The <strong>Physics of AI Group<\/strong> (within NTT Research) now continues the work with a focused mandate: <em>to enhance understanding, trust, and control of advanced AI<\/em><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=,ongoing%20collaboration%20with%20academic%20partners\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. NTT framed this as transitioning from foundational studies to a broader pursuit of <strong>human-AI collaboration<\/strong><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=academic%20partners.%20,broader%20pursuit%20of%20human%2FAI%20collaboration\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. The PAI Group builds on the \u201cPhysics of Intelligence\u201d vision developed over the past five years and retains close ties to academia. NTT\u2019s support has been not just financial but strategic: their <strong>Upgrade 2024\/2025<\/strong> summits highlighted Physics of Intelligence for <em>trustworthy and green AI<\/em> as a pillar of innovation<a href=\"https:\/\/www.unite.ai\/ntt-research-launches-new-physics-of-artificial-intelligence-group-at-harvard\/#:~:text=The%20newly,for%20the%20past%20five%20years\" target=\"_blank\" rel=\"noreferrer noopener\">unite.ai<\/a><a href=\"https:\/\/www.unite.ai\/ntt-research-launches-new-physics-of-artificial-intelligence-group-at-harvard\/#:~:text=Early%20on%20in%20their%20research%2C,governance%20decisions%20on%20AI%20adoption\" target=\"_blank\" rel=\"noreferrer noopener\">unite.ai<\/a>. NTT Research\u2019s CEO, Kazuhiro Gomi, often emphasizes that AI\u2019s rise is akin to inventions like the steam engine \u2013 a new force driving physics research \u2013 and that NTT sees this as an opportunity to foster <strong>\u201ctrustworthy and green AI\u201d through basic science collaborations<\/strong><a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=%E2%80%9CWe%20are%20thrilled%20to%20support,%E2%80%9D\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a>. In terms of personnel, <strong>Maya Okawa<\/strong> (visiting scientist) and <strong>Ekdeep Singh Lubana<\/strong> (postdoctoral fellow) are part of the core NTT team with Tanaka<a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=Princeton%20University%20Assistant%20Professor%20,Previous%20contributions%20to%20date%20include\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. The PHI Lab as a whole has partnered with many institutions (Caltech, Cornell, MIT, etc. <a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=Since%202019%2C%20the%20PHI%20Lab,Swinburne%20University%20of%20Technology%2C%20the\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>), and some of those collaborations (e.g., with MIT and Stanford) feed directly into the Physics of Intelligence project\u2019s research on AI. Notably, <strong>NTT Research Foundation<\/strong> provided a philanthropic gift to Harvard (see below), and NTT\u2019s PHI\/PAI groups facilitate the industry side of the research with funding, computational resources, and translation of findings into potential applications (such as the bias removal algorithm being recognized by NIST, or exploring optical hardware to cut AI\u2019s energy use<a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=The%20Physics%20of%20Artificial%20Intelligence,new%20group%20will%20also%20explore\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>).<\/li>\n\n\n\n<li><strong>Harvard University \u2013 Center for Brain Science (CBS):<\/strong> Since 2022, Dr. Tanaka has been an Associate at Harvard\u2019s Center for Brain Science (CBS)<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=2022%20,Science%2C%20Harvard%20University%2C%20MA%2C%20USA\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>, where his group is physically based. The Harvard CBS provides an academic environment for the project, embedding it among neuroscientists and cognitive scientists. A significant development was the <strong>establishment of the CBS-NTT Program in Physics of Intelligence in 2024<\/strong>, enabled by a gift of up to $1.7M from the NTT Research Foundation<a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=April%2011%2C%202024%20%203,min%20read\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a><a href=\"https:\/\/www.thecrimson.com\/article\/2024\/4\/17\/harvard-cbs-awarded-ntt-grant\/#:~:text=Harvard%20University%E2%80%99s%20Center%20for%20Brain,Foundation%2C%20the%20foundation%20announced%20Thursday\" target=\"_blank\" rel=\"noreferrer noopener\">thecrimson.com<\/a>. This program funds <strong>postdoctoral fellowships<\/strong> and joint research activities in the physics of intelligence, effectively formalizing the collaboration between Harvard and NTT. It supports two postdoc researchers at a time, plus seminars, travel, and other collaborative events<a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=The%20two,meetings%2C%20and%20other%20associated%20costs\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a>. Harvard faculty (like Prof. Venkatesh Murthy, CBS director) have embraced this as a way to \u201cdevelop ideas around the Physics of Intelligence\u201d in an interdisciplinary setting<a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=%E2%80%9CWe%E2%80%99re%20grateful%20to%20the%20NTT,%E2%80%9D\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a>. The Harvard side of the collaboration is interested in how this physics-driven approach can <em>enhance neuroscience<\/em>: CBS researchers study neural circuits, computation in the brain, development, and disorders<a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=Scientists%20affiliated%20with%20the%20Center,or%20disordered%2C%20yet%20potentially%20ameliorated\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a>, and they see value in the fresh theoretical approaches brought by Tanaka\u2019s team. For instance, neuroscientists can use AI both as a tool and as a model of brains; the Physics of Intelligence program provides a formal framework to do this, potentially leading to new insights in brain science as well<a href=\"https:\/\/www.thecrimson.com\/article\/2024\/4\/17\/harvard-cbs-awarded-ntt-grant\/#:~:text=With%20the%20gift%20and%20more,%E2%80%9D\" target=\"_blank\" rel=\"noreferrer noopener\">thecrimson.com<\/a><a href=\"https:\/\/www.thecrimson.com\/article\/2024\/4\/17\/harvard-cbs-awarded-ntt-grant\/#:~:text=One%20core%20focus%20of%20the,applied%20to%20fields%20like%20neuroscience\" target=\"_blank\" rel=\"noreferrer noopener\">thecrimson.com<\/a>. Some <strong>key collaborators at Harvard<\/strong> include cognitive scientist <strong>Tom\u00e1s Lozano-P\u00e9rez Ullman<\/strong> (who co-authored work on neural text generation with Tanaka<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=ICLR%20,Representations\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>) and others in psychology and neuroscience departments who are part of Tanaka\u2019s extended group (e.g., <strong>Eric Bigelow<\/strong>, a PhD student in Psychology, co-advised projects bridging human cognition and AI text models<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=Language\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>). The CBS-NTT program also supports <em>alignment with the broader Harvard community<\/em>, for example by hosting talks (Tanaka gave seminars at Harvard\u2019s math and ML groups in 2022<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=Nov,and%20Applications%2C%20Harvard%20U%20niversity\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>) and integrating with student training. Harvard undergrads and grads (such as <strong>Kento Nishi<\/strong> and <strong>Corey Francisco Park<\/strong>, who co-authored papers on in-context learning<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=C,Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=M.%20Okawa,Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>) have worked in the group. This provides a pipeline of young researchers from a variety of fields into the physics-of-AI endeavor. In essence, Harvard CBS provides the academic <strong>hub<\/strong> where the natural science of intelligence is explored, ensuring the project is grounded in biological reality and cognitive science as well as in theory.<\/li>\n\n\n\n<li><strong>MIT and IAIFI:<\/strong> Dr. Tanaka became an <strong>Affiliate of the MIT Institute for Artificial Intelligence and Fundamental Interactions (IAIFI)<\/strong> in 2024<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=2024%20,MIT%29%2C%20MA%2C%20USA\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. IAIFI is an NSF-funded institute focused on the intersection of AI and physics (originally emphasizing using AI for physics and vice versa). Being an affiliate suggests active collaboration or co-supervision of fellows. Indeed, one of Tanaka\u2019s group members, <strong>Dr. Sam Bright-Thonney<\/strong>, is an <strong>IAIFI Postdoctoral Fellow in physics at MIT<\/strong> working with the group<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=co\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. Through IAIFI, the project connects to a network of physicists thinking about AI from first principles. This likely facilitates joint workshops, idea exchange, and possibly joint appointments of researchers. MIT\u2019s strengths in both theoretical physics and computer science make it an ideal partner. The <strong>IAIFI affiliation<\/strong> indicates that Tanaka\u2019s approaches (like symmetry in deep learning) resonated with the fundamental questions IAIFI tackles (such as: what are the fundamentals of learning, or how can physics insights improve AI?). Moreover, collaborators such as <strong>Mikail Khona<\/strong> (a physics PhD student at MIT in Tanaka\u2019s group<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=,CSE%2C%20U%20of%20Michigan\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>) and <strong>Max Aalto<\/strong> (an EECS PhD at MIT in the group<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=,MS%20student%2C%20CS%2C%20Harvard\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>) foster cross-pollination between Harvard\/NTT and MIT. The IAIFI and MIT connections also bring in expertise from physics heavyweights \u2013 for instance, <strong>Prof. Max Tegmark<\/strong> or others at MIT who are known for physics approaches to AI could be informal collaborators, though not explicitly listed. Additionally, Tanaka\u2019s group joining the <strong>ML Alignment &amp; Theory Scholars (MATS) program<\/strong> as a mentor in late 2024<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=Jan,at%20ICLR%202025\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a> shows engagement with the broader AI alignment community, which often has ties to MIT and Harvard via initiatives like the Center for Brains, Minds &amp; Machines. In short, <strong>MIT IAIFI<\/strong> provides another institutional pillar, emphasizing the <em>fundamental interactions<\/em> part of the project \u2013 connecting AI with the laws of nature.<\/li>\n\n\n\n<li><strong>University of Tokyo \u2013 Institute for Physics of Intelligence:<\/strong> In 2023, Tanaka became a Visiting Researcher at the <strong>Institute for Physics of Intelligence (IPI) at the University of Tokyo<\/strong><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=%28MIT%29%20iaifi\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. The very name of this institute mirrors the \u201cPhysics of Intelligence\u201d concept, suggesting a parallel initiative in Japan. This likely resulted in exchange of ideas and possibly joint workshops or student visits between Cambridge, MA and Tokyo. <strong>Ziyin Liu<\/strong>, a University of Tokyo physics PhD student, was co-advised by Tanaka on a publication about loss landscapes in self-supervised learning<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=,Physics%2C%20U%20of%20Tokyo\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=L,Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>, reflecting this collaboration. The University of Tokyo IPI focuses on understanding intelligence through physics and math (founded by Prof. Masaki Aono and others), so Tanaka\u2019s involvement there helps globalize the project\u2019s reach. It also ties into NTT\u2019s roots in Japan \u2013 showing that while Tanaka\u2019s group sits at Harvard\/NTT in the US, it maintains strong links to Japanese research efforts. The exchange of ideas is two-way: for example, co-authors like <strong>Dr. Makoto Ueda<\/strong> (University of Tokyo) worked with Tanaka\u2019s group on understanding loss landscapes<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=L,Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. Through IPI, the project gains access to a broader talent pool and complementary perspectives from the Japanese scientific community.<\/li>\n\n\n\n<li><strong>Other Key Collaborators:<\/strong> The project is inherently collaborative, and many individuals have contributed. A few notable ones:\n<ul class=\"wp-block-list\">\n<li><strong>Prof. Surya Ganguli (Stanford University):<\/strong> Tanaka was a postdoc with Surya Ganguli and co-authored several foundational papers with him (including the synaptic flow pruning<a href=\"https:\/\/papers.nips.cc\/paper\/2020\/hash\/46a4378f835dc8040c8057beb6a2da52-Abstract.html#:~:text=\" target=\"_blank\" rel=\"noreferrer noopener\">papers.nips.cc<\/a> and Noether\u2019s dynamics work). Ganguli, a Stanford physicist\/neuroscientist, remains a close collaborator \u2013 the 2025 NTT press release explicitly states plans to continue collaborating with <strong>Surya Ganguli<\/strong> on the project<a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=Princeton%20University%20Assistant%20Professor%20,Scientist%20Maya%20Okawa%20and%20NTT\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. Ganguli\u2019s lab brings expertise in theoretical neuroscience and deep learning theory, likely collaborating on projects like the ghost dynamics (which had many Stanford co-authors<a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2501.02378#:~:text=Fatih%20Dinc%20%20CNC%20Program%2C,Sunnyvale%2C%20CA\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a>) and mechanistic mode connectivity. This Stanford-Harvard-NTT triangle has been very productive, blending Ganguli\u2019s theoretical savvy with Tanaka\u2019s cross-domain approach.<\/li>\n\n\n\n<li><strong>Dr. Daniel Kunin:<\/strong> Kunin was a Stanford PhD student who co-led the \u201cNeural Mechanics\u201d and \u201cNoether\u2019s dynamics\u201d papers<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=URL%3A%20https%3A%2F%2Fai.stanford.edu%2Fblog%2Fneural,The%20Stanford%20AI%20Lab%20Blog\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>. He was a driving force in those early theoretical works and worked with Tanaka at NTT PHI Lab. Kunin is listed as a former member of Tanaka\u2019s group (now finishing PhD at Stanford)<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=,Computational%20%26%20Mathematical%20Engineering%2C%20Stanford\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. His contribution is central to the symmetry\/conservation law line of research. The continuity of collaboration is evident: Tanaka and Kunin published multiple papers in 2020\u20132021, establishing much of the theoretical foundation<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=D.%20Kunin%2A%2C%20J.%20Sagastuy,Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=H\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>.<\/li>\n\n\n\n<li><strong>Dr. Gautam Reddy and Dr. Logan Wright:<\/strong> Both were colleagues of Tanaka in the NTT PHI Lab\u2019s Intelligent Systems group and are now professors (Reddy at Princeton, Wright at Yale). They contributed to early interdisciplinary projects \u2013 for instance, Reddy is co-author on the animal behavior sequences paper<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=G,Wyart%20%282022\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. The 2025 press release mentions <strong>collaboration with Gautam Reddy (Princeton)<\/strong> as ongoing<a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=The%20new%20group%20will%20continue,Previous%20contributions%20to%20date%20include\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. This indicates that even after leaving NTT, Reddy remains part of the physics-of-intelligence endeavor, perhaps focusing on links to biological physics and behavior.<\/li>\n\n\n\n<li><strong>Dr. Kenji Kawaguchi (NTT &amp; RIKEN):<\/strong> Kawaguchi co-authored the <em>percolation model of emergence<\/em> paper<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=,Lubana%2C%20Kyogo%20Kawaguchi%2C%20Robert%20P\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>, bringing expertise in theoretical ML (Kawaguchi is known for work on optimization theory). This collaboration (including Robert Dick from University of Michigan<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=Authors%3AEkdeep%20Singh%20Lubana%20%2C%20,Dick%20%2C%20%2013\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>) shows the project\u2019s network extending to specialists in different subfields like formal theory of deep learning and even hardware (Dick\u2019s background is in computer engineering).<\/li>\n\n\n\n<li><strong>Interdisciplinary Students:<\/strong> A number of PhD students from various universities have been part of the project (often as visiting researchers or co-advised). For example, <strong>Ekdeep Lubana<\/strong> (U. Michigan) has co-authored many papers on in-context learning and emergence<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=C,Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=E.%20S.%20Lubana,Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a> and is now a postdoc at NTT; <strong>Yongyi Yang<\/strong> (Michigan), <strong>Bhavya Vasudeva<\/strong> (USC), <strong>Rahul Ramesh<\/strong> (UPenn), <strong>Bo Zhao<\/strong> (UCSD) all contributed to publications in 2023\u20132024 dealing with transformers\u2019 capabilities<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=Y,Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=R,Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. Their diverse home institutions highlight the collaborative web Tanaka\u2019s group has spun \u2013 an extended \u201clab\u201d that crosses university boundaries. This also reflects the <strong>CBS-NTT fellowship program\u2019s role<\/strong> in bringing in fresh talent from different fields (e.g., a psychology student working alongside a physics student).<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">In essence, the Physics of Intelligence project thrives on a <strong>consortium of academia and industry<\/strong>: <strong>NTT Research provides a platform and funding, Harvard provides an interdisciplinary scientific environment, MIT IAIFI and others provide intellectual cross-fertilization, and a host of collaborators bring expertise from theoretical physics to neuroscience.<\/strong> The formal CBS-NTT Program at Harvard<a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=The%20new%20program%20will%20amplify,neuroscientists%20and%20psychologists%20at%20Harvard\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a> and the new NTT PAI Group<a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=,ongoing%20collaboration%20with%20academic%20partners\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a> ensure that this collaboration is sustained through joint appointments and funding streams. Such a structure is somewhat novel \u2013 it is neither a purely academic lab nor a siloed corporate lab, but a hybrid. This allows the project to pursue long-term fundamental questions (something academia excels at) while keeping an eye on real-world impact and applications (the forte of industry labs). It also encourages <strong>students and postdocs to move fluidly<\/strong> between academic and industrial research settings. The result is a <strong>global effort<\/strong> (U.S., Japan, etc.) with a shared vision.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">One can see the influence of this collaborative network in the direction of research: for instance, the emphasis on <em>trustworthy AI<\/em> and <em>green AI<\/em> comes partly from NTT\u2019s priorities (and their clients\u2019 needs), whereas the push to unify with neuroscience comes from the Harvard side and Tanaka\u2019s own physics\/neuro background. By having stakeholders like NTT\u2019s CEO explicitly mention goals of <em>unbiased, trustworthy, and green AI<\/em><a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=the%20Physics%20of%20Intelligence%2C%E2%80%9D%20NTT,%E2%80%9D\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a>, the project aligns its scientific questions with broader societal and technological priorities, as discussed next.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Publications, Communications, and Key Results (2020\u20132025)<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Over the past few years, Tanaka\u2019s team has produced a rich body of <strong>academic papers, blog articles, and talks<\/strong> that articulate the vision and report results of the Physics of Intelligence initiative. Here we <strong>review some of the prominent outputs<\/strong> and their significance:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Foundational Papers (2020\u20132021):<\/strong> The groundwork was laid with a series of theoretical papers:\n<ul class=\"wp-block-list\">\n<li><em>NeurIPS 2020:<\/em> <strong>\u201cPruning neural networks without any data by iteratively conserving synaptic flow\u201d<\/strong> \u2013 Introduced the <strong>SynFlow<\/strong> algorithm<a href=\"https:\/\/papers.nips.cc\/paper\/2020\/hash\/46a4378f835dc8040c8057beb6a2da52-Abstract.html#:~:text=ever%20training%2C%20or%20indeed%20without,at%20initialization%20subject%20to%20a\" target=\"_blank\" rel=\"noreferrer noopener\">papers.nips.cc<\/a>. This paper\u2019s <em>vision<\/em>: even at initialization, a network has a \u201cflow of synaptic strengths\u201d that should be conserved when pruning to avoid losing capacity<a href=\"https:\/\/papers.nips.cc\/paper\/2020\/hash\/46a4378f835dc8040c8057beb6a2da52-Abstract.html#:~:text=first%20mathematically%20formulate%20and%20experimentally,art\" target=\"_blank\" rel=\"noreferrer noopener\">papers.nips.cc<\/a>. It verified (experimentally and theoretically) a <em>conservation law<\/em> that explained failures of prior methods and demonstrated a new pruning technique that achieved up to 99.99% sparsity without training data<a href=\"https:\/\/papers.nips.cc\/paper\/2020\/hash\/46a4378f835dc8040c8057beb6a2da52-Abstract.html#:~:text=entirely%20avoided%2C%20motivating%20a%20novel,used%20to%20quantify%20which%20synapses\" target=\"_blank\" rel=\"noreferrer noopener\">papers.nips.cc<\/a>. This was one of the earliest concrete successes of the physics-of-AI approach (conservation law \u2192 algorithm).<\/li>\n\n\n\n<li><em>ICLR 2021:<\/em> <strong>\u201cNeural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics\u201d<\/strong> \u2013 A highly cited work by Kunin, Tanaka, et al., accompanied by a <strong>Stanford AI Lab Blog post<\/strong> explaining it in accessible terms<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Just%20like%20the%20fundamental%20laws,world%20datasets\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>. This paper systematically identified symmetries in common network architectures (translation invariance in softmax weights, scale invariance in batchnorm, etc.) and derived the associated gradient constraints and conservation laws<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Symmetries%20in%20the%20loss%20shape,gradient%20and%20Hessian%20geometry\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a><a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=%5C%5B%5Cbegin,%7C%5Ctheta_%7B%5Cmathcal%7BA%7D_2%7D%280%29%7C%5E2%20%5Cend%7Baligned\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>. It then extended the theory to <em>modified gradient flow<\/em> for finite learning rates, providing exact formulas for how those conserved quantities evolve when the ideal is broken<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=A%20realistic%20continuous%20model%20for,stochastic%20gradient%20descent\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a><a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Combining%20symmetry%20and%20modified%20gradient,to%20derive%20exact%20learning%20dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>. The key result was showing that even <em>state-of-the-art networks on real data<\/em> respect these physics-derived dynamics to a large degree<a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=architecture%20that%20are%20present%20for,symmetries%20to%20derive%20exact%20integral\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=associated%20conservation%20law%20in%20the,our%20work%20demonstrates%20that%20we\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a> (and deviations can be predicted by their continuous models). The SAIL blog article drew parallels to classical mechanics and highlighted how <strong>\u201ceach symmetry of a network architecture has a corresponding \u2018conserved quantity\u2019 through training\u201d<\/strong><a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Strikingly%20similar%20to%20Noether%E2%80%99s%20theorem%2C,constant%20under%20gradient%20flow%20dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>. This helped seed the idea that understanding learning dynamics is like discovering the laws of motion for neural networks.<\/li>\n\n\n\n<li><em>NeurIPS 2021:<\/em> <strong>\u201cNoether\u2019s Learning Dynamics: Role of Symmetry Breaking in Neural Networks\u201d<\/strong> \u2013 This follow-up (Tanaka &amp; Kunin) pushed the envelope by incorporating Kinetic Symmetry Breaking (KSB)<a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=discrete%20learning%20dynamics%20of%20gradient,of%20implicit%20adaptive%20optimization%2C%20establishing\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a>. It framed gradient descent in a Lagrangian mechanics picture and derived <strong>Noether\u2019s Learning Dynamics (NLD)<\/strong>, an equation describing the motion of the conserved quantities when symmetries are broken<a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=discrete%20learning%20dynamics%20of%20gradient,of%20implicit%20adaptive%20optimization%2C%20establishing\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a><a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=energy%20explicitly%20breaks%20the%20symmetry,learning%20dynamics%20of%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a>. The paper applied NLD to show how <em>normalization layers act as an implicit optimizer<\/em>, analytically linking a design choice to an optimization benefit<a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=energy%20explicitly%20breaks%20the%20symmetry,learning%20dynamics%20of%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a><a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=resulting%20motion%20of%20the%20Noether,Lagrangian%20mechanics%2C%20we%20have%20established\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a>. An NTT blog summary called it \u201cThe Role of Kinetic Symmetry Breaking in Deep Learning\u201d and emphasized how this theoretical framework can identify <em>geometric design principles<\/em> for neural network training<a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=of%20symmetry%20breaking%20is%20not,account%20KSB%20and%20derive%20the\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a><a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=energy%20explicitly%20breaks%20the%20symmetry,geometric%20design%20principles%20for%20the\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a>. Together, the Neural Mechanics and NLD papers form a one-two punch: the former identifies conservation laws; the latter explains what happens when you break them on purpose for better learning. These works garnered attention in the ML theory community and are <em>empirically supported<\/em> by experiments on networks like ResNets and VGGs<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Fig,dynamics%20are%20smooth%20and%20patterned\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=Image%3A%20Refer%20to%20caption\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a>.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Interdisciplinary Papers (2022\u20132023):<\/strong> As the program grew, outputs diversified:\n<ul class=\"wp-block-list\">\n<li><em>NeurIPS 2022:<\/em> Papers like <strong>\u201cBeyond BatchNorm: Towards a Unified Understanding of Normalization in Deep Learning\u201d<\/strong><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=E,Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a> attempted to demystify <em>why<\/em> tricks like BatchNorm, LayerNorm, etc. work, using a unifying theoretical lens. This likely drew on the symmetry concepts to show commonalities between normalization methods (though details are beyond our scope here).<\/li>\n\n\n\n<li><em>PLoS Comp Bio 2022:<\/em> <strong>\u201cA lexical approach for identifying behavioural action sequences\u201d<\/strong><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=G,Wyart%20%282022\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a> \u2013 an application of AI methods to neuroscience data (zebrafish or rodent behavior), showing the project\u2019s expanding reach to <em>natural intelligence<\/em>.<\/li>\n\n\n\n<li><em>Neuron 2023:<\/em> <strong>\u201cInterpreting the retinal neural code for natural scenes: from computations to neurons\u201d<\/strong><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=N.%20Maheswaranathan,Baccus\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a> \u2013 This high-profile neuroscience paper used deep learning models and theory to crack the code of vision in the retina, exemplifying the <em>two-way street<\/em> of the collaboration (AI helping neuroscience). Tanaka was a co-author and <em>theoretical co-first author<\/em>, highlighting his group\u2019s role in the theory behind the analyses.<\/li>\n\n\n\n<li><em>Neural Computation 2023:<\/em> <strong>\u201cRethinking the limiting dynamics of SGD: modified loss, phase space oscillations and anomalous diffusion\u201d<\/strong><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=D.%20Kunin%2A%2C%20J.%20Sagastuy,Yamins\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a> \u2013 Another theoretical work (with Ganguli\u2019s lab) that studied SGD\u2019s behavior as a dynamical system, finding phenomena like oscillatory modes and diffusion in the weight trajectory. This connects with the idea of understanding training at a fundamental level (possibly identifying regimes where training dynamics resemble physical processes like diffusion).<\/li>\n\n\n\n<li><em>NeurIPS 2023:<\/em> <strong>\u201cCORNN: Convex Optimization of Recurrent Neural Networks for rapid inference of neural dynamics\u201d<\/strong><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=F.%20Dinc,Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a> \u2013 This paper (Dinc et al.) introduces a method to optimally configure recurrent nets to emulate neural dynamics quickly, potentially useful for brain-machine interfaces. It shows the practical offshoots of understanding neural mechanics: if you know the dynamics, you can <em>design networks to meet them<\/em>. It\u2019s an interesting blend of convex optimization and neuroscience, again reflecting the interdisciplinary nature of the lab.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Emergent Behavior and In-Context Learning (2023\u20132025):<\/strong> A major thrust in the past two years has been investigating <em>emergent abilities and in-context learning<\/em> in large models (especially Transformers). Some notable outputs:<ul><li><em>ICLR 2024:<\/em> <strong>\u201cDynamics of Concept Learning and Compositional Generalization\u201d<\/strong><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=Y,Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a> \u2013 Studied how networks acquire abstract concepts and compositional skills over training, using synthetic tasks. They likely identified distinct stages or sudden improvements, contributing to understanding <em>grokking<\/em> (where test performance jumps after a delay).<\/li><li><em>ICLR 2024:<\/em> <strong>\u201cIn-Context Learning Dynamics with Random Binary Sequences\u201d<\/strong><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a> \u2013 Focused on how Transformers can learn to do tasks <em>within<\/em> their forward pass (in-context learning) without gradient updates. By using random sequences tasks, they analyzed what mechanisms allow models to learn from prompts. This is very relevant to understanding how large language models can perform new tasks just from examples in a prompt \u2013 a currently mysterious capability. The team\u2019s work provides <em>algorithmic phase<\/em> interpretations: e.g., showing that in-context learning might switch from a \u201cmemorization phase\u201d to a \u201cgeneralization phase\u201d depending on prompt diversity<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=C,Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>.<\/li><li><em>NeurIPS 2024 (to appear):<\/em> <strong>\u201cEmergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space\u201d<\/strong><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=C.%20F.%20Park,E.S.%20Lubana\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a> \u2013 Likely an extension of emergence studies, possibly identifying <em>when<\/em> hidden skills surface during training.<\/li><li><em>ICML 2024:<\/em> Two papers on reasoning in Transformers<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=R,Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=M,H.%20Tanaka\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a> \u2013 one on stepwise reasoning in a graph navigation task, another on effects of fine-tuning on procedural tasks. These works try to open the black box of <em>reasoning processes<\/em> in AI: e.g., how a Transformer\u2019s internal states evolve when doing multi-step inference, and how fine-tuning alters that.<\/li><li><em>ICLR 2025:<\/em> As per the site\u2019s news, <strong>5 works were accepted at ICLR 2025<\/strong><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=News%3A\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>, indicating a significant volume of contributions. Among these is <em>likely<\/em> the \u201cPercolation model of emergence\u201d (which was on arXiv Aug 2024 and would align with ICLR\u201925 timing). Indeed, the arXiv for <strong>\u201cA Percolation Model of Emergence: Analyzing Transformers on a Formal Language\u201d<\/strong> shows it was updated in Sep 2024<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=arXiv%3A2408\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a><a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=,experimental%20system%20grounded%20in%20a\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a> \u2013 presumably now accepted to ICLR 2025. This paper is particularly noteworthy: it formalizes the emergence concept and confirms it empirically with controlled tasks<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=,experimental%20system%20grounded%20in%20a\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a><a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=Specifically%2C%20we%20show%20that%20once,predicting%20emergence%20in%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>, as discussed earlier. It also stands out as bridging to questions of <em>AI governance<\/em>, noting that understanding emergence is crucial for risk management of AI<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=,generating%20process%20as%20a\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>.<\/li><li>Another ICLR 2025 work likely from the list is <strong>\u201cCompetition Dynamics Shape Algorithmic Phases of In-Context Learning\u201d<\/strong><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=ICLR%20,Representations\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. The title suggests they found that as a model tries to learn in-context, different \u201calgorithms\u201d (perhaps pattern-matching vs reasoning) compete, and the dominant one shifts in phases. This is a deep insight for interpretability of LLMs and connects to the physics idea of <em>phase transitions<\/em> or <em>phase competition<\/em> in a system.<\/li><\/ul>The <strong>common thread<\/strong> in these recent papers is <strong>demystifying emergent and higher-order behaviors of AI<\/strong> (like in-context learning, reasoning, compositionality) using simplified tasks and theoretical analogies. These are exactly the behaviors that have captured public attention (e.g., how GPT-4 suddenly can do multi-step math) and also worry experts (unpredictable emergence of capabilities). By publishing in top venues (NeurIPS, ICLR, ICML) and writing accessible summaries (e.g., Stanford blog, Twitter threads by team members), Tanaka\u2019s team is actively disseminating their findings to both specialists and the broader AI community. For instance, the team often posts <em>Twitter highlight threads<\/em> (the site links a tweet for the NeurIPS 2021 paper<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=,Laws%20in%20Deep%20Learning%20Dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a> and others), indicating a conscious effort to communicate results widely.<\/li>\n\n\n\n<li><strong>Talks and Media:<\/strong> The project has been covered in press releases and talks targeted to broader audiences:\n<ul class=\"wp-block-list\">\n<li><strong>Harvard Gazette (Apr 2024):<\/strong> Announcement of the Harvard\u2013NTT program gave a concise summary of the project\u2019s goals: <em>using physics to tackle fundamental questions in intelligence, bridging multiple disciplines<\/em><a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=The%20two,meetings%2C%20and%20other%20associated%20costs\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a>, and addressing urgent problems like unbiased, trustworthy, green AI<a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=%E2%80%9CWe%20are%20thrilled%20to%20support,%E2%80%9D\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a>. It included quotes framing AI as potentially launching a new field in physics, much as historic inventions did<a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=%E2%80%9CWe%20are%20thrilled%20to%20support,%E2%80%9D\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a>.<\/li>\n\n\n\n<li><strong>Harvard Crimson (Apr 2024):<\/strong> The student newspaper\u2019s piece reinforced that narrative, noting that Harvard was chosen after a \u201cbidding process\u201d among institutions because its approach aligned with NTT\u2019s vision<a href=\"https:\/\/www.thecrimson.com\/article\/2024\/4\/17\/harvard-cbs-awarded-ntt-grant\/#:~:text=Before%20selecting%20Harvard%E2%80%99s%20Center%20for,we%20dream%20of%2C%E2%80%9D%20Gomi%20said\" target=\"_blank\" rel=\"noreferrer noopener\">thecrimson.com<\/a>. It highlighted the interdisciplinary hiring (bringing in new PhDs from varied fields) and quoted Tanaka on the need for people from diverse backgrounds to build this new field<a href=\"https:\/\/www.thecrimson.com\/article\/2024\/4\/17\/harvard-cbs-awarded-ntt-grant\/#:~:text=Hidenori%20Tanaka%2C%20a%20CBS%20associate,%E2%80%9D\" target=\"_blank\" rel=\"noreferrer noopener\">thecrimson.com<\/a>. Tellingly, Murthy (CBS director) admitted <em>\u201cthis is new for all of us\u2026 How do you explain intelligent behavior in equations or physics terms?\u201d<\/em><a href=\"https:\/\/www.thecrimson.com\/article\/2024\/4\/17\/harvard-cbs-awarded-ntt-grant\/#:~:text=Despite%20the%20Center%E2%80%99s%20broader%20goals,are%20yet%20to%20be%20determined\" target=\"_blank\" rel=\"noreferrer noopener\">thecrimson.com<\/a> \u2013 capturing the excitement and uncertainty of this venture.<\/li>\n\n\n\n<li><strong>NTT Upgrade 2025 Summit (Mar 2025):<\/strong> Tanaka\u2019s group was featured in NTT\u2019s annual R&amp;D event, with a talk titled <em>\u201cPhysics of Intelligence for Trustworthy and Green AI\u201d<\/em>. In a <strong>Unite.AI article covering that event<\/strong>, Tanaka is quoted reflecting on profound questions: <em>\u201cMathematically, how can you think of the concept of creativity? \u2026 kindness? These concepts would have remained abstract if not for AI\u2026 now, if we want to make AI kind, we have to tell it in the language of mathematics what kindness is.\u201d<\/em><a href=\"https:\/\/www.unite.ai\/ntt-research-launches-new-physics-of-artificial-intelligence-group-at-harvard\/#:~:text=%E2%80%9CAs%20a%20physicist%20I%20am,sidelines%20of%20the%20Upgrade%20conference\" target=\"_blank\" rel=\"noreferrer noopener\">unite.ai<\/a>. This quote exemplifies the philosophical angle of the project \u2013 using AI as a <strong>testbed to formalize fuzzy concepts<\/strong> (creativity, kindness) that physics and math traditionally sidestep. The talk and article also reiterated the black box problem and how <em>applying scientific methods from physics can demystify AI\u2019s learning processes<\/em><a href=\"https:\/\/www.unite.ai\/ntt-research-launches-new-physics-of-artificial-intelligence-group-at-harvard\/#:~:text=Early%20on%20in%20their%20research%2C,governance%20decisions%20on%20AI%20adoption\" target=\"_blank\" rel=\"noreferrer noopener\">unite.ai<\/a>. The Unite.AI piece did a nice job contextualizing Physics of AI as a response to real incidents and concerns (self-driving car failures, biased hiring algorithms) that motivate the need for <em>trust and safety<\/em><a href=\"https:\/\/www.unite.ai\/ntt-research-launches-new-physics-of-artificial-intelligence-group-at-harvard\/#:~:text=From%20an%20AI,humans%20to%20achieve%20higher%20understanding\" target=\"_blank\" rel=\"noreferrer noopener\">unite.ai<\/a><a href=\"https:\/\/www.unite.ai\/ntt-research-launches-new-physics-of-artificial-intelligence-group-at-harvard\/#:~:text=and%20biases%20exhibited%20by%20AI,humans%20to%20achieve%20higher%20understanding\" target=\"_blank\" rel=\"noreferrer noopener\">unite.ai<\/a>. It effectively communicated to a general audience why merging physics, psychology, neuroscience, and AI is a timely pursuit to ensure AI benefits society.<\/li>\n\n\n\n<li><strong>YouTube and Conference Talks:<\/strong> Hidenori Tanaka has also given talks at various venues explaining pieces of this research. For example, at <strong>Stanford Symsys (2023)<\/strong> he talked about <em>\u201cPhysics of intelligence for trustworthy and green AI\u201d<\/em><a href=\"https:\/\/neuroscience.stanford.edu\/events\/hidenori-tanaka-physics-intelligence-trustworthy-and-green-ai#:~:text=Hidenori%20Tanaka%20,condensed%20matter%20physics%20from\" target=\"_blank\" rel=\"noreferrer noopener\">neuroscience.stanford.edu<\/a><a href=\"https:\/\/www.youtube.com\/watch?v=sKeJ_tuhGso#:~:text=,at\" target=\"_blank\" rel=\"noreferrer noopener\">youtube.com<\/a> (the title suggests aligning with the key impact themes). On YouTube, one can find his presentation <em>\u201c3 Mechanisms Underlying Emergent Abilities in Generative Models\u201d<\/em><a href=\"https:\/\/www.youtube.com\/watch?v=sKeJ_tuhGso#:~:text=Hidenori%20Tanaka%20%E2%80%93%203%20Mechanisms,at\" target=\"_blank\" rel=\"noreferrer noopener\">youtube.com<\/a> \u2013 likely a summary of recent findings on emergence, perhaps given at a workshop or symposium. These talks are important for disseminating the project\u2019s vision beyond written papers, allowing interactive discussion with both experts in AI and in other fields (physics, cognitive science).<\/li>\n\n\n\n<li><strong>Media Recognition:<\/strong> The press release in April 2025 about the launch of the NTT Physics of AI Group was picked up by outlets like Yahoo Finance, BusinessWire, and tech blogs<a href=\"https:\/\/finance.yahoo.com\/news\/ntt-research-launches-physics-artificial-150000925.html#:~:text=NTT%20Research%20Launches%20New%20Physics,AM%206%20min%20read\" target=\"_blank\" rel=\"noreferrer noopener\">finance.yahoo.com<\/a><a href=\"https:\/\/www.linkedin.com\/posts\/georgehowellrain_ntt-research-launches-new-physics-of-artificial-activity-7316547860638973954-Ne-r#:~:text=NTT%20Research%20Launches%20New%20Physics,this%20author%20%C2%B7%20Explore%20topics\" target=\"_blank\" rel=\"noreferrer noopener\">linkedin.com<\/a>. Forbes even published an article \u201cThe Laws of Automation: NTT details \u2018Physics of AI\u2019\u201d \u2013 indicating mainstream interest in the idea of laws governing AI. In that coverage, NTT\u2019s framing of <em>\u201cAI as a new force in physics\u201d<\/em> and the focus on <em>understanding AI\u2019s black box for trust<\/em> were echoed. The Forbes piece (by Adrian Bridgwater) likely discussed how this group\u2019s creation formalizes a trend of applying scientific rigor to AI, and might have mentioned the bias-removal and pruning contributions as examples of early wins.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">In reviewing these communications, it\u2019s clear the team is <strong>actively shaping the narrative<\/strong> that AI can and should be studied like a natural phenomenon. They underscore both <em>vision<\/em> (grand questions about intelligence, even kindness, in mathematical terms) and <em>results<\/em> (like algorithms that reduce bias or predict training outcomes). <strong>Speculative ideas<\/strong> \u2013 such as finding a unifying theory of intelligence or encoding ethics into AI equations \u2013 are openly discussed in talks and press, but always alongside <strong>empirical progress<\/strong> that builds confidence (e.g., \u201cwe have found exact conserved quantities,\u201d \u201cwe created a bias fix that NIST noted\u201d<a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=,of%20how%20AI%20learns%20concepts\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=4%20years%29%20%2A%20A%20bias,scientific%20and%20practical%20insights%3B%20and\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>). By maintaining this balance, Tanaka\u2019s team has managed to legitimize a field that could otherwise sound highly speculative.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">One concrete outcome highlighted in press materials is a <strong>\u201cbias-removal algorithm for large language models (LLMs) recognized by NIST\u201d<\/strong><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=%2A%20A%20bias,of%20how%20AI%20learns%20concepts\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. Although details aren\u2019t given in the press release, this presumably refers to a method developed by the group to identify and remove latent biases in an LLM\u2019s outputs. NIST (the U.S. National Institute of Standards and Technology) has been working on AI bias standards, so recognition from them means the method offered both <em>scientific insight and practical utility<\/em>. It might be linked to one of the group\u2019s works on <em>knowledge editing or representation shattering<\/em> in transformers<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=K,E.S.%20Lubana\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>, where altering internal representations can mitigate undesired outputs. This example shows how a line of theoretical inquiry (e.g., understanding model internals as physical systems) can yield a tool for a pressing real-world problem (AI fairness).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In summary, the project\u2019s outputs from 2020\u20132025 span theoretical breakthroughs, practical algorithms, cross-disciplinary studies, and public-facing communications. The <strong>academic papers<\/strong> form the backbone, presenting peer-reviewed evidence of the approach\u2019s validity. The <strong>blog posts and press articles<\/strong> translate those findings for broader consumption and link them to the bigger picture (AI safety, ethics, efficiency). The <strong>talks and conferences<\/strong> help build a community and influence how other researchers think about AI (for instance, inspiring others to consider physics analogies or to use synthetic tasks to probe their models). Collectively, these outputs have started to <strong>outline the \u201cphysics of intelligence\u201d paradigm<\/strong>: we see its foundational principles (invariance, dynamics, emergence), its methodologies (analytical and experimental), and glimpses of its impact (bias reduction, explanation of black-box behavior).<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Broader Technological and Philosophical Impact<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">One of the most important aspects of the Physics of Intelligence initiative is its potential impact on how we <strong>trust, interpret, and efficiently implement AI systems<\/strong>. By uncovering fundamental principles of intelligence, the project aims to address key societal and technical challenges of modern AI:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Interpretability and Understanding (\u201cDemystifying the Black Box\u201d):<\/strong> AI models, especially deep neural networks, have long been criticized as opaque \u201cblack boxes.\u201d By applying physics-style analysis, Tanaka\u2019s team is chipping away at that opacity. If we know, for example, that <em>certain combinations of weights follow a conserved quantity or a simple dynamical law<\/em>, we have a handle on the model\u2019s internal state that is both interpretable and predictive<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=effectively%20restricting%20the%20possible%20trajectory,their%20dynamics%20to%20a%20hyperbola\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=Figure%203%3A%20Visualizing%20conservation,black%20lines%20are%20level%20sets\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a>. This contributes to <strong>mechanistic interpretability<\/strong> \u2013 understanding <em>how<\/em> a network\u2019s parameters and activations lead to its outputs. For instance, discovering an analogy between batch normalization and an adaptive optimizer<a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=energy%20explicitly%20breaks%20the%20symmetry,learning%20dynamics%20of%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a> interprets the role of that layer in a human-understandable way (it\u2019s as if the network is tuning its learning rate for each feature). Similarly, identifying <em>phases<\/em> in a model\u2019s learning (say a \u201cmemorization phase\u201d vs \u201cgeneralization phase\u201d in in-context learning) means we can, in principle, detect which phase a model is in by observing certain metrics or internal signals<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=ICLR%20,Representations\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. That helps practitioners know whether a model is likely to generalize or just reciting memorized data. In the long run, the hope is to achieve <strong>trust through understanding<\/strong>: if the behavior of an AI can be explained by a set of scientific principles (like we explain an airplane\u2019s flight with aerodynamics), users and regulators can trust the AI more. This aligns with NTT\u2019s stated goal of <em>building trust that leads to a harmonious fusion of human and AI<\/em><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=to%20industry%20applications%20and%20governance,physicists%20have%20done%20over%20many\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=leading%20academic%20researchers%2C%20the%20Physics,is%20to%20obtain%20a%20better\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. Rather than just saying \u201cthe network weights did something,\u201d physics of intelligence might let us say \u201cthe network made that decision because it\u2019s conserving X and has entered Y regime of operation.\u201d Such explanations could be audited and debated much like scientific theories, thereby <strong>increasing transparency<\/strong>.<\/li>\n\n\n\n<li><strong>Ethics, Alignment, and Trustworthiness:<\/strong> On a philosophical level, Tanaka\u2019s musings about defining concepts like <em>\u201ckindness\u201d mathematically<\/em><a href=\"https:\/\/www.unite.ai\/ntt-research-launches-new-physics-of-artificial-intelligence-group-at-harvard\/#:~:text=%E2%80%9CAs%20a%20physicist%20I%20am,sidelines%20of%20the%20Upgrade%20conference\" target=\"_blank\" rel=\"noreferrer noopener\">unite.ai<\/a> speak to the <strong>AI alignment problem<\/strong> \u2013 how to ensure AI goals align with human values. The approach here is novel: instead of treating ethics as an external set of rules to impose on AI, <em>embed ethical principles into the fundamental understanding of the AI\u2019s operation<\/em>. The PAI group\u2019s mission explicitly mentions <em>integrating ethics from within, rather than through patchwork fine-tuning<\/em><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=Going%20forward%2C%20the%20Physics%20of,improved%20operations%20and%20data%20control\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. This could mean designing training dynamics that inherently conserve or optimize for fairness metrics, or identifying symmetry principles that correspond to fairness (e.g., requiring that swapping demographic identifiers is a symmetry of the loss, which would enforce unbiased behavior as a conserved \u201ccharge\u201d). The bias-removal algorithm recognized by NIST is a concrete outcome on this front \u2013 it suggests the team found a systematic way to adjust a model to remove biases<a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=%2A%20A%20bias,of%20how%20AI%20learns%20concepts\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>, likely informed by understanding of the model\u2019s internal geometry or conservation laws. In addition, trustworthiness comes from <strong>predictability<\/strong>: if emergent behaviors can be predicted (via percolation models or phase diagrams<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=Specifically%2C%20we%20show%20that%20once,predicting%20emergence%20in%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>), AI developers can anticipate and mitigate undesirable behaviors <em>before<\/em> deploying the model. This proactive stance could inform AI governance; for example, if we know scaling data by a factor will suddenly make the model capable of some dangerous task, we can decide to withhold that scaling. The team\u2019s work is already interfacing with risk considerations \u2013 note they mention <em>\u201cenable risk regulation frameworks for AI\u201d<\/em> when understanding emergence<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=,generating%20process%20as%20a\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>. Philosophically, this project suggests that <em>to trust AI, we must first understand its laws<\/em>, similar to how we trust bridges because we understand physics, not just because we tested the bridge a bunch of times. If successful, it could shift AI safety from a reactive, empirical field to a principled, scientific one.<\/li>\n\n\n\n<li><strong>Energy Efficiency and Green AI:<\/strong> Modern AI models, especially large ones, consume enormous energy in training and inference. The Physics of Intelligence project addresses this in two ways. First, by <strong>pruning and optimizing networks<\/strong> \u2013 the SynFlow algorithm is an example of striving for <em>sparsity without performance loss<\/em>, which directly translates to faster, energy-saving inference<a href=\"https:\/\/papers.nips.cc\/paper\/2020\/hash\/46a4378f835dc8040c8057beb6a2da52-Abstract.html#:~:text=Pruning%20the%20parameters%20of%20deep,verify%20a%20conservation%20law%20that\" target=\"_blank\" rel=\"noreferrer noopener\">papers.nips.cc<\/a><a href=\"https:\/\/papers.nips.cc\/paper\/2020\/hash\/46a4378f835dc8040c8057beb6a2da52-Abstract.html#:~:text=entirely%20avoided%2C%20motivating%20a%20novel,used%20to%20quantify%20which%20synapses\" target=\"_blank\" rel=\"noreferrer noopener\">papers.nips.cc<\/a>. If you can prune 99% of a model\u2019s weights using a physics-informed criterion and still solve the task, you\u2019ve made that model dramatically greener. This technique doesn\u2019t rely on data, so it could be applied at initialization to large models to cut down their size before the costly training even begins. Second, by seeking <strong>biologically inspired efficiency<\/strong>. Human brains operate on ~20 watts, while a large AI might use megawatts in a data center. Tanaka\u2019s group, especially via NTT PHI Lab, is aware of this vast gap<a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=The%20Physics%20of%20Artificial%20Intelligence,new%20group%20will%20also%20explore\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. The press release notes that <em>other PHI Lab groups are working on optical computing and novel hardware (like thin-film lithium niobate photonics) to reduce AI\u2019s energy consumption<\/em>, and that <em>the Physics of AI group will look to leverage similarities between brains and neural networks in pursuit of efficiency<\/em><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=The%20Physics%20of%20Artificial%20Intelligence,new%20group%20will%20also%20explore\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=efforts%20to%20reduce%20the%20energy,new%20group%20will%20also%20explore\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. This suggests the group might study features like sparse firing, analog computation, or event-driven processing in brains to inform new architectures that use energy more sparingly. Already, by connecting with neuroscientists, they might identify which computations the brain performs exactly (which might hint at what\u2019s unnecessary in current AI models). On the hardware side, if the team\u2019s theoretical insights yield simpler models or new algorithms, those can be implemented in low-power analog or photonic hardware being developed by PHI Lab. In broad terms, <strong>a scientific understanding of AI could reveal redundancies or more efficient pathways<\/strong> that engineers alone might miss. A simple example: if a conservation law implies some weights are effectively unused (conserved in a way that doesn\u2019t affect output), those weights could be pruned or quantized to lower precision, saving energy. By the rhetoric in NTT and Harvard press, \u201cgreen AI\u201d is a major promised outcome<a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=the%20Physics%20of%20Intelligence%2C%E2%80%9D%20NTT,%E2%80%9D\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a><a href=\"https:\/\/www.unite.ai\/ntt-research-launches-new-physics-of-artificial-intelligence-group-at-harvard\/#:~:text=The%20newly,for%20the%20past%20five%20years\" target=\"_blank\" rel=\"noreferrer noopener\">unite.ai<\/a>.<\/li>\n\n\n\n<li><strong>Fusion of Human and AI Collaboration:<\/strong> The project often mentions creating a <em>\u201charmonious coexistence\u201d<\/em> or <em>\u201cfusion\u201d<\/em> of human and AI<a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=to%20industry%20applications%20and%20governance,physicists%20have%20done%20over%20many\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=establishment%20of%20NTT%20Research%E2%80%99s%20Physics,%E2%80%9D\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. This goes beyond just trust \u2013 it imagines AI systems that can integrate into human workflows and society in a seamless, <em>predictable<\/em> way. A physics of intelligence could provide a common language for human cognition and AI cognition. For example, if both brains and networks are described by similar equations or principles, one could design interfaces where they meet optimally (think brain-computer interfaces guided by theory, or AI assistants that truly understand human mental models). While this is a speculative long-term impact, it aligns with philosophical questions: <em>What is the nature of intelligence? Is an AI\u2019s problem-solving fundamentally similar to a human\u2019s?<\/em> If yes, we might use that to make AI decisions more relatable or to enhance human intelligence (by learning from the efficient strategies of algorithms). Tanaka\u2019s quote about everyone being willing to talk about AI and learning from each such conversation<a href=\"https:\/\/www.unite.ai\/ntt-research-launches-new-physics-of-artificial-intelligence-group-at-harvard\/#:~:text=%E2%80%9CCurrently%2C%20AI%20is%20the%20one,Tanaka%20concluded\" target=\"_blank\" rel=\"noreferrer noopener\">unite.ai<\/a> hints at the idea that AI is a unifying subject \u2013 if scientifically understood, it could tie together insights from psychology (how humans think) and computer science (how machines think) into one framework. The philosophical payoff would be enormous: a theory of intelligence could reshape how we view our own minds (just as understanding thermodynamics reshaped our view of heat and life processes in the 19th century).<\/li>\n\n\n\n<li><strong>Limitations and Responsible Innovation:<\/strong> It\u2019s worth noting that while the aspirations are high, the team is careful to validate and not over-claim. They distinguish speculation from supported findings. For instance, they <em>do not yet have a single equation<\/em> that explains \u201cintelligent behavior\u201d fully \u2013 Murthy\u2019s question <em>\u201cHow do you explain intelligent behavior in equations?\u201d<\/em> remains partially open<a href=\"https:\/\/www.thecrimson.com\/article\/2024\/4\/17\/harvard-cbs-awarded-ntt-grant\/#:~:text=Despite%20the%20Center%E2%80%99s%20broader%20goals,are%20yet%20to%20be%20determined\" target=\"_blank\" rel=\"noreferrer noopener\">thecrimson.com<\/a>. The project\u2019s impact so far has been more about frameworks and pieces of the puzzle (like explaining one aspect of training or one emergent feature) rather than a grand unified theory. However, even these pieces have practical implications (SynFlow for efficiency, bias removal for fairness). By continuing to chip away, they could gradually build a comprehensive picture. An important philosophical stance here is <strong>humility in the face of complexity<\/strong> \u2013 they approach intelligence with the assumption it <em>can<\/em> be understood (not mystical), but also with respect for its complexity (hence borrowing approaches from fields that handle complexity, like statistical physics). In doing so, they contribute to an AI narrative that is <em>less hyperbolic<\/em> and <em>more scientific<\/em>: rather than \u201cjust trust the deep net\u201d or conversely \u201cAI is incomprehensible and dangerous,\u201d they offer a middle ground of <em>studying AI like we study any complex natural system<\/em>. This attitude could influence regulators and the public to demand more explainable and principled AI. It moves the conversation from fearing an alien mind to figuring out its \u201csource code\u201d in nature\u2019s language.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">To summarize the broader impact: <strong>Physics of Intelligence is pioneering a path to make AI understood, safe, and efficient by treating it as an object of scientific inquiry.<\/strong> If successful, this could transform AI development from an artisanal engineering endeavor into a rigorous discipline grounded in laws and principles. That, in turn, means AI systems might come with guarantees (like how bridges come with stress tolerances), biases might be identifiable and correctable <em>a priori<\/em>, and new AI designs might be discovered by reasoning (instead of trial-and-error). Such a transformation is inherently philosophical too, as it forces us to ask: <em>What does it mean for a machine to \u201cunderstand\u201d or \u201cdecide\u201d? Can those processes be described in the language of physics and math?<\/em> The Tanaka team is betting that the answer is yes \u2013 and that pursuing these questions will not only yield better AI, but also <strong>deeper insights into intelligence itself<\/strong>, including our own.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Visual Overview of the Physics of Intelligence Approach<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><img decoding=\"async\" alt=\"\" src=\"blob:https:\/\/chatgpt.com\/e7bfde05-6568-4e44-bc81-eb94a267ee7e\"> <em>Figure 2:<\/em>* Conceptual flowchart of the Physics of Intelligence approach. The project combines <strong>physics-inspired approaches<\/strong> (left oval) \u2013 such as applying symmetry analysis (Noether\u2019s theorem) and designing controlled experiments \u2013 to derive <strong>scientific findings and principles<\/strong> about learning (center oval), including conserved quantities during training, emergent phase transitions in ability, and analogies between neural network behavior and algorithms. These findings in turn inform <strong>outcomes and impacts<\/strong> (right oval): more interpretable AI models (with known invariants and dynamics), <strong>trustworthy systems<\/strong> that embed ethical principles and are predictable, and <strong>energy-efficient algorithms<\/strong> optimized by understanding which computations are essential. In essence, the flowchart shows how foundational science is leveraged to achieve practical AI goals.*<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The diagram above provides a high-level summary: the <strong>input<\/strong> is a set of interdisciplinary scientific techniques; the <strong>output<\/strong> is a set of improvements in AI technology and understanding. This visual underscores that Tanaka\u2019s team doesn\u2019t view AI development as separate from science \u2013 rather, <strong>AI progress can be driven by scientific inquiry<\/strong>, and reciprocally, <strong>AI is a new domain to discover scientific laws<\/strong>. Each arrow (\u2192) in the flowchart can be thought of as <em>research translating into impact<\/em>: e.g., finding a conservation law (middle) leads to a pruning algorithm that makes AI more efficient (right), or using a physics experiment approach (left) leads to discovering emergent behavior rules (middle) that make AI more interpretable (right).<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Summary Table: Goals, Methods, Collaborators, and Outcomes<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Finally, we consolidate the key aspects of Dr. Hidenori Tanaka\u2019s Physics of Intelligence project in the table below, summarizing its driving goals, the methodologies employed, the major collaborators\/institutions involved, and the expected outcomes and impacts:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th><strong>Goals<\/strong> (Scientific and Societal)<\/th><th><strong>Methodologies<\/strong> (Techniques &amp; Approaches)<\/th><th><strong>Key Collaborators &amp; Institutions<\/strong><\/th><th><strong>Expected Outcomes<\/strong> (Vision &amp; Results)<\/th><\/tr><\/thead><tbody><tr><td>&#8211; <strong>Uncover fundamental laws<\/strong> of learning and intelligence (analogous to physical laws)<a href=\"https:\/\/ntt-research.com\/pai-group\/#:~:text=AI%20is%20quite%20possibly%20the,be%20implemented%20to%20further%20humankind\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>.<br>&#8211; <strong>Understand and align<\/strong> AI\u2019s <em>emergent abilities<\/em> with human values<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=,emergent%20abilities\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>.<br>&#8211; Develop <strong>mathematical descriptions<\/strong> of AI generalization and decision-making<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=Goals%3A%20Our%20research%20aims%20to,Our%20current%20interests%20include\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>.<br>&#8211; <strong>Integrate insights<\/strong> from AI and human cognition to aid education and mental health<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=,emergent%20abilities\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>.<br>&#8211; Address urgent needs for <strong>unbiased, trustworthy, and \u201cgreen\u201d AI<\/strong> systems<a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=%E2%80%9CWe%20are%20thrilled%20to%20support,%E2%80%9D\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a>.<\/td><td>&#8211; <strong>Theoretical physics tools<\/strong> applied to neural networks (symmetry analysis, Noether\u2019s theorem, Lagrangian dynamics) to derive invariant quantities and dynamics<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Strikingly%20similar%20to%20Noether%E2%80%99s%20theorem%2C,constant%20under%20gradient%20flow%20dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a><a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=%5C%5B%5Cbegin,%7C%5Ctheta_%7B%5Cmathcal%7BA%7D_2%7D%280%29%7C%5E2%20%5Cend%7Baligned\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>.<br>&#8211; <strong>Continuous dynamical modeling<\/strong> of learning (gradient flow ODEs, modified equations for SGD) to solve training behavior analytically<a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=Symmetry%20leads%20to%20conservation,through%20training%20under%20gradient%20flow\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2012.04728#:~:text=Application%20of%20this%20general%20theorem,following%20conservation%20law%20of%20learning\" target=\"_blank\" rel=\"noreferrer noopener\">ar5iv.labs.arxiv.org<\/a>.<br>&#8211; <strong>Controlled experimental setups<\/strong> (synthetic tasks, formal languages, toy models) to observe phase transitions, abrupt learning, and test hypotheses under clean conditions<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=cause%20of%20sudden%20performance%20growth,when%20changing%20the%20data%20structure\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a><a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=narrower%20tasks%20suddenly%20begins%20to,when%20changing%20the%20data%20structure\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>.<br>&#8211; <strong>Interdisciplinary data analysis<\/strong>, linking AI models to neuroscience\/psychology experiments (e.g. comparing network representations to brain data) to find common principles.<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=N.%20Maheswaranathan,Baccus\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=G,Wyart%20%282022\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>.<br>&#8211; <strong>Algorithm design via theory:<\/strong> using derived principles to create new methods (e.g. SynFlow pruning conserving synaptic flow<a href=\"https:\/\/papers.nips.cc\/paper\/2020\/hash\/46a4378f835dc8040c8057beb6a2da52-Abstract.html#:~:text=ever%20training%2C%20or%20indeed%20without,at%20initialization%20subject%20to%20a\" target=\"_blank\" rel=\"noreferrer noopener\">papers.nips.cc<\/a>, bias removal via identified geometry).<\/td><td>&#8211; <strong>NTT Research \u2013 PHI Lab &amp; Physics of AI (PAI) Group:<\/strong> Industry research lab funding and driving the project; led by H. Tanaka (Group Head) with core team (M. Okawa, E.S. Lubana)<a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=Lab%20,Physics%20of%20Artificial%20Intelligence%20Group\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=Princeton%20University%20Assistant%20Professor%20,Previous%20contributions%20to%20date%20include\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. Provides computational resources and translation to applications (energy-efficient hardware, etc.).<br>&#8211; <strong>Harvard University \u2013 Center for Brain Science (CBS):<\/strong> Academic host of the CBS-NTT Physics of Intelligence Program<a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=The%20new%20program%20will%20amplify,neuroscientists%20and%20psychologists%20at%20Harvard\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a>. Director Venkatesh Murthy and others collaborate, bringing neuroscience and psychology expertise. Joint program funds postdocs and fosters cross-pollination on campus.<br>&#8211; <strong>MIT \u2013 IAIFI (Institute for AI and Fundamental Interactions):<\/strong> Collaboration via affiliate role and IAIFI postdocs<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=co\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a><a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=2024%20,MIT%29%2C%20MA%2C%20USA\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. Connects to MIT physics\/AI researchers and resources; contributes theoretical physics perspectives (e.g. quantum analogies, fundamental math).<br>&#8211; <strong>University of Tokyo \u2013 Institute for Physics of Intelligence:<\/strong> International partnership with similar goals<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=%28MIT%29%20iaifi\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>. Facilitates exchange of researchers (e.g. Z. Liu, M. Ueda co-authors) and globalizes the research agenda.<br>&#8211; <strong>Stanford University (Ganguli Lab) &amp; Princeton University (Reddy):<\/strong> Key academic collaborators<a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=The%20new%20group%20will%20continue,Previous%20contributions%20to%20date%20include\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. Ganguli\u2019s lab co-develops theory (symmetry, dynamics), Reddy and others link to biological physics and behavior. Also includes students from Michigan, USC, Yale, etc. working within Tanaka\u2019s group, reflecting a broad academic network.<\/td><td>&#8211; <strong>Scientific breakthroughs<\/strong> in understanding AI: e.g. identification of conserved quantities in deep learning<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=%5C%5B%5Cbegin,%7C%5Ctheta_%7B%5Cmathcal%7BA%7D_2%7D%280%29%7C%5E2%20%5Cend%7Baligned\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>, phase transition models of emergent skills<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=Specifically%2C%20we%20show%20that%20once,predicting%20emergence%20in%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>, and general theoretical frameworks that unify architecture and optimization<a href=\"https:\/\/openreview.net\/forum?id=fiPtD7iXuhn#:~:text=energy%20explicitly%20breaks%20the%20symmetry,learning%20dynamics%20of%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">openreview.net<\/a>.<br>&#8211; <strong>Practical algorithms and tools:<\/strong> data-agnostic network pruning (SynFlow) for efficient models<a href=\"https:\/\/papers.nips.cc\/paper\/2020\/hash\/46a4378f835dc8040c8057beb6a2da52-Abstract.html#:~:text=first%20mathematically%20formulate%20and%20experimentally,art\" target=\"_blank\" rel=\"noreferrer noopener\">papers.nips.cc<\/a>; methods to <strong>mitigate bias<\/strong> in language models (one recognized by NIST for its impact)<a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=%2A%20A%20bias,of%20how%20AI%20learns%20concepts\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>; techniques for mechanistic interpretability (diagnosing \u201calgorithmic phases\u201d within a model) that help debug and improve AI<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=ICLR%20,Representations\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>.<br>&#8211; <strong>Improved AI trust and safety:<\/strong> Ability to predict and control AI behavior, reducing \u201cblack box\u201d surprises. For example, embedding ethical constraints as symmetries or invariants in training (a speculative yet desired outcome) so that AI systems are <em>aligned by design<\/em> rather than retrofitted<a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=Going%20forward%2C%20the%20Physics%20of,improved%20operations%20and%20data%20control\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. Greater transparency through physics-style explainability builds human trust in AI decisions<a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=establishment%20of%20NTT%20Research%E2%80%99s%20Physics,%E2%80%9D\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>.<br>&#8211; <strong>Energy-efficient AI systems:<\/strong> Leaner models via principled pruning and architecture insights; inspiration from brain efficiency to guide new hardware (optical\/neuromorphic) and algorithms. The <strong>\u201cgreen AI\u201d<\/strong> goal is AI that achieves more with less computational power<a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=the%20Physics%20of%20Intelligence%2C%E2%80%9D%20NTT,%E2%80%9D\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=The%20Physics%20of%20Artificial%20Intelligence,new%20group%20will%20also%20explore\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>, guided by scientific understanding of which computations are necessary.<br>&#8211; <strong>Cross-disciplinary knowledge:<\/strong> A unifying theory of intelligence that informs neuroscience (e.g. explaining neural coding with deep learning models<a href=\"https:\/\/sites.google.com\/view\/htanaka\/home#:~:text=N.%20Maheswaranathan,Baccus\" target=\"_blank\" rel=\"noreferrer noopener\">sites.google.com<\/a>) and vice versa. Training a new generation of researchers fluent in both physics and AI, capable of further breakthroughs. Ultimately, a <strong>conceptual bridge between human and machine intelligence<\/strong> that could transform technology and cognitive science.<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">In the table above, each element underscores how the Physics of Intelligence project is <strong>not just about theory for theory\u2019s sake<\/strong>, but tightly interweaves its scientific pursuits with real-world outcomes. The goals drive the choice of methods (e.g., to achieve trustworthy AI, they examine symmetries related to fairness); the collaborations provide the breadth of expertise needed; and the outcomes are both measurable (algorithms, papers) and aspirational (a future of more human-compatible AI).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>In conclusion,<\/strong> Dr. Hidenori Tanaka\u2019s Physics of Intelligence project represents a bold and comprehensive effort to <em>scientifically decode<\/em> the nature of intelligence \u2013 in machines and organisms \u2013 using the language of physics. In the past few years, it has made substantial strides: formulating exact analogies between neural network training and physical laws<a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=Strikingly%20similar%20to%20Noether%E2%80%99s%20theorem%2C,constant%20under%20gradient%20flow%20dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a><a href=\"https:\/\/ai.stanford.edu\/blog\/neural-mechanics\/#:~:text=%5C%5B%5Cbegin,%7C%5Ctheta_%7B%5Cmathcal%7BA%7D_2%7D%280%29%7C%5E2%20%5Cend%7Baligned\" target=\"_blank\" rel=\"noreferrer noopener\">ai.stanford.edu<\/a>, revealing why and when AI systems undergo phase changes in capability<a href=\"https:\/\/arxiv.org\/abs\/2408.12578#:~:text=Specifically%2C%20we%20show%20that%20once,predicting%20emergence%20in%20neural%20networks\" target=\"_blank\" rel=\"noreferrer noopener\">arxiv.org<\/a>, and delivering tools that improve AI\u2019s efficiency and fairness (SynFlow, bias mitigation)<a href=\"https:\/\/papers.nips.cc\/paper\/2020\/hash\/46a4378f835dc8040c8057beb6a2da52-Abstract.html#:~:text=first%20mathematically%20formulate%20and%20experimentally,art\" target=\"_blank\" rel=\"noreferrer noopener\">papers.nips.cc<\/a><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=%2A%20A%20bias,of%20how%20AI%20learns%20concepts\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. It has also built an ecosystem of collaboration spanning industry and academia, which is accelerating progress in this nascent field<a href=\"https:\/\/news.harvard.edu\/gazette\/story\/newsplus\/gift-establishes-new-program-in-physics-of-intelligence-at-center-for-brain-science\/#:~:text=The%20new%20program%20will%20amplify,neuroscientists%20and%20psychologists%20at%20Harvard\" target=\"_blank\" rel=\"noreferrer noopener\">news.harvard.edu<\/a><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=SUNNYVALE%2C%20Calif,Physics%20of%20Artificial%20Intelligence%20Group\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>. Many of the ideas are still <em>taking shape<\/em> \u2013 the quest to fully explain \u201cintelligent behavior in equations\u201d continues, and some implications remain speculative. However, the distinctive approach of treating AI as a natural phenomenon is already yielding <em>fresh insights that neither standard deep learning research nor traditional neuroscience alone could achieve<\/em>. By clearly distinguishing what is known (e.g., specific conservation laws, empirically observed emergent thresholds) versus what is conjectured (e.g., eventual unification of AI and cognitive science under physics), the project maintains scientific rigor while pushing the boundaries of our understanding.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Moving forward, the Physics of Intelligence initiative is poised to influence not only how we build AI, but also how we <strong>conceptualize intelligence itself<\/strong> \u2013 potentially leading to AI that is more interpretable, more aligned with human values, and more energy efficient, because it would be designed on a foundation of <em>scientific principles<\/em> rather than trial-and-error. In a time when AI is becoming ever more powerful (and sometimes unpredictable), this convergence of physics and AI offers a path toward <strong>\u201cupgrading reality\u201d with AI that we can truly trust and harmonize with<a href=\"https:\/\/www.unite.ai\/ntt-research-launches-new-physics-of-artificial-intelligence-group-at-harvard\/#:~:text=The%20newly,for%20the%20past%20five%20years\" target=\"_blank\" rel=\"noreferrer noopener\">unite.ai<\/a><a href=\"https:\/\/ntt-research.com\/ntt-research-launches-new-physics-of-artificial-intelligence-group\/#:~:text=establishment%20of%20NTT%20Research%E2%80%99s%20Physics,%E2%80%9D\" target=\"_blank\" rel=\"noreferrer noopener\">ntt-research.com<\/a>.<\/strong> The coming years will reveal how far this physics-of-AI paradigm can go, but its early successes already suggest that understanding intelligence through physics is not only possible but deeply rewarding \u2013 yielding insights that are as intellectually fascinating as they are practically important.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Dr. Hidenori Tanaka\u2019s \u201cPhysics of Intelligence\u201d project (also called the Physics of Artificial Intelligence) is an ambitious research initiative aiming to apply concepts from physics \u2013 such as symmetry, conservation laws, and phase transitions \u2013 to the study of intelligence&hellip;<\/p>\n","protected":false},"author":4,"featured_media":1619,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[23,68],"tags":[],"class_list":["post-1617","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-academic","category-creativity"],"_links":{"self":[{"href":"https:\/\/www.aicritique.org\/us\/wp-json\/wp\/v2\/posts\/1617","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.aicritique.org\/us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aicritique.org\/us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aicritique.org\/us\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aicritique.org\/us\/wp-json\/wp\/v2\/comments?post=1617"}],"version-history":[{"count":1,"href":"https:\/\/www.aicritique.org\/us\/wp-json\/wp\/v2\/posts\/1617\/revisions"}],"predecessor-version":[{"id":1620,"href":"https:\/\/www.aicritique.org\/us\/wp-json\/wp\/v2\/posts\/1617\/revisions\/1620"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aicritique.org\/us\/wp-json\/wp\/v2\/media\/1619"}],"wp:attachment":[{"href":"https:\/\/www.aicritique.org\/us\/wp-json\/wp\/v2\/media?parent=1617"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aicritique.org\/us\/wp-json\/wp\/v2\/categories?post=1617"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aicritique.org\/us\/wp-json\/wp\/v2\/tags?post=1617"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}