- AI, But Simple
- Posts
- Mamba and State Space Models, Simply Explained
Mamba and State Space Models, Simply Explained
AI, But Simple Issue #58

Hello from the AI, but simple team! If you enjoy our content, consider sharing it with a friend or supporting us so we can keep doing what we do.
Our newsletter is no longer sustainable to run at no cost, so we’re relying on different measures to cover operational expenses. Thanks again for reading!
Mamba and State Space Models, Simply Explained
AI, But Simple Issue #58
Transformer-based LLMs have become the dominant architecture for text processing and NLP in recent years. They are behind some of the most widely used LLMs, like ChatGPT, Claude, and Gemini.
While the transformer brought on a huge wave of development and growth, researchers have started developing new architectures that might outperform it.
Don’t understand transformers? Learn about transformers or brush up your knowledge in our previous issue, “Transformers and the Attention Mechanism.”
One of these new architectures is called Mamba, and it is what we call a “state-space” model (SSM).

Mamba was introduced by researchers at Carnegie Mellon University in their 2023 paper: “Mamba: Linear-Time Sequence Modeling with Selective State Spaces.”
State space models have been around for quite a while (introduced in the 1960s), but why specifically were state space models introduced to NLP? The reason is that state space models like Mamba solve key problems with transformers and RNNs.
Essentially, transformers have great attention mechanisms that model long-range dependencies well, but their computation and memory complexity scales quadratically with sequence length—they are too computationally expensive.
RNNs scale better, but they have small, fixed-size hidden states that do not allow for good comprehension of long sequences.
Due to these constraints, there is a need for an architecture that could be both memory-efficient and manage long-range dependencies well.
State Space Models (SSMs) for NLP were the researchers' answer to this problem. Mamba is one of the most standout SSMs that have shown fantastic performance.