Energy-Based Transformers (EBTs), Simply Explained

AI, But Simple Issue #77

 

Hello from the AI, but simple team! If you enjoy our content, consider supporting us so we can keep doing what we do.

Our newsletter is no longer sustainable to run at no cost, so we’re relying on different measures to cover operational expenses. Thanks again for reading!

Energy-Based Transformers (EBTs), Simply Explained

AI, But Simple Issue #77

Recent advances in deep learning have brought us models that can generate text, solve code, and describe images with great accuracy and fluency.

Yet, these models, as powerful as they are, remain mostly System 1 thinkers: fast, pattern-recognizing models that excel at recall and imitation. They still struggle with slow, deliberate reasoning, known in cognitive science as System 2 Thinking.

The paper Energy-Based Transformers are Scalable Learners and Thinkers by Gladstone et al. introduces a new technique to solve this issue by merging two previously separate ideas: Energy-Based Models (EBMs) and Transformers.

Subscribe to keep reading

This content is free, but you must be subscribed to AI, But Simple to continue reading.

Already a subscriber?Sign in.Not now