Autoregressive MicroSim¶

<iframe src="https://dmccreary.github.io/tracking-ai-course/sims/autoregressive/main.html"  height="450px" scrolling="no"
  style="overflow: hidden;"></iframe>

Run the Autoregressive MicroSim

Edit the Autoregressive MicroSim

Prompt

Please create a new MicroSim that simulates the prediction of the next token from a sequence of words using a neural network. The simulation works in phases, with five steps per phase.

Layout of the neural network graph¶

The animation has the five horizontal rows, each with 20 columns of circles.
The labels for each row is on the left.
The leftMargin is 150 for drawing the labels.
The labels on the left column are: 4.1 "Output" at y = 100 4.2 "Hidden Layer" at y = 200, 300 and 400 4.3 "Input" at y = 500
The "Output" layer has light orange filled circles with r=8.
The "Hidden Layers" has gray filled circles with r=8
The "Input Layer" has light blue filled circles with r=8
All circles have a thin 1pm black border

Only 16 of the columns are active at any time showing the context window of 16 in the input row.

Step 1: Draw 16 arrows from the left-most 16 bottom input layer up to the first hidden layer. The arrows merge to alternating 8 circles on the lower hidden layer. Keep the arrows visible for each phase.

Step 2: Draw 8 arrows from the lower hidden layer to the middle hidden layer. Draw them to alternate nodes so the left-most nodes shift one to the right.

Step 3. Draw 4 arrows from the middle hidden layer to the top hidden layer.

Step 4. Draw 2 arrows from the top hidden layer to the node in column 17 on the top output row.

Step 5. Animate the circle just generated in the 17th row moving to the 17th circle on the bottom row. Erase all the arrows on the screen.

Repeat this animation three times shifting to the right one column each time.

Add buttons for Start/Stop and Reset in the control area at the bottom of the animation.

## References

This MicroSim was inspired by the GIF above in the following article:

Autoregressive (AR) Language Modeling on Medium.com by Tony Jesuthasan published on Jul 31, 2021

Self-Assessment Quiz¶

Test your understanding of autoregressive language models.

Question 1: What does "autoregressive" mean in the context of language models?

The model automatically registers new users
The model predicts each token based on the previous tokens it has generated
The model only processes text once
The model corrects its own grammar

Answer

B) The model predicts each token based on the previous tokens it has generated - Autoregressive models generate text one token at a time, using the previously generated tokens as context for predicting the next token.

Question 2: What does the "context window" represent in the animation?

A graphical user interface window
The set of previous tokens the model considers when predicting the next token
The screen size of the display
The time limit for generating text

Answer

B) The set of previous tokens the model considers when predicting the next token - The context window (16 tokens in this animation) shows how many previous tokens influence the prediction of the next token.

Question 3: Why does the animation show arrows converging from many inputs to fewer hidden layer nodes?

It saves computer memory
Neural networks compress information as it flows through hidden layers
The animation is broken
Fewer nodes mean faster processing

Answer

B) Neural networks compress information as it flows through hidden layers - The architecture shows how information from multiple input tokens is combined and compressed through successive hidden layers to produce a single output prediction.

Question 4: What happens after a new token is generated in autoregressive generation?

The process stops
The new token is added to the context and the process repeats
All previous tokens are deleted
The model retrains itself

Answer

B) The new token is added to the context and the process repeats - The newly generated token becomes part of the input context for generating the next token, creating a sequential generation process.

Question 5: Which modern AI systems use autoregressive generation?

Only simple calculators
GPT, Claude, and other large language models (LLMs)
Only image recognition systems
Only speech-to-text systems

Answer

B) GPT, Claude, and other large language models (LLMs) - Modern LLMs like GPT, Claude, Gemini, and others use autoregressive generation to produce text one token at a time.