Tool Usage¶
Use MicroGPT training and generation as tools inside any Strands agent.
Available Tools¶
microgpt_train¶
Train a model from scratch:
from strands import Agent
from strands_microgpt import microgpt_train
agent = Agent(tools=[microgpt_train])
agent("Train a MicroGPT on the names dataset for 500 steps")
Parameters:
| Param | Default | Description |
|---|---|---|
dataset_url |
names.txt | Training data URL |
dataset_path |
Local file override | |
num_steps |
1000 | Training steps |
n_layer |
1 | Transformer layers |
n_embd |
16 | Embedding dim |
block_size |
16 | Context window |
n_head |
4 | Attention heads |
learning_rate |
0.01 | Initial LR |
checkpoint_path |
Save path |
microgpt_generate¶
Generate from a trained checkpoint:
from strands import Agent
from strands_microgpt import microgpt_generate
agent = Agent(tools=[microgpt_generate])
agent("Generate 10 names with temperature 0.7")
Parameters:
| Param | Default | Description |
|---|---|---|
checkpoint_path |
/tmp/microgpt_checkpoint.json | Model path |
num_samples |
20 | Samples to generate |
temperature |
0.5 | Sampling temperature |
Both Tools Together¶
from strands import Agent
from strands_microgpt import microgpt_train, microgpt_generate
agent = Agent(tools=[microgpt_train, microgpt_generate])
# The agent can train then generate in one conversation
agent("Train a small GPT on names, then generate 10 creative names")
→ Next: Architecture | API Reference