Chain-of-Thought Chat Demo
79.7M params 12 attention layers Qwen2.5 tokenizer Apache-2.0
Pretrained from scratch on wop/XXXXXL-chain-of-thought · Model card: wop/Cosmos-T-80M
⚠️ Research / demo model. Only 840 training conversations, so the model is
heavily overfit and will hallucinate confidently outside its training
distribution. Treat it as a stylish parrot — not a fact source.
Examples
| System prompt | Temperature | Top-K | Context window (max 1028) | Max new tokens (max 1028) |
|---|
0 2
1 200
64 1028
16 1028
**Tips** — Keep `temp = 0.1` and `top_k = 1` for the most coherent output. Crank `temp` up to 0.8+ for more creative (but messier) replies. Clear the chat if responses start looping.