note by Renato Candido

Renato Candido at 2026-01-13T15:14:27Z

https://github.com/Mega4alik/ollm a lightweight Python library for large-context LLM inference that enables running models like gpt-oss-20B, qwen3-next-80B or Llama-3.1-8B-Instruct on 100k context using ~$200 consumer GPU with 8GB VRAM.