Terminal + LLM: from Warp to Ghostty

I am using aichat and a local LLM inside Ghostty to replace Warp. Why?

Warp’s app icon is ugly, and Ghostty’s app icon is beautiful.
I challenge myself to work with a local LLM instead of a free version of a paid LLM product. I have to connect the local LLM to Ghostty via aichat.

At the time of writing, the local LLM is qwen3-4b-2507 served by LM Studio. The model is the best among models of its size for me. I used Grok (x.ai) to install Ghostty, aichat (rustup + cargo), and LM Studio. I recommend model configs as below.

Model configs

6 bits quantization is recommended by an Apple’s staff working on their AI.

Some data to help decide on what the right precision is for Qwen3 4B (Instruct 2507).

I ran the full MMLU Pro eval, plus some efficiency benchmarks with the model at every precision from 4-bit to bf16.

TLDR 6-bit is a very decent option at < 1% gap in quality to the full… pic.twitter.com/YQsOL9ZPw1
— Awni Hannun (@awnihannun) October 11, 2025