llm-box-uk

main DESKTOP-SSAQG5J cc 2.1.167 cli

/home/megaproxy/claude/research/llm-box-uk

126 events

38 tool calls

2.2M tokens

$2.77 notional cost

6/6/2026, 11:14:26 PM → 24d ago

in 7.5K out 41.9K cache-write 102.3K cache-read 2.1M

Transcript

126 turns · chronological (UTC timestamps shown local)

attachment 6/6/2026, 11:14:26 PM
attachment 6/6/2026, 11:14:26 PM
user 6/6/2026, 11:14:26 PM
```
are intel arc good for ai?
```
assistant Opus 4.8 6/6/2026, 11:14:37 PM

assistant Opus 4.8 27.6K tok $0.10 6/6/2026, 11:14:48 PM

Short answer: **fine for inference, bad for the fine-tuning half of your brief.**

For this project specifically (inference + light LoRA/QLoRA), Intel Arc is a half-fit:

**Where Arc works — inference**
- **Arc A770 16GB** is the interesting card: ~£280–330 used/new in the UK, 16GB GDDR6, ~560 GB/s bandwidth. That's a lot of VRAM per pound — cheaper than a 3090 on a £/GB basis.
- Software is actually usable now: **IPEX-LLM** (Intel's PyTorch LLM extension) and **llama.cpp** with the SYCL or Vulkan backend both run quantised models on Arc reasonably well. Ollama works via the IPEX-LLM fork.
- Battlemage **B580 12GB** (~£250) is the newer gen but less VRAM, so less relevant for "biggest model per pound."

**Where Arc falls down — fine-tuning**
- The LoRA/QLoRA ecosystem is CUDA-locked. `bitsandbytes` (the backbone of 4-bit QLoRA) has no real Arc support; PEFT/Axolotl/Unsloth all assume CUDA. You *can* do training via oneAPI/IPEX, but it's a constant fight against the toolchain, and Unsloth — the thing that makes "light QLoRA on one card" pleasant — is CUDA-only.
- So Arc fails the second half of your "done" criteria. The whole reason the **used RTX 3090 24GB** is the safe pick is that it does both halves with zero ecosystem friction.

**Verdict for the catalogue:** Arc A770 16GB belongs in the same honest-but-caveated bucket as the P40/MI50 cheap routes — a viable *inference-only* budget play, explicitly NOT recommended if you actually want the fine-tuning capability the brief calls for. It's arguably a cleaner cheap route than MI50 (no ROCm-on-ancient-card pain) but worse than 3090 for anything training-shaped.

Want me to add an Arc A770 inference-only build to `index.html` as a sibling to the other cheap-card routes, with that caveat called out? I'd price it live first.

system 6/6/2026, 11:14:48 PM
attachment 6/6/2026, 11:15:57 PM

user 6/6/2026, 11:15:57 PM

if i dont really want to do training.. would a 24gb vram intel arc be a good fit?

assistant Opus 4.8 6/6/2026, 11:16:17 PM

assistant Opus 4.8 6/6/2026, 11:16:19 PM

The card you're thinking of is the **Intel Arc Pro B60 24GB** (Battlemage, the "Project Battlematrix" workstation line) — the consumer A770 tops out at 16GB, so 24GB Arc means the Pro B60. Let me check current UK availability/pricing before I commit to a verdict, since that's the whole point of this catalogue.

user 6/6/2026, 11:16:20 PM
assistant Opus 4.8 29.1K tok $0.07 6/6/2026, 11:16:20 PM
- ToolSearch select:WebSearch,WebFetch 4ms 100 B
assistant Opus 4.8 6/6/2026, 11:16:22 PM
- WebSearch Intel Arc Pro B60 24GB price UK availability 2026 7.4s 2.8 KB
assistant Opus 4.8 29.9K tok $0.10 6/6/2026, 11:16:23 PM
- WebSearch Intel Arc Pro B60 24GB memory bandwidth llama.cpp vLLM inference benchmark 9.1s 2.5 KB
user 6/6/2026, 11:16:30 PM
user 6/6/2026, 11:16:32 PM
assistant Opus 4.8 6/6/2026, 11:17:00 PM

assistant Opus 4.8 34.8K tok $0.09 6/6/2026, 11:17:10 PM

Yes — drop the training requirement and the B60 goes from "half-fit" to **genuinely competitive**. Here's the honest read, now grounded in current pricing:

## Intel Arc Pro B60 24GB — what it actually is
- 24GB GDDR6, **456 GB/s** bandwidth, 192-bit bus, PCIe 5.0 **x8**, 160 XMX AI engines.
- Pricing: listed as low as **$599** in the US; Europe **€769–€1,025** depending on model (Sparkle/ASRock). That's roughly **£650–£900 inc VAT** landed in the UK — and crucially it's *new with warranty and UK stock* (Newegg UK, ASRock, Sparkle), not a used eBay gamble.
- It's explicitly marketed for ~20B-class local LLMs, and the software is real now: **vLLM has first-class Intel Arc support** (Nov 2025), llama.cpp runs via SYCL, plus IPEX-LLM/Ollama. For inference-only, the CUDA-lock problem that killed it for training no longer applies.

## The one tradeoff that matters: bandwidth
The used **RTX 3090 24GB** has **~936 GB/s** — roughly **2× the B60's 456 GB/s**. LLM token generation is memory-bandwidth-bound, so on a model that fills 24GB, expect the 3090 to push **roughly 2× the single-stream tokens/sec**. That's the headline cost of going Arc.

Where the B60 claws it back:
- **Power/heat:** ~120–200W TBP vs the 3090's 350W. For a 24/7 garage box at UK electricity prices, that's real money and far less cooling drama. Blower and passive variants exist — basically built for racking.
- **Batched serving:** the XMX engines scale well concurrently (~1000 tok/s aggregate, 50–64 concurrent requests in vLLM benches). If you're serving a few users/agents rather than chasing one fast stream, the gap narrows a lot.
- **New + warranty + in stock** vs used-3090 roulette.

## Verdict for your case (inference, no training)
- **Want max single-user speed per pound** → used 3090 still wins on raw tok/s, similar price (~£600–700 used).
- **Want quiet-ish, low-power, new-with-warranty, and you serve batched/agent workloads or are fine with ~half the stream speed** → the B60 24GB is a legitimately good fit, and arguably the *better* fit for an always-on garage box you don't want to babysit.
- **Bonus path:** two B60s (or the single-card **B60 Dual 48GB**) gets you 48GB for roughly the price of one used 3090-and-a-bit — that's where Arc gets genuinely interesting, since 48GB lets you run ~70B quants the 3090 can't hold alone.

Want me to **live-price a UK B60 build** (single 24GB, and a dual-48GB variant) and add it to `index.html` as an inference-only sibling to the 3090 build? I'd pull current Newegg UK / ASRock stock prices and date-stamp it per the catalogue convention.

Sources:
- [Arc Pro B60 24GB Europe launch from €769 — VideoCardz](https://videocardz.com/newz/intel-arc-pro-b60-24gb-workstation-gpu-to-launch-in-europe-mid-to-late-november-starting-at-e769)
- [Arc Pro B60 listed at $599, in stock — VideoCardz](https://videocardz.com/newz/intel-arc-pro-b60-24gb-professional-gpu-listed-at-599-in-stock-and-shipping)
- [Intel Arc Pro B60 product specs — Intel](https://www.intel.com/content/www/us/en/products/sku/243916/intel-arc-pro-b60-graphics/specifications.html)
- [Benchmarking LLM inference on Arc Pro B60 (vLLM) — EmbeddedLLM](https://embeddedllm.com/blog/benchmarking-llm-inference-intel-arc-pro-b60)
- [Fast, affordable LLM serving on Arc Pro B-Series with vLLM — vLLM blog](https://vllm.ai/blog/2025-11-11-intel-arc-pro-b)
- [ASRock Arc Pro B60 24GB — Newegg UK](https://www.newegg.com/global/uk-en/asrock-intel-arc-pro-b60-24gb-graphics/p/N82E16814930148)

system 6/6/2026, 11:17:10 PM

user 6/6/2026, 11:18:20 PM

ceck websites like scan or overclockers.co.uk and ebay to be sure of prices. 3090 costs a lot more than you say etc

assistant Opus 4.8 6/6/2026, 11:18:25 PM

You're right to push on this — let me pull actual UK retailer and eBay numbers rather than trusting the MSRP-to-GBP conversion. Checking Scan, Overclockers, and eBay sold.

assistant Opus 4.8 6/6/2026, 11:18:25 PM
assistant Opus 4.8 6/6/2026, 11:18:25 PM
- WebSearch Intel Arc Pro B60 24GB scan.co.uk price 6.6s 2.6 KB
assistant Opus 4.8 6/6/2026, 11:18:26 PM
- WebSearch RTX 3090 24GB used price UK 2026 ebay sold 6.4s 2.1 KB
assistant Opus 4.8 35.3K tok $0.03 6/6/2026, 11:18:26 PM
- WebSearch RTX 3090 price UK scan overclockers 2026 7.5s 3.0 KB
assistant Opus 4.8 6/6/2026, 11:18:26 PM
- WebSearch Intel Arc Pro B60 24GB overclockers.co.uk 9.6s 2.9 KB
user 6/6/2026, 11:18:32 PM
user 6/6/2026, 11:18:33 PM
user 6/6/2026, 11:18:34 PM
attachment 6/6/2026, 11:18:35 PM
user 6/6/2026, 11:18:35 PM
assistant Opus 4.8 6/6/2026, 11:18:45 PM

assistant Opus 4.8 6/6/2026, 11:18:45 PM

Good leads. Let me pull the actual numbers from the trackers and product pages.

assistant Opus 4.8 6/6/2026, 11:18:45 PM
- WebFetch https://www.newegg.com/global/uk-en/asrock-intel-arc-pro-b60-24gb-graphics/p/N82E16814930148 6.8s 665 B
assistant Opus 4.8 6/6/2026, 11:18:45 PM
- WebFetch https://bestvaluegpu.com/en-gb/history/new-and-used-rtx-3090-price-history-and-specs/ 5.6s 407 B
assistant Opus 4.8 41.5K tok $0.07 6/6/2026, 11:18:47 PM
- WebSearch "Arc Pro B60" 24GB UK in stock buy price £ scan OR overclockers OR cclonline 9.8s 2.7 KB
assistant Opus 4.8 6/6/2026, 11:18:47 PM
- WebFetch https://e-catalog.co.uk/SPARKLE-INTEL-ARC-PRO-B60-BLOWER.htm 4.4s 215 B
user 6/6/2026, 11:18:50 PM
user 6/6/2026, 11:18:51 PM
user 6/6/2026, 11:18:52 PM
user 6/6/2026, 11:18:57 PM
assistant Opus 4.8 6/6/2026, 11:19:21 PM
system 6/6/2026, 11:19:28 PM

assistant Opus 4.8 45.4K tok $0.11 6/6/2026, 11:19:28 PM

Thanks — you're right, and the live numbers change the verdict. I undersold the 3090. Corrected picture:

## Actual UK prices (June 2026)

| Card | UK price | Notes |
|---|---|---|
| **RTX 3090 24GB (used)** | **~£820** | bestvaluegpu tracker, 6 Jun 2026. *Not* the £600–700 I quoted — I was wrong. New is irrelevant (~£2,180, discontinued). |
| **Arc Pro B60 24GB** (Maxsun, Destello Tech) | **£899.99** | Proper UK stock, free UK delivery, UK warranty. |
| **Arc Pro B60 24GB** (ASRock via Newegg) | £639.59 + £42.88 ship = **~£682** | Ships from US — **no manufacturer warranty out-of-region**, and the £682 likely doesn't include import handling on top. |
| **Arc Pro B60 Dual 48GB** (Maxsun) | **~£1,400** + £40 ship | Single card, 48GB, real UK buyer paid this. |

Caveat on the 3090: that £820 is the tracker's snapshot figure, and per the catalogue's own rule eBay blocks automated fetch — the honest anchor is "spot-check completed *sold* listings by clicking `LH_Sold=1`." But £800-ish is clearly the right ballpark, not £700.

## What this does to the verdict

The price gap I implied has basically evaporated. At the **UK-supported** prices it's ~£820 (3090) vs ~£900 (B60) — near parity. So:

- **Single card, like-for-like ~£850:** the **used 3090 wins** for inference. Same 24GB, but ~2× the memory bandwidth (936 vs 456 GB/s) = roughly 2× single-stream tokens/sec, and zero CUDA friction. The B60 only pulls ahead on power (200W vs 350W) and warranty — real, but not enough to overcome 2× speed at the same money.
- **The grey-import B60 at ~£682** undercuts the 3090, but you're trading away warranty and eating import-handling risk for a slower card. Hard to recommend over a used 3090.
- **Where Arc actually wins: the B60 Dual 48GB at ~£1,400.** Two used 3090s for 48GB is ~£1,640 *plus* a dual-GPU headache (two slots, ~700W, cooling, splitting layers). One B60 Dual card gives you 48GB in one slot at ~200–300W for less money. That's the genuinely compelling Arc story for a quiet always-on garage box — it lets you hold ~70B-class quants the single 3090 can't.

So: **for one 24GB card, no — the 3090 is still the better buy at today's prices.** The Arc case is the 48GB dual card, not the single.

Want me to add both an **Arc B60 24GB (inference-only)** and an **Arc B60 Dual 48GB** build to `index.html`, with these dated prices and the "single = buy the 3090 instead, dual = genuinely interesting" verdict baked in? I'll re-verify the 3090 sold price by clicking the eBay `LH_Sold=1` URL first so the comparison anchor is solid.

Sources:
- [RTX 3090 UK price tracker — BestValueGPU](https://bestvaluegpu.com/en-gb/history/new-and-used-rtx-3090-price-history-and-specs/)
- [ASRock Arc Pro B60 24GB — Newegg UK](https://www.newegg.com/global/uk-en/asrock-intel-arc-pro-b60-24gb-graphics/p/N82E16814930148)
- [Maxsun Arc Pro B60 24GB £899.99 — Destello Tech](https://destellotech.com/products/maxsun-arc-pro-b60-24gb)
- [Arc Pro B60 in-stock/price tracker — TrackaLacker](https://www.trackalacker.com/products/showcase/arc-pro-b60)
- [MAXSUN Arc Pro B60 Dual 48GB — VideoCardz](https://videocardz.com/newz/maxsun-arc-pro-b60-dual-arrives-to-first-users-two-battlemage-gpus-on-one-board-and-48gb-memory)

user 6/6/2026, 11:21:14 PM

are there any other tricks or hardware we can consider?

assistant Opus 4.8 6/6/2026, 11:22:02 PM

assistant Opus 4.8 49.2K tok $0.12 6/6/2026, 11:22:18 PM

Plenty — and some of them sidestep the "discrete GPU" framing entirely. Since you've dropped training, the door opens to a few routes that are *inference-only* plays. Rough price anchors below are approximate and need live-verification before they go in the catalogue (flagging that explicitly given how the 3090 number bit us).

## Hardware routes worth a look

**1. Unified-memory machines (the big one for "model size per pound")**
- **Apple Mac Studio (M-series Ultra).** Up to 192GB unified memory, *all* usable as VRAM, ~800 GB/s bandwidth on the Ultra — that's *better* than the B60 and near a 3090. Runs on MLX or llama.cpp Metal, sips power, near-silent. A used **M1 Ultra 128GB** or **M2 Ultra** is a genuinely strong "run a 70B–120B quant in one quiet box" option. The catch: macOS, no CUDA, weaker prompt-processing (compute-bound) speed than a 3090.
- **AMD Strix Halo / Ryzen AI Max+ 395** (Framework Desktop, GMKtec/Bosgame mini-PCs). Up to **128GB LPDDR5X unified**, ~96GB allocatable to the GPU, ~256 GB/s. ~£1,500–2,000. Low power, x86, runs llama.cpp/Vulkan + ROCm-ish. The current darling for cheap large-capacity local inference.
- **NVIDIA DGX Spark (GB10).** 128GB unified, CUDA, ~£3,000. Pricey but it's the only "big unified memory *and* CUDA" box.

**2. CPU + lots of RAM for MoE models — the cleverest trick right now**
Modern frontier open models are **Mixture-of-Experts** (DeepSeek-V3, Qwen3-235B-A22B, etc.): huge total params but only a small slice active per token, so they're *compute-light, memory-heavy*. A used **Epyc/Xeon board with 256–512GB DDR4/5** can run these at usable tok/s with **ktransformers** or **llama.cpp**, no big GPU needed — or a single 24GB GPU holding the hot experts/KV cache. This gets you "runnable model size" far beyond any single GPU's VRAM, for the price of cheap server RAM. Strongly worth a BOM.

**3. VRAM-modded cards (China)**
- **RTX 2080 Ti 22GB** (modded) — 22GB for often less than a 3090, decent bandwidth. Cult favourite for cheap VRAM.
- **RTX 4090 48GB** (modded, blower) — expensive but real; 48GB CUDA in one slot.
- Both carry the usual grey-import + reliability caveats your catalogue already applies to China cards.

**4. Cheaper datacenter cards beyond your existing P40/MI50/V100 set**
- **Tesla P100 16GB / V100 32GB PCIe** (HBM2 = high bandwidth), **MI100** coming down. Same import-VAT and software-pain caveats you already document.

**5. Tenstorrent (Wormhole/Blackhole)** — interesting, lots of memory per pound, but the software stack is still immature. Honest "watch this space," not "buy now."

## Software/config tricks (stretch whatever GPU you pick)
- **Quantization is the master lever** — GGUF Q4_K_M, AWQ, or EXL2; roughly 4× the model in the same VRAM vs FP16.
- **KV-cache quantization** (q8/q4) to fit long context without buying more VRAM.
- **Partial offload** (`-ngl`) — split layers GPU/CPU so a model that's slightly too big still runs.
- **Speculative decoding** — small draft model + big model for a free-ish speedup.
- **Tensor parallelism** across cheap cards (vLLM/ExLlama) — two 3090s acting as 48GB.
- **Mining-frame multi-GPU + x1 PCIe risers** — for *inference* the model loads once and PCIe traffic is tiny, so cheap x1 risers and an open frame let you cram 3–4 cards on a modest board. Perfect for a garage.
- **Power-cap the 3090s to ~250W** — ~5–10% perf loss, big thermal/PSU/noise win for an always-on box.

## My read on what's actually worth adding to the catalogue
The three that meaningfully expand your "best value" frontier beyond the current 10 BOMs:
1. **Mac Studio Ultra (used)** — quiet big-model-per-pound, inference-only.
2. **Strix Halo 128GB mini-PC** — same idea, x86, lower entry price.
3. **Epyc + 384GB RAM MoE box** — the "run a 235B model for the price of RAM" trick.

Want me to **live-price those three as UK BOMs** (plus the B60 24GB/Dual we already discussed) and fold them into `index.html` as a new "unified-memory & MoE routes" section? I'd verify each price at source and date-stamp per the catalogue convention before committing.

system 6/6/2026, 11:22:19 PM

user 6/6/2026, 11:22:35 PM

yes, price those three and add to index.html

assistant Opus 4.8 6/6/2026, 11:22:42 PM

Let me read the current catalogue structure and memory before pricing, so the new builds match conventions.

assistant Opus 4.8 6/6/2026, 11:22:42 PM
user 6/6/2026, 11:22:43 PM
assistant Opus 4.8 49.8K tok $0.04 6/6/2026, 11:22:43 PM
- Read /home/megaproxy/claude/research/llm-box-uk/index.html 239ms 50.3 KB
attachment 6/6/2026, 11:22:43 PM
user 6/6/2026, 11:22:43 PM
assistant Opus 4.8 6/6/2026, 11:22:43 PM
- Read /home/megaproxy/claude/research/llm-box-uk/memory.md 18ms 4.7 KB
assistant Opus 4.8 6/6/2026, 11:22:54 PM

assistant Opus 4.8 6/6/2026, 11:22:54 PM

Structure understood. Now let me live-price the three routes. Running the price research in parallel.

assistant Opus 4.8 6/6/2026, 11:22:55 PM
- WebSearch used Mac Studio M1 Ultra 128GB price UK 2026 ebay 7.3s 2.4 KB
assistant Opus 4.8 6/6/2026, 11:22:56 PM
- WebSearch Ryzen AI Max+ 395 128GB mini PC UK price Framework Desktop GMKtec EVO-X2 8.3s 3.4 KB
assistant Opus 4.8 6/6/2026, 11:22:56 PM
- WebSearch Mac Studio M2 Ultra 128GB 192GB used price UK refurbished 2026 7.0s 2.7 KB
assistant Opus 4.8 77.2K tok $0.21 6/6/2026, 11:22:57 PM
- WebSearch used Epyc 7002 7003 motherboard 384GB DDR4 RAM price UK ebay LLM 7.8s 2.6 KB
user 6/6/2026, 11:23:02 PM
user 6/6/2026, 11:23:03 PM
user 6/6/2026, 11:23:05 PM
user 6/6/2026, 11:23:05 PM
assistant Opus 4.8 6/6/2026, 11:23:16 PM
- WebFetch https://www.gmktec.uk/products/gmktec-evo-x2-amd-ryzen%e2%84%a2-ai-max-395-ai-mini-pc 4.9s 538 B

assistant Opus 4.8 6/6/2026, 11:23:16 PM

Need firmer numbers on each. Fetching specific product/listing pages.

assistant Opus 4.8 6/6/2026, 11:23:16 PM
- WebFetch https://www.apple.com/uk/shop/refurbished/mac/mac-studio 4.8s 366 B
assistant Opus 4.8 6/6/2026, 11:23:16 PM
assistant Opus 4.8 6/6/2026, 11:23:18 PM
- WebFetch https://www.hoxtonmacs.co.uk/collections/refurbished-mac-studio 8.1s 1.1 KB
assistant Opus 4.8 83.8K tok $0.10 6/6/2026, 11:23:19 PM
- WebFetch https://www.techradar.com/pro/there-are-15-amd-ryzen-ai-max-395-mini-pcs-in-the-world-right-now-heres-where-you-can-buy-… 4.5s 691 B
user 6/6/2026, 11:23:21 PM
user 6/6/2026, 11:23:21 PM
user 6/6/2026, 11:23:24 PM
attachment 6/6/2026, 11:23:26 PM
user 6/6/2026, 11:23:26 PM
assistant Opus 4.8 6/6/2026, 11:23:37 PM

assistant Opus 4.8 6/6/2026, 11:23:38 PM

Strix Halo nailed (£1,619.99 UK). Need to firm up the M1 Ultra 128GB used price and the Epyc bundle. Two more rounds.

assistant Opus 4.8 6/6/2026, 11:23:39 PM
- WebSearch Epyc 7532 OR 7702 H12SSL-i 256GB DDR4 bundle ebay UK price £ 7.1s 2.4 KB
assistant Opus 4.8 6/6/2026, 11:23:39 PM
- WebSearch Mac Studio M1 Ultra 128GB 1TB used price UK £ 2026 sold 7.2s 2.2 KB
assistant Opus 4.8 86.1K tok $0.09 6/6/2026, 11:23:40 PM
- WebSearch 384GB DDR4 ECC RDIMM 2666 3200 12x32GB price UK ebay used 8.7s 2.5 KB
user 6/6/2026, 11:23:46 PM
user 6/6/2026, 11:23:46 PM
user 6/6/2026, 11:23:48 PM
assistant Opus 4.8 6/6/2026, 11:26:39 PM

assistant Opus 4.8 6/6/2026, 11:26:41 PM

All three priced from live UK sources:
- **Strix Halo (GMKtec EVO-X2 128GB+2TB):** £1,619.99 direct from gmktec.uk — complete machine.
- **Mac Studio M1 Ultra 128GB:** ~£1,600 (eBay auction) to ~£2,455 (refurb dealer); ~£1,950 typical used buy-it-now.
- **Epyc + 384GB MoE box:** ~£1,375 assembled from eBay UK server parts.

Now adding builds K/L/M to the catalogue. Editing the hero, index table, a new section, the verdict, and footer.

user 6/6/2026, 11:26:44 PM
assistant Opus 4.8 103.6K tok $0.42 6/6/2026, 11:26:44 PM
- Edit /home/megaproxy/claude/research/llm-box-uk/index.html 34ms 161 B
user 6/6/2026, 11:26:50 PM
assistant Opus 4.8 104.1K tok $0.14 6/6/2026, 11:26:50 PM
- Edit /home/megaproxy/claude/research/llm-box-uk/index.html 14ms 161 B
user 6/6/2026, 11:26:55 PM
assistant Opus 4.8 104.5K tok $0.06 6/6/2026, 11:26:55 PM
- Edit /home/megaproxy/claude/research/llm-box-uk/index.html 39ms 161 B
assistant Opus 4.8 104.9K tok $0.06 6/6/2026, 11:26:59 PM
- Edit /home/megaproxy/claude/research/llm-box-uk/index.html 12ms 161 B
user 6/6/2026, 11:26:59 PM

assistant Opus 4.8 6/6/2026, 11:27:04 PM

Now the index table — add K/L/M rows and update caption:

assistant Opus 4.8 105.2K tok $0.06 6/6/2026, 11:27:05 PM
- Edit /home/megaproxy/claude/research/llm-box-uk/index.html 21ms 161 B
user 6/6/2026, 11:27:05 PM
attachment 6/6/2026, 11:27:05 PM
user 6/6/2026, 11:27:16 PM
assistant Opus 4.8 106.4K tok $0.08 6/6/2026, 11:27:16 PM
- Edit /home/megaproxy/claude/research/llm-box-uk/index.html 14ms 161 B
user 6/6/2026, 11:27:21 PM
assistant Opus 4.8 106.8K tok $0.07 6/6/2026, 11:27:21 PM
- Edit /home/megaproxy/claude/research/llm-box-uk/index.html 21ms 161 B
assistant Opus 4.8 6/6/2026, 11:27:52 PM

assistant Opus 4.8 6/6/2026, 11:27:52 PM

Now insert the new Unified-memory & MoE section between the builds and verdict sections:

assistant Opus 4.8 111.2K tok $0.17 6/6/2026, 11:28:05 PM
- Edit /home/megaproxy/claude/research/llm-box-uk/index.html 36ms 161 B
user 6/6/2026, 11:28:05 PM

assistant Opus 4.8 6/6/2026, 11:28:12 PM

Now add the three routes to the verdict table and update the footer methodology:

assistant Opus 4.8 112K tok $0.10 6/6/2026, 11:28:14 PM
- Edit /home/megaproxy/claude/research/llm-box-uk/index.html 31ms 161 B
user 6/6/2026, 11:28:14 PM
user 6/6/2026, 11:28:23 PM
assistant Opus 4.8 112.6K tok $0.07 6/6/2026, 11:28:23 PM
- Edit /home/megaproxy/claude/research/llm-box-uk/index.html 25ms 161 B

assistant Opus 4.8 6/6/2026, 11:28:30 PM

Edits in. Now updating memory.md per the working agreement, then committing.

assistant Opus 4.8 6/6/2026, 11:28:30 PM
assistant Opus 4.8 113.2K tok $0.07 6/6/2026, 11:28:32 PM
- Edit /home/megaproxy/claude/research/llm-box-uk/memory.md 30ms 160 B
user 6/6/2026, 11:28:32 PM
assistant Opus 4.8 114.6K tok $0.09 6/6/2026, 11:28:51 PM
- Edit /home/megaproxy/claude/research/llm-box-uk/memory.md 49ms 160 B
attachment 6/6/2026, 11:28:52 PM
user 6/6/2026, 11:28:52 PM
user 6/6/2026, 11:29:01 PM
assistant Opus 4.8 115.4K tok $0.08 6/6/2026, 11:29:01 PM
- Edit /home/megaproxy/claude/research/llm-box-uk/memory.md 27ms 160 B
assistant Opus 4.8 6/6/2026, 11:29:05 PM
```
Now commit and push:
```
assistant Opus 4.8 115.7K tok $0.07 6/6/2026, 11:29:06 PM
- Bash git add -A && git commit -q -m "Add unified-memory & MoE builds (K-M): Mac Studio, Strix Halo, EPYC MoE" && git push -q … 14.4s 9 B
user 6/6/2026, 11:29:20 PM
system 6/6/2026, 11:29:38 PM

assistant Opus 4.8 116.5K tok $0.08 6/6/2026, 11:29:38 PM

Done. Committed and pushed.

## What I added — builds K, L, M (live-priced 2026-06-07)

| # | Build | Capacity | Bandwidth | Total | The angle |
|---|---|---|---|---|---|
| **K** | Mac Studio M1 Ultra (used) | 128GB unified | ~800 GB/s | **~£1,950** | Quiet, near-3090 bandwidth, 5× the capacity. 70B fast, 120B Q4. |
| **L** | GMKtec EVO-X2 (Ryzen AI Max+ 395) | 128GB unified | ~256 GB/s | **£1,620** | Cheapest 128GB box, ~120W, x86. Best on MoE + mid models. |
| **M** | EPYC 7532 + 384GB DDR4 | 384GB system RAM | ~200 GB/s | **~£1,375** | The MoE trick — runs a 235B model with no GPU at all. |

**Price confidence:** L is rock-solid (£1,619.99 direct from gmktec.uk). K is a range (~£1,600 eBay auction → ~£2,300+ certified refurb; I anchored ~£1,950 for a typical used buy-it-now). M is assembled from eBay UK server parts, so the £1,375 is a build estimate, not a single SKU — I flagged that and the ~20% import-VAT risk on China RAM/CPU bundles.

**Where they landed in the doc:** new `[3·5]` section between the GPU builds and the verdict, with a framing box explaining the unified-memory vs MoE distinction; rows added to the `[00]` index table (with a divider and `*`/`†` footnote making clear this is shared/system memory, not discrete VRAM); three rows added to the verdict table; hero, nav, and footer methodology updated; `CHECKED 2026-06-05` kept with a `+UNIFIED/MoE 2026-06-07` tag since I didn't re-verify the A–J GPU prices this pass.

Two things worth flagging:
- The honest caveat baked into all three cards: these are **inference-first** — they buy capacity and quiet, not training or raw speed. Bandwidth, not capacity, caps token rate (Apple 800 / Strix Halo 256 / DDR4 200 GB/s).
- I did **not** add the **Arc B60 24GB / Dual 48GB** builds we discussed earlier — you only asked for these three. Want me to add the B60 Dual 48GB as build N? It's the one Arc config that genuinely competes (48GB one-slot, ~£1,400, vs 2× used 3090 at ~£1,640).