All sessions

llm-box-uk

main DESKTOP-SSAQG5J cc 2.1.167 cli

/home/megaproxy/claude/research/llm-box-uk

126 events
38 tool calls
2.2M tokens
$2.77 notional cost
6/6/2026, 11:14:26 PM → 24d ago
in 7.5K out 41.9K cache-write 102.3K cache-read 2.1M

Transcript

126 turns · chronological (UTC timestamps shown local)
  1. attachment 6/6/2026, 11:14:26 PM
  2. attachment 6/6/2026, 11:14:26 PM
  3. user 6/6/2026, 11:14:26 PM
    are intel arc good for ai?
  4. assistant Opus 4.8 6/6/2026, 11:14:37 PM
  5. assistant Opus 4.8 27.6K tok $0.10 6/6/2026, 11:14:48 PM
    Short answer: **fine for inference, bad for the fine-tuning half of your brief.**
    
    For this project specifically (inference + light LoRA/QLoRA), Intel Arc is a half-fit:
    
    **Where Arc works — inference**
    - **Arc A770 16GB** is the interesting card: ~£280–330 used/new in the UK, 16GB GDDR6, ~560 GB/s bandwidth. That's a lot of VRAM per pound — cheaper than a 3090 on a £/GB basis.
    - Software is actually usable now: **IPEX-LLM** (Intel's PyTorch LLM extension) and **llama.cpp** with the SYCL or Vulkan backend both run quantised models on Arc reasonably well. Ollama works via the IPEX-LLM fork.
    - Battlemage **B580 12GB** (~£250) is the newer gen but less VRAM, so less relevant for "biggest model per pound."
    
    **Where Arc falls down — fine-tuning**
    - The LoRA/QLoRA ecosystem is CUDA-locked. `bitsandbytes` (the backbone of 4-bit QLoRA) has no real Arc support; PEFT/Axolotl/Unsloth all assume CUDA. You *can* do training via oneAPI/IPEX, but it's a constant fight against the toolchain, and Unsloth — the thing that makes "light QLoRA on one card" pleasant — is CUDA-only.
    - So Arc fails the second half of your "done" criteria. The whole reason the **used RTX 3090 24GB** is the safe pick is that it does both halves with zero ecosystem friction.
    
    **Verdict for the catalogue:** Arc A770 16GB belongs in the same honest-but-caveated bucket as the P40/MI50 cheap routes — a viable *inference-only* budget play, explicitly NOT recommended if you actually want the fine-tuning capability the brief calls for. It's arguably a cleaner cheap route than MI50 (no ROCm-on-ancient-card pain) but worse than 3090 for anything training-shaped.
    
    Want me to add an Arc A770 inference-only build to `index.html` as a sibling to the other cheap-card routes, with that caveat called out? I'd price it live first.
  6. system 6/6/2026, 11:14:48 PM
  7. attachment 6/6/2026, 11:15:57 PM
  8. user 6/6/2026, 11:15:57 PM
    if i dont really want to do training.. would a 24gb vram intel arc be a good fit?
  9. assistant Opus 4.8 6/6/2026, 11:16:17 PM
  10. assistant Opus 4.8 6/6/2026, 11:16:19 PM
    The card you're thinking of is the **Intel Arc Pro B60 24GB** (Battlemage, the "Project Battlematrix" workstation line) — the consumer A770 tops out at 16GB, so 24GB Arc means the Pro B60. Let me check current UK availability/pricing before I commit to a verdict, since that's the whole point of this catalogue.
  11. user 6/6/2026, 11:16:20 PM
  12. assistant Opus 4.8 29.1K tok $0.07 6/6/2026, 11:16:20 PM
    • ToolSearch select:WebSearch,WebFetch 4ms 100 B
  13. assistant Opus 4.8 6/6/2026, 11:16:22 PM
    • WebSearch Intel Arc Pro B60 24GB price UK availability 2026 7.4s 2.8 KB
  14. assistant Opus 4.8 29.9K tok $0.10 6/6/2026, 11:16:23 PM
    • WebSearch Intel Arc Pro B60 24GB memory bandwidth llama.cpp vLLM inference benchmark 9.1s 2.5 KB
  15. user 6/6/2026, 11:16:30 PM
  16. user 6/6/2026, 11:16:32 PM
  17. assistant Opus 4.8 6/6/2026, 11:17:00 PM
  18. assistant Opus 4.8 34.8K tok $0.09 6/6/2026, 11:17:10 PM
    Yes — drop the training requirement and the B60 goes from "half-fit" to **genuinely competitive**. Here's the honest read, now grounded in current pricing:
    
    ## Intel Arc Pro B60 24GB — what it actually is
    - 24GB GDDR6, **456 GB/s** bandwidth, 192-bit bus, PCIe 5.0 **x8**, 160 XMX AI engines.
    - Pricing: listed as low as **$599** in the US; Europe **€769–€1,025** depending on model (Sparkle/ASRock). That's roughly **£650–£900 inc VAT** landed in the UK — and crucially it's *new with warranty and UK stock* (Newegg UK, ASRock, Sparkle), not a used eBay gamble.
    - It's explicitly marketed for ~20B-class local LLMs, and the software is real now: **vLLM has first-class Intel Arc support** (Nov 2025), llama.cpp runs via SYCL, plus IPEX-LLM/Ollama. For inference-only, the CUDA-lock problem that killed it for training no longer applies.
    
    ## The one tradeoff that matters: bandwidth
    The used **RTX 3090 24GB** has **~936 GB/s** — roughly **2× the B60's 456 GB/s**. LLM token generation is memory-bandwidth-bound, so on a model that fills 24GB, expect the 3090 to push **roughly 2× the single-stream tokens/sec**. That's the headline cost of going Arc.
    
    Where the B60 claws it back:
    - **Power/heat:** ~120–200W TBP vs the 3090's 350W. For a 24/7 garage box at UK electricity prices, that's real money and far less cooling drama. Blower and passive variants exist — basically built for racking.
    - **Batched serving:** the XMX engines scale well concurrently (~1000 tok/s aggregate, 50–64 concurrent requests in vLLM benches). If you're serving a few users/agents rather than chasing one fast stream, the gap narrows a lot.
    - **New + warranty + in stock** vs used-3090 roulette.
    
    ## Verdict for your case (inference, no training)
    - **Want max single-user speed per pound** → used 3090 still wins on raw tok/s, similar price (~£600–700 used).
    - **Want quiet-ish, low-power, new-with-warranty, and you serve batched/agent workloads or are fine with ~half the stream speed** → the B60 24GB is a legitimately good fit, and arguably the *better* fit for an always-on garage box you don't want to babysit.
    - **Bonus path:** two B60s (or the single-card **B60 Dual 48GB**) gets you 48GB for roughly the price of one used 3090-and-a-bit — that's where Arc gets genuinely interesting, since 48GB lets you run ~70B quants the 3090 can't hold alone.
    
    Want me to **live-price a UK B60 build** (single 24GB, and a dual-48GB variant) and add it to `index.html` as an inference-only sibling to the 3090 build? I'd pull current Newegg UK / ASRock stock prices and date-stamp it per the catalogue convention.
    
    Sources:
    - [Arc Pro B60 24GB Europe launch from €769 — VideoCardz](https://videocardz.com/newz/intel-arc-pro-b60-24gb-workstation-gpu-to-launch-in-europe-mid-to-late-november-starting-at-e769)
    - [Arc Pro B60 listed at $599, in stock — VideoCardz](https://videocardz.com/newz/intel-arc-pro-b60-24gb-professional-gpu-listed-at-599-in-stock-and-shipping)
    - [Intel Arc Pro B60 product specs — Intel](https://www.intel.com/content/www/us/en/products/sku/243916/intel-arc-pro-b60-graphics/specifications.html)
    - [Benchmarking LLM inference on Arc Pro B60 (vLLM) — EmbeddedLLM](https://embeddedllm.com/blog/benchmarking-llm-inference-intel-arc-pro-b60)
    - [Fast, affordable LLM serving on Arc Pro B-Series with vLLM — vLLM blog](https://vllm.ai/blog/2025-11-11-intel-arc-pro-b)
    - [ASRock Arc Pro B60 24GB — Newegg UK](https://www.newegg.com/global/uk-en/asrock-intel-arc-pro-b60-24gb-graphics/p/N82E16814930148)
  19. system 6/6/2026, 11:17:10 PM
  20. user 6/6/2026, 11:18:20 PM
    ceck websites like scan or overclockers.co.uk and ebay to be sure of prices. 3090 costs a lot more than you say etc
  21. assistant Opus 4.8 6/6/2026, 11:18:25 PM
    You're right to push on this — let me pull actual UK retailer and eBay numbers rather than trusting the MSRP-to-GBP conversion. Checking Scan, Overclockers, and eBay sold.
  22. assistant Opus 4.8 6/6/2026, 11:18:25 PM
  23. assistant Opus 4.8 6/6/2026, 11:18:25 PM
    • WebSearch Intel Arc Pro B60 24GB scan.co.uk price 6.6s 2.6 KB
  24. assistant Opus 4.8 6/6/2026, 11:18:26 PM
    • WebSearch RTX 3090 24GB used price UK 2026 ebay sold 6.4s 2.1 KB
  25. assistant Opus 4.8 35.3K tok $0.03 6/6/2026, 11:18:26 PM
    • WebSearch RTX 3090 price UK scan overclockers 2026 7.5s 3.0 KB
  26. assistant Opus 4.8 6/6/2026, 11:18:26 PM
    • WebSearch Intel Arc Pro B60 24GB overclockers.co.uk 9.6s 2.9 KB
  27. user 6/6/2026, 11:18:32 PM
  28. user 6/6/2026, 11:18:33 PM
  29. user 6/6/2026, 11:18:34 PM
  30. attachment 6/6/2026, 11:18:35 PM
  31. user 6/6/2026, 11:18:35 PM
  32. assistant Opus 4.8 6/6/2026, 11:18:45 PM
  33. assistant Opus 4.8 6/6/2026, 11:18:45 PM
    Good leads. Let me pull the actual numbers from the trackers and product pages.
  34. assistant Opus 4.8 6/6/2026, 11:18:45 PM
    • WebFetch https://www.newegg.com/global/uk-en/asrock-intel-arc-pro-b60-24gb-graphics/p/N82E16814930148 6.8s 665 B
  35. assistant Opus 4.8 6/6/2026, 11:18:45 PM
    • WebFetch https://bestvaluegpu.com/en-gb/history/new-and-used-rtx-3090-price-history-and-specs/ 5.6s 407 B
  36. assistant Opus 4.8 41.5K tok $0.07 6/6/2026, 11:18:47 PM
    • WebSearch "Arc Pro B60" 24GB UK in stock buy price £ scan OR overclockers OR cclonline 9.8s 2.7 KB
  37. assistant Opus 4.8 6/6/2026, 11:18:47 PM
    • WebFetch https://e-catalog.co.uk/SPARKLE-INTEL-ARC-PRO-B60-BLOWER.htm 4.4s 215 B
  38. user 6/6/2026, 11:18:50 PM
  39. user 6/6/2026, 11:18:51 PM
  40. user 6/6/2026, 11:18:52 PM
  41. user 6/6/2026, 11:18:57 PM
  42. assistant Opus 4.8 6/6/2026, 11:19:21 PM
  43. system 6/6/2026, 11:19:28 PM
  44. assistant Opus 4.8 45.4K tok $0.11 6/6/2026, 11:19:28 PM
    Thanks — you're right, and the live numbers change the verdict. I undersold the 3090. Corrected picture:
    
    ## Actual UK prices (June 2026)
    
    | Card | UK price | Notes |
    |---|---|---|
    | **RTX 3090 24GB (used)** | **~£820** | bestvaluegpu tracker, 6 Jun 2026. *Not* the £600–700 I quoted — I was wrong. New is irrelevant (~£2,180, discontinued). |
    | **Arc Pro B60 24GB** (Maxsun, Destello Tech) | **£899.99** | Proper UK stock, free UK delivery, UK warranty. |
    | **Arc Pro B60 24GB** (ASRock via Newegg) | £639.59 + £42.88 ship = **~£682** | Ships from US — **no manufacturer warranty out-of-region**, and the £682 likely doesn't include import handling on top. |
    | **Arc Pro B60 Dual 48GB** (Maxsun) | **~£1,400** + £40 ship | Single card, 48GB, real UK buyer paid this. |
    
    Caveat on the 3090: that £820 is the tracker's snapshot figure, and per the catalogue's own rule eBay blocks automated fetch — the honest anchor is "spot-check completed *sold* listings by clicking `LH_Sold=1`." But £800-ish is clearly the right ballpark, not £700.
    
    ## What this does to the verdict
    
    The price gap I implied has basically evaporated. At the **UK-supported** prices it's ~£820 (3090) vs ~£900 (B60) — near parity. So:
    
    - **Single card, like-for-like ~£850:** the **used 3090 wins** for inference. Same 24GB, but ~2× the memory bandwidth (936 vs 456 GB/s) = roughly 2× single-stream tokens/sec, and zero CUDA friction. The B60 only pulls ahead on power (200W vs 350W) and warranty — real, but not enough to overcome 2× speed at the same money.
    - **The grey-import B60 at ~£682** undercuts the 3090, but you're trading away warranty and eating import-handling risk for a slower card. Hard to recommend over a used 3090.
    - **Where Arc actually wins: the B60 Dual 48GB at ~£1,400.** Two used 3090s for 48GB is ~£1,640 *plus* a dual-GPU headache (two slots, ~700W, cooling, splitting layers). One B60 Dual card gives you 48GB in one slot at ~200–300W for less money. That's the genuinely compelling Arc story for a quiet always-on garage box — it lets you hold ~70B-class quants the single 3090 can't.
    
    So: **for one 24GB card, no — the 3090 is still the better buy at today's prices.** The Arc case is the 48GB dual card, not the single.
    
    Want me to add both an **Arc B60 24GB (inference-only)** and an **Arc B60 Dual 48GB** build to `index.html`, with these dated prices and the "single = buy the 3090 instead, dual = genuinely interesting" verdict baked in? I'll re-verify the 3090 sold price by clicking the eBay `LH_Sold=1` URL first so the comparison anchor is solid.
    
    Sources:
    - [RTX 3090 UK price tracker — BestValueGPU](https://bestvaluegpu.com/en-gb/history/new-and-used-rtx-3090-price-history-and-specs/)
    - [ASRock Arc Pro B60 24GB — Newegg UK](https://www.newegg.com/global/uk-en/asrock-intel-arc-pro-b60-24gb-graphics/p/N82E16814930148)
    - [Maxsun Arc Pro B60 24GB £899.99 — Destello Tech](https://destellotech.com/products/maxsun-arc-pro-b60-24gb)
    - [Arc Pro B60 in-stock/price tracker — TrackaLacker](https://www.trackalacker.com/products/showcase/arc-pro-b60)
    - [MAXSUN Arc Pro B60 Dual 48GB — VideoCardz](https://videocardz.com/newz/maxsun-arc-pro-b60-dual-arrives-to-first-users-two-battlemage-gpus-on-one-board-and-48gb-memory)
  45. user 6/6/2026, 11:21:14 PM
    are there any other tricks or hardware we can consider?
  46. assistant Opus 4.8 6/6/2026, 11:22:02 PM
  47. assistant Opus 4.8 49.2K tok $0.12 6/6/2026, 11:22:18 PM
    Plenty — and some of them sidestep the "discrete GPU" framing entirely. Since you've dropped training, the door opens to a few routes that are *inference-only* plays. Rough price anchors below are approximate and need live-verification before they go in the catalogue (flagging that explicitly given how the 3090 number bit us).
    
    ## Hardware routes worth a look
    
    **1. Unified-memory machines (the big one for "model size per pound")**
    - **Apple Mac Studio (M-series Ultra).** Up to 192GB unified memory, *all* usable as VRAM, ~800 GB/s bandwidth on the Ultra — that's *better* than the B60 and near a 3090. Runs on MLX or llama.cpp Metal, sips power, near-silent. A used **M1 Ultra 128GB** or **M2 Ultra** is a genuinely strong "run a 70B–120B quant in one quiet box" option. The catch: macOS, no CUDA, weaker prompt-processing (compute-bound) speed than a 3090.
    - **AMD Strix Halo / Ryzen AI Max+ 395** (Framework Desktop, GMKtec/Bosgame mini-PCs). Up to **128GB LPDDR5X unified**, ~96GB allocatable to the GPU, ~256 GB/s. ~£1,500–2,000. Low power, x86, runs llama.cpp/Vulkan + ROCm-ish. The current darling for cheap large-capacity local inference.
    - **NVIDIA DGX Spark (GB10).** 128GB unified, CUDA, ~£3,000. Pricey but it's the only "big unified memory *and* CUDA" box.
    
    **2. CPU + lots of RAM for MoE models — the cleverest trick right now**
    Modern frontier open models are **Mixture-of-Experts** (DeepSeek-V3, Qwen3-235B-A22B, etc.): huge total params but only a small slice active per token, so they're *compute-light, memory-heavy*. A used **Epyc/Xeon board with 256–512GB DDR4/5** can run these at usable tok/s with **ktransformers** or **llama.cpp**, no big GPU needed — or a single 24GB GPU holding the hot experts/KV cache. This gets you "runnable model size" far beyond any single GPU's VRAM, for the price of cheap server RAM. Strongly worth a BOM.
    
    **3. VRAM-modded cards (China)**
    - **RTX 2080 Ti 22GB** (modded) — 22GB for often less than a 3090, decent bandwidth. Cult favourite for cheap VRAM.
    - **RTX 4090 48GB** (modded, blower) — expensive but real; 48GB CUDA in one slot.
    - Both carry the usual grey-import + reliability caveats your catalogue already applies to China cards.
    
    **4. Cheaper datacenter cards beyond your existing P40/MI50/V100 set**
    - **Tesla P100 16GB / V100 32GB PCIe** (HBM2 = high bandwidth), **MI100** coming down. Same import-VAT and software-pain caveats you already document.
    
    **5. Tenstorrent (Wormhole/Blackhole)** — interesting, lots of memory per pound, but the software stack is still immature. Honest "watch this space," not "buy now."
    
    ## Software/config tricks (stretch whatever GPU you pick)
    - **Quantization is the master lever** — GGUF Q4_K_M, AWQ, or EXL2; roughly 4× the model in the same VRAM vs FP16.
    - **KV-cache quantization** (q8/q4) to fit long context without buying more VRAM.
    - **Partial offload** (`-ngl`) — split layers GPU/CPU so a model that's slightly too big still runs.
    - **Speculative decoding** — small draft model + big model for a free-ish speedup.
    - **Tensor parallelism** across cheap cards (vLLM/ExLlama) — two 3090s acting as 48GB.
    - **Mining-frame multi-GPU + x1 PCIe risers** — for *inference* the model loads once and PCIe traffic is tiny, so cheap x1 risers and an open frame let you cram 3–4 cards on a modest board. Perfect for a garage.
    - **Power-cap the 3090s to ~250W** — ~5–10% perf loss, big thermal/PSU/noise win for an always-on box.
    
    ## My read on what's actually worth adding to the catalogue
    The three that meaningfully expand your "best value" frontier beyond the current 10 BOMs:
    1. **Mac Studio Ultra (used)** — quiet big-model-per-pound, inference-only.
    2. **Strix Halo 128GB mini-PC** — same idea, x86, lower entry price.
    3. **Epyc + 384GB RAM MoE box** — the "run a 235B model for the price of RAM" trick.
    
    Want me to **live-price those three as UK BOMs** (plus the B60 24GB/Dual we already discussed) and fold them into `index.html` as a new "unified-memory & MoE routes" section? I'd verify each price at source and date-stamp per the catalogue convention before committing.
  48. system 6/6/2026, 11:22:19 PM
  49. user 6/6/2026, 11:22:35 PM
    yes, price those three and add to index.html
  50. assistant Opus 4.8 6/6/2026, 11:22:42 PM
    Let me read the current catalogue structure and memory before pricing, so the new builds match conventions.
  51. assistant Opus 4.8 6/6/2026, 11:22:42 PM
  52. user 6/6/2026, 11:22:43 PM
  53. assistant Opus 4.8 49.8K tok $0.04 6/6/2026, 11:22:43 PM
    • Read /home/megaproxy/claude/research/llm-box-uk/index.html 239ms 50.3 KB
  54. attachment 6/6/2026, 11:22:43 PM
  55. user 6/6/2026, 11:22:43 PM
  56. assistant Opus 4.8 6/6/2026, 11:22:43 PM
    • Read /home/megaproxy/claude/research/llm-box-uk/memory.md 18ms 4.7 KB
  57. assistant Opus 4.8 6/6/2026, 11:22:54 PM
  58. assistant Opus 4.8 6/6/2026, 11:22:54 PM
    Structure understood. Now let me live-price the three routes. Running the price research in parallel.
  59. assistant Opus 4.8 6/6/2026, 11:22:55 PM
    • WebSearch used Mac Studio M1 Ultra 128GB price UK 2026 ebay 7.3s 2.4 KB
  60. assistant Opus 4.8 6/6/2026, 11:22:56 PM
    • WebSearch Ryzen AI Max+ 395 128GB mini PC UK price Framework Desktop GMKtec EVO-X2 8.3s 3.4 KB
  61. assistant Opus 4.8 6/6/2026, 11:22:56 PM
    • WebSearch Mac Studio M2 Ultra 128GB 192GB used price UK refurbished 2026 7.0s 2.7 KB
  62. assistant Opus 4.8 77.2K tok $0.21 6/6/2026, 11:22:57 PM
    • WebSearch used Epyc 7002 7003 motherboard 384GB DDR4 RAM price UK ebay LLM 7.8s 2.6 KB
  63. user 6/6/2026, 11:23:02 PM
  64. user 6/6/2026, 11:23:03 PM
  65. user 6/6/2026, 11:23:05 PM
  66. user 6/6/2026, 11:23:05 PM
  67. assistant Opus 4.8 6/6/2026, 11:23:16 PM
    • WebFetch https://www.gmktec.uk/products/gmktec-evo-x2-amd-ryzen%e2%84%a2-ai-max-395-ai-mini-pc 4.9s 538 B
  68. assistant Opus 4.8 6/6/2026, 11:23:16 PM
    Need firmer numbers on each. Fetching specific product/listing pages.
  69. assistant Opus 4.8 6/6/2026, 11:23:16 PM
    • WebFetch https://www.apple.com/uk/shop/refurbished/mac/mac-studio 4.8s 366 B
  70. assistant Opus 4.8 6/6/2026, 11:23:16 PM
  71. assistant Opus 4.8 6/6/2026, 11:23:18 PM
    • WebFetch https://www.hoxtonmacs.co.uk/collections/refurbished-mac-studio 8.1s 1.1 KB
  72. assistant Opus 4.8 83.8K tok $0.10 6/6/2026, 11:23:19 PM
    • WebFetch https://www.techradar.com/pro/there-are-15-amd-ryzen-ai-max-395-mini-pcs-in-the-world-right-now-heres-where-you-can-buy-… 4.5s 691 B
  73. user 6/6/2026, 11:23:21 PM
  74. user 6/6/2026, 11:23:21 PM
  75. user 6/6/2026, 11:23:24 PM
  76. attachment 6/6/2026, 11:23:26 PM
  77. user 6/6/2026, 11:23:26 PM
  78. assistant Opus 4.8 6/6/2026, 11:23:37 PM
  79. assistant Opus 4.8 6/6/2026, 11:23:38 PM
    Strix Halo nailed (£1,619.99 UK). Need to firm up the M1 Ultra 128GB used price and the Epyc bundle. Two more rounds.
  80. assistant Opus 4.8 6/6/2026, 11:23:39 PM
    • WebSearch Epyc 7532 OR 7702 H12SSL-i 256GB DDR4 bundle ebay UK price £ 7.1s 2.4 KB
  81. assistant Opus 4.8 6/6/2026, 11:23:39 PM
    • WebSearch Mac Studio M1 Ultra 128GB 1TB used price UK £ 2026 sold 7.2s 2.2 KB
  82. assistant Opus 4.8 86.1K tok $0.09 6/6/2026, 11:23:40 PM
    • WebSearch 384GB DDR4 ECC RDIMM 2666 3200 12x32GB price UK ebay used 8.7s 2.5 KB
  83. user 6/6/2026, 11:23:46 PM
  84. user 6/6/2026, 11:23:46 PM
  85. user 6/6/2026, 11:23:48 PM
  86. assistant Opus 4.8 6/6/2026, 11:26:39 PM
  87. assistant Opus 4.8 6/6/2026, 11:26:41 PM
    All three priced from live UK sources:
    - **Strix Halo (GMKtec EVO-X2 128GB+2TB):** £1,619.99 direct from gmktec.uk — complete machine.
    - **Mac Studio M1 Ultra 128GB:** ~£1,600 (eBay auction) to ~£2,455 (refurb dealer); ~£1,950 typical used buy-it-now.
    - **Epyc + 384GB MoE box:** ~£1,375 assembled from eBay UK server parts.
    
    Now adding builds K/L/M to the catalogue. Editing the hero, index table, a new section, the verdict, and footer.
  88. user 6/6/2026, 11:26:44 PM
  89. assistant Opus 4.8 103.6K tok $0.42 6/6/2026, 11:26:44 PM
    • Edit /home/megaproxy/claude/research/llm-box-uk/index.html 34ms 161 B
  90. user 6/6/2026, 11:26:50 PM
  91. assistant Opus 4.8 104.1K tok $0.14 6/6/2026, 11:26:50 PM
    • Edit /home/megaproxy/claude/research/llm-box-uk/index.html 14ms 161 B
  92. user 6/6/2026, 11:26:55 PM
  93. assistant Opus 4.8 104.5K tok $0.06 6/6/2026, 11:26:55 PM
    • Edit /home/megaproxy/claude/research/llm-box-uk/index.html 39ms 161 B
  94. assistant Opus 4.8 104.9K tok $0.06 6/6/2026, 11:26:59 PM
    • Edit /home/megaproxy/claude/research/llm-box-uk/index.html 12ms 161 B
  95. user 6/6/2026, 11:26:59 PM
  96. assistant Opus 4.8 6/6/2026, 11:27:04 PM
    Now the index table — add K/L/M rows and update caption:
  97. assistant Opus 4.8 105.2K tok $0.06 6/6/2026, 11:27:05 PM
    • Edit /home/megaproxy/claude/research/llm-box-uk/index.html 21ms 161 B
  98. user 6/6/2026, 11:27:05 PM
  99. attachment 6/6/2026, 11:27:05 PM
  100. user 6/6/2026, 11:27:16 PM
  101. assistant Opus 4.8 106.4K tok $0.08 6/6/2026, 11:27:16 PM
    • Edit /home/megaproxy/claude/research/llm-box-uk/index.html 14ms 161 B
  102. user 6/6/2026, 11:27:21 PM
  103. assistant Opus 4.8 106.8K tok $0.07 6/6/2026, 11:27:21 PM
    • Edit /home/megaproxy/claude/research/llm-box-uk/index.html 21ms 161 B
  104. assistant Opus 4.8 6/6/2026, 11:27:52 PM
  105. assistant Opus 4.8 6/6/2026, 11:27:52 PM
    Now insert the new Unified-memory & MoE section between the builds and verdict sections:
  106. assistant Opus 4.8 111.2K tok $0.17 6/6/2026, 11:28:05 PM
    • Edit /home/megaproxy/claude/research/llm-box-uk/index.html 36ms 161 B
  107. user 6/6/2026, 11:28:05 PM
  108. assistant Opus 4.8 6/6/2026, 11:28:12 PM
    Now add the three routes to the verdict table and update the footer methodology:
  109. assistant Opus 4.8 112K tok $0.10 6/6/2026, 11:28:14 PM
    • Edit /home/megaproxy/claude/research/llm-box-uk/index.html 31ms 161 B
  110. user 6/6/2026, 11:28:14 PM
  111. user 6/6/2026, 11:28:23 PM
  112. assistant Opus 4.8 112.6K tok $0.07 6/6/2026, 11:28:23 PM
    • Edit /home/megaproxy/claude/research/llm-box-uk/index.html 25ms 161 B
  113. assistant Opus 4.8 6/6/2026, 11:28:30 PM
    Edits in. Now updating memory.md per the working agreement, then committing.
  114. assistant Opus 4.8 6/6/2026, 11:28:30 PM
  115. assistant Opus 4.8 113.2K tok $0.07 6/6/2026, 11:28:32 PM
    • Edit /home/megaproxy/claude/research/llm-box-uk/memory.md 30ms 160 B
  116. user 6/6/2026, 11:28:32 PM
  117. assistant Opus 4.8 114.6K tok $0.09 6/6/2026, 11:28:51 PM
    • Edit /home/megaproxy/claude/research/llm-box-uk/memory.md 49ms 160 B
  118. attachment 6/6/2026, 11:28:52 PM
  119. user 6/6/2026, 11:28:52 PM
  120. user 6/6/2026, 11:29:01 PM
  121. assistant Opus 4.8 115.4K tok $0.08 6/6/2026, 11:29:01 PM
    • Edit /home/megaproxy/claude/research/llm-box-uk/memory.md 27ms 160 B
  122. assistant Opus 4.8 6/6/2026, 11:29:05 PM
    Now commit and push:
  123. assistant Opus 4.8 115.7K tok $0.07 6/6/2026, 11:29:06 PM
    • Bash git add -A && git commit -q -m "Add unified-memory & MoE builds (K-M): Mac Studio, Strix Halo, EPYC MoE" && git push -q … 14.4s 9 B
  124. user 6/6/2026, 11:29:20 PM
  125. system 6/6/2026, 11:29:38 PM
  126. assistant Opus 4.8 116.5K tok $0.08 6/6/2026, 11:29:38 PM
    Done. Committed and pushed.
    
    ## What I added — builds K, L, M (live-priced 2026-06-07)
    
    | # | Build | Capacity | Bandwidth | Total | The angle |
    |---|---|---|---|---|---|
    | **K** | Mac Studio M1 Ultra (used) | 128GB unified | ~800 GB/s | **~£1,950** | Quiet, near-3090 bandwidth, 5× the capacity. 70B fast, 120B Q4. |
    | **L** | GMKtec EVO-X2 (Ryzen AI Max+ 395) | 128GB unified | ~256 GB/s | **£1,620** | Cheapest 128GB box, ~120W, x86. Best on MoE + mid models. |
    | **M** | EPYC 7532 + 384GB DDR4 | 384GB system RAM | ~200 GB/s | **~£1,375** | The MoE trick — runs a 235B model with no GPU at all. |
    
    **Price confidence:** L is rock-solid (£1,619.99 direct from gmktec.uk). K is a range (~£1,600 eBay auction → ~£2,300+ certified refurb; I anchored ~£1,950 for a typical used buy-it-now). M is assembled from eBay UK server parts, so the £1,375 is a build estimate, not a single SKU — I flagged that and the ~20% import-VAT risk on China RAM/CPU bundles.
    
    **Where they landed in the doc:** new `[3·5]` section between the GPU builds and the verdict, with a framing box explaining the unified-memory vs MoE distinction; rows added to the `[00]` index table (with a divider and `*`/`†` footnote making clear this is shared/system memory, not discrete VRAM); three rows added to the verdict table; hero, nav, and footer methodology updated; `CHECKED 2026-06-05` kept with a `+UNIFIED/MoE 2026-06-07` tag since I didn't re-verify the A–J GPU prices this pass.
    
    Two things worth flagging:
    - The honest caveat baked into all three cards: these are **inference-first** — they buy capacity and quiet, not training or raw speed. Bandwidth, not capacity, caps token rate (Apple 800 / Strix Halo 256 / DDR4 200 GB/s).
    - I did **not** add the **Arc B60 24GB / Dual 48GB** builds we discussed earlier — you only asked for these three. Want me to add the B60 Dual 48GB as build N? It's the one Arc config that genuinely competes (48GB one-slot, ~£1,400, vs 2× used 3090 at ~£1,640).