AI | Analysis | SUZUKI SHOTEN

3 analyses tagged with "AI"

Gemma 4 MTP on GB10 — Inference 1.83x faster on 26B, 3.52x on 31B Dense

Independent measurement of Google's Gemma 4 MTP drafter on NVIDIA GB10 (GX10). 1.83x speedup on 26B-A4B and 3.52x on 31B Dense. No public independent benchmark of 31B Dense + 1 GPU + MTP could be found, making this the first such measurement. Reproduces and exceeds Google's 'up to 3x' claim on real hardware.

2026-05-07 Read →

Decomposing the AI Value Chain in the Agentic Era

Building on SemiAnalysis's 'AI Value Capture', this analysis traces how value in the Agentic AI era propagates upstream — from model labs and NVIDIA to HBM, advanced packaging, equipment, and power/cooling — using detailed causal subgraphs from an internal investment-theme model. Source attribution is kept explicit throughout.

2026-05-02 Read →

How DeepSeek V4's 90% KV Cache Reduction Reshapes HBM Demand — From Capacity to Bandwidth, Packaging, and Thermal Control

DeepSeek V4 claims a 90% KV cache reduction at 1M tokens. This analysis argues HBM demand is not destroyed — its center of value shifts from raw capacity to bandwidth, packaging, and thermal control. Examined across technology, market, and supply chain layers.

2026-04-27 Read →