Allen Kuo (kwyshell) – Medium

Allen Kuo (kwyshell)

·

6d ago

Speculative Decoding for Local LLMs Was a Mixed Bag. Then DFlash Landed in vLLM

An audit of MTP, DFlash, PFlash, and CPU-MoE offload across vLLM, llama.cpp, and Ollama on a desktop Blackwell.

Speculative Decoding for Local LLMs Was a Mixed Bag. Then DFlash Landed in vLLM

·

May 3

Choosing a Real-Time Whisper Engine

Testing CT2, TheWhisper, and whisper.cpp for live speech-to-text and why the pipeline mattered as much as the engine

Choosing a Real-Time Whisper Engine

·

Apr 29

打造自己的規格驅動 (SDD) 工作流

Spec-Driven Development · Agentic Engineering

打造自己的規格驅動 (SDD) 工作流

·

Apr 29

Building My Own Spec-Driven Workflow

Spec-Driven Development · Agentic Engineering

Building My Own Spec-Driven Workflow

·

Apr 27

Bringing Up a Secure Embedded Linux Platform on Arm Cortex Chip

From BL1 to OP-TEE, U-Boot, Linux, NAND, recovery, and production delivery

Bringing Up a Secure Embedded Linux Platform on Arm Cortex Chip

·

Apr 26

GPU-Resident Frame Interpolation on Android

RIFE, NCNN, Vulkan, and the Real Pipeline Problem

GPU-Resident Frame Interpolation on Android

·

Apr 26

Driverless USB Proxy over SCSI and CD-ROM Emulation

A historical note on turning a constrained CD-ROM path into a client/server redirect route

Driverless USB Proxy over SCSI and CD-ROM Emulation

·

Apr 18

Qwen3.6–35B-A3B on Desktop Blackwell: The First Time vLLM Beats Ollama on Decode

Third in a series. Previous articles: Gemma 4 on vLLM vs Ollama · NVFP4 Completion

Qwen3.6–35B-A3B on Desktop Blackwell: The First Time vLLM Beats Ollama on Decode

·

Apr 16

Finishing What We Started: Gemma 4 NVFP4 on vLLM, Desktop Blackwell, WSL2

A follow-up to Gemma 4 on vLLM vs Ollama: Benchmarks on a 96 GB Blackwell GPU.

Finishing What We Started: Gemma 4 NVFP4 on vLLM, Desktop Blackwell, WSL2

·

Apr 12

當 Visual Studio 2026 偷走了我的 vcpkg

一場從 VCPKG_ROOT 被悄悄換掉開始、最後挖到 CodeView 二進位層級的深度追查。以及我學到的一件事：Headers 才是 ABI 的契約，cl.exe 不是。

當 Visual Studio 2026 偷走了我的 vcpkg

Allen Kuo (kwyshell)

Allen Kuo (kwyshell)

Following

Help

Status

About

Careers

Press

Blog

Privacy

Rules

Terms

Text to speech