Background & Announcement

On April 23, 2026, Hsinchu-based Skymizer Taiwan Inc. publicly unveiled the HTX301 inference chip ahead of Computex 2026. The company describes the HTX301 as the first reference chip of its HyperThought™ IP — a software/hardware co-design platform originally introduced at Computex 2025. Coverage from major tech outlets including TechRadar, Wccftech, and Digital Citizen Life picked up the story in the days that followed, with widespread reporting arriving around May 7–10, 2026.

The announcement positions the HTX301 as a direct challenge to GPU-centric inference stacks from Nvidia and AMD, but from an entirely different architectural philosophy: instead of chasing peak compute with bleeding-edge nodes and expensive HBM, Skymizer used a mature 28 nm process paired with commodity LPDDR4 and LPDDR5 memory to construct a card optimized specifically for the memory-bandwidth-intensive decode phase of LLM inference.

Architecture & Design Philosophy

The HTX301 PCIe card integrates six HTX301 chips working collaboratively, delivering a combined memory capacity of up to 384 GB — enough to hold the weights of a 700-billion-parameter model in full. The underlying LPU (Language Processing Unit) IP is purpose-built for large-model inference, with core optimizations around decode acceleration and unified prefill/decode orchestration.

Critically, Skymizer forgoes HBM (High Bandwidth Memory) and GDDR entirely in favor of LPDDR4/LPDDR5. While these memory standards offer far lower raw bandwidth per pin, the company offsets this through dedicated weight and KV-cache compression technology, claiming less than 0.06% perplexity loss from compression. The result, per official figures, is inference performance 9% to 17.8% higher than the open-source llama.cpp framework, at a nominal bandwidth of 100 GB/s.

“Inference has become the dominant AI workload, and infrastructure needs to reflect that reality.”
— William Wei, Chief Marketing Officer, Skymizer

Skymizer also claims its LPU design reaches 30 tokens per second with only 0.5 TOPS of compute at 100 GB/s of bandwidth — a figure that underlines the decode-first, memory-centric character of the chip. For Llama 2 7B prefill, the company states an octa-core LPU configuration can achieve 240 tokens per second, scaling to 1,200 tokens per second through multi-chip orchestration.

Power & Deployment Simplicity

At a TDP of approximately 240 W, the HTX301 card is positioned as substantially more power-efficient than comparable high-end AI accelerators when performing similar large-model inference tasks. Because it uses a standard PCIe form factor and operates within conventional thermal envelopes, the HTX301 can be installed directly in ordinary air-cooled servers without modifications to data center power distribution or cooling infrastructure — a meaningful deployment advantage for enterprises considering on-premise AI.

Skymizer frames this around two enterprise pain points: data sovereignty (eliminating the privacy exposure that comes with cloud inference) and predictable infrastructure cost (fixed hardware investment versus variable cloud billing).

Competitive Context

Product Memory TDP Memory Type Status
Skymizer HTX301 384 GB ~240 W LPDDR4/5 Announced
NVIDIA RTX PRO 6000 Blackwell 96 GB ~600 W* GDDR7 Available
AMD Instinct MI350P 288 GB >480 W* HBM3 Available

* Power figures for competing products are based on Skymizer’s own comparison materials and have not been independently benchmarked for equivalent inference workloads.

⚠ Verification Status

All performance data cited for the HTX301 currently originates solely from Skymizer. No independent third-party benchmarks have been published as of May 10, 2026. Skymizer has stated it will provide a live demonstration at Computex Taipei 2026 (late May) and will open the card to independent verification at that time. Until such results are available, all claims should be treated as vendor-provided figures pending confirmation.

Road to Computex

Skymizer has positioned Computex Taipei 2026 as the first public live demonstration of the HTX301. The company has opened a preview request program (HTX301 Evaluation Platform, P/N HTX301-EVB-R01A-P01) for prospective enterprise customers. Pricing and general commercial availability have not been announced; the Computex showcase is expected to provide a more complete picture of both.

Whether the HTX301 lives up to its headline claims will become clearer once independent reviewers gain hands-on access. If the specifications hold, the card represents a compelling proposition for enterprises seeking frontier-scale LLM inference on-premise, without the cost and complexity of GPU clusters. If they don’t, it joins a long list of AI hardware announcements that looked better on paper. Computex will be the first real test.