Published 20 hours ago • loading... • Updated 19 hours ago

Pliops expands AI's context windows with 3D NAND-based accelerator – can accelerate certain inference workflows by up to eight times

Summary by Tom's Hardware

Pliops claims its XDP LightningAI card and FusIOnX software accelerate large language model inference by offloading context data to SSDs, reducing redundant computation, and boosting vLLM throughput by up to eight times while avoiding the need for additional GPUs.

This story is only covered by news sources that have yet to be evaluated by the independent media monitoring agencies we use to assess the quality and reliability of news outlets on our platform. Learn more here.

3 Articles

All

Left

Center

Right

World NEWS Live

Pliops expands AI's context windows with 3D NAND-based accelerator – can accelerate certain inference workflows by up to eight times - WorldNL Magazine

(Image credit: Pliops) As language models grow in complexity and their context windows expand, GPU-attached high bandwidth memory (HBM) becomes a bottleneck, forcing systems to repeatedly recalculate data that no longer fits in onboard HBM. Pliops has addressed this challenge with its XDP LightningAI device and FusIOnX software, which store precomputed context on fast SSDs and retrieve it instantly when needed, reports Blocks and Files. The com…

19 hours ago

Read Full Article