DSP chip and IP company VSORA SA (Paris, France) has announced Jotunn its processor platform for generative AI inferencing.
Jotunn is expected to become available in 2024 but the architecture is intended to overcome the “memory wall” that leaves current processors idle for most of the time while waiting for data within generative AI software. No information was given about the manufacturing process node VSORA is targeting.
Recently introduced generative AI GPT-3.5 requires 175 billion parameters and GPT-4 reportedly requires almost 2 trillion parameters. With a traditional hierarchical memory model latency increases. The efficiency of running GPT-4, drops to around 3 percent VSORA asserts. Leaving thousands of processors idle 97 percent of the time.
Jotunn is a scalable chip architecture is designed accompany a host processor and interface to high-bandwidth memory (HBM). The Jotunn-4 implementation, with four processor cores, will provide 192Gbytes of on-chip memory and be capable of between 12 and 3,200 TFLOPS performance depending on the data type. Power consumption is rated at 100W peak in a 45mm-by-45mm package.
ChatGPT, based on GPT-3.5, can be handled by Jotunn4 entirely on-chip, dropping power consumption by more than an order of magnitude versus competitors. Jotunn4 achieves efficiencies of more than 50 percent for GPT-3.5 and GPT-4.
Jotunn was initially designed as a low-power, low-cost and high-performance chip architecture for L3 to L5 autonomous driving vehicles (see VSORA introduces Tyr chip for autonomous driving). VSORA extended the architecture to accelerate GenAI applications.