Myrtle.ai Achieves 5.1 Microsecond Latency in Financial LSTM Inference Benchmark with VOLLO

PRNewswire

2 years ago

A symbolic image of robot touching human

Myrtle.ai, a recognized leader in accelerating machine learning inference, today announced that a stack featuring its VOLLO™ product has recently been audited by STAC®, a leading benchmark authority for the finance industry.[1] This is the first FPGA-based solution with published results for the Tacana Suite of STAC-ML™ and it achieved incredible latency results. The STAC-ML Markets (Inference) benchmark represents the needs of capital markets firms using machine learning inference to respond rapidly to changes in the markets.

Created by the STAC Benchmark Council™, which includes the world’s largest global banks, brokerages, exchanges, hedge funds, proprietary trading shops, and asset managers, as well as more than 50 leading technology vendors, the STAC-ML™ Markets (Inference) benchmarks are a tool used to objectively compare different platforms for latency, throughput, efficiency and quality in machine learning (ML) inference.

VOLLO achieved latencies as low as 5.08 microseconds with a throughput over 800k inferences/second.[2] Such low latency and high throughput enable users to make more intelligent decisions using more complex models faster than in the past, giving them a competitive advantage in trading, risk analysis, quotes and many other trading-related activities. Designed for simple and quick installation on co-located servers, VOLLO also enables reductions in rack space and energy consumption, two premium resources in such locations.

At the STAC Summit London on 4^th May, where the results were announced by STAC, Myrtle.ai gave a short talk about how financial firms with no FPGA skills whatsoever can benefit from the extremely low latencies achievable with an FPGA-based product such as VOLLO.

VOLLO runs on a standard form factor PCIe accelerator card with Intel Agilex® 7 FPGAs, up to 4 of which can be installed in a 1U server. VOLLO can be configured with customer-trained models, utilizing model architectures from the LSTM model zoo, enabling users to deploy a range of workloads specific to their application requirements via an export process from standard ML tool flows. It can support up to 12 parallel models per FPGA accelerator card installed in the system, enabling a maximum of 48 parallel models for a system with 4 cards installed.

The FPGA at the heart of VOLLO is the Intel Agilex 7 FPGA F-Series– the industry’s first FPGA family containing hardened, native bfloat16 support. For AI applications requiring low latency, like those in the STAC-ML benchmark, use of hardened bfloat16 decreases latency and increases throughput. Peter Baldwin, CEO of Myrtle.ai said “Since announcing our first set of STAC-ML benchmark results in December, we’re now engaging with firms wishing to gain a competitive edge in responding quickly to real-time market data. These companies know they want the latency advantage that Intel FPGAs provide but often they don’t have the FPGA designers they would need to create their own solutions. VOLLO solves that problem.”

“Myrtle.ai’s outstanding results from the STAC-ML Markets (Inference) benchmark attest to its combined FPGA and ML expertise, and we’re pleased that they chose Intel Agilex 7 FPGAs to showcase the value of the VOLLO product,” says Jim Dworkin, Vice President and General Manager of the Cloud & Enterprise Acceleration Division in Intel’s Programmable Solutions Group. “All of the financial services firms in the vast ecosystem that STAC cultivates will benefit from the low latency and high throughput efficiency that VOLLO offers, enabling those firms to make more intelligent trading decisions.”

“STAC benchmarks are specified by financial firms based on their business needs, and they designed this benchmark to compare inference performance across multiple architectures,” commented STAC President, Peter Nabicht. “Myrtle.ai has now delivered low-latency results in both STAC-ML inference benchmark suites, highlighting the capabilities of FPGA technology and providing valuable performance data to financial firms dealing with rapidly evolving markets.”

VOLLO is available today and evaluations can be arranged. For more details go to myrtle.ai/fintech or contact Myrtle.ai today at fintech@myrtle.ai.

More information about STAC and the STAC Benchmark Council can be found at www.STACresearch.com. The full benchmark results are available in the STAC Report (SUT ID MRTL230426) at www.STACresearch.com/MRTL230426.