Abstract: Latency is one of the critical performance metrics for Networks-on-Chips (NoCs). When designing an NoC, the designers have to explore enormous design parameters and various traffic patterns, ...
Abstract: Driven by the tremendous demand for real-time data processing in the Internet of Vehicles (IoV), edge computing is envisioned as a promising solution to alleviate the resource limitation on ...
Sarathi-Serve is a high througput and low-latency LLM serving framework. Please refer to our OSDI'24 paper for more details. @article{agrawal2024taming, title={Taming Throughput-Latency Tradeoff in ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results