Home

On Latency

What is latency ?

Traditional approach to latency in a multi-tier web application

Latency profiles with generative AI

Input Latency

Commit Latency

Resource Acquisition Latency

Inference Startup Latency

Chunking Latency

Display Latency