The engineering challenges of designing for low latency execution include tightly controlling the time it takes to detect the onset of latency excursion and a diagnosis of its most likely cause. In modern x-as-a-service (XaaS) forms of distributed applications, the points at which latency is experienced by a service consumer are separated by many layers of modular abstractions from the underlying system hardware. This separation makes it difficult to pinpoint the causes of latency pushouts and to apply corrective actions in a timely manner. The classic performance methodology to profile ‘cycles’ of work may be broadly successful in extracting higher levels of latency, but not very effective in determining causes of short-duration latency surges; and, to determine that, it is frequently necessary to: • trace execution • pinpoint when a significant latency stretch out occurs • establish its correlation with a nearby precursor or a set of precursor events Each of these steps can incur significant overheads; further, one has to be concerned that even modest overheads from tracing risk contributing to tail latencies. Not just the detection of the onset of a latency excursion, but the identification of why it occurs must be completed quickly so that if a corrective action is possible, it can be taken promptly. Similarly, if no recourse to curb the latency of a slice of computation is available at some point in time, then it is ideal that steps to minimize the impact of the exception are put into effect as early as possible In our talk, we present an approach that complements the very low overhead software tracing provided by KUtrace. It uses eBPF to trigger a collection of additional data at very low overhead from the hardware performance monitoring unit (PMU) so that latency excursions within a span of execution can be examined in a timely manner. We will describe the use of PMU capabilities like precise events-based sampling (PEBS) and timed last branch records (Timed LBRs) in close proximity to events of interest to extract critical clues. We will further discuss planned future work to integrate in-band network telemetry (INT) into these tracing flows.