This is the official site for the book BPF Performance Tools: Linux System and Application Observability, published by Addison Wesley (2019)
Today’s software systems are arguably robust at logging and recovering from fail-stop hardware – there is a clear,binary signal that is fairly easy to recognize a and interpret. We believe fail-slow hardware is a fundamentally harder problem to solve. It is very hard to distinguish such cases from ones that are caused by software performance issues. It is also evident that many modern,advanced deployed systems do not anticipate this failure mode. We hope that our study can influence vendors, operators, and systems designers to treat fail-slow hardware as a separate class of failures and start addressing them more robustly in future systems.
This post explains Transparent Hugepages (THP) in a nutshell, describes techniques that can be used to measure the performance impact, shows the effect on a real-world application.
Easy flamegraphs for Rust projects and everything else, without Perl or pipes.
Network performance and utilization will affect the general application throughput.
Check if you are hitting network bandwidth limits
Protocol compression can improve the results if you are limited by network bandwidth, but also can make things worse if you are not
SSL encryption has some penalty (~10%) with a low amount of threads, but it does not scale for high concurrency workloads.
Critical but oft-neglected service metrics that every SRE and product owner should care about.
Resource pressure metrics from the Linux kernel.
Gain insight into resource utilization with new Linux kernel pressure metrics and related tools.
Are you sure you want to delete this link?
The personal, minimalist, super-fast, database free, bookmarking service by the Shaarli community