Page-level memory management for NVIDIA GPUs

ShrutiDongare · September 21, 2023, 9:39pm

My objective is to monitor memory access patterns for generative AI workloads and subsequently develop optimization strategies for managing memory costs across various memory tiers. One example of memory tiers would be GPU Memory > CPU/System Memory > CXL Memory > Disk Swap.

DAMON enables the observation of memory access patterns (address-level) for data-intensive workloads, particularly in scenarios with limited DRAM capacity. Given this, I’m interested in exploring the possibility of observing memory usage patterns at the address level for Generative AI workloads. Is there a comparable method or tool available that offers similar capabilities to DAMON but is tailored to NVIDIA GPUs?