Intel(r) Performance Counter Monitor
|
CPU Performance Monitor. More...
#include <cpucounters.h>
Classes | |
struct | CustomCoreEventDescription |
Custom Core event description. More... | |
struct | ExtendedCustomCoreEventDescription |
Extended custom core event description. More... | |
Public Types | |
enum | { MAX_C_STATE = 10 } |
enum | ProgramMode { DEFAULT_EVENTS = 0, CUSTOM_CORE_EVENTS = 1, EXT_CUSTOM_CORE_EVENTS = 2, INVALID_MODE } |
Mode of programming (parameter in the program() method) More... | |
enum | ErrorCode { Success = 0, MSRAccessDenied = 1, PMUBusy = 2, UnknownError } |
Return codes (e.g. for program(..) method) | |
enum | SupportedCPUModels { NEHALEM_EP = 26, NEHALEM = 30, ATOM = 28, ATOM_2 = 53, ATOM_CENTERTON = 54, ATOM_BAYTRAIL = 55, ATOM_AVOTON = 77, CLARKDALE = 37, WESTMERE_EP = 44, NEHALEM_EX = 46, WESTMERE_EX = 47, SANDY_BRIDGE = 42, JAKETOWN = 45, IVY_BRIDGE = 58, HASWELL = 60, HASWELL_ULT = 69, HASWELL_2 = 70, IVYTOWN = 62, HASWELLX = 63, BROADWELL = 61, END_OF_MODEL_LIST = 0x0ffff } |
Identifiers of supported CPU models. | |
enum | PCIeEventCode { PCIeRdCur = 0x19E, PCIeNSRd = 0x1E4, PCIeWiLF = 0x194, PCIeItoM = 0x19C, PCIeNSWr = 0x1E5, PCIeNSWrF = 0x1E6, RFO = 0x180, CRd = 0x181, DRd = 0x182, PRd = 0x187, WiL = 0x18F, ItoM = 0x1C8 } |
enum | CBoEventTid { RFOtid = 0x3E, ItoMtid = 0x3E } |
Public Member Functions | |
bool | isCoreCStateResidencySupported (int state) |
Returns true if the specified core C-state residency metric is supported. | |
bool | isPackageCStateResidencySupported (int state) |
Returns true if the specified package C-state residency metric is supported. | |
void | setOutput (const std::string filename) |
Redirects output destination to provided file, instead of std::cout. | |
void | restoreOutput () |
Restores output, closes output file if opened. | |
void | setRunState (int new_state) |
Set Run State. | |
int | getRunState (void) |
Returns program's Run State. | |
bool | isBlocked (void) |
void | setBlocked (const bool new_blocked) |
void | allowMultipleInstances () |
Call it before program() to allow multiple running instances of PCM on the same system. | |
bool | L3CacheOccupancyMetricAvailable () |
checks if cache monitoring present More... | |
unsigned | getMaxRMID () const |
returns the max number of RMID supported by socket More... | |
bool | good () |
Checks the status of PCM object. More... | |
const std::string & | getErrorMessage () const |
Returns the error message. More... | |
ErrorCode | program (const ProgramMode mode_=DEFAULT_EVENTS, const void *parameter_=NULL) |
Programs performance counters. More... | |
ErrorCode | programServerUncorePowerMetrics (int mc_profile, int pcu_profile, int *freq_bands=NULL) |
Programs uncore power/energy counters on microarchitectures codename SandyBridge-EP and IvyTown. More... | |
void | freezeServerUncoreCounters () |
Freezes uncore event counting (works only on microarchitecture codename SandyBridge-EP and IvyTown) | |
void | unfreezeServerUncoreCounters () |
Unfreezes uncore event counting (works only on microarchitecture codename SandyBridge-EP and IvyTown) | |
ServerUncorePowerState | getServerUncorePowerState (uint32 socket) |
Reads the power/energy counter state of a socket (works only on microarchitecture codename SandyBridge-EP) More... | |
void | cleanup () |
Cleanups resources and stops performance counting. More... | |
void | resetPMU () |
Forces PMU reset. More... | |
void | getAllCounterStates (SystemCounterState &systemState, std::vector< SocketCounterState > &socketStates, std::vector< CoreCounterState > &coreStates) |
Reads all counter states (including system, sockets and cores) More... | |
bool | isCoreOnline (int32 os_core_id) const |
Return true if the core in online. More... | |
SystemCounterState | getSystemCounterState () |
Reads the counter state of the system. More... | |
SocketCounterState | getSocketCounterState (uint32 socket) |
Reads the counter state of a socket. More... | |
CoreCounterState | getCoreCounterState (uint32 core) |
Reads the counter state of a (logical) core. More... | |
uint32 | getNumCores () |
Reads number of logical cores in the system. More... | |
uint32 | getNumOnlineCores () |
Reads number of online logical cores in the system. More... | |
uint32 | getNumSockets () |
Reads number of sockets (CPUs) in the system. More... | |
uint32 | getThreadsPerCore () |
Reads how many hardware threads has a physical core "Hardware thread" is a logical core in a different terminology. If Intel(r) Hyperthreading(tm) is enabled then this function returns 2. More... | |
bool | getSMT () |
Checks if SMT (HyperThreading) is enabled. More... | |
uint64 | getNominalFrequency () |
Reads the nominal core frequency. More... | |
uint32 | getL3ScalingFactor () |
runs CPUID.0xF.0x01 to get the L3 up scaling factor to calculate L3 Occupancy Scaling factor is returned in EBX register after running the CPU instruction More... | |
uint32 | getCPUModel () |
Reads CPU model id. More... | |
uint32 | getOriginalCPUModel () |
Reads original CPU model id. More... | |
int32 | getSocketId (uint32 core_id) |
Determines socket of given core. More... | |
uint64 | getQPILinksPerSocket () const |
Returns the number of Intel(r) Quick Path Interconnect(tm) links per socket. More... | |
uint32 | getMCPerSocket () const |
Returns the number of detected integrated memory controllers per socket. | |
uint32 | getMCChannelsPerSocket () const |
Returns the total number of detected memory channels on all integrated memory controllers per socket. | |
uint32 | getMaxIPC () const |
Returns the max number of instructions per cycle. More... | |
uint64 | getPCUFrequency () const |
Returns the frequency of Power Control Unit. | |
uint64 | getTickCount (uint64 multiplier=1000, uint32 core=0) |
Return TSC timer value in time units. More... | |
uint64 | getTickCountRDTSCP (uint64 multiplier=1000) |
Return TSC timer value in time units using rdtscp instruction from current core. More... | |
uint64 | getQPILinkSpeed (uint32 socketNr, uint32 linkNr) const |
Return QPI Link Speed in GBytes/second. More... | |
double | getJoulesPerEnergyUnit () const |
Returns how many joules are in an internal processor energy unit. | |
int32 | getPackageThermalSpecPower () const |
Returns thermal specification power of the package domain in Watt. | |
int32 | getPackageMinimumPower () const |
Returns minimum power derived from electrical spec of the package domain in Watt. | |
int32 | getPackageMaximumPower () const |
Returns maximum power derived from electrical spec of the package domain in Watt. | |
void | disableJKTWorkaround () |
void | programPCIeCounters (const PCIeEventCode event_, const uint32 tid_=0, const uint32 miss_=0) |
Program uncore PCIe monitoring event(s) More... | |
void | programPCIeMissCounters (const PCIeEventCode event_, const uint32 tid_=0) |
PCIeCounterState | getPCIeCounterState (const uint32 socket_) |
Get the state of PCIe counter(s) More... | |
uint64 | extractCoreGenCounterValue (uint64 val) |
uint64 | extractCoreFixedCounterValue (uint64 val) |
uint64 | extractUncoreGenCounterValue (uint64 val) |
uint64 | extractUncoreFixedCounterValue (uint64 val) |
uint64 | extractL3CacheOccupancy (uint64 val) |
const char * | getUArchCodename (int32 cpu_model_=-1) const |
Get a string describing the codename of the processor microarchitecture. More... | |
bool | packageEnergyMetricsAvailable () const |
bool | dramEnergyMetricsAvailable () const |
bool | packageThermalMetricsAvailable () const |
bool | outgoingQPITrafficMetricsAvailable () const |
bool | qpiUtilizationMetricsAvailable () const |
bool | memoryTrafficMetricsAvailable () const |
bool | memoryIOTrafficMetricAvailable () const |
bool | hasBecktonUncore () const |
bool | hasPCICFGUncore () const |
Static Public Member Functions | |
static PCM * | getInstance () |
Returns PCM object. More... | |
static bool | initWinRing0Lib () |
Loads and initializes Winring0 third party library for access to processor model specific and PCI configuration registers. More... | |
static std::string | getCPUBrandString () |
Get Brand string of processor. | |
Friends | |
class | BasicCounterState |
class | UncoreCounterState |
CPU Performance Monitor.
This singleton object needs to be instantiated for each process before accessing counting and measuring routines
enum PCM::ProgramMode |
Mode of programming (parameter in the program() method)
Enumerator | |
---|---|
DEFAULT_EVENTS |
Default choice of events, the additional parameter is not needed and ignored |
CUSTOM_CORE_EVENTS |
Custom set of core events specified in the parameter to the program method. The parameter must be a pointer to array of four |
EXT_CUSTOM_CORE_EVENTS |
Custom set of core events specified in the parameter to the program method. The parameter must be a pointer to a |
INVALID_MODE |
Non-programmed mode |
void PCM::cleanup | ( | ) |
Cleanups resources and stops performance counting.
One needs to call this method when your program finishes or/and you are not going to use the performance counting routines anymore.
Referenced by exit_cleanup().
void PCM::getAllCounterStates | ( | SystemCounterState & | systemState, |
std::vector< SocketCounterState > & | socketStates, | ||
std::vector< CoreCounterState > & | coreStates | ||
) |
Reads all counter states (including system, sockets and cores)
systemState | system counter state (return parameter) |
socketStates | socket counter states (return parameter) |
coreStates | core counter states (return parameter) |
References isCoreOnline().
CoreCounterState PCM::getCoreCounterState | ( | uint32 | core | ) |
Reads the counter state of a (logical) core.
Be aware that during the measurement other threads may be scheduled on the same core by the operating system (this is called context-switching). The performance events caused by these threads will be counted as well.
\param core core id \return State of counters in the core
Referenced by getCoreCounterState(), and getTickCount().
|
inline |
Reads CPU model id.
Referenced by getCyclesLostDueL2CacheMisses(), getCyclesLostDueL3CacheMisses(), getDRAMConsumedJoules(), getL2CacheHitRatio(), getL2CacheHits(), getL2CacheMisses(), getL3CacheHitRatio(), getL3CacheHits(), getL3CacheHitsNoSnoop(), getL3CacheHitsSnoop(), getL3CacheMisses(), and ServerPCICFGUncore::ServerPCICFGUncore().
|
inline |
Returns the error message.
Call this when good() returns false, otherwise return an empty string
|
static |
Returns PCM object.
Returns PCM object. If the PCM has not been created before than an instance is created. PCM is a singleton.
Referenced by ServerPCICFGUncore::computeQPISpeed(), exit_cleanup(), getActiveAverageFrequency(), getAllIncomingQPILinkBytes(), getAllOutgoingQPILinkBytes(), getAverageFrequency(), getConsumedJoules(), getCoreCounterState(), getCoreCStateResidency(), getCoreIPC(), getCyclesLostDueL2CacheMisses(), getCyclesLostDueL3CacheMisses(), getDRAMConsumedJoules(), getIncomingQPILinkUtilization(), getL2CacheHitRatio(), getL2CacheHits(), getL2CacheMisses(), getL3CacheHitRatio(), getL3CacheHits(), getL3CacheHitsNoSnoop(), getL3CacheHitsSnoop(), getL3CacheMisses(), getOutgoingQPILinkBytes(), getOutgoingQPILinkUtilization(), getSocketCounterState(), getSocketIncomingQPILinkBytes(), getSystemCounterState(), getTotalExecUsage(), MySystem(), sigINT_handler(), and sigSTOP_handler().
uint32 PCM::getL3ScalingFactor | ( | ) |
runs CPUID.0xF.0x01 to get the L3 up scaling factor to calculate L3 Occupancy Scaling factor is returned in EBX register after running the CPU instruction
|
inline |
Returns the max number of instructions per cycle.
unsigned PCM::getMaxRMID | ( | ) | const |
returns the max number of RMID supported by socket
uint64 PCM::getNominalFrequency | ( | ) |
Reads the nominal core frequency.
Referenced by getActiveAverageFrequency(), getAverageFrequency(), getIncomingQPILinkUtilization(), getOutgoingQPILinkBytes(), getOutgoingQPILinkUtilization(), getTickCount(), and getTickCountRDTSCP().
uint32 PCM::getNumCores | ( | ) |
Reads number of logical cores in the system.
Referenced by getCoreIPC(), getIncomingQPILinkUtilization(), getOutgoingQPILinkBytes(), getOutgoingQPILinkUtilization(), and getTotalExecUsage().
uint32 PCM::getNumOnlineCores | ( | ) |
Reads number of online logical cores in the system.
Referenced by getCoreIPC(), and getTotalExecUsage().
uint32 PCM::getNumSockets | ( | ) |
Reads number of sockets (CPUs) in the system.
Referenced by getAllIncomingQPILinkBytes(), getAllOutgoingQPILinkBytes(), and ServerPCICFGUncore::ServerPCICFGUncore().
|
inline |
Reads original CPU model id.
PCIeCounterState PCM::getPCIeCounterState | ( | const uint32 | socket_ | ) |
Get the state of PCIe counter(s)
socket_ | socket of the PCIe controller |
|
inline |
Return QPI Link Speed in GBytes/second.
References ServerPCICFGUncore::getQPILinkSpeed().
Referenced by getIncomingQPILinkUtilization(), getOutgoingQPILinkBytes(), and getOutgoingQPILinkUtilization().
|
inline |
Returns the number of Intel(r) Quick Path Interconnect(tm) links per socket.
References ServerPCICFGUncore::getNumQPIPorts().
Referenced by getAllIncomingQPILinkBytes(), getAllOutgoingQPILinkBytes(), and getSocketIncomingQPILinkBytes().
ServerUncorePowerState PCM::getServerUncorePowerState | ( | uint32 | socket | ) |
Reads the power/energy counter state of a socket (works only on microarchitecture codename SandyBridge-EP)
socket | socket id |
References ServerPCICFGUncore::freezeCounters(), ServerPCICFGUncore::getDRAMClocks(), ServerPCICFGUncore::getMCCounter(), ServerPCICFGUncore::getNumMCChannels(), ServerPCICFGUncore::getNumQPIPorts(), ServerPCICFGUncore::getQPIClocks(), ServerPCICFGUncore::getQPIL0pTxCycles(), ServerPCICFGUncore::getQPIL1Cycles(), and ServerPCICFGUncore::unfreezeCounters().
bool PCM::getSMT | ( | ) |
Checks if SMT (HyperThreading) is enabled.
SocketCounterState PCM::getSocketCounterState | ( | uint32 | socket | ) |
Reads the counter state of a socket.
socket | socket id |
References isCoreOnline().
Referenced by getSocketCounterState().
|
inline |
Determines socket of given core.
core_id | core identifier |
SystemCounterState PCM::getSystemCounterState | ( | ) |
Reads the counter state of the system.
System consists of several sockets (CPUs). Socket has a CPU in it. Socket (CPU) consists of several (logical) cores.
Referenced by getSystemCounterState().
uint32 PCM::getThreadsPerCore | ( | ) |
Reads how many hardware threads has a physical core "Hardware thread" is a logical core in a different terminology. If Intel(r) Hyperthreading(tm) is enabled then this function returns 2.
Referenced by getCoreIPC(), and getTotalExecUsage().
uint64 PCM::getTickCount | ( | uint64 | multiplier = 1000 , |
uint32 | core = 0 |
||
) |
Return TSC timer value in time units.
multiplier | use 1 for seconds, 1000 for ms, 1000000 for mks, etc (default is 1000: ms) |
core | core to read on-chip TSC value (default is 0) |
References getCoreCounterState(), getInvariantTSC(), and getNominalFrequency().
Referenced by ServerPCICFGUncore::computeQPISpeed().
uint64 PCM::getTickCountRDTSCP | ( | uint64 | multiplier = 1000 | ) |
Return TSC timer value in time units using rdtscp instruction from current core.
multiplier | use 1 for seconds, 1000 for ms, 1000000 for mks, etc (default is 1000: ms) |
References getNominalFrequency().
const char * PCM::getUArchCodename | ( | int32 | cpu_model_ = -1 | ) | const |
Get a string describing the codename of the processor microarchitecture.
cpu_model_ | cpu model (if no parameter provided the codename of the detected CPU is returned) |
bool PCM::good | ( | ) |
|
static |
Loads and initializes Winring0 third party library for access to processor model specific and PCI configuration registers.
bool PCM::isCoreOnline | ( | int32 | os_core_id | ) | const |
Return true if the core in online.
i | OS core id |
Referenced by getAllCounterStates(), and getSocketCounterState().
bool PCM::L3CacheOccupancyMetricAvailable | ( | ) |
checks if cache monitoring present
PCM::ErrorCode PCM::program | ( | const ProgramMode | mode_ = DEFAULT_EVENTS , |
const void * | parameter_ = NULL |
||
) |
Programs performance counters.
mode_ | mode of programming, see ProgramMode definition |
parameter_ | optional parameter for some of programming modes Call this method before you start using the performance counting routines. |
References ServerPCICFGUncore::computeQPISpeed(), CUSTOM_CORE_EVENTS, EXT_CUSTOM_CORE_EVENTS, and ServerPCICFGUncore::program().
void PCM::programPCIeCounters | ( | const PCIeEventCode | event_, |
const uint32 | tid_ = 0 , |
||
const uint32 | miss_ = 0 |
||
) |
Program uncore PCIe monitoring event(s)
event_ | a PCIe event to monitor |
tid_ | tid filter (PCM supports it only on Haswell server) |
PCM::ErrorCode PCM::programServerUncorePowerMetrics | ( | int | mc_profile, |
int | pcu_profile, | ||
int * | freq_bands = NULL |
||
) |
Programs uncore power/energy counters on microarchitectures codename SandyBridge-EP and IvyTown.
mc_profile | profile for integrated memory controller PMU. See possible profile values in pcm-power.cpp example |
pcu_profile | profile for power control unit PMU. See possible profile values in pcm-power.cpp example |
freq_bands | array of three integer values for core frequency band monitoring. See usage in pcm-power.cpp example |
Call this method before you start using the power counter routines on microarchitecture codename SandyBridge-EP
References ServerPCICFGUncore::program_power_metrics().
void PCM::resetPMU | ( | ) |
Forces PMU reset.
If there is no chance to free up PMU from other applications you might try to call this method at your own risk.