SHORT Packed AVX MFLOP/s

EVENTSET
FIXC0 INSTR_RETIRED_ANY
FIXC1 CPU_CLK_UNHALTED_CORE
FIXC2 CPU_CLK_UNHALTED_REF
PMC0   AVX_INSTS_CALC

METRICS
Runtime (RDTSC) [s] time
Runtime unhalted [s] FIXC1*inverseClock
Clock [MHz]  1.E-06*(FIXC1/FIXC2)/inverseClock
CPI  FIXC1/FIXC0
Packed SP [MFLOP/s]  1.0E-06*(PMC0*8.0)/time
Packed DP [MFLOP/s]  1.0E-06*(PMC0*4.0)/time

LONG
Formulas:
Packed SP [MFLOP/s] = 1.0E-06*(AVX_INSTS_CALC*8)/runtime
Packed DP [MFLOP/s] = 1.0E-06*(AVX_INSTS_CALC*4)/runtime
-
Packed 32b AVX FLOP/s rates. Approximate counts of AVX & AVX2 256-bit instructions.
May count non-AVX instructions that employ 256-bit operations, including (but
not necessarily limited to) rep string instructions that use 256-bit loads and
stores for optimized performance, XSAVE* and XRSTOR*, and operations that
transition the x87 FPU data registers between x87 and MMX.
Caution: The event AVX_INSTS_CALC counts the insertf128 instruction often used
by the Intel C compilers for (unaligned) vector loads.
