summaryrefslogtreecommitdiffstats
path: root/tools/perf/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'tools/perf/Documentation')
-rw-r--r--tools/perf/Documentation/intel-pt.txt78
-rw-r--r--tools/perf/Documentation/itrace.txt8
-rw-r--r--tools/perf/Documentation/perf-ftrace.txt33
-rw-r--r--tools/perf/Documentation/perf-script.txt18
-rw-r--r--tools/perf/Documentation/perf-stat.txt14
5 files changed, 144 insertions, 7 deletions
diff --git a/tools/perf/Documentation/intel-pt.txt b/tools/perf/Documentation/intel-pt.txt
index b0b3007d3c9c0..4b6cdbf8f935e 100644
--- a/tools/perf/Documentation/intel-pt.txt
+++ b/tools/perf/Documentation/intel-pt.txt
@@ -108,6 +108,9 @@ approach is available to export the data to a postgresql database. Refer to
script export-to-postgresql.py for more details, and to script
call-graph-from-postgresql.py for an example of using the database.
+There is also script intel-pt-events.py which provides an example of how to
+unpack the raw data for power events and PTWRITE.
+
As mentioned above, it is easy to capture too much data. One way to limit the
data captured is to use 'snapshot' mode which is explained further below.
Refer to 'new snapshot option' and 'Intel PT modes of operation' further below.
@@ -364,6 +367,42 @@ cyc_thresh Specifies how frequently CYC packets are produced - see cyc
CYC packets are not requested by default.
+pt Specifies pass-through which enables the 'branch' config term.
+
+ The default config selects 'pt' if it is available, so a user will
+ never need to specify this term.
+
+branch Enable branch tracing. Branch tracing is enabled by default so to
+ disable branch tracing use 'branch=0'.
+
+ The default config selects 'branch' if it is available.
+
+ptw Enable PTWRITE packets which are produced when a ptwrite instruction
+ is executed.
+
+ Support for this feature is indicated by:
+
+ /sys/bus/event_source/devices/intel_pt/caps/ptwrite
+
+ which contains "1" if the feature is supported and
+ "0" otherwise.
+
+fup_on_ptw Enable a FUP packet to follow the PTWRITE packet. The FUP packet
+ provides the address of the ptwrite instruction. In the absence of
+ fup_on_ptw, the decoder will use the address of the previous branch
+ if branch tracing is enabled, otherwise the address will be zero.
+ Note that fup_on_ptw will work even when branch tracing is disabled.
+
+pwr_evt Enable power events. The power events provide information about
+ changes to the CPU C-state.
+
+ Support for this feature is indicated by:
+
+ /sys/bus/event_source/devices/intel_pt/caps/power_event_trace
+
+ which contains "1" if the feature is supported and
+ "0" otherwise.
+
new snapshot option
-------------------
@@ -674,13 +713,15 @@ Having no option is the same as
which, in turn, is the same as
- --itrace=ibxe
+ --itrace=ibxwpe
The letters are:
i synthesize "instructions" events
b synthesize "branches" events
x synthesize "transactions" events
+ w synthesize "ptwrite" events
+ p synthesize "power" events
c synthesize branches events (calls only)
r synthesize branches events (returns only)
e synthesize tracing error events
@@ -699,7 +740,40 @@ and "r" can be combined to get calls and returns.
'flags' field can be used in perf script to determine whether the event is a
tranasaction start, commit or abort.
-Error events are new. They show where the decoder lost the trace. Error events
+Note that "instructions", "branches" and "transactions" events depend on code
+flow packets which can be disabled by using the config term "branch=0". Refer
+to the config terms section above.
+
+"ptwrite" events record the payload of the ptwrite instruction and whether
+"fup_on_ptw" was used. "ptwrite" events depend on PTWRITE packets which are
+recorded only if the "ptw" config term was used. Refer to the config terms
+section above. perf script "synth" field displays "ptwrite" information like
+this: "ip: 0 payload: 0x123456789abcdef0" where "ip" is 1 if "fup_on_ptw" was
+used.
+
+"Power" events correspond to power event packets and CBR (core-to-bus ratio)
+packets. While CBR packets are always recorded when tracing is enabled, power
+event packets are recorded only if the "pwr_evt" config term was used. Refer to
+the config terms section above. The power events record information about
+C-state changes, whereas CBR is indicative of CPU frequency. perf script
+"event,synth" fields display information like this:
+ cbr: cbr: 22 freq: 2189 MHz (200%)
+ mwait: hints: 0x60 extensions: 0x1
+ pwre: hw: 0 cstate: 2 sub-cstate: 0
+ exstop: ip: 1
+ pwrx: deepest cstate: 2 last cstate: 2 wake reason: 0x4
+Where:
+ "cbr" includes the frequency and the percentage of maximum non-turbo
+ "mwait" shows mwait hints and extensions
+ "pwre" shows C-state transitions (to a C-state deeper than C0) and
+ whether initiated by hardware
+ "exstop" indicates execution stopped and whether the IP was recorded
+ exactly,
+ "pwrx" indicates return to C0
+For more details refer to the Intel 64 and IA-32 Architectures Software
+Developer Manuals.
+
+Error events show where the decoder lost the trace. Error events
are quite important. Users must know if what they are seeing is a complete
picture or not.
diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt
index e2a4c5e0dbe5b..a3abe04c779d0 100644
--- a/tools/perf/Documentation/itrace.txt
+++ b/tools/perf/Documentation/itrace.txt
@@ -3,13 +3,15 @@
c synthesize branches events (calls only)
r synthesize branches events (returns only)
x synthesize transactions events
+ w synthesize ptwrite events
+ p synthesize power events
e synthesize error events
d create a debug log
g synthesize a call chain (use with i or x)
l synthesize last branch entries (use with i or x)
s skip initial number of events
- The default is all events i.e. the same as --itrace=ibxe
+ The default is all events i.e. the same as --itrace=ibxwpe
In addition, the period (default 100000) for instructions events
can be specified in units of:
@@ -26,8 +28,8 @@
Also the number of last branch entries (default 64, max. 1024) for
instructions or transactions events can be specified.
- It is also possible to skip events generated (instructions, branches, transactions)
- at the beginning. This is useful to ignore initialization code.
+ It is also possible to skip events generated (instructions, branches, transactions,
+ ptwrite, power) at the beginning. This is useful to ignore initialization code.
--itrace=i0nss1000000
diff --git a/tools/perf/Documentation/perf-ftrace.txt b/tools/perf/Documentation/perf-ftrace.txt
index 6e6a8b22c8594..721a447f046ea 100644
--- a/tools/perf/Documentation/perf-ftrace.txt
+++ b/tools/perf/Documentation/perf-ftrace.txt
@@ -48,6 +48,39 @@ OPTIONS
Ranges of CPUs are specified with -: 0-2.
Default is to trace on all online CPUs.
+-T::
+--trace-funcs=::
+ Only trace functions given by the argument. Multiple functions
+ can be given by using this option more than once. The function
+ argument also can be a glob pattern. It will be passed to
+ 'set_ftrace_filter' in tracefs.
+
+-N::
+--notrace-funcs=::
+ Do not trace functions given by the argument. Like -T option,
+ this can be used more than once to specify multiple functions
+ (or glob patterns). It will be passed to 'set_ftrace_notrace'
+ in tracefs.
+
+-G::
+--graph-funcs=::
+ Set graph filter on the given function (or a glob pattern).
+ This is useful for the function_graph tracer only and enables
+ tracing for functions executed from the given function.
+ This can be used more than once to specify multiple functions.
+ It will be passed to 'set_graph_function' in tracefs.
+
+-g::
+--nograph-funcs=::
+ Set graph notrace filter on the given function (or a glob pattern).
+ Like -G option, this is useful for the function_graph tracer only
+ and disables tracing for function executed from the given function.
+ This can be used more than once to specify multiple functions.
+ It will be passed to 'set_graph_notrace' in tracefs.
+
+-D::
+--graph-depth=::
+ Set max depth for function graph tracer to follow
SEE ALSO
--------
diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 3517e204a2b30..5ee8796be96eb 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -116,8 +116,9 @@ OPTIONS
--fields::
Comma separated list of fields to print. Options are:
comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
- srcline, period, iregs, brstack, brstacksym, flags, bpf-output, brstackinsn,
- callindent, insn, insnlen. Field list can be prepended with the type, trace, sw or hw,
+ srcline, period, iregs, brstack, brstacksym, flags, bpf-output, brstackinsn, brstackoff,
+ callindent, insn, insnlen, synth.
+ Field list can be prepended with the type, trace, sw or hw,
to indicate to which event type the field list applies.
e.g., -F sw:comm,tid,time,ip,sym and -F trace:time,cpu,trace
@@ -130,6 +131,14 @@ OPTIONS
i.e., the specified fields apply to all event types if the type string
is not given.
+ In addition to overriding fields, it is also possible to add or remove
+ fields from the defaults. For example
+
+ -F -cpu,+insn
+
+ removes the cpu field and adds the insn field. Adding/removing fields
+ cannot be mixed with normal overriding.
+
The arguments are processed in the order received. A later usage can
reset a prior request. e.g.:
@@ -185,6 +194,9 @@ OPTIONS
instruction bytes and the instruction length of the current
instruction.
+ The synth field is used by synthesized events which may be created when
+ Instruction Trace decoding.
+
Finally, a user may not set fields to none for all event types.
i.e., -F "" is not allowed.
@@ -203,6 +215,8 @@ OPTIONS
is printed. This is the full execution path leading to the sample. This is only supported when the
sample was recorded with perf record -b or -j any.
+ The brstackoff field will print an offset into a specific dso/binary.
+
-k::
--vmlinux=<file>::
vmlinux pathname
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index bd0e4417f2be6..698076313606a 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -239,6 +239,20 @@ taskset.
--no-merge::
Do not merge results from same PMUs.
+--smi-cost::
+Measure SMI cost if msr/aperf/ and msr/smi/ events are supported.
+
+During the measurement, the /sys/device/cpu/freeze_on_smi will be set to
+freeze core counters on SMI.
+The aperf counter will not be effected by the setting.
+The cost of SMI can be measured by (aperf - unhalted core cycles).
+
+In practice, the percentages of SMI cycles is very useful for performance
+oriented analysis. --metric_only will be applied by default.
+The output is SMI cycles%, equals to (aperf - unhalted core cycles) / aperf
+
+Users who wants to get the actual value can apply --no-metric-only.
+
EXAMPLES
--------