Abstract
Unified Performance Tool or uperf for short, is a network performance measurement tool that supports execution of workload profiles
Table of Contents
Microbenchmarks rarely measure real world performance. This is especially true in networking where applications can use multiple protocols, use different types of communication, interleave CPU processing with communication, etc. However, popular microbenchmarks like iPerf and netperf are very simplistic, supporting only one protocol at a time, fixed message sized communication, no support for interleaving CPU processing between communication, and so on. Thus there is a need for a tool to closely model real world performance.
Uperf (Unifed performance tool for networking) solves this problem by allowing the user to model the real world application using a very high level language (called profile) and running this over the network. It allows the user to use multiple protocols, varying message sizes for communication, a 1xN communication model, support for collection of CPU counter statistics, and much more.
uperf was developed by the Performance Availablity Engineering group at Sun Microsystems. It was originally developed by Neelakanth Nadgir and Nitin Rammanavar. Jing Zhang added support for the uperf harness. Joy added SSL support, and Eric He ported it to windows and is currently a core contributer. Charles Suresh, Alan Chiu, Jan-Lung Sung have contributed significantly to its design and development.
The following list is a short overview of some of the features supported by uperf:
uperf is opensource software using the GNU General Public License v2 . You can download it from http://uperf.org. Binaries are available for Solaris and Linux.
uperf can be run as either a master(active) or a slave(passive). When run as active it needs master flag(-m) with a profile describing the test application.
Uperf Version 1.0.8 Usage: uperf [-m profile] [-hvV] [-ngtTfkpaeE:X:i:P:RS:] uperf [-s] [-hvV] -m <profile> Run uperf with this profile -s Slave -S <protocol> Protocol type for the control Socket [def: tcp] -n No statistics -T Print Thread statistics -t Print Transaction averages -f Print Flowop averages -g Print Group statistics -k Collect kstat statistics -p Collect CPU utilization for flowops [-f assumed] -e Collect default CPU counters for flowops [-f assumed] -E <ev1,ev2> Collect CPU counters for flowops [-f assumed] -a Collect all statistics -X <file> Collect response times -i <interval> Collect throughput every <interval> -P <port> Set the master port (defaults to 20000) -R Emit raw (not transformed), time-stamped (ms) statistics -v Verbose -V Version -h Print usage More information at http://www.uperf.org
uperf comes bundled with quite a few sample profiles in the
workloads
directory. You
can always tweak them to suit your needs or write your own
profile. Several of these profiles pick up values (like
remotehost
, or
protocol
) from the ENVIRONMENT. These
variables begin with the $ sign in the profile. You can either
set these (via export h=192.168.1.4
) or hardcode
them in the profile.
The list of profiles included by uperf is as follows
This profile represents the request-response kind of
traffic. One thread on the master is reading and
writing 90 bytes of data from the slave. The remote end
(slave) address is specified via the
$h
environment variable.
$proto
specifies the protocol to
be used.
In this profile, multiple threads simulates one way
traffic (8K size) between two hosts (similar to the
iperf networking tool) for 30 seconds.
$h
specifies the remote host,
$proto
specifies the protocol, and
$nthr
specifies the numnber of threads.
In this profile, multiple threads try to connect and
disconnect from the remote host. This can be used to
measure the connection setup performance.
$nthr
specifies the numnber of
threads, and $iter
determines
number of connects and disconnects each thread will do.
This profile demonstrates an application in which each thread opens a connection each to two hosts, and then reads 200 bytes from the first connection and writes it to the other connection.
uperf is based on the idea that you can describe your application or workload in very general terms and the framework will run that application or workload for you. For example, if you are familiar with netperf or request-response microbenchmarks, this general description would be "each thread sends 100bytes and receives 100 bytes using UDP". For a more complex application, we may have to specify the number of connections, and/or the number of threads, are the threads all doing the same kind of operation, what protocols are being used, Is the traffic bursty?, etc. As you can see, it gets quite complicated for any real-world application. uperf defines a language to specify all of these information in a machine-understandable format (xml) called a profile. uperf then parses and runs whatever the profile specifies. The user has to specify the profile for the master only. uperf automatically transforms the profile for the slaves and uses it.
The profile needs to be a valid XML file. Variables that begin with a '$' are picked up from the ENVIRONMENT.
A sample profile for the request-response microbenchmark is shown below.
<?xml version="1.0"?> <profile name="netperf"> <group nthreads="1"> <transaction iterations="1"> <flowop type="accept" options="remotehost=$h protocol=$proto wndsz=50k tcp_nodelay"/> </transaction> <transaction duration="30s"> <flowop type="write" options="size=90"/> <flowop type="read" options="size=90"/> </transaction> <transaction iterations="1"> <flowop type="disconnect" /> </transaction> </group> </profile>
Every profile begins with a xml header that specifies that it is a XML file. A profile has name. This is used to identify the name of the profile, and is not used by uperf. The major parts of a profile are
Lets look at each of these in detail.
<transaction iteration=1000>
is specified,
the contents of the transactions are executed 1000 times. If
<transaction duration=30s>
is specified,
the contents of the transaction are executed for 30 seconds.
By default, the transaction executes its contents only once.
All threads or processes start executing transactions at the
same time.
Every Flowop has a set of options. In the XML file, these are space seperated. The supported options are listed below.
count | The number of times this flowop will be executed |
duration | The amount of time this flowop will be executed.
Example: duration=100ms .
This option will no longer be
supported in future versions of uperf. Specify the
duration in the transaction
|
rate | Experimental:
This option causes uperf to execute this flowop at the
specified rate for iterations or
duration seconds.
|
writerse_option
The connect flowop specifies that a connection needs to
be opened. The options
parameter
specifies more details regaring the connection. The
following keys are supported
remotehost | The remote host that we need to connect or accept connection from |
protocol | The protocol used to connect to the remote host. Valid values are tcp, udp, ssl, sctp, and vsock |
tcp_nodelay | Controls whether
TCP_NODELAY is set or not
|
wndsz | Size of the socket send and receive buffer. This
parameter is used to set SO_SNDBUF, SO_RCVBUF
flags using setsocktopt()
|
engine | SSL Engine. |
size | Amount of data that is either read or written.
Uperf supports exchange of
rszize
parameter. The master still uses the
size parameter. For a random sized
message, the a uniformly distributed value between
the user specifed min and max is used by the
transmitting side, and the receiving side uses the
max as the message size.
Example: size=64k or
size=rand(4k,8k)
|
rsize | See description of asymmetrical messages above. |
canfail | Indicates that a failure for this flowop will
not cause uperf to abort. This is espcially useful
in UDP where a packet drop does not constitue a
fatal error. This can be also be used for example,
to test a SYN flood attack (Threads
connect() repeatedly ignoring errors).
|
non_blocking | Use non-blocking IO. The socket/file descriptor is set the NO_BLOCK flag. |
poll_timeout | If this option is set, the thread will first
poll for specified duration before trying
to carry out the operation. A poll timeout
is returned as an error back to uperf.
|
conn | Every open connection is assigned a connection name.
Currently, the name can be any valid integer, however, uperf
could take a string in the future. conn
identifies the connection to use with this flowop. This
connection name is thread private.
|
The sendfile flowop uses the
sendfile(3EXT)
function call to transfer
a single file. The sendfilev flowop transfers a set
of files using the sendfilev(3EXT)
interface. Multiple files are randomly picked from
all transferrable files (see dir below) and
tranferred to the slave.
dir | This parameter identifies the directory from
which the files will be transferred. The directory
is search recursively to generate a list of all
readable files. Example: dir=/space
|
nfiles | This parameter identifies the number of files
that will be transferred with each call to
sendfilev(3EXT) . This is used as the
3rd argument to the sendfilev(3EXT) .
nfiles is assumed to be 1 for the sendfile flowop.
function. Example: nfiles=10
|
size | This parameter identifies the chunk size for the transfer. Instead of sending the whole file, uperf will send size sized chunks one at a time. This is used only if nfiles=1 |
uperf collects quite a wide variety of statistics. By default, uperf prints the throughput every second while the test is running, and then prints out the total throughput. uperf also prints the network statistics, calculated independently using system statistics, to verify the throughput reported via uperf. uperf also prints statistics from all the hosts involved in this test to validate the output.
Some of the statistics collected by uperf are listed below
bash$ ./framework/uperf -m netperf.xml -a -e -p Starting 4 threads running profile:netperf ... 0.01 seconds Txn0 0B/1.01 (s) = 0b/s 3txn/s 254.89ms/txn Txn1 195.31MB/30.30 (s) = 54.07Mb/s 13201txn/s 2.30ms/txn Txn2 0B/0.00 (s) = 0b/s -------------------------------------------------------------------------------- netperf 195.31MB/32.31(s) = 50.70Mb/s (CPU 21.42s) Section: Group details -------------------------------------------------------------------------------- Elapsed(s) CPU(s) DataTx Throughput Group0 32.31 21.40 195.31M 50.70M Group 0 Thread details -------------------------------------------------------------------------------- Thread Elapsed(s) CPU(s) DataTx Throughput 0 32.31 5.30 48.83M 12.68M 1 32.31 5.31 48.83M 12.68M 2 32.31 5.44 48.83M 12.68M 3 32.31 5.36 48.83M 12.68M Group 0 Txn details -------------------------------------------------------------------------------- Txn Avg(ms) CPU(ms) Min(ms) Max(ms) 0 5.45 0.51 5.37 5.68 1 0.29 0.00 0.23 408.63 2 0.32 0.16 0.07 0.81 Group 0 Flowop details (ms/Flowop) -------------------------------------------------------------------------------- Flowop Avg(ms) CPU(ms) Min(ms) Max(ms) Connect 5.41 0.49 5.31 5.66 Write 0.02 0.00 0.01 0.53 Read 0.25 0.00 0.05 408.59 Disconnect 0.30 0.14 0.06 0.79 Netstat statistics for this run -------------------------------------------------------------------------------- Nic opkts/s ipkts/s obits/s ibits/s ce0 12380 12391 30.68M 30.70M ce1 0 0 0 84.67 -------------------------------------------------------------------------------- Waiting to exchange stats with slave[s]... Error Statistics -------------------------------------------------------------------------------- Slave Total(s) DataTx Throughput Operations Error % 192.9.96.101 32.25 195.31MB 50.80Mbps 800008 0.00 Master 32.31 195.31MB 50.70Mbps 800008 0.00 -------------------------------------------------------------------------------- Difference(%) 0.20% 0.00% -0.20% 0.00% 0.00%
Q: | What is the history behind uperf? |
A: | uperf was developed by the Performance Availablity Engineering group at Sun Microsystems circa 2004. It was originally inspired by Filebench, and developed by Neelakanth Nadgir and Nitin Rammanavar. |
Q: | Where can I submit bugs/feedback? |
A: | Until we have something better, please email
|
Q: | How do I specify which interface to use? |
A: | uperf just specifies the host to connect to. It is upto the OS to determine which interface to use. You can change the default interface to that host by changing the routing tables |
Q: |
Does the use of |
A: |
Since |
Q: | Does uperf support socket autotuning on Linux? |
A: |
uperf currently always call |
Q: | Where can I get the uperf harness? |
A: | The harness is not opensource, although if there is sufficient interest, we would definetely consider opensourcing it. For more details, please contact Jing Zhang. |
Q: | Why do you even have a
|
A: | uperf uses a global variable to count the
number of bytes transferred. This is updated using atomic
instructions |
Q: | Why do we have an option to do sendfilev with chunks? |
A: | Pallab identified an issue where chunked sendfilev's were faster than transferring the whole file in one go. This will help debug the issue. |
uperf supports named connections. To specify a name,
you should specify conn=X
variable
in the options
to a
connect
or accept
flowop. For example,
<flowop type="connect" options="conn=2 remotehost=$h
protocol=tcp>
If a name is not specified, the connection is an anonymous connection. For any flowop, if a connection is not specified, it uses the first anonymous connection.
Uperf can generate data that can be post processed by
Fenxi. To
use that feature, you have to use the -x
option of uperf. The output should be stored in file whose
name has the uperf prefix. For ex
$ uperf -m iperf.xml -x > uperf-iperf.out $ fenxi process uperf-iperf.out outdir iperf
. The output is now stored in outdir