Processes may vary in number. Smaller numbers of processes result in faster context switches. More than 20 processes is not supported.
Processes may vary in size. A size of zero is the baseline process that does nothing except pass the token on to the next process. A process size of greater than zero means that the process does some work before passing on the token. The work is simulated as the summing up of an array of the specified size. The summing is an unrolled loop of about a 2.7 thousand instructions.
The effect is that both the data and the instruction cache get polluted by some amount before the token is passed on. The data cache gets polluted by approximately the process ‘‘size’’. The instruction cache gets polluted by a constant amount, approximately 2.7 thousand instructions.
The pollution of the caches results in larger context switching times for the larger processes. This may be confusing because the benchmark takes pains to measure only the context switch time, not including the overhead of doing the work. The subtle point is that the overhead is measured using hot caches. As the number and size of the processes increases, the caches are more and more polluted until the set of processes do not fit. The context switch times go up because a context switch is defined as the switch time plus the time it takes to restore all of the process state, including cache state. This means that the switch includes the time for the cache misses on larger processes.
"size=0 ovr=179 2 71 4 104 8 134 16 333 20 438
The reasons for the inaccuracies are possibly interaction between the VM system and the processor caches. It is possible that sometimes the benchmark processes are laid out in memory such that there are fewer TLB/cache conflicts than other times. This is pure speculation on our part.
Comments, suggestions, and bug reports are always welcome.