Determining instruction coverage using Pin
May 28, 2012Pin is an extendable tool for the dynamic instrumentation of programs. It supports many execution platforms and allows the injection of instrumentation code at arbitrary locations in an executable at runtime.
I created a Pin extension tool to determine coverage information of instrumented programs. The tool is open source and can be found on Github.
The Pin coverage tool calculates coverage based on the executed instructions: Using Pin every instruction inside every routine is instrumented and monitored. Actually, only those routines are instrumented where (1) symbol information is available and (2) the source code file is stored inside a specific directory. This reduces the instrumentation overhead and the required memory.
After the instrumentation, the number of instructions inside any routine is known (base metric) and the number of actually executed instructions is determined during runtime. Both values can be used to calculate the coverage - summarizing these values for all routines located in the same source code file leads to the coverage value on source code file level.
The “instruction coverage” calculated by the tool depends on the compiler settings and returns different results than tools determining line coverage or block coverage. To analyze the compiler settings dependency, a simple FizzBuzz test application was used. The application is also available in the tool repository.
main.cpp
1 #include "test.h"
2
3 int main(int argc, char* argv[])
4 {
5 FizzBuzz fb;
6
7 fb.sayFizzBuzz(14);
8 fb.sayFizz(10);
9
10 return 0;
11 }
test.h
1 #pragma once
2
3 #include <string>
4
5 class FizzBuzz
6 {
7 public:
8
9 static const std::string FIZZ;
10 static const std::string BUZZ;
11
12 FizzBuzz();
13
14 void sayFizzBuzz(int limit) const;
15
16 void sayFizz(int limit) const;
17 void sayBuzz(int limit) const;
18 };
test.cpp
1 #include "test.h"
2 #include <iostream>
3
4 const std::string FizzBuzz::FIZZ = "fizz";
5 const std::string FizzBuzz::BUZZ = "buzz";
6
7 FizzBuzz::FizzBuzz() { }
8
9 void FizzBuzz::sayFizzBuzz(int limit) const
10 {
11 for (int i=1; i<limit; ++i)
12 {
13 if ((i%3 == 0) && (i%5 == 0))
14 {
15 std::cout << FIZZ << BUZZ << std::endl;
16 continue;
17 }
18 if (i%3 == 0)
19 {
20 std::cout << FIZZ << std::endl;
21 continue;
22 }
23 if (i%5 == 0)
24 {
25 std::cout << BUZZ << std::endl;
26 continue;
27 }
28 std::cout << i << std::endl;
29 }
30 }
31
32 void FizzBuzz::sayFizz(int limit) const
33 {
34 for (int i=1; i<limit; ++i)
35 {
36 if (i%3 == 0)
37 {
38 std::cout << FIZZ << std::endl;
39 continue;
40 }
41 std::cout << i << std::endl;
42 }
43 }
44
45 void FizzBuzz::sayBuzz(int limit) const
46 {
47 // .. not shown .. not called
48 }
Lines 15-16 in test.cpp are not executed. For the initialization of the static class attributes, pseudo routines are generated by the compiler.
Instruction counts and coverage results on the routine and file level are shown in the following table. The application was compiled using Visual Studio 2008 and different optimization settings. Other compiler flags might also have an influence on the results.
No opt. ( /Od ) |
Min. size ( /O1 ) |
Max. speed ( /O2 ) |
Full opt. ( /Ox ) |
||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
File/Routine | |||||||||||||||
main.cpp | 35 | 35 | 100.0 | 15 | 15 | 100.0 | 29 | 29 | 100.0 | 4 | 4 | 100.0 | 4 | 4 | 100.0 |
main | 35 | 35 | 100.0 | 15 | 15 | 100.0 | 29 | 29 | 100.0 | 4 | 4 | 100.0 | 4 | 4 | 100.0 |
test.cpp | 244 | 334 | 73.5 | 136 | 150 | 90.7 | 70 | 78 | 89.7 | 102 | 114 | 89.5 | 102 | 114 | 89.5 |
initializer for FIZZ | 28 | 28 | 100.0 | 10 | 10 | 100.0 | 7 | 7 | 100.0 | 7 | 7 | 100.0 | 7 | 7 | 100.0 |
initializer for BUZZ | 28 | 28 | 100.0 | 10 | 10 | 100.0 | 7 | 7 | 100.0 | 7 | 7 | 100.0 | 7 | 7 | 100.0 |
sayFizzBuzz | 101 | 124 | 81.5 | 68 | 82 | 82.9 | 56 | 64 | 87.5 | 56 | 68 | 82.4 | 56 | 68 | 82.4 |
sayFizz | 67 | 67 | 100.0 | 40 | 40 | 100.0 | - | - | - | 7 | 7 | 100.0 | 7 | 7 | 100.0 |
sayBuzz | 0 | 67 | 0.0 | - | - | - | - | - | - | - | - | - | - | - | - |
ctor FizzBuzz | 20 | 20 | 100.0 | 8 | 8 | 100.0 | - | - | - | - | - | - | - | - | - |
It is clear that the compiler settings are influencing the coverage values - for the example application coverage values between 74% and 91% were calculated.
For comparable results, it is important to use the same compiler settings across different runs.
I think release mode without any optimizations works best.
Then, routines do not contain additional debug related instructions and fewer routines are inlined.
On the other hand: uncalled methods (e.g. sayBuzz
in the example) are not considered in release mode.
The implemented tool provides an easy to use possibility to calculate coverage information for C or C++ applications. Virtual machine based or interpreted languages (e.g. Java, C# or Python) usually provide better suited tools to determine coverage information.
Tweet Follow @schoebel