Friday, July 13, 2007

Surprised?

Unit Testing Effective?

Apparently there is an opposite correlation between methods used in the field, and unit testing coverage. That is, the methods that the unit test cover, were not used in the field. The methods that were used, had no coverage.

A larger sampling and verification of the results is necessary, however it is an interesting result so far.

Data Collection Methods

Over 500 projects were downloaded from the code.google.com project. The projects were selected based on the label: Mono or CSharp. In addition, 2 projects from a company were included. The projects were manually built, or a binary distribution was acquired. Some projects had to be excluded due to immaturity(not building), misclassification, and lacking the appropriate resources to build the project.

Results

For this run, only 35 of the projects were included in these results.









The information shown includes the type.method and corresponding:
  • Frequency across all applications.
  • Number of applications that the type.method appears in.
  • Coverage for that type.method. (1 = 100%, 0.50 = 50%, 0 = 0%, -1 = no test data)
Note: Coverage data was not collected for Managed.Windows.Form at this time.

One thing I think would help would be to display the (lines of code, complexity of method) in order to estimate how hard the method is to test or if it is worth testing.

Viewing the Results Yourself

I uploaded the results here:
http://groups.google.com/group/mono-soc-2007/web/FieldStatResults.xml

You can get the mono student projects from here
svn checkout http://mono-soc-2007.googlecode.com/svn/trunk/ mono-soc-2007

Compile from source or run the executable (Relies on WindowForms) under the directory:
christopher/FieldStat/FieldStat/bin/Release/FieldStat.exe

You can easily load the results under the Results tab using the(Import Results) button. Load the xml file on the web, or in christopher/FieldStat/Data/Results/FieldStatResults.xml

The default sort is on AppFrequency, you can resort on the other fields by clicking the columns.

Lessons and Design Issues

My original focus was initially just on accurately getting usage results from one executable and comparing that to the coverage data. However, I soon realized whenI was working with real data that the real challenge was with dealing with many applications and how the information varies across them. There are some interesting emergent problems. For instance, an application uses the log4net library. Is that treated as part of the application? What if several applications are linking that library.

The collection of the applications for sampling field data should be a project in itself. There should be easy automated way to scan code.google.com, sourceforge, etc and gather executables as samples.

No comments: