Skip to main content
added 155 characters in body
Source Link
Arseni Mourzenko
  • 139.2k
  • 32
  • 359
  • 544

So if you wrote a unit test like this...

for (int i = 0; i < 100; i++) {
 assertTrue(i*i, square(i));
}

you would be given 100 points.

I would give this person 0 points (even if the test were testing something actually relevant), because assertions within a loop make little sense and tests with multiple asserts (especially in a form of a loop or a map) are difficult to work with.

The problem is essentially to have a metric which cannot [easily] be cheated. A metric which is exclusively based on the number of asserts is exactly the same as paying developers per LOC written. As with pay-by-LOC which leads to huge and impossible to maintain code, your actual company policy leads to useless and possibly badly written tests.

If the number of asserts is irrelevant, the number of tests is irrelevant as well. This is also the case for many metrics (including combined ones) one could imagine for this sort of situations.

Ideally, you would be applying systemic approach. In practice, this can hardly work in most software development companies. So I can suggest a few other things:

  1. Using pair reviews for tests and have something similar to the number of WTFs per minute metric.

  2. Measure the impact of those tests over time on the number of bugs. This has several benefits:

  • Seems fair,
  • Can actually be measured if you collect enough data about bug reports and their fate,
  • Is actually worth it!
  1. Use branch coverage, but combine it with other metrics (as well as a review). Branch coverage has its benefits, but testing CRUD code just to get a better grade is not the best way to spend developers' time.

    Use branch coverage, but combine it with other metrics (as well as a review). Branch coverage has its benefits, but testing CRUD code just to get a better grade is not the best way to spend developers' time.

  2. Decide all together what are the metrics you want to enforce for the moment (such decisions may not be welcomed or even be possible in some companies and teams). Review and change the metrics often, picking the ones which become more relevant, and make sure everyone clearly understands what is measured and how.

So if you wrote a unit test like this...

for (int i = 0; i < 100; i++) {
 assertTrue(i*i, square(i));
}

you would be given 100 points.

I would give this person 0 points (even if the test were testing something actually relevant), because assertions within a loop make little sense and tests with multiple asserts (especially in a form of a loop or a map) are difficult to work with.

The problem is essentially to have a metric which cannot [easily] be cheated. A metric which is exclusively based on the number of asserts is exactly the same as paying developers per LOC written. As with pay-by-LOC which leads to huge and impossible to maintain code, your actual company policy leads to useless and possibly badly written tests.

If the number of asserts is irrelevant, the number of tests is irrelevant as well. This is also the case for many metrics (including combined ones) one could imagine for this sort of situations.

Ideally, you would be applying systemic approach. In practice, this can hardly work in most software development companies. So I can suggest a few other things:

  1. Using pair reviews for tests and have something similar to the number of WTFs per minute metric.

  2. Measure the impact of those tests over time on the number of bugs. This has several benefits:

  • Seems fair,
  • Can actually be measured if you collect enough data about bug reports and their fate,
  • Is actually worth it!
  1. Use branch coverage, but combine it with other metrics (as well as a review). Branch coverage has its benefits, but testing CRUD code just to get a better grade is not the best way to spend developers' time.

So if you wrote a unit test like this...

for (int i = 0; i < 100; i++) {
 assertTrue(i*i, square(i));
}

you would be given 100 points.

I would give this person 0 points (even if the test were testing something actually relevant), because assertions within a loop make little sense and tests with multiple asserts (especially in a form of a loop or a map) are difficult to work with.

The problem is essentially to have a metric which cannot [easily] be cheated. A metric which is exclusively based on the number of asserts is exactly the same as paying developers per LOC written. As with pay-by-LOC which leads to huge and impossible to maintain code, your actual company policy leads to useless and possibly badly written tests.

If the number of asserts is irrelevant, the number of tests is irrelevant as well. This is also the case for many metrics (including combined ones) one could imagine for this sort of situations.

Ideally, you would be applying systemic approach. In practice, this can hardly work in most software development companies. So I can suggest a few other things:

  1. Using pair reviews for tests and have something similar to the number of WTFs per minute metric.

  2. Measure the impact of those tests over time on the number of bugs. This has several benefits:

  • Seems fair,
  • Can actually be measured if you collect enough data about bug reports and their fate,
  • Is actually worth it!
  1. Use branch coverage, but combine it with other metrics (as well as a review). Branch coverage has its benefits, but testing CRUD code just to get a better grade is not the best way to spend developers' time.

  2. Decide all together what are the metrics you want to enforce for the moment (such decisions may not be welcomed or even be possible in some companies and teams). Review and change the metrics often, picking the ones which become more relevant, and make sure everyone clearly understands what is measured and how.

Source Link
Arseni Mourzenko
  • 139.2k
  • 32
  • 359
  • 544

So if you wrote a unit test like this...

for (int i = 0; i < 100; i++) {
 assertTrue(i*i, square(i));
}

you would be given 100 points.

I would give this person 0 points (even if the test were testing something actually relevant), because assertions within a loop make little sense and tests with multiple asserts (especially in a form of a loop or a map) are difficult to work with.

The problem is essentially to have a metric which cannot [easily] be cheated. A metric which is exclusively based on the number of asserts is exactly the same as paying developers per LOC written. As with pay-by-LOC which leads to huge and impossible to maintain code, your actual company policy leads to useless and possibly badly written tests.

If the number of asserts is irrelevant, the number of tests is irrelevant as well. This is also the case for many metrics (including combined ones) one could imagine for this sort of situations.

Ideally, you would be applying systemic approach. In practice, this can hardly work in most software development companies. So I can suggest a few other things:

  1. Using pair reviews for tests and have something similar to the number of WTFs per minute metric.

  2. Measure the impact of those tests over time on the number of bugs. This has several benefits:

  • Seems fair,
  • Can actually be measured if you collect enough data about bug reports and their fate,
  • Is actually worth it!
  1. Use branch coverage, but combine it with other metrics (as well as a review). Branch coverage has its benefits, but testing CRUD code just to get a better grade is not the best way to spend developers' time.