Return to Answer

added 660 characters in body

Source Link

edited Dec 2, 2014 at 6:49

220.3k
35
410
623

Short answer: you need both

fake data, with well defined input X and output Y
real-world data, probably with the modifications you suggested

Use the first one especially when doing TDD (as your tag indicated), and after you have the basic algorithm ready, use the second kind of data for integration or acceptance tests. The first kind of tests will prevent you from the need of running the (probably slow) second kind of tests more often than necessary.

Something tells me there is either a nice middle ground which allows for testing flexibility without deviating from how the application works in real life or a completely different testing approach

Sorry, but there is no "magic bullet" so far. Testing complex algorithms is hard work, sometimes difficult, requiring analytic skills. There are whole books written about how to construct test cases efficiently, and the techniques described, for example, by Glenford Myers in his book about software testing, which was published first 1979 AFAIK, are still valid today.

Source Link

answered Dec 2, 2014 at 5:37

Doc Brown

220.3k
35
410
623

Short answer: you need both

fake data, with well defined input X and output Y
real-world data, probably with the modifications you suggested