Revisions to Design a test framework for validating static files

Wording improved

Source Link

edited Jul 5, 2024 at 10:59

220.3k
35
410
623

I don't think your comparison of these tests with "unit tests" and "integration tests" matches the popular understanding of these terms.

A unit test would be a test which takes a single (partial) transformation/calculation function or module from your Python scripts and validates it in isolation, probably without using any input and output files.
Any test which runs a fully combined complex transformation / calculation step, as described in the question, is something I would call an integration test, regardless whether it is using approach #1 or #2.

All of those test types can make sense, and for most scenarios I can think of, tests of all three categories (unit test, integration test of type #1 or #2) will be useful. Note your approach #1 is not suitable for testing the semantical correctness of an XML file, only the syntactical, since just looking at an output file with ignoring from what input it was produced cannot tell you if the content is correct in regards to the input.

Still, how much sense they reallywhich tests make most sense, and where to put the focus on which category one should focus depends ultimately on the specific transformation rules and the specific requirements - which are not mentioned, not even scetched in the question.

And that is the real warning sign to me: my impression is you believe this question could be answerable without knowing more about the semantics of the described processes. The question does not even give readers a clue by posting some, there is not even a small example, hence I guess you believe this isn't necessary to understand what's going on.

Don't get me wrong, but I think that does not work. Even when you want your testing framework to stay somewhat generically, you still need to analyse your testing requirements first! - maybe not with all the gory details, but at least on a level of abstraction where you have a firm understanding of the problem domain.

Here are some questions you may start with. Ask yourself:

are there transformations or calculations which can be tested on their own? Or could be, after some minor refactorings?
what are the typical use cases or individual transformations your test code may have to deal with serve?
roughly - how many JSON files are required as input for each individual use case, and how many XML files will the case produce? Is it always 1 JSON file which leads to 1 XML file? Or do you expect different numbers?
which means are available to verify the semantical correctness of a certain output? Can this be done manually? (Note your approach #1 does not seem to be suitable for testing thisAre there some tools available, since just looking at an output file with no idea from what input it was produced can only tellor do you whether the output is syntactically correcthave to create such tools? Will a sample examination be sufficient, but not if it is whator do you semantically expected).need a full verification?
are certain transformations reversible, so one might apply the transformation, apply the reverse one and compare the original input with the result?
what is the order of magnitude of cases you have to deal with, the typical size of those JSON or XML files and the typical numbers? Are the numbers small enough so test data can be constructed manually, or are they so large you need to think of a programmatic solution?

Find out what kind and how many different examples and data scenarios you need to run by your framework so you feel confident in the code, then you will probably reach the point where you know which of the tests scetched above will serve you most, and if your overall approach makes sense.

I don't think your comparison of these tests with "unit tests" and "integration tests" matches the popular understanding of these terms.

A unit test would be a test which takes a single (partial) transformation/calculation function or module from your Python scripts and validates it in isolation, probably without using any input and output files.
Any test which runs a fully combined complex transformation / calculation step, as described in the question, is something I would call an integration test, regardless whether it is using approach #1 or #2.

All of those test types can make sense, and for most scenarios I can think of, tests of all three categories (unit test, integration test of type #1 or #2) will be useful. Still, how much sense they really make, and where to put the focus on depends ultimately on the specific transformation rules and the specific requirements - which are not even scetched in the question.

And that is the real warning sign to me: my impression is you believe this question could be answerable without knowing more about the semantics of the described processes. The question does not even give readers a clue by posting some example, hence I guess you believe this isn't necessary to understand what's going on.

Don't get me wrong, but I think that does not work. Even when you want your testing framework to stay somewhat generically, you still need to analyse your testing requirements first! - maybe not with all the gory details, but at least on a level of abstraction where you have a firm understanding of the problem domain.

Here are some questions you may start with. Ask yourself:

are there transformations or calculations which can be tested on their own? Or could be, after some minor refactorings?
what are the typical use cases or individual transformations your test code may have to deal with serve?
roughly - how many JSON files are required as input for each individual use case, and how many XML files will the case produce? Is it always 1 JSON file which leads to 1 XML file? Or do you expect different numbers?
which means are available to verify the semantical correctness of a certain output? Can this be done manually? (Note your approach #1 does not seem to be suitable for testing this, since just looking at an output file with no idea from what input it was produced can only tell you whether the output is syntactically correct, but not if it is what you semantically expected).
are certain transformations reversible, so one might apply the transformation, apply the reverse one and compare the original input with the result?
what is the order of magnitude of cases you have to deal with, the typical size of those JSON or XML files and the typical numbers? Are the numbers small enough so test data can be constructed manually, or are they so large you need to think of a programmatic solution?

Find out what kind and how many different examples and data scenarios you need to run by your framework so you feel confident in the code, then you will probably reach the point where you know which of the tests scetched above will serve you most, and if your overall approach makes sense.

I don't think your comparison of these tests with "unit tests" and "integration tests" matches the popular understanding of these terms.

A unit test would be a test which takes a single (partial) transformation/calculation function or module from your Python scripts and validates it in isolation, probably without using any input and output files.
Any test which runs a fully combined complex transformation / calculation step, as described in the question, is something I would call an integration test, regardless whether it is using approach #1 or #2.

All of those test types can make sense, and for most scenarios I can think of, tests of all three categories (unit test, integration test of type #1 or #2) will be useful. Note your approach #1 is not suitable for testing the semantical correctness of an XML file, only the syntactical, since just looking at an output file with ignoring from what input it was produced cannot tell you if the content is correct in regards to the input.

Still, which tests make most sense, and on which category one should focus depends ultimately on the specific transformation rules and the specific requirements - which are not mentioned, not even scetched in the question.

And that is the real warning sign to me: my impression is you believe this question could be answerable without knowing more about the semantics of the described processes. The question does not give readers a clue, there is not even a small example, hence I guess you believe this isn't necessary to understand what's going on.

Don't get me wrong, but I think that does not work. Even when you want your testing framework to stay somewhat generically, you still need to analyse your testing requirements first! - maybe not with all the gory details, but at least on a level of abstraction where you have a firm understanding of the problem domain.

Here are some questions you may start with. Ask yourself:

are there transformations or calculations which can be tested on their own? Or could be, after some minor refactorings?
what are the typical use cases or individual transformations your test code may have to deal with serve?
roughly - how many JSON files are required as input for each individual use case, and how many XML files will the case produce? Is it always 1 JSON file which leads to 1 XML file? Or do you expect different numbers?
which means are available to verify the semantical correctness of a certain output? Can this be done manually? Are there some tools available, or do you have to create such tools? Will a sample examination be sufficient, or do you need a full verification?
are certain transformations reversible, so one might apply the transformation, apply the reverse one and compare the original input with the result?
what is the order of magnitude of cases you have to deal with, the typical size of those JSON or XML files and the typical numbers? Are the numbers small enough so test data can be constructed manually, or are they so large you need to think of a programmatic solution?

Find out what kind and how many different examples and data scenarios you need to run by your framework so you feel confident in the code, then you will probably reach the point where you know which of the tests scetched above will serve you most, and if your overall approach makes sense.

added 45 characters in body

Source Link

edited Jul 4, 2024 at 9:40

Doc Brown

220.3k
35
410
623

I don't think your comparison of these tests with "unit tests" and "integration tests" matches the popular understanding of these terms.

A unit test would be a test which takes a single (partial) transformation/calculation function or module from your Python scripts and validates it in isolation, probably without using any input and output files.
Any test which runs a fully combined complex transformation / calculation step, as described in the question, is something I would call an integration test, regardless whether it is using approach #1 or #2.

All of those test types can make sense, and for most scenarios I can think of, tests of all three categories (unit test, integration test of type #1 or #2) will be useful. Still, how much sense they really make, and where to put the focus on depends ultimately on the specific transformation rules and the specific requirements - which are not even scetched in the question.

And that is the real warning sign to me: my impression is you believe this question could be answerable without knowing more about the semantics of the described processes. The question does not even give readers a clue by posting some example, hence I guess you believe this isn't necessary to understand what's going on.

Don't get me wrong, but I think that does not work. Even when you want your testing framework to stay somewhat generically, you still need to analyse your testing requirements first! - maybe not with all the gory details, but at least on a intermediate level of abstraction where you have a firm understanding of the problem domain.

Here are some questions you may start with. Ask yourself:

are there transformations or calculations which can be tested on their own? Or could be, after some minor refactorings?
what are the typical use cases or individual transformations your test code may have to deal with serve?
roughly - how many JSON files are required as input for each individual use case, and how many XML files will the case produce? Is it always 1 JSON file which leads to 1 XML file? Or do you expect different numbers?
which means are available to verify the semantical correctness of a certain output? Can this be done manually? (Note your approach #1 does not seem to be suitable for testing this, since just looking at an output file with no idea from what input it was produced can only tell you whether the output is syntactically correct, but not if it is what you semantically expected).
are certain transformations reversible, so one might apply the transformation, apply the reverse one and compare the original input with the result?
what is the order of magnitude of cases you have to deal with, the typical size of those JSON or XML files and the typical numbers? Are the numbers small enough so test data can be constructed manually, or are they so large you need to think of a programmatic solution?

Find out what kind and how many different examples and data scenarios you need to run by your framework so you feel confident in the code, then you will probably reach the point where you know which of the tests scetched above will serve you most, and if your overall approach makes sense.

I don't think your comparison of these tests with "unit tests" and "integration tests" matches the popular understanding of these terms.

A unit test would be a test which takes a single (partial) transformation/calculation function or module from your Python scripts and validates it in isolation, probably without using any input and output files.
Any test which runs a fully combined complex transformation / calculation step, as described in the question, is something I would call an integration test, regardless whether it is using approach #1 or #2.

All of those test types can make sense, and for most scenarios I can think of, tests of all three categories (unit test, integration test of type #1 or #2) will be useful. Still, how much sense they really make, and where to put the focus on depends ultimately on the specific transformation rules and the specific requirements - which are not even scetched in the question.

And that is the real warning sign to me: my impression is you believe this question could be answerable without knowing more about the semantics of the described processes. The question does not even give readers a clue by posting some example, hence I guess you believe this isn't necessary to understand what's going on.

Don't get me wrong, but I think that does not work. Even when you want your testing framework to stay somewhat generically, you still need to analyse your testing requirements first! - maybe not with all the gory details, but at least on a intermediate level of abstraction.

Here are some questions you may start with. Ask yourself:

are there transformations or calculations which can be tested on their own? Or could be, after some minor refactorings?
what are the typical use cases or individual transformations your test code may have to deal with serve?
roughly - how many JSON files are required as input for each individual use case, and how many XML files will the case produce? Is it always 1 JSON file which leads to 1 XML file? Or do you expect different numbers?
which means are available to verify the semantical correctness of a certain output? Can this be done manually? (Note your approach #1 does not seem to be suitable for testing this, since just looking at an output file with no idea from what input it was produced can only tell you whether the output is syntactically correct, but not if it is what you semantically expected).
are certain transformations reversible, so one might apply the transformation, apply the reverse one and compare the original input with the result?
what is the order of magnitude of cases you have to deal with, the typical size of those JSON or XML files and the typical numbers? Are the numbers small enough so test data can be constructed manually, or are they so large you need to think of a programmatic solution?

Find out what kind and how many different examples and data scenarios you need to run by your framework so you feel confident in the code, then you will probably reach the point where you know which of the tests scetched above will serve you most, and if your overall approach makes sense.

I don't think your comparison of these tests with "unit tests" and "integration tests" matches the popular understanding of these terms.

A unit test would be a test which takes a single (partial) transformation/calculation function or module from your Python scripts and validates it in isolation, probably without using any input and output files.
Any test which runs a fully combined complex transformation / calculation step, as described in the question, is something I would call an integration test, regardless whether it is using approach #1 or #2.

All of those test types can make sense, and for most scenarios I can think of, tests of all three categories (unit test, integration test of type #1 or #2) will be useful. Still, how much sense they really make, and where to put the focus on depends ultimately on the specific transformation rules and the specific requirements - which are not even scetched in the question.

And that is the real warning sign to me: my impression is you believe this question could be answerable without knowing more about the semantics of the described processes. The question does not even give readers a clue by posting some example, hence I guess you believe this isn't necessary to understand what's going on.

Don't get me wrong, but I think that does not work. Even when you want your testing framework to stay somewhat generically, you still need to analyse your testing requirements first! - maybe not with all the gory details, but at least on a level of abstraction where you have a firm understanding of the problem domain.

Here are some questions you may start with. Ask yourself:

are there transformations or calculations which can be tested on their own? Or could be, after some minor refactorings?
what are the typical use cases or individual transformations your test code may have to deal with serve?
roughly - how many JSON files are required as input for each individual use case, and how many XML files will the case produce? Is it always 1 JSON file which leads to 1 XML file? Or do you expect different numbers?
which means are available to verify the semantical correctness of a certain output? Can this be done manually? (Note your approach #1 does not seem to be suitable for testing this, since just looking at an output file with no idea from what input it was produced can only tell you whether the output is syntactically correct, but not if it is what you semantically expected).
are certain transformations reversible, so one might apply the transformation, apply the reverse one and compare the original input with the result?
what is the order of magnitude of cases you have to deal with, the typical size of those JSON or XML files and the typical numbers? Are the numbers small enough so test data can be constructed manually, or are they so large you need to think of a programmatic solution?

Find out what kind and how many different examples and data scenarios you need to run by your framework so you feel confident in the code, then you will probably reach the point where you know which of the tests scetched above will serve you most, and if your overall approach makes sense.

added 577 characters in body

Source Link

edited Jul 4, 2024 at 7:52

Doc Brown

220.3k
35
410
623

I don't think your idea of a comparison of these tests with "unit tests" and "integration tests" is very usefulmatches the way you described itpopular understanding of these terms.

A unit test would be a test which takes a single (partial) transformation/calculation function or module from your Python scripts and validates it in isolation, probably without using any input and output files.
Any test which runs a fully combined complex transformation +/ calculation step, as described in the question, is something I would call an integration test, regardless whether it is using approach #1 or #2.

All of those test types can make sense, and for most scenarios I can think of, tests of all three categories (unit test, integration test of type #1 or #2) will be useful. Still, how much sense they really make, and where to put the focus on depends ultimately on the specific transformation rules and the specific requirements - which are not statedeven scetched in the question.

And that is the real warning sign to me: my impression is you really seem to believe this question could be answerable without knowing more about the semantics of the described processes,. The question does not even givinggive readers a clue by posting some example, hence I guess you believe this isn't necessary to understand what's going on.

Don't get me wrong, but I think that does not work. YouEven when you want your testing framework to stay somewhat generically, you still need to analyse your testing requirements first! - maybe not with all the gory details, but at least on a intermediate level of abstraction.

Here are some questions you may start with. Ask yourself:

are there transformations or calculations which can be tested on their own? Or could be, after some minor refactorings?
what are the differenttypical use cases or individual transformations your test code canmay have to deal with serve?
roughly - how many JSON files are required as input for each individual use case, and how many XML files will the case produce? Is it always 1 JSON file which leads to 1 XML file? Or do you expect different numbers?
which means are available to verify the semantical correctness of a certain output? Can this be done manually? (Note your approach #1 does not seem to be suitable for testing this, since just looking at an output file with no idea from what input it was produced can only tell you whether the output is syntactically correct, but not if it is what you semantically expected).
are certain transformations reversible, so one might apply the transformation, apply the reverse one and compare the original input with the result?
what is the order of magnitude of cases you have to deal with, the typical size of those JSON or XML files and the typical numbers? Are the numbers small enough so test data can be constructed manually, or are they so large you need to think of a programmatic solution?

Find out what kind and how many different examples and data scenarios you need to run by your framework so you feel confident in the code, then you will probably reach the point where you know which of the tests scetched above will serve you most, and if your overall approach makes sense.

I don't think your idea of a comparison with "unit tests" and "integration tests" is very useful the way you described it.

A unit test would be a test which takes a single transformation/calculation function or module from your Python scripts and validates it in isolation, probably without using any input and output files.
Any test which runs a combined complex transformation + calculation step is something I would call an integration test, regardless whether it is using approach #1 or #2.

All of those test types can make sense, and for most scenarios I can think of, tests all three categories will be useful. Still, how much sense they really make, and where to put the focus on depends ultimately on the specific transformation rules and the specific requirements - which are not stated in the question.

And that is the real warning sign to me: you really seem to believe this question could be answerable without knowing more about the semantics of the described processes, not even giving readers a clue by posting some example. Don't get me wrong, but I think that does not work. You need to analyse your testing requirements first! Ask yourself:

are there transformations or calculations which can be tested on their own?
what are the different use cases or individual transformations your code can serve?
how many JSON files are required as input for each individual use case, and how many XML files will the case produce?
which means are available to verify the semantical correctness of a certain output?
are certain transformations reversible, so one might apply the transformation, apply the reverse one and compare the original input with the result?
what is the order of magnitude of cases you have to deal with, the typical size of those JSON or XML files and the typical numbers? Are the numbers small enough so test data can be constructed manually, or are they so large you need to think of a programmatic solution?

Find out what kind and how many different examples and data scenarios you need to run by your framework so you feel confident in the code, then you will probably reach the point where you know which of the tests scetched above will serve you most, and if your overall approach makes sense.

I don't think your comparison of these tests with "unit tests" and "integration tests" matches the popular understanding of these terms.

A unit test would be a test which takes a single (partial) transformation/calculation function or module from your Python scripts and validates it in isolation, probably without using any input and output files.
Any test which runs a fully combined complex transformation / calculation step, as described in the question, is something I would call an integration test, regardless whether it is using approach #1 or #2.

All of those test types can make sense, and for most scenarios I can think of, tests of all three categories (unit test, integration test of type #1 or #2) will be useful. Still, how much sense they really make, and where to put the focus on depends ultimately on the specific transformation rules and the specific requirements - which are not even scetched in the question.

And that is the real warning sign to me: my impression is you believe this question could be answerable without knowing more about the semantics of the described processes. The question does not even give readers a clue by posting some example, hence I guess you believe this isn't necessary to understand what's going on.

Don't get me wrong, but I think that does not work. Even when you want your testing framework to stay somewhat generically, you still need to analyse your testing requirements first! - maybe not with all the gory details, but at least on a intermediate level of abstraction.

Here are some questions you may start with. Ask yourself:

are there transformations or calculations which can be tested on their own? Or could be, after some minor refactorings?
what are the typical use cases or individual transformations your test code may have to deal with serve?
roughly - how many JSON files are required as input for each individual use case, and how many XML files will the case produce? Is it always 1 JSON file which leads to 1 XML file? Or do you expect different numbers?
which means are available to verify the semantical correctness of a certain output? Can this be done manually? (Note your approach #1 does not seem to be suitable for testing this, since just looking at an output file with no idea from what input it was produced can only tell you whether the output is syntactically correct, but not if it is what you semantically expected).
are certain transformations reversible, so one might apply the transformation, apply the reverse one and compare the original input with the result?
what is the order of magnitude of cases you have to deal with, the typical size of those JSON or XML files and the typical numbers? Are the numbers small enough so test data can be constructed manually, or are they so large you need to think of a programmatic solution?

Find out what kind and how many different examples and data scenarios you need to run by your framework so you feel confident in the code, then you will probably reach the point where you know which of the tests scetched above will serve you most, and if your overall approach makes sense.

added 11 characters in body

Source Link

edited Jul 4, 2024 at 6:51

Doc Brown

220.3k
35
410
623

Loading

added 834 characters in body

Source Link

edited Jul 4, 2024 at 6:28

Doc Brown

220.3k
35
410
623

Loading

added 17 characters in body

Source Link

edited Jul 3, 2024 at 22:40

Doc Brown

220.3k
35
410
623

Loading

Source Link

answered Jul 3, 2024 at 22:34

Doc Brown

220.3k
35
410
623

Loading

Stack Exchange Network

Return to Answer