Do these validations make sense? Am I missing a better way of doing it?
Both ways involve rewriting the transformation code as a test. Sometimes you’re stuck testing that way but I don’t think you are here.
Let me propose a simple regression test strategy that works like this:
compare(transform(test1_json), test1_xml)
Done like this you can have many tests that hit all your corner cases.
Whether I’d call that a unit test depends mostly on how fast this is and if it involves IO. You don’t actually have to hit the file system. You could do this with strings. But how reasonable that is depends on the size of the json and xml.
The best argument for calling this a unit test is that compare is a pure function (at least it could be). The test certainly doesn’t have to be an integration test that deals with peripherals. But that depends on how you wrote the transform function and where you're keeping the json and xml. It doesn't have to know the file system even exists.
I like to separate my slow tests from my fast ones. Mixing them together ruins the fast ones. Call them what you will.