I have a number of tests where the Pandas dataframe output needs to be compared with a static baseline file. My preferred option for the baseline file format is the csv format for its readability and easy maintenance within Git. But if I were to load the csv file into a dataframe, and use
A.equals(B)
where A is the output dataframe and B is the dataframe loaded from the CSV file, inevitably there will be errors as the csv file does not record datatypes and what-nots. So my rather contrived solution is to write the dataframe A into a CSV file and load it back out the same way as B then ask whether they are equal.
Does anyone have a better solution that they have been using for some time without any issues?
sum((A != B).any(1))Let me know if this works, I can't test this myself as re-creating your situation is not easy