I'm trying to load the columns of a file with a strange encoding. Windows appears to have no issues opening it, but Linux complains and I have only been able to open it using the Atom text editor (others give me either a blank file or a file with data encoded)
The command:
file -i data_file.tit
returns:
application/octet-stream; charset=binary
Opening the file in binary mode and reading the first 400 bytes gives:
'0905077U1- a\r\nIntegration time: 19,00 ms\r\nAverage: 25 scans\r\nNr of pixels used for smoothing: 2\r\nData measured with spectrometer name: 0905077U1\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\r\nWave ;Dark ;Ref ;Sample ;Absolute Irradiance ;Photon Counts\r\n[nm] ;[counts] ;[counts] ;[counts] ;[\xb5Watt/cm\xb2/nm] ;[\xb5Mol/s/m\xb2/nm]\r\n247,40;-1,0378;18,713;10,738;21,132;0,4369\r\n247,'
The rest of the file consists only of ASCII numbers separated by semicolons.
I tried the following ways to load the file:
with open('data_file.tit') as f:
bytes = f.read() # (1)
# bytes = f.read().decode('???') # (2)
# bytes = np.genfromtxt(f) # (3)
print bytes
(1)Sort of works but skips the first several hundred lines.(2)Failed with every encoding I tried with the error:codec can't decode byte 0xb5 in position 315: unexpected special character(3)Complains aboutValueError: Some errors were detected !and shows for each line something similar toLine #3 (got 3 columns instead of 2).
How can I load this data file?
repr()can give you Python representations of the data. Open the file in binary mode ('rb') and give us a sample perhaps.