I'm having some peculiarities with the dictionary file of .dsl format I'm trying to convert. It's essentially a text file with the dictionary pairs. The dictionary software I use is GoldenDict. It requires UTF-16 dictionaries so they render properly.
All the dictionaries I have are UTF-16LE format. There is one standing out however. It has iso-8859-1 encoding. An entry looks like this when I open it with vim:
abandonarse
[m2][c crimson][b]Sinónimos[/b][/c][/m]
[m2][i][c green]verbo[/c][/i][/m]
[m1][trn][b]desanimarse:[/b] <<desanimarse>>, <<abatirse>>, <<tumbarse>>, <<plegarse>>, <<entregarse>>, <<desligarse>>[/trn][/m]
I have to convert it to UTF-16LE because Goldendict renders some Cyrillic characters instead of Spanish accented characters. Then I try:
iconv -f iso-8859-1 -t utf-16le dictionary.dsl -o test.dsl
The new test.dsl dictionary is rendered correctly by Goldendict, however I can see some peculiar things I would love to get rid of. First is that the just converted file's encoding is not recognized as it usually is with the other dictionaries:
aleksandr@desktop:~/windoc/Dic/Es extra/dictionary.dsl> file dictionary.dsl
dictionary: data
When I open the file test.dsl with vim every character inside has ^@ added to it. Here is the example of the same entry:
^@<^@<^@e^@n^@t^@r^@e^@g^@a^@r^@s^@e^@>^@>^@,^@ ^@<^@<^@d^@e^@s^@l^@i^@g^@a^@r^@s^@e^@>^@>^@[^@/^@t^@r^@n^@]^@[^@/^@m^@]^@
^@ ^@[^@m^@2^@]^@[^@c^@ ^@c^@r^@i^@m^@s^@o^@n^@]^@[^@b^@]^@A^@n^@t^@ó^@n^@i^@m^@o^@s^@[^@/^@b^@]^@[^@/^@c^@]^@[^@/^@m^@]^@
^@ ^@[^@m^@2^@]^@[^@i^@]^@[^@c^@ ^@g^@r^@e^@e^@n^@]^@v^@e^@r^@b^@o^@[^@/^@c^@]^@[^@/^@i^@]^@[^@/^@m^@]^@
I tried removing this characters in vim
%s/<Ctrl-V><Ctrl-J>//g
However, then I save the file, it has the encoding iso-8859-1 again. I would like to have this file to be show without ^@ characters, because I may need to edit some headings in the dictionary manually.
iconvnot adding BOMs when it is explicitly told the endinanness to use.fileto correctly recognize my text file as UTF-16LE when the file does not have a BOM? is really a separate question.