I have a Garmin Nuvi which uses OpenStreet maps. Garmin do postcodes, but is usually 2-3 years out of date for Scotland. OSM does not do British postcodes, but the Post office does, and it can be downloaded for free. This file is just under 1GB. It has 16 columns, of which I only want the first 3.
I used cut to remove the extraneous columns, so I now have Postcode, Latitude and Longitude. Unfortunately the POI file is to be Latitude, Longitude and Postcode, i.e. column 1 is to be column 3. To add to the problem, the postcode must be in quotes e.g EH9 1QG and SW12 1AB has to be "EH9 1QG" and "SW12 1AB".
I used awk rather awkwardly (see what I did there?) with:
awk 'BEGIN {FS="\t"; OFS=","} {print $2, $3, $1}' pc0.csv > pc.csv
and all it did was add 2 empty columns to the front.
It would be nice to be able to use a spreadsheet on it but there are over 3 million rows.
Any ideas?
This is what I get from the output of cut - pc0.csv.
The awk command gives the same but with two commas at the front of each row to give 2 empty columns.
Postcode Latitude Longitude
AB1 0AA,57.101474,-2.242851
AB1 0AB,57.102554,-2.246308
AB1 0AD,57.100556,-2.248342
AB1 0AE,57.084444,-2.255708
AB1 0AF,57.096656,-2.258102
AB1 0AG,57.097085,-2.267513
AB1 0AJ,57.099011,-2.252854
AB1 0AL,57.101765,-2.254688
So using the "cut" file above, which is now only 73MB, I need to convert it to:
Latitude,Longitude,Postcode
57.101474,-2.242851,"AB1 0AA"
57.102554,-2.246308,"AB1 0AB"
57.100556,-2.248342,"AB1 0AD"
57.084444,-2.255708,"AB1 0AE"
57.096656,-2.258102,"AB1 0AF"
57.097085,-2.267513,"AB1 0AG"
57.099011,-2.252854,"AB1 0AJ"
57.101765,-2.254688,"AB1 0AL"
Now I had to remove the tabs to display these lines, so that is another problem as there can only be commas and nothing else - not even spaces unless inside quotes.
P.S. Linux (Ubuntu Mate) 22.04 LTS
SW12_1AB long1 lat1(tabs between columns) - your script generates -long1,lat1,SW12_1AB- when the file has unix line endings (\n); when the file has windows line endings (\r\n) your script generates -SW12_1AB1; I'm guessing your file may have windows line endings in which case you'll either want to remove them (eg,dos2unix pc0.csv) or modify theawkscript to factor in the trailing\rcharacter; providing us with sample inputs and (expected, wrong) outputs will help us to determine the issue(s) and how to proceedperl,awk, or evenbash, but without a fuller description of your system, and a short example of your input (are there TABs?) and the desired output, I can't pick a silution.{FS="\t"; OFS=","}, then the input file is not a CSV, it's a tsv. Am I right?