Skip to main content
13 events
when toggle format what by license comment
May 10, 2021 at 14:33 history edited YMGenesis CC BY-SA 4.0
deleted 272 characters in body
May 10, 2021 at 14:22 comment added YMGenesis @cas That also makes a lot of sense. I guess my head isn't at that stage yet where I can think of these kinds of solutions, so I appreciate the input.
May 10, 2021 at 14:22 comment added YMGenesis @icarus That's very helpful to start 'sanitizing' my data for the range.
May 10, 2021 at 14:15 vote accept YMGenesis
May 10, 2021 at 6:41 comment added cas Then you either don't need to use -F'[- ]' as the field separator, or you can make awk split the input fields again with {$1 = $1 "-" $1; $0=$0}. The point is to have awk do the transformation on the fly, while it's reading the file(s).
May 10, 2021 at 6:33 comment added cas @YMGenesis changing $1 if it doesn't include a range is a good idea - but you don't have to change all your input files. just do something like $1 !~ /-/ {$1 = $1 "-" $1}, or $1 ~ /^[[:digit:]]+$/ {$1 = $1 "-" $1} in your awk script.
May 10, 2021 at 5:26 answer added Kamil Maciorowski timeline score: 1
May 10, 2021 at 4:44 comment added icarus sed '/^[0-9]* /s/^\([0-9]*\)/\1-\1/' addresses.txt> newaddresses.txt might be a good starting point.
May 9, 2021 at 23:22 history edited YMGenesis CC BY-SA 4.0
added 272 characters in body
May 9, 2021 at 23:16 comment added YMGenesis continued: But, I think if it makes more sense to add 1-1 ranges (two people have suggested it now), maybe I'll just bite the bullet and start changing the data. I appreciate the input.
May 9, 2021 at 23:12 comment added YMGenesis Yes there will be an overlap in some cases. Kamil did suggest the 1-1 range solution. My addresses.txt data is around 2000 lines long, so I'd have to go in and change each single number to a single 1-1 range. I agree with the philosophy you mention, clean and simple. Would just take a while to change the data. But if it works, it works. The data itself doesn't change much. It's supposed to be a database of addresses which specify which route a letter carrier delivers to (the number after the colon), so it doesn't change much at all. Maybe once a year or less. Little changes here and there.
May 9, 2021 at 22:58 comment added icarus Will there ever been an overlap between the range and the single numbers (or even a different range)? e.g. 3 fastest rd: 99 1-58 fastest rd: 98. In general there are two ways to approach this, (1) Have clean data and a simple program, or (2) have dirty data and a complicated program. For this case I think clean data is a good approach, so perhaps you can change your "1 test st: 1" line to "1-1 test st: 1" so there are never any cases where you don't have a range (even if the range is just 1 long). This is not a good approach if the data changes frequently so can you tell us this as well?
May 9, 2021 at 22:33 history asked YMGenesis CC BY-SA 4.0