2

I am trying to sort a file called data for learning purposes. Its given in my textbook.

5 27
2 12
3 33
23 2
-5 11
15 6
14 -9

Q1) What is the order of just sort data in this case ?

Q2) I am working in one folder. sort data works, but sort +1n data does not. Why ? I typed it exactly like in the book and I get this error -

sort: cannot read: +1n: No such file or directory

EDIT - The book wants to skip column 1 and sort by column 2. Thats why +n might be used.

I use lubuntu 13 to learn unix bash scripting.

PS - Here is the output of sort data

14 -9
15 6
2 12
23 2
3 33
-5 11
5 27
17
  • You're lacking the flag -k, to define key ranges. Note they are ranges! Not columns! Commented Aug 14, 2013 at 23:27
  • Why could Q1 be a question? can't you just try it? Commented Aug 14, 2013 at 23:31
  • 1
    @blasto you must have a really old book Commented Aug 14, 2013 at 23:47
  • 1
    @blasto that argument style (the +1n) was eliminated in modern versions of sort. See my response Commented Aug 14, 2013 at 23:59
  • 1
    In part, RTFM for sort, but the number before the comma is the start column, and the number after is the end column. Using sort -k1 data is the same as sort data and means sort by column 1, then by column 2, then by column 3, ... You can also use sort -k1.2 data which sorts starting with the second character in column 1, etc. Note that on any specific system, including Linux in particular, the sort command often has options not defined by POSIX. Commented Aug 15, 2013 at 1:09

1 Answer 1

1

sort by default sorts the entire line lexicographically, so the first sort will be

-5 11
14 -9
15 6
2 12
23 2
3 33
5 27

- comes before 1 (check the ASCII codes for each)

According to the posix standard, the aforementioned sort is correct. GNU SORT (the version used in ubuntu) appears to deviate.

The +1n argument also stems from older versions of sort:

Earlier versions of this standard also allowed the - number and + number options. These options are no longer specified by POSIX.1-2008 but may be present in some implementations.

First, the zero-based counting used by sort is not consistent with other utility conventions.

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sort.html

Putting the facts together, older versions of sort treated -1 as if it were -k 2, so you should use -k2 -n in ubuntu.

Sign up to request clarification or add additional context in comments.

6 Comments

In my ubuntu, -5 comes between 3 and 5 by default.
@MartinZhang digging through the posix standard, its clear that ubuntu is wrong here
It is locale-dependent. Set LC_ALL=C and the output of sort is different (as claimed here).
My LC related environment variables are zh_CN.UTF-8. After exporting LC_ALL=C, I get exact the same order as yours. Thanks.
@blasto: I think Nirk is over-claiming — not accounting for locale-sensitivity. You find your locale by looking at the environment variables that affect it — LANG, LC_ALL, LC_COLLATE, LC_NUMERIC, LC_TIME, LC_MONETARY, LC_CTYPE, LC_MESSAGES (and possibly others). Try man locale to find out more (works on Mac OS X, listing information for locale command); or man setlocale.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.