2

I cannot figure out what is the method i should use to find out:

  • frequency of occurrence
  • values that appear in the file

For example my file is:

  xxxxx, yyyy , 79
  xxxxx, yyyy , 80
  xxxxx, yyyy , 79
  xxxxx, yyyy , 81
  xxxxx, yyyy , 80

and i want to find out that 79 has 40% of occurrence like 80 and 81 has 20%. How can i do that? (without R if it's possible...)

I need those because i want to plot an histogram using gnuplot. Can you also show me how to use the values just calculated to plot one histogram?

1 Answer 1

2

some combination of sort and uniq might do the trick. You could start with

cat file  | cut -d ',' -f 3 | sort | uniq --count > file.1

To plot in gnuplot, do

gnuplot
plot [78:82][0:3] "file.1" using 2:1 with boxes

The left braces [78:82] set the xrange, the other the yrange. This could be determined automatically, but a short look at the file to determine min/max values works well in this demo case.

Depending on your OS and configuration, this could be enough. You might need to also use set terminal and set output. (start gnuplot and say help, there is an in-program help)

Sign up to request clarification or add additional context in comments.

2 Comments

sorry i can't figure out how to use that, and i'm still stuck
hopefully, this helps, just say if you still need help

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.