0

How can I split the output of a command on a delimiter, in this case a new line, and then select one line at random?

An example of what I'm trying to do is:

curl -s http://www.compciv.org/files/pages/nyt-sample/ | pup 'article p text{}' \
        | sed 's/^[ \r\n\t]*//'

which outputs:


The fine of $35 million in each of two civil penalties, for a total of $70 million, is a record for the National Highway Traffic Safety Administration.


After becoming a grandmaster at the tender age of 13, Sam Sevian is getting some help from the chess champion Garry Kasparov.


In a small stand against gentrification, the nonprofit Wildflowers Institute has helped San Francisco’s gritty Tenderloin neighborhood define its cultural assets.


The currency has fallen because investors fear that the eurozone is stuck in a quagmire and its leaders are not doing much to pull it out.


President Obama is facing opposition from his party to one of his top priorities: winning the power to negotiate international trade pacts and speed them through Congress.

How can I split on the newlines and then select just one so I'd get something just like

The currency has fallen because investors fear that the eurozone is stuck in a quagmire and its leaders are not doing much to pull it out.
4
  • You may want to take a look at the shuffle command shuf. Commented May 7, 2019 at 15:29
  • @RakeshSharma Could you be more specific? Just running shuf randomizes all the lines, but I want to select just one. Commented May 7, 2019 at 15:44
  • head and tail are other helpful commands in this situation. The Unix philosophy is about combining simple tools to do complex tasks, and head and tail are useful in a wide variety of situations like yours. Commented May 7, 2019 at 16:24
  • Relating unix.stackexchange.com/a/326614/117549 Commented May 7, 2019 at 16:30

1 Answer 1

3

The simple answer is to pipe your command into shuf -n1.  This shuffles the input (lines) into a random order and then outputs one of them (i.e., a random selection).

But your (curl … | pup …) command outputs blank lines.  I guess that you don’t want to get one of those blank lines in your final output.  shuf doesn’t seem to have an option to ignore blank lines; you will have to remove them before shuf sees them.  A generic approach would be

curl -s http://www.compciv.org/files/pages/nyt-sample/ | pup 'article p text{}' \
        | sed 's/^[ \r\n\t]*//' | grep '.'    | shuf -n1

but, since your existing command line already ends with a sed command, we can piggyback on that:

curl -s http://www.compciv.org/files/pages/nyt-sample/ | pup 'article p text{}' \
        | sed -e 's/^[ \r\n\t]*//' -e '/^$/d' | shuf -n1

P.S. It doesn’t make sense to include \n in your sed s command — sed processes its input a line at a time, unless you tell it to do otherwise, and lines never include \n.

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.