Skip to main content
Update URL
Source Link
JonDeg
  • 301
  • 3
  • 5

As mentioned in the accepted answer, GNU shuf supports simple random sampling (shuf -n) quite well. If sampling methods beyond those supported by shuf are needed, consider tsv-sampletsv-sample from eBay's TSV Utilities. It supports several additional sampling modes, including weighted random sampling, Bernoulli sampling, and distinct sampling. Performance is similar to GNU shuf (both are quite fast). Disclaimer: I am the author.

As mentioned in the accepted answer, GNU shuf supports simple random sampling (shuf -n) quite well. If sampling methods beyond those supported by shuf are needed, consider tsv-sample from eBay's TSV Utilities. It supports several additional sampling modes, including weighted random sampling, Bernoulli sampling, and distinct sampling. Performance is similar to GNU shuf (both are quite fast). Disclaimer: I am the author.

As mentioned in the accepted answer, GNU shuf supports simple random sampling (shuf -n) quite well. If sampling methods beyond those supported by shuf are needed, consider tsv-sample from eBay's TSV Utilities. It supports several additional sampling modes, including weighted random sampling, Bernoulli sampling, and distinct sampling. Performance is similar to GNU shuf (both are quite fast). Disclaimer: I am the author.

Clarification of performance. Comments have conflicting statements about shuf performance. Current versions are very fast.
Source Link
JonDeg
  • 301
  • 3
  • 5

As mentioned in the accepted answer, GNU shuf supports simple random sampling (shuf -n) quite well. If sampling methods beyond those supported by shuf are needed, consider tsv-sample from eBay's TSV Utilities. It supports several additional sampling modes, including weighted random sampling, Bernoulli sampling, and distinct sampling. Performance is similar to GNU shuf (both are quite fast). Disclaimer: I am the author.

As mentioned in the accepted answer, GNU shuf supports simple random sampling (shuf -n) quite well. If sampling methods beyond those supported by shuf are needed, consider tsv-sample from eBay's TSV Utilities. It supports several additional sampling modes, including weighted random sampling, Bernoulli sampling, and distinct sampling. Performance is similar to GNU shuf. Disclaimer: I am the author.

As mentioned in the accepted answer, GNU shuf supports simple random sampling (shuf -n) quite well. If sampling methods beyond those supported by shuf are needed, consider tsv-sample from eBay's TSV Utilities. It supports several additional sampling modes, including weighted random sampling, Bernoulli sampling, and distinct sampling. Performance is similar to GNU shuf (both are quite fast). Disclaimer: I am the author.

Source Link
JonDeg
  • 301
  • 3
  • 5

As mentioned in the accepted answer, GNU shuf supports simple random sampling (shuf -n) quite well. If sampling methods beyond those supported by shuf are needed, consider tsv-sample from eBay's TSV Utilities. It supports several additional sampling modes, including weighted random sampling, Bernoulli sampling, and distinct sampling. Performance is similar to GNU shuf. Disclaimer: I am the author.