Selecting N random lines from a text file (Perl)
In many situations I need to sample a few lines from a big file, the basic approach is to take just the first N lines in the file, but that isn't correct, a real sampling needs a random selection of points to avoid some bias. Here is my solution in the Perl language: #!/usr/bin/perl =head1 NAME randomLines.pl =head1 DESCRIPTION Subsample some random lines from a text file. =head1 USAGE perl randomLines.pl [PARAM] Parameter Description Value Default -i --in Input file File STDIN -o --out Output file File STDOUT -n --num Number of lines to sample Integer 1000 -t --total Total lines in file Integer 1000000 -f --first Include first line Bool No -h --help Print this screen and exit -v --verbose Verbose mode --version Print version number and exit =head1 EXAMPLES 1. S...