Random lines in a text file
Sometimes I need to sample large data sets, so I randomly select some lines in the file. My files are generally text-records, one record by line, then I wrote this small script to do the task: #!/usr/bin/perl -w use strict; =head1 NAME selectRandomLines.pl =head1 DESCRIPTION Select random lines in a file. =cut $ARGV[2] or die "Usage: selectRandomLines.pl TOTAL_LINES_IN_FILE NUM_LINES_WANTED FILE_NAME\n"; my $total = shift @ARGV; # Total lines in the file my $want = shift @ARGV; # Total lines to select my $file = shift @ARGV; # The file my $line = 0; # Line counter my %select = randomSelect($total, $want); # Hash with selected lines open FILE, "$file" or die "cannot open $file\n"; while (<FILE>) { print "$_" if (defined $select{$line++}); } close FILE; =head1 SUBROUTINES randomSelect() CALL: randomSelect(TOTAL_ELEMENTS, TOTAL_WANTED) [NUM, NUM] RETURN: %s [HASH] =cut sub randomSelect { my $t = shift @_; my $n = shift...