Information Technology Grimoire

Version .0.0.1

IT Notes from various projects because I forget, and hopefully they help you too.

Count Lines in CSV Using Perl

We have hundreds of CSV files, and need to know how many lines are in each one. A simple way to count lines in a CSV file is to open it, update your line count, then close it.

CSV Output Count Example

Counting lines in files
combined1.csv,13875256
combined2.csv,1234
combined3.csv,144
combined4.csv,13801
DONE!!
 3 wallclock secs ( 2.06 usr +  0.39 sys =  2.45 CPU) seconds

Perl Script to Count Lines in CSV Files

use warnings;
use strict;
use Benchmark;

# tell user what's going on
print "Counting lines in files\n";

# start timer
my $t0 = new Benchmark;

# get array files
my @files = <*.csv>;

# loop over that array
foreach my $file (@files) {

	# restart counter for each file
	my $cnt;

	# open the filehandle
	open(FH,"$file") or die "Damn. $!";

		# then count lines
		$cnt++ while <fh>;

	# then close the file
	close FH;

	# tell us the count for that file
	print "$file, $cnt\n";

}

# time the whole deal
my $t1 = new Benchmark;
my $td = timediff($t1,$t0);
print "DONE!!\n",timestr($td), " seconds\n";

Why was it Benchmarked?

A similar program was written in Python. Perl was about twice as fast. As this program was being run on very large CSV files, frequently, it was important to know which was faster.

Last updated on 29 Jul 2019
Published on 29 Jul 2019