Information Technology Grimoire

Version .0.0.1

IT Notes from various projects because I forget, and hopefully they help you too.

kw density

the script

#!/usr/bin/perl
use strict;
use warnings;
# Make a word frequency count
my %seen = ();
my $total = 0;
while (<DATA>) {
  while ( /(\w['\w-]*)/g ) {
    my $word = $1;
    unless (($word eq 'h3') or ($word eq 'p') or ($word eq 'h2') or ($word eq 'li') or ($word eq 'ul') or ($word eq 'ol') or ($word eq 'a')) {
    $seen{lc $1}++;
    $total++;
    }
  }
}
# output hash in a descending numeric sort of its values
print "TOTAL WORDS: $total\n";
foreach my $word ( sort { $seen{$b} <=> $seen{$a} } keys %seen) {
  my $percent = $seen{$word}/$total;
  $percent = sprintf("%.2f", $percent);
  my $appears = $seen{$word};
  printf "\t$percent\t$appears\t$word\n";
}
__DATA__
save your text to test for keyword density below this line