Reorder Text With Perl Split and Splice
We were hired to edit some transcriptions. The transcriptions were in different formats. Sometimes the “time” was first, and sometimes the “who” was first. Here is one way to reorder text programatically.
Original Text to Reorder
The client gave us text in the format of “who”,“when”,“what was said”
Female: '00.0s' "Hello!"
Male: '01.0s' "Hello!"
Female: '02.0s' "Do you like my hat?"
Male: '03.5s' "I do not."
Female: '04.9s' "Good-by!"
Male: '06.0s' "Good-by!"
Desired Output
The client wanted text in the format of “when”,“who”,“what was said”
'00.0s',Female:,"Hello!"
'01.0s',Male:,"Hello!"
'02.0s',Female:,"Do you like my hat?"
'03.5s',Male:,"I do not."
'04.9s',Female:,"Good-by!"
'06.0s',Male:,"Good-by!"
Reorder Text in Arrays
There is a bit more that was required than what we are showing (commas needed to escaped), but overall, a simple Perl splice was all that was needed:
#!/usr/bin/perl
use warnings;
use strict;
# loop over data handle
while (<data>) {
# for readability, make new variable
my $line = $_;
# split the data on white space, as that is our delimiter
my @line = split(/ /,$line);
# the very first part is "who"
my $who = $line[0];
# the next chunk of our data is "when"
my $time = $line[1];
# how much data is left?
my $length = @line;
# put the rest of the data into a new array using splice
# splice takes full data, offset, and how much to cut, returning into new array
my @restof = splice(@line, 2, $length);
# Print it out, quoting array to auto white space
# new lines were needed as they already existed
print "$time,$who,@restof";
}
# paste into data handle
__DATA__
Female: '00.0s' "Hello!"
Male: '01.0s' "Hello!"
Female: '02.0s' "Do you like my hat?"
Male: '03.5s' "I do not."
Female: '04.9s' "Good-by!"
Male: '06.0s' "Good-by!"