Monday, April 21, 2014

Perl quick reference

Scalar Data

String concatenation

"hello" . "world" -> "helloworld"

----------------------------------------
Numeric and String Comparison Operators
----------------------------------------
Comparison   Numeric   String
--------------------------------------
Equal         ==        eq
Not equal    !=         ne 
Less than     <         lt
Greater than  >         gt
Less than or equal to <= le
Greater than or equal to >= ge

File tests and their meanings
--------------------------------------
File test      Meaning
-------------------------------------
-r             File or directory is readable by this (effective) user or group
-w             File or directory is writable by this (effective) user or group
-x             File or directory is executable by this (effective) user or group
-o             File or directory is owned by this (effective) user
-R             File or directory is readable by this real user or group
-W             File or directory is writable by this real user or group
-X             File or directory is executable by this real user or group
-O             File or directory is owned by this real user
-e             File or directory name exists
-z             File exists and has zero size (always false for directories)
-s             File or directory exists and has nonzero size (the value is the size in bytes)
-f             Entry is a plain file
-d             Entry is a directory
-l             Entry is a symbolic link
-S             Entry is a socket
-p             Entry is a named pipe (a “fifo”)
-b             Entry is a block-special file (like a mountable disk)
-c             Entry is a character-special file (like an I/O device)
-u             File or directory is setuid
-g             File or directory is setgid
-k             File or directory has the sticky bit set
-t             The filehandle is a TTY (as reported by the isatty() system function; filenames can’t be tested by this test)
-T             File looks like a “text” file
-B             File looks like a “binary” file
-M             Modification age (measured in days)
-A             Access age (measured in days)
-C             Inode-modification age (measured in days)

String repetition operator

"fred" x 3  -->   is "fredfredfred"

The chop and chomp Functions

$x = "hello world";
chop($x); # $x is now "hello worl"
$a = "hello world\n";
chomp ($a); # $a is now "hello world"

Reading from standard input

$a = <STDIN>;  # get the text
chomp($a);     # get rid of that pesky newline
or
chomp($a = <STDIN>);

Output with print
print("hello world\n"); # say hello world, followed by newline
print "hello world\n";  # same thing

Arrays

(1,2,3)             # array of three values 1, 2, and 3
() # the empty list (zero elements)

List constructor operator

(1 .. 5)            # same as (1, 2, 3, 4, 5)

"quote word" function

 @a = ("fred","barney","betty","wilma"); # ugh!
 @a = qw(fred barney betty wilma); # better!
 @a = qw(
    fred
    barney
    betty
    wilma
 );                                # same thing

Assignment

@fred = (1,2,3); # The fred array gets a three-element literal
@barney = @fred; # now that is copied to @barney
@huh = 1; # 1 is promoted to the list (1) automatically
@barney = (4,5,@fred,6,7); # @barney becomes 4,5,1,2,3,6,7
@barney = (8,@barney);     # puts 8 in front of @barney
@barney = (@barney,"last");# and a "last" at the end

($a,$b,$c) = (1,2,3);    # give 1 to $a, 2 to $b, 3 to $c
($a,$b) = ($b,$a);       # swap $a and $b
($d,@fred) = ($a,$b,$c); # give $a to $d, and ($b,$c) to @fred

Length of the array

@fred = (4,5,6);   # initialize @fred
$a = @fred;        # $a gets 3, the current length of @fred
$#fred to get the index value of the last element of @fred

Array Element Access

@fred = (7,8,9);
$b = $fred[0];  # give 7 to $b (first element of @fred)
$fred[2]++;                      # increment the third element of @fred

Slice

Accessing a list of elements from the same array is called a slice
@fred[0,1];                 # same as ($fred[0],$fred[1])
@fred[0,1] = @fred[1,0];    # swap the first two elements

If you access an array element beyond the end of the current array, the undef value is returned without warning. 
For example:
@fred = (1,2,3);
$barney = $fred[7]; # $barney is now undef

Assigning a value beyond the end of the current array automatically extends the array
For example:
@fred = (1,2,3);
$fred[3] = "hi"; # @fred is now (1,2,3,"hi")
$fred[6] = "ho"; # @fred is now (1,2,3,"hi",undef,undef,"ho")

The push and pop Functions

push(@mylist,$newvalue);    # like @mylist = (@mylist,$newvalue)
$oldvalue = pop(@mylist);   # removes the last element of @mylist

The shift and unshift Functions

The push and pop functions do things to the "right" side of a list. Similarly, the unshift and shift functions perform the corresponding actions on the "left" side of a list. Here are a few examples:
unshift(@fred,$a);       # like @fred = ($a,@fred);
$x = shift(@fred);       # like ($x,@fred) = @fred;

The reverse Function

@a = (7,8,9);
@b = reverse(@a);    # gives @b the value of (9,8,7)

The sort Function

@y = (1,2,4,8,16,32,64);
@y = sort(@y); # @y gets 1,16,2,32,4,64,8 # Note that sorting numbers does not happen numerically, but by the string values of each number

Control Structures


if ($a < 18) {
    print "So, you're not old enough to vote, eh?\n";
} else {
    print "Old enough!  Cool!  So go vote!\n";    
}

unless ($a < 18) {
    print "Old enough!  Cool!  So go vote!\n";
#Replacing if with unless is in effect saying "If the control expression is false, do...." 

if (some_expression_one) {
    one_true_statement_1; 
} elsif (some_expression_two) {
    two_true_statement_1; 
} else {
    all_false_statement_1;
    all_false_statement_2;
    all_false_statement_3;
}

while ($a > 0) {
    print "At one time, you were $a years old.\n";
    $a--;
}

until (some_expression) {
    statement_1; 
}
#until something is true" rather than "while not this is true."

$stops = 0;
do {
    $stops++;
    print "Next stop? ";
    chomp($location = <STDIN>);
} until $stops > 5 || $location eq 'home';

for ($i = 1; $i <= 10; $i++) {
    print "$i ";
}

@a = (3,5,7,9);
foreach $one (@a) {
    $one *= 3;
}
# @a is now (9,15,21,27)

Hash (key/value pair)

$fred{"aaa"} = "bbb"; # creates key "aaa", value "bbb"
$fred{234.5} = 456.7; # creates key "234.5", value 456.7

print $fred{"aaa"}; # prints "bbb"

@fred_list = %fred;
# @fred_list gets ("aaa","bbb","234.5",456.7)
%barney = %fred;       # a faster way to do the same

%smooth = ("aaa","bbb","234.5",456.7);

%backwards = reverse %normal;
#To construct a hash with keys and values swapped using the reverse operator.
#if %normal has two identical values, those will end up as only a single element in %backwards, so this is best performed only on hashes with unique keys and values

The keys Function

foreach $key (keys (%fred)) { 
    print "at $key we have $fred{$key}\n"; 
}

if (keys(%somehash)) { # if keys() not zero:
    ...; # array is non empty
}

In fact, merely using %somehash in a scalar context will reveal whether the hash is empty or not:

if (%somehash) { # if true, then something's in it 
    # do something with it
}

The values Function

values(%hashname)

The each Function

while (($first,$last) = each(%lastname)) {
    print "The last name of $first is $last\n";
}

The delete Function

%fred = ("aaa","bbb",234.5,34.56); # give %fred two elements
delete $fred{"aaa"};
# %fred is now just one key-value pair

Basinc I/O

Command line arguments.
@ARGV array. Each command-line argument goes into a separate element of the @ARGV array.

printf for Formatted Output
printf "%15s %5d %10.2f\n", $s, $n, $r;
prints $s in a 15-character field, then space, then $n as a decimal integer in a 5-character field, then another space, then $r as a floating-point value with 2 decimal places in a 10-character field, and finally a newline.

Regular Expressions


if (/abc/) {
    print $_;
}

while (<>) {
    if (/ab*c/) {
        print $_;
    }
}

Single-Character Patterns

the dot ".". This matches any single character except newline (\n). For example, the pattern /a./ matches any two-letter sequence that starts with a and is not "a\n".

/[abcde]/ #matches a string containing any one of the first five letters of the lowercase alphabet
/[aeiouAEIOU]/ #matches any of the five vowels in either lower- or uppercase. 
[0-9] #match any single digit
[0-9\-]         # match 0-9, or minus
[a-z0-9]        # match any single lowercase letter or digit
[a-zA-Z0-9_]    # match any single letter, digit, or underscore
[^0-9]        # match any single non-digit
[^aeiouAEIOU] # match any single non-vowel
[^\^]         # match single character except an up-arrow

\d (a digit) [0-9]    \D (digits, not!) [^0-9]

\w (word char) [a-zA-Z0-9_]  \W (words, not!) [^a-zA-Z0-9_]

\s (space char) [ \r\t\n\f]  \S (space, not!) [^ \r\t\n\f]
The \d pattern matches one "digit." The \w pattern matches one "word character",The \s pattern matches one "space" (whitespace)
 \W matches one character that can't be in an identifier, \S matches one character that is not whitespace (including letter, punctuation, control characters, and so on), and \D matches any single nondigit character.

Grouping Patterns

 Multipliers
  The asterisk(*) indicates zero or more of the immediately previous character (or character class).
  The plus sign (+), meaning one or more of the immediately previous character
  The question mark (?), meaning zero or one of the immediately previous character
 For example, the regular expression /fo+ba?r/ matches an f followed by one or more o's followed by a b, followed by an optional a, followed by an r.

Parentheses as memory
/fred(.)barney\1/;
matches a string consisting of fred, followed by any single non-newline character, followed by barney, followed by that same single character. So, it matches fredxbarneyx, but not fredxbarneyy. 

Alternation
a|b|c. This means to match exactly one of the alternatives (a or b or c in this case)

Anchoring Patterns

The \b anchor requires a word boundary at the indicated point for the pattern to match.
/fred\b/;     # matches fred, but not frederick
/\bmo/;       # matches moe and mole, but not Elmo
/\bFred\b/;   # matches Fred but not Frederick or alFred
/\b\+\b/;     # matches "x+y" but not "++" or " + "
/abc\bdef/;   # never matches (impossible for a boundary there)

\B requires that there not be a word boundary at the indicated point.
For example: /\bFred\B/; # matches "Frederick" but not "Fred Flintstone"

The caret (^) matches the beginning of the string. ^a matches an a if, and only if, the a is the first character of the string
The $, anchors the pattern, but to the end of the string. c$ matches a c only if it occurs at the end of the string.

Matching Operator: the =~ Operator

This operator takes a regular expression operator on the right side, and changes the target of the operator to something besides the $_ variable - namely, some value named on the left side of the operator.
It looks like this:
$a = "hello world";
$a =~ /^he/;         # true
$a =~ /(.)\l/;       # also true (matches the double l)
if ($a =~ /(.)\l/) { # true, so yes...
                     # some stuff
}

if (<STDIN> =~ /^[yY]/) {          # does the input begin with a y?

}

Ignoring Case

/somepattern/i
if (<STDIN> =~ /^y/i) { # does the input begin with a y?
    # yes! deal with it
}

Using a Different Delimiter

$path = <STDIN>; # read a pathname (from "find" perhaps?)
if ($path =~ /^\/usr\/etc/) {  # Not comfortable
    # begins with /usr/etc...
}

m@^/usr/etc@      # using @ for a delimiter
m#^/usr/etc#      # using # for a delimiter

Using Variable Interpolation

$what = "bird";
$sentence = "Every good bird does fly.";
if ($sentence =~ /\b$what\b/) {
    print "The sentence contains the word $what!\n";
}

Special Read-Only Variables

After a successful pattern match, the variables $1, $2, $3, and so on are set to the same values as \1, \2, \3, and so on. You can use this to look at a piece of the match in later code.
For example:
$_ = "this is a test";
/(\w+)\W+(\w+)/; # match first two words
                 # $1 is now "this" and $2 is now "is"

$_ = "this is a test";
($first, $second) = /(\w+)\W+(\w+)/; # match first two words
     # $first is now "this" and $second is now "is"

Other predefined read-only variables include $&, which is the part of the string that matched the regular expression; $`, which is the part of the string before the part that matched; and $', which is the part of the string after the part that matched.
For example:
$_ = "this is a sample string";
/sa.*le/; # matches "sample" within the string
          # $` is now "this is a "
          # $& is now "sample"
          # $' is now " string"

Substitutions

s/old-regex/new-string/
$_ = "foot fool buffoon";
s/foo/bar/g; # $_ is now "bart barl bufbarn"

$_ = "hello, world";
$new = "goodbye";
s/hello/$new/; # replaces hello with goodbye

$_ = "this is a test";
s/(\w+)/<$1>/g; # $_ is now "<this> <is> <a> <test>"

The split and join Functions

The split Function

$line = "merlyn::118:10:Randal:/home/merlyn:/usr/bin/perl";
@fields = split(/:/,$line); # split $line, using : as delimiter
# now @fields is ("merlyn","","118","10","Randal","/home/merlyn","/usr/bin/perl")
Note how the empty second field became an empty string. If you don't want this, match all of the colons in one fell swoop:
@fields = split(/:+/, $line);
This matches one or more adjacent colons together, so there is no empty second field

$_ = "some string";
@words = split; # same as @words = split(/\s+/, $_);

$line = "merlyn::118:10:Randal:/home/merlyn:";
($name,$password,$uid,$gid,$gcos,$home,$shell) = split(/:/,$line); # split $line, using : as delimiter
simply gives $shell a null (undef) value if the line isn't long enough or if it contains empty values in the last field. (Extra fields are silently ignored)

The join Function

$bigstring = join($glue,@list);
$outline = join(":", @fields);