WigwamHQ


Wigwam Community Libraries Modules Plugins Download Development
Wigwam: Introduction | Basics | Details

View source (Text/Wigwam.pm)


Name

Text::Wigwam - A user-extensible template parser.


Synopsis

 # Quick file parse
 use Text::Wigwam;
 my $wwobj = Text::Wigwam->new( file => 'path/to/filename.txt' )
  or die $Text::Wigwam::ERROR;
 print $wwobj->execute;
 # Quick string parse
 use Text::Wigwam; 
 my $wwobj = Text::Wigwam->new(
   text =>  'foo is [!!foo!!]',
   varspace =>  { foo => 'FOO' },
 ) or die $Text::Wigwam::ERROR;
 print $wwobj->execute;                     # foo is FOO
 # Detailed usage with debug support
 use Text::Wigwam; 
 my $restrict = {
   config_open => '<<',
   config_term => '>>',
   default_engine => 'Fusion',
 };
 my $defaults = {
   engine => 'Fusion',
   strict_tags => 1,
   code_open => '[!!',
   code_term => '!!]',
   text_term => '<!!',
   text_open => '!!>',
   code_open_trim => '[!~',
   code_term_trim => '~!]',
   text_term_trim => '<!~',
   text_open_trim => '~!>',
   numbers => float,
   directive_root => undef,
   directive_path => undef,
 };
 my $settings = {
   plugins => 'DirectiveSet',
   modules => 'Core/Cgi, Core/Html',
   drip_cache => 0,
 };
 my $varspace = { greet => "Hello", entity => "World" };
 my $string = '[!!greet!!], [!!entity!!]';
 my $wwobj = Text::Wigwam->new(
   text => $string,
   Restrict => $restrict,
   Defaults => $defaults,
   Settings => $settings,
   Varspace => $varspace,
 ) or die $Text::Wigwam::ERROR;
 $wwobj->set_path( template => qw( ~/wigwam/templates/ /wigwam/templates/ ) );
 my( $error, $text ) = $wwobj->execute;
 if( $error ){
   # Debug errors
   my( $derr, $dtxt ) = $wwobj->debug;
   print $dtxt and die $error unless $derr;
   warn "Debug error: $derr";
   die $error;
 }
 print $text; # outputs "Hello, World"


Description

Wigwam is a general-purpose Perl template parsing module designed with emphasis on ease of use, reusability, flexibility and extensibility. Its purpose is to provide the basic framework for building custom template scripting environments with varying degrees of functionality ranging from strict variable interpolation, to turing-complete scripting languages.


Interface

Wigwam employs an object oriented interface. It is the mechanism through which the user invokes Wigwam to construct template objects, set up parameters, and execute templates.

Constructor

The new() constructor creates and returns a new Wigwam template object. It requires at least two arguments, the first of which indicates whether to parse a given string or a file. The second parameter is the string or filename which contains the root template to be processed. These parameters must be the first two passed in to the new() constructor - if either is missing, an error will result.

 use Text::Wigwam;
 my $stemplate = Text::Wigwam->new( text => $string )
  or die $Text::Wigwam::ERROR;
 my $ftemplate = Text::Wigwam->new( file => $filename )
  or die $Text::Wigwam::ERROR;

If an error occurs during this phase, an undef value is returned and the error can be retrieved by accessing the package variable $Text::Wigwam::ERROR, or Text::Wigwam->error.

Optional named parameters

The following optional named parameters consist of a key identifier (case-insensitive) followed by a hash reference. They may be placed in any arbitrary order after the first two arguments.

 use Text::Wigwam;
 my $wwobj = Text::Wigwam->new(
   file => 'path/to/root_template.txt',
   # Optional named parameters follow
   Restrict => \%restrict,
   Defaults => \%defaults,
   Settings => \%settings,
   Varspace => \%varspace,
 ) or die $Text::Wigwam::ERROR;

Varspace

This parameter can be used to specify a hash reference which is to be used as the template's variable space.

 use Text::Wigwam;
   my $wwobj = Text::Wigwam->new(
     text => '[!! greet !!]',
     Varspace => { greet => 'Welcome!' },
   ) or die $Text::Wigwam::ERROR;
  print $wwobj->execute;      # Welcome!

The following named parameters are used to control configuration options of varying scope and rank. Each requires a hash reference whose key/value pairs reflect the various options and their values.

Defaults

Config option values defined within this parameter represent the initial values for the root template as well as all child templates. These option values hold the lowest priority and are easily overridden, even from within a template.

Restrict

Config option values defined within this parameter are global in scope and cannot be overridden throughout the life of the template object. It is used to fix config options to specific values not just for the root template, but any given child template as well. This is a handy mechanism for locking out features, such as specifying a fixed set of modules for use by templates, or limiting templates to specific branches of the directive tree, and so on.

Settings

Config option values defined within this parameter only apply to the root template, and are not passed to child templates. These option values will override conflicting option values in the defaults parameter, but they can be overridden by conflicting option values within the restrict parameter, or conflicting option values within the options class of the root template's configuration tag.

Binding

Typically, the new() constructor is invoked on the Text::Wigwam class, but it can also be invoked on a blessed Text::Wigwam object reference as a means for creating new template objects that are bound to a common set of attributes which enables them to interact with one another. These attributes consist of the globals, directory paths and the varspace hash (unless explicitly overridden).

 use Text::Wigwam;
 $primary = Text::Wigwam->new(
   text => q/[!! #parameters < secondary > &secondary() !!]/
 ) or die $Text::Wigwam::ERROR;
 my $secondary = $primary->new(
    text => '[!! #return "The external template was here" !!]'
 ) or die $Text::Wigwam::ERROR;
 print $primary->execute( [ $secondary ] ); # The external template was here

Binding is also an ideal mechanism for populating a template's variable space with external templates prior to template execution.

 my $varspace = {};
 my $primary = Text::Wigwam->new(
    text => q/[!! #parameters < action > &dispatch.[action]() !!]/,
    varspace => $varspace,
 ) or die $Text::Wigwam::ERROR;
  # populate the primary template's varspace hash
 $varspace->{dispatch} = {
   secondary => $primary->new(
     text => q/[!! #return "The secondary template was here" !!]/,
     varspace => {}, # Use a private varspace hash
   ),
   ternary => $primary->new(
     text => q/[!! #return "The ternary template was here" !!]/,
     varspace => {}, # Use a private varspace hash
   ),
 };
 print $primary->execute(  # The ternary template was here
    { action => 'ternary' }
 );
 print $primary->execute(  # The secondary template was here
    { action => 'secondary' }
 );

Methods

Many of the methods which are available in a Text::Wigwam object are also available from the API. So, rather than document them in both places, we will only document those methods which are typically invoked on a Text::Wigwam object in this section, and simply refer all others to the relevant API sub-section.

General

execute( )

The execute() method executes the template object and returns a list consisting of an error message (if any), and the resulting text of the parsed template.

 use Text::Wigwam;
 my $varspace = {};
 my $wwobj = Text::Wigwam->new( file => 'path/to/filename.txt', varspace => $varspace )
  or die $Text::Wigwam::ERROR;
 my( $error, $text ) = $wwobj->execute();
 die $error if $error;
 print $text;

debug( $beautifier, $debug_template )

The debug() method can be called anytime after the execute() method has been invoked. It displays runtime errors, debug information, and warnings in context within a beautified redraw of the relevant templates. It returns a list much the same as execute(), consisting of an error message (if the debug template generated an error) followed by the resulting text produced by the debug template.

Common usage is to call it after execute() has reported an error message.

 my $wwobj = Text::Wigwam->new( file => 'path/to/template/file' );
 my( $error, $text ) = $wwobj->execute;
 if( $error ){
   # Debug errors
   my( $derr, $dtxt ) =  $wwobj->debug;
   print $dtxt and die $error unless $derr;
   warn "Debug error: $derr";
   die $error;
 }

The $beautifier argument can be used to specify a beautifier object which is designed to render the template for display in a specific format. Wigwam is packaged with two beautifiers, a text beautifier (the default) and an html beautifier.

 my($derr, $dtxt) = $wwobj->debug( 'Text::Wigwam::Beautify::Html' );
 my($derr, $dtxt) = $wwobj->debug( 'Text::Wigwam::Beautify::Text' );

Beautifiers typically contain an embedded debug template that is used by default unless you supply your own debug template as the second parameter. This debug template need only be a valid Text::Wigwam object.

 my($derr, $dtxt) = $wwobj->debug(
   'Text::Wigwam::Beautify::Html',
   Text::Wigwam->new( file => '/path/to/debug_template' ),
 );

spawn( $filename, \%varspace )

Creates a child template object after successfully locating the specified file within the template path. The resulting child template object will inherit the relevant configuration tag classes as well as the globals from the parent template object. The varspace will also be inherited unless the optional hash reference parameter is provided.

Directory paths

Wigwam maintains several independent lists of directory paths which are used to locate various types of files. Whenever a file is required, Wigwam will attempt to locate it by searching the relevant ordered path list. Each list can be manipulated by invoking one of the following methods with the list class to be modified as the first argument, followed by a list of directory paths.

set_path( $class, @paths )

The set_path() method is used to provide Wigwam with an ordered list of directory paths in which to search for files of the class specified by $class. Any pre-existing list within the current template object is lost and replaced with the @paths items. The new path list is then returned.

add_path( $class, @paths )

This is similar to set_path() except that add_path() is non-destructive to any pre-existing list. The add_path() method appends @paths items to any pre-existing list items in the specified $class, thereby yielding priority to any existing directory paths. The updated list is then returned.

Where $class holds one of the following case-insensitive list class names:

template
Specifies where Wigwam will look for child templates. Empty by default. Ordinarily, no child templates can be processed unless you define at least one directory path in this list.
module
Specifies where Wigwam will look for modules. Defaults to a single entry which is the equivalent of Text::Wigwam::Modules on any given system. (e.g. /usr/lib/perl5/site_perl/5.8.4/Text/Wigwam/Modules)
plugin
Specifies where Wigwam will look for plug-ins. Defaults to a single entry equivalent to Text::Wigwam::Plugins.
engine
Specifies where Wigwam will look for engines. Defaults to a single entry which translates to the equivalent of Text::Wigwam::Engines.

Unrecognized classes will be quietly ignored, and an empty list will be returned as a result.

 use Text::Wigwam;
 $wwobj=Text::Wigwam->new( text => '[!!#template "blah" !!]' );
 $wwobj->add_path( template => '/var/www/templates' );
 print $wwobj->execute;

A copy of any given list can be retrieved by calling the (non-destructive) add_path() list manipulation method with an empty list.

 print "Template paths: " . join( ', ', $wwobj->add_path( template => () ) );
 print "Module paths: ". join( ', ', $wwobj->add_path( module => () ) );

Directory paths can be given priority by placing them first in the list. This can be accomplished by retrieving the existing list items using add_path(), manipulating it, then storing it back with the set_path() method.

 $wwobj->set_path( module => '/priority', $wwobj->add_path qw/module/ );

Varspace management

All methods described in the Variables section are available, making it possible to pre-populate, or otherwise condition the template's variable space prior to template execution, or to retrieve data from the template's variable space after the template has executed. The function of each of the methods are explained in detail in the API section. It's also helpful to understand variables, which are explained in detail in the Variables section.

Globals management

All globals methods are available as well, which can be used to pre-populate, or otherwise condition the template's globals prior to template execution, or to retrieve data from the globals facility after template execution. The function of each of the following methods are explained in more detail in the API section. It's also helpful to understand globals, which are explained in the globals section.


Templates

Wigwam templates consist of simple plain-text documents which are interspersed with special tags that act as place holders where data is to be inserted. These tags contain tokens which can either represent data or manipulate it in some way. When the template is parsed, each tag is systematically processed from top to bottom and replaced by the resulting text that they produce.

Code tags

Code tags are the elements of templates that invoke Wigwam code. They are denoted by the default delimiters [!! and !!] and typically encompass one or more tokens. The Wigwam engine replaces these tags along with their contents with the resulting string of text that is produced after all of its tokens have been executed.

 Here is the gratuitous "[!!"Hello"!!], [!!"World"!!]" example.

After the above example is parsed by the engine, it becomes:

 Here is the gratuitous "Hello, World" example.

Tags are not limited to a single expression. The Wigwam engine will concatenate the results of each expression in a given tag, as demonstrated below.

 [!! "Hello"  ", "  "World" !!]

produces:

 Hello, World

Blocks

Blocks are typically used to group a number of tokens together so that their combined values may be joined into a single string which can then be used to represent a single argument for any given directive. Internally, blocks are nothing more than a list directive and its associated terminator - in fact, both block types are processed by the same built-in list directive.

There are two types of blocks, code blocks and text blocks. These block types are fully interchangeable as the difference is merely syntactical. Either block type may also contain an unlimited number of nested blocks.

Code blocks

Code blocks are represented by the brace characters { and }, and may contain any number of tokens and/or nested blocks.

 [!!
  #if foo {
    "This is some text which will only be output if foo has a true value.\n"
    "foo's value is '" foo "'.\n"
    #if bar { "bar is also true. Its value is " bar ".\n" }
  }
 !!]

Text blocks

Text blocks are denoted by the default tag delimiters !!> and <!!. As the name implies, text blocks are expressed in text form and may contain embedded tags with nested blocks.

 [!! #if foo !!>
  This is some text which will only be output if foo has a true value.
  foo's value is [!! foo !!].
  [!! #if bar !!>bar is also true. Its value is [!! bar !!].<!! !!]
 <!! !!]

Comments

Any text existing between the character sequences /* and */ within a Wigwam tag will be ignored by the engine. This is useful for commenting your template code.

 [!! /* A simple greeting */ "Welcome!" !!]

produces

 Welcome!

White-space trimming

White-space surrounding any given tag can be eliminated by using the white-space trimming tags. These alternate tag delimiters will remove all tabs and spaces directly outside the tag up to and including the first new-line (if encountered before a non-tab or non-space). Anything beyond that will be left untouched.

This example removes the leading white-space to the left of the white-space trimming open code tag delimiter.

 Howdy,     [!~ " Stranger" !!].
 [!!
  /* output:
   Howdy, Stranger.
  */
 !!]

The white-space trimming code tag terminator can be used to remove white-space up to and including the first encountered new-line character to the right of the tag.

 [!! "Hi, " ~!]
 There!
 [!!
  /* output:
   Hi, There!
  */
 !!]

White-space outside of the first encountered new-line character is left untouched, however. The purpose of this feature is to remove white-space used in formatting (indenting nested tags & such).

 Welcome 
 [!~
   #if 1 !!>
     [!~ "foobar" !!]
   <!!
  /* output (Assuming there's a single space after "Welcome"):
   Welcome foobar
  */
 !!]

White-space produced by adjacent tags is unaffected by trimming tags.

 [!! "left " !!]        [!~ "middle" ~!]  [!! " right" !!]
 [!!
  /* output:
   left middle right
  */
 !!]

Empty white-space trimming code tags can be used to remove leading white space next to indented plain-text, also.

 [!! #define some_text !!>
    [!~ !!]Plain-text here
 <!~ !!]
 "[!! some_text !!]"
 [!! /* output: "Plain-text here" */ !!]

Syntax

Wigwam code is written using prefix syntax, in which operators precede their operands.

 A conventional expression:
 a = b
 Expressed in prefix syntax:
 = a b
 Expressed in Wigwam code:
 [!! #define a b !!]

This not only allows for well-defined expressions without the need for parentheses or other grouping delimiters, but also offers a consistent syntax and contributes to Wigwam's parsing efficiency.

 A conventional expression:
 1 + 2
 Expressed in prefix syntax:
 + 1 2
 Expressed in Wigwam code:
 [!! #add 1 2 !!]

Tokens

Tokens are the basic elements that make up templates. Wigwam views any given template as a series of tokens consisting of the following token classes: variables; literals; directives; and terminators. Most of these token types can take on several forms within templates, but ultimately everything within the template becomes a token - even the plain text (a literal) and tag delimiters (list directive and associated terminator) are translated to tokens.

Literals

Literals represent values that do not change, such as the plain text outside of code blocks, or the text within a text block. Literals also take on several forms inside of code blocks: two quoted types; and an un-quoted numeric type.

Quoted

Quoted literals are identified by their surrounding quotation marks:

 [!! "My Dog Has Fleas" !!]
 [!! "37" !!]
 [!! "Quote marks can be \"escaped\" with the backslash character" !!]

Quoted literals can also be defined using the single quotation marks:

 [!! 'My Dog Has Ticks' !!]
 [!! 'The "double quote" marks don\'t need to be escaped here' !!]
 [!! 'However, the \'single quotes\' do' !!]

The metacharacters \n, \r, \t, and \\ in either quoted literal will produce a new-line, carriage return, tab, and single backslash respectively. The only difference between the two quoted literal types is the quote character that requires the backslash escape when used within the string itself.

Numeric

An un-quoted numeric literal is any string of characters that equates to a valid floating point numeric value:

 [!! -42 !!]
 [!! 98.6 !!]
 [!! +186E3 !!]

Variables

Wigwam variables are identified by their lack of surrounding decorations (e.g. var1). They represent values which can be examined, defined and undefined. They're used to access values within Wigwam's variable space (which is simply a hash reference) represented in the following examples as $vs.

    Variables     Perl Equivalents
      foobar      $vs->{foobar}
      fooBAR      $vs->{fooBAR}
       quux       $vs->{quux}

Variables are case sensitive, so foobar and Foobar represent two completely independent variables.

Variables can be made up of any number of upper or lower case alpha numeric characters, the underscore, plus & minus characters, as well as any of the metacharacters which consist of the dot, colon, backtick and square brackets characters.

Metacharacters

In brief, metacharacters consist of the dot, colon, backtick and square bracket characters. Each carries special meaning within variable names, but aside from the operations they perform, metacharacters tend to visually divide variables into distinct components which represent individual data elements that, when viewed as a whole, form a sort-of path to a value. For this reason, we sometimes refer to variables as ``path names'' when they contain one or more metacharacters.

Because the variable space is merely a hash reference, the first component of any variable always represents a hash key of the root varspace hash. Each subsequent component represents an element of the data type obtained by its preceding component and is accessed by the metacharacter which conjoins them.

The dot metacharacter performs an explicit hash look-up operation.

    Variables     Perl Equivalents
     foo.var1     $vs->{foo}->{var1}
     foo.bar      $vs->{foo}->{bar}
   foo.bar.baz    $vs->{foo}->{bar}->{baz}

The colon metacharacter performs an explicit array look-up operation, and should be followed by an integer value.

    Variables     Perl Equivalents
     foo:23       $vs->{foo}->[23]
     bar:-1        $vs->{bar}->[-1]
   baz:2:4.quux   $vs->{baz}->[2]->[4]->{quux}

The backtick metacharacter can be used in place of either the dot or colon metacharacters to dynamically determine the appropriate look-up operation necessary to retrieve a value. Besides simple hash and array look-ups, the backtick metacharacter will also perform the additional tasks of executing code references, invoking object methods, or performing some built-in operations on data.

These metacharacters make a trivial task of traversing even the most complex data structures. The result is much more readable than their Perl equivalents, especially when you throw interpolation into the mix...

Variable interpolation

Wigwam variables can themselves be interpolated using the square-brackets, [ and ], allowing values to be accessed indirectly.

    Variables     Perl Equivalents
    foo.[quux]    $vs->{foo}->{$vs->{quux}}
    foo[quux]     $vs->{"foo$vs->{quux}"}
  fum:[baz.quux]  $vs->{fum}->[$vs->{baz}->{quux}]
  [fe:[fi.[fo]]]  $vs->{$vs->{fe}->[$vs->{fi}->{$vs->{fo}}]}

All of the brackets are processed before any other metacharacters. They're evaluated left-innermost first.

  [fifth[third[first][second]][fourth]]
  Perl equivalent:
  $vs->{'fifth'.$vs->{'third'.$vs->{first}.$vs->{second}}.$vs->{fourth}}

Of course, you'd never use such hideous variables in the real world.

Escaping metacharacters

For maximum versatility, any one of these metacharacters can be escaped by preceding it with a backslash character. Doing so cancels any special meaning that metacharacter carried, and interprets that occurrence of the metacharacter as just an ordinary character.

    Variables     Perl Equivalents
  foo\.bar\.baz   $vs->{'foo.bar.baz'}
   baz\:2:4.quux  $vs->{'baz:2'}->[4]->{quux}
  fum\:[baz.quux] $vs->{'fum:'.$vs->{baz}->{quux}}
 \[fe\]:[fi.[fo]] $vs->{'[fe]'}->[$vs->{fi}->{$vs->{fo}}]

Global variables (Experimental)

Global variables that are stored in the globals facility can be accessed directly by prepending the variable name with a colon character followed by the name of the global. Only all lower-case globals are accessible in this manner.

 [!!
   #repeat 4 {
    #given #random 1 20 {
     #when ( 1, 2, 3, 5, 7, 11, 13, 17, 19 ){
        "(" :given ")"
     }
     #default :given
    }
    "\n"
   }
 !!]

Wigwam stores a hash reference in a global called 'global' which can be used to share global data among templates.

 [!! :global.CGI->param( ... ) !!]

Directives

Directive tokens are identified by the hash mark # preceding them and are not case sensitive. They take the following form:

 [!! #directive !!]

Operations such as defining variables, weighing values against each other, comparing strings, and connecting to a database, can be performed by directives. Directives are in reality just Perl subroutines (which are typically organized into various modules or plug-ins) and are called by the parsing engine whenever a directive token is encountered in a template. Once executed, the directive and all of its associated arguments are interpolated by the parsing engine with the directive's return value.

Arguments

Each directive requires its own fixed number of arguments. As an example, the #define directive requires two arguments, a variable name and a value.

 [!! #define varx "value" !!]

Using directives as arguments is commonplace. In the following example, the #define directive's arguments consist of the variable name varx, and an expression consisting of the #add directive and its associated arguments (4 and 5 in the example below).

 [!!
  #define varx #add 4 5 /* stores 9 in the varx variable */
  /* Let's break it down:
   #define  (requires 2 args)
     varx   (arg #1 for #define)
     #add   (arg #2 for #define, requires 2 args of its own)
      4     (arg #1 for #add)
      5     (arg #2 for #add)
  */
 !!]

When writing templates, it is crucial to always provide the exact number of arguments required by any given directive, otherwise errors or unpredictable results are bound to follow.

Terminators

Terminators are used to terminate list arguments for directives. In their native (un-aliased) form, they are relatively hideous and make for rather ugly template code, so most often they are dealt with in their aliased forms as code block terminators } and the list (array) terminators > and ). The code tag terminators !!] and text block terminators <!!; are also terminators for their respective inherent list-argument directive counterparts.

Explicit terminator

An explicit terminator is used to terminate list arguments for its associated directive, exclusively. During runtime, it will verify that it is not being used to terminate a list argument for some other directive, in which case it will generate a parse error.

Explicit terminators look like their associated directive with the addition of a forward slash character inserted after the initial hash mark.

 [!! #directive ... #/directive !!]

It's possible that a given directive might require more than one list argument, in which case a terminator must be used to terminate each one. However, writing custom directives that take several list arguments like this is not recommended merely for the sake of aesthetics.

 [!! #directive ... #/directive ... #/directive !!]

Hideous, isn't it?

Generic terminator

We now introduce the generic terminator which takes on the unsightly form of #/*. Its function is to terminate any list argument regardless of the directive it's used against. Its use is not recommended, as its primary purpose is for backward compatibility with legacy templates and for debugging purposes.

 [!! #directive ... #/* !!]

Yuck.

Expressions

Arguments and expressions are closely related, and essentially refer to the same thing only in slightly different contexts. When we speak of an argument, we are simply referring to the resulting value of an expression. When we speak of an expression, we refer to the one or more tokens which evaluate to a single value.

An expression can be a single token such as a literal, variable, or a single directive, but often it is a directive and its arguments (and their arguments, and so on). The following example consists of a code tag which contains two expressions - it attempts to demonstrate how it can sometimes be difficult to tell where the expression boundaries are.

 [!! #define varx #sub #add x y z #if #ne varx n #define varx 0 !!]

The best way to improve readability is to simply format the code in such a way as to make it more readable. This is the most efficient solution in terms of parsing speed, as parsing speed is inversely proportional to the number of tokens involved.

 [!!
  #define varx
    #sub
      #add x y
      z
  #if #ne varx n
    #define varx 0
 !!]

Another way of clarifying the boundaries is by simply putting them in separate code tags or to add code blocks where they help distinguish the boundaries.

 [!! #define varx { #sub #add x y z } !!]
 [!! #if #ne varx n { #define varx 0 } !!]

Another alternative is to use expression terminators, though they are still considered experimental.

Expression terminators (experimental)

Expression terminators are experimental in that the specifics of their behavior will most likely change in future versions.

Expression terminators consist of the semicolon and comma characters. They are an optional tool which can be used to police expression boundaries, or to just make template code easier to read. Currently there is no functional difference between the colon and semicolon other than the superficial aspect, but future revisions may introduce some minor behavior differences between the two.

Should you decide to use the expression terminators in your templates, it is recommended (in the interest of remaining compatible with future Wigwam versions) that you use the comma to separate elements in a list, and to use the semicolon to mark the end of a statement. This is also a good rule of aesthetics.

 [!!
  #define list ( 1 2 3 #add x y 5 )
  /* can optionally be written as... */
  #define list ( 1, 2, 3, #add x y, 5 );
 !!]

Because expression terminators are active tokens, there is a minor performance penalty associated with their use in templates. Their job is to simply verify that they are being processed as an argument for a (non-terminated-list) directive - if not, they are simply ignored by the parser - otherwise, they'll determine whether the current argument is optional for the requesting directive, after which they either return a null value of the appropriate type, or generate a parse error.

They are particularly useful when a group of expressions are bound together within a single code tag, list or code block. Consider the following example:

 [!! #sizeof ( var_x #add var_x #sub var_y #div var_z 4 var_y ) !!]

The #sizeof directive produces 3, which is the number of elements in the list and consequently reflects the number of expressions within the list. To make this a little more readable, we can place expression terminators between each expression.

 [!! #sizeof ( var_x, #add var_x #sub var_y #div var_z 4, var_y ); !!]

Expression terminators can be used to omit optional arguments for directives that allow such a thing (currently, there are only a few that allow this). If allowed, an empty value of the requested type is generated by the parser and passed to the directive.

 [!!
  #call procedure;
  /* equivalent to:
   #call procedure ()
  */
 !!]

Terminating expressions in this manner applies to all directives within a given expression that still require arguments. Once the terminator is encountered, no further tokens will be processed for that expression. Consider this:

 [!! #if #call procedure; { "D'Oh!" } !!]

The preceding example produces an exception because, even though we are legitimately terminating the final argument for the #call directive, we are (perhaps unintentionally) terminating the final argument for the #if directive as well, as it is part of the overall expression. In this case, we have no choice but to explicitly supply the #call directive with all of its arguments.

 [!! #if #call procedure() { "Ah, there we go" }; !!]

Contrary to what you might expect, expression terminators do not give you an empty value within a list. For example, #sizeof (a,b,,d) will produce 3. This behavior is subject to change in the future, and may eventually become part of the behavior specification for the comma character, thereby distinguishing it functionally from the semicolon.

Method invocation (experimental)

Wigwam supports method calling on Perl objects within its varspace, and will pass along any specified parameters.

 [!!
  path.to.object->method( arg1 arg2 )
  /* The parameter list can also be omitted using an expression terminator */
  path.to.object->method;
 !!]

Interpolation can be used on the object path, just as you would any variable. The method portion is translated as a literal, although if it contains brackets, they will be interpolated.

 [!!
  #define which 'this'
  #define thing 'dad'
  /*
   Attempts to invoke the doodad method on the object stored in the
   this.object variable and passes it an empty list.
  */
  [which].object->doo[thing]( )
 !!]

Wigwam will determine the appropriate list or scalar context in which to invoke the object method based upon the data type that the calling directive has requested. If the method call returns a single value, Wigwam will simply return that value. If a list of values are returned, Wigwam will return an array reference containing all of these values.

Configuration tag

Each template may contain its own configuration tag which is processed prior to template execution. It can be used to specify parsing options, debugging features, template characteristics (such as redefining tag delimiters), or to load plug-ins & modules. It may also be used to specify default, or to restrict, config option values for subsequently spawned child templates. These options are defined via a uniquely identified and specially formatted configuration tag. They take the following form:

 << Wigwam class: key1=value1; key2=value2; >>

Where class represents the rank or scope of the options to follow, they are one of Options, Restrict, Defaults, or Settings (case insensitive). The key=value pairs that follow the class declaration become members of that class, and represent options and their respective values.

Configuration tags are very passive, as any unsupported class definitions and all key=value pairs defined therein will be quietly ignored. Furthermore, no checks are made to determine whether any given option key serves a purpose, nor whether its value is of the required type or format.

The Options class exclusively affects the current template, whereas the remainder of the classes are used to define defaults or restrictions for subsequently spawned child templates. They directly correlate to their counterparts described in the Interface section. Note also that any pre declared Restrict settings still have precedence over all.

Only the first encountered configuration tag is processed for any given template. Any other configuration tags within the template will be treated as plain text. This is a useful feature when dealing with templates that generate other templates.

The various classes may occur within a single configuration tag more than once, although any duplicate keys within a common class will be ignored:

 << Wigwam
   Options:
     code_open=[!!;
     code_term=!!];
   Restrict:
     engine=Foomatic;
   Options:
     code_open=[%; /* ignored */
     strict_tags=0;
     code_term=%]; /* ignored */
 >>

The colon, semicolon, and equals characters must be escaped using the backslash escape character '\' when used as part of a key or value in a configuration tag in order to prevent them from being interpreted as metacharacters.

 << Wigwam
   Options:
     directive_root=bazz\:\:quux;
     directive_path=foo\:\:bar;
     some\=crazy\:key=some\;wacky\=value;
 >>

Options

The following options' values can only be altered via the Restrict parameter during the object construction phase.

config_open
(Default: <<) Wigwam configuration tag open delimiter.
config_term
(Default: >>) Wigwam configuration tag terminator.
default_engine
(Default: Fusion) A comma delimited list of fall-back engines which are to be used in the event that the engines specified within the engine parameter fail to initialize.

The following options' values may be altered via any class parameter.

engine
(Default: Fusion) - A comma delimited list of parsing engines (in order of preference) to be used to parse a given template. Should the first one fail to initialize, an attempt to initialize the next entry will be made. This goes on until an engine is successfully initialized or until all entries are exhausted, after which the default_engine parameters are processed.
plugins
(Default: DirectiveSet) - Loads a comma delimited list of plug-ins.
modules
(Default: null) - Loads a comma delimited list of modules.
directive_path
(Default: null) - Used to specify a list of branches within the directive tree in which to search for directives.
directive_root
(Default: null) - Used to specify a single branch within the directive tree which is to act as the root for all directive searches.

Fusion specific:

strict_tags
(Default: 1) - When true, helps to detect template coding issues such as missing arguments and/or unexpected terminators by generating an exception when a problem is detected. It is recommended to leave this enabled, as we only keep it around for engine debugging purposes.
code_open
(Default: [!!) - Defines the code tag open delimiter.
code_term
(Default: !!]) - Defines the code tag terminator.
text_open
(Default: !!>) - Defines the open text block delimiter.
text_term
(Default: <!!) - Defines the text block terminator.
code_open_trim
(Default: [!~) - Defines the white-space trimming code tag open delimiter.
code_term_trim
(Default: ~!]) - Defines the white-space trimming code tag terminator.
text_open_trim
(Default: ~!>) - Defines the white-space trimming open text block delimiter.
text_term_trim
(Default: <!~) - Defines the white-space trimming text block terminator.
numbers
(Default: float) - Controls what constitutes an unquoted numeric literal. Valid options are: FLOAT, REAL, INTEGER, WHOLE, and OFF. This option is useful when you must use variable names which would ordinarily be recognized by the tokenizer as numeric literals. If disabled via the OFF option, numeric literals must be surrounded by quotation marks so that they will not be mistaken for variables.
hexadecimal
(Default: 1) - When true, Wigwam will recognize hexadecimal values, such as 0x1234, as numeric literals.

Here's an example that demonstrates some available config options and their respective default or typical values:

 << Wigwam
   Options: 
     engine=Fusion;
     strict_tags=1;
     code_open=[!!;
     code_term=!!];
     text_term=<!!;
     text_open=!!>;
     code_open_trim=[!~;
     code_term_trim=~!];
     text_term_trim=<!~;
     text_open_trim=~!>;
     numbers=float;
     plugins=DirectiveSet;
     modules=Core/Cgi, Core/Html;
 >>

The default Wigwam configuration tag delimiters, << and >>, can be changed to whatever characters you see fit by using the config_open and config_term options exclusively within the Restrict hash reference parameter during the object construction phase. This is a global alteration and applies not only to the root template, but to all spawned child templates as well. They cannot be changed throughout the life of the template object. As an example, a CGI wrapper may alter the Wigwam configuration tag identifiers so that they take on a form similar to SSI in order to better blend in with html documents (i.e. <!--Wigwam Options: numbers=off; -->

Directive tree

The directive tree is a name space hierarchy into which directives can be loaded, defined, or imported. Directives may be organized into various branches of the directive tree (nested name spaces) and accessed from within templates either explicitly, or indirectly by way of an ordered path mechanism. Templates can also be restricted to specific branches, thereby limiting the functions available to them. This allows Wigwam to simultaneously provide template environments with varying degrees of functionality ranging from strict variable interpolation, to turing-complete scripting languages.

The internal base of the directive tree is fixed in the Text::Wigwam::Directives name space. Any simple directive such as #foo is expected to be located in the Text::Wigwam::Directives name space by default. Subsequent nested name spaces represent branches of the directive tree and can be accessed explicitly from within templates by using the double colon character sequence in the directive name (i.e. #branch::here::directive_name_here). In the case of #cgi::encode, Wigwam would expect to find the #encode directive within the ``cgi'' branch of the directive tree (i.e. Text::Wigwam::Directives::cgi).

The directive_root config option can be used to provide a reference point within the directive tree in which to search for directives. The following examples demonstrate how the engine will attempt to locate a specific directive handler given the respective directive_root config option values.

 << Wigwam Options: directive_root=; >>
 [!! #foo /* &Text::Wigwam::Directives::_foo */ !!]
 << Wigwam Options: directive_root=quux; >>
 [!! #foo /* &Text::Wigwam::Directives::quux::_foo */ !!]
 << Wigwam Options: directive_root=blah; >>
 [!! #fe::fi::foo /* &Text::Wigwam::Directives::blah::fe::fi::_foo */ !!]
 << Wigwam Options: directive_root=roe\:\:sham; >>
 [!! #boe::foo /* &Text::Wigwam::Directives::roe::sham::boe::_foo */ !!]

The directive_path config option can be used to define a comma delimited list of branches in which to search for directives. The following examples attempt to demonstrate how the engine searches through the directive tree for directive handlers given various directive_path settings.

 << Wigwam Options: directive_root=; directive_path=baz,baz\:\:quux; >>
 [!!
   #foo
  /* 
    Directive handler search order
   &Text::Wigwam::Directives::baz::_foo
   &Text::Wigwam::Directives::baz::quux::_foo
   &Text::Wigwam::Directives::_foo  (searches root branch last)
  */
 !!]
 << Wigwam Options: directive_root=; directive_path=baz,quux; >>
 [!!
   #fe::fi::foo
  /* 
    Directive handler search order
   &Text::Wigwam::Directives::baz::fe::fi::_foo
   &Text::Wigwam::Directives::quux::fe::fi::_foo
   &Text::Wigwam::Directives::fe::fi::_foo  (searches root branch last)
  */
 !!]

As the previous examples indicate, the root branch assumes the last value in the path. The root branch can be specified elsewhere in the path in the form of a null entry in the directive_path list, as the following examples attempt to demonstrate.

 << Wigwam Options: directive_root=; directive_path=,baz,baz\:\:quux; >>
 [!!
   #foo
  /* 
    Directive handler search order
   &Text::Wigwam::Directives::_foo  (searches root branch first)
   &Text::Wigwam::Directives::baz::_foo
   &Text::Wigwam::Directives::baz::quux::_foo
  */
 !!]
 << Wigwam Options: directive_root=; directive_path=baz,,baz\:\:quux; >>
 [!!
   #fe::fi::foo
   /*
    Directive handler search order
   &Text::Wigwam::Directives::baz::fe::fi::_foo
   &Text::Wigwam::Directives::fe::fi::_foo  (searches root branch second)
   &Text::Wigwam::Directives::baz::quux::fe::fi::_foo
  */
 !!]

The following examples demonstrate how Wigwam searches for directives given various directive_path and directive_root settings.

 << Wigwam Options: directive_root=foobar; directive_path=baz,quux; >>
 [!!
  #foo
  /*
    Directive handler search order
   &Text::Wigwam::Directives::foobar::baz::_foo
   &Text::Wigwam::Directives::foobar::quux::_foo
   &Text::Wigwam::Directives::foobar::_foo
  */
 !!]
 << Wigwam Options: directive_root=bar; directive_path=baz,baz\:\:quux; >>
 [!!
  #fe::fi::foo
  /*
    Directive handler search order
   &Text::Wigwam::Directives::bar::baz::fe::fi::_foo
   &Text::Wigwam::Directives::bar::baz::quux::fe::fi::_foo
   &Text::Wigwam::Directives::bar::fe::fi::_foo
  */
 !!]
 << Wigwam Options: directive_root=foobar; directive_path=baz,,baz\:\:quux; >>
 [!!
  #foo
  /*
    Directive handler search order
   &Text::Wigwam::Directives::foobar::baz::_foo
   &Text::Wigwam::Directives::foobar::_foo
   &Text::Wigwam::Directives::foobar::baz::quux::_foo
  */
 !!]

Directive interpolation

As with variable interpolation, the square brackets can be used to interpolate all or part of any directive.

 [!!
  #define db "pgsql"
  #[db]::connect /* translates to #pgsql::connect */
 !!]

As the following example demonstrates, directive interpolation is not limited to the branch portion of the directive.

 [!!
  #define action "add"
  #[action] 2 3 /* translates to #add 2 3 */
  /* output: 5 */
 !!]

Unfortunately there is a performance penalty for using this feature, as some of the internal caching must be disabled while processing interpolated directives.


Development environment

Modules

Modules are simply Perl external library files which are used to organize the subroutines that make up custom directives. The directives contained within these modules become available to templates by loading them via the 'modules' config option. Wigwam locates modules via the module directory path.

 << Wigwam Options: modules=Core/Cgi, Core/Html; >>

As with all Perl external library files, the last executable statement in a Wigwam module should result in a true value. Simply putting 1; or return 1; as the last executable statement in the file will suffice. However, sometimes it is necessary for a module or plug-in to declare some globals upon initialization. This can be accomplished by returning a reference to a subroutine instead of simply 1; as the last executable statement in the file. This code will be executed and passed an API object as its only argument each time the module is initialized by a new template object.

Package declarations within Wigwam plug-ins and modules must begin with the base name space Text::Wigwam::Directives (the base of the directive tree). Subsequent name spaces represent branches of the directive tree, and must consist of all lower case characters (i.e. package Text::Wigwam::Directives::blah::blah). Modules which do not explicitly declare a package will be loaded into the base name space (Text::Wigwam::Directives) by default, as this is the name space in which modules (and plug-ins) are evaluated.

Plug-ins

Plug-ins are functionally equivalent to modules, but differ in that they may associate directives with single-character aliases, and that they are given priority over modules in terms of loading and initialization. Plug-ins are loaded via the 'plugins' config option from a location defined within the plugin directory path.

 << Wigwam Options: plugins=DirectiveSet, CustomLingo; >>

Writing directives

Ease of extensibility was one of the primary goals driving Wigwam's development. Since directives are the most basic means of extention, adding them is quite simple. All that is required for any single directive are two specifically named subroutines consisting of a prototype, and a handler.

Handlers use the following naming convention:

 sub _directive { }

Prototypes are named like their handler counterpart but have ``_proto'' prepended:

 sub _proto_directive { [ ] }

These two Perl subroutines make up a directive named #directive. The existence of these subroutines is detected at runtime when a directive is encountered while parsing a template - The absence of either of these subroutines will result in a parse error. Note also that, although the directive can be made up of mixed upper or lower case alpha characters within the template, the prototype and handler subroutines associated with it are expected to be all lower-case.

Prototypes

Prototypes provide information regarding its corresponding handler's arguments which the parsing engine uses during runtime to ensure that the handler receives the type of data expected for each argument by performing the necessary type-casting when needed. Prototypes need do nothing more than return a simple array reference, the elements of which represent each argument required by the directive's handler. For example: if the directive requires one argument, then the prototype should return an array reference containing a single element; if the directive requires no arguments, then the prototype should return an empty array reference; and so on.

 use Text::Wigwam::Const qw(STRING BOOLEAN);    # Import required constants
 sub _proto_directive_a { [ STRING ] }          # One argument required
 sub _proto_directive_b { [ BOOLEAN, STRING ] } # Two arguments required
 sub _proto_directive_c { [ ] }                 # No arguments required

The values in this array reference consist of a numeric constants which represent the various data types and/or qualifiers & optional modifiers for each argument. Each argument required by the directive handler is represented by its corresponding element in this array. Each argument's requested data type is represented by the value stored in its corresponding array element.

The constants which represent the various data types, qualifiers & modifiers are typically imported from the Text::Wigwam::Const class (see the constants section for details), as demonstrated in the previous example. However, they're also accessible via methods provided by the object passed to the prototype upon its invocation. We could alternatively write the previous example as follows, though it's not as readable and offers no real advantage to the former.

 sub _proto_directive_a { [ $_[0]->STRING ] }
 sub _proto_directive_b { [ $_[0]->BOOLEAN, $_[0]->STRING ] }

As you'd expect, the prototypes are consulted prior to handler invocation at runtime. Unlike handlers which are often called many times, prototypes are typically only called the first time the directive is encountered in a template - The information is typically pulled from a cache upon subsequent encounters. This cache can be de-activated via the drip_cache config option, thereby forcing the engine to call the prototype each time a directive is encountered. This will undoubtedly degrade performance and is generally reserved for engine debugging purposes.

These constants can also be strung together using Perl's bitwise-or operator '|' to specify more than one acceptable data type, or to add a qualifier and/or modifier attribute for an argument. The following are some real world examples - the meanings of each is discussed in more detail in the constants section.

 use Text::Wigwam::Const qw(ARRAY HASH STRICT);
 sub _proto_pop     { [ ARRAY|STRICT ] }
 sub _proto_reverse { [ ARRAY|HASH ] }
 sub _proto_if      { [ BOOLEAN, EXPR|BLOCK ] }

Aside from the programmatic role prototypes play internally, they also provide a visual overview of the directive's usage in templates. They reveal the number of arguments the directive requires, as well as detailed information about each of those argument's type requirements. This degree of readability to both programmer and casual observer is a necessity considering Wigwam's extensible nature and its use by the presumed non-Perl-programmers who are likely to be utilizing these directives in their templates.

Handlers

Handlers perform the function of a given directive - It's called each time the parsing engine encounters its associated directive in a template during runtime. Their task is to simply pull all of the arguments declared in their associated prototype (if any), perform an operation, and return a value.

Retrieving arguments

Arguments are retrieved by calling methods of the Wigwam API object which is passed to the handler upon its invocation. These methods are explained in more detail later, but typically you'll only need to use the methods: get_arg() which retrieves the next argument; and kill_arg(n) which skips the next n arguments.

This example introduces a #fnord directive which merely concatenates the values in a given list while inserting `` (fnord) '' between them:

 use Text::Wigwam;
 my $wwobj = Text::Wigwam->new( text => join( '', <DATA> ) )
   or die $Text::Wigwam::ERROR;
 my( $err, $txt ) = $wwobj->execute;
 die $err if $err;
 print $txt; # Principia (fnord) Discordia
 # Add our #fnord directive to the directive tree base.
 package Text::Wigwam::Directives;
 # use Text::Wigwam::Const qw(ARRAY);
 sub _proto_fnord { [ ARRAY ] }
 sub _fnord { return join( " (fnord) " => @{$_[0]->get_arg} ) }
 __END__
 [!! #fnord ( "Principia" "Discordia" ) !!]

Because the #fnord directive in this example lives in the directive tree base, it is not required to import any constants, therefore the prototype simply uses the ARRAY constant without worry. In any nested branch of the directive tree it would be necessary to import the required constants via use Text::Wigwam::Const qw(ARRAY), or rewrite the prototype to access them via method calls (i.e. sub _proto_fnord{ [ $_[0]->ARRAY ] }).

Skipping arguments

It's important to understand why kill_arg() is so useful in skipping arguments as opposed to simply using get_arg() and throwing away its value. This becomes an issue when, for example, an argument is made up of a directive which performs a physical operation on data. Consider the following example:

 sub _proto_unless { [ BOOLEAN, ANY ] } 
 sub _unless { 
  my ($Api) = @_;
  if( $Api->get_arg ){
    $Api->get_arg; # retrieve the next argument, * UH-OH!
    return;        # and return null.
  }
  return $Api->get_arg;
 }

This works fine in situations where you are dealing with simple string values, like [!! #unless foo "bar" !!], but you must also take into consideration the possibility that the second argument might consist of a directive which operates on data, such as [!! #unless foo #undefine bar !!]. In the latter case, bar will be undefined regardless of the value of foo, and is most probably not the behavior sought. For this reason, the directive handler ought to be rewritten to use kill_arg(n), instead.

 sub _proto_unless { [ BOOLEAN, ANY ] } 
 sub _unless { 
   my ($Api) = @_;
   if( $Api->get_arg ){
     $Api->kill_arg(1); # skip the next argument, * WHEE!
     return;            # and return null. 
   }
   return $Api->get_arg; 
 }

Handler/prototype policy

As a means to ensure that directives are well coded, a strict policy is enforced between a handler and its prototype. The parser will not tolerate attempts by a handler to pull more arguments than what is declared in its prototype - this sort of thing will generate a parse error. Similarly, if a handler does not pull all of its arguments as declared in its prototype, a parse error will occur. The only time this is forgiven is when a handler raises an exception (presumably before it had the opportunity to pull all of its arguments), in which case the parser will perform a cleanup by skipping all remaining arguments so that it can make a clean exit.

Constants

The Text::Wigwam::Const class provides several constants which are to be used within directive prototypes to specify argument attributes required by their corresponding directive handlers.

 Data type
 Constant   Instructs the engine to execute the next argument, and return...
   STRING     a string of text.
   SCALAR     a scalar reference.
   ARRAY      an array reference.
   HASH       a hash reference.
   VAR        a string of text.. unless the given argument is a variable name,
              in which case, return the raw variable name rather than its
              value. When used in conjunction with ARRAY or HASH, the Wigwam
              engine will vivify this variable's value with an empty array or
              hash, respectively, if it was previously undefined.
   The following are to be used exclusively and should not be used in
   conjunction with any other data type constant.
   BOOLEAN    a boolean value, either 1 or 0 (true or false, respectively).
   NUM        a numeric string (or else generate a parse error).
   BLESSED    a blessed reference (or else generate a parse error).
   ANY        any value - no casting or type-checking is performed.

In most cases, the following argument modifiers should be used in conjunction with at least one of the above data type constants. Multiple modifiers can be specified per argument, so long as it makes sense.

 Modifier
 Constant   Signals the Wigwam engine that we wish to...
  BLOCK     Execute this argument in a new block scope.
            (see the globals section).
  EXPR      Retrieve the next argument in the form of an executable object so
            that it may be executed at a later time, multiple times, or
            perhaps never. The EXPR type is explained in more detail below.
  ITERATE   Keep executing this argument on every call to get_arg until
            kill_arg is invoked. The same thing can be accomplished with EXPRs
            but ITERATE offers a slight advantage in performance.
  STRICT    Disallow casting (accepting only the specified data type(s)).
            Generates an exception if the value encountered is of a different
            type than what is specified for this argument.
  TERM      Keep executing expressions until a terminator token is encountered
            and return an array reference in which each element represents the
            resulting value of each expression encountered.
  OPTIONAL  Indicates that this argument may be truncated by an expression
            terminator. Use sparingly, and only on the final argument. It can
            theoretically be used on several consecutive arguments leading up
            to and including the last argument, but that would be silly.
            (experimental)
  RUNONCE   Execute this argument only once.
  VOID      Signal void context while processing this argument.

Access to the constants can be achieved in several ways. They can be imported into the current name space selectively like this use Text::Wigwam::Const qw(ARRAY HASH STRING), or they can all be imported with use Text::Wigwam::Const qw(:all). Constants can also be accessed as methods of the constants object which is passed to each prototype upon invocation, in which case all that is required is to code your prototypes like this: sub _proto_blah { [ $_[0]->STRING, $_[0]->ARRAY ] }.

These attributes can also be strung together using Perl's bitwise-or operator '|' to indicate a number of acceptable data types for your directive handler. To demonstrate, the code for the #reverse directive follows.

 sub _proto_reverse{ [ ARRAY|HASH ] }
 sub _reverse{
   my ($Api) = @_;
   my $arg = $Api->get_arg;
   return { reverse( %$arg ) } if $Api->is_hash( $arg ); # Hash ref?
   return [ reverse( @$arg ) ];           # Assume it's an array ref
 }

The prototype in the above example guarantees that we will receive either an array reference or a hash reference upon calling the get_arg() method, therefore the handler needs only to check the reference type once. This not only simplifies things & saves us some code, but it also helps avoid Perl runtime errors.

The EXPR data type

The EXPR type is a proprietary pseudo-type inherent to Wigwam. Wigwam templates themselves are EXPR compliant objects and are treated the same as EXPRs generated by way of directives in the manner described here.

The EXPR attribute can be specified for an argument in a prototype to instruct the parser to return that argument in the form of an EXPR object which can be executed once, many times, or not at all. This EXPR object can then be stored away somewhere such as the varspace or globals facility, or simply tossed away once they've served their purpose. This makes them ideal for use as macros, and handy for coding such things as iterator and conditional directives.

EXPRs will cast their return value in accordance with the prototype from which it originated. If no type was specified, ANY is assumed.

 sub _proto_badong { [ EXPR ] }
 # is functionally equivalent to:
 sub _proto_badong { [ EXPR|ANY ] }

The following simplified version of the #while directive attempts to serve as an example of how an EXPR object can be forced to return a specific data type upon execution.

 sub _proto_while { [ EXPR|BOOLEAN, EXPR|STRING ] }
 sub _while {
   my ($Api) = @_;
   my $cond = $Api->get_arg;
   my $loop = $Api->get_arg;
   my $text;
   while ($cond->execute()) {    # guaranteed a boolean value here
      $text .= $loop->execute(); # guaranteed a text string here
   }
   return $text;
 }

EXPRs execute within the same environment as the template from which they originate. This means that if an expression object is executed from within a template other than the one where it was defined, the EXPR may not have access to the same variable space, directives, etc. as the template that invoked it. Conversely, the EXPR may have access to directives and resources that the calling template doesn't. This feature can be used as a means for providing some advanced functions to templates which are otherwise restricted.

The standard EXPR methods follow:

execute( )
When no arguments are provided, the EXPR is simply executed in the current scope and its return value is passed back.
execute( $args )
When an argument value is provided to the execute method, it's stored in the globals facility where it can easily be retrieved by the template code within the EXPR itself. Then, the EXPR type is executed, after which the global facility restores the argument stack to its previous state, and finally the EXPR returns its value.
size( )
Returns the number of tokens that make up the expression.
doc( )
Returns the filename of the template from which it originates.
beautify( $beautifier )
Returns a beautified representation of the expression. You may optionally specify a beautifier object which is designed to render the template for display in a specific format. Wigwam is packaged with two beautifiers, a text beautifier (the default) and an html beautifier.
 print $expr->beautify( Text::Wigwam::Beautify::Html->new() )
engine( )
Returns the name of the engine that generated the expression.

API

The Wigwam API is an object that is passed by the engine as the first argument to any given directive handler upon invocation. It provides methods which can be called by directive handlers to perform basic functions, such as handling arguments, manipulating variables, managing globals, and other useful functions. They're typically invoked using one of the following techniques:

 my $arg = $_[0]->get_arg; 
 # do something with $arg... 
 #   ..or.. 
 my ($Api) = @_;
 my $arg = $Api->get_arg;
 # do something with $arg...

Argument related methods

get_arg( )
Returns the next argument in the form of the data type indicated in the prototype. When retrieving a terminated list (TERM type) argument, get_arg() returns an array reference whose elements hold the resulting values of each expression encountered prior to encountering the terminator.
kill_arg( num )
Skips the next num arguments without executing them. When called on an terminated list (TERM type) argument, kill_arg(1) will skip the entire list.

Varspace related methods

These functions perform operations on variables stored within Wigwam's variable space hash.

get_value( $variable )
Retrieves and returns the value specified by $variable from the varspace hash.
set_value( $variable, $value )
Stores $value into the varspace hash location specified by $variable and returns the stored value. If $value happens to contain an unblessed reference to an array or hash, set_value makes a copy of $value and uses the new reference as the value to be stored and returned.
set_alias( $variable, $value )
Stores $value into the varspace hash location specified by $variable and returns the stored value. Unlike set_value, no reference copying is performed.
is_defined( $variable )
Returns true if the value within the varspace hash specified by $variable contains a defined value.
undefine( $variable )
Deletes the key or element within the varspace hash specified by $variable.
escape_var( $variable )
Returns an escaped version of $variable - backslashes are added in front of meta characters, such as the dot, colon, backslash and square-bracket characters. This is useful for escaping arbitrary hash keys which are to be used to look up their respective values using variable interpolation.
interpolate( $variable )
Returns the resulting string after interpolating any existing square brackets within $variable.

Data type detection methods

is_num( $value )
Returns true if $value holds a valid numeric value.
is_string( $thing )
Returns true if $thing holds a text string.
is_list( $thing )
Returns true if the underlying referent type of $thing is an array reference. (Returns false for all EXPR types, regardless of their true underlying referent type.)
is_hash( $thing )
Returns true if the underlying referent type of $thing is a hash reference. (Returns false for all EXPR types.)
is_scalar( $thing )
Returns true if the underlying referent type of $thing is a scalar reference. (Returns false for all EXPR types.)
is_expr( $thing )
Returns true if $thing holds a valid Wigwam expression object. (i.e. $thing->isa('Text::Wigwam::Expression'))
ref_type( $thing )
Returns the $thing's underlying referent data type. (Returns 'EXPR' for all EXPR types regardless of their true underlying referent type.)
is_blessed( $thing )
Returns the name of the package that $thing is blessed into, otherwise returns false. Also returns false for all EXPR types.

Miscellaneous functions

doc( )
Returns the current template filename, or null if it was provided as text.
exception( $message )
Stores the contents of $message into the _DIE global along with some brief text indicating which template caused the exception.
 $Api->exception( 'Something horrible just happened!' );
debug ( $message )
Adds the contents of $message as a comment that will appear at the current location in the template when rendered by the template debugger.
load_modules( @module_list )
Attempts to load all modules specified in the list. Returns undef upon success, or an error message if there was an error.
 my $err = $Api->load_modules qw/ Core/Html Core/Cgi / );
 $Api->exception( $err ) if length $err;
 return;
spawn( $filename, \%varspace )
This method is used for spawning child templates. It attempts to locate $filename within the template path and generates an EXPR compliant object (of the Text::Wigwam class) from it. The second parameter is an optional varspace hash reference to be used as the variable space for the template. If the varspace parameter is omitted, the child template will share a common varspace hash with the parent.
 sub _proto_include { [ STRING ] }
 sub _include {
     my ($Api) = @_;
     return $Api->spawn( $Api->get_arg );
 }

When called in scalar context, the spawn method will throw an exception upon failure. When called in list context, it returns a list consisting of an error message (or null string) followed by the EXPR object if one was successfully generated.

  # A directive that returns the first available template
  # given a list of filenames
 sub _proto_first_available { [ ARRAY ] }
 sub _first_available {
    my ($Api) = @_;
    my $list = $Api->get_arg;
    for my $template (@$list) {
      my( $error, $expr ) = $Api->spawn( $template );
      next if $error;
      return $expr;
    }
    $Api->exception( "Unable to locate any templates" );
    return;
 }
get_options( @optlist )
Returns the values of configuration options specified in the list for the currently executing template.

Returns an ordered list of configuration values when called in list context.

 my ($root, $path) = $Api->get_options qw/directive_root directive_path/;
 return "Template is unrestricted" unless $root;
 return "Template is restricted to root: $root, path: $path";

Call in scalar context to retrieve a single value.

 return "Template is parsed by the ".$Api->get_options('engine')." engine.";

When no parameters are given in @optlist, the entire configuration hash is returned as a list of a name/value pairs. This is handy for making a copy of the entire configuration hash.

 my %config_hash = $Api->get_options();
 return \%config_hash;
set_path( $class, @paths )
See Directory Paths
add_path( $class, @paths )
See Directory Paths

Globals methods

The globals facility is primarily a stack management tool, so most values stored there are not static and will vary depending upon the current scope. The following global stack manipulation methods only operate on the current value for the current scope (the only one that matters). For a more detailed description of what globals are and how to use them, read the globals section.

eflag( )
Returns the name of the global responsible if an eflag global stack has forced the parser into passive mode.
set_global( $name, $value )
Sets the global specified by $name to $value within the current scope, and returns $value.
get_global( $name )
Returns the value of the $name global within the current scope.
get_global_ref( $name )
Returns the value of the $name global for the current scope unless it is a string, in which case it returns a scalar reference to the string.
add_global( @names )
Adds scope to (pushes) each specified global stack and sets their default value.
del_global( @names )
Deletes scope from (pops) each specified global stack.
inc_scope( $scope )
Increments the scope of (pushes) all stacks within the scope specified by $scope.
dec_scope( $scope )
Decrements (pops) all stacks within the scope specified by $scope.
push_global( $name, $value )
Performs an add_global() function on $name, followed by a set_value() function on $name using $value, and returns $value (merely a convenience method).
new_global( $scope, $name, $default )
Creates a new global stack with the specified $name, $scope, and $default parameters.
new_eflag( $scope, $name, $default )
Creates a new eflag enabled global stack with the specified $name, $scope, and $default parameters.
global_exists( $name )
Returns the scope of the global specified by $name, or undef if it is not defined.
kill_global( @names )
Removes a global or a list of globals specified @names.

Advanced varspace methods

The following varspace methods are generally only used by Wigwam's varspace facility, but they've been made available via the API for rare situations where it's necessary to traverse arbitrary data complexes which are not necessarily stored within the normal varspace.

explicit_get_value( ref, meta, var )
explicit_set_alias( ref, meta, var, val )
explicit_is_defined( ref, meta, var )
explicit_undefine( ref, meta, var )
ref
The (array or hash) reference which will serve as the base of the data structure to be traversed.
meta
Consists of a single valid meta character (i.e. dot, colon, or backtick) and determines how the base reference, ref, is to be de-referenced.
var
The variable name to be used to traverse the data structure.
val
The value to be assigned to the final data element (explicit_set_alias() only).

Caveat:

If a variable is interpolated via square brackets (i.e. foo.[bar]), that interpolated element's value is always retrieved based on the root varspace. As an example, the following routine probably will not return ``bazz'' as you might expect.

 my $ref = { foo => { quux => 'bazz' }, bar => 'quux' }; 
 return $Api->explicit_get_value( $ref, '.', 'foo.[bar]' );

The call to explicit_get_value() will behave similarly to the following pseudo-code to retrieve the value of the given path name, foo.[bar].

 return $ref->{foo}{$vs->{bar}};

Casting

Prototype information is used to determine the type of data required by the directive which is requesting the argument. Casting is performed only if needed, so if a particular directive requests several acceptable types and the retrieved value matches any one of those types, no casting is performed and the value is returned as is.

All blessed references are treated according to their underlying referent type unless the BLESSED attribute is set, in which case the raw blessed reference value is returned. Should the BLESSED attribute be set and a non-blessed reference is encountered, a parse error is generated.

Should the cast facility encounter an EXPR type, it will be executed and its resulting value is cast and returned accordingly.

   Want       Got      Cast Method
                       (Perl equiv)      Which returns:
  STRING     STRING         N/A          The unaltered string.
  STRING     SCALAR       $$value        The de-referenced scalar value.
  STRING     ARRAY       scalar @$a      The number of array elements.
  STRING     HASH      scalar keys %$h   The number of hash keys.
 BOOLEAN   (see STRING)     N/A          1 or 0 - casts value to a string,
                                         then returns 1 if true, or 0 if false
                                         (false == null string, undefined, or
                                         a 0).
   NUM     null string      N/A          0 (zero), unless used with STRICT, in
                                         which case a parse error is generated.
   NUM     (see STRING)     N/A          The unaltered string as long as it is
                                         numeric, otherwise a parse error
                                         results.
  ARRAY      STRING      [ $string ]     An array reference with a single
                                         string element, or...
  ARRAY    null string       []          an empty array reference if the
                                         string is empty or undefined.
  ARRAY      SCALAR    [ $$scal_ref ]    The value is de-referenced and cast
                                         as a STRING.
  ARRAY      ARRAY          N/A          The unaltered array reference.
  ARRAY      HASH        [ %$hash ]      An array reference whose elements
                                         take on the keys/values of the hash
                                         in some arbitrary order. 
  HASH       STRING    { value => $string }
                                         A hashref with the string as its only
                                         value keyed by 'value', or...
           null string       {}          An empty hash reference if the string
                                         is empty or undefined.
  HASH       SCALAR    { value => $$scal_ref }
                                         The value is de-referenced and cast
                                         as a STRING.
  HASH       ARRAY         { @$a }       A hash reference whose key/value
                                         pairs are populated by the array
                                         elements.
  HASH       HASH           N/A          The unaltered hash reference.
  ANY        (any)          N/A          The unaltered value - no casting.

Any time an exception is generated while an argument is being processed for any given directive, the cast routine will return a null version of the requested data type so that the parser can make a clean exit. That is, it will return: an empty array, hash, or string; a null code reference; or a dummy blessed object which intercepts and ignores all method calls (for both EXPR and BLESSED types).

In cases where the casting of a value is required but multiple types are specified in the directive's prototype as acceptable types for a given argument, the cast mechanism will always cast the value to the least complex of the acceptable types.

 STRING         least complex castable type
 SCALAR
 ARRAY
 HASH           most complex castable type

This means that if a given directive will accept either an array or hash as its argument but a text string is encountered at runtime, that string will be cast to an array.

Globals

The globals facility is a stack management and parse control tool which is indispensable for writing directives that control the parsing process, but also helps reduce the complexities involved in writing directives that cooperate with other directives within a common scope, and are ultimately responsible for the proper handling of nested directives in templates. It's also useful for storing global data shared among all templates with a common globals object.

Declaring globals

Globals can be declared by calling one of several methods of the API: new_global(), new_subst(), new_fatal, or new_eflag(). Once a global has been declared by one of these methods, it is henceforth referred to by its name.

 $Api->new_global( $scope, $name, $default );
 $Api->new_eflag ( $scope, $name, $default );
 $Api->new_fatal ( $scope, $name, $default );
 $Api->new_subst ( $scope, $name, $default );

The arguments are the same for all global declaration methods:

scope
The range in which any given value is retained (TEMPLATE, BLOCK, GLOBAL, or LOCAL).
name
The name by which the global is to be referenced. Globals with all-lower-case names may be freely accessed from within templates - if this is not desirable, select a name which has at least one captial letter.
default
The default value given to this global upon entering a new scope. This argument is optional. If omitted, the default value will simply be undef.

The manipulation methods are the same for all globals:

Manipulating globals

A global's value within the current scope is manipulated by way of the set_global() and get_global() API methods, and is always referenced by its case-sensitive name.

 $Api->set_global( $name, $value ); 
 $foo = $Api->get_global( $name );

Globals declared using the new_eflag(), new_subst(), and new_fatal() methods will suspend the token execution process during runtime while their value is true, making them useful for writing directives that control the parsing process, such as #break and #continue. These globals will not affect the parsing process so long as its value remains false - Execution will proceed normally until its value becomes true, after-which token execution ceases until its true value loses scope and is popped off the stack, or until it's forced false again.

Globals declared with the new_global() do not affect the parsing process, but are useful for maintaining state between a set of directives within a common scope, such as: #if / #elsif / #else, #switch / #case, and #given / #when.

Block scope

BLOCK scope globals must be declared before using it by calling the new_global() method, using the format:

 $Api->new_global( BLOCK => $name, $default );

The directives #if, #elsif, and #else make use of a BLOCK scoped global stack to maintain state. These directives take the following generalized form:

 #if expression block 
 #elsif expression block 
 #else block

The BLOCK scoped global stack Do_else is pre-declared in the module init routine to maintain state:

 return sub { $_[0]->new_global( BLOCK => 'Do_else' ) }

BLOCK scope is declared on a per-argument basis in the directive's argument prototype:

 sub _proto_if    { [ BOOLEAN, ANY|BLOCK ] }
 sub _proto_elsif { [ BOOLEAN, ANY|BLOCK ] }
 sub _proto_else  { [ ANY|BLOCK ] }

When the Wigwam engine encounters a BLOCK argument, a new BLOCK scope is generated automatically, and thus a new instance of Do_else vivified to its default value. Once that BLOCK argument has been processed, the previous BLOCK scope is restored immediately before the result is returned.

Local scope

LOCAL scope globals can be pre-declared within a module init routine or manually through the API by calling the new_global() method, using the format:

 $Api->new_global( LOCAL => $name, $default );

LOCAL scope is invoked by the engine on specified variable names before the directive handler is called. These variable names are declared in an array reference existing as the second argument of the prototype:

 sub _proto_call{ [ EXPR, ARRAY ], [ qw/ Return / ] }

This signals the engine to push a new default value onto the Return stack before calling the directive handler. Once the directive handler has completed execution, the Wigwam engine restores the original value by popping the stack.

LOCAL scoped global stacks need not be pre-declared using the new_global() method unless you want to specify a default value other than undef. If a previously undeclared variable name appears in the locals prototype, a stack is automatically vivified using a default value of undef.

 sub _proto_bazz{ return( [ ], [ 'bazzvar' ] ); }

Assuming bazzvar was never pre declared using new_global() it would be vivified for you by the Wigwam engine, in essence like this:

 $Api->new_global( LOCAL => 'bazzvar', undef );

Template scope

TEMPLATE scope variables must be pre-declared by invoking the new_global() method, using the format:

 $Api->new_global( TEMPLATE => $name, $default );

TEMPLATE scope is like BLOCK scope in that all variables declared as such are updated as a whole whenever a template is invoked or exits.

As an example, the #stop directive uses the TEMPLATE scoped Stop eflag global stack to halt further processing of a template. It is declared with TEMPLATE scope.

 sub _proto_stop{ return [ ]; } 
 sub _stop{ $_[0]->set_global( Stop => 1 ); return; } 
 # module init routine
 return sub { $_[0]->new_eflag( TEMPLATE => 'Stop' ); }

Global scope

GLOBAL scope variables must be pre-declared by calling the new_global() method, using the format:

 $Api->new_global( GLOBAL => name, default );

GLOBAL scope is just that. The Wigwam engine never generates a new GLOBAL scope. Any global stack declared with GLOBAL scope will retain its value until altered by set_global() or until all tokens in all templates are exhausted. The #try directive, on the other hand, generates a new GLOBAL scope in order to trap and recover from errors.

The GLOBAL scoped _DIE eflag stack is one of the few global stacks inherently declared by the Wigwam engine. All processing of the current template and parent templates (if any) will cease if its value becomes true.

 $Api->new_eflag( GLOBAL => '_DIE', undef );

The API sets this global's value upon invocation of the exceptions method.


Export

None by default.


See Also

HTML::TEMPLATE TEMPLATE::TOOLKIT


Author

DJOHNSTON, <djohnston@cpan.org> WINGNUT, <wingnut@cpan.org>

The latest Wigwam version and documentation are available at http://www.wigwamhq.org


Copyright and License

Copyright 2004-2007 by Scot Woodward and Daniel Johnston

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

Validate: CSS, HTML, Spelling