Text::Wigwam

NAME
Synopsis
- Simple
- Verbose
About
Description
- Overview
  - Mechanics
- Features
Interface
Templates
Development environment
Engine
- Active parsing mode
- Passive parsing mode
Export
See Also
Author
Copyright and License

NAME

Text::Wigwam - A user-extensible template parser.

Synopsis

Simple

 # Parse a file
 use Text::Wigwam; 
 my $wwobj = Text::Wigwam->new( file => 'path/to/filename.txt' )
  or die $Text::Wigwam::ERROR;
 print $wwobj->parse; 
 #
 # Parse a string
 use Text::Wigwam; 
 my $wwobj = Text::Wigwam->new( text =>  'foo=[!!foo!!] bar=[!!bar!!]' )
  or die $Text::Wigwam::ERROR;
 print $wwobj->parse;

Verbose

 use Text::Wigwam;
 my $Restrict = {
	config_open    => '<<',
	config_term    => '>>',
  	default_engine => 'Fusion, Totem',
 };
 my $Defaults = {
 	engine => 'Fusion, Totem',
 	plugins => 'DirectiveSet',
 	modules => 'CgiTools, HtmlTools',
	config_open    => '<<',
	config_term    => '>>',
 	code_open => '[!!',
 	code_term => '!!]',
 	text_open => '!!>',
 	text_term => '<!!',
    code_open_trim => '[!~',
    code_term_trim => '~!]',
    text_open_trim => '~!>',
    text_term_trim => '<!~',
 	strict_tags => 1,
 	numbers => 'float',
 	hexadecimal => 'off',
    directive_root => '',
    directive_path => '',
 	drip_cache => 0,
 };
 my $Settings = {
 	plugins => 'DirectiveSet',
 	modules => 'CgiTools, HtmlTools',
 	drip_cache => 0,
 };
 my $varspace = { greet => "Hello", entity => "World" };
 my $string = '[!!greet!!], [!!entity!!]';
 my $wwobj = Text::Wigwam->new(
 	text => $string,
 	$Restrict,
	$Defaults,
 	$Settings,
 ) or die $Text::Wigwam::ERROR; 
 $wwobj->set_path( template => '~/wigwam/templates/', '/wigwam/templates/' );
 my( $error, $text ) = $wwobj->parse( $varspace );
 die $error if $error;
 print $text; # outputs "Hello, World"

About

The purpose of this documentation is to provide a categorized and detailed reference into the specifics of the Wigwam templating system.

Many of the examples found throughout this documentation make use of directives which are included in the DirectiveSet plug-in which is loaded by default unless it's explicitly overridden.

Description

Wigwam is a user-extensible/customizable template processor which provides a framework for intermingling text with dynamic content in a manner which is flexible, extensible, reusable, and efficient.

What differentiates Wigwam from most other template parsers we've come across is its extensibility, versatile data handling features, embedded parsing options and that it doesn't deppend on external modules outside of the Perl core.

Mechanics

Wigwam can be divided into three fundamental components:

Interface

This is the mechanism through which a user invokes Wigwam to construct a template object, define parameters, and parse the template.
Engine

The engine provides the basic template processing framework. It understands only the rudimentary layout of Wigwam templates, and the three basic token types: literals, variables, and directives.
Directive Tree

The directive tree is a name space based hierarchy into which groups of directives (in the form of plug-ins and modules) are loaded by the interface, when instructed. These directives collectively contribute to the overall templating environment. Templates can be restricted to specific branches of the directive tree hierarchy as a method of limiting their capability.

Features

Dynamic parsing options

Redefine tag identifiers, load plug-ins and modules, & more from within the template itself using embedded parsing options.
Versatile data handling

Accessing data within a complex structure from within a template is child's play.
Extensibility

Write your own custom directives targeted to specific tasks, and organize them into plug-ins or modules for easy reuse.
Template debugging features

The facilities for debugging are built-in.

Wigwam's default DirectiveSet plug-in extends this templating environment to include:

macros/subroutines
local variables
external templates (reusability)
if/elsif/else blocks w/unlimited nesting
given/when/default tree directives w/unlimited nesting
loop directives w/unlimited nesting (while, foreach, for, ...)
process control directives (break, continue, exit, ...)
exception handling directives (try, throw, catch, ...)

Flexibility

Although Wigwam provides for some fairly complex code in templates by default, it can be easily reduced to nothing more than a variable interpolator, or, perhaps somewhere in between.

Extensibility

The primary goal throughout the development of Wigwam was to make it as easily user-extensible as possible, to facilitate custom directives with all the robustness of any of the directives found in the DirectiveSet plug-in via a relatively simple API and with a minimal amount of code & effort.

Efficiency

For template coding and parsing efficiency, we've adopted a straight-forward Polish notation style syntax, as it requires the simplest of algorithms to process and is easy to learn. Under this scheme, template code is well-defined, parsed efficiently, and requires a minimal amount of pre-process time during tokenization, so templates execute with minimum initial overhead.

Interface

Wigwam employs an object oriented interface which is the mechanism through which a user invokes Wigwam to construct a template object, set up parameters, and execute (parse) the template.

Constructor

The new constructor returns a Wigwam template object, or undef upon error. It requires at least two arguments, the first of which indicates whether to parse a given string or a file. The second parameter is the string or filename which contains the root template (the topmost template) to be processed.

 use Text::Wigwam;
 my $wwobj = Text::Wigwam->new( text => $string )
  or die $Text::Wigwam::ERROR;
 #	...or...
 my $wwobj = Text::Wigwam->new( file => $filename )
  or die $Text::Wigwam::ERROR;

Optionally, three hashref parameters can be passed to the constructor, each of The key/value pairs (or, "options") are used to control some of the behavioral characteristics of the parsing engine, among other things. Most of these options can be set using any of the three hash reference parameters described below, however, each hash reference holds its own unique scope and/or priority.

 use Text::Wigwam;
  my $wwobj = Text::Wigwam->new(
  	file => 'path/to/root_template.txt',
  	\%Restrict,
 	\%Defaults,
  	\%Settings,
  ) or die $Text::Wigwam::ERROR;
  print $wwobj->parse( \%varspace );

Restrict

Options defined within this parameter are global in scope and cannot be overridden throughout the life of the template object. It is used to fix options to a specific value not just for the root template, but any given child template as well. This is a handy mechanism for locking out features, such as specifying a fixed set of modules for use by templates, or limiting templates to specific branches of the directive tree, and so on.

Defaults

Options defined within this parameter represent the initial settings for the root template as well as all child templates, if any. These settings hold the lowest priority and are easily overridden, even from within a template.

Settings

Options defined within this parameter only apply to the root template, and are not passed to child templates. These settings will override conflicting settings in the Defaults parameter, but they can be overridden by conflicting options within the Restrict parameter, or a Wigwam configuration tag within the root template.

Standard Wigwam Options

The following options (shown with their default values) can only be defined via the Restrict parameter because they are used during the tokenization phase.

 	config_open => '<<',
 	config_term => '>>',
  	default_engine  => 'Fusion'

config_open & config_term: These options can be used to redefine the embedded parsing options identifier tags (<< and >> by default).
default_engine name1, name2, ...: A comma-delimited list of fall-back engines (i.e. "Fusion,Totem") which are to be used in the event that the engine(s) specified by the engine parameter fail to initialize.

The following options (shown with their default values) can be defined via any parameter.

 	engine => 'Fusion',
 	plugins => 'DirectiveSet',
 	modules => 'CgiTools, HtmlTools',
	config_open    => '<<',
	config_term    => '>>',
 	code_open => '[!!',
 	code_term => '!!]',
 	text_open => '!!>',
 	text_term => '<!!',
    code_open_trim => '[!~',
    code_term_trim => '~!]',
    text_open_trim => '~!>',
    text_term_trim => '<!~',
 	strict_tags => 1,
 	numbers => 'float',
 	hexadecimal => 'off',
    directive_root => '',
    directive_path => '',
 	drip_cache => 1,

engine name1, name2, ...: A comma delimited list of preferred parsing engines to be used to parse a given template. Should the first one fail to initialize, an attempt to initialize the next entry will be made. This goes on until an engine is successfully initialized or until all entries are exhausted, after which the default_engine parameters are invoked.
plugins plugin1, plugin2, ...: Loads a comma delimited list of plug-ins.
modules module1, module2, ...: Loads a comma delimited list of modules.
strict_tags 1: When true, helps to detect template coding issues such as missing arguments and/or unexpected terminators by generating an exception when a problem is detected. It is typically okay to set this value to false, but it is recommended to leave it enabled while coding templates.
code_open [!!: Defines the tag to open code-mode.
code_term !!]: Defines the tag to terminate code-mode.
text_open !!>: Defines the tag used to begin a text-mode block.
text_term <!!: Defines the tag to terminate a text-mode block.
numbers FLOAT|REAL|INTEGER|WHOLE|OFF: (Default: Float) This option allows you to declare the complexity of the numeric literals in your template, lest they be confused for variables. Numeric literals are disabled when set to OFF, thereby requiring that all literals be surrounded by quotation marks. This is useful when a template must use variable names that may be confused as variable names by the tokenizer, such as 3.14E0
hexadecimal ON|OFF: As an extension of the numbers option, when this option is enabled, hexadecimals will be tokenized as numeric literals.
directive_root namespace: The directive_root option key is used to restrict templates to a specified namespace within the directive tree. This option is null by default, but when set, it restricts templates to a specific branch within the Directive tree.
directive_path namespace1, namespace2, ...: The directive_path option key can be used to define a comma delimited list of namespaces within the Directive tree in which to search for directives.

Methods

The following methods can be invoked on a Wigwam template object.

General

parse( hashref )

The parse method executes the template object and returns a list consisting of (potentially) an error message, and the resulting text of the parsed template. If the optional hashref parameter is provided, it will be used as the initial variable space for the template and will override any existing variable space data, including anything which was defined using the varspace related methods described below.

 use Text::Wigwam;
 my $varspace = {};
 my $wwobj = Text::Wigwam->new( file => 'path/to/filename.txt' )
  or die $Text::Wigwam::ERROR;
 my( $error, $text ) = $wwobj->parse( $varspace );
 die $error if $error;
 print $text;

Directory Paths

Wigwam maintains several independent lists of directory paths which are used to locate various types of files. Whenever a file is required, Wigwam will attempt to locate it by searching the relevant path list in left-to-right order. Each list can be manipulated by invoking one of the following methods with the list class to be modified, followed by a list of directory paths.

set_path( class, path1, path2, ... ): The set_path method is used to provide Wigwam with an ordered list of directory paths in which to search for files of the specified class type. Any pre-existing list within the current template object is lost and replaced with the path items. The new list is then returned in list form.

add_path( class, path1, path2, ... ): This is similar to set_path except that add_path is non-destructive to any pre-existing list. The add_path method appends path items to any pre-existing list items in the specified class, thereby yielding priority to any existing directory paths. The updated list is returned in list form.

Where class (the first argument) is one of the following non-case-sensitive list types:

template: Specifies where Wigwam will look for child templates. Empty by default - no child templates can be processed unless you define at least one directory path in this list.
module: Specifies where Wigwam will look for modules. Defaults to a single entry which is the equivalent of Text::Wigwam::Modules on any given system. (e.g. /usr/lib/perl5/site_perl/5.8.4/Text/Wigwam/Modules)
plugin: Specifies where Wigwam will look for plug-ins. Defaults to a single entry equivalent to Text::Wigwam::Plugins.
engine: Specifies where Wigwam will look for engines. Defaults to a single entry which translates to the equivalent of Text::Wigwam::Engines.

Unrecognized class types will be silently ignored, and an empty list will always be returned as a result.

 use Text::Wigwam;
 $wwobj=Text::Wigwam->new( text => '[!!#template "blah" !!]' );
 $wwobj->add_path( template => '/var/www/templates' );
 print $wwobj->parse;

A copy of any given list can be retrieved by calling the non-destructive add_path list manipulation method with an empty list.

 print "Template paths: " . join( ', ', $wwobj->add_path( template => () ) );
 print "Module paths: ". join( ', ', $wwobj->add_path( module => () ) );

Directory paths can be given priority by placing them first in the list. This can be accomplished by first retrieving any existing list items (which is done by passing an empty list to the add_path method) and inserting the new items appropriatly ahead of it.

 $wwobj->set_path( module => '/path/priority', $wwobj->add_path( 'module' ) );

Varspace related

The following methods can be used to pre-populate, or otherwise condition the template's variable space prior to template execution, or to retrieve data from the template's variable space after the template has executed. The function of each of the following methods are explained in more detail in the API section. It's also helpful to understand variables, which are explained in detail in the Variables section.

get_value( var )
set_value( var, val )
set_alias( var, val )
undefine( var )
is_defined( var )
generalized_get_value( ref, arr, var )
generalized_set_alias( ref, arr, var, val )
generalized_is_defined( ref, arr, var )
generalized_undefine( ref, arr, var )

Globals related

The following methods can be used to pre-populate, or otherwise condition the template's globals prior to template execution, or to retrieve data from the globals facility after template execution.

new_global( scope, name, default )
get_global( name )
set_global( name, value )
push_global( name, value )
add_global( name1, name2, ... )
del_global( name )
global_exists( name )
kill_global( name1, name2, ... )

Globals and their associated methods are explained in more detail in Text::Wigwam::Globals.

Templates

Wigwam templates consist of plain text enhanced with a customizable scripting language which is embedded within the text between a set of predefined tags.

Comments

Any text existing between the character sequences /* and */ within a Wigwam tag will be ignored by the engine. This is useful for commenting your template code.

 [!! /* Greeting */ "Welcome!" !!]

produces

 Welcome!

Syntax

Wigwam code is written using Polish notation in which operators precede their operands. This allows for expressions without the need for parentheses or other grouping delimiters, and also offers a consistent syntax.

Conventional Expression:

 1 + 2

Polish Notation:

 + 1 2

Wigwam:

 #add 1 2

Example

 $text = 'one plus two equals [!!#add 1 2!!]';
 use Text::Wigwam;
 my $wwobj = Text::Wigwam->new( text => $text );
 print $wwobj->parse();

produces:

 one plus two equals 3

Tokens

Wigwam tokens come in only three varieties: literals; variables; and directives.

Literals

There are two forms of literals...

Quoted

Quoted literals are identified by their surrounding quotation marks:

 [!! "My Dog Has Fleas" !!]
 [!! "37" !!]
 [!! "Quote marks can be \"escaped\" with the backslash character" !!]

Quoted literals can also be defined using the single quotation marks:

 [!! 'My Dog Has Ticks' !!]
 [!! 'The "double quote" marks don\'t need to be escaped here' !!]
 [!! 'However, the \'single quotes\' do' !!]

Both methods are identical in terms of function, the only difference being the character which requires the escape character when used within the string itself.

The special character sequences \n, \r, and \t in any quoted literal will produce a new-line, carriage return, and tab, respectively.

Numeric

Unquoted numeric literals are any string of characters that equate to a valid numeric value:

 [!! 42 !!]
 [!! 98.6 !!]
 [!! 186E3 !!]

Variables

Variables are identified by their lack of surrounding decorations (e.g. var1). They represent values which can be examined, defined and undefined. They're used to access values within Wigwam's variable space, which is simply a hash reference as the following example demonstrates.

 Wigwam Variables      Perl Equivalents
      foobar        $varspace->{foobar}
       var1         $varspace->{var1}
       quux         $varspace->{quux}

Path names

Path names are simply Wigwam variables with the introduction of dot and colon characters. These characters are part of a simplified syntax which is useful for directly accessing data within complex structures.

Each individual component of a path name represents either a hash key, or an array element. The first component of any path name is always a hash key, each subsequent component is identified by the dot or colon character which precedes it.

Components preceded by a dot character represent hash keys, and are restricted to valid Perl hash key characters.

 Wigwam Variables      Perl Equivalents
     foo.var1       $varspace->{foo}->{var1}
     foo.bar        $varspace->{foo}->{bar}
   foo.bar.baz      $varspace->{foo}->{bar}->{baz}

Components preceded by a colon character represent array elements, and should always equate to a numeric value.

 Wigwam Variables      Perl Equivalents
     foo:23         $varspace->{foo}->[23]
     bar:1          $varspace->{bar}->[1]
   baz:2:4.quux     $varspace->{baz}->[2]->[4]->{quux}

Self-interpolation

Wigwam variables can, themselves, be interpolated using the square-brackets, '[' and ']', allowing values to be accessed indirectly.

 Wigwam Variables    Perl Equivalents
    foo.[quux]    $varspace->{foo}->{$varspace->{quux}}
    foo[quux]     $varspace->{"foo$varspace->{quux}"}
  fum:[baz.quux]  $varspace->{fum}->[$varspace->{baz}->{quux}]
  [fe:[fi.[fo]]]  $varspace->{$varspace->{fe}->[$varspace->{fi}->{$varspace->{fo}}]}

Escaped Characters

Any of these identifying characters can be escaped using the backslash character '\'.

 Wigwam Variables    Perl Equivalents
   foo\.bar.baz      $varspace->{'foo.bar'}->{baz}
   baz\:2:4.quux     $varspace->{'baz:2'}->[4]->{quux}
  fum\:[baz.quux]    $varspace->{'fum:'.$varspace->{baz}->{quux}}
 \[fi\]:[fi.[fo]]    $varspace->{'[fi]'}->{$varspace->{fi}->{$varspace->{fo}}}

Directives

Directive tokens are identified by the hash mark '#' preceding them. They take the following form:

 #directive arg1 arg2 ...

All operations including defining variables, weighing values against each other, comparing strings, and connecting to a database, are performed by directives.

Directives are, in reality, just Perl subroutines which are typically organized in modules or plug-ins and are invoked by the parsing engine whenever a directive token is encountered in a template. Once executed, the directive and all of its associated arguments are interpolated by the parsing engine with the directive's return value.

Arguments

When writing templates, it is essential to provide the exact number of arguments as required by any given directive so as to avoid errors & unpredictable results. Each directive requires its own fixed number of arguments. For example, the #define directive in the DirectiveSet plug-in requires two arguments, a variable name and a value.

 #define var "value"

It is perfectly acceptable to use directives as arguments. In the following example, the #define directive's arguments consist of the variable name var, and the #add directive. The #add directive requires two arguments (1 and 2, in the example below), and since its function is to return the sum of its arguments, it returns '3', which then replaces the expression #add 1 2. Ultimately, the value used as the second argument for the #define directive is '3', where it is assigned to the variable var.

 #define var #add 1 2

...essentially becomes...

 #define var 3

Directive tree

The directive tree is a Perl namespace heirarchy used to organize directives into categories. The internal base of the directive tree is fixed in the Text::Wigwam::Directives name space. By default, any simple directive such as #foo is expected to be located in the Text::Wigwam::Directives namespace. Subsequent nested packages represent branches of the directive tree and can be accessed explicitly using the following directive syntax:

 #name::space::here::directive_name_here

In the following example, the Wigwam engine would expect to find the #connect directive within the Text::Wigwam::Directives::mysql name space.

 #mysql::connect ...

Certain features are available which can be used to temporarily overload directives, or restrict templates from accessing certain directives within the directive tree. These features are controlled by the directive_root and directive_path options.

The directive_root option restricts templates from calling directives stored outside its specified branch of the directive tree.

The following table lists some examples of how the engine will attempt to locate a given directive.

 directive_root    directive     Name space
    (null)          #foo        Text::Wigwam::Directives
     quux           #foo        Text::Wigwam::Directives::quux
     blah      #fee::fye::foo   Text::Wigwam::Directives::blah::fee::fye
  roe::sham       #boe::foo     Text::Wigwam::Directives::roe::sham::boe

The directive_path is a list of namespaces which are used to search for directives. The following table lists some examples of how the engine searches through the directive tree for directives given various directive_path settings.

 directive_root  directive_path    directive     name space search order
    (null)     bazz,bazz::quux     #foo        Text::Wigwam::Directives::bazz
                                               Text::Wigwam::Directives::bazz::quux
                                               Text::Wigwam::Directives

 directive_root  directive_path    directive     name space search order
    (null)        bazz,quux    #fe::fi::foo    Text::Wigwam::Directives::bazz::fe::fi
                                               Text::Wigwam::Directives::quux::fe::fi
                                               Text::Wigwam::Directives::fe::fi

As the previous examples demonstrate, the root name space will assume the last value in the path. The root name space can be specified elsewhere in the path with an empty entry, as the following table attempts to demonstrate.

 directive_root  directive_path     directive     name space search order
    (null)     ,bazz,bazz::quux     #foo        Text::Wigwam::Directives
                                                Text::Wigwam::Directives::bazz
                                                Text::Wigwam::Directives::bazz::quux

 directive_root  directive_path     directive     name space search order
    (null)        bazz,,quux    #fe::fi::foo    Text::Wigwam::Directives::bazz::fe::fi
                                                Text::Wigwam::Directives::fe::fi
                                                Text::Wigwam::Directives::quux::fe::fi

The following tables list some examples of how the engine searches for directives given various directive_path and directive_root settings.

 directive_root  directive_path     directive     name space search order
   foobar         bazz,quux         #foo        Text::Wigwam::Directives::foobar::bazz
                                                Text::Wigwam::Directives::foobar::quux
                                                Text::Wigwam::Directives::foobar

 directive_root  directive_path     directive     name space search order
     bar          bazz,quux     #fe::fi::foo    Text::Wigwam::Directives::bar::bazz::fe::fi
                                                Text::Wigwam::Directives::bar::quux::fe::fi
                                                Text::Wigwam::Directives::bar::fe::fi

 directive_root  directive_path     directive     name space search order
   foobar         bazz,,quux        #foo        Text::Wigwam::Directives::foobar::bazz
                                                Text::Wigwam::Directives::foobar
                                                Text::Wigwam::Directives::foobar::quux

This hierarchy could be used to implement a set of strategically designed plug-ins and modules which, when used with these restrictive option settings, limit a template coder to a specific set of directives. It can also be used to enable any given child template to temporarily overload directives.

Interpolation

Self-interpolation is supported in directive names. The motivation for this feature was to add a layer of abstraction in calling directives which exist within a sub-level of the current branch of the directive tree.

 [!!
	#define db "pgsql"
	#[db]::connect
 !!]

translates to

 [!!
 	#pgsql::connect
 !!]

This feature is not limited to the name space portion of the directive handle, as demonstrated in the following example.

 [!!
 	#define action "add"
 	#[action] 2 3
 !!]

translates to

 [!!
 	#add 2 3
 !!]

which produces

Embedded parsing options

Embedded parsing options reside within the template itself & can be used to define the behavioral characteristics of the parsing engine. They are manipulated via a specially formatted configuration tag. This mechanism can be used to override pre-specified default values for the current template, or to restrict (lock) or alter the default settings for child templates. They take the following form:

  << Wigwam class: key1=value1; key2=value2; >>

Where class is one of Options, Restrict, Defaults, or Settings (case insensitive). The key=value pairs that follow the class declaration are made members of that class.

Configuration tags are very passive. Any unsupported class definitions and all key=value pairs defined therein will be silently ignored. Furthermore, no checks are made to determine whether any given option key serves a purpose, nor whether its value is of the required type or format.

The Options class is the only one that will affect the current template. The remainder of the classes are used to define defaults or restrictions for subsequently invoked child templates, and they directly correlate to their counterparts described in the Interface section. Note also that any predeclared Restrict settings still have precedence over all.

Only a single configuration tag may exist in any given template file, as only the first encountered configuration tag will be parsed. Any other configuration tags within the template will be ignored and treated as plain text. However, the various classes may be cascaded within this single configuration tag:

 << Wigwam Options: code_open=[!!; code_term=!!]; Restrict: engine=Foomatic; >>

Duplicate keys within a common class will be ignored.

 << Wigwam
	Options:
 	 code_open=[!!;
	 code_term=!!];
	Restrict:
	 engine=Foomatic;
	Options:
	 code_open=[%; /* ignored */
	 set_strict_tags=0;
	 code_term=%]; /* ignored */
 >>

The colon, semicolon, and equals characters must be escaped using the back-slash character '\' when used as part of a key or value in a configuration tag.

 << Wigwam
 	Options:
 	 directive_root=bazz\:\:quux;
 	 directive_path=foo\:\:bar;
 	 some\=crazy\:key=some\;wacky\=value;
 >>

Here's an example that demonstrates some available option keys and their respective standard values:

 << Wigwam
 	Options: 
 	 set_engine=Fusion;
 	 strict_tags=1;
 	 code_open=[!!;
 	 code_term=!!];
 	 code_open_trim=<!!;
 	 code_term_trim=!!>;
 	 numbers="off";
 	 plugins=DirectiveSet;
 	 modules=CgiTools, HtmlTools;
 >>

The default Wigwam configuration tag entities, << and >>, can be changed to whatever characters you see fit by using the config_open & config_term options exclusively within the Restrict hash reference parameter during the object construction phase. This is a global alteration and applies not only to the root template, but to all child templates as well. They cannot be changed throughout the life of the template object.

As an example, a CGI wrapper may alter the Wigwam configuration tag identifiers so that they take on a form similar to SSI in order to better blend in with html documents:

 <!--Wigwam Options: set_engine=Fusion; set_priv_vspace=0; -->

Development environment

Modules

Modules are simply Perl external library files which are used to organize the subroutines that make up custom directives. The directives contained within these modules become available to templates by loading them via the modules option. Where Wigwam searches for these library files is determined by the module directory path (see Directory Paths).

 << Wigwam Options: modules=CgiTools, HtmlTools; >>

As with all Perl external library files, the last executable statement in a Wigwam module should result in a true value. Simply putting "1;" or "return 1;" as this last executable value in the file will suffice. However, sometimes it is necessary for a module or plug-in to declare some Globals upon initialization. This can be accomplished by returning a reference to a subroutine instead of simply "1;" as the last executable statement in the file. This code will be executed and passed an API object as its only argument each time the module is initialized by a new template object.

Package declarations within Wigwam plug-ins and modules must begin with the base name space Text::Wigwam::Directives (the base of the directive tree). Subsequent nested name spaces must consist of all lower case characters, however (i.e. Text::Wigwam::Directives::blah::blah). Modules which do not explicitly declare a package will be loaded into the base name space, Text::Wigwam::Directives by default, as this is the default name space in which the Wigwam engine will look for directives.

Plug-ins

Plug-ins are functionally equivalent to modules, but differ in that they may associate directives with single-character symbols (i.e. '{' => #do, '(' => #list, etc.) and are given priority over modules in terms of loading and initialization. Plug-ins are stored separately from modules in the 'Text/Wigwam/Plugins' folder and are loaded via the plugins option.

 << Wigwam Options: plugins=DirectiveSet, MyCustomSet; >>

Writing directives

Writing any single directive requires two subroutines consisting of a prototype, and a handler.

Handlers use the following naming convention:

 sub _directive{ }

Prototypes are named like their handler counterpart but have "_proto" prepended:

 sub _proto_directive{ }

These two Perl subroutines make up a directive named #directive. The existence of these subroutines is detected at run-time when a directive is encountered by the engine during the template parsing process. If either of these subroutines is absent, the engine will throw an exception.

Note that both subroutine names must use all lower case, as the engine will associate #directive, #Directive, #DIRective, #DIRECTIVE, etc. to the same all-lower-case handler/prototype pair.

Prototypes

The Wigwam engine uses the information gleaned from prototypes to ensure that directive handlers receive the type of data that they expect for each of their arguments. The engine will perform the necessary type-casting when required to ensure this. This technique puts much of the burden of housekeeping on the Wigwam engine, which contributes to simplifying the task of directive coding.

Prototypes should do nothing more than return a simple array reference. Each argument required by the directive handler is represented by its corresponding element in this array. Each argument's requested data type is represented by the value stored in their corresponding array element. Consequently, the number of arguments required by any given directive is determined by the number of elements in this array.

The values stored within this array consist of numeric constants which represent the various data types. These constants are accessible via methods provided by an object which is passed as an argument to the prototype upon its invocation, or they can be imported from the Text::Wigwam::Const class. The various constants are described in detail below.

As you might expect, prototypes are called by the Wigwam engine before the handler is called. However, prototypes are generally only called once per parsing session upon first encountering any given directive. The information is typically pulled from a cache upon subsequent encounters. This cache can be de-activated via the drip_cache option, thereby forcing the engine to call the prototype at each encounter of a directive. This will degrade performance and is reserved for debugging and engine test purposes.

Handlers

Handlers perform the function of the directive. The Wigwam engine replaces the directive and its associated arguments in the template with the directive handler's return value.

Directive handlers are required to pull exactly all of the arguments declared in their corresponding prototype. Pulling too few arguments, or attempting to pull too many, will cause the Wigwam engine to throw an exception.

Arguments are processed by calling methods of the Wigwam API object which is provided as an argument by the engine upon invocation of the directive handler. These methods are explained in more detail later, but for typical module coding, you'll rarely need to use anything besides the methods: get_arg which retrieves the next argument; and kill_arg( n ) which destroys the next n arguments.

Examples

Here's a simple directive called #fnord, which concatenates two given scalars and inserts "(fnord)" between them:

#fnord's prototype:

 sub _proto_fnord{ [ $_[0]->SCALAR, $_[0]->SCALAR ] }

The prototype tells Wigwam that it requires two arguments of type SCALAR.

#fnord's handler:

 sub _fnord{ my $API=shift; return( $API->get_arg." (fnord) ".$API->get_arg ); }

The handler in the preceding example makes two calls to the get_arg method in order to retrieve each of its two arguments. This handler simply returns these arguments with "(fnord)" inserted between them.

The following example demonstrates an alternative method of coding the same thing.

 use Text::Wigwam::Const qw(:all); # Imports Wigwam constants
 sub _proto_fnord{ [ SCALAR, SCALAR ] } 
 sub _fnord{ my $API=shift; return( $API->get_arg." (fnord) ".$API->get_arg ); }

Let's see our #fnord directive in action in a template - we'll assume that the module containing our #fnord directive routines was already saved as "Fnord.pm" into the appropriate modules folder...

 << Wigwam Options: modules=Fnord; >>
 [!!#fnord "Wigwam modules" "are fun!"!!]

produces:

 Wigwam modules (fnord) are fun!

Caveat

It is important that you use the kill_arg( n ) method to pull arguments that are being ignored by your directive, as opposed to simply calling get_arg and throwing away its return value. This becomes an issue when, for example, an argument is made up of a directive which performs a physical operation on data. Consider the following example:

 sub _proto_unless{ my $WWC=shift; return [ $WWC->SCALAR, $WWC->ANY ]; } 
 sub _unless { 
 	my $API=shift;
 	if( $API->get_arg ){
		$API->get_arg; # retrieve the next argument,
		return;        # and return null.
	}
 	return $API->get_arg;
 }

This works fine in situations like this:

 [!!#unless foo bar!!]

But consider situations like this:

 [!!#unless foo #undefine bar!!]

In the latter example, #undefine bar will be executed regardless of the value of foo, which is probably not the behavior you want.

The directive handler ought to be rewritten to use kill_arg( n ), instead...

 sub _unless { 
 	my $API=shift; 
 	if( $API->get_arg ){
		$API->kill_arg(1); # destroy the next argument,
		return;            # and return null. 
	}
 	return $API->get_arg; 
 }

Constants

The Text::Wigwam::Const class provides several constants which are to be used within directive prototypes to specify argument attributes required by their corresponding directive handlers.

 Data type
 Constant   Instructs the Wigwam engine to execute the next argument, and return...

   HASH       a hash reference.
   ARRAY      an array reference.
   SCALAR_REF a scalar reference.
 * BLESSED    a blessed reference (or else throw an exception).
 * NUM        a numeric scalar value (or else throw an exception).
 * ANY        any value - no casting or type-checking is performed.
   SCALAR     a scalar value.
   VAR        a scalar value.. unless the given argument is a variable name, in which case,
              return the raw variable name rather than its value. When used in conjunction
              with ARRAY or HASH, the Wigwam engine will vivify this variable's value with
              an empty array or hash, respectively, if it was previously undefined.
 * To be used exclusively. Do not combine with any other data type constant.

In most cases, the following argument modifiers should be used in conjunction with at least one of the above data type constants. Multiple modifiers can be specified per argument, so long as it makes sense.

 Modifier
 Constant   Signals the Wigwam engine that we wish to...
  BLOCK     Execute this argument in a new block scope.
            (see the Globals section).
  EXPR      Retrieve the next argument in the form of an expression so that it may be
            executed at a later time, multiple times, or perhaps not at all.
            Note: An EXPR object is executed by invoking its 'execute' method.
  STRICT    Disallow casting (accepting only the specified data type(s)). Generates
            an exception if the value encountered is of a different type than what is
            specified for this argument.
  TERM      Keep pulling arguments until an #end terminator token is encountered.
            (rarely needed)

Access to the constants can be achieved in several ways. They can all be imported into the current name space, thusly.

 use Text::Wigwam::Const qw(:all);

Or, more selectively...

 use Text::Wigwam::Const qw( ARRAY HASH SCALAR );

Importing constants is the recommended technique, as prototypes are more readable.

 use Text::Wigwam::Const qw( ARRAY HASH SCALAR );
 sub _proto_mydirective { [ SCALAR, HASH, ARRAY ] }

Or, as methods of the constants object which is passed in to each prototype upon invocation, for convenience.

 sub _proto_mydirective { [ $_[0]->SCALAR, $_[0]->HASH, $_[0]->ARRAY ] }

These attributes can also be strung together using Perl's bitwise-or operator '|' to indicate a number of acceptable data types for your directive handler. To demonstrate, here is the code for the #reverse directive straight out of the DirectiveSet plug-in...

 sub _proto_reverse{ [ $_[0]->ARRAY | $_[0]->HASH ] }
 sub _reverse{
 	my $arg = $_[0]->get_arg;
 	return { reverse( %{$arg} ) } if $_[0]->is_hash( $arg ); # Hash ref?
 	return [ reverse( @{$arg} ) ];                          # Assume it's an array ref
 }

The prototype in the above example guarantees that we will receive either an array reference or a hash reference upon calling the get_arg method. Therefore, the handler needs only to check the reference type once. This not only simplifies things & saves us some code, but we avoid Perl runtime errors, as the argument's type is assured to be one of two possible types.

The `EXPR` data type

EXPR data types are ideally suited for macro and iterator directives, as they can be executed once, many times, or not at all. An EXPR object is executed by calling its execute method, after which it will return the appropriate data type as defined in the prototype where it was declared - if no data type was specified, ANY is assumed.

 sub _proto_badong { [ EXPR ] }
 # is identical to:
 sub _proto_badong { [ EXPR | ANY ] }
 # Or we can force the EXPR type to return a specific data type when executed:
 sub _proto_badong { [ EXPR | ARRAY ] }
 sub _badong {
   my $expr = $_[0]->get_arg;
   my $array = $expr->execute; # We're guaranteed an array reference here.
 }

EXPRs execute within the same environment as the template from which they originate. This means that if an expression object is executed from within a template other than the one where it was defined, the EXPR may not have access to the same varspace, directives, etc. as the template that invoked it. Conversely, the EXPR may have access to directives and resources that the calling template doesn't. This feature can be used as a means for providing some advanced functions to templates which are restricted to a very limited branch of the directive tree.

Some other standard methods available in EXPRs:

num: Returns a numeric representation of the size of the expression (usually the number of tokens).
doc: Returns the filename of the template in which it originated.
beautify: Returns a beautified representation of the expression.
engine: Returns the name of the engine that generated the expression.

API

The Wigwam API is an object that is passed by the engine as the first (and only) argument to any given directive handler upon invocation. It provides methods which can be called by directive handlers to perform basic functions, such as handling arguments, manipulating variables, managing Globals, and other useful functions. They are invoked using one of the following techniques:

 my $arg = $_[0]->get_arg; 
 # do something with $arg... 
 #   ..or.. 
 my $API=shift;
 my $arg = $API->get_arg;
 # do something with $arg...

Argument functions

get_arg: Returns the next argument in the form of the data type indicated in the prototype.
kill_arg( n ): Skips the next n arguments without executing them. When called on an #end terminated list (TERM type) argument, kill_arg(1) will skip over the entire list.

Basic variable functions

These functions perform operations on variables stored within Wigwam's variable space.

get_value( path_name ): Returns the value stored in path_name.
set_value( path_name, value ): Stores the specified value into path_name and returns value. If value happens to be a reference to an array or hash, set_value creates a new reference from value and uses that as the value to be stored into path_name & returned.
set_alias( path_name, value ): Stores value into path_name and returns value. Unlike set_value, set_alias blindly stores value into path_name. No copies of references are made.
is_defined( path_name ): Returns true if path_name is defined.
undefine( path_name ): Deletes the specified path_name key, or element.
escape_var( path_name ): Returns an escaped version of a variable name - backslashes are added in front of meta characters, such as the dot, colon, and square-bracket characters.

Data type detection functions

is_num( value ): Returns true if value is a valid numeric value.
is_scalar( thing ): Returns true if thing is a scalar value.
is_list( thing ): Returns true if thing is an array reference.
is_hash( thing ): Returns true if thing is a hash reference.
is_scalar_ref( thing ): Returns true if thing is a scalar reference.
is_blessed( thing ): Returns thing's underlying referent type if thing is a blessed reference, otherwise returns false.
ref_type( thing ): Returns the thing's underlying referent data type.
is_expr( thing ): Returns true if thing is a valid Wigwam expression object.
Note also that the is_expr method is the only one of those listed here that will return true when passed an EXPR object, regardless of its true underlying referent type.

Miscellaneous functions

doc

Returns the current template filename, or an empty string if it was provided as a text string.

exception( string )

Prepends some brief text to string indicating which template caused the exception and stores it in the die Global.

 $API->exception( 'An error occurred!' );

debug ( string )

Adds string as a comment that will appear at the current location in the template when reconstructed by Wigwam's template debugger facility.

load_modules( string )

Attempts to load all modules specified in the comma-delimited string. Returns undef upon success, or an error message if there was an error.

 my $err = $_[0]->load_modules( 'HtmlTools, CgiTools' );
 $_[0]->exception( $err ) if $err;
 return;

template( filename )

Attempts to locate and execute the child template specified by filename, and returns a list consisting of an error message (or null string) followed by the resulting text produced by the template.

 my( $error, $result ) = $API->template( 'ChildTemplate' );
 if( $error ){ $API->exception( $error ); return undef }
 return $result;

get_options( list )

Returns the values of configuration options specified in list for the currently executing template.

Returns a array of configuration values when called in list context.

 my ( $engine, $default_engine ) = $API->get_options qw( engine default_engine );
 return if $engine ne $default_engine;

When called in scalar context, only the value of the first parameter in list is returned.

 return if $API->get_options qw( engine ) ne $API->get_options qw( default_engine );

When no parameters are given in list, the entire configuration hash is returned in the form of a key/value list. This is handy for making a copy of the entire configuration hash.

 my %config_hash = $API->get_options();
 return if $config_hash{engine} ne $config_hash{default_engine};

set_path( class, list )

See Directory Paths

add_path( class, list )

See Directory Paths

Globals functions

For a more detailed description of what globals are and how to use them, visit the Text::Wigwam::Globals module.

new_global to create a new global.
get_global to retrieve a global value.
add_global to add scope to specified globals.
set_global to set the value of a global.
push_global to perform an add_global function followed by a set_global function on a specified global.
del_global to delete scope from specified globals.
inc_scope to push default values into all stacks within the specified scope.
dec_scope to pop all stacks within the scope.
eflag to retrieve the status on eflag enabled globals.
global_exists to retrieve the scope of a specific global.
kill_global to remove globals.

Advanced variable functions

The following variable functions are made available to directive handlers for rare situations where it's required to traverse arbitrary data complexes which are not necessarily stored within the normal varspace. They are primarily used by the Wigwam engine and are not typically used in directive handlers, since the basic variable functions will suffice 99% of the time.

generalized_get_value( ref, arr, var )
generalized_set_alias( ref, arr, var, val )
generalized_is_defined( ref, arr, var )
generalized_undefine( ref, arr, var )

Where...

ref is the array or hash reference which will serve as the root level of the data structure to be traversed.

arr indicates whether the root reference, ref is to be accessed as an array (true), or as a hash (false). i.e.: $ref->[ ] vs. $ref->{ }

var is the path name to be traversed.

val is the value to be assigned to the final data element (generalized_set_alias only).

Caveat:

If a path name is embedded within a var inside of square brackets (i.e. foo.[bar]), that embedded element's value is retrieved based on the root varspace regardless of the ref value. As an example, the following routine probably will not return "bazz" as you might expect...

 my $ref = { foo => { quux => 'bazz' }, bar => 'quux' }; 
 return $API->generalized_get_value( $ref, 0, 'foo.[bar]' );

...instead, the call to generalized_get_value will behave similarly to the following semi-pseudo code to retrieve the value of the given path name, foo.[bar] ...

 return $ref->{foo}{$API->get_value('bar')};

Casting

Prototype information is used to determine the type of data required by the directive which is requesting the argument. Casting is performed by the Wigwam engine only if needed, so if a particular directive requests several acceptable types and the retrieved value matches any one of those types, no casting is performed and the value is returned as is.

All blessed references are treated according to their underlying referent data type unless the BLESSED attribute is set, in which case the raw blessed reference value is returned. Should the BLESSED attribute be set and a non-blessed reference is encountered, an exception is generated.

Should the cast facility encounter an EXPR type, it will be executed and its resulting value cast, if required, and then returned.

   Type       Type       Casting 
 Requested  Retrieved  Method (Perl)     Which Returns
  SCALAR     SCALAR         N/A          The unaltered scalar value.
  SCALAR   SCALAR_REF     $$value        The dereferenced scalar value.
  SCALAR     ARRAY       scalar @$a      The number of array elements.
  SCALAR     HASH      scalar keys %$h   The number of hash keys.

  ARRAY      SCALAR      [ $scalar ]     An array reference with a single scalar element, or
          null string        []          an empty array reference if the scalar is a null string.
  ARRAY    SCALAR_REF  [ $$scal_ref ]    The value is dereferenced and cast as a SCALAR.
  ARRAY      ARRAY          N/A          The unaltered array reference.
  ARRAY      HASH        [ %$hash ]      An array reference whose elements take on the
                                          keys/values of the hash in some arbitrary order.

  HASH       SCALAR      { $scalar }     A hashref with the scalar value as its only key, or
          null string        {}          An empty hash reference if the scalar is a null string.
  HASH     SCALAR_REF  { $$scal_ref }    The value is dereferenced and cast as a SCALAR.
  HASH       ARRAY         { @$a }       A hash reference whose key/value pairs are populated by
                                         the array elements.
  HASH       HASH           N/A          The unaltered hash reference.

  ANY         any           N/A          The unaltered value - no casting.

Any time an exception is generated while an argument is being processed for any given directive, the cast routine will return a null version of the requested data type so that its calling directive can make a clean exit.

Engine

The Wigwam engine is responsible for tokenizing templates, executing tokens, maintaining the stacks, and casting data types when required as the template is parsed. Wigwam sports two engines (Fusion and Totem) which are functionally identical, differing only in processing technique, yet also serve to prove engine interchangeability.

For purposes of writing your own directives, it's helpful to understand the engine's parsing modes: