Overview

CodeWorker is a scripting language distributed under the GNU Lesser General Public License and devoted to manipulate many aspects of generative programming as easy and intuitive as possible. Generative programming is a software engineering approach for producing reusable, tailor-made, evolvable and reliable IT systems with a high level of automation.

The scripting language adapts its syntax to the subject it has to handle: - an extended-BNF syntax (declarative part of the language) for recognizing the format of the specifications to parse, - a procedural language for manipulating easily parse trees (the only structured type admitted by 'CodeWorker'), strings, files and directories, - a JSP-like syntax (imperative part of the language), which facilitates the writing of template-based code generation.

Thanks to this syntax adaptation, the scripting language is able to easily: - acquire any kind of specification of the IT system to produce (often XML but not necessary), - generate source code in a classical way (as Rational ROSE), managing protected areas of text that accept hand-typed code, - expand a source file like the class-wizard of Visual C++ (generated text is inserted at specified markups), - translate from a format to another (LaTeX to HTML, XSL to CodeWorker, ... no limit), - transform a source file (to instrument a source file with profiling features, ...).

These tasks are executed in a straightforward process, with no binding to an external programming language and with no translation of requirements specification.

1 Building a parse tree

CodeWorker provides two methods for performing a parsing:

During the parsing of files, CodeWorker feeds an appropriate data structure that is called a tree, a parse tree. A tree is a convenient structure to represent a hierarchical set of nodes, as in XML for instance. The parse tree is shared both by the parse task, which takes in charge of populating the tree, and by the source code generation that will walk through it for generating text.

We suggest to use the file extension ".cwp" for extended-BNF parse scripts.

2 A universal source code/text generation

Given a specification provided in any kind of format, CodeWorker will generate source code or text as required in template-based scripts.

The source code generation can use three modes: generation, expansion or translation.

We suggest to use the file extension ".cwt" for template-based scripts.

3 About the manual

Efforts are focused on improving the reliability of this documentation on examples and on the reference manual (except on English text, I'm afraid!).

A formal representation describes all functions and procedures that CodeWorker provides, with their prototype and a short explanation and an example and the list of all-similar functions and procedures. This formal representation is used to generate source codes of CodeWorker that handle parsing and C++ mapping and execution of each function and procedure of the scripting language. This formal representation that conforms to what CodeWorker expects in terms of function/procedure prototypes, is reused to generate the LaTeX part of the reference manual that describes each of them. Examples are executed while generating the documentation to be sure they are correct, and to report an up to date output.

The chapter getting started is partially generated too, and the guarantee is given that every script runs successfully and that every example file has the last annotations. To warrant that, scripts are executed while generating the documentation, and example/script files contain some formatted comments just before lines to annotate. While including them into the chapter, their content is numerated line by line, and notes are extracted. Notes are written just after the content, and refer to the line they explain.

The documentation is written in LaTeX. The great advantage of LaTeX is that it offers a powerful text processing and that it is easy to manipulate for source code generation (text format instead of binary, and it accepts comments). Markups are inserted into the documentation at the points where generated text must be included. A markup is a special comment that CodeWorker recognizes. This mode of code generation is an illustration of what is called expansion mode before.

Getting started

This chapter is intended to help you to discover the scripting language and how it may serve your software development process.

CodeWorker is delivered with:

Binaries are available into the "bin" directory.

The scripting language adapts its syntax to the nature of the tasks to handle:

Example:

CodeWorker allows saving time to implement source code, if it disposes of a detailed design. Let start with a tiny modeling language that only understands object types and that we create just for this example:

      // file "GettingStarted/Tiny.tml":
      1 class A {
      2 }
      3
      4 class B : A {
      5 }
      6
      7 class C {
      8     B[] b
      9 }
      10
      11 class D {
      12     A a
      13     C[] c
      14 }

line 1: we declare the class A, without attributes,
line 4: we declare the class B, which inherits from A,
line 7: we declare the class C that encapsulates an array of B instances,
line 11: we declare the class D that encapsulates an association to an instance of class A and an array of C instances,

4 The parse tree

The role of the parsing is to populate the parse tree. Let suppose that, for each class, we need of the following attributes:

The description of an encapsulated attribute will require:

To discover the parse tree, we'll first populate it by hand. To do that, let run CodeWorker in console mode:

CodeWorker -console

Type the following line into the console, and be careful not to forget the final semi colon:

insert listOfClasses["A"].name = "A";
traceObject(project);

The insert keyword is used to create new branches into the parse tree. The root is named project, but hasn't to be specified, and a sub-node (or attribute) listOfClasses has been added. This sub-node is quite special: it has to contain an array of nodes that describe classes. Items are indexed by a string and are stored into their entrance order; so, the node that takes in charge of describing the class A is accessed via listOfClasses["A"]. The string "A" is assigned to the attribute listOfClasses["A"].name.

The procedure traceObject(project) shows us the first-level content of the root: the attribute listOfClasses and all its entries (only "A" for the moment). Let populate the tree with the description of the class B:

set listOfClasses["B"].name = "B";

The set keyword is used to assign a value to an existing branch of the parse tree. If this branch doesn't exist yet, a warning notices you that perhaps you have done a spelling mistake, to avoid inserting new bad nodes. But the node is inserted despite of the warning. As the language isn't typed, it allows avoiding some troubles. Let's continue:

ref listOfClasses["B"].parent = listOfClasses["A"];
traceLine(listOfClasses["B"].parent.name);

The node listOfClasses["B"].parent refers to the node listOfClasses["A"], so listOfClasses["B"].parent.name is similar to listOfClasses["A"].name. Let start filling in the tree for class C:

insert listOfClasses["C"].name = "C";
pushItem listOfClasses["C"].listOfAttributes;
local myAttribute;
ref myAttribute = listOfClasses["C"].listOfAttributes#back;

The pushItem assignment command is another way to add a new node into an array, where the item is indexed by the position of the node, starting at 0. The local keyword allows declaring a variable on the stack. This variable is also a parse tree, but not attached to the main parse tree project. For more commodities, this variable will refer to the last element of the attribute's list: myAttribute is shorter to type than listOfClasses["C"].listOfAttributes#back. Notice that the last element of an array is accessed via '#back'. Let complete the attribute b of class C:

insert myAttribute.name = "b";
ref myAttribute.class = listOfClasses["B"];
insert myAttribute.isArray = true;

The keyword true is a predefined constant string that is worth "true". The keyword false also exists and is worth an empty string.

Exercise:

Populate the parse tree with the description of class D.

5 Scanning our design with a BNF-driven script

Now, we'll describe the format of our tiny modeling language thanks to a BNF grammar (see paragraph BNF syntax for more elements about it) like it is recognized by CodeWorker :

      // file "GettingStarted/Tiny-BNF.cwp":
      1 TinyBNF ::=
      2     #ignore(JAVA)
      3     [classDeclaration]*
      4     #empty
      5     => { traceLine("this file is valid"); };
      6 classDeclaration ::=
      7     IDENT:"class"
      8     IDENT
      9     [':' IDENT ]?
      10     classBody;
      11 classBody ::= '{' [attributeDeclaration]* '}';
      12 attributeDeclaration ::= IDENT ['[' ']']? IDENT;
      13 IDENT ::= #!ignore ['a'..'z'|'A'..'Z']+;

line 1: the clause TinyBNF takes in charge of reading our design,
line 2: blanks and comments are allowed between tokens, conforming to the JAVA syntax ('/*' '*/' and '//'),
line 3: the clause classDeclaration is repeated as long as class declarations are encountered into the design,
line 4: if no class anymore, the end of file may have been reached,
line 5: the '=>' operator allows executing instructions of the scripting language into the BNF-driven script; this one will be interpreted once the file will be matched successfully,
line 6: the clause classDeclaration takes in charge of reading a class,
line 7: the clause IDENT reads identifiers and the matched sequence must be worth "class",
line 8: the name of the class is expected here
line 9: the declaration of the parent is facultative and is announced by a colon,
line 11: the clause classBody reads attributes as long as a it matches,
line 12: the clause attributeDeclaration expects a class identifier and, eventually, the symbol of an array, and the name of the attribute,
line 13: the clause IDENT reads an identifier, composed of a letter or more, which cannot be separated by blanks or comments (required by the directive #!ignore),
This BNF-driven script only scans the design ; it doesn't parse the data. Type the following line into the console to scan the design "Tiny.tml":


parseAsBNF("Scripts/Tutorial/GettingStarted/Tiny-BNF.cwp", project, 
        "Scripts/Tutorial/GettingStarted/Tiny.tml");

Output:

this file is valid

But this script isn't sufficient enough to complete the parse tree.

6 Parsing our design with a BNF-driven script

We have to improve the precedent script, called now "Tiny-BNFparsing.cwp", for building the parse tree that represents the pertinent data of the design:

      // file "GettingStarted/Tiny-BNFparsing.cwp":
      1 TinyBNF ::= #ignore(JAVA) [classDeclaration]* #empty
      2     => { traceLine("this file has been parsed successfully"); };
      3 classDeclaration ::=
      4     IDENT:"class"
      5     IDENT:sName
      6     => insert project.listOfClasses[sName].name = sName;
      7     [
      8         ':'
      9         IDENT:sParent
      10         => {
      11             if !findElement(sParent, project.listOfClasses)
      12                 error("class '" + sParent + "' should have been declared before");
      13             ref project.listOfClasses[sName].parent = project.listOfClasses[sParent];
      14         }
      15     ]?
      16     classBody(project.listOfClasses[sName]);
      17 classBody(myClass : node) ::=
      18     '{' [attributeDeclaration(myClass)]* '}';
      19 attributeDeclaration(myClass : node) ::=
      20     IDENT
      21     ['[' ']']?
      22     IDENT;
      23 IDENT ::= #!ignore ['a'..'z'|'A'..'Z']+;

line 5: the name of the class is put into the local variable sName. Note that the first time a variable is encountered after a token, it is declared as local automatically.
line 6: we populate the parse tree as we have proceeded manually,
line 9: the name of the parent class is put into the local variable sParent,
line 11: the parent class must have been declared before: the item is searched into the list of classes,
line 13: we populate the parse tree as we have proceeded manually,
line 16: clauses may accept parameters; here, the current class is passed to classBody that will populate it with attributes,
line 17: the clause classBody expects a parameter as a node; a parameter may be passed as value or node or reference,
line 19: little exercise: complete the clause attributeDeclaration that takes in charge of parsing an attribute of the class given to the argument myClass,
line 20: remember that you must parse the class name of the association here (attribute myClass.listOfAttributes#back.class refers to the associated class),
line 21: remember that you must parse the multiplicity of the association here (attribute myClass.listOfAttributes#back.isArray is worth true if '[]' is present),
line 22: remember that you must parse the name of the association here (to put into attribute myClass.listOfAttributes#back.name),
Exercise:

Complete the precedent clause attributeDeclaration to populate an attribute. You'll find the solution into file "Scripts/Tutorial/GettingStarted/Tiny-BNFparsing1.cwp".

Solution:

      // file "GettingStarted/Tiny-BNFparsing1.cwp":
      1 classBody(myClass : node) ::=
      2     '{' [attributeDeclaration(myClass)]* '}';
      3 attributeDeclaration(myClass : node) ::=
      4     IDENT:sClass
      5     => local myAttribute;
      6     => {
      7         pushItem myClass.listOfAttributes;
      8         ref myAttribute = myClass.listOfAttributes#back;
      9         if !findElement(sClass, project.listOfClasses)
      10             error("class '" + sClass + "' should have been declared before");
      11         ref myAttribute.class = project.listOfClasses[sClass];
      12     }
      13     ['[' ']' => insert myAttribute.isArray = true;]?
      14     IDENT:sName => {insert myAttribute.name = sName;};
      15
      16 IDENT ::= #!ignore ['a'..'z'|'A'..'Z']+;

line 4: the name of the class for the association is assigned to the local variable sName,
line 5: we'll need a local variable to point to the attribute's node for commodity,
line 7: the local variable myAttribute hasn't been declared here, because it disappears at the end of the scope (the trailing brace); a new node is added to the list of attributes,
line 8: the local variable myAttribute points to the last item of the list,
line 9: the class specifier of the association must have been declared,
line 11: we populate the parse tree as done by hand,
line 13: this attribute isArray is added only if the type of the association is an array,
line 14: we complete the attribute description by assigning its name,
Type the following line into the console to parse the design "Tiny.tml":


parseAsBNF("Scripts/Tutorial/GettingStarted/Tiny-BNFparsing1.cwp", project, 
        "Scripts/Tutorial/GettingStarted/Tiny.tml");

Output:

this file has been parsed successfully

7 Implementing a leader script

Now, we'll implement a little function that displays the content of our parse tree. We stop using the console here, and we'll implement the call to the parsing and the function into a leader script. This script will be called at the command line, as seen further.

We suggest to use the file extension ".cws" for non-template and non-BNF scripts.

CodeWorker command line to execute:
-script Scripts/Tutorial/GettingStarted/Tiny-leaderScript0.cws

      // file "GettingStarted/Tiny-leaderScript0.cws":
      1 parseAsBNF("Tiny-BNFparsing1.cwp", project, "Scripts/Tutorial/GettingStarted/Tiny.tml");
      2
      3
      4 function displayParsingTree() {
      5     foreach i in project.listOfClasses {
      6         traceLine("class '" + i.name + "'");
      7         if existVariable(i.parent)
      8             traceLine("\tparent = '" + i.parent.name + "'");
      9         foreach j in i.listOfAttributes {
      10             traceLine("\tattribute '" + j.name + "'");
      11             traceLine("\t\tclass = '" + j.class.name + "'");
      12             if existVariable(j.isArray)
      13                 traceLine("\t\tarray = '" + j.isArray + "'");
      14         }
      15     }
      16 }
      17
      18 displayParsingTree();

line 4: a user-defined function without parameters,
line 5: the foreach statement iterates all items of an array; here, all classes are explored,
line 7: check whether the attribute parent exists or not,
line 9: all attributes of the current class i are iterated,
line 12: perhaps the association is multiple,
line 18: a call to the user-defined function,

Output:

this file has been parsed successfully
class 'A'
class 'B'
    parent = 'A'
class 'C'
    attribute 'b'
        class = 'B'
        array = 'true'
class 'D'
    attribute 'a'
        class = 'A'
    attribute 'c'
        class = 'C'
        array = 'true'

8 Generating code with a pattern script

The source code generation exploits the parse tree to generate any kind of output files: HTML, SQL, C++, ...

A pattern script is written in the scripting language of CodeWorker, extended to be able to fuse the text to put into the output file and the instructions to interpret. It enables to process a {template-based} generation. Such a script looks like a JSP template: the script is embedded between tags '<%' and '%>' or '@'.

We'll start by generating a short JAVA class for each class of the design. It translates the attributes in JAVA and it generates their accessors:

      // file "Scripts/Tutorial/GettingStarted/Tiny-JAVA.cwt":
      1 package tiny;
      2
      3 public class @this.name@ @
      4 if existVariable(this.parent) {
      5     @ extends @this.parent.name@ @
      6 }
      7 @{
      8     // attributes:
      9 @
      10 function getJAVAType(myAttribute : node) {
      11     local sType = myAttribute.class.name;
      12     if myAttribute.isArray {
      13         set sType = "java.util.ArrayList/*<" + sType + ">*/";
      14     }
      15     return sType;
      16 }
      17
      18 foreach i in this.listOfAttributes {
      19     @ private @getJAVAType(i)@ _@i.name@ = null;
      20 @
      21 }
      22 @
      23     //constructor:
      24     public @this.name@() {
      25     }
      26
      27     // accessors:
      28 @
      29 foreach i in this.listOfAttributes {
      30     @ public @getJAVAType(i)@ get@toUpperString(i.name)@() { return _@i.name@; }
      31     public void set@toUpperString(i.name)@(@getJAVAType(i)@ @i.name@) { _@i.name@ = @i.name@; }
      32 @
      33 }
      34 setProtectedArea("Methods");
      35 @}

line 3: swapping to script mode: the value of this.name is put into the output file, knowing that the variable this is determined by the second parameter that is passed to the procedure generate (see section generate() and below). If the notation appears confusing to you (where does the writing mode ends, where does the script mode starts or the contrary), you can choose to inlay the variables in tags '<%' and '%>'.
line 4: swapping once again to script mode for writing the inheritance, if any
line 7: swapping to text mode,
line 10: we'll need a function to convert a type specifier of the tiny modeling language to JAVA, which expects the attribute's node (parameter mode is variable, instead of value),
line 13: we have chosen java.util.ArrayList to represent an array, why not?
line 18: swapping to script mode for declaring the attributes of the class
line 22: swapping to text mode for putting the constructor into the output file,
line 29: swapping to script mode for implementing the accessors to the attributes of the class
line 30: the predefined function toUpperString capitalizes the parameter,
line 34: the procedure setProtectedArea (see section setProtectedArea()) adds a protected area that is intended to the user and that is preserved during a generation process,
line 35: swapping to text mode for writing the trailing brace,
The leader script must be changed to require the generation of each class in JAVA:

CodeWorker command line to execute:
-script Scripts/Tutorial/GettingStarted/Tiny-leaderScript1.cws

      // file "Scripts/Tutorial/GettingStarted/Tiny-leaderScript1.cws":
      1 parseAsBNF("Scripts/Tutorial/GettingStarted/Tiny-BNFparsing1.cwp", project, "Scripts/Tutorial/GettingStarted/Tiny.tml");
      2
      3 foreach i in project.listOfClasses {
      4     generate("Scripts/Tutorial/GettingStarted/Tiny-JAVA.cwt", i, "Scripts/Tutorial/GettingStarted/tiny/" + i.name + ".java");
      5 }
      6

line 4: the second argument is waiting for a tree node that will be accessed into the pattern script via the predefined variable this, which has been encountered above,

Output:

this file has been parsed successfully

Let have a look to the following generated file:

      // file "Scripts/Tutorial/GettingStarted/tiny/D.java":
      package tiny;
     
      public class D {
          // attributes:
          private A _a = null;
          private java.util.ArrayList/*<C>*/ _c = null;
     
          //constructor:
          public D() {
          }
     
          // accessors:
          public A getA() { return _a; }
          public void setA(A a) { _a = a; }
          public java.util.ArrayList/*<C>*/ getC() { return _c; }
          public void setC(java.util.ArrayList/*<C>*/ c) { _c = c; }
      //##protect##"Methods"
      //##protect##"Methods"
      }

9 Expanding text with a pattern script

We'll learn about another mode of generation: expanding a file. Let suppose that you want to inlay generated code into an existing file. The way to do it is first to insert a special comment at the expected place. This comment begins with ##markup## and is followed by a sequence of characters written between double quotes and called the markup key.

Here is a little HTML file that is going to be expanded:

      // file "Scripts/Tutorial/GettingStarted/Tiny.html":
      <HTML>
          <HEAD>
          </HEAD>
          <BODY>
      <!--##markup##"classes"-->
          </BODY>
      </HTML>

The markup key is called "classes" and is put into the file like it: <!- -##markup##"classes"- ->.

Now, we'll implement a short script that is intended to populate the markup area with all classes of the design, displayed into tables:

      // file "Scripts/Tutorial/GettingStarted/Tiny-HTML.cwt":
      1 @
      2 if getMarkupKey() == "classes" {
      3     foreach i in project.listOfClasses {
      4         @ <TABLE>
      5             <TR>
      6                 <TD colspan=3><B>@i.name@</B></TD>
      7             </TR>
      8             <TR>
      9                 <TD><EM>Attribute</EM></TD><TD><EM>Type</EM></TD> <TD><EM>Description</EM></TD>
      10             </TR>
      11 @
      12         foreach j in i.listOfAttributes {
      13             @ <TR>
      14                 <TD><I>@j.name@</I></TD><TD><CODE>@
      15             @@j.class.name@@
      16             if j.isArray {
      17                 @[]@
      18             }
      19             @</CODE></TD><TD>@
      20             setProtectedArea(i.name + "::" + j.name);
      21             @</TD>
      22             </TR>
      23 @
      24         }
      25         @ </TABLE>
      26 @
      27     }
      28 }

line 2: the function getMarkupKey() returns the current expanding markup that is handled,
line 3: all classes will be presented sequentially into tables of 3 columns, whose title is the name of the class, and rows are populated with attributes,
line 12: the name, Type and Description of all attributes of the class are presented into the table,
line 15: the type is expressed in the syntax of our tiny modeling language,
line 20: the description of an attribute must be filled by the user into a protected area, so as to preserve it from an expansion to another,
The leader script has to take into account the expansion of the HTML file:

CodeWorker command line to execute:
-script Scripts/Tutorial/GettingStarted/Tiny-leaderScript2.cws

      // file "Scripts/Tutorial/GettingStarted/Tiny-leaderScript2.cws":
      1 parseAsBNF("Scripts/Tutorial/GettingStarted/Tiny-BNFparsing1.cwp", project, "Scripts/Tutorial/GettingStarted/Tiny.tml");
      2
      3 foreach i in project.listOfClasses {
      4     generate("Scripts/Tutorial/GettingStarted/Tiny-JAVA.cwt", i, "Scripts/Tutorial/GettingStarted/tiny/" + i.name + ".java");
      5 }
      6
      7 traceLine("expanding file 'Tiny0.html'...");
      8 setCommentBegin("<!--");
      9 setCommentEnd("-->");
      10 expand("Scripts/Tutorial/GettingStarted/Tiny-HTML.cwt", project, "Scripts/Tutorial/GettingStarted/Tiny0.html");
      11 //normal;

line 8: to expand a file, the interpreter has to know the format of comments used for declaring the markups. If the format isn't correct, the file will not be expanded.
line 10: be careful to call the procedure expand() and not to confuse with generate()! Remember that a classic generation rewrites all according to the directives of the pattern script and preserves protected areas, but doesn't recognize markup keys.

Output:

this file has been parsed successfully
expanding file 'Tiny0.html'...

It hasn't a great interest to present here the content of the HTML once it has been expanded, but you can display it (file "Scripts/Tutorial/GettingStarted/Tiny0.html") into your browser. You'll notice into the source code that the expanded text is put between tags <!- -##begin##"classes"- -> and <!- -##end##"classes"- ->. Don't type text into this tagged part, except into protected areas, because the next expansion will destroy the tagged part.

For discovering more about CodeWorker through a more complex example, please read the next chapter. You'll learn how to do translations from a format to another, and to use template functions or BNF clauses (very efficient for readability and extension!), and a lot of various things. But it is recommended to practice a little before.

Discovering more with an example

The first time, we recommend to read the precedent chapter, more approachable, before reading this one.

Let imagine that we dispose of a design expressed in a simple modeling language, like it:

      // file "GettingStarted/SolarSystem0.sml":
      1 class Planet {
      2     double diameter;
      3     double getDistanceToSun(int day, int month, int year);
      4 }
      5
      6 class Earth : Planet {
      7     string[] countryNames;
      8 }
      9
      10 class SolarSystem {
      11     aggregate Planet[] planets;
      12 }

line 1: a class is declared with keyword class
line 2: declaration of attributes in a syntax close to C++ or JAVA
line 3: declaration of methods in a syntax close to C++ or JAVA
line 6: a class may inherit from an other ; the syntax looks like C++, see ':'
line 7: an attribute may be an array ; the syntax looks like JAVA
line 11: an attribute may be an object or an array of objects, and an object may be an aggregation (meaning that it belongs to the instance),
This simple modeling language conforms to a BNF grammar (see paragraph
BNF syntax to obtain information about the elements of a BNF syntax):
world ::= [class_declaration]*
class_declaration ::= "class" IDENT [':' IDENT]? class_body
class_body ::= '{' [attribute_decl | method_decl]* '}'
attribute_decl ::= type_specifier IDENT ';'
method_decl ::= type_specifier IDENT '(' [parameters_decl]? ')' ';'
parameters_decl ::= parameter [',' parameters_decl]*
parameter ::= [parameter_mode]? type_specifier IDENT
parameter_mode ::= "in" | "inout" | "out"
type_specifier ::= basic_type ['[' ']']?
basic_type ::= "int" | "double" | "string" |
"boolean" | class_specifier
class_specifier ::= ["aggregate"]? IDENT
IDENT ::= ['a'..'z'|'A'..'Z'|'_'] ['a'..'z'|'A'..'Z'|'_'|'0'..'9']*

Starting from the desing file "SolarSystem0.sml" seen before, which conforms to the Simple Modeling Language described just above, we propose to implement the source code for classes and a light documentation.

10 The parse tree

CodeWorker doesn't belong to the category of typed languages. It recognizes only the tree as structured type and the string as basic type (that may however represent an integer or a boolean, ...). Each node may contain a string as a value, and/or an array of nodes. The main tree is called project, which is the name of its root node, accessible everywhere into scripts.

Now, the best way to understand how to handle the tree is to run the console, and to practice some examples.

Type CodeWorker to the shell to set the console mode. A cursor is waiting for your commands.

Type set a = "little"; and press enter. Don't forget the semi-colon at the end of the line. If absent, the console wait for more input: type the expected semi-colon, and it should be right.

What is the impact of the line you typed? You assigned "little" to the variable a, which doesn't exist. So, a node named 'a' has been added into the main parse tree (called project, remember), to which the variable a points. You noticed that a varning has occurred. It means that you assigned a value to a node that doesn't exist yet. In fact, the instruction set supposes that the variable to assign already exists, and a warning has been thrown to prevent you of a spelling error (perhaps do you intended to type another variable that already exists?) or a logic mistake (at this point of the program, the variable should exist, so what?). It is important to offer this protection, because the language isn't typed, and so, a lot of errors may be reported during the runtime.

The variable a has been added, even if the warning has occurred, but we prefer the instruction insert to add a new node properly : type insert b = "big"; and press enter. No warning was displayed. Now, the root project node contains two sub-nodes, called 'a' and 'b', and we control it by typing traceObject(project);. The following lines are displayed:


Tracing variable 'project':
        a = "little"
        b = "big"
End of variable's trace 'project'.

Let's go further. What about storing a list of items?
Type insert classes["Planet"].name = "Planet";. A node node called 'classes' has been added to project, and then an array entry called "Planet" has been pushed. This entry points to a node, to which 'name' is added, and node 'name' is worth "Planet".

Type insert classes["Earth"].name = "Earth"; and then ask for tracing node 'project'. The following lines are displayed:


Tracing variable 'project':
        a = "little"
        b = "big"
        classes = ""
        classes["Planet", "Earth"]
End of variable's trace 'project'.

Notice that the node 'classes' has no value (but could have!) and contains an array of nodes where entries are "Planet" and "Earth".

To iterate items of array 'classes', type foreach i in classes traceLine("handling class '" + i.name + "'..."); and see the result:


handling class 'Planet'...
handling class 'Earth'...

Variable 'i' is an iterator and is declared locally for processing the foreach instruction. We'll see further that the statement local allows declaring a tree to the stack.

What you know about the parse tree in CodeWorker is sufficient to tackle the next section.

11 Parsing our design

CodeWorker provides two different approaches for parsing files.

11.1 The parsing scripts that read tokens

Those that aren't familiar with a BNF representation will perhaps be more self-assured in using a procedure-driven parsing, where control resides within the implementation and where all tokens are explicitly read by a devoted operation. But it means for instance that ignoring blanks and comments must be indicated explicitly between reading of tokens.

The parsing scripts that read tokens are the oldest way to parse into CodeWorker and are the fastest mode too. But it doesn't offer the same flexibility as BNF scripts, which are syntax-oriented.

Below is an example of what a script that reads tokens looks like:

      // file "GettingStarted/SimpleML-token-reading.cws":
      1 declare function readType();
      2
      3 while skipEmptyCpp() {
      4     if !readIfEqualToIdentifier("class") error("'class' expected");
      5     skipEmptyCpp();
      6     local sClassName = readIdentifier();
      7     if !sClassName error("class name expected");
      8     skipEmptyCpp();
      9     if readIfEqualTo(":") {
      10         skipEmptyCpp();
      11         local sParentName = readIdentifier();
      12         if !sParentName error("parent name expected for class '" + sClassName + "'");
      13         skipEmptyCpp();
      14     }
      15     if !readIfEqualTo("{") error("'{' expected");
      16     skipEmptyCpp();
      17     while !readIfEqualTo("}") {
      18         skipEmptyCpp();
      19         readType();
      20         skipEmptyCpp();
      21         local sMemberName = readIdentifier();
      22         if !sMemberName error("attribute or method name expected");
      23         skipEmptyCpp();
      24         if readIfEqualTo("(") {
      25             skipEmptyCpp();
      26             if !readIfEqualTo(")") {
      27                 do {
      28                     skipEmptyCpp();
      29                     local iPosition = getInputLocation();
      30                     local sMode = readIdentifier();
      31                     if !sMode error("parameter type or mode expected");
      32                     if (sMode != "in") && (sMode != "out") && (sMode != "inout") {
      33                         setInputLocation(iPosition);
      34                         set sMode = "";
      35                     }
      36                     skipEmptyCpp();
      37                     readType();
      38                     skipEmptyCpp();
      39                     local sParameterName = readIdentifier();
      40                     if !sParameterName error("parameter name expected");
      41                     skipEmptyCpp();
      42                 } while readIfEqualTo(",");
      43                 if !readIfEqualTo(")") error("')' expected");
      44             }
      45             skipEmptyCpp();
      46         }
      47         if !readIfEqualTo(";") {
      48             error("';' expected to close an attribute, instead of '" + readChar() + "'");
      49         }
      50         skipEmptyCpp();
      51     }
      52 }
      53 traceLine("the file has been read successfully");
      54
      55 function readType() {
      56     local sType = readIdentifier();
      57     if !sType error("type modifier or name expected, instead of '" + readChar() + "'");
      58     if sType == "aggregate" {
      59         skipEmptyCpp();
      60         sType = readIdentifier();
      61         if !sType error("aggregated class name expected");
      62     }
      63     skipEmptyCpp();
      64     if readIfEqualTo("[") {
      65         skipEmptyCpp();
      66         if !readIfEqualTo("]") error("']' expected to close an array declaration");
      67     }
      68 }

line 1: forward declaration of method readType(), so as to start explanations about how to implement BNF clause world ::= [class_declaration]*,
line 3: do a loop while the end of file hasn't been reached, skipping blanks and C++ comments: skipEmptyCpp() returns false only if an error occurs while reading the stream or the file has completed,
line 4: waiting for token "class" as an identifier (doesn't accept "class" as the beginning of another identifier, such as "classes"). If not found, an error occurs. This token announces a class declaration.
line 5: a disadvantage of writing a procedure-driven reading/parsing: don't forget to skip explicitly blanks and comments by yourself,
line 6: populates a local variable with an identifier token that represents the name of the class
line 7: if an identifier token hasn't been found (token is empty), an error is thrown,
line 9: if the file location points to ":", announcing the inheritance, function readIfEqualTo(":") returns true, and the location moves after the matched expression. If it fails, the file location remains the same.
line 15: body of the class declaration expected
line 17: while inside the class body, reading of attribute and method members,
line 19: we don't conform exactly to the BNF: beginning of method and attribute declaration is factorized,
line 21: name of the attribute or method member,
line 24: not any more ambiguity : it starts by a parenthesis when the members is a method,
line 27: the method expects at least one parameter,
line 29: we keep the current file position, to be able to come back if the next token isn't an access mode ("in", "out" or "inout"),
line 33: we were reading a basic type, instead of a parameter access mode: we come back to the beginning of this token and the mode is set as empty (no mode). Of course, it is possible not to waste time like this, and to optimize function readType() by passing the token as a parameter. But here is the occasion of discovering how to handle the file position.
line 37: type of the current parameter is expected,
line 39: name of the current parameter is expected,
line 42: parameters are separated by commas,
line 47: both attributes and methods must finish with a semi colon,
line 48: function readChar() reads just one character, or returns an empty string if the end of file has been reached,
line 53: once the read of file has completed, a message of success is written,
line 55: user-defined function ; may return a value or not. The declaration always starts with keyword function, even if it announces a procedure (no return value). Reading a type is called at several points of the grammar, so the code is factorized in the procedure readType(). It doesn't return any value about success or failure, because an error is thrown in case of syntax mismatch.
line 58: does the keyword is a modifier? If not sType contains a basic type or a class name
line 60: reads the name of the aggregated class
line 64: perhaps that the type is an array, represented by [],
This script seems quite far from the BNF of our simple modeling language, while it implements it in a procedural way. It is able to read a well-formed design file, as our solar system presented at the beginning of the chapter. It doesn't care about populating a parse tree yet, but produces contextual error messages when the design file doesn't conform to the BNF.

Let apply the script on the design file:


parseFree("GettingStarted/SimpleML-token-reading.cws",
        project, "GettingStarted/SolarSystem0.sml");

Output:

the file has been read successfully

Now, let improve the script to allow populating a parse tree:

      // file "GettingStarted/SimpleML-token-parsing.cws":
      1 declare function readType(myType : node);
      2
      3 while skipEmptyCpp() {
      4     if !readIfEqualToIdentifier("class") error("'class' expected");
      5     skipEmptyCpp();
      6     local sClassName = readIdentifier();
      7     if !sClassName error("class name expected");
      8     insert project.listOfClasses[sClassName].name = sClassName;
      9     skipEmptyCpp();
      10     if readIfEqualTo(":") {
      11         skipEmptyCpp();
      12         local sParentName = readIdentifier();
      13         if !sParentName error("parent name expected for class '" + sClassName + "'");
      14         insert project.listOfClasses[sClassName].parent = sParentName;
      15         skipEmptyCpp();
      16     }
      17     if !readIfEqualTo("{") error("'{' expected");
      18     skipEmptyCpp();
      19     local myClass;
      20     ref myClass = project.listOfClasses[sClassName];
      21     while !readIfEqualTo("}") {
      22         skipEmptyCpp();
      23         local myType;
      24         readType(myType);
      25         skipEmptyCpp();
      26         local sMemberName = readIdentifier();
      27         if !sMemberName error("attribute or method name expected");
      28         skipEmptyCpp();
      29         if readIfEqualTo("(") {
      30             insert myClass.listOfMethods[sMemberName].name = sMemberName;
      31             if myType.name != "void" {
      32                 setall myClass.listOfMethods[sMemberName].type = myType;
      33             }
      34             skipEmptyCpp();
      35             if !readIfEqualTo(")") {
      36                 local myMethod;
      37                 ref myMethod = myClass.listOfMethods[sMemberName];
      38                 do {
      39                     skipEmptyCpp();
      40                     local iPosition = getInputLocation();
      41                     local sMode = readIdentifier();
      42                     if !sMode error("parameter type or mode expected");
      43                     if (sMode != "in") && (sMode != "out") && (sMode != "inout") {
      44                         setInputLocation(iPosition);
      45                         set sMode = "";
      46                     }
      47                     skipEmptyCpp();
      48                     local myParameterType;
      49                     readType(myParameterType);
      50                     skipEmptyCpp();
      51                     local sParameterName = readIdentifier();
      52                     if !sParameterName error("parameter name expected");
      53                     insert myMethod.listOfParameters[sParameterName].name = sParameterName;
      54                     setall myMethod.listOfParameters[sParameterName].type = myParameterType;
      55                     if sMode {
      56                         insert myMethod.listOfParameters[sParameterName].name = sMode;
      57                     }
      58                     skipEmptyCpp();
      59                 } while readIfEqualTo(",");
      60                 if !readIfEqualTo(")") error("')' expected");
      61             }
      62             skipEmptyCpp();
      63         } else {
      64             insert myClass.listOfAttributes[sMemberName].name = sMemberName;
      65             setall myClass.listOfAttributes[sMemberName].type = myType;
      66         }
      67         if !readIfEqualTo(";") error("';' expected to close an attribute, instead of '" + readChar() + "'");
      68         skipEmptyCpp();
      69     }
      70 }
      71 traceLine("the file has been parsed successfully");
      72
      73 function readType(myType : node) {
      74     local sType = readIdentifier();
      75     if !sType error("type modifier or name expected, instead of '" + readChar() + "'");
      76     if sType == "aggregate" {
      77         insert myType.isAggregation = true;
      78         skipEmptyCpp();
      79         sType = readIdentifier();
      80         if !sType error("aggregated class name expected");
      81     }
      82     insert myType.name = sType;
      83     if (sType != "int") && (sType != "double") && (sType != "boolean") && (sType != "string") {
      84         insert myType.isObject = true;
      85     }
      86     skipEmptyCpp();
      87     if readIfEqualTo("[") {
      88         skipEmptyCpp();
      89         if !readIfEqualTo("]") error("']' expected to close an array declaration");
      90         insert myType.isArray = true;
      91     }
      92 }

line 8: about parsing, classes are modeled into node project.listOfClasses[sClassName]. Its attribute name contains the value of sClassName.
line 14: this class inherits from a parent, so the optional attribute parent of the class is populated with the value of sParentName,
line 19: to work easier with the current class node project.listOfClasses[sClassName], we define a reference to it, called myClass,
line 23: the class is populated with the characteristics of the member once its declaration has finished. Otherwise, it may confuse between an attribute or a method declaration. So, we should have factorized the type declaration and the name of the member into a common clause, for example.
line 30: about parsing, methods are modeled into node myClass.listOfMethods[sMemberName],
line 31: attribute name is compulsory into a type node, so if myType.name returns "void", there is no return type,
line 36: to work easier with the current class node myClass.listOfMethods[sMemberName], we define a reference to it, called myMethod,
line 53: about parsing, parameters are modeled into node myMethod.listOfParameters[sParameterName],
line 64: about parsing, attributes are modeled into node myClass.listOfAttributes[sMemberName],
line 65: the type is allocated on the stack, so it is copied into branch type (no node reference) integrally,
line 71: once the parsing of file has achieved, a message of success is written,
line 73: function readType() requires a node into which description of type will be populated,
line 77: about parsing, myType.isAggregation contains true if type is an array,
line 82: about parsing, myType.name contains the name of basic type,
line 83: check whether the type is a basic one or a class specifier,
line 84: about parsing, myType.isObject contains true because we suppose that this type is a class specifier (by default: it isn't a basic type),
line 90: about parsing, myType.isArray contains true if type is an array,
The first version of the script was just able to read a well-formed design file written in the simple modeling language. The second version validates the file and populates the parse tree:


parseFree("GettingStarted/SimpleML-token-parsing.cws",
        project, "GettingStarted/SolarSystem0.sml");

Output:

the file has been parsed successfully

11.2 The parsing scripts that describe a BNF syntax

A BNF is more flexible and more synthetic than a procedural description of parsing. CodeWorker accepts parsing scripts that conform to a BNF.

For more information about elements of syntax for a BNF, let have a look to paragraph BNF syntax.

Below is an example of what a BNF script looks like:

      // file "GettingStarted/SimpleML-reading.cwp":
      1 // syntactical clauses:
      2 world ::= #ignore(C++) [class_declaration]* #empty
      3             => { traceLine("file read successfully"); };
      4 class_declaration ::= IDENT:"class" IDENT [':' IDENT]? class_body;
      5 class_body ::= '{' [attribute_decl | method_decl]* '}';
      6 attribute_decl ::= type_specifier IDENT ';';
      7 method_decl ::= [IDENT:"void" | type_specifier] IDENT
      8                 '(' [parameters_decl]? ')' ';';
      9 parameters_decl ::= parameter [',' parameters_decl]*;
      10 parameter ::= [parameter_mode]? type_specifier IDENT;
      11 parameter_mode ::= IDENT:{"in", "inout", "out"};
      12 type_specifier ::= basic_type ['[' ']']?;
      13 basic_type ::= "int" | "boolean" | "double" | "string" | class_specifier;
      14 class_specifier ::= ["aggregate"]? IDENT;
      15
      16 // lexical clauses:
      17 IDENT ::= #!ignore ['a'..'z'|'A'..'Z'|'_']
      18                     ['a'..'z'|'A'..'Z'|'_'|'0'..'9']*;

line 2: the world to model is composed of classes ; some special commands are used:

line 4: a class declaration begins with identifier "class", and IDENT:"class" means that an identifier is expected, and that this identifier is worth "class". This instruction isn't identical to "class" IDENT that validates the expression "classes", where IDENT matches to "es". A class has a name, read by the first IDENT clause call, and may inherit from a parent, read by the second IDENT
line 5: the body of a class is composed of attributes and methods
line 6: the attribute is preceded by its type, and IDENT reads the name of the attribute
line 7: the method has a return type or expects void keyword, and may expect some parameters ; IDENT reads the name of the method
line 9: a comma separates parameters
line 10: an access mode may be specified to the parameter ; the type is then specified, and IDENT reads the name
line 11: a parameter may be passed:

The pattern IDENT:{"in", "inout", "out"} means that the identifier must match with one of the constant strings listed between brackets. It isn't identical to the pattern "in" | "inout" | "out" that validates the beginning of "int".
line 12: a type is a basic type or an array of basic types
line 13: some basic types, including object types
line 14: IDENT reads the class name, and the object may be aggregated
line 17: this clause reads an identifier, such as pretty_pig1 ; #!ignore means that no character is ignored, even if it matches C++ comment or a blank. If we forget clause #!ignore, then IDENT will validate pretty/*comment*/_pig 1 as an identifier.
This BNF script is very close to the BNF of our simple modeling language, and is able to read a well-formed design file, as our solar system presented at the beginning of the chapter. It doesn't care about populating a parse tree yet, and doesn't produce a contextual error message when the design file doesn't conform to the BNF.

Let apply the BNF script on the design file:


parseAsBNF("GettingStarted/SimpleML-reading.cwp",
        project, "GettingStarted/SolarSystem0.sml");

Output:

file read successfully

About differences, note that each BNF rule must end with a semi colon, and that they have to indicate what is their behaviour while encountering blanks and comments.

Now, let improve the BNF script to allow populating a parse tree, or throwing an error when a syntax error has occurred:

      // file "GettingStarted/SimpleML-parsing.cwp":
      1 // syntactical clauses:
      2 world ::= #ignore(C++) [class_declaration]* #empty
      3             => {
      4                 traceLine("file parsed successfully");
      5                 saveProject("Scripts/Tutorial/SolarSystem0.xml");
      6             };
      7 class_declaration ::= IDENT:"class" #continue
      8             IDENT:sClassName
      9                 => insert project.listOfClasses[sClassName].name = sClassName;
      10             [':' #continue IDENT:sParentName
      11                 => insert project.listOfClasses[sClassName].parent = sParentName;
      12             ]?
      13             class_body(project.listOfClasses[sClassName]);
      14 class_body(myClass : node) ::= '{'
      15         [attribute_decl(myClass) | method_decl(myClass)]* '}';
      16 attribute_decl(myClass : node) ::=
      17             => local myType;
      18             type_specifier(myType) IDENT:sAttributeName ';'
      19             => {
      20                 insert myClass.listOfAttributes[sAttributeName].name = sAttributeName;
      21                 setall myClass.listOfAttributes[sAttributeName].type = myType;
      22             };
      23 method_decl(myClass : node) ::=
      24             => local myType;
      25             [IDENT:"void" | type_specifier(myType)]
      26             IDENT:sMethodName '('
      27             #continue
      28                 => {
      29                     insert myClass.listOfMethods[sMethodName].name = sMethodName;
      30                     if myType.name
      31                         setall myClass.listOfMethods[sMethodName].type = myType;
      32                 }
      33             [parameters_decl(myClass.listOfMethods[sMethodName])]? ')' ';';
      34 parameters_decl(myMethod : node) ::=
      35                 parameter(myMethod)
      36                 [',' #continue parameters_decl(myMethod)]*;
      37 parameter(myMethod : node) ::=
      38             [parameter_mode]?:sMode
      39             => local myType;
      40             type_specifier(myType)
      41             IDENT:sParameterName
      42                 => {
      43                     insert myMethod.listOfParameters[sParameterName].name = sParameterName;
      44                     setall myMethod.listOfParameters[sParameterName].type = myType;
      45                     if sMode {
      46                         insert myMethod.listOfParameters[sParameterName].name = sMode;
      47                     }
      48                 };
      49 parameter_mode ::= IDENT:{"in", "inout", "out"};
      50 type_specifier(myType : node) ::=
      51     basic_type(myType)
      52     ['[' #continue ']' => insert myType.isArray = true; ]?;
      53 basic_type(myType : node) ::=
      54     ["int" | "boolean" | "double" | "string"]:myType.name
      55         |
      56     class_specifier(myType);
      57 class_specifier(myType : node) ::=
      58     ["aggregate" => insert myType.isAggregation = true; ]?
      59     IDENT:myType.name => {insert myType.isObject = true; };
      60
      61 IDENT ::= #!ignore ['a'..'z'|'A'..'Z'|'_']
      62                     ['a'..'z'|'A'..'Z'|'_'|'0'..'9']*;

line 2: the pattern [class_declaration]* always matches with the parsed file, so the rule will continue in sequence in any case (supposing that no error has occurred into clause class_declaration) and the end of file will be checked. If not reached, it doesn't write the message "file read successfully",
line 7: once keyword "class" has been matched, there is no ambiguity : we are handling a class declaration and the rule must continue in sequence. To require that, instruction #continue is written after pattern "class". If a pattern of the sequence doesn't match the parsed file, the parser throws a syntax error automatically.
line 8: the identifier that matches with clause call IDENT is assigned to the local variable sClassName : on contrary of other types of script, a new variable is considered as local, instead of an new attribute added to the current node this,
line 9: about parsing, classes are modeled into node project.listOfClasses[sClassName]. Its attribute name contains the value of sClassName.
line 10: if the class inherits from a parent, ':' is necessary followed by an identifier (pattern #continue), and the identifier that matches with clause call IDENT is assigned to the local variable sClassName,
line 11: this class inherits from a parent, so the optional attribute parent of the class is populated with the value of sParentName,
line 14: clause class_body expects an argument: the class node into which the class members must be described (myClass : node),
line 16: the class is populated with the characteristics of the attribute once its declaration has finished. Otherwise, it may confuse with the beginning of a method declaration. To avoid this ambiguity, we should have factorized the type declaration and the name of the member into a common clause, for example.
line 20: about parsing, attributes are modeled into node myClass.listOfAttributes[sAttributeName],
line 21: the type is allocated on the stack, so it is copied into branch type (no node reference) integrally,
line 23: the class is populated with the characteristics of the method once the opened parenthesis is recognized,
line 27: from here, there is no doubt that we are parsing a method declaration,
line 29: about parsing, methods are modeled into node myClass.listOfMethods[sMethodName],
line 30: attribute name is compulsory into a type node, so if condition myType.name returns false, there is no return type (void),
line 36: a parameter declaration is expected after the comma,
line 43: about parsing, parameters are modeled into node myMethod.listOfParameters[sParameterName],
line 52: about parsing, myType.isArray contains true if type is an array,
line 54: about parsing, myType.name contains the name of basic type,
line 58: about parsing, myType.isAggregation contains true if the object is aggregated,
line 59: about parsing, myType.isObject contains true because this type is a class specifier,
line 61: the lexical clause IDENT recognizes identifiers and might be replaced by the predefined clause #readIdentifier, which does the same work,
The first version of the script was just able to read a well-formed design file written in the simple modeling language. The second version validates the file and populates the parse tree:


parseAsBNF("GettingStarted/SimpleML-parsing.cwp",
        project, "GettingStarted/SolarSystem0.sml");

Output:

file parsed successfully

12 Decorating the parse tree

Once our design file has been parsed (either procedure-driven or BNF-driven, we don't care), there is sometimes a little more work to acomplish on the parse tree. It may be verifying consistency of the whole, as checking existence of each class referenced as association or parent. It may also be reorganizing the graph differently, so as to simplify tasks of source code generation. We call it decorating the parse tree in the CodeWorker vocabulary.

The next script proposes to check the existence of each class specifier types and to keep a reference to the node that describes this class specifier. Some nodes change their nature (myClass.parent becomes a reference to the parent node, for example), some other are added (for object types, the new node myType.class keeps a reference to the class):

      // file "GettingStarted/TreeDecoration.cws":
      1 foreach myClass in project.listOfClasses {
      2     if myClass.parent {
      3         if !findElement(myClass.parent, project.listOfClasses)
      4             error("class '" + myClass.parent + "' doesn't exist while class '"
      5                   + myClass.name + "intends to inherit from it");
      6         ref myClass.parent = project.listOfClasses[myClass.parent];
      7     }
      8     foreach myAttribute in myClass.listOfAttributes {
      9         local myType;
      10         ref myType = myAttribute.type;
      11         if myType.isObject {
      12             if !findElement(myType.name, project.listOfClasses)
      13                 error("class '" + myType.name + "' doesn't exist while attribute '"
      14                       + myClass.name + "::" + myAttribute.name + "' refers to it");
      15             ref myType.class = project.listOfClasses[myType.name];
      16         }
      17     }
      18     foreach myMethod in myClass.listOfMethods {
      19         if existVariable(myMethod.type) && myMethod.type.isObject {
      20             localref myType = myMethod.type;
      21             if !findElement(myType.name, project.listOfClasses)
      22                 error("class '" + myType.name + "' doesn't exist while method '"
      23                       + myClass.name + "::" + myMethod.name + "' refers to it");
      24             ref myType.class = project.listOfClasses[myType.name];
      25         }
      26         foreach myParameter in myMethod.listOfParameters {
      27             localref myType = myParameter.type;
      28             if myType.isObject {
      29                 if !findElement(myType.name, project.listOfClasses)
      30                     error("class '" + myType.name
      31                           + "' doesn't exist while method '"
      32                           + myClass.name + "::" + myMethod.name
      33                           + "' refers to it");
      34                 ref myType.class = project.listOfClasses[myType.name];
      35             }
      36         }
      37     }
      38 }

line 1: we iterate all classes,
line 2: if field parent is filled, we check its existence and then, we change it as a reference to the parent class,
line 8: we iterate all attributes of each class,
line 11: only object attributes are interesting,
line 12: check whether the class exists or not into the array node that contains all classes: does the key myType.name exist as an array entry of node project.listOfClasses?
line 15: to optimize navigating into the parse tree later, we keep a reference to the class into new node myType.class,
line 18: we iterate all methods of each class,
line 26: we iterate all parameters of each method,
Now, we dispose of a parsing script that loads well-formed Simple-Modeling designs, and a script that decorates the parse tree. It is time to write a leader script that will take in charge calling tasks of parsing, tree decoration and source code generation:

CodeWorker command line to execute:
-I Scripts/Tutorial/GettingStarted -define DESIGN_FILE=SolarSystem0.sml -script LeaderScript0.cws

      // file "GettingStarted/LeaderScript0.cws":
      1 if !getProperty("DESIGN_FILE")
      2     error("'-define DESIGN_FILE=file' expected on the command line");
      3 traceLine("'Simple Modeling' design file to parse = \""
      4           + getProperty("DESIGN_FILE") + "\"");
      5 parseAsBNF("SimpleML-parsing.cwp",
      6            project, getProperty("DESIGN_FILE"));
      7 #include "TreeDecoration.cws"

line 1: we expect the design as a file that conforms to our Simple-Modeling Language ; the file name is given to the definition preprocessor DESIGN_FILE on the command line by typing -define DESIGN_FILE=SolarSystem0.sml,
line 5: the file is parsed thanks to our previous BNF script,
line 7: the source code for decorating tree is included here, and its content will be executed just after the parsing,

13 Generating code

A script that is intended to source code generation is called a pattern script in the CodeWorker vocabulary. The output file is rewritten completely after the protected areas of user's source code have been preserved.

Such a script begins with a sequence of characters exactly like they must be written into the output file, up to it encounters special character '@' or JSP-like tag '<%'. Then it swaps into script mode, and everything is interpreted as script instructions, up to special character '@' or JSP-like tag '%>' are encountered. Content of the script file is again understood as a sequence of characters to write into the output file, up to the next special character. And it continues swapping from a mode to another...

For convenience, the script mode may be just restrained to an expression (often the name of a variable) whose value is written into the output file.

To do source code generation, we'll need some useful functions, such as converting a Simple-Modeling type to its C++ representation. These functions might be included into the leader script, so as to be shared by all pattern scripts.

We'll discover a new type of functions, called template functions that bring a little generic programming in the language: let imagine that we need function getType(myType : node), to decline for every language we could have to generate (C++ and JAVA in this chapter). You plan to generate an object library from the design you have written in the Simple Modeling Language. This object library will be delivered both in C++ and JAVA, and a technical documentation will come with each of these implementations. This technical documentation will give the signature of methods and the type of attributes in the language the developer will choose. So the C++ documentation will be slightly different from the JAVA one, just at the level of type's spelling. Normally, you'll write the following lines to recover the type depending on the language for which you are producing the documentation:


if doc_language == "C++" {
    sType = getCppType(myParameterType);
} else if doc_language == "JAVA" {
    sType = getJAVAType(myParameterType);
} else {
    error("unrecognized language '" + doc_language + "'");
}

Thanks to the template functions, you may replace the precedent lines by the next one:


sType = getType<doc_language>(myParameterType);
...
function getType<"JAVA">(myType : node) {
    ... // implementation for returning a Java type
}

function getType<"C++">(myType : node) {
    ... // implementation for returning a C++ type
}

During the execution, the function getType<T>(myType : node) resolves on what instantiated function it has to dispatch: either getType<"JAVA">(myType : node) or getType<"C++">(myType : node), depending on what value is assigned to variable doc_language.

Trying to call an instantiated function that doesn't exist, raises an error at runtime. However, one might imagine an implementation by default. For instance:


function getType<T>(myType : node) {
    ... // implementation for any unrecognized language
}

For those that know generic programming with C++ templates, here is a classical example of using template functions:


function f<1>() { return 1; }
function f<N>() { return $N*f<$N - 1$>()$; }
local f10 = f<10>();
if $f10 != 3628800$ error("10! should be worth 3628800");
traceLine("10! = " + f10);

Output:

10! = 3628800

We'll find below all useful functions we'll need for source code generation, including the template function getType<T>(myType : node) we spoke about:

      // file "GettingStarted/SharedFunctions.cws":
      1 function normalizeIdentifier(sName) {
      2     if sName {
      3         if startString(sName, "_")
      4             return "_" + normalizeIdentifier(subString(sName, 1));
      5         set sName = toUpperString(charAt(sName, 0))
      6                     + subString(sName, 1);
      7         local iIndex = findFirstChar(sName, "_.");
      8         if !isNegative(iIndex) {
      9             local sNext = subString(sName, add(iIndex, 1));
      10             return leftString(sName, iIndex)
      11                     + normalizeIdentifier(sNext);
      12         }
      13     }
      14     return sName;
      15 }
      16
      17 function getType<"C++">(myType : node) {
      18     local sType;
      19     if myType.isObject set sType = myType.name + "*";
      20     else if myType.name == "boolean" set sType = "bool";
      21     else if myType.name == "string" set sType = "std::string";
      22     else set sType = myType.name;
      23     if myType.isArray set sType = "std::vector<" + sType + ">";
      24     return sType;
      25 }
      26
      27 function getParameterType<"C++">(myType : node, sMode) {
      28     local sType = getType<"C++">(myType);
      29     if endString(sMode, "out") set sType += "&";
      30     else if (sMode == "in") set sType = "const " + sType + "&";
      31     return sType;
      32 }
      33
      34 function getType<"JAVA">(myType : node) {
      35     local sType;
      36     if myType.name == "string" set sType = "String";
      37     else set sType = myType.name;
      38     if myType.isArray set sType = "java.util.ArrayList/*<" + sType + ">*/";
      39     return sType;
      40 }
      41
      42 function getParameterType<"JAVA">(myType : node, sMode) {
      43     return getType<"JAVA">(myType);
      44 }
      45
      46 function getVariableName(sName, myType : node) {
      47     local sPrefix;
      48     if myType.isArray set sPrefix = "t";
      49     if myType.isObject set sPrefix += "p";
      50     else {
      51         switch(myType.name) {
      52             case "int": set sPrefix += "i";break;
      53             case "double": set sPrefix += "d";break;
      54             case "boolean": set sPrefix += "b";break;
      55             case "string": set sPrefix += "s";break;
      56         }
      57     }
      58     return sPrefix + normalizeIdentifier(sName);
      59 }
      60
      61 function getMethodID(myMethod : node) {
      62     local sMethodID = myMethod.name;
      63     foreach i in myMethod.listOfParameters {
      64         set sMethodID += "." + i.type.name;
      65         if i.type.isArray set sMethodID += "[]";
      66     }
      67     return sMethodID;
      68 }

line 1: this function normalizes identifiers, so as to capitalize the first letter and to suppress '_' or dots after capitalizing the letter that follows: average_speed becomes AverageSpeed, for example. This function is applied on attribute names for instance.
line 3: if the identifier starts with an underscore, it is preserved,
line 7: points to the first character encountered among an underscore and a dot,
line 17: this function returns the C++ type of a Simple-Modeling type node:

line 27: this function returns the C++ type of a Simple-Modeling type node as expected when passed to a method as a parameter type (sMode is worth "in", "out", "inout" or empty string),
line 34: this function returns the JAVA type of a Simple-Modeling type node:

line 42: this function returns the JAVA type of a Simple-Modeling type node as expected when passed to a method as a parameter type (sMode is worth "in", "out", "inout" or empty string, but we don't care about "inout" or "out" for the moment),
line 46: this function returns a variable name whose nomenclature depends on its type,
line 51: the switch statement allows selection among multiple sections of code, depending on the value of expression myType.name, enclosed in parentheses. If no controlling expression (announced by label case) matches with the value, and no default label is present, CodeWorker throws an error.
line 61: this function returns a unique method ID, which is composed from the name of the method and the type of parameters, to avoid confusing protected areas from a method to another,
The next two examples both implement same functionalities, but in different languages (C++ and JAVA). They describe the skeleton of our objects.

13.1 C++ classes

A pattern script may be launched thanks to the procedure generate that expects three parameters:

The next pattern script describes the pattern of a C++ header file:

      // file "GettingStarted/CppObjectHeader.cwt":
      1 #ifndef _@this.name@_h_
      2 #define _@this.name@_h_
      3
      4 @
      5 newFloatingLocation("include files");
      6 @
      7 // this line separates the two insertion points, so as to distinguish them!
      8 @
      9 newFloatingLocation("class declarations");
      10
      11 function populateHeaderDeclarations(myType : node) {
      12     if myType.isObject insertTextOnce(getFloatingLocation("class declarations"), "class " + myType.name + ";" + endl());
      13     if myType.isArray insertTextOnce(getFloatingLocation("include files"), "#include <vector>" + endl());
      14     if myType.name insertTextOnce(getFloatingLocation("include files"), "#include <string>" + endl());
      15 }
      16
      17 @
      18 class @this.name@ @
      19 if existVariable(this.parent) {
      20     insertTextOnce(getFloatingLocation("include files"), "#include \"" + this.parent.name +".h\"" + endl());
      21     @: public @this.parent.name@ @
      22 }
      23 @{
      24     private:
      25 @
      26 foreach i in this.listOfAttributes {
      27     populateHeaderDeclarations(i.type);
      28     @ @getType<"C++">(i.type)@ _@getVariableName(i.name, i.type)@;
      29 @
      30 }
      31 @
      32     public:
      33         @this.name@();
      34         ~@this.name@();
      35
      36         // accessors:
      37 @
      38 foreach i in this.listOfAttributes {
      39     local sVariableName = getVariableName(i.name, i.type);
      40     %> inline <%getType<"C++">(i.type)%> get<%normalizeIdentifier (i.name)%>() const { return _<%sVariableName%>; }
      41         inline void set<%normalizeIdentifier(i.name)@(<%getType <"C++">(i.type)%> <%sVariableName@) { _<%sVariableName%> = <%sVariableName%>; }
      42 @
      43 }
      44 @
      45         // methods:
      46 @
      47 foreach i in this.listOfMethods {
      48     @ virtual @
      49     if existVariable(i.type) {
      50         populateHeaderDeclarations(i.type);
      51         @@getType<"C++">(i.type)@@
      52     } else {
      53         @void@
      54     }
      55     @ @i.name@(@
      56     foreach j in i.listOfParameters {
      57         if !first(j) {
      58             @, @
      59         }
      60         populateHeaderDeclarations(j.type);
      61         @@getParameterType<"C++">(j.type, j.mode)@ @getVariableName(j.name, j.type)@@
      62     }
      63     @);
      64 @
      65 }
      66 @
      67     private:
      68         @this.name@(const @this.name@&);
      69         @this.name@& operator =(const @this.name@&);
      70 };
      71
      72 #endif

line 1: the value of attribute this.name is written to the output file, where this points to a node that describes the current class. Note that this is facultative, and is assigned by the caller of procedure generate that runs this script.
line 5: put one anchor for including all files that we'll encounter as compulsory, while iterating attributes or methods. Example: if an attribute is an array, we'll need to include the STL header vector at this position of the file: #include <vector>. This insertion point is called "include files".
line 6: to avoid that the two floating locations "include files" and "class declarations" (described just below) point to the same file position, an empty line is added,
line 9: put one anchor for announcing all classes that we'll encounter as referenced, while iterating attributes or methods. Example: if an attribute is an object Planet, we'll need to write class Planet; at this position of the file. This insertion point is called "class declarations".
line 11: this function is called on every type encountered while iterating attributes and methods. Its role is to populate the "include files" and "class declarations" areas.
line 12: the type of an object must be declared at the beginning of the header, otherwise the compiler will not recognize it : the class is declared once only in the insertion point called "class declarations". Use of function insertTextOnce assures that if this class has already been inserted before, it will not be twice.
line 13: this type is an array, so the declaration of std::vector must be included to the insertion point called "include files",
line 14: this type is a string, so the declaration of std::string must be included to the insertion point called "include files",
line 19: if the class inherits from a parent class, this relationship must be written,
line 20: the parent class must be declared,
line 26: declaration of all attributes,
line 27: does the type of the attribute need some backward declarations?
line 38: accessors to each attribute,
line 40: there are two symbols to swap between writing a sequence of characters and interpreting script ; we have used the symbol '@', and now we illustrate the use of tags '<% and '%>,
line 41: you can melt the two swapping symbol, but it is more difficult to read, so not very interesting!
line 47: declaration of all methods,
line 48: each method might be overloaded by subclasses,
line 49: the return type of the method is translated to C++,
line 50: does the return type of the method need some backward declarations?
line 51: expression getType<"C++">(i.type) to evaluate is embedded between double '@'. The first one allow swapping to the sequence of characters mode, but there is no characters to write. The second one allows swapping to the script mode, which is reduced just to evaluate the expression. The two final '@' take the same role as seen before.
line 56: parameters of the method are iterated to be written in C++
line 57: if iterator j doesn't point to the first parameter, a comma makes a separation with the precedent,
line 60: does the type of the parameter need some backward declarations?
Let's continue with the pattern that describes the skeleton of a C++ body file:

      // file "GettingStarted/CppObjectBody.cwt":
      1 #ifdef WIN32
      2 #pragma warning(disable : 4786)
      3 #endif
      4
      5 @
      6 setProtectedArea("include files");
      7 @
      8 #include "@this.name@.h"
      9
      10 @this.name@::@this.name@()@
      11 local bAtLeastOne = false;
      12 foreach i in this.listOfAttributes {
      13     if !i.type.isArray && (i.type.name != "string") {
      14         if bAtLeastOne {
      15             @, @
      16         } else {
      17             @ : @
      18             set bAtLeastOne = true;
      19         }
      20         @_@getVariableName(i.name, i.type)@(@
      21         if i.type.isObject {
      22             @0L@
      23         } else {
      24             switch(i.type.name) {
      25                 case "int":
      26                     @0@
      27                     break;
      28                 case "double":
      29                     @0.0@
      30                     break;
      31                 case "boolean":
      32                     @false@
      33                     break;
      34             }
      35         }
      36         @)@
      37     }
      38 }
      39 @ {
      40 }
      41
      42 @this.name@::~@this.name@() {
      43 @
      44 foreach i in this.listOfAttributes {
      45     if i.type.isAggregation && i.type.isObject {
      46         local sAttributeName = "_" + getVariableName(i.name, i.type);
      47         local sIndex = "iterate" + normalizeIdentifier(i.name);
      48         if i.type.isArray {
      49             @ for (std::vector<@i.name@*>::const_iterator @sIndex@ = @sAttributeName@.begin(); @sIndex@ != @sAttributeName@.end(); ++@sIndex@) {
      50         delete *@sIndex@;
      51     }
      52 @
      53         } else {
      54             @ delete @sAttributeName@;
      55 @
      56         }
      57     }
      58 }
      59 @}
      60
      61 @
      62 foreach i in this.listOfMethods {
      63     if existVariable(i.type) {
      64         @@getType<"C++">(i.type)@@
      65     } else {
      66         @void@
      67     }
      68     @ @this.name@::@i.name@(@
      69     foreach j in i.listOfParameters {
      70         if !first(j) {
      71             @, @
      72         }
      73         @@getParameterType<"C++">(j.type, j.mode)@ @getVariableName(j.name, j.type)@@
      74     }
      75     @) {
      76 @
      77         setProtectedArea(getMethodID(i));
      78 @}
      79 @
      80 }

line 1: Visual C++-specific pragma must be added to prevent from intempestive warnings about template class instantiation of std::vector<T> in DEBUG mode!
line 6: the developer will add here all include files he will need for implementation of methods,
line 8: the header of this body is compulsory,
line 11: this part concerns the initialization of attributes. Some attributes, such as strings and vectors of the STL don't require to be initialized explicitly. It justifies the declaration of variable bAtLeastOne that is worth false as long as no attribute has been initialized yet. We'll see why below.
line 13: arrays and strings are skipped,
line 15: if it isn't the first attribute to be initialized, a comma make a separation with the precedent,
line 17: if it is the first attribute to be initialized, a colon is expected to announce the beginning of initializations
line 18: now, there is at least one attribute to be initialized,
line 21: attribute is populated with the default value corresponding to its type,
line 44: aggregated objects must be deleted before leaving this instance,
line 49: all elements of an aggregated array must be deleted
line 54: the aggregated object is deleted
line 62: implementation of all methods,
line 63: the return type of the method is translated to C++,
line 69: parameters of the method are iterated to be written in C++
line 70: if iterator j doesn't point to the first parameter, a comma makes a separation with the precedent,
line 77: a protected area is inserted, whose key is the method ID,
The leader script has to be improved to reclaim generation of C++ files:

CodeWorker command line to execute:
-I Scripts/Tutorial -path . -define DESIGN_FILE=GettingStarted/SolarSystem0.sml -script GettingStarted/LeaderScript1.cws

      // file "GettingStarted/LeaderScript1.cws":
      1 if !getProperty("DESIGN_FILE")
      2     error("'-define DESIGN_FILE=file' expected on the command line");
      3 traceLine("'Simple Modeling' design file to parse = \""
      4           + getProperty("DESIGN_FILE") + "\"");
      5 parseAsBNF("GettingStarted/SimpleML-parsing.cwp",
      6            project, getProperty("DESIGN_FILE"));
      7 #include "TreeDecoration.cws"
      8
      9 #include "SharedFunctions.cws"
      10 foreach myClass in project.listOfClasses {
      11     traceLine("generating class '" + myClass.name + "' ...");
      12     generate("GettingStarted/CppObjectHeader.cwt", myClass,
      13              getWorkingPath() + "Scripts/Tutorial/GettingStarted/Cpp/"
      14              + myClass.name + ".h");
      15     generate("GettingStarted/CppObjectBody.cwt", myClass,
      16              getWorkingPath() + "Scripts/Tutorial/GettingStarted/Cpp/"
      17              + myClass.name + ".cpp");
      18 }

line 9: all useful functions for source code generation are loaded here,
line 10: all classes are iterated and their C++ header and body are generated
line 12: instruction generate is applied on a pattern script and its second argument expects a node that will be seen as variable 'this' into the pattern script,
line 13: getWorkingPath() is worth the output path passed to the command line via the option '-path',

Output:

'Simple Modeling' design file to parse = "GettingStarted/SolarSystem0.sml"
file parsed successfully
generating class 'Planet' ...
generating class 'Earth' ...
generating class 'SolarSystem' ...

Let have a look on some generated files:

      // file "GettingStarted/Cpp/SolarSystem.h":
      #ifndef _SolarSystem_h_
      #define _SolarSystem_h_
     
      #include <vector>
      #include <string>
     
      // this line separates the two insertion points, so as to distinguish them!
      class Planet;
     
      class SolarSystem {
          private:
              std::vector<Planet*> _tpPlanets;
     
          public:
              SolarSystem();
              ~SolarSystem();
     
              // accessors:
              inline std::vector<Planet*> getPlanets() const { return _tpPlanets; }
              inline void setPlanets(std::vector<Planet*> tpPlanets) { _tpPlanets = tpPlanets; }
     
              // methods:
     
          private:
              SolarSystem(const SolarSystem&);
              SolarSystem& operator =(const SolarSystem&);
      };
     
      #endif

      // file "GettingStarted/Cpp/Planet.cpp":
      1 #ifdef WIN32
      2 #pragma warning(disable : 4786)
      3 #endif
      4
      5 //##protect##"include files"
      6 //##protect##"include files"
      7
      8 #include "Planet.h"
      9
      10 Planet::Planet() : _dDiameter(0.0) {
      11 }
      12
      13 Planet::~Planet() {
      14 }
      15
      16 double Planet::getDistanceToSun(int iDay, int iMonth, int iYear) {
      17 //##protect##"getDistanceToSun.int.int.int"
      18 //##protect##"getDistanceToSun.int.int.int"
      19 }

line 1: Visual C++-specific pragma must be added to prevent from intempestive warnings about template class instantiation of std::vector<T> in DEBUG mode!

13.2 JAVA classes

Some modelers don't separate clearly the design and its implementation, but theoretically, no language-dependent data has to be included into the design. The modeling language should be improved to take into account some finer modeling aspects that lead to choose a mapping (for parameter types, for example) to the implementation language. The logic of a source code generation process is to factorize as most as possible the knowledge at the design level. We'll speak longer about it further.

Our design is totally independent from the implementation : a string isn't explicitly a const std::string& or a std::string in C++, but the pattern script decides according to the context whether it is more judicious to choose the first C++ mapping or the second one.

This independence allows us implementing the same functionalities as in C++, but in JAVA now:

      // file "GettingStarted/JAVAObject.cwt":
      1 package solarsystem;
      2
      3 public class @this.name@ @
      4 if existVariable(this.parent) {
      5     @extends @this.parent.name@ @
      6 }
      7 @{
      8 @
      9 foreach i in this.listOfAttributes {
      10     @ private @getType<"JAVA">(i.type)@ _@getVariableName(i.name, i.type)@;
      11 @
      12 }
      13 @
      14     public @this.name@() {}
      15
      16     // accessors:
      17 @
      18 foreach i in this.listOfAttributes {
      19     local sVariableName = getVariableName(i.name, i.type);
      20     @ public @getType<"JAVA">(i.type)@ get@normalizeIdentifier(i.name)@() { return _@sVariableName@; }
      21     public void set@normalizeIdentifier(i.name)@(@getType<"JAVA">(i.type)@ @sVariableName@) { _@sVariableName@ = @sVariableName@; }
      22 @
      23 }
      24 @
      25         // methods:
      26 @
      27 foreach i in this.listOfMethods {
      28     @ public @
      29     if existVariable(i.type) {
      30         @@getType<"JAVA">(i.type)@@
      31     } else {
      32         @void@
      33     }
      34     @ @i.name@(@
      35     foreach j in i.listOfParameters {
      36         if !first(j) {
      37             @, @
      38         }
      39         @@getParameterType<"JAVA">(j.type, j.mode)@ @getVariableName(j.name, j.type)@@
      40     }
      41     @) {
      42 @
      43     setProtectedArea(getMethodID(i));
      44 @ }
      45
      46 @
      47 }
      48 @}

line 4: if the class inherits from a parent class, this relationship must be written,
line 9: declaration of all attributes,
line 18: accessors to each attribute,
line 27: declaration of all methods,
The leader script has to be improved to reclaim generation of JAVA files:

CodeWorker command line to execute:
-I Scripts/Tutorial -path . -define DESIGN_FILE=GettingStarted/SolarSystem0.sml -script GettingStarted/LeaderScript2.cws

      // file "GettingStarted/LeaderScript2.cws":
      1 if !getProperty("DESIGN_FILE")
      2     error("'-define DESIGN_FILE=file' expected on the command line");
      3 traceLine("'Simple Modeling' design file to parse = \""
      4           + getProperty("DESIGN_FILE") + "\"");
      5 parseAsBNF("GettingStarted/SimpleML-parsing.cwp",
      6            project, getProperty("DESIGN_FILE"));
      7 #include "TreeDecoration.cws"
      8
      9 #include "SharedFunctions.cws"
      10 foreach myClass in project.listOfClasses {
      11     traceLine("generating class '" + myClass.name + "' ...");
      12     generate("GettingStarted/CppObjectHeader.cwt", myClass, getWorkingPath() + "Scripts/Tutorial/GettingStarted/Cpp/" + myClass.name + ".h");
      13     generate("GettingStarted/CppObjectBody.cwt", myClass, getWorkingPath() + "Scripts/Tutorial/GettingStarted/Cpp/" + myClass.name + ".cpp");
      14     generate("GettingStarted/JAVAObject.cwt", myClass, getWorkingPath() + "Scripts/Tutorial/GettingStarted/JAVA/solarsystem/" + myClass.name + ".java");
      15 }

line 14: generates the JAVA implementation of the current design class,

Output:

'Simple Modeling' design file to parse = "GettingStarted/SolarSystem0.sml"
file parsed successfully
generating class 'Planet' ...
generating class 'Earth' ...
generating class 'SolarSystem' ...

Let have a look on some generated files:

      // file "GettingStarted/JAVA/solarsystem/SolarSystem.java":
      package solarsystem;
     
      public class SolarSystem {
          private java.util.ArrayList/*<Planet>*/ _tpPlanets;
     
          public SolarSystem() {}
     
          // accessors:
          public java.util.ArrayList/*<Planet>*/ getPlanets() { return _tpPlanets; }
          public void setPlanets(java.util.ArrayList/*<Planet>*/ tpPlanets) { _tpPlanets = tpPlanets; }
     
              // methods:
      }

      // file "GettingStarted/JAVA/solarsystem/Planet.java":
      package solarsystem;
     
      public class Planet {
          private double _dDiameter;
     
          public Planet() {}
     
          // accessors:
          public double getDiameter() { return _dDiameter; }
          public void setDiameter(double dDiameter) { _dDiameter = dDiameter; }
     
              // methods:
          public double getDistanceToSun(int iDay, int iMonth, int iYear) {
      //##protect##"getDistanceToSun.int.int.int"
      //##protect##"getDistanceToSun.int.int.int"
          }
     
      }

14 Expanding a file

Expanding a file consists of generating code to some determined points of the file. These points are called markups and are noted ##markup##"name-of-the-markup", surrounded by comment delimiters.

For example, a valid markup inlayed in a C++ file could be:
//##markup##"factory"
and a valid markup inlayed in an HTML file could be:
<!- -##markup##"classes"- ->

Some data may accompany the markup. The block of data is put between tags ##data##:
//##markup##"switch(sText)"
//##data##
//Customer
//Videostore
//##data##
You obtain the data attached to the current markup key by calling the function getMarkupValue() (see
getMarkupValue()). This example extends the C++/Java functionalities with a switch statement working on a string expression.

A pattern script intended to expand code is launched thanks to the procedure expand that expects three parameters:

Each time CodeWorker will encounter a markup, it will call the pattern script that will decide how to populate it. The code generated by the pattern script for this markup is surrounded by tags ##begin##"name-of-the-markup" and ##end##"name-of-the-markup", automatically added by the interpreter. If some protected areas were put into the generated code, they are preserved the next time the expansion is required.

Note that CodeWorker doesn't change what is written outside the markups and their begin/end delimiters.

Starting from a (very simple) HTML canvas, we'll generate an HTML documentation to our project SolarSystem. Here is the canvas that we would like to keep for all our projects:

      // file "GettingStarted/defaultDocumentation.html":
      <HTML>
          <HEAD>
              <TITLE>some title...</TITLE>
          </HEAD>
          <BODY>
              <H1>some title...</H1>
              some global documentation...
      <!--##markup##"classes presentation"-->
          </BODY>
      </HTML>

We'll copy it to "GettingStarted/SolarSystem0.html" to populate it with the characteristics of our current project. The pattern script that will be launched to expand "GettingStarted/SolarSystem0.html" is:

      // file "GettingStarted/HTMLDocumentation.cwt":
      1 @
      2 if getMarkupKey() == "classes presentation" {
      3     foreach i in project.listOfClasses {
      4         @
      5         <H2><A href="#@i.name@">@i.name@</A></H2>
      6 @
      7         setProtectedArea(i.name + ":presentation");
      8         if !isEmpty(i.listOfAttributes) {
      9             @
      10         <TABLE border="1" cellpadding="3" cellspacing="0" width="100%">
      11             <TR BGCOLOR="#CCCCFF">
      12                 <TD><B>Type</B></TD>
      13                 <TD><B>Attribute name</B></TD>
      14                 <TD><B>Description</B></TD>
      15             </TR>
      16 @
      17             foreach j in i.listOfAttributes {
      18                 @ <TR>
      19                 <TD>@composeHTMLLikeString(getType<this.language> (j.type))@</TD>
      20                 <TD>@j.name@</TD>
      21                 <TD>
      22 @
      23                 setProtectedArea(i.name + "::" + j.name + ":description");
      24                 @
      25                 </TD>
      26             </TR>
      27 @
      28             }
      29             @ </TABLE>
      30 @
      31         }
      32         if !isEmpty(i.listOfMethods) {
      33             @
      34         <UL>
      35 @
      36             foreach j in i.listOfMethods {
      37                 @ <LI>@
      38                 if existVariable(j.type) {
      39                     @function @composeHTMLLikeString(getType <this.language>(j.type))@ @
      40                 } else {
      41                     @procedure@
      42                 }
      43                 @<B>@j.name@</B>(@
      44                 foreach k in j.listOfParameters {
      45                     if !first(k) {
      46                         @, @
      47                     }
      48                     @@composeHTMLLikeString(getParameterType <this.language>(k.type, k.mode))@ <I>@getVariableName(k.name, k.type)@</I>@
      49                 }
      50                 @)
      51                 <BR>
      52 @
      53                 setProtectedArea(i.name + "::" + getMethodID(j) + ":description");
      54                 @
      55             </LI>
      56 @
      57             }
      58             @ </UL>
      59 @
      60         }
      61     }
      62 }

line 2: the predefined function getMarkupKey() returns the name of the markup to expand,
line 3: the markup is worth "classes presentation", and so, we'll describe all classes
line 7: a protected area is embedded here, which has to be populated by hand into the expanded file for describing the class,
line 9: attributes are presented into a table,
line 18: the language into which types have to be expressed is given by this.language, and is worth "C++" or "JAVA" ; don't forget to convert the type to the HTML syntax, because of '<' or '>' to convert respectively to '&lt' or '&gt' for instance. Use the predefined function composeHTMLLikeString() to do this process.
line 23: a protected area is embedded here, which has to be populated by hand into the expanded file for describing the attribute,
line 37: methods are presented into unordered lists,
line 53: a protected area is embedded here, which has to be populated by hand into the expanded file for describing the method,
Now, we have to change the leader script, so as to take into account the generation of the documentation:

CodeWorker command line to execute:
-I Scripts/Tutorial -path . -define DESIGN_FILE=GettingStarted/SolarSystem0.sml -script GettingStarted/LeaderScript3.cws

      // file "GettingStarted/LeaderScript3.cws":
      1 if !getProperty("DESIGN_FILE")
      2     error("'-define DESIGN_FILE=file' expected on the command line");
      3 traceLine("'Simple Modeling' design file to parse = \""
      4           + getProperty("DESIGN_FILE") + "\"");
      5 parseAsBNF("GettingStarted/SimpleML-parsing.cwp",
      6            project, getProperty("DESIGN_FILE"));
      7 #include "TreeDecoration.cws"
      8
      9 #include "SharedFunctions.cws"
      10 foreach myClass in project.listOfClasses {
      11     traceLine("generating class '" + myClass.name + "' ...");
      12     generate("GettingStarted/CppObjectHeader.cwt", myClass, getWorkingPath() + "Scripts/Tutorial/GettingStarted/Cpp/" + myClass.name + ".h");
      13     generate("GettingStarted/CppObjectBody.cwt", myClass, getWorkingPath() + "Scripts/Tutorial/GettingStarted/Cpp/" + myClass.name + ".cpp");
      14     generate("GettingStarted/JAVAObject.cwt", myClass, getWorkingPath() + "Scripts/Tutorial/GettingStarted/JAVA/solarsystem/" + myClass.name + ".java");
      15 }
      16 if !existFile("Scripts/Tutorial/GettingStarted/SolarSystem0.html") {
      17     copyFile("Scripts/Tutorial/GettingStarted/defaultDocumentation.html", "Scripts/Tutorial/GettingStarted/SolarSystem0.html");
      18 }
      19
      20 local myDocumentationContext;
      21 insert myDocumentationContext.language = "C++";
      22 traceLine("generating the HTML documentation...");
      23 setCommentBegin("<!--");
      24 setCommentEnd("-->");
      25 expand("GettingStarted/HTMLDocumentation.cwt",
      26         myDocumentationContext, getWorkingPath()
      27         + "Scripts/Tutorial/GettingStarted/SolarSystem0.html");

line 16: copy the default empty HTML documentation to "SolarSystem0.html" if it doesn't exist yet,
line 20: the myDocumentationContext variable will be passed to the procedure expand(),
line 21: an attribute language is added to the myDocumentationContext variable, which specifies whether types must be expressed in C++ or in JAVA into the HTML documentation,
line 23: don't forget to specify comment delimiters that are expected by an HTML file,
line 25: the procedure expand() allow populating "SolarSystem0.html" with the characteristics of the project automatically,

Output:

'Simple Modeling' design file to parse = "GettingStarted/SolarSystem0.sml"
file parsed successfully
generating class 'Planet' ...
generating class 'Earth' ...
generating class 'SolarSystem' ...
generating the HTML documentation...

After executing this script, we obtain the following HTML documentation, where protected areas have to be populated, so as to describe classes and attributes and methods:

      // file "GettingStarted/SolarSystem0.html":
      <HTML>
          <HEAD>
              <TITLE>some title...</TITLE>
          </HEAD>
          <BODY>
              <H1>some title...</H1>
              some global documentation...
      <!--##markup##"classes presentation"--><!--##begin##"classes presentation"-->
              <H2><A href="#Planet">Planet</A></H2>
      <!--##protect##"Planet:presentation"--><!--##protect##"Planet:presentation"-->
              <TABLE border="1" cellpadding="3" cellspacing="0" width="100%">
                  <TR BGCOLOR="#CCCCFF">
                      <TD><B>Type</B></TD>
                      <TD><B>Attribute name</B></TD>
                      <TD><B>Description</B></TD>
                  </TR>
                  <TR>
                      <TD>double</TD>
                      <TD>diameter</TD>
                      <TD>
      <!--##protect##"Planet::diameter:description"--><!--##protect##"Planet::diameter:description"-->
                      </TD>
                  </TR>
              </TABLE>
     
              <UL>
                  <LI>function double <B>getDistanceToSun</B>(int <I>iDay</I>, int <I>iMonth</I>, int <I>iYear</I>)
                      <BR>
      <!--##protect##"Planet::getDistanceToSun.int.int.int:description"--><!--##protect##"Planet::getDistanceToSun.int.int.int:description"-->
                  </LI>
              </UL>
     
              <H2><A href="#Earth">Earth</A></H2>
      <!--##protect##"Earth:presentation"--><!--##protect##"Earth:presentation"-->
              <TABLE border="1" cellpadding="3" cellspacing="0" width="100%">
                  <TR BGCOLOR="#CCCCFF">
                      <TD><B>Type</B></TD>
                      <TD><B>Attribute name</B></TD>
                      <TD><B>Description</B></TD>
                  </TR>
                  <TR>
                      <TD>std::vector&lt;std::string&gt;</TD>
                      <TD>countryNames</TD>
                      <TD>
      <!--##protect##"Earth::countryNames:description"--><!--##protect##"Earth::countryNames:description"-->
                      </TD>
                  </TR>
              </TABLE>
     
              <H2><A href="#SolarSystem">SolarSystem</A></H2>
      <!--##protect##"SolarSystem:presentation"--><!--##protect##"SolarSystem:presentation"-->
              <TABLE border="1" cellpadding="3" cellspacing="0" width="100%">
                  <TR BGCOLOR="#CCCCFF">
                      <TD><B>Type</B></TD>
                      <TD><B>Attribute name</B></TD>
                      <TD><B>Description</B></TD>
                  </TR>
                  <TR>
                      <TD>std::vector&lt;Planet*&gt;</TD>
                      <TD>planets</TD>
                      <TD>
      <!--##protect##"SolarSystem::planets:description"--><!--##protect##"SolarSystem::planets:description"-->
                      </TD>
                  </TR>
              </TABLE>
      <!--##end##"classes presentation"-->
          </BODY>
      </HTML>

We'll suppose that the skeleton of the HTML documentation is acceptable for us. It will evolve with our design "SolarSystem0.sml": if some classes or some members are added or removed, the skeleton will take these changes into account. When the reference to a protected area disappears, because the member it was linked to changes its name or is removed, the protected area is kept up at the end of the file.

Now, we have to populate protected areas and parts of text that are put outside the markups, so as to complete our documentation. This work has been done to "SolarSystem1.html".

15 Translating a file

Up to now, we discovered parsing on one side and source code generation on the other side. The translation mode merges the two: it offers to parse a file conforming to a BNF and to translate it into another format, all in the same translation script.

A translation script looks like a BNF-driven parsing script, but where:

are allowed into compound statements that are announced by '=>'.

Outputs are written into another file, so the input file is preserved. The procedure that takes the translation in charge is called translate().

Little practical example: all our documentation has been written in HTML, but we would like to translate it to LaTeX, into our own format. Why not?

First step, we must be able to read an HTML file according to a BNF representation. The corresponding BNF-driven script we have to write is restricted to be able to write our file "SolarSystem1.html":

      // file "GettingStarted/HTML-parsing.cwp":
      1 #noCase
      2
      3 HTML ::= #ignore(HTML) #continue '<' "HTML" '>' HTMLHeader HTMLBody '<' '/' "HTML" '>' #empty;
      4 HTMLHeader ::= '<' #continue "HEAD" '>' [~['<' '/' "HEAD" '>']]* '<' '/' "HEAD" '>';
      5 HTMLBody ::= '<' #continue "BODY" '>' HTMLText '<' '/' "BODY" '>';
      6 HTMLText ::=
      7         [
      8             ~'<'
      9                 |
      10             !['<' '/'] #continue '<'
      11                 #readIdentifier:sTag HTMLNextOfTag<sTag>
      12         ]*;
      13 HTMLNextOfTag<"H1"> ::= #continue '>' HTMLText '<' '/' "H1" '>';
      14 HTMLNextOfTag<"H2"> ::= #continue '>' HTMLText '<' '/' "H2" '>';
      15 HTMLNextOfTag<"A"> ::= [HTMLAttribute]* #continue '>' HTMLText '<' '/' 'A' '>';
      16 HTMLNextOfTag<"TABLE"> ::= [HTMLAttribute]* #continue '>' [HTMLTag("TR")]* '<' '/' "TABLE" '>';
      17 HTMLTag(sTag : value) ::= '<' #readText(sTag) #continue HTMLNextOfTag<sTag>;
      18 HTMLNextOfTag<"TR"> ::= [HTMLAttribute]* #continue '>' [HTMLTag("TD")]* '<' '/' "TR" '>';
      19 HTMLNextOfTag<"TD"> ::= [HTMLAttribute]* #continue '>' HTMLText '<' '/' "TD" '>';
      20 HTMLNextOfTag<"UL"> ::= [HTMLAttribute]* #continue '>' [HTMLTag("LI")]* '<' '/' "UL" '>';
      21 HTMLNextOfTag<"LI"> ::= [HTMLAttribute]* #continue '>' HTMLText '<' '/' "LI" '>';
      22 HTMLNextOfTag<"B"> ::= #continue '>' HTMLText '<' '/' "B" '>';
      23 HTMLNextOfTag<"I"> ::= #continue '>' HTMLText '<' '/' "I" '>';
      24 HTMLNextOfTag<"FONT"> ::= [HTMLAttribute]* #continue '>' HTMLText '<' '/' "FONT" '>';
      25 HTMLNextOfTag<"BR"> ::= ['/']? #continue '>';
      26 HTMLAttribute ::= #readIdentifier ['=' #continue [STRING_LITERAL | WORD_LITERAL]]?;
      27
      28
      29 STRING_LITERAL ::= #!ignore '\"' [~'\"']* '\"';
      30 WORD_LITERAL ::= #!ignore [~['>' | '/' | ' ' | '\t']]+;

line 1: we don't care about the case: <BODY> and <Body> must be recognized as identical for instance,
line 6: the clause HTMLText reads the value between tags,
line 11: the best way to assure an easy extension of the grammar: to declare a template clause for describing the reading of a tag,
line 17: a clause to read a determined tag: the token #readText matches the input stream to the evaluated expression passed in parameter and the rest is read by the template clause that describes the reading of a tag,
Second step, we have to improve the BNF-driven script to add some features for generating the LaTeX code properly. Don't be afraid about the length of the source code, but go forward to the notes directly:

      // file "GettingStarted/HTML2LaTeX.cwp":
      1 #noCase
      2
      3 HTML2LaTeX ::= #ignore(HTML) #continue '<' "HTML" '>' HTMLHeader HTMLBody '<' '/' "HTML" '>' #empty;
      4 HTMLHeader ::= '<' #continue "HEAD" '>' [~['<' '/' "HEAD" '>']]* '<' '/' "HEAD" '>';
      5 HTMLBody ::= '<' #continue "BODY" '>' HTMLText '<' '/' "BODY" '>';
      6 HTMLText ::= #!ignore
      7         [
      8             '&' #continue #readIdentifier:sEscape HTMLEscape<sEscape> ';'
      9                 |
      10             ~'<':cChar => writeText(cChar);
      11                 |
      12             !['<' blanks '/']
      13             [
      14                 "<!--" #continue [~"-->"]* "-->"
      15                     |
      16                 '<' #continue #ignore(HTML) #readIdentifier:sTag HTMLNextOfTag<sTag>
      17             ]
      18         ]*;
      19 HTMLEscape<"lt"> ::= => {@<@};
      20 HTMLEscape<"gt"> ::= => {@>@};
      21 HTMLTag(sTag : value) ::= '<' #readText(sTag) #continue HTMLNextOfTag<sTag>;
      22 HTMLNextOfTag<"H1"> ::=
      23         #continue '>' => {@\subsection{@}
      24         HTMLText
      25         '<' '/' "H1" '>' => {@}@};
      26 HTMLNextOfTag<"H2"> ::=
      27         #continue '>' => {@\subsubsection{@}
      28         HTMLText
      29         '<' '/' "H2" '>' => {@}@};
      30 HTMLNextOfTag<"A"> ::= [HTMLAttribute]* #continue '>' HTMLText '<' '/' 'A' '>';
      31 HTMLNextOfTag<"TABLE"> ::=
      32         [HTMLAttribute]* #continue '>' => {
      33             @\begin{table@
      34             newFloatingLocation("table PDF suffix");
      35             @}{@
      36             newFloatingLocation("table columns");
      37             @}{.5}@
      38         }
      39         => local sPDFTableSuffix;
      40         HTMLTableTitle(sPDFTableSuffix)
      41         [HTMLTableLine(sPDFTableSuffix)]*
      42         '<' '/' "TABLE" '>' => {@\end{table@sPDFTableSuffix@}
      43 @};
      44 HTMLTableTitle(sPDFTableSuffix : node) ::=
      45     '<' "TR" [HTMLAttribute]*
      46     #continue '>'
      47     [HTMLTableCol(sPDFTableSuffix)]*
      48     '<' '/' "TR" '>' => {
      49         insertText(getFloatingLocation("table PDF suffix"), sPDFTableSuffix);
      50         writeText(endl());
      51     };
      52 HTMLTableCol(sPDFTableSuffix : node) ::=
      53     '<' "TD" [HTMLAttribute]* #continue '>' => {
      54         @{@
      55         if !sPDFTableSuffix insertText(getFloatingLocation("table columns"), "l");
      56         else insertText(getFloatingLocation("table columns"), "|l");
      57         set sPDFTableSuffix += "i";
      58     }
      59     '<' 'B' '>' [#!ignore [~'<':cChar => writeText(cChar);]*] '<' '/' 'B' '>'
      60     '<' '/' "TD" '>' => {@}@};
      61 HTMLTableLine(sPDFTableSuffix : value) ::=
      62         '<' "TR" [HTMLAttribute]* #continue '>' => {@\line@sPDFTableSuffix@@}
      63         [HTMLTag("TD")]* '<' '/' "TR" '>' => {writeText(endl());};
      64 HTMLNextOfTag<"TD"> ::=
      65         [HTMLAttribute]* #continue '>' => {@{@}
      66         HTMLCellText '<' '/' "TD" '>' => {@}@};
      67 HTMLCellText ::= #!ignore
      68         [
      69             '&' #continue #readIdentifier:sEscape HTMLEscape<sEscape> ';'
      70                 |
      71             ['\r']? ['\n'] => {@ @}
      72                 |
      73             ~'<':cChar => writeText(cChar);
      74                 |
      75             !['<' blanks '/']
      76             [
      77                 "<!--" #continue [~"-->"]* "-->"
      78                     |
      79                 '<' #continue #ignore(HTML) #readIdentifier:sTag HTMLNextOfTag<sTag>
      80             ]
      81         ]*;
      82 HTMLNextOfTag<"UL"> ::=
      83         [HTMLAttribute]* #continue '>' => {@\begin{itemize}
      84 @}
      85         [HTMLTag("LI")]*
      86         '<' '/' "UL" '>' => {@\end{itemize}
      87 @};
      88 HTMLNextOfTag<"LI"> ::=
      89         [HTMLAttribute]* #continue '>' => {@\item @}
      90         HTMLText
      91         '<' '/' "LI" '>' => {writeText(endl());};
      92 HTMLNextOfTag<"B"> ::=
      93         #continue '>' => {@\textbf{@}
      94         HTMLText
      95         '<' '/' "B" '>' => {@}@};
      96 HTMLNextOfTag<"I"> ::=
      97         #continue '>' => {@\textbf{@}
      98         HTMLText
      99         '<' '/' "I" '>' => {@}@};
      100 HTMLNextOfTag<"FONT"> ::= [HTMLAttribute]* #continue '>' HTMLText '<' '/' "FONT" '>';
      101 HTMLNextOfTag<"BR"> ::= ['/']? #continue '>' => { writeText(endl());};
      102 HTMLAttribute ::= #readIdentifier ['=' #continue [STRING_LITERAL | WORD_LITERAL]]?;
      103
      104
      105 blanks ::= [' '| '\t' | '\r' | '\n']*;
      106 STRING_LITERAL ::= #!ignore '\"' [~'\"']* '\"';
      107 WORD_LITERAL ::= #!ignore [~['>' | '/' | ' ' | '\t']]+;

line 6: blank characters are interesting, so we refuse to ignore HTML blanks and comments,
line 8: handling of HTML escape sequences, announced by character '&',
line 10: if not the beginning of a tag, the current character of the input stream is put to the output stream,
line 12: token operator '!' doesn't move the position of the input stream, and it continues in sequence only if the token expression that follows doesn't match; here, we check whether we have reached an end of tag or not,
line 14: we do not ignore comments anymore, so we have to do it my ourselves,
line 16: an embedded tag has been encountered,
line 19: template clauses HTMLEscape<T> are always valid and just convert special characters to their LaTeX representation,
line 22: in the real life, HTML tag <H1> could represent a chapter, but the LaTeX output file is intended to be included into the reference manual of CodeWorker as an illustration ; it will be a part of a section, so chapters are translated as sub sections!
line 26: in the real life, HTML tag <H2> could represent a section, but for the same reason as above, it will be translated as a sub-sub section,
line 34: with HTML, the number of columns the table expects is deduced later. However, a latex table (well-formed for a PDF conversion) must know explicetly of how many columns it is composed. So, a floating position is attached to the current position of the output file. While discovering columns, text will be inserted here and further.
line 36: the format of each column is specified at this place,
line 39: we consider that the first line of the table gives the name of the columns, and we'll take the PDF table suffix ('ii' for 2 columns, 'iii' for 3 columns, ...) to write lines of the table correctly,
line 41: we translate as many lines of the table as we can read, knowing the PDF suffix,
line 52: the clause is intended to read the name of a column of a table, and to translate it to LaTeX, knowing that some text must be inserted into the declarative part of the LaTeX table,
line 67: the text into a cell of a table shouldn't contain paragraph jumps (empty line in LaTeX),
line 71: the simplest way to avoid empty lines is to ignore end of lines, and to replace it to a space,
Last step, we have to change the leader script, so as to take into account the translation of the HTML documentation to the LaTeX one:

CodeWorker command line to execute:
-I Scripts/Tutorial -path . -define DESIGN_FILE=GettingStarted/SolarSystem0.sml -script GettingStarted/LeaderScript4.cws

      // file "GettingStarted/LeaderScript4.cws":
      1 if !getProperty("DESIGN_FILE")
      2     error("'-define DESIGN_FILE=file' expected on the command line");
      3 traceLine("'Simple Modeling' design file to parse = \""
      4           + getProperty("DESIGN_FILE") + "\"");
      5 parseAsBNF("GettingStarted/SimpleML-parsing.cwp",
      6            project, getProperty("DESIGN_FILE"));
      7 #include "TreeDecoration.cws"
      8
      9 #include "SharedFunctions.cws"
      10 foreach myClass in project.listOfClasses {
      11     traceLine("generating class '" + myClass.name + "' ...");
      12     generate("GettingStarted/CppObjectHeader.cwt", myClass, getWorkingPath() + "Scripts/Tutorial/GettingStarted/Cpp/" + myClass.name + ".h");
      13     generate("GettingStarted/CppObjectBody.cwt", myClass, getWorkingPath() + "Scripts/Tutorial/GettingStarted/Cpp/" + myClass.name + ".cpp");
      14     generate("GettingStarted/JAVAObject.cwt", myClass, getWorkingPath() + "Scripts/Tutorial/GettingStarted/JAVA/solarsystem/" + myClass.name + ".java");
      15 }
      16
      17 local myDocumentationContext;
      18 insert myDocumentationContext.language = "C++";
      19 traceLine("generating the HTML documentation...");
      20 setCommentBegin("<!--");
      21 setCommentEnd("-->");
      22 expand("GettingStarted/HTMLDocumentation.cwt",
      23         myDocumentationContext, getWorkingPath()
      24         + "Scripts/Tutorial/GettingStarted/SolarSystem1.html");
      25 translate("GettingStarted/HTML2LaTeX.cwp", project, "GettingStarted/SolarSystem1.html", getWorkingPath() + "Scripts/Tutorial/GettingStarted/SolarSystem.tex");

line 22: the procedure expand() will allow populating "SolarSystem1.html" with the characteristics of the project,
line 25: a context of execution (project here) is given as a this variable, although no parsing will be processed: reading and writing only, no data to keep,

Output:

'Simple Modeling' design file to parse = "GettingStarted/SolarSystem0.sml"
file parsed successfully
generating class 'Planet' ...
generating class 'Earth' ...
generating class 'SolarSystem' ...
generating the HTML documentation...

It generates the LaTeX file that composes the next sub section:

15.1 Design of a solar system

We dispose of some classes both in C++ and JAVA that allow building applications working on notions of planets, stars and solar systems.

15.1.1 Planet

This class represents the characteristics of a planet.

TypeAttribute nameDescription
double diameter the average diameter of the planet

15.1.2 Earth

This class represents our planet, for instantiating our particular solar system for instance, and working on geopolitical data perhaps!

TypeAttribute nameDescription
std::vector<std::string> countryNames the name of all countries are put into

15.1.3 SolarSystem

This class represents the solar system, with its constituents, the sun excluded for the moment.

TypeAttribute nameDescription
std::vector<Planet*> planets the planets that compose the solar system.

16 The debugger

The -debug option passed to the command line allows running the interpreter in debug mode. See chapter the integrated debugger for more information about its functionalities. We'll apply it on our precedent leader script:

CodeWorker command line to execute:
-I Scripts/Tutorial -path . -define DESIGN_FILE=GettingStarted/SolarSystem0.sml -script GettingStarted/LeaderScript5.cws -stdin GettingStarted/Debugger.cmd -debug

      // file "GettingStarted/LeaderScript5.cws":
      if !getProperty("DESIGN_FILE")
          error("'-define DESIGN_FILE=file' expected on the command line");
      traceLine("'Simple Modeling' design file to parse = \""
                + getProperty("DESIGN_FILE") + "\"");
      parseAsBNF("GettingStarted/SimpleML-parsing.cwp",
                 project, getProperty("DESIGN_FILE"));
      #include "TreeDecoration.cws"
     
      #include "SharedFunctions.cws"
      foreach myClass in project.listOfClasses {
          traceLine("generating class '" + myClass.name + "' ...");
          generate("GettingStarted/CppObjectHeader.cwt", myClass, getWorkingPath() + "Scripts/Tutorial/GettingStarted/Cpp/" + myClass.name + ".h");
          generate("GettingStarted/CppObjectBody.cwt", myClass, getWorkingPath() + "Scripts/Tutorial/GettingStarted/Cpp/" + myClass.name + ".cpp");
          generate("GettingStarted/JAVAObject.cwt", myClass, getWorkingPath() + "Scripts/Tutorial/GettingStarted/JAVA/solarsystem/" + myClass.name + ".java");
      }
     
      local myDocumentationContext;
      insert myDocumentationContext.language = "C++";
      traceLine("generating the HTML documentation...");
      setCommentBegin("<!--");
      setCommentEnd("-->");
      expand("GettingStarted/HTMLDocumentation.cwt",
              myDocumentationContext, getWorkingPath()
              + "Scripts/Tutorial/GettingStarted/SolarSystem1.html");
      translate("GettingStarted/HTML2LaTeX.cwp", project, "GettingStarted/SolarSystem1.html", getWorkingPath() + "Scripts/Tutorial/GettingStarted/SolarSystem.tex");

Output:

"LeaderScript5.cws" at 5: if !getProperty("DESIGN_FILE")
// The controlling sequence stops on the first statement of the leader script.
// We go the next instruction:
n
"LeaderScript5.cws" at 7: traceLine("'Simple Modeling' design file to parse = \""
// twice more:
n2
'Simple Modeling' design file to parse = "GettingStarted/SolarSystem0.sml"
"LeaderScript5.cws" at 11: parseAsBNF("GettingStarted/SimpleML-parsing.cwp",
//let plunge into the BNF-driven script:
s
parsed file is "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SolarSystem0.sml":1,1
"SimpleML-parsing.cwp" at 6: world ::= #ignore(C++) [class_declaration]* #empty
//We are pointing to the beginning of the rule. Let execute '#ignore(C++)':
s
parsed file is "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SolarSystem0.sml":1,1
"SimpleML-parsing.cwp" at 6: world ::= #ignore(C++) [class_declaration]* #empty
//Let go to the unbounded expression '[class_declaration]*':
s
parsed file is "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SolarSystem0.sml":1,1
"SimpleML-parsing.cwp" at 6: world ::= #ignore(C++) [class_declaration]* #empty
//Now, we have a look to 'class_declaration':
s
parsed file is "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SolarSystem0.sml":2,1
"SimpleML-parsing.cwp" at 16: class_declaration ::= IDENT:"class" #continue
//We visit 'INDENT:"class"' and we step over immediatly. Into a BNF-driven script, tokens of a
//sequence are iterated step by step, and 'next' runs all the sequence in one shot:
s
parsed file is "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SolarSystem0.sml":2,1
"SimpleML-parsing.cwp" at 112: IDENT ::= #!ignore ['a'..'z'|'A'..'Z'|'_']
n
parsed file is "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SolarSystem0.sml":2,6
"SimpleML-parsing.cwp" at 21: IDENT:sClassName
//We visit 'INDENT:sClassName' and we step over immediatly:
s
parsed file is "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SolarSystem0.sml":2,7
"SimpleML-parsing.cwp" at 112: IDENT ::= #!ignore ['a'..'z'|'A'..'Z'|'_']
n
parsed file is "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SolarSystem0.sml":2,13
"SimpleML-parsing.cwp" at 25: => insert project.listOfClasses[sClassName].name = sClassName;
//What about all local variables available on the stack?
l
sClassName
//What is the value of 'sClassName'?
t sClassName
Planet
//Now, we are looking at a classical statement of the language, an 'insert' assignment. But
//it might be more convenient to see more source code:
d 4
parsed file is "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SolarSystem0.sml":2,13
21: IDENT:sClassName
22: //note: about parsing, classes are modeled into node
23: //note: \textbf{project.listOfClasses[}\textit{sClassName}\textbf{]}. Its attribute
24: //note: \samp{name} contains the value of \textit{sClassName}.
25: => insert project.listOfClasses[sClassName].name = sClassName;
26: //note: if the class inherits from a parent, \samp{\textbf{':'}} is necessary followed by
27: //note: an identifier (pattern \samp{\#continue}), and the identifier that matches with
28: //note: clause call \textit{IDENT} is assigned to the local variable \samp{sClassName},
29: [':' #continue IDENT:sParentName
//What about the call stack?
stack
parsed file is "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SolarSystem0.sml":2,13
"SimpleML-parsing.cwp" at 25: => insert project.listOfClasses[sClassName].name = sClassName;
parsed file is "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SolarSystem0.sml":2,13
"SimpleML-parsing.cwp" at 6: world ::= #ignore(C++) [class_declaration]* #empty
parsed file is "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SolarSystem0.sml":2,13
"LeaderScript5.cws" at 11: parseAsBNF("GettingStarted/SimpleML-parsing.cwp",
//Exiting the debug session:
q
file parsed successfully
generating class 'Planet' ...
generating class 'Earth' ...
generating class 'SolarSystem' ...
generating the HTML documentation...

17 Scripts coverage and time consuming

The -quantify option passed to the command line allows running the interpreter with the profiling mode. See chapter quantifying scripts for more information about its functionalities. We'll apply it on our precedent leader script:

CodeWorker command line to execute:
-I Scripts/Tutorial -path . -define DESIGN_FILE=GettingStarted/SolarSystem0.sml -script GettingStarted/LeaderScript6.cws -quantify Scripts/Tutorial/GettingStarted/quantify.html

      // file "GettingStarted/LeaderScript6.cws":
      if !getProperty("DESIGN_FILE")
          error("'-define DESIGN_FILE=file' expected on the command line");
      traceLine("'Simple Modeling' design file to parse = \""
                + getProperty("DESIGN_FILE") + "\"");
      parseAsBNF("GettingStarted/SimpleML-parsing.cwp",
                 project, getProperty("DESIGN_FILE"));
      #include "TreeDecoration.cws"
     
      #include "SharedFunctions.cws"
      foreach myClass in project.listOfClasses {
          traceLine("generating class '" + myClass.name + "' ...");
          generate("GettingStarted/CppObjectHeader.cwt", myClass, getWorkingPath() + "Scripts/Tutorial/GettingStarted/Cpp/" + myClass.name + ".h");
          generate("GettingStarted/CppObjectBody.cwt", myClass, getWorkingPath() + "Scripts/Tutorial/GettingStarted/Cpp/" + myClass.name + ".cpp");
          generate("GettingStarted/JAVAObject.cwt", myClass, getWorkingPath() + "Scripts/Tutorial/GettingStarted/JAVA/solarsystem/" + myClass.name + ".java");
      }
     
      local myDocumentationContext;
      insert myDocumentationContext.language = "C++";
      traceLine("generating the HTML documentation...");
      setCommentBegin("<!--");
      setCommentEnd("-->");
      expand("GettingStarted/HTMLDocumentation.cwt",
              myDocumentationContext, getWorkingPath()
              + "Scripts/Tutorial/GettingStarted/SolarSystem1.html");
      translate("GettingStarted/HTML2LaTeX.cwp", project, "GettingStarted/SolarSystem1.html", getWorkingPath() + "Scripts/Tutorial/GettingStarted/SolarSystem.tex");

Output:

'Simple Modeling' design file to parse = "GettingStarted/SolarSystem0.sml"
file parsed successfully
generating class 'Planet' ...
generating class 'Earth' ...
generating class 'SolarSystem' ...
generating the HTML documentation...

Profiling results:

-- quantify session --
quantify execution time = 427ms
User defined functions:
  populateHeaderDeclarations(...) file "c:/Projects/generator/Scripts/Tutorial/GettingStarted/CppObjectHeader.cwt" at 29: 7 occurences in 0ms
  getMethodID(...) file "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SharedFunctions.cws" at 98: 3 occurences in 0ms
  getParameterType(...) file "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SharedFunctions.cws" at 44: 3 occurences in 0ms
  getType(...) file "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SharedFunctions.cws" at 31: 13 occurences in 0ms
  getVariableName(...) file "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SharedFunctions.cws" at 76: 26 occurences in 1ms
  normalizeIdentifier(...) file "c:/Projects/generator/Scripts/Tutorial/GettingStarted/SharedFunctions.cws" at 5: 39 occurences in 0ms
Predefined functions:
  charAt(...): 39 occurrences
  composeHTMLLikeString(...): 7 occurrences
  endString(...): 9 occurrences
  endl(...): 19 occurrences
  executeStringQuiet(...): 1 occurrences
  existVariable(...): 11 occurrences
  findElement(...): 2 occurrences
  findFirstChar(...): 39 occurrences
  first(...): 12 occurrences
  getFloatingLocation(...): 23 occurrences
  getMarkupKey(...): 1 occurrences
  getProperty(...): 3 occurrences
  getWorkingPath(...): 11 occurrences
  isEmpty(...): 6 occurrences
  isNegative(...): 39 occurrences
  newFloatingLocation(...): 12 occurrences
  not(...): 72 occurrences
  startString(...): 39 occurrences
  subString(...): 39 occurrences
  toUpperString(...): 39 occurrences
Procedures:
  __RAW_TEXT_TO_WRITE(...): 498 occurrences
  clearVariable(...): 1 occurrences
  expand(...): 1 occurrences
  generate(...): 9 occurrences
  insertTextOnce(...): 24 occurrences
  parseAsBNF(...): 1 occurrences
  setCommentBegin(...): 1 occurrences
  setCommentEnd(...): 1 occurrences
  setProtectedArea(...): 19 occurrences
  traceLine(...): 5 occurrences
  translate(...): 1 occurrences
  writeText(...): 325 occurrences
Covered source code: 83%
-- end of quantify session --

When the -quantify option isn't followed by an HTML file name, the synthetic profiling results are reported to the console:

If a file name was specified, the HTML output file highlights all visited script, so as to show parts of the code that are executed a lot and those that are less executed. Each visited line is prefixed by the number of times the controlling sequence has run on it.

Some points to notice:

18 Translating interpreted scripts to C++ source code

Once the scripts are considered as stable, it might be interesting to convert the interpreter and all necessary scripts to an executable, for many reasons:

The executable is built starting from the corresponding C++ source codes of the script files. It exists two ways to ask for compiling the CodeWorker script files to C++:
Compiling the project to C++ will convert the leader script and all its dependencies (meaning that all scripts that may be required by the leader will be compiled to C++) and then two makefiles will be created (a DSP for Visual C++ and a classical makefile intended to LINUX). The project takes the name of the leader script.

To compile our Simple Modeling Language project to C++, we may choose to proceed as one of the following:

The directory called "Scripts/Tutorial/GettingStarted/bin" contains the C++ source files and the makefiles:

    Scripts/Tutorial/GettingStarted/bin/CGExternalHandling.h
    Scripts/Tutorial/GettingStarted/bin/CGRuntime.h
    Scripts/Tutorial/GettingStarted/bin/CppObjectBody_cwt.cpp
    Scripts/Tutorial/GettingStarted/bin/CppObjectBody_cwt.h
    Scripts/Tutorial/GettingStarted/bin/CppObjectHeader_cwt.cpp
    Scripts/Tutorial/GettingStarted/bin/CppObjectHeader_cwt.h
    Scripts/Tutorial/GettingStarted/bin/CppParsingTree.h
    Scripts/Tutorial/GettingStarted/bin/DynPackage.h
    Scripts/Tutorial/GettingStarted/bin/HTML2LaTeX_cwp.cpp
    Scripts/Tutorial/GettingStarted/bin/HTML2LaTeX_cwp.h
    Scripts/Tutorial/GettingStarted/bin/HTMLDocumentation_cwt.cpp
    Scripts/Tutorial/GettingStarted/bin/HTMLDocumentation_cwt.h
    Scripts/Tutorial/GettingStarted/bin/JAVAObject_cwt.cpp
    Scripts/Tutorial/GettingStarted/bin/JAVAObject_cwt.h
    Scripts/Tutorial/GettingStarted/bin/LeaderScript6.dsp
    Scripts/Tutorial/GettingStarted/bin/LeaderScript6_cws.cpp
    Scripts/Tutorial/GettingStarted/bin/LeaderScript6_cws.h
    Scripts/Tutorial/GettingStarted/bin/Makefile
    Scripts/Tutorial/GettingStarted/bin/SimpleML-parsing_cwp.cpp
    Scripts/Tutorial/GettingStarted/bin/SimpleML-parsing_cwp.h
    Scripts/Tutorial/GettingStarted/bin/UtlException.h

The main C++ source file is "LeaderScript6.cpp" and the executable will be called "LeaderScript6.exe".

The scripting language

CodeWorker must be seen as a script interpreter that is intended to parse and to generate any kind of text or source code. This interpreter admits some options on the command line. Some of them look like those of a compiler.

CodeWorker doesn't provide any Graphical User Interface, but a console mode allows interactivity with the user.

19 Command line of the interpreter

The leader script is the name given to the script that is executed first by the interpreter. It exists six ways to pass this leader script to the interpreter via the command line:

To find easier a file to open for reading among some directories, the option -I specifies a path to explore. It gives more flexibility in sharing input files (both scripts and user files, excepting generated or expanded files) between directories, and it avoids relative or absolute paths into scripts.

It is possible to define some properties on the command line, thanks to option -define (or -D). These properties are intended to be exploited into scripts.

It is recommended to specify a kind of working directory with option -path. The assigned value is accessible into scripts via the function getWorkingPath(). This working directory generally indicates the output path for copying or generating files. The developer of scripts decides how to use it.

CodeWorker interprets scripts efficiently for speed. However, it is more convenient to run a standalone executable, instead of the interpreter and some script files. Moreover, once scripts are stable, why not to compile them as an executable to run the project a few times faster? Option -c++ allows translating the leader script and all its dependencies to C++ source codes, ready-to-compile.

To facilitate the tracking of errors, an integrated debugger is called thanks to the option -debug. It runs into the console, and some classical commands allow taking the control of the execution and exploring the stack and the variables.

Here are presented all switches that are allowed on the command line:

SwitchDescription
-args [arg]* Pass some arguments to the command line. The list of arguments stops at the end of the command line or as soon as an option is encountered. The arguments are stored in a global array variable called _ARGS.
-autoexpand file-to-expand The file file-to-expand is explored for expanding code at markups, executing a template-based script inserted just below each markup. It is identical to execute the script function autoexpand(file-to-expand, project).
-c++ generated-project-path
CodeWorker-path?
To translate the leader script and all its dependencies in C++ source code, once the execution of the leader script has achieved (same job as compileToCpp() compileToCpp()). The CodeWorker-path is optional and gives the path through includes and libraries of the software. However, it is now recommended to specify CodeWorker-path by the switch -home.
-c++2target script-file
generated-project-path target-language?
To translate the leader script and all its dependencies in C++ source code. Hence, the C++ is translated to a target language, all that once the execution of the leader script has achieved. Do not forget to give the path through includes and libraries of CodeWorker, setting the switch -home.
A preprocessor definition called "c++2target-path" is automatically created. It contains the path of the generated project. Call getProperty("c++2target-path") to retrieve the path value.
target-language is optional if at least one script of the project holds the target into its filename, just before the extension. Example: "myscript.java.cwt" means that the target language of this script is "java".
A property can follow the name of the target language, separated by a '=' symbol. The property is accessible via getProperty("c++2target-property"), and its nature depends on the target. For instance, in Java, this property represents the package the generated classes will belong to. Example: java=org.landscape.mountains.
-c++external filename To generate C++ source code for implementing all functions declared as external into scripts.
-commentBegin format To specify the format of a beginning of comment.
-commentEnd format To specify the format of a comment's end.
-compile scriptFile To compile a script file, just to check whether the syntax is correct.

SwitchDescription
-commands commandFile To load all arguments processed ordinary on the command-line. It must be the only switch or else passed on the command-line.
-console To open a console session (default mode if no script to interpret is specified via -script or -compile or -generate or -expand.
-debug [remote]? To debug a script in a console while executing it. The optional argument remote defines parameters for a remote socket control of the debugging session. remote looks like <hostname>:<port>. If <hostname> is empty, CodeWorker runs as a socket server.
-define VAR=value
or -D ...
To define some variables, as when using the C++ preprocessor or when passing properties to the JAVA compiler. These variables are similar to properties, insofar as they aren't exploited during the preprocessing of scripts to interpret. This option conforms to the format -define VAR when no value has to be assigned ; in that case, "true" is assigned by default to variable VAR. The script function getProperty("VAR") gives the value of variable VAR.
-expand pattern-script
file-to-expand
Script file pattern-script is executed to expand file file-to-expand into markups. It is identical to execute script function expand(pattern-script, project, file-to-expand).
-fast To optimize speed. While processing generation, the output file is built into memory, instead of into a temporary file.
-generate pattern-script
file-to-generate
Script file pattern-script is executed to generate file file-to-generate. It is identical to execute script function generate(pattern-script, project, file-to-generate).
-genheader text Adds a header at the beginning of all generated files, followed by a text (see procedure setGenerationHeader() setGenerationHeader()).
-help or ? Help about the command line.
-home CodeWorker-path Specifies the path to the home directory of CodeWorker.
-I path Specify a path to explore when trying to find a file while invoking include or parseFree or parseAsBNF or generate or expand or ... This option may be repeated to specify more than one path.
-insert variable_expression
value
Creates a new node in the main parse tree project and assigns a constant value to it. It is identical to execute the statement insert variable_expression = " value " ;.
-nologo The interpreter doesn't write the copyright in the shell at the beginning.

SwitchDescription
-nowarn warnings Specified warning types are ignored. They are separated by pipe symbols. Today, the only recognized type is undeclvar, which prevents the developer against the use of a undeclared variable.
-parseBNF BNF-parsing-script
source-file
The script file BNF-parsing-script parses source-file from an extended BNF grammar. It is identical to execute the script function parseAsBNF(BNF-parsing-script, project, source-file).
-path path Output directory, returned by the script function getWorkingPath(), and used ordinary to specify where to generate or copy a file.
-quantify [outputFile]? To execute scripts into quantify mode that consists of measuring the coverage and the time consuming. Results are saved to HTML file outputFile or displayed to the console if not present.
-report report-file
request-flag
To generate a report once the execution has achieved. The report is saved to file report-file and nature of information depends on the flag request-flag. This flag must be built by computing a bitwise OR for one or several of the following integer constants:
  • 1: provides every output file written by a template-based script (generate(), expand() or translate)
  • 2: provides every input file scanned by a BNF parse script (parseAsBNF() or translate())
  • 4: provides details of coverage recording for every output file using the #coverage directive
  • 8: provides details of coverage recording for every input file using the #matching directive
  • 16: provides details of coverage recording for every output file written by a template-based script
  • 32: provides details of coverage recording for every input file scanned by a BNF parse script
Notice that flags 16 and 32 may become highly time and memory consuming, depending both on how many input/output files you have to process and on their size.
-script script-file Defines the leader script, which will be executed first.
-stack depth To limit the recursive call of functions, for avoiding an overflow stack memory. By default, the depth is set to 1000.
-stdin filename To change the standard input for reading from an existing file. It may be useful for running a scenario.
-stdout filename To change the standard output for writing it to a file.
-time To display the execution time expressed in milliseconds, just before exiting.

SwitchDescription
-translate translation-script
source-file file-to-generate
Script file translation-script processes a source-to-source translation. It is identical to execute the script function translate(translation-script, project, source-file, file-to-generate).
-varexist To trigger a warning when the value of a variable that doesn't exist is required into a script.
-verbose To display internal messages of the interpreter (information).
-version version-name To force interpreted scripts as written in a precedent version given by version-name.

Note that the interpreter proposes a convenient way for running a common script with arguments:

codeworker <script-file> <arg1> ... <argN> [<switch>]*

This writing replaces the more verbose:

codeworker -script <script-file> -args <arg1> ... <argN> [<switch>]*

A console mode is launched when the command line is empty. The console only accepts scripts written in the common syntax, with common functions and procedures. So, parsing and generation scripts aren't typed directly on the console.

20 Syntax generalities and statements

A script in CodeWorker consists of a series of statements that are organized into blocks (also known as compound statements). A statement is an instruction the interpreter has to execute.

A single statement must close with a semicolon (';'). A compound statement is defined by enclosing instructions between braces ('{}'). A block can be used everywhere you can use a single statement and must never end with a semicolon after the trailing brace.

Comments are indicated either by surrounding the text with '/*' and '*/' or by preceding the rest of the line to ignore with a double slash ('//').

It exists three families of scripts here. To facilitate their syntax highlighting in editors, or to indicate briefly the type of the script, we suggest to employ some file extensions, depending on the nature of the script. The next table exposes the different extensions used commonly in CodeWorker.

ExtensionDescription
".cwt" a template-based script, for text generation
".cwp" a extended-BNF parse script, for parsing text
".cws" a common script, none of the precedent

The structure of the grammar is so rich that it is a challenge to find an editor, which offers a syntax highlighting engine powerful enough. JEdit proposes the writing of production rules to describe it, so it is possible to express the syntax highlighting of the scripting language.

You'll find a package dedicated to JEdit on the Web site, for the inclusion of these new highlighting modes. Many thanks to Patrick Brannan for this contribution.

20.1 preprocessor directives

A preprocessor directive always starts with a '#' symbol and is followed by the name of the directive.

20.1.1 Including a file

The #include filename directive tells the preprocessor to replace the directive at the point where it appears by the contents of the file specified by the constant string filename. The preprocessor looks for the file in the current directory and then searches along the path specified by the -I option on the command line.

20.1.2 Extending the language via a package

A package is an extension of the scripting language that allows adding new functions in CodeWorker at runtime. A package is implemented as an executable module, which exports all new functions the developer wants to make available in the interpreter.

Loading of a package

The preprocessor directive #use tells the interpreter that it must extend itself with the functions exposed by a package.

The syntax is: #use package-name

Loading a package more than once has no effect.

The name of the package must prefix the name of the function, when calling it: package-name::my-function(parameters...)

Example:

#use PGSQL
PGSQL::connect("-U pilot -d emergencyDB");
local sRequest = "SELECT solution FROM average_adjustment WHERE damage = 'broken wing'";
local listOfSolutions;
PGSQL::selectList(sRequest, listOfSolutions);
if listOfSolutions.empty()
  traceLine("No solution. Suggestion: parachute jump?");
else {
  traceLine("Solutions:");
  foreach i in listOfSolutions
    traceLine(" -" + i);
}
PGSQL::disconnect(); // if the plane hasn't crashed yet

The PGSQL package serves here for connecting to and querying a PostGreSQL database. For this example, the package exports three functions: PGSQL::connect, PGSQL::selectList and PGSQL::disconnect.

The executable module

CodeWorker expects a dynamic library, whose name is deduced from the package name and from the platform the interpreter is running to.
The short name of the dynamic library concatenates "cw" at the end of the package name. The extension of the dynamic library must be ".dll" under Microsoft Windows, and ".so" under Linux.

You must put the dynamic library at a place where CodeWorker will find it at runtime.
Microsoft Windows proceeds in the following order to locate the library:

Under Unix, a relative path for the shared object refers to the current directory (according to the man description of dlopen(3C)).

So, when CodeWorker reads #use PGSQL, it searches a dynamic library called "PGSQLcw.dll" under Windows or "PGSQLcw.so" under Linux.

Building a package

This section is intended to those that want to build their own packages, for binding to a database or to a graphical library ... or just for gluing with their own libraries.

When the interpreter find the preprocessor directive #use package-name in a script, it loads the executable module and executes the exported C-like function CW4DL_EXPORT_SYMBOL void package-name_Init(CW4dl::Interpreter*).

The preprocessor definition CW4DL_EXPORT_SYMBOL and the namespace CW4dl are both declared in the C++ header file "CW4dl.h". This header file is located in the "include" directory if you downloaded binaries, and at the root of the project if you downloaded sources.

The C-like function 'package-name_Init()' MUST be present! C-like means that it is declared extern "C" (done by CW4DL_EXPORT_SYMBOL).

Initializing the module that way is useful for registering new functions in the engine, via the function createCommand() of the interpreter (see the header file "CW4dl.h" in the declaration of the class Interpreter for learning more about it).

Every function to export must start its declaration with the preprocessor definition CW4DL_EXPORT_SYMBOL (means 'extern "C"', but a little more under Windows).

Every function returns const char*. The CodeWorker's keyword null designates an atypical tree node. It doesn't accept navigation and reference, only passing by parameter to a function. On the C++ side, this null tree node is seen as a null pointer of kind CW4dl::Tree*.

The interpreter CW4dl::Interpreter represents the runtime context of CodeWorker. It is the unavoidable intermediary between the module you are building and CodeWorker.
Use it for:

The #line directive forces to another number the line counter of the script file being parsed. The line just after the directive is supposed to be worth the number specified after #line.

20.1.3 Changing the syntax of the scripting language

The #syntax directive tells the preprocessor not to parse the following instructions as classical statements of the scripting language, but as conforming to another syntax. It allows adapting the syntax to what you are programming: The directive admits the following writing:
"#syntax" [parsing-mode [':' BNF-script-file]? | BNF-script-file]

How does it work? The piece of source code, which doesn't conform to the syntax of the script language, is put between the directives #syntax ... and #end syntax. If the trailing directive isn't found, the remaining of the script is considered as written in a foreign syntax. Be careful that the trailing directive must start at the beginning of the line necessary to be recognized and that no spaces are allowed between # and end.
At runtime, the famous piece of source code is parsed and processed via the BNF script file.

Note that it is possible to attach an identifier (called parsing-mode above) to a script file, and to specify later, in any other script, the parsing mode only; CodeWorker will find the corresponding BNF script file. It avoids to handle a physical name of the BNF parsing file, where a logical name of parsing mode is more convenient.

Example:

     // the first time, a parsing mode may be attached to the BNF script file
     #syntax shell:"TinyShell.cwp"
     ...
     #end syntax
     
     // at the second call, it isn't recommended to use the path of the parsing file
     // it is better to use the parsing mode registered previously
     #syntax shell
     ...
     #end syntax
     
     // here, I know that I'll call it once only, so I don't care about a parsing mode
     #syntax "MakeFile.cwp"
     ...
     #end syntax

where the parsing script "TinyShell.cwp" might be worth:

      // file "GettingStarted/TinyShell.cwp":
      tinyShell ::=
              #ignore(C++)
              #continue
              [
                  #readIdentifier:sCommand
                  #ignore(blanks) #continue
                  command<sCommand>
              ]* #empty;
     
      //----------------------------//
      // commands of the tiny shell //
      //----------------------------//
      command<"copy"> ::=
              #continue parameter:sSource parameter:sDestination
              => {copyFile(sSource, sDestination);};
     
      command<"rmdir"> ::=
              #continue parameter:sDirectory
              => {removeDirectory(sDirectory);};
     
      command<"del"> ::=
              #continue parameter:sFile
              => {deleteFile(sFile);};
     
     
      //--------------------
      // Some useful clauses
      //--------------------
      parameter:value ::=
              #readCString:parameter
                  |
              #!ignore #continue [~[' ' | '\t' | '\r' | '\n']]+:parameter;

Of course, the parsing and the processing are implemented in the scripting language, so changing the syntax will be slower than keeping the default one. However, it allows writing a code easy to support and to understand.

20.1.4 Managing changes in a multi-language generation

The directives #reference and #attach serve to be notified when a change has been made into a script for generating in a given language, but not taken back in another language. For example, you are writing a framework both in C++ and JAVA. You are adding some new features in C++ or correcting some mistakes. One day, you'll be care not to forget to update the JAVA generation. In fact, thanks to these directives, a warning will be produced up to changes will have been put in the other script.

How does it work? Directives must delimit the piece of script you have changed:
"#reference" key
...
"#end" key

The key is an identifier that allows putting more than one reference area into a script file. A #reference area might cover one or more #reference directives, without confusing about boundaries. The directive must be put at the beginning of the line.

Here are the directives delimiting the piece of script that should be updated later in another file:
"#attach" reference-file ':' reference-key
...
"#end" reference-key

A #attach area might cover one or more #reference or #attach directives, as a #reference area. The directive must be put at the beginning of the line.

The first time CodeWorker will encounter the reference script file, it will compute a number that depends on the content of the area. The first time CodeWorker will encounter an attached script file, it will get back the magic number of the reference area, found both by the file name and the key of the reference. And then, at the beginning, the reference and attached areas are considered as similar. CodeWorker stores the magic number of the reference just behind the #attach directive:
"#attach" reference-file ':' reference-key ',' reference-number

In fact, a script file that must be updated, so as to store the magic numbers for some attached areas, takes into account the modifications at the end of the parsing, and only if no error was encountered. If the writefileHook() function (see writefileHook) is implemented, it is called and the script file doesn't change if it returns false. If the script file is read-only, the corresponding readonlyHook() function is called (see readonlyHook). If it isn't possible to save the script file, an error is thrown.

When a change occurs in the reference area, the next time CodeWorker will encounter it, the magic number will be recomputed. When an attached piece of script is encountered after the change, the old magic number of the reference is compared to the new one. If they aren't the same, a warning is displayed to notify that the attached area hasn't been updated yet.

Once the changes have been taken back into the attached area, the magic number of the reference must be cut (don't forget the comma too!). And so, the next time this attached area will be encountered by the interpreter, it will get back the magic number of the reference area. And then, the reference area and the attached area are considered as similar once again.

Of course, the use of these directives is quite constraining. However, it is the only way in CodeWorker to assure that features and corrections have been taken back in all generated languages.

20.2 Constant literals

CodeWorker handles all basic types as strings, and doesn't distinguish a double from a boolean or a date. A string literal is a sequence of characters from the source character set enclosed in double quotation marks (" "). String literals are used to represent a sequence of characters which, taken together, form a null-terminated string. The interpretation done of the data depends on the context: function

increment(index)

expects that its argument index contains a number, but stored as a string.

A constant tree describes a tree as a list of constant trees and expressions, intended to be assigned to a variable. Example:

local aVariable = "a"{["yellow", "red":"or"{.alternative="orange"

], .vehicle="submarine"};}

You'll find more information in the sub section Scope below.

20.3 Variables, declaration and assignment

Variables serve as containers for the data you use into scripts. Data type is a tree that may be reduced to a leaf node, which contains a value and that's all.

20.3.1 Declaring variables

It isn't necessary to declare a variable before using if for the first time. A variable that is assigned without being declared is understood as a new sub-node to be added to the current tree context. The current context is obtained by the read-only variable called this. It corresponds to the main parse tree whose root name is project when you are into the leader script, and to the variable passed by parameter when calling a parsing or pattern script.

The next table exposes all pre-defined variable names (accessible from anywhere) and their meaning:

Variable NameDescription
project The main parse tree, always present.
this It points to the current context variable.
_ARGS An array of all custom command-line arguments. Custom arguments are following the script file name or the switch -args on the command-line.
_REQUEST If the interpreter works as a CGI program, it stores all parameters of the request in a association table. The key is the parameter name, which associates the corresponding value.

A variable that is read without being declared returns an empty string, but doesn't cause the creation of a sub-node. The danger is that you aren't safe from a spelling mistake. To prevent it, put the option -varexist on the command line and use the function existVariable() to check whether a variable exists or not.

20.3.2 Scope

When you declare a local variable, it is valid for use within a specific area of code, called the scope. When the flow of execution leaves the scope, the content of the variable, a subtree specially allocated during its declaration, is deleted and disappears forever from the stack. A scope is delimited by a block.

To declare a variable to the stack, use the following declaration statement:
local-variable-statement ::= "local" local-variable-declaration ';'
local-variable-declaration ::= variable [ '=' assignment-expression ]?
assignment-expression ::= constant-tree | expression
constant-tree ::= [tree-value]? '{' [tree-array-or-attribute [',' tree-array-or-attribute]* ]? '}'
tree-value ::= expression
tree-array-or-attribute ::= tree-array | tree-attribute
tree-attribute ::= '.' attribute-name '=' assignment-expression
tree-array ::= '[' tree-array-item [',' tree-array-item]* ']'
tree-array-item ::= expression ':' assignment-expression | assignment-expression

An extension of the syntax allows the declaration of more than one variable in one shot. A comma separates the variable declarations:
local-variable-statement ::= "local" local-variable-declaration [ ',' local-variable-declaration ]* ';'

The local variable points to a new empty tree, pushed into the stack.

To assign a reference to another variable, instead of either the result of evaluating an expression or a constant tree, use rather the following declaration statement:
local-ref-statement ::= "localref" local-ref-declaration [ ',' local-ref-declaration ]* ';'
local-ref-declaration ::= variable '=' reference

In the case of a CodeWorker version strictly older than 1.13, local variables that are declared in the body of a script or in the scope of a function may be accessed further in the scope of functions during their timelife. So a different behaviour may occur with a more recent CodeWorker interpreter.

This stack management had historical reasons, but it is now obsolete and often reflects an implementation's error. To preserve you from this kind of mistake, a warning may be displayed, so that scripts strictly older than version 1.13 may continue to run. Specify a version strictly older than 1.13 to the command line (option -version) for reclaiming that CodeWorker checks and generates a warning.

To correct this kind of mistake in old scripts, the variable should be propagated in an argument for functions that refer to it.

To declare a global variable, use the global statement. The declaration of a global variable can be specified anywhere in scripts. The first time the declaration of a global variable is encountered, the interpreter registers it as accessible from any point into scripts. The second time the interpreter encounters a global declaration for the variable, the latter remains global but its content is cleared.
Note that if a local variable or an attribute of the current node (this) is identical to the name of an existing global variable, the global variable remains hidden while the flow of control hasn't left the scope that contains the homonym.

the global declaration statement looks like:
global-variable-statement ::= "global" global-variable-declaration [ ',' global-variable-declaration ]* ';'
global-variable-declaration ::= variable [ '=' assignment-expression ]?

20.3.3 Navigating along branches

It is possible to navigate along a branch of the subtree put into the variable. A branch points to a node of the subtree. The syntax looks generally like:
branch ::= variable ['.' sub-node]*

If the branch isn't known before runtime, it may be build during the execution.

Example: while parsing an XML file, each time an XML attribute is encountered, one creates the corresponding attribute into the parse tree. But the name of the attribute is discovered during the parsing. The directive #evaluateVariable(expression) allows doing it. expression is evaluated at runtime and provides a branch:

#evaluateVariable("a.b.c")

will resolve the path "a.b.c" at runtime and navigate from a to textit{c}.

A node may contain an array of nodes, which are indexed by a key that is a constant string. A branch allows navigating through arrays, and the definitive syntax of branches conforms to:
branch ::= "#evaluateVariable" '(' expression ')'
                ::= variable ['.' sub-node | array-access]*
array-access ::= '[' expression ']'
                ::= '#' ["front" | "back" | "parent"] | "root"]
                ::= '#' '[' integer-expression ']'

We see that there are some ways to access an item node of an array or to change how to navigate from nodes to nodes:

20.3.4 Assignments

CodeWorker provides some different ways to put a data into a variable or into the node pointed to by a branch:

20.4 Expressions

20.4.1 Presentation

The BNF representation of an expression looks like:
expression ::= boolean-expr | ternary-expr
boolean-expr ::= comparison-expr [boolean-op comparison-expr]
boolean-op ::= '&' | '&&' | '|' | '||' | '^' | '^^'
ternary-expr ::= comparison-expr '?' expression ':' expression
comparison-expr ::= concatenation-expr [comparison-op concatenation-expr | "in" constant-set]
constant-set ::= '{' constant-string [',' constant-string]* '}'
comparison-op ::= '<' | '<=' | '==' | '=' | '!=' | '<>' | '>' | '>='
concatenation-expr ::= stdliteral-expr ['+' stdliteral-expr]*
stdliteral-expr ::= literal-expr
                ::= '$' arithmetic-expr '$'
literal-expr ::= constant-string | number
                ::= "true" | "false"
                ::= '(' expression ')'
                ::= '!' literal-expr
                ::= preprocessor-expr
                ::= function-call
                ::= variable-or-branch

arithmetic-expr ::= comparith-expr [boolean-op comparith-expr]*
comparith-expr ::= sum-expr [comparison-op sum-expr]
sum-expr ::= shift-expr [['+' | '-'] shift-expr]*
shift-expr ::= factor-expr [["<<" | ">>"] factor-expr]*
factor-expr ::= literal-expr [['*' | '/' | '%'] literal-expr]*
unary-expr ::= literal-expr ["++" | "--"]
literal-expr ::= string | variable-expr | number | unary-expr
                ::= '~' literal-expr
preprocessor-expr ::= '#' ["LINE" | "FILE"]

where:

20.4.2 Arithmetic expressions

The classical syntax of the interpreter forces expressions to work on sequences of characters. So, comparison operators apply the lexicographical order and the '+' operator concatenates two strings and the '*' operator doesn't exist.

Of course, it exists some functions to handle strings as number and to execute an arithmetic operation (the 'add()' or 'mult()' functions for instance) or a comparison (the 'isPositive()' or 'inf()' functions for instance).

However, it appears clearly more convenient to write arithmetic operations and comparisons in a natural way, using operators instead of the corresponding functions. So, CodeWorker provides an escape mode that draws its inspiration from LaTeX to express mathematical formulas: the arithmetic expression are delimited by the symbol '$'.

Example:


local a = 11;
local b = 7;
traceLine("Classical mode = '"
    + inf(add(mult(5, a), 3), sub(mult(a, a), mult(b, b))) + "'");
traceLine("Escape mode = '" + $5*a + 3 < a*a - b*b$ + "'");

Output:

Classical mode = 'true'
Escape mode = 'true'

20.5 Common statements

20.5.1 The 'if' statement

The BNF representation of the while statement is:
if-statement ::= "if" expression then-statement ["else" else-statement]?

The if statement evaluates the expression following immediately. The expression must be of arithmetic, text, variable or condition type. In both forms of the if syntax, if the expression evaluates to a nonempty string, the statement dependent on the evaluation is executed; otherwise, it is skipped.

In the if...else syntax, the second statement is executed if the result of evaluating the expression is an empty string. The else clause of an if...else statement is associated with the closest previous if statement that does not have a corresponding else statement.

20.5.2 The 'while'/'do' statements

The BNF representation of the while statement is:
while_statement ::= "while" expression statement

The while statement lets you repeat a statement or compound statement as long as a specified expression becomes an empty string. The expression in a while statement is evaluated before the body of the loop is executed. Therefore, the body of the loop may be never executed. If expression returns an empty string, the while statement terminates and control passes to the next statement in the program. If expression is non-empty, the process is repeated. The while statement can also terminate when a break, or return statement is executed within the statement body. When a continue statement is encountered, the control breaks the flow and jumps to the evaluation of the expression.

Note that the break and continue statements apply to the first loop statement (foreach/forfile/select, do/while) they encounter while leaving instruction blocks.

The BNF representation of the do statement is:
do_statement ::= "do" statement "while" expression ';'

The do-while statement lets you repeat a statement or compound statement until a specified expression becomes an empty string. The expression in a do-while statement is evaluated after the body of the loop is executed. Therefore, the body of the loop is always executed at least once. If expression returns an empty string, the do-while statement terminates and control passes to the next statement in the program. If expression is non-empty, the process is repeated. The do-while statement can also terminate when a break, or return statement is executed within the statement body. When a continue statement is encountered, control is transferred to the evaluation of the expression.

20.5.3 The 'switch' statement

The BNF representation of this statement is:
switch_statement ::= "switch" '(' expression ')' '{' (label_declaration)* ("default" ':' statement)? '}'
label_declaration ::= ["case" | "start"] constant_string ':' statement

The switch statement allows selection among multiple sections of code, depending on the value of an expression. The expression enclosed in parentheses, the controlling expression, must be of string type.

The switch statement causes an unconditional jump to, into, or past the statement that is the switch body, depending on the value of the controlling expression, the constant string values of the case or start labels, and the presence or absence of a default label. The switch body is normally a compound statement (although this is not a syntactic requirement). Usually, some of the statements in the switch body are labeled with case labels or with start labels or with the default label. The default label can appear only once.

The constant-string in the case label is compared for equality with the controlling expression. The constant-string in the start label is compared for equality with the first characters of the controlling expression. In a given switch statement, no two constant strings in start or case statements can evaluate to the same value.

The switch statement behaviour depends on how the controlling expression matches with labels. If a case label exactly matches with the controlling expression, control is transferred to the statement following that label. If failed, start labels are iterated into the lexicographical order, and the control is transferred to the statement following the first label that matches with the beginning of the controlling expression. If failed, control is transferred to the default statement or, if not present, an error is thrown.

A switch statement can be nested. In such cases, case or start or default labels associate with the most deeply nested switch statements that enclose them.

Control is not impeded by case or start or default labels. To stop execution at the end of a part of the compound statement, insert a break statement. This transfers control to the statement after the switch statement.

20.5.4 The 'foreach' statement

The BNF representation of this statement is:
foreach_statement ::= "foreach" iterator "in" [direction]?
                [sorted_declaration]? [cascading_declaration]? list-node body_statement
direction ::= "reverse"
sorted_declaration ::= "sorted" ["no_case"]? ["by_value"]?
cascading_declaration ::= "cascading" ["first" | "last"]?

A foreach statement iterates all items of the list owned by node list-node. The iterator refers to the current item of the list, and the body statement is executed on it.

Items are iterated either in the order of entrance, or in alphabetical order if option sorted is set. The sort operates on keys, except if the option by_value is set. The order is inverted if option reverse was chosen. To ignore the case, these options must be followed by no_case. If not, uppercase letters are considered as smaller than any lowercase letter.

      // file "Documentation/ForeachSampleSorted.cws":
      local list;
      insert list["silverware"] = "tea spoon";
      insert list["Mountain"] = "Everest";
      insert list["SilverWare"] = "Tea Spoon";
      insert list["Boat"] = "Titanic";
      insert list["acrobat"] = "Circus";
     
      traceLine("Sorted list in a classical order:");
      foreach i in sorted list {
          traceLine("\t" + key(i));
      }
      traceLine("Note that uppercases are listed before lowercases." + endl());
     
      traceLine("Sorted list where the case is ignored:");
      foreach i in sorted no_case list {
          traceLine("\t" + key(i));
      }
     
      traceLine("Reverse sorted list:");
      foreach i in reverse sorted list {
          traceLine("\t" + key(i));
      }
     
      traceLine("Reverse sorted list where the case is ignored:");
      foreach i in reverse sorted no_case list {
          traceLine("\t" + key(i));
      }

Output:

Sorted list in a classical order:
    Boat
    Mountain
    SilverWare
    acrobat
    silverware
Note that uppercases are listed before lowercases.

Sorted list where the case is ignored:
    acrobat
    Boat
    Mountain
    SilverWare
    silverware
Reverse sorted list:
    silverware
    acrobat
    SilverWare
    Mountain
    Boat
Reverse sorted list where the case is ignored:
    silverware
    SilverWare
    Mountain
    Boat
    acrobat

Control may not be sequential into the body statement. break and return enable exiting definitely the loop, and continue transfers the control to the head of the foreach statement for the next iteration.

Option cascading allows propagating foreach on item nodes. The way it works is illustrated by an example:


    foreach i in cascading myObjectModeling.packages ...

At the beginning, i points to myObjectModeling.packages#front and the body is executed. Before iterating i to the next item, the foreach checks whether the item node myObjectModeling.packages#front owns attribute packages or not. If yes, it applies recursively foreach on myObjectModeling.packages#front.packages.

Option cascading avoids writing the following code:


function propagateOnPackages(myPackage : node) {
    foreach i in myPackage {
        // my code to apply on this package
        if existVariable(myPackages.packages)
            propagateOnPackages(myPackages.packages);
    }
}
propagateOnPackages(myObjectModeling.packages);

Option cascading offers two behaviours:

20.5.5 The 'forfile' statement

The BNF representation of this statement is:
forfile_statement ::= "forfile" iterator "in" [sorted_declaration]? [cascading_declaration]? file-pattern body_statement
sorted_declaration ::= "sorted" ["no_case"]?
cascading_declaration ::= "cascading" ["first" | "last"]?

A forfile statement iterates the name of all files that verify the filter file-pattern. The iterator refers to the current item of the list composed of retained file names, and the body statement is executed on it. Note that the file pattern may begin with a path, which cannot contain jocker characters ('*' and '?').

Like for the foreach statement, items are iterated either in the order of entrance, or in alphabetical order of keys if option sorted is set. To ignore the case, the option must be followed by no_case. If not, uppercase letters are considered as smaller than any lowercase letter.

Control may not be sequential into the body statement. break and return enable exiting definitely the loop, and continue transfers the control to the head of the forfile statement for the next iteration.

The option cascading allows propagating forfile on directories recursively. The way it works is illustrated by an example:

      // file "Documentation/ForfileSample.cws":
      local iIndex = 0;
      forfile i in cascading "*.html" {
          if $findString(i, "manual_") < 0$ &&
              $findString(i, "Bugs") < 0$ {
                  traceLine(i);
          }
          // if too long, stop the iteration
          if $iIndex > 15$ break;
          increment(iIndex);
      }

Output:

cs/DOTNET.html
cs/tests/data/MatchingTest/example.csv.html
Documentation/LastChanges.html
java/JAVAAPI.html
java/data/MatchingTest/example.csv.html
Scripts/Tutorial/GettingStarted/defaultDocumentation.html
WebSite/AllDownloads.html
WebSite/examples/basicInformation.html
WebSite/highlighting/basicInformation.html
WebSite/repository/highlighting.html
WebSite/repository/JEdit/Entity.java.cwt.html
WebSite/serewin/ExempleIllustre.html
WebSite/tutorials/DesignSpecificModeling/tutorial.html
WebSite/tutorials/DesignSpecificModeling/highlighting/demo.cws.html
WebSite/tutorials/overview/tinyDSL_spec.html
WebSite/tutorials/overview/scripts2HTML/CodeWorker_grammar.html

At the beginning, i points to the first HTML file of the current directory and the body is executed. Before iterating i to the next item, the forfile checks whether the directory of the current file owns subfolders or not. If yes, it applies recursively forfile on subfolders.

Option cascading offers two behaviours:

20.5.6 The 'select' statement

The BNF representation of this statement is:
select_statement ::= "select" iterator "in" [sorted_declaration]? node-motif body_statement
sorted_declaration ::= "sorted" first-key [, other-key]*
first-key ::= branch
other-key ::= branch

A select statement iterates a list of nodes that match a motif expression. The iterator refers to the current item of the list composed of retained nodes, and the body statement is executed on it.

      // file "Documentation/SelectSample.cws":
      local a;
      pushItem a.b;
      pushItem a.b#back.c = "01";
      pushItem a.b#back.c = "02";
      pushItem a.b#back.c = "03";
      pushItem a.b;
      pushItem a.b#back.c = "11";
      pushItem a.b#back.c = "12";
      pushItem a.b#back.c = "13";
      pushItem a.b;
      pushItem a.b#back.c = "21";
      pushItem a.b#back.c = "22";
      pushItem a.b#back.c = "23";
      select i in a.b[].c[] {
          traceLine("i = "+ i);
      }

Output:

i = 01
i = 02
i = 03
i = 11
i = 12
i = 13
i = 21
i = 22
i = 23

Like for the foreach statement, items are iterated either in the order of entrance, or according to the sorting result if the option sorted is set.

Control may not be sequential into the body statement. break and return enable exiting definitely the loop, and continue transfers the control to the head of the select statement for the next iteration.

20.5.7 The 'try'/'catch' statement

The BNF representation of this statement is:
try-catch-statement ::= "try" try-statement "catch" '('error_message_variable')' catch-statement

Error handling is implemented by using the try, catch, and error keyword. With error handling, your program can communicate unexpected events to a higher execution context that is better able to recover from such abnormal events. These errors are handled by code that is outside the normal flow of control.

The compound statement after the try clause is the guarded section of code. An error is thrown (or raised) when command error(message-text) is called or when CodeWorker encounters an internal error. The compound statement after the catch clause is the error handler, and catches (handles) the error thrown. The catch clause statement indicates the name of the variable that must receive the error message.

20.5.8 The 'exit' statement

The BNF representation of this statement is:
exit_statement ::= "exit" integer-expression ";"

A exit statement leaves the application and returns an error code, given by the integer-expression.

Example:

exit -1;

20.6 User-defined functions

The BNF representation of a user-defined function to implement is:
user-function ::= classical-function-definition | template-function-definition
classical-function-definition ::= classical-function-prototype compound-statement
classical-function-prototype ::= "function" function-name '(' parameters ')'
template-function-definition ::= see the next section,
template function, for more information
parameters ::= parameter [',' parameter]*
parameter ::= argument [':' parameter-mode [':' default-value]? ]?
parameter-mode ::= "value" | "node" | "reference" | "index"
default-value ::= "project" | "this" | "null" | "true" | "false" | constant-string

The scripting language allows the user implementing its own functions. Parameters may be passed to the body of the function. A value may be returned by the function and, if so, the return type is necessary a sequence of characters. Of course, functions manage their own stack, and so, accept recursive calls.

An argument may have a default value if the parameter is missing in a call. All following arguments must then have default values too. A node argument can't have a constant string as a default argument, but it can be worth a global variable.

20.6.1 Parameters and return value

Arguments passed by parameter must be chosen among the following modes:

If you have omitted to return a value from a function, it returns an empty string ; in that case, you expects to call this function as a procedure and the result isn't exploited. The special procedure nop takes a function call as parameter and allows executing the function and ignoring the result. It isn't compulsory to use nop for calling a function as a procedure. As in C or C++, you can type the function call followed by a semi-colon and the result is lost.

It exists two possibilities for returning a value:

If you wish to execute a particular process in any case before leaving a function and:

20.6.2 The 'finally' statement

the statement finally warrants you that the block of instructions that follows the keyword will be systematically executed before leaving. This declaration may be placed anywhere into the body of the function. Its syntax conforms to:
finally-statement ::= "finally" compound-statement

Example:

      // file "Documentation/FinallySample.cws":
      1 function f(v : value) {
      2     traceLine("BEGIN f(v)");
      3     finally {
      4         traceLine("END f(v)");
      5     }
      6     // the body of the function, with more than
      7     // one way to exit the function, for example:
      8     if !v return "empty";
      9     if v == "1" return "first";
      10     if v == "2" return "second";
      11     if v == "3" return "third";
      12     return "other";
      13 }
      14
      15 traceLine("...f(1) has been executed and returned '" + f(1) + "'");

line 3: the finally statement is put anywhere in the body,
line 4: this statement will be executed while exiting the function, even if an exception was raised,

Output:

BEGIN f(v)
END f(v)
...f(1) has been executed and returned 'first'

20.6.3 Unusual function declarations

It may arrive that a function prototype must be declared before being implemented, because of a cross-reference with another function for instance. The scripting language offers the forward declaration to answer this need. To do that, the prototype of the function is written, preceded by the declare keyword:
forward-declaration ::= "declare" function-prototype ';'

If the body of the function must be implemented in another library and into C++ for example, the prototype of the function is preceded by the external keyword (see section C++ binding):
external-declaration ::= "external" function-prototype ';'

20.6.4 Template functions

CodeWorker proposes a special category of functions called template functions. Because of CodeWorker doesn't provide a typed scripting language, template hasn't to be understood as it is commonly exploited in C++ for instance.

A template function represents a set of functions with the same prototype, except the dispatching constant. The dispatching constant is a constant string that extends that name of the function. These functions instantiate the template function for a particular dispatching constant. Each instantiated function implements its own body.

The BNF representation of a template function to implement is:
template-function-definition ::= instantiated-function-definition | generic-function-definition
instantiated-function-definition ::= instantiated-function-prototype compound-statement
instantiated-function-prototype ::= "function" function-name '<' dispatching-constant '>' '(' parameters ')'
dispatching-constant ::= a constant string between double quotes
generic-function-definition ::= generic-function-prototype [compound-statement | template-based-body]
generic-function-prototype ::= "function" function-name '<' generic-key '>' '(' parameters ')'
generic-key ::= an identifier that matches any dispatching constant with no attached prototype
template-based-body ::= "{{" template-based-script "}}"
template-based-script ::= a piece of template-based script describing the generic implementation

A call to a template function requires to provide a dispatching expression to determine the dispatching constant. The dispatching expression will be evaluated during the execution and CodeWorker will resolve what instantiated function of this template to call: the result of the dispatching expression must match with the dispatching constant of the instantiated function. The BNF representation of a call to a template function is:
instantiated-function-call ::= function-name '<' dispatching-expression '>' '(' parameters ')'
parameters ::= expression [',' expression]*

Note that a dispatching constant may be empty and such an instantiated function can be called as a classical function. In fact, classical functions are considered as instantiated functions where the dispatching constant is empty.

template functions bring generic programming in the language: let imagine that we need function getType(myType : node), to decline for every language we could have to generate (C++, Java, ...). Normally, you'll write the following lines to recover the type depending on the language for which you are producing the source code:


if doc_language == "C++" {
    sType = getCppType(myParameterType);
} else if doc_language == "JAVA" {
    sType = getJAVAType(myParameterType);
} else {
    error("unrecognized language '" + doc_language + "'");
}

Thanks to the template functions, you may replace the precedent lines by the next one:


sType = getType<doc_language>(myParameterType);

with:


function getType<"JAVA">(myType : node) {
    ... // implementation for returning a Java type
}

function getType<"C++">(myType : node) {
    ... // implementation for returning a C++ type
}

During the execution, the function getType<T>(myType : node) resolves on what instantiated function it has to dispatch: either getType<"JAVA">(myType : node) or getType<"C++">(myType : node), depending on what value is assigned to variable doc_language.

Trying to call an instantiated function that doesn't exist, raises an error at runtime. However, one might imagine an implementation by default. For instance:


function getType<T>(myType : node) {
    ... // common implementation for any unrecognized language
}

For those that know generic programming with C++ templates, here is a classical example of using template functions:


function f<1>() { return 1; }
function f<N>() { return $N*f<$N - 1$>()$; }
local f10 = f<10>();
if $f10 != 3628800$ error("10! should be worth 3628800");
traceLine("10! = " + f10);

Output:

10! = 3628800

To provide more flexibility in the implementation of the template function, depending on the generic key <T>, the body admits a template-based script to implement the source code of the function. The specialization of the function for a given template instantiation key is then resolved at runtime.

Example:
The template function f inserts a new attribute in a tree node. The attribute has the name passed to the generic key for instantiation, and the value of the instantiation key is assigned to the new attribute. Then, the function calls itself recursively on the instantiation key without the last character.
For instance, the source code of f<"field"> should be:

function f<"field">(x : node) {
      insert x.field = "field";
      f<"fiel">(x); // cut the last character
}

Code:

//a synonym of f<"">(x : node), terminal condition for recusive calls
function f(x : node) {/*does nothing*/}

function f<T>(x : node) {{
      // '{{' announces a template-based script, which
      // will generate the correct implementation during the instantiation
      insert x.@T@ = "@T@";
      f<"@T.rsubString(1)@">(x);
@
      // '}}' announces the end of the template-based script
}}

f<"field">(project);
traceObject(project);

Output:

Tracing variable 'project':
      field = "field"
      fiel = "fiel"
      fie = "fie"
      fi = "fi"
      f = "f"
End of variable's trace 'project'.

20.6.5 Methods

For more readability, syntactical facilities are offered to call functions on a node as if this function was a method of the node. For example, it is possible to call function leftString on the node a like this: a.leftString(2), instead of the classical functional form: leftString(a, 2).

The rule is that every function (user-defined included) whose first argument is passed either by value or by node or by index (but never by reference) can propose a method call.

In that case, the method call applies on the first argument, which has to be a node. The BNF representation of a method call is:
method-call ::= variable '.' function-name '(' parameters ')'
parameters ::= expression [',' expression]*
where parameters have missed the first argument of the function called function-name.

It exists some exceptions where the method doesn't apply to the first argument:

The following methods offer a synonym to the function name:

20.6.6 The 'readonly' hook

The BNF representation of this statement is:
readonlyHook-statement ::= "readonlyHook" '(' filename ')' compound-statement

The token filename is the argument name that the user chooses for passing the name of the file to the body of the hook.

This special function allows implementing a hook that will be called each time a read-only file will be encountered while generating the output file through the generate or expand instruction.

Limitations: only one declaration of this hook is authorized, and it can't be declared inside a parsing or pattern script.

Example:

Common usage: file to generate has to be checked out from a source code control system (see system command to run executables).

readonlyHook(sFilename) {
  if !getProperty("SSProjectFolder") || !getProperty("SSWorkingFolder") || !getProperty("SSExecutablePath") || !getProperty("SSArchiveDir") {
    traceLine("WARNING: properties 'SSProjectFolder' and 'SSWorkingFolder' and 'SSExecutablePath' and 'SSArchiveDir' should be passed to the command line for checking out read-only files from Source Safe");
  } else {
    if startString(sFilename, getProperty("SSWorkingFolder")) {
      local sourceSafe;
      insert sourceSafe.fileName = sFilename;
      generate("SourceSafe.cwt", sourceSafe, getEnv("TMP") + "/SourceSafe.bat");
      if sourceSafe.isOk {
        putEnv("SSDIR", getProperty("SSArchiveDir"));
        traceLine("checking out '" + sFilename + "' from Source Safe archive '" + getProperty("SSArchiveDir") + "'");
        local sFailed = system(getEnv("TMP") + "/SourceSafe.bat");
        if sFailed {
          traceLine("Check out failed: '" + sFailed + "'");
        }
      }
    } else {
      traceLine("Unable to check out '" + sFilename + "': working folder starting with '" + getProperty("SSWorkingFolder") + "' expected");
    }
  }
}

20.6.7 The 'write file' hook

This special function allows implementing a hook that will be called just before writing a file, after ending a text generation process such as expanding or generating or translating text.

It is very important to notice that it returns a boolean value. A true value means that the generated text must be written into the file. A false boolean value means that the generated text doesn't have to be written into the file.

CodeWorker always interprets not returning a value explicitly of a function, as returning an empty string. If you forget to return a value, the generated text will not be written into the file!

The BNF representation of this statement is:
writefileHook-statement ::= "writefileHook" '(' filename ',' position ',' creation ')' compound-statement

ArgumentTypeDescription
filename string The argument name that the user chooses for passing the file name to the body of the hook.
position int The argument name that the user chooses for passing a position where a difference occurs between the new generated version of the file and the precedent one.
If the files don't have the same size, the position is worth -1.
creation boolean The argument name that the user chooses for passing whether the file is created or updated.
The argument is worth true if the file doesn't exist yet.

Limitations: only one declaration of this hook is authorized, and it can't be declared inside a parsing or pattern script.

Example:

writefileHook(sFilename, iPosition, bCreation) {
    if bCreation {
        traceLine("Creating file '" + sFilename + "'!");
    } else {
        traceLine("Updating file '" + sFilename + "', difference at " + iPosition + "!");
    }
    return true;
}

20.6.8 The 'step into' hook

This special function is automatically called before that the extended BNF engine resolves the production rule of a BNF non-terminal. Combined with stepoutHook(), it is very useful for trace and debug tasks.

This hook can be implemented in parse scripts only.

The BNF representation of this statement is:
stepintoHook-statement ::= "stepintoHook" '(' sClauseName ',' localScope ')' compound-statement

ArgumentTypeDescription
sClauseName string The name of the non-terminal.
localScope tree The scope of parameters used into the production rule.

20.6.9 The 'step out' hook

This special function is automatically called once the extended BNF engine has finished the resolution of a BNF non-terminal. Combined with stepintoHook(), it is very useful for trace and debug tasks.

This hook can be implemented in parse scripts only.

The BNF representation of this statement is:
stepoutHook-statement ::= "stepoutHook" '(' sClauseName ',' localScope ',' bSuccess ')' compound-statement

ArgumentTypeDescription
sClauseName string The name of the non-terminal.
localScope tree The scope of local variables and parameters used into the production rule.
bSuccess boolean Whether the resolution of the production rule has succeeded or not.

20.7 Statement's modifiers

A statement's modifier is a directive that stands just before a statement, meaning an instruction or a compound statement.

This directive operates some actions in the scope of the statement and then restores the behaviour as being before.

This action may be:

20.7.1 Statement's modifier 'delay'

This keyword stands just before an instruction or a compound statement. It executes the statement and then, it measures the time it has consumed.

Function getLastDelay (getLastDelay()) gives you the last measured duration.

Example:


local list;
local iIndex = 4;
delay while isPositive(decrement(iIndex)) {
    pushItem list = "element " + iIndex;
    traceLine("creating node '" + list#back + "'");
}
traceLine("time of execution = " + getLastDelay() + " seconds");

Output:

creating node 'element 3'
creating node 'element 2'
creating node 'element 1'
time of execution = 0.000037079177335661762 seconds

20.7.2 Statement modifier 'quiet'

This keyword stands just before an instruction or a compound statement. It executes the statement and all messages intended to the console are concatenated into a string, instead of being displayed. The variable that receives the concatenation of messages is specified after the quiet keyword.

The BNF representation of the quiet statement modifier looks like:
quiet_modifier ::= "quiet" '(' variable ')' statement

Note that the variable must have been declared before, as a local one or as an attribute of the parse tree. If this variable doesn't exist while executing the statement, an error is raised.

20.7.3 Statement modifier 'new project'

This keyword stands just before an instruction or a compound statement. A new project parse tree is created, which is empty and that replaces temporarily the current one. The statement is executed and, once the controlling sequence leaves the statement, the temporary parse tree is removed, and the precedent project comes back as the current one.

The BNF representation of the new_project statement modifier looks like:
new_project_modifier ::= "new_project" statement

This statement modifier is useful to handle a task that doesn't have to interact with the main parse tree.

20.7.4 Statement modifier 'file as standard input'

This keyword stands just before an instruction or a compound statement. A new standard input is opened for reading data. Generally, the keyboard is the standard input, but here, it will be the content of a file that is passed to the argument filename. Once the execution of the statement has completed, the precedent standard input comes back.

The BNF representation of the file_as_standard_input statement's modifier looks like:
file_as_standard_input_modifier ::= "file_as_standard_input" '(' filename ')' statement

This statement modifier is useful to replay a sequence of commands for the debugger or to drive the standard input from an external module that puts its instructions into a file for a batch mode or anything else.

20.7.5 Statement modifier 'string as standard input'

This keyword stands just before an instruction or a compound statement. A new standard input is opened for reading data. Generally, the keyboard is the standard input, but here, it will be the content of the string that is passed to argument. Once the execution of the statement has completed, the precedent standard input comes back.

The BNF representation of the string_as_standard_input statement's modifier looks like:
string_as_standard_input_modifier ::= "string_as_standard_input" '(' expression ')' statement

The standard input is the result of evaluating expression.

This statement modifier is useful to drive the standard input of CodeWorker from an external module, such as a JNI library or an external C++ application ( see chapter external bindings).

20.7.6 Statement modifier 'parsed file'

This keyword stands just before an instruction or a compound statement that belongs to a parsing/translation script exclusively. A new input file is opened for source scanning, and replaces temporarily the precedent during the execution of the statement.The statement is executed and, once the controlling sequence leaves the statement, the input file is closed properly and the precedent one comes back.

The BNF representation of the parsed_file statement modifier looks like:
parsed_file_modifier ::= "parsed_file" '(' filename ')' statement

The token filename is an expression that is evaluated to give the name of the input file.

This statement modifier is useful to handle a task that must redirect the text to parse into another input file. An example could be to emulate the C++ preprocessing on #include directives.

20.7.7 Statement modifier 'parsed string'

This keyword stands just before an instruction or a compound statement that belongs to a parsing/translation script exclusively. The result of an expression is taken as the source to scan, and replaces temporarily the precedent input during the execution of the statement.The statement is executed and, once the controlling sequence leaves the statement the precedent input comes back.

The BNF representation of the parsed_string statement modifier looks like:
parsed_string_modifier ::= "parsed_string" '(' expression ')' statement

The token fexpression is an expression that is evaluated to give the text to scan.

This statement modifier is useful to handle a task that must temporary parse a string.

20.7.8 Statement modifier 'generated file'

This keyword stands just before an instruction or a compound statement that belongs to a pattern script exclusively. A new output file is opened for source code generation, preserving protected areas as usually, and replaces temporarily the current one during the execution of the statement. The statement is executed and, once the controlling sequence leaves the statement, the output file is closed properly and the precedent one takes its place.

The BNF representation of the generated_file statement modifier looks like:
generated_file_modifier ::= "generated_file" '(' filename ')' statement

The token filename is an expression that is evaluated to give the name of the output file.

This statement modifier is useful to handle a task that must redirect the generated text into another output file. An example could be to split an HTML text to generate into a few files for implementing a frame set.

20.7.9 Statement modifier 'generated string'

This keyword stands just before an instruction or a compound statement that belongs to a pattern script exclusively. The output stream is redirected into a variable that replaces temporarily the current output stream during the execution of the statement. The statement is executed and, once the controlling sequence leaves the statement, the variable is populated with the content of the output produced during this scope and the precedent output stream takes its place.

The BNF representation of the generated_string statement modifier looks like:
generated_string_modifier ::= "generated_string" '(' variable ')' statement

The variable argument gives the name of the variable that will be populated with the generated text. This variable must already exist, declared on the stack or referring a node of the current parse tree.

20.7.10 Statement modifier 'appended file'

This keyword stands just before an instruction or a compound statement that belongs to a pattern script exclusively. A new output file is opened for appending source code generation at the end of the file and replaces temporarily the current one during the execution of the statement. The statement is executed and, once the controlling sequence leaves the statement, the output file is closed properly and the precedent one takes its place.

The BNF representation of the appended_file statement modifier looks like:
appended_file_modifier ::= "appended_file" '(' filename ')' statement

The token filename is an expression that is evaluated to give the name of the output file to append.

21 Common functions and procedures

All functions and procedures that are described below may be encountered in any kind of scripts : parsing, source code generation and file expanding, process driving, included script files.

Category interpreterFunction for running a CodeWorker script
autoexpand Expands a file on markups, following the directives self-contained in the file.
executeString Executes a script given in a string.
executeStringQuiet Interprets a string as a script and returns all traces intended to the console.
expand Expands a file on markups, following the directives of a template-based script.
extendExecutedScript Extend the current executed script dynamically with the content of the string.
generate Generates a file, following the directives of a template-based script.
generateString Generates a string, following the directives of a template-based script.
parseAsBNF Parses a file with a BNF script.
parseFree Parses a file with an imperative script.
parseFreeQuiet Parses a file with an imperative script, reroute all console messages and returns them as a string.
parseStringAsBNF Parses a string with a BNF script.
traceEngine Displays the state of the interpreter.
translate Performs a source-to-source translation or a program transformation.
translateString Performs a source-to-source translation or a program transformation on strings.

Category stringFunctions for handling strings
charAt Returns the characters present at a given position of a string.
completeLeftSpaces Completes a string with spaces to the left so that it reaches a given size.
completeRightSpaces Completes a string with spaces to the right so that it reaches a given size.
composeAdaLikeString Converts a sequence of characters to a Ada-like string without double quote delimiters.
composeCLikeString Converts a sequence of characters to a C-like string without double quote delimiters.
composeHTMLLikeString Converts a sequence of characters to an HTML-like text
composeSQLLikeString Converts a sequence of characters to a SQL-like string without single quote delimiters.
coreString Extracts the core of a string, leaving the beginning and the end.
countStringOccurences How many occurences of a string to another.
cutString Cuts a string at each separator encountered.
endString Compares the end of the string.
endl Returns an end-of-line, depending on the operating system.
equalsIgnoreCase Compares two strings, ignoring the case.
executeString Executes a script given in a string.
executeStringQuiet Interprets a string as a script and returns all traces intended to the console.
findFirstChar Returns the position of the first character amongst a set, encountered into a string.
findLastString Returns the position of the last occurence of a string to another.
findNextString Returns the next occurence of a string to another.
findString Returns the first occurence of a string to another.
generateString Generates a string, following the directives of a template-based script.
joinStrings Joins a list of strings, adding a separator between them.
leftString Returns the beginning of a string.
lengthString Returns the length of a string.
midString Returns a substring starting at a point for a given length.
parseStringAsBNF Parses a string with a BNF script.
repeatString Returns the concatenation of a string repeated a few times.
replaceString Replaces a substring with another.
replaceTabulations Replaces tabulations with spaces.
rightString Returns the end of a string.
rsubString Returns the left part of a string, ignoring last characters.
startString Checks the beginning of a string.
subString Returns a substring, ignoring the first characters.
toLowerString Converts a string to lowercase.
toUpperString Converts a string to uppercase.
trim Eliminates heading and trailing whitespaces.
trimLeft Eliminates the leading whitespaces.
trimRight Eliminates the trailing whitespaces.
truncateAfterString Special truncation of a string.
truncateBeforeString Special truncation of a string.

Category arrayFunctions handling arrays
findElement Checks the existence of an entry key in an array.
findFirstSubstringIntoKeys Returns the first entry key of an array, containing a given string.
findNextSubstringIntoKeys Returns the next entry key of an array, containing a given string.
getArraySize Returns the number of items in an array.
insertElementAt Inserts a new element to a list, at a given position.
invertArray Inverts the order of items in an array.
isEmpty Checks whether a node has items or not.
removeAllElements Removes all items of the array.
removeElement Removes an item, given its entry key.
removeFirstElement Removes the first item of the array.
removeLastElement Removes the last item of the array.

Category nodeFunctions handling a node
clearVariable Removes the subtree and assigns an empty value.
equalTrees Compares two subtrees.
existVariable Checks the existence of a node.
getVariableAttributes Extract all attribute names of a tree node.
removeRecursive Removes a given attribute from the subtree.
removeVariable Removes a given variable.
slideNodeContent Moves the subtree elsewhere on a branch.
sortArray Sort an array, considering the entry keys.

Category iteratorFunctions handling an iterator
createIterator Creates an iterator pointing to the beginning of a list.
createReverseIterator Creates a reverse iterator pointing to the end of a list.
duplicateIterator Duplicates an iterator.
first Returns true if the iterator points to the first item.
index Returns the position of an item in a list.
key Returns the entry key of the item pointed to by the iterator.
last Returns true if the iterator points to the last item.
next Move an iterator to the next item of a list.
prec Move an iterator to the precedent item of a list.

Category fileFunctions handling files
appendFile Writes the content of a string to the end of a file
canonizePath Builds an absolute path, starting to the current directory.
changeFileTime Changes the access and modification times of a file.
chmod Changes the permissions of a file.
copyFile Copies a file.
copyGenerableFile Copies a file with protected areas or expandable markups, only if the hand-typed code differs between source and destination.
copySmartFile Copies a file only if the destination differs.
createVirtualFile Creates a transient file in memory.
createVirtualTemporaryFile Creates a transient file in memory, CodeWorker choosing its name.
deleteFile Deletes a file on the disk.
deleteVirtualFile Deletes a transient file from memory.
existFile Checks the existence of a file.
existVirtualFile Checks the existence of a transient file, created in memory.
exploreDirectory Browses all files of a directory, recursively or not.
fileCreation Returns the creation date of a file.
fileLastAccess Returns the last access date of a file.
fileLastModification Returns the last modification date of a file.
fileLines Returns the number of lines in a file.
fileMode Returns the permissions of a file.
fileSize Returns the size of a file.
getGenerationHeader Returns the comment to put into the header of generated files.
getShortFilename Returns the short name of a file
indentFile Indents a file, depending on the target language.
loadBinaryFile Loads a binary file and stores each byte in a hexadecimal representation of 2 digits.
loadFile Returns the content of a file or raises an error if not found.
loadVirtualFile Returns the content of a transient file or raises an error if not found.
pathFromPackage Converts a package path to a directory path.
relativePath Returns the relative path, which allows going from a path to another.
resolveFilePath Gives the location of a file with no ambiguity.
saveBinaryToFile Saves binary data to a file.
saveToFile Saves the content of a string to a file
scanDirectories Explores a directory, filtering filenames.
scanFiles Returns a flat list of all filenames matching with a filter.

Category directoryFunctions handling directories
changeDirectory Changes the current directory (chdir() in C).
copySmartDirectory Copies files of a directory recursively only when destination files differ from source files.
createDirectory Creates a new directory.
existDirectory Check the existence of a directory.
exploreDirectory Browses all files of a directory, recursively or not.
getCurrentDirectory Returns the current directory (getcwd() in C).
removeDirectory Removes a directory from the disk.
scanDirectories Explores a directory, filtering filenames.
scanFiles Returns a flat list of all filenames matching with a filter.

Category URLFunctions working on URL transfers (HTTP,...)
decodeURL Decodes an HTTP URL.
encodeURL Encodes an URL to HTTP.
getHTTPRequest Sends an HTTP's GET request.
postHTTPRequest Sends an HTTP's POST request.
sendHTTPRequest Sends an HTTP request.

Category datetimeFunctions handling date-time
addToDate Change a date by shifting its internal fields days/months/years or time.
compareDate Compares two dates.
completeDate Extends an incomplete date with today characteristics.
fileCreation Returns the creation date of a file.
fileLastAccess Returns the last access date of a file.
fileLastModification Returns the last modification date of a file.
formatDate Changes the format of a date.
getLastDelay Returns the time consumed to execute a statement.
getNow Returns the current date-time.
setNow Fixes the current date-time.

Category numericFunctions handling numbers
add Equivalent admitted writing is $a + b$.
ceil Returns the smallest integer greater that or equal to a number
decrement Equivalent admitted writing is set a = $a - 1$;.
div Equivalent admitted writing is $a / b$.
equal Equivalent admitted writing is $a == b$.
exp Returns the exponential of a value.
floor Returns the largest integer less that or equal to a number
increment Equivalent admitted writing is set a = $a + 1$;.
inf Equivalent admitted writing is $a < b$.
isNegative Equivalent admitted writing is $a < 0$.
isPositive Equivalent admitted writing is $a > 0$.
log Returns the Neperian logarithm.
mod Equivalent admitted writing is $a % b$.
mult Equivalent admitted writing is $a * b$.
pow Raises a number to the power of another.
sqrt Calculates the square root.
sub Equivalent admitted writing is $a - b$.
sup Equivalent admitted writing is $a > b$.

Category standardClassical functions of any standard library
UUID Generates an UUID.
error Raises an error message
inputKey If any, returns the last key pressed on the standard input.
inputLine Wait for the standard input to the console.
isIdentifier Checks whether a string is a C-like identifier or not.
isNumeric Checks whether a string is a floating-point number or not.
randomInteger Generates a pseudorandom number.
randomSeed Changes the seed of the pseudorandom generator.
traceLine Displays a message to the console, adding a carriage return.
traceObject Displays the content of a node to the console.
traceStack Displays the stack to the console.
traceText Displays a message to the console.

Category conversionType conversion
byteToChar Converts a byte (hexadecimal representation of 2 digits) to a character.
bytesToLong Converts a 4-bytes sequence to an unsigned long integer in its decimal representation.
bytesToShort Converts a 2-bytes sequence to an unsigned short integer in its decimal representation.
charToByte Converts a character to a byte (hexadecimal representation of 2 digits).
charToInt Converts a character to the integer value of the corresponding ASCII.
hexaToDecimal Converts an hexadecimal representation to an integer.
hostToNetworkLong Converts a 4-bytes representation of a long integer to the network bytes order.
hostToNetworkShort Converts a 2-bytes representation of a short integer to the network bytes order.
longToBytes Converts an unsigned long integer in decimal base to its 4-bytes representation.
networkLongToHost Converts a 4-bytes representation of a long integer to the host bytes order.
networkShortToHost Converts a 2-bytes representation of a short integer to the host bytes order.
octalToDecimal Converts an octal representation to a decimal integer.
shortToBytes Converts an unsigned short integer in decimal base to its 2-bytes representation.

Category systemFunctions relative to the operating system
computeMD5 Computes the MD5 of a string.
environTable Equivalent of environ() in C
existEnv Checks the existence of an environment variable.
getEnv Returns an environment variable, or raises an error if not exist.
openLogFile Opens a log file for logging every console trace.
putEnv Puts a value to an environment variable.
sleep Suspends the execution for millis milliseconds.
system Equivalent to the C function system().

Category commandRelative to the command line
compileToCpp Translates a script to C++.
getIncludePath Returns the include path passed via the option -I.
getProperty Returns the value of a property passed via the option -D.
getVersion Returns the version of the interpreter.
getWorkingPath Returns the output directory passed via option -path.
setIncludePath Changes the option -I while running.
setProperty Adds/changes a property (option -D) while running.
setVersion Gives the version of scripts currently interpreted by CodeWorker.
setWorkingPath Does the job of the option -path.

Category generationFunctions relative to generation
addGenerationTagsHandler Adds your own CodeWorker's tags handler
autoexpand Expands a file on markups, following the directives self-contained in the file.
expand Expands a file on markups, following the directives of a template-based script.
extractGenerationHeader Gives the generation header of a generated file, if any.
generate Generates a file, following the directives of a template-based script.
generateString Generates a string, following the directives of a template-based script.
getCommentBegin Returns the current format of a comment's beginning.
getCommentEnd Returns the current format of a comment's end.
getGenerationHeader Returns the comment to put into the header of generated files.
getTextMode Returns the text mode amongst "DOS", "UNIX" and "BINARY".
getWriteMode Returns how text is written during a generation (insert/overwrite).
listAllGeneratedFiles Gives the list of all generated files.
removeGenerationTagsHandler Removes a custom generation tags handler
selectGenerationTagsHandler Selects your own CodeWorker's tags handler for processing generation tasks
setCommentBegin Changes what a beginning of comment looks like, perhaps before expanding a file.
setCommentEnd Changes what an end of comment looks like, perhaps before expanding a file.
setGenerationHeader Specifies a comment to put at the beginning of every generated file.
setTextMode "DOS", "UNIX" or "BINARY"
setWriteMode Selects how to write text during a generation (insert/overwrite).
translate Performs a source-to-source translation or a program transformation.
translateString Performs a source-to-source translation or a program transformation on strings.

Category parsingFunctions relative to scanning/parsing
parseAsBNF Parses a file with a BNF script.
parseFree Parses a file with an imperative script.
parseFreeQuiet Parses a file with an imperative script, reroute all console messages and returns them as a string.
parseStringAsBNF Parses a string with a BNF script.
translate Performs a source-to-source translation or a program transformation.
translateString Performs a source-to-source translation or a program transformation on strings.

Category socketSocket operations
acceptSocket Listens for a client connection and accepts it.
closeSocket Closes a socket descriptor.
createINETClientSocket Creates a stream socket connected to the specified port and IP address.
createINETServerSocket Creates a server stream socket bound to a specified port.
receiveBinaryFromSocket Reads binary data from the socket, knowing the size.
receiveFromSocket Reads text or binary data from a socket.
receiveTextFromSocket Reads text from a socket, knowing the size.
sendBinaryToSocket Writes binary data to a socket.
sendTextToSocket Writes text to a socket.

Category unknownVarious types of function
loadProject Loads a parse tree previously saved thanks to saveProject().
not The boolean negation, equivalent to !a.
produceHTML
saveProject Saves a parse tree to XML or to a particular text format.
saveProjectTypes Factorizes nodes of the projects to distinguish implicit types for node and saves it to XML.

21.1 acceptSocket

21.2 add

21.3 addGenerationTagsHandler

21.4 addToDate

21.5 appendFile

21.6 autoexpand

21.7 bytesToLong

21.8 bytesToShort

21.9 byteToChar

21.10 canonizePath

21.11 ceil

21.12 changeDirectory

21.13 changeFileTime

21.14 charAt

21.15 charToByte

21.16 charToInt

21.17 chmod

21.18 clearVariable

21.19 closeSocket

21.20 compareDate

21.21 compileToCpp

21.22 completeDate

21.23 completeLeftSpaces

21.24 completeRightSpaces

21.25 composeAdaLikeString

21.26 composeCLikeString

21.27 composeHTMLLikeString

21.28 composeSQLLikeString

21.29 computeMD5

21.30 copyFile

21.31 copyGenerableFile

21.32 copySmartDirectory

21.33 copySmartFile

21.34 coreString

21.35 countStringOccurences

21.36 createDirectory

21.37 createINETClientSocket

21.38 createINETServerSocket

21.39 createIterator

21.40 createReverseIterator

21.41 createVirtualFile

21.42 createVirtualTemporaryFile

21.43 cutString

21.44 decodeURL

21.45 decrement

21.46 deleteFile

21.47 deleteVirtualFile

21.48 div

21.49 duplicateIterator

21.50 encodeURL

21.51 endl

21.52 endString

21.53 environTable

21.54 equal

21.55 equalsIgnoreCase

21.56 equalTrees

21.57 error

21.58 executeString

21.59 executeStringQuiet

21.60 existDirectory

21.61 existEnv

21.62 existFile

21.63 existVariable

21.64 existVirtualFile

21.65 exp

21.66 expand

21.67 exploreDirectory

21.68 extendExecutedScript

21.69 extractGenerationHeader

21.70 fileCreation

21.71 fileLastAccess

21.72 fileLastModification

21.73 fileLines

21.74 fileMode

21.75 fileSize

21.76 findElement

21.77 findFirstChar

21.78 findFirstSubstringIntoKeys

21.79 findLastString

21.80 findNextString

21.81 findNextSubstringIntoKeys

21.82 findString

21.83 first

21.84 floor

21.85 formatDate

21.86 generate

21.87 generateString

21.88 getArraySize

21.89 getCommentBegin

21.90 getCommentEnd

21.91 getCurrentDirectory

21.92 getEnv

21.93 getGenerationHeader

21.94 getHTTPRequest

21.95 getIncludePath

21.96 getLastDelay

21.97 getNow

21.98 getProperty

21.99 getShortFilename

21.100 getTextMode

21.101 getVariableAttributes

21.102 getVersion

21.103 getWorkingPath

21.104 getWriteMode

21.105 hexaToDecimal

21.106 hostToNetworkLong

21.107 hostToNetworkShort

21.108 increment

21.109 indentFile

21.110 index

21.111 inf

21.112 inputKey

21.113 inputLine

21.114 insertElementAt

21.115 invertArray

21.116 isEmpty

21.117 isIdentifier

21.118 isNegative

21.119 isNumeric

21.120 isPositive

21.121 joinStrings

21.122 key

21.123 last

21.124 leftString

21.125 lengthString

21.126 listAllGeneratedFiles

21.127 loadBinaryFile

21.128 loadFile

21.129 loadProject

21.130 loadVirtualFile

21.131 log

21.132 longToBytes

21.133 midString

21.134 mod

21.135 mult

21.136 networkLongToHost

21.137 networkShortToHost

21.138 next

21.139 not

21.140 octalToDecimal

21.141 openLogFile

21.142 parseAsBNF

21.143 parseFree

21.144 parseFreeQuiet

21.145 parseStringAsBNF

21.146 pathFromPackage

21.147 postHTTPRequest

21.148 pow

21.149 prec

21.150 produceHTML

21.151 putEnv

21.152 randomInteger

21.153 randomSeed

21.154 receiveBinaryFromSocket

21.155 receiveFromSocket

21.156 receiveTextFromSocket

21.157 relativePath

21.158 removeAllElements

21.159 removeDirectory

21.160 removeElement

21.161 removeFirstElement

21.162 removeGenerationTagsHandler

21.163 removeLastElement

21.164 removeRecursive

21.165 removeVariable

21.166 repeatString

21.167 replaceString

21.168 replaceTabulations

21.169 resolveFilePath

21.170 rightString

21.171 rsubString

21.172 saveBinaryToFile

21.173 saveProject

21.174 saveProjectTypes

21.175 saveToFile

21.176 scanDirectories

21.177 scanFiles

21.178 selectGenerationTagsHandler

21.179 sendBinaryToSocket

21.180 sendHTTPRequest

21.181 sendTextToSocket

21.182 setCommentBegin

21.183 setCommentEnd

21.184 setGenerationHeader

21.185 setIncludePath

21.186 setNow

21.187 setProperty

21.188 setTextMode

21.189 setVersion

21.190 setWorkingPath

21.191 setWriteMode

21.192 shortToBytes

21.193 sleep

21.194 slideNodeContent

21.195 sortArray

21.196 sqrt

21.197 startString

21.198 sub

21.199 subString

21.200 sup

21.201 system

21.202 toLowerString

21.203 toUpperString

21.204 traceEngine

21.205 traceLine

21.206 traceObject

21.207 traceStack

21.208 traceText

21.209 translate

21.210 translateString

21.211 trim

21.212 trimLeft

21.213 trimRight

21.214 truncateAfterString

21.215 truncateBeforeString

21.216 UUID

22 The extended BNF syntax for parsing

A BNF description of a grammar is more flexible and more synthetic than a procedural description of parsing. CodeWorker accepts parsing scripts that conform to a BNF.

BNF is the acronym of Backus-Naur Form, and consists of describing a grammar with production rules. The first production rule that is encountered into the script and that isn't a special one (beginning with a '#' like the {#empty} clause), is chosen as the main non-terminal to match with the input stream, when the BNF-driven script is executed.

A non-terminal (often called a clause in the documentation) breaks down into terminals and other non-terminals. Defining how to break down a non-terminal is called a production rule. A clause is valid as soon as the production rule matches its part of the input stream.

The syntax of a clause looks like:
["#overload"]? <clause_specifier> <preprocessing> "::=" <sequence> ['|' <sequence>]* ';'

where:
<preprocessing> ::= "#!ignore" | "#ignore" ['(' <ignore-mode> ')']? ';'
<ignore-mode> ::= "blanks" | "C++" | "JAVA" | "HTML" | "LaTeX"; <sequence> ::= non-terminal | terminal; <terminal> ::= symbol of the language: a constant character or string

A sequence is a set of terminals and non-terminals that must match the input stream, starting at the current position. A production rule may propose alternatives: if a sequence doesn't match, the engine tries the next one (the alternation symbol '|' separates the sequences).

A regular expression asks for reading tokens into the input stream. If tokens are put in sequence, one behind the other, they are evaluated from the left to the right and all of them must match the input stream. For example, "class" '{' is a sequence of 2 non-terminals, which requires that the input stream first matches with "class" and then is followed by '{'.

Putting #overload just before the declaration of a production rule means that the non-terminal was already defined and that it must be replaced by this new rule when called. Example:

nonterminal ::= "bye";
...
#overload nonterminal ::= "bye" | "quit" | "exit";

Now, calling nonterminal executes the second production rule. Use the directive #super to call the overloaded clause. The precedent overloading might be written:

...
#overload nonterminal ::= #super::nonterminal | "quit" | "exit";

#overload takes an important place in the reuse of BNF scripts. A parser might be built as reusing a scanner, where some non-terminals only have to be extended, for populating a parse tree for instance.

The statement #transformRules provides also a convenient way to reuse a BNF script.

It defines a rule that describes how to transform the header (left member) and the production rule (right member) of a non-terminal declaration.

Example:

INTEGER ::= #ignore ['0'..'9']*;

INTEGER is the header and #ignore ['0'..'9']* is the production rule.

During the compilation of a BNF parse script, before processing the declaration of a non-terminal, the compiler checks whether a transforming rule validates the name of the non-terminal. If so, both the header of the declaration and the production rule are translated, following the directives of the rule.

The #transformRules statement must be put in the BNF script, before the production rules to transform.

The syntax the statement #transformRules looks like:
transform-rules ::= "#transformRules" filter header-transformation prod-rule-transformation
filter ::= expression
header-transformation ::= '{' translation-script ''
prod-rule-transformation ::= '{' translation-script ''} }

The filter is a boolean expression, applied on the name of the production rule. The variable x contains the name of the production rule.

header-transformation consists on a translation script, which describes how to transform the header. If the block remains empty, the header doesn't change.

prod-rule-transformation consists on a translation script, which describes how to transform the production rule. If the block remains empty, the header doesn't change.

Example:

This example describes how to transform each production rule, whose name ends with "expr".

or_expr ::= and_expr ["&&" and_expr]*;
becomes
or_expr(myExpr : node) ::= and_expr(myExpr.left) ["&&":myExpr.operator and_expr(myExpr.right)]*;

The original production rules are just scanning the input, and the example shows how to transform them for populating a node of the parse tree.

#transformRules
    // The filter accepts production rules that have a name
    // ending with "expr" only.
    // Note that the variable x holds the name
    // of the production rule.
    x.endString("expr")
    
    
A script for transforming the header of the production rule:
    {
        // By default, copies the input to the output
        #implicitCopy
        // Writes the declaration of the parameter myExpr
        // after the non-terminal and copies the rest.
        header ::= #readIdentifier
            => {@(myExpr : node)@}
            ->#empty;
    }
    
    
A script for transforming the production rule itself:
    {
        #implicitCopy
        // - Pass the left member of the expression to populate,
        // to the first non-terminal,
        // - assign the operator to the expression,
        // - Pass the right member of the expression to populate,
        // to the first non-terminal.
        // In any case, the rest of the production rule remains
        // invariant.
        prodrule ::= [
                #readIdentifier
                =>{@(myExpr.left)@}
                ->[
                    "'" #readChar "'" => {@:myExpr.operator@}
                  |
                    #readCString => {@:myExpr.operator@}
                ]
                #readIdentifier
                =>{@(myExpr.right)@}
            ]?
            ]->#empty;
    }

22.1 BNF tokens

Below are described all BNF tokens that CodeWorker recognizes: