CREATE
statement and the ability to assign
to special registers (to control the runtime operating mode).
This document describes the syntax of a CREATE
statement, and shows an example of a simple parser for a subset of
INTERCAL-72
CREATE
statement has the form:
CREATE grammar class template AS codeThe grammar is optional, and can only be used when compiling a compiler. Normal program can only modify their own compiler by leaving grammar to the default value (
_1
); compilers can
modify other compilers, including but not limited to the grammar used
to compile the program (compiler modifying itself) and the one used
to compile the compiler itself. This advanced use of the CREATE
statement is discussed elsewhere.
The class specifies a syntactic class (some other languages might call it a nonterminal). Usually, this takes the form of a what ("?") followed by some alphanumerics, although anything which evaluates to a number will do. Please note that in CLC-INTERCAL the what does not introduce a unary logical operator as in C-INTERCAL, and it always produces a named constant (of sorts).
The template is composed of a sequence of terminals and nonterminals
which vaguely resemble the syntax you are trying to define. Nonterminals
are specified as a special type of constant, usually introduced by the
"what" discussed before. Terminals are specified as "array slices", that
is sequences of numbers enclosed in tails or hybrids and representing a
16- or 32-bit array. The text which would be produced by READing OUT these
arrays specifies the syntax to define. For example, consider the following
line from 1972.iacc
:
DO CREATE _2 ?VERB ?LVALUE ,#91 + #91 + #79, ,#95 + #91 + #67, ?EXPRESSION AS STO + ?EXPRESSION #1 + ?LVALUE #1Ignore for just now the bit after the
AS
. We are creating
something in class ?VERB
of grammar _2
(which
by default contains the compiler - the compiler compiler is in grammar
_1
). There is a nonterminal (?LVALUE
), which
presumably will match a register and anything else you can assign to; then
there are two terminals. To see what these are, you can run the following
program:
PLEASE ,1 <- #3 DO ,1 SUB #1 <- #91 DO ,1 SUB #2 <- #91 DO ,1 SUB #3 <- #72 PLEASE ,2 <- #3 DO ,2 SUB #1 <- #95 DO ,2 SUB #2 <- #91 DO ,2 SUB #3 <- #67 PLEASE READ OUT ,1 + ,2 DO GIVE UPWe assign the values to 16-bit arrays because the slices are encloses in tails. If they had been hybrids, we'd use 32-bit arrays. Running the program, we get "<-". This is not a coincidence: the above statement CREATEs the assignment. It should not surprise anybody that the next nonterminal is called
?EXPRESSION
.
CREATE
statements can define the syntax
for all the known variants of INTERCAL. The semantics is defined by the
code part of the CREATE
statements. Before detailing how
the semantics is specified, we must take a detour.
The INTERCAL Common Bytecode Machine (ICBM) is an assembler-like language
which is used by the CLC-INTERCAL compiler as an intermediate language.
It can be interpreted directly by the bytecode interpreter, or run directly
on any computer with the INTERCAL Operating System; alternatively, a second
level compiler will be made available to convert ICBM programs to other
languages, for example assembler, Perl, or DD/SH. Please note that it is
currently not possible to create a code generator for C, FORTRAL, COBOL or
Pascal, but this may change in future - however you will need to have the
C, FORTRAN, COBOL or Pascal compiler available at runtime if the program
happens to use CREATE
in a way which cannot be optimised away
(all the examples in this documentation can, so you won't need a compiler
at runtime to run them).
ICBM assumes a memory model with multiple stashes: any value can be stashed and retrieved on any of the stashes, which function independently of each other. There is no limit to the size of an element on a stash, or to the number of elements in each stash. Memory allocated to a stash element is automatically freed when the element is removed from the stash.
ICBM programs are inherently multi-threaded. Memory between the threads is shared unless otherwise specified. When a thread terminates, it remains in memory until it is absolutely clear that the thread cannot came back to life. This is important, since threads in quantum programs can have their behaviour modified after they have completed execution.
ICBM programs also contain a number of compilers. Usually, there will be a "program compiler" and a "compiler compiler", provided by the system. When compiling a compiler, however, just the "compiler compiler" is provided: the "program compiler" is built during the execution. A compiler compiler is a special type of compiler, which replaces the original compiler compiler after (re-)building itself. In other words, the compiler compiler coincides with the compiler compiler compiler; this is necessary to avoid infinite regression.
Each block of ICBM code in a program has an associated source code and a compiler used to compile it. This allows to avoid to recompile a block if the compiler has not changed since the last time it has been compiled. A separate document describes this "just too late" compiler in some more detail.
ICBM code is described in detail in the documentation provided with the
Perl module Language::INTERCAL::ByteCode
. Use the perldoc
program to access it, or man if your version of Perl installs
prebuilt manpages from POD.
CREATE
statement after the AS
keyword. It consists of a list of
mnemonics (separated by intersection symbols). If the template refers to
n other templates, the codes generated by these templates will be
available in the code by repeating the template identifier (with
its own "what" or whatever was used to introduce it) followed by an expression.
If there is only one instance of each different template, the expression must
be #1, otherwise it indicates which instance to use. For example, considering
again the assignment statement from 1972.iacc
:
DO CREATE _2 ?VERB ?LVALUE ,#91 + #91 + #79, ,#95 + #91 + #67, ?EXPRESSION AS STO + ?EXPRESSION #1 + ?LVALUE #1We see that we refer to both the
?EXPRESSION
and the
?LVALUE
: since there is one of each, these references are
followed by #1. Suppose the statement to execute is DO .1 <- #2.
Then the ?LVALUE
will generate SPO + #1
and the
?EXPRESSION
generates #2
. So, replacing what needs
to be replaced, the assignment produces:
STO + #2 + SPO + #1
Executing it, we first check whether calculation is to be ABSTAINed FROM; if
it is not, we STOre #2 to register SPOt #1. Just as we required.
Example
See the various compiler parsers provided with the distribution for plenty
of examples.
These examples should be self-explanatory. If not, reread this document.
Back