The Perl INTERCAL compiler

... Input/Output

We agree with the designers of C-INTERCAL that the I/O capabilities of INTERCAL-72 are insufficient. However, we don't agree with their implementation so we shamelessly go and do something entirely incompatible. However, if you select the "C-INTERCAL compatibility mode" you get back their I/O behaviour.

Note that numeric and alphanumeric I/O assume the input to be an EBCDIC punched card, although this can be changed with a runtime option at the price of slower performance.

In addition, if you request compatibility with C-INTERCAL or Threaded-INTERCAL, all array I/O will imitate C-INTERCAL's Turing Tapes modes.

Numeric I/O

This is the standard INTERCAL-72 I/O. It is used with numeric variables (one spot or two spots).

On input, one card is read and its contents must be a sequence of digits spelled out in full in a selection or popular languages, for example "ONE THREE TWO ONE ZERO" (English for 13210) or "COIG CEITHIR TRI" (Scottish Gaelic for 543). Spaces between digits are ignored.

On output, butchered Roman numerals are used. However, we differ from INTERCAL-72 slightly in that we use backslashes before numerals to multiply them by 1,000,000 - or, if the terminal can do it, a runtime option allows to underline them instead.

For example, 1,234,567,890 is written \M\C\C\X\X\X\I\VdlxviiDCCCXC or MCCXXXIVdlxviiDCCCXC.

Alphanumeric I/O

This is the kind of input/output the compiler itself uses to read the program source and produce program listings. It applies to 16 bit arrays (tails), which must be dimensioned. Multidimensional arrays are "flattened".

On input, EBCDIC text is obtained from the next card, and it is converted to an extended Baudot code. This is essentially the same as standard Baudot, except that selecting "letters" when you are already in "letters" will cause a shift to lowercase, and selecting "figures" when you are already in "figures" causes a shift to other symbols. See the chapter on Character Sets for a complete description of extended Baudot. Please note that the insertion of shift codes means that the Baudot code can be as much as three times as long as the original EBCDIC, so you must dimension your array accordingly.

On output, extended Baudot is converted to ASCII and sent to the virtual line printer.

Binary I/O

This is the simplest, yet most powerful, form of input/output. It applies to 32 bit arrays, which must be dimensioned. Multidimensional arrays are "flattened".

The input is assumed to be a stream of bytes, which is not interpreted in any way. The number of bytes written in from the input is the same as the size of the array specified. To compute the data to store in the array, a simple algorithm is applied to every pair of consecutive elements. Since this would give one less element, a #172 is inserted at the start. If the two numbers in a pair are in .1 (left) and .2 (right), the following fragment produces the value to be stored and leaves it in :1

DO :1 <- '.2~.1'¢"'"¥'#65535¢.2'"~"#0¢#65535"'~'"¥'#65535¢.1'"~"#0¢#65535"'"
We know it's an insult to the reader's intelligence to explain what it does, but maybe somebody is in a hurry and doesn't have the two milliseconds to parse that, so here it goes: the result is obtained by computing two numbers and interleaving them. The first number computed is the second input element selected by the first input element (.2~.1). The second number computed is the same selection but applied to the bit-complement of the two input numbers. Before applying the algorithm, the numbers are extended from 8 to 16 bits by padding them with a random value. It is clear that the result has all the bits of the corresponding input value, permuted in a predictable order, so all the information is there.

A value of zero is inserted to indicate end-of-file. The padding is guaranteed to have at least one bit set, so the values stored for any input value is never zero.

For output, the same algorithm is applied in reverse, except that zeros are skipped. If you need to use a value zero, pad it to 32 bits with ones. The current implementation does not check that the padding is sufficiently random, although this might well change in future.


Back