In theory the implementation should follow a model, but Viua is in such an early stage of development that no model exists yet for it. Below document is an attempt at creating an abstract model of Viua VM.
This document is meant to explain and describe the general concepts and ideas behind Viua. For example: what is a process, a call frame, a stack, and how are they related. Apart from being a glossary of a kind, it also aims to explain how various mechanisms are intended to work in Viua, e.g. exception handling (throw-catch), function calls, and concurrency.
Descriptions of individual instructions are not included in this document, and are provided on the ISA description page.
Viua is a register-based VM. Programs running on it manipulate values held in registers instead of on a stack.
On the lowest level, Viua VM programs are sequences of instructions. Instruction fetch, manipulate (modify, delete, move, copy, etc.), and produce values. Values are stored in registers. This section describes what is a register, a register set, and how values can be stored in and fetched from registers.
A "register set" is array of registers, with limited capacity. Size of register sets is determined statically, at compile time. There are three main register sets:
Values held in registers from these register sets can be manipulated directly by instructions, and
can be accessed by copy
, move
, swap
, and delete
instructions.
There are also a few "special" register sets (that are not really register sets). Values in their registers cannot be manipulated directly, and must be first brought into one of the three main register sets. These sets are:
receive
-d before use
draw
-ed before use
All register sets are process-local, meaning that no register set is shared between processes.
This register set's lifetime is bound to the call frame for which it has been spawned, or to the closure for which is has been created. At any point in time there may be many local register sets spawned.
In the first case, the lifetime of the register set (and its registers and their contents) can be statically determined by analysing when the call frame will be popped off the stack.
In the second case, the liftetime is more dynamic since the closure can be returned as a value from a function and thus outlive its original environment. Determining the lifetime of local register set of a closure requires analysing lifetime of the closure, which almost always can be done.
Local register sets are spawned either by frame
, or by closure
instruction.
Capacity of register set spawned by frame
instruction is stated explicitly.
Register set spawned using closure
instruction inherits capacity from its enclosing
environment.
-- spawn a frame whose local register set will contain 20 registers
-- the frame will accept no parameters
--
-- frame {no-of-parameters} {no-of-local-registers}
frame %0 %20
-- closure stored in register 1, using body of function foo/0
--
-- closure {index-of-register-where-closure-is-stored} {function-implementing-body-of-the-closure}
closure %1 foo/0
Local register sets spawned using frame
destroyed when the frame they are associated with
is popped off the call stack by return
or tailcall
instruction, or
during stack unwinding when an exception is thrown.
Closure-local register sets are destroyed when their closure is destroyed. During lifetime of the closure-local register set it may be pushed to the stack, and popped off it many times without being destroyed. Closures obey the lifetime rules of Viua values, which are described in another section.
The user program has access to the local register set of only the top-most frame on the stack. It is not possible to access registers in local register set of any frame lower on the stack. Also, contents of local register sets of lower frames do not have any effect on contents of local register sets of the upper frames. Local register sets are isolated from each other, and disposable - they are created anew for every call (closure-local register sets being the exception).
Capacity of each local register set may be different and is determined by the user program. Capacity of local register sets is limited to 4'294'967'296 (2^32) registers.
Tail calls do not inherit local register sets of their original frames. They start with a fresh register set.
Static register sets live as long as the process inside which they have been spawned.
Static register sets are created before their first use. This means that they may be spawned eagerly when a process is created for every function that uses a static register set, or lazily - when the runtime detects a function is about to access its static register set. It does not make a difference to the function when the static register set is allocated as long as it can use it.
Static register sets are destroyed after the last frame of a process is popped off the stack, i.e. when the process is no longer able to run any more instructions.
Static register sets are assigned per-function and are local to a single process.
A function foo/0
does not have access to static registers of function bar/0
.
If function foo/0
inside process A puts 42
in first static register, the
same function foo/0
inside process B will not see that value inside its first static register.
The user program also has access only to the static register set of the function function that is currently being executed
by the top-most frame on the stack.
Capacity of static register sets is currently fixed at 16 registers.
User functions should always check if their static registers are empty before using them.
It is a function's responsibility to initialise its own static registers - static register set will
be always provided when the function requests it but will be completely empty during the first access.
Checking for empty registers can be done using the isnull
instruction.
Global register set's lifetime is bound to a process for which it has been spawned.
The user program has access to the global register set at all times, and from any call frame on the stack. There are no restrictions similar to those of static or local register sets, except that global register set is spawned per-process and isolated between processes.
Capacity of global register sets is currently fixed at 255 registers.
There are no special notes related to the global register set.
Registers are "slots" that are used to hold values Viua VM instructions operate on. A register can hold any value representable by Viua program, or be empty.
Registers are indexed slots in a register set. Register indexes start from 0, and go to register set's capacity minus 1. For example, a register set with capacity 16 has registers numbered from 0 to 15.
When an instruction wants to fetch a value held in a register, or to put a value in a register, it must properly address the register it wants to access. If the register address supplied by one of instruction's operands is not valid the VM throws an exception. To be valid the address must consist of three parts:
Register address must include the register set which should be used.
Identified by the local
.
This register set is resolved at compile time.
Identified by the local
.
This register set is resolved at compile time.
Identified by the local
.
This register set is resolved at compile time.
Identified by the current
.
The register set to use is determined at runtime, based
on what the "current" register set means at the exact moment in the program.
Depending on the state in which the program is "current" may mean that local, static, or
global register set may be used.
Fetch mode instructs the VM how it should fetch the value an instruction requests. There are three fetch modes: plain, pointer-dereference, and register-indirect. In source code they are identified by sigils.
Identified by percent sign - "%
".
The simplest fetch mode. It involves just fetching a value from a register at given index from a given register set. For example:
-- print contents of register 1 from local register set
print %1 local
-- copy contents of register 4 from static register set
-- into register 2 from local register set
copy %2 local %4 static
-- store text in register 1 from global register set
text %1 global "Hello World!"
If registers were boxes and values were balls the "plain" fetch mode would mean just taking the ball from a box.
Identified by star sign - "*
".
This mode is composed of two phases.
The first one involves fetching a value of a register at specified index from a specified register set.
In the second phase the VM dereferences the pointer.
The value obtained after dereferencing is the one supplied to the instruction.
Value fetched by the first phase of this mode MUST be a pointer. Otherwise the VM throws an exception. An exception is also thrown if the pointer is expired.
An example:
-- store text in register 1 from local register set
text %1 local "Hello World!"
-- store pointer to a value in register 1 from local register set
-- in register 2 from local register set
ptr %2 local %1 local
-- print the pointer
-- "TextPointer" will be printed to standard output
print %2 local
-- print the value pointed-to by the pointer
-- "Hello World!" will be printed to standard output
print *2 local
It important to note, that Viua pointers point to values. The code below works even though the text value was moved between taking the pointer to it and dereferencing the pointer.
text %1 local "Hello World!"
ptr %2 local %1 local
-- move the value from register 1 from local register set
-- to register 4 from local register set
move %4 local %1 local
-- this still works and prints "Hello World!"
print *2 local
If registers were boxes and values were balls the "plain" fetch mode would mean putting your hand in a box and instead of getting a ball you got a piece of string. You can pull the string to get the ball that is attached to the other end, no matter where the ball currenlty is. Be wary, though, as there is no guarantee that there actually will be a ball attached to the other end of the string (in which case you get an exception)!
Identified by "at" sign - "@
".
This mode is composed of two phases.
The first one involves fetching a value of a register at specified index from a specified register set.
In the second phase the VM fetches a value from the register index specified by the integer fetched in the first phase.
The second phase fetches from the same register set as the first one.
The value obtained after the second fetch is the one supplied to the instruction.
Value fetched by the first phase of this mode MUST be an integer.
Otherwise the VM throws an exception.
An exception is also thrown if the register that would be accessed in the second phase does not exist (i.e. the index is out of bounds), or
is empty (i.e. there is no value to be fetched; this is not true for the isnull
instruction).
An example:
-- store text in register 1 from local register set
text %1 local "Hello World!"
-- store integer 1 in register 2 from local register set
istore %2 local 1
-- print the value using register-indirect fetch mode
-- "Hello World!" will be printed to standard output
print @2 local
If registers were boxes and values were balls the "plain" fetch mode would mean putting your hand in a box and instead of getting a ball you got a piece of paper with a number written on it, and fetching the ball from a box with the number you read from the piece of paper.
Using empty registers as source operands will result in an exception being thrown by the VM.
The isnull
instruction may be used to check if a register is empty.
Accessing "out of range" registers either as destination or source operands will result in an exception being thrown by the VM. There is no instruction that can be used to check if a register index is "in range".
Using incorrect fetch mode for either destination or source operands will result in an exception being thrown by the VM. Sometimes several fetch modes are correct from the VM point of view. It is programmer's responsibility to ensure the right fetch mode from their point of view is used. For example, for pointers either "plain" or "pointer dereference" fetch mode can be used, but the value supplied to the instruction will be different.
Values are instances of types supported by Viua VM.
From the user point of view values are "whole pieces" - VM does not provide instructions to
access, for example, a third byte of a floating point value.
If a value is an instance of a compound type (e.g. a vector) the VM may expose instructions to
access individual elements of the complex value, but each accessed element will be a Viua value.
There is no way a program may learn about internal structure of a value short of invoking a FFI
function and then casting the value into a unsigned char*
to access the bytes that
from the value.
Also, there is no concept of "memory" in the traditional sense. Viua VM programs do not request "eight bytes for an integer", they just request an integer. The concept of "memory" is abstracted away in Viua VM. Programs cannot put values in "memory", values can only exist in registers.
Viua programs see values as "values" not as "references to values", and this rule is applied consistently. It is probably most visible during function calls - when vectors are passed by value and do not decay into pointers (a la C arrays) or are passed by reference (a la Python lists).
To avoid frequent copying which would be implied by the no-references rule, many Viua instructions and mechanisms use move semantics instead. For example: function returns, and exception throwing are always done by moving values.
The thinking behind this is that if you want a copy you can create it yourself before the move happens, but it is not easy the other way around, that is: if you wanted a move but the language by default gives you a copy, what can you do about it?
Moves in Viua are real moves instead of a "steal-my-guts" moves. Moved value really is moved from one place to another, e.g. from one register to another, from a register into a slot in a vector, etc.
Some places when moves are not used is message passing and other forms of inter-process communication.
Although, in theory, values could be moved between processes without violating the no-sharing-between-processes rule when sent as messages they are copied. This is done this way to provide consistency as there are no differneces between sending a value to a process running on the same VM instance (i.e. in the same underlying address space) or over a network to a process running in a different VM instance (and different address space).
The same thinking is applied when values are transfered between processes as exceptions or return values during process joins. Even though a value is normally moved in such cases it is copied when return or catch involves crossing process boundaries.
Pointers in Viua point to values, not to locations. For example, a pointer to a vector can be taken, and when this vector is moved (e.g. to another register, or moved as a parameter to a nested call) the pointer is still valid. This is in contrast to pointers known from C or C++.
Pointers in Viua may be considered safe by virtue of being aware of the fact whether the value they point to exists or not. While it is not possible to create a null pointer, it is possible that the value to which the pointer has been taken was destroyed. In such an event the pointer becomes "expired" and dereferencing it produces an exception (which can be handled).
Pointers become invalid when either the value they point to is destroyed (then they are said to have expired), or when they are sent as a message to another process as pointers are only valid inside the process in which they were created. When a pointer crosses process boundaries any dereference of it will produce an exception.
Viua does not prevent the pointers from being sent as messages, but enforces the isolation of process by making dereferences of pointers taken in process A illegal in process B. If such dereferences were allowed then processes could be made to share values which is not permitted in Viua.
It is possible to create pointers to pointers in Viua. Dereferencing must be done one nest-level at a time. This means that if a value is hidden behind a three-level pointer, three Viua instructions will be needed to get it.
However, it is impossible to make a pointer immediately pointing to itself, because pointers are taken to existsing values and the pointer does not exist before its creation. Also, pointers cannot be rebound so it is impossible to take a pointer and then rebind it to point to itself.
Description of basic (primitive and complex) data types available in Viua, and of the way user-defined types work in Viua.
All numeric data types support basic arithmetic operations, and can be compared with each other.
Integers have unlimited size, and are always signed. They are intended for arithmetic (basic math) operations.
Typical, 64 bit, floating point numbers. Fast enough but come with their set of imperfections.
Slower but more accurate than floating point numbers. No rounding issues known from floats, and with unlimited precision.
Represent true, or false values.
Can be obtained by comparison (e.g. eq
), logic (e.g. and
), and
some rare other instructions (e.g. isnull
).
Every Viua value is implicitly castable to Boolean
type.
By default a value is casted to false.
If a value should be casted to true, the reasons why are explained in the documentation for
said data type.
Viua provides a basic text data type. Text is visible to users as a sequence of characters (each character being a Unicode codepoint). Internally, text is encoded as UTF-8.
Common simple operations are provided as instructions: equality comparison, substring extraction, obtaining the character at an index, extracting common suffixes and prefixes, and concatenation.
Byte string is a string whose element is an 8-bit unsigned integer. Bit string is a string whose element is single bit.
Byte strings are useful for storing a variable amount of bytes. They are the typical unstructured bag-o'-bytes.
Byte strings may be freely converted to bit strings.
Bit strings are useful for storing bitmasks, and fixed-size signed or unsigned integers.
The VM provides instructions to perform (bitwise) or
, and
, not
, xor
operations.
Shifts and rotates are provided in "naive" and arithetic variant: bitshl
, bitshr
, bitrol
, bitror
, bitashl
, bitashr
, bitarol
, bitaror
.
Shifts modify bit strings in-place, and the shifted-out parts is produced as the output of the instruction.
Rotates modify bit strings in place.
Access to individual bits and substrings is provided by bitat
, and bitsbetween
instructions.
Bit strings whose length is a multiple of 8 may be freely converted to Integers. When the conversion is performed, it is assumed that the bit string represents a big-endian integer.
Integers may be converted to bit strings without additional constraints. Big endian encoding is used when converting integers to bit strings.
PIDs represent process identifiers. PIDs are used to send messages to processes.
Calling foreign functions from Viua code.
Calling Viua code from foreign functions.
Foreign functions calling Viua code should be split into two parts. After calling Viua code, foreign function should return to free FFI scheduler. Otherwise, some kind of suspension mechanism must be devised to allow suspending foreign functions mid-call while allowing Viua code called from them to run.