=pod =head1 NAME oh =head1 Command Line Interface: Evaluate code: oh e "'oh... print" Evaluate code and launch a REPL: oh e "'oh 24 var" i Load files: oh file.oh file2.oh Load files and launch a REPL: oh file1 i file2 i Load files named i and e: oh f i f e Launch a REPL: oh =head1 stack and rpn This is a stack based concatenative language using reverse polish notation. Operands come first, then operators 1 2 3 4 + + + This pushes 1 2 3 and 4 on the stack, then the first + takes the 3 and 4 from the stack sums them and pushes the result on the stack. Now the stack is 1 2 7 Then the second + executes, taking the 2 and 7 from the stack, sums them and pushes the result on the stack. Now the stack is 1 9 The third + sums the 1 and 9 and pushes 10 on the stack. This behavior is consistent on all the operations in the language, except for words that read from the source code. Most words take their arguments from the stack and return values by pushing them on the stack. Even if a word reads data from the source code, they can take additional arguments from the stack and if they return something it will be on the stack. The word 'r' will reset the stack, which is useful for using the repl. 1 2 3 Now we have 1 2 3 on the stack r Now the stack is empty =head1 colon words There are several ways to define functions in the language. Functions are named words in this language, since the interpreter just reads a word separated by spaces, tabs or newlines and evaluates it. The most common way to define a word is using the compiling word ':'. Which is the reason for naming functions as colon definitions or colon words : oh 1 2 3 ; This defines a new word named 'oh' that when executed will evaluate the code, which in this case will push the numbers 1 2 and 3 on the stack. In order to evaluate the word we just have to name it. oh That would evaluate the word. The word is created in the current environment and has some sort of fake compilation mechanism that converts the code into a list of perl subroutines. This fake compilation mechanism allows the word to retain its definition. This means that if the word being created depends on other words, the current definition of those words at the time when the word is being compiled will be retained. : meh 4 5 6 ; : oh 1 2 3 meh ; So even if we redefine the word 'meh', the word 'oh' will retain the previous definition: : meh 7 8 9 ; oh That would still push 1 2 3 4 5 6. If we wanted, we could redefine oh again so it takes the new definition. But we can use the alternative version of ':' that creates a colon definition without compiling it. The word '::' is similar to ':', but does not compile the definition, so the word will always evaluate the latest definitions. Since it does not compile anything, the words it uses do not need to exist until the word is being evaluated. :: oh 1 2 3 meh ; : meh 4 5 6 ; oh That will push 1 2 3 4 5 6. If we redefine the word meh, oh will reflect the changes. : meh 7 8 9 ; oh Now oh will push 1 2 3 7 8 9 A colon definition can also be created at runtime, with the word 'colon', which expects a name and a list of code to be evaluated. 'oh (1 2 3) colon It is equivalent to: : oh 1 2 3 ; And: 'oh (1 2 3) colon* Is equivalent to: :: oh 1 2 3 ; Which means that colon creates a compiled word and colon* creates a word without compiling. Note that ':' and '::' are what is known as immediate words, which means that they happen at read or compile time, while colon and colon* happen at runtime. Colon definitions have their own environment which provides lexical scope. Colon definitions can be nested, and the inner definitions will become children of the outers, existing only in the environment of the outer and having that environment as the parent of its own environment. : oh : meh 1 2 3 ; meh ; The word meh is defined inside the environment of the word oh, it will only exist there so it will not be found outside it. That means that trying to execute meh outside will trigger an error telling that the word was not found. The word meh is a child of oh and it inherits from the scope of oh, which means that it can access all its siblings and the variables of oh. : oh : ah 1 2 3 ; : meh ah ; meh ; Since we are using ':' instead of '::' the definition of ah has to appear before the definition of meh, so when we compile the word meh we can find it. Changing the order will trigger an error: : oh : meh ah ; : ah 1 2 3 ; meh ; That will say: compiling: atom not recognized ah Because meh was defined before ah. We could use '::' so the order does not matter. : oh :: meh ah ; : ah 1 2 3 ; meh ; But then it will not be a compiled word. =head1 lambda The word '{' is similar to ':' in the sense that it also compiles a colon definiton, but instead of creating it in the environment it pushes it on the stack. { 1 2 3 } eval Is almost equivalent to: : oh 1 2 3 ; oh With the difference that '{' does not create a named word. It might seem similar to evaluating a list: (1 2 3) eval But lists do not have environments while '{' does. Also '{' compiles the code and it can capture ':' and '::' definitions in its environment, which will become children of '{'. {: oh 1 2 3 ; oh} The word oh will only exist in the scope of this lambda and will not be found outside. A list cannot capture ':' or '::' like '{' does. (: oh 1 2 3 ; oh) The list will contain one element which is the string 'oh', and the word 'oh' will be defined in the current environment, so it is actually equivalent to: : oh 1 2 3 ; (oh) The word '{' has a non compiling variation like ':' has with '::', which is '['. The word '[' is similar to '::' because it does not compile a lambda. [ 1 2 3 ] eval Is almost equivalent to: :: oh 1 2 3 ; oh With the difference that '[' does not create a word in the environment, but pushes it on the stack instead like '{' does. There are also the runtime versions which create a lambda from a list, similar to colon and colon*, which are lambda and lambda* lambda creates a compiled lambda like '{' does, but it does so at runtime and takes a list from the stack. (1 2 3) lambda Is similar to: {1 2 3} With the difference that lambda executes at runtime and '{' at compile time. Due to being executed at runtime and using a list as code, it will not be able to capture ':' and '::'. (colon and colon* are not able to do that either) The word lambda* does not compile a lambda, so it is similar to '[' (1 2 3) lambda* Is similar to: [1 2 3] But at runtime. =head1 variables The word var creates a variable by taking the name and value from the stack. In order to quote the name of a word we can use ' followed by the name of the word, or enclose the word between " 'some-var 24 var "some-var" 24 var Both will push the string 'some-var' on the stack, then push the number 24. The var word will take the string as the name of the variable and the number 24 as the value for that variable. The word var also creates a new word named 'some-var' which we can use to push the value of the variable on the stack: some-var This word is created automatically by the var word. We can also use the word get-var to push the contents of the variable on the stack 'some-var get-var The word var which creates the variable can be used to update the value of the variable, but will only work for the current environment. 'some-var 42 var But to update a variable we can use set-var instead. 'some-var 42 set-var set-var will trigger an error if the variable was not found. The variable has to be defined first. set-var updates a variable and will try to find the variable in the chain of parents, which means that we can mutate an existing variable as long as it's found in the current scope or the parents. After we define a variable, like in: 'some-var 24 var We can access it by using the shorthand word that var created for us or by using get-var. 'some-var get-var Is equivalent to: some-var Because the var word creates the word named 'some-var' in the current environment where the variable also resides. There is a problem though because var is not an immediate word, which means that the shorthand word that it creates will not be created at compile time, so it will not be found if we are compiling an inner definition that wants to access that variable using the shorthand form. So this code would fail: : oh 'some-var 24 var : meh some-var ; meh ; Because var is not immediate, it will create the shorthand word at runtime which means that when we compile the word meh and find the word 'some-var' in its definition, this word will not exist yet until we evaluate the word oh. There are two options. One is using get-var: : oh 'some-var 24 var : meh 'some-var get-var ; meh ; The other is declaring the variable at compile time with the word declare : oh declare some-var 'some-var 24 var : meh some-var ; meh ; The word declare is immediate and takes the name of the variable to declare from the source code: declare some-var Instead of: 'some-var declare It just reads the next word and initializes the variable to 0, and also creates the shorthand word 'some-var' at compile time, so the interpreter will be able to find this word when compiling the code of the word meh. If we do not plan to use the shorthand word, we can just use get-var which takes the name of the word from the stack. Note that set-var and var are different. var creates a new variable in the current environment while set-var updates an existing variable in whatever environment the variable was found. If set-var does not find a variable it will trigger an error instead of creating a new one. : oh 'oh 24 var : meh 'oh 42 var ; meh 'oh get-var ; This word when evaluated will create a variable in its own environment named 'oh', The word meh is also using var, since var only creates variables in the current environment the variable in the word meh will be created in the environment of the word meh, which means that we have two variables being defined in different environments. The word oh when evaluated will push the value 24, since meh was not updating any variable, but creating its own. If we want meh to mutate the variable of oh, then we use set-var : oh 'oh 24 var : meh 'oh 42 set-var ; meh 'oh get-var ; Now when oh is executed it will execute meh which will mutate the variable of the word oh, so oh will return the value 42 instead of 24. =head1 temporary variables Temporary variables are auxiliar variables to help managing stack elements. Since the stack can easily be hard to manage, temporary variables try to help in avoiding stack juggling. Temporary variables have syntax sugar by appending or prefixing a colon symbol into the name of the temp variable. 24 :oh This creates a temporary variable named 'oh' with the value 24. In order to push the value of that variable on the stack we append the colon instead: oh: This will push the number 24 on the stack. If we want to both set the temp var and push the value we can add the colon in both ways. 24 :oh: That would set the temporary variable named 'oh' with the value 24 and also push the 24 on the stack. Temporary variables cannot be updated by inner scopes. They are meant to store data temporarily and if they are defined inside a colon word, they will disappear once the colon word ends execution. Inner definitions will be able to use the temporary variables of the outer words as long as they execute inside the context of the outer word. : oh 24 :oh : meh oh: ; meh ; This will push the value 24, since meh was executed inside the code of oh and it's an inner definition. Once oh finishes execution, the temporary variable will cease to exist. Also, meh will never be able to update that variable. : oh 24 :oh : meh 42 :oh ; meh oh: ; The value will still be 24, because meh created its own temporary variable in its own environment. We can also use the word 'temp' to create a new temporary variable in a similar way 'var' does, and 'get-temp' in a similar way 'get-var' does. : oh 'oh 24 temp 'oh get-temp ; If the temporary variables exist in a scope that is not from a colon definition, they will not be removed since they only get removed when they are inside the environment of a colon definition. That means that if we create a temporary variable outside any definition, it will persist. 24 :oh : oh oh: ; oh Since that temporary variable was defined in the root environment and we are not evaluating any colon definition that has the root environment as its own environment (the word oh we define has its own environment which inherits from the root environment, but is a new environment, not the root environment), The temporary variable we defined will persist. We could mess with the environments of a colon definition and set the root environment as the environment of that colon definition. 24 :oh :: oh oh: ; 'oh find-word 'env root set oh We are creating a non compiled colon definition with '::', which is just a hash table that we can push on the stack. 'root' pushes the root environment on the stack, the word 'find-word' pushes the word oh on the stack and 'env root set will set the environment of the word oh to the root environment. This means that when oh finishes execution, the interpreter will remove all the temporary variables of the environment oh, which we have set it to be the root environment. Note that 'root' is not actually a word, but some sort of special operator, so we can always find the root environment unless we create a word named 'root' that shadows this operator. =head1 lists Lists are created by the word '(' or the word 'list' The word '(' will read elements until it finds a terminating ')' and push a list with those elements on the stack. (oh my cat is very nice) That will push the list in the stack with those elements as strings. '(' is an immediate word which means that will execute at read and compile time and will execute other immediate words when it finds them. Since it is an immediate word, if it finds another '(', it will evaluate it, and this gives it a recursive nature. (oh (my (cat is) very) nice) The word '(' is also a special character detected by the reader, which means that when the interpreter reads a word, if it finds a '(' or ')', it will return them as a complete word, not requiring any space. The word list expects a number on the stack and will create a list taking that number of elements from the stack. 1 2 3 3 list The word list receives the number 3 which tells it to get 3 elements from the stack and build a list with them. In this example it returns the list (1 2 3) There is list interpolation by using the word '~' and special directives in the elements of a list. The word '~' takes a list from the stack and iterates over all the elements of that list in a recursive way, trying to find special symbols or syntax that make it perform certain actions. The comma ',' inside a list makes '~' insert an element from the stack into the list. 3 (1 2 ,) ~ That would return the list (1 2 3) If the comma precedes a word, '~' will evaluate that word and insert one element from the stack. : oh 3 ; (1 2 ,oh) ~ This evaluates the word 'oh' and inserts the last element on the stack. Since oh pushes a 3 on the stack, the list will be (1 2 3) We can also reverse the current list at any point by using '!' (1 2 3 !) ~ That returns (3 2 1) We can flatten a list inside the list: (1 2 3 (4 5 6) @) ~ Returns (1 2 3 4 5 6) And also get a list from the stack and flatten it: (4 5 6) (1 2 3 ,@) ~ Returns the same list as before. If we add a word name, it will evaluate the word, which should return a list and then flatten it. : oh (4 5 6) ; (1 2 3 ,@oh) ~ The '#' directive will evaluate a list inside the list and insert the last element from the stack. (1 2 (1 2 +) #) ~ Returns (1 2 3) =head1 hash tables Hash tables are created from a list: (oh my cat is very nice) hash The word 'hash' expects a list of key/value pairs, it will trigger an error if the list is not multiple of two. =head1 get and set To get an element from a list or hash table we can use get. (1 2 3) 0 get Will push the first element, which is 1. It works the same with hash tables (oh my) hash 'oh get The word set will set an element in a list or hash table. (oh my) hash :oh: 'meh 24 set oh: Now the hash is { oh: my, meh: 24 } Both set and get accept negative indices in lists, which will access or set the element starting to count from the end. For accessing elements we can use the dot notation '.' (1 2 3) -1 get Will push the last element which is the 3 set and get have syntax sugar. (1 2 3) 0 get Is equivalent to: (1 2 3) .0 And: (oh my) hash .oh Is equivalent to: (oh my) hash 'oh get For setting elements we can use '!' instead (1 2 3) 24 !0 Is equivalent to: (1 2 3) 0 24 set And: (oh my) hash 24 !meh Is equivalent to: (oh my) hash 'meh 24 set =head2 persistent hash tables Persistent hash tables use the berkeley db (by using the DB_File perl module) and are used as normal hash tables. They are created by the word named 'magic-hash' which expects the filename for the database to create or tie. 'some.db magic-hash That will return a hash table that is tied to the database created with the name 'some.db', which will be a berkeley database. We can use that hash table as a normal one with any differences, except that now the values are persistent. Those hash tables are managed by the perl tie mechanism, and it will sync to the database when the hash table is not referenced anymore, like when it leaves the stack and we did not save it anywhere. In order to force it to sync with the db, we can use 'magic-flush' 'some.db magic-hash :db db: 24 !oh db: magic-flush That will force the database to sync with the hash table contents. =cut