Slang Programming Reference --------------------------- 2.0 Overview: compiler, assembler, linker 2.1 Xlang vs. Slang (Cross-compiling) 2.2 Compiler 2.2.1 Numbers and misc 2.2.2 Variable types and declarations 2.2.3 Strings 2.2.4 Subroutines 2.2.5 Operators 2.2.6 Functions 2.2.7 Flow control, loops 2.2.8 Printing and IO commands 2.2.9 Misc commands 2.2.10 Interrupts 2.2.11 VIC commands 2.3 Assembler 2.4 Linker 2.0 Overview: compiler, assembler, linker From a programming perspective, Slang consists of a compiler, assembler, and linker. The compiler is a full-featured compiler, including subroutines, loops, etc., and has a fairly straightforward syntax. It also contains many 6502-specific features. The explanations below assume a familiarity with compiled languages (like C or Pascal). The assembler is a full, powerful 65816 assembler, in fact it is the assembler used to write Slang. It is possible to turn the compiler off and use the pure assembler (without label conflicts and such), and it is possible to mix and match Slang and assembly. The section on the assembler is for assembly programmers who want to get the most out of it. The linker is a special program which is used for large projects (for small projects it's not needed). When using the linker, programs are broken up into smaller, more manageable modules. Modules, and the connections between them, are then brought together by the linker to make the final program. Using reusable linker modules instead of a huge, monolithic file keeps the compile times down and maintains programmer sanity (whatever was there to begin with, that is). 2.1 Xlang vs. Slang (Cross-compiling) For cross-compiling purposes, the xlang program is used. "Xlang" is actually a 65816-emulator, with a little bit of C64 and SCPU emulator thrown in, that runs the Slang program on a PC. Xlang is written in C and is fairly portable. There are a few little quirks to using Xlang: - Xlang programs use _ (underscore) in place of <- (backarrow) for subroutine variables. - The first two bytes of a standard C64 file are the program address. When using PUT files, be sure to have a few blank lines at the beginning of the file. - ummm... Now is a great time to point out that I was keeping all my notes on the old BBS, which I lost after it was hacked, which means I'm having to leave some stuff out of these docs until we re-discover them! In all of these docs I've tried to remember everything, but there are probably things I've forgotten (it has been 2 1/2 years, you know), so please email me if you encounter something strange or have any questions. 2.2 Compiler ------------ Speaking in general, a program starts with variable declarations followed by program statements. The only real rule is that variables must be declared before using them. Slang writes programs like an assembler. Many times in the docs which follows, it will help to think of how it would be done in assembly, because that's how Slang more or less does it. For example, the structure of a program goes like a main program followed by subroutines. There is no "main" declaration -- like an assembler, statements are compiled in the order they appear, and the program will start running on the first line (or wherever you SYS to). Variable scoping is straightforward: variables declared in the main program area are global variables, accessible by subroutines as well as the main program. Variables declared within subroutines are local, although subroutine variables can be made "public" -- more on this below, in the subroutine section. When using assembly code with Slang variables, first the local scope is checked, then the entire variable space is checked and the first match is used. The compiler automatically performs type conversion in expressions, using the general rule of "promoting types up". For example, if you add a byte to a float, the byte will be converted to a float before addition, even if the eventual result will be a byte. Temporary results: When evaluating expressions, the compiler makes use of temporary locations for three separate events: numerical operations, string operations, and zero-page operations. By default, the temp zp stack starts at $02, the temp eval stack starts at $0110, and the temp string buffer is at $0200. These may be changed using the SetZPTemp, SetEvalTemp, and SetStrBuf commands, for programs needing a different memory arrangement. Optimization: Without going into too much detail, the compiler performs local (peephole) optimization. In general, the compiler looks backwards, not forwards -- it doesn't see what is coming, but to some extent it does see where it was. Moreover, optimization is only applied to individual expressions -- basically, one line at a time. So when you look at the code produced you will sometimes see some odd constructs, or some code that looks like an obvious optimization, and that's why. The compiler will output both error messages (things like syntax errors, which are fatal) and warning messages (like using fewer array dimensions than expected, which are not), and will try to let you know where the problem is. 2.2.1 Numbers and misc ---------------------- Because individual commands change with type, or are added, these sections are going to be a little general: for the list of specific commands, see the "quickref_tiny" document, which will stay updated with all the latest commands. In Slang -- like assembly -- the semicolon is used for comments. You can also use * if it's in the first column, another assembly convention. Slang reads one line at a time. There is no line terminator; there are no multiple statements per line. Sometimes, of course, it's nice to break up a long line across multiple lines, and the & character is used for this: a = a + & b ; break the line across two lines Again like assembly, Slang accepts numbers in decimal, hex, and binary: a = 32 ; decimal a = $20 ; hex a = %100000 ; binary Floating-point variables use the BASIC routines, so all the usual conventions apply, including e-notation (1.2e-3). Strings in Slang are specified using either ' or " -- there is no difference. Using one allows the other to be embedded: a$ = "don't do that" ;the ' doesn't break the string To embed specific chr$ codes in a string, ! is used as the escape character: a$ = !5"hola"!13 ;like chr$(5)+"hola"+chr$(13) The following special escape codes are also defined: !s ;use screen codes instead of PETSCII codes !r ;use regular (PETSCII) codes !n ;negate (ora #$80) following codes !z ;do not null-terminate string Finally, Slang is case insensitive: everything is converted to lower-case. 2.2.2 Variable types and declarations ------------------------------------- In Slang, a variable is defined by: - whether it is fixed or float - whether it is signed or unsigned - how many bytes it occupies At the time of this writing, the following types are supported: byte/ubyte ;signed/unsigned byte int/uint ;signed/unsigned int (2-bytes) float ;five byte MFLPT Variables are declared the way you'd expect: byte b,c,d ;can have a list of vars uint blah Normally a variable is allocated at the end of the code; in an assembly program, the above would be like placing b dfb 00 ;alloc one byte, assign address to b c dfb 00 d dfb 00 blah ds 2 ;alloc two bytes at the end of the code. What's cool is that Slang also allows you to specify the variable address if you want: ubyte border@$d020 ;create a 1-byte variable, located at 53280 uint irqvec@$0314 ;create a 2-byte var at irq vector location uint freq1@$d400 ;you get the idea... This type of variable is referred to as an @-var, for obvious reasons. ; ; Arrays ; Declaring an array of some variable type is also simple: byte b(20,30) ;2D array of bytes Arrays are done in "row, column" order, which means that when combined with @-vars you can do things like byte screen(25,40)@$0400 ;treat screen as 2D array of bytes Array indices start at zero. This means that array indices go from 0..N-1, and you can get in trouble if you stop paying attention: ubyte blah(10) blah(0) = 1 ;1st array element blah(9) = 10 ;Last array element is 9, not 10 blah(10) = 0 ;Oops -- out of array, overwriting another variable No array boundary checking is done (at this time -- maybe a future option for debugging) which means you can happily overwrite other variables if you exceed array dimensions. Another array quirk is that you don't have to specify all array dimensions when using a multidimensional array: byte screen(25,40)@$0400 screen(3)=0 ;treated as screen(3,0) The compiler will generate a warning, but will just treat everything else as zero. Note that, as with variables, arrays are allocated at the end of the code if you don't use an @-var. Thus, large arrays can lead to large binaries. ; ; Pointers ; Pointers are one of those things that every CS major has to get confused by and takes a while to get used to. Fortunately, if you have some 6502 programming experience, you're already familiar with them. Consider the following assembly program: zp = $fe lda #$00 sta zp lda #$c0 sta zp+1 ;set the value of zp to $c000 ldy #00 lda #$ff sta (zp),y ;set the indirect value of zp to $ff The idea is that zp is an indirect pointer: there's where it points, and what it points to. Same with Slang pointers; in fact, the above is pretty much how they are implemented. ubyte ^test ;create a 16-bit pointer to a byte float ^^blah ;create a 24-bit pointer to a float byte ^yak(10,20);create a pointer to an array Just as in the example above, you can set where it points, and what it points to: test = $c000 ;Set the address to $c000 ^test = $ff ;Set the indirect value: (test)=$ff The procedure using arrays is similar, with one exception: yak = $8000 ;Set the array base to $8000 yak(3,8)=5 ;Use as normal: no leading ^ namely, the leading ^ is not used -- an array is already a pointer. By using pointers it is possible to create "movable arrays", i.e. the base address may be moved around. To create a 24-bit pointer, a double up-arrow is used: one arrow for 16-bits, two arrows for 24-bits. This is true during definition as well as during de-referencing: ubyte ^ptr16 ;16-bit pointer ubyte ^^ptr24 ;24-bit pointer ^ptr16 = 10 ;use (zp),y addressing mode ^^ptr16 = 10 ;use [zp],y addressing mode ^ptr24 = $ff ;use (zp),y addressing mode Note in the above that it is possible to use 16-bit addressing mode with a 24-bit pointer (in which case the bank is discarded), and it is possible to use 24-bit addressing with a 16-bit pointer (in which case the bank is forced to zero). Normally, the way a pointer works is that the address contained in the pointer variable is copied to zero page, and that zero page location is then used as an indirect pointer. To skip the first step, a pointer may be declared explicitly in zp as an @-var: ubyte ^ptr@$fe ; ; Compound variables ; In pascal it's a record; in C it's a struct. In Slang it's a "compound variable". The idea is to create a new variable type that contains multiple basic variable types, and the syntax is straightforward: deftype blah ;create a new type "blah" int .a ;note leading period byte .b(10) float .x defend ;end of definition This creates a new type -- not a new variable. Once the type is created new variables of that type are defined using type blah var1, test2 ;create two variables of type "blah" which contain all the member types: var1.a = 10 test2.b(3) = -13 Currently, pointers do not work with compound variables (that is, pointers can be inside a compound variable, but you cannot declare a pointer to a compound variable at this time), and similarly compound variables may contain arrays, but you cannot create an array of compound variables. ; ; Varblock ; Sometimes, when writing code, it's nice to locate all variables in a common area without having to specify the address of each and every variable using @-addressing. For example, a block of common variables might be shared between two separate programs. In Slang, this can be done using the varblock command: Varblock @$c000 ;Define following variables starting at $c000 ubyte blah int yak EndVarblock Varblock is like changing the ORG for the variable table; instead of being allocated at the end of the code, variables are allocated starting at the varblock address. Varblock may be used as many times as needed; it's a bookkeeping convenience. ; ; Variable addresses ; To get the address of a variable, place a # before the variable name: int b,c c = b ;set c to the _contents_ of b c = #b ;set c to the _address_ of b This is analogous to assembly language (as usual): lda b ;lda with the contents of b lda #b ;lda with the address of b This type of mode can be used in expressions c = #b+6 and works with compound variables and arrays as well: ubyte screen(25,40)@$0400 uint row row = #screen(12) ;set row = address of screen row 13 (Remember that array dimensions start at zero...). 2.2.3 Strings ------------- In Slang, strings are simply a byte array. (Just like -- ta da -- assembly.) The difference is purely in how the array is addressed: byte str(20) ;declare an array of bytes str(3) = 1 ;treat it as a normal array str$ = "Hey!!"!13 ;Using $ treats it as a string As before, !13 embeds a chr$(13) into the string. Strings in Slang are null-terminated, so after executing the above code str will look in memory like |H|e|y|!|!|13|00 As usual, make sure you've allocated enough room for that terminating zero. It is possible to specify substrings, and to use strings in comparisons: if str$(2:4) = "y!!" ;compare substring range str$(2) = "blah" ;write a string, starting at offset 2 endif Executing the above code results in |H|e|b|l|a|h|00 Strings are concatenated using the + operator, and as stated earlier the following special escape codes are available for embedded strings: !s ;use screen codes instead of petscii !r ;use regular (petscii) codes !n ;negate (ora #$80) following codes !z ;do not null-terminate string 2.2.4 Subroutines ----------------- Slang implements subroutines the 6502 way. Parameters are passed directly; instead of passing parameters on the stack, they are copied directly to the relevant memory locations. Return parameters are explicitly defined, and again copied directly. Therefore, things like recursion are not supported directly: you'll have to push stuff yourself. On the plus side, as will be seen, this makes it very easy to interface with existing ML routines. A subroutine definition begins with the "sub" keyword and ends with the "endsub" keyword: sub blah() ;Subroutine with no input or output variables ubyte b int c do something endsub The "endsub" command is really just an rts; the compiler won't complain if you leave it out. The parenthesis () are required, even if no input variables are specified. Variables declared within a subroutine are _local_, i.e. visible only within the subroutine. Input variables are specified within the parenthesis: sub blah2(int x, byte r) ;Subroutine with two input vars and, naturally, the subroutine is called using blah2(12,5) The first var is copied to subroutine variable "x", and the second is copied to subroutine variable "r". Subroutines don't return variables per se, but instead it is possible to declare "output variables": sub blah3(int z)<-byte s, int t The above declaration has one input variable (z) and two output variables (s and t). A piece of code making use of this subroutine might look like blah3(12) x = blah3<-s y = blah3<-t + 10 ;can use in expressions like any variable That is, the subroutine is called, and the return variables are then accessed. It is possible to combine statements, as x = blah3(12)<-s ;call before accessing return var y = blah3<-t ;don't call; just access return var This is one of those "make sure you know what you're doing" things. The first statement, because of the () parethesis, calls blah3 before accessing the return variable. The second statement, because it doesn't use (), doesn't call the subroutine. When used in an expression, the first return parameter is the default value: x = blah3(12) ;use first return parameter as default value As it turns out, "return" variables aren't really just return variables. You can set them _before_ calling a subroutine as well: blah3<-s = 10 blah3(13) What's really going on with return parameters is a scoping issue. Normally, variables created within a subroutine are local to the subroutine, and hence not visible outside the subroutine. Return variables are subroutine variables that are accessible outside of the subroutine: you can read from them _and_ write to them. If it helps, it's analogous to a "public" variable versus a "private" variable. ; ; Regvars ; Subroutines can also pass parameters directly through 6502 registers. In Slang this is termed a "regvar": sub blah4(@ax) The above declaration says to pass one input var in the .A (lo) and .X (hi) registers. The following regvars are available: @a ;8-bit @x @y @ax @xa ;16-bits @ay @ya @xy @yx with the convention that the first reg contains the low byte and the second contains the high byte, i.e. @xy will place the low byte in x in the high byte in y. ; ; Subroutine interfaces ; Slang implements subroutines the way a 6502 programmer would. This means in turn that it can call existing ML subroutines in a very straightforward way. This is done using a "subroutine interface": sub asm chrout@$ffd2(@a) The "asm" keyword denotes that this is a subroutine interface. It contains no code; it's whole purpose is simply to tell Slang how to call an outside routine: where to copy stuff, where to call, where to find return variables. Here's a more concrete example. Years ago I wrote a set of bitmap drawing routines called "grlib". These are ML routines for drawing lines and circles, clearing the bitmap, etc. using a jump table for the different routines. Now Slang does include an assembler, so it's possible to call them directly using asm. But interfaces provide a much cleaner, slangy way of doing it; here's the entire set of grlib interfaces: sub asm InitGr@$c000() sub asm SetScreenOrg@$c003(@x, @y) sub asm GRON@$c006(@a) sub asm GROFF@$c009() sub asm GrSetColor@$c00c(@a) sub asm GrMode@$c00f(@x) sub asm GrSetBuf@$c012(@x) sub asm GrSwapBuf@$c015() sub asm GrPlot@$c018(int x1@$02, int y1@$04) sub asm GrPlotAbs@$c01b(int x1@$02, int y1@$04) sub asm GrLine@$c01e(int x1@$02, int y1@$04, & int x2@$06, int y2@$08) sub asm GrCircle@$c021(int x1@$02, int y1@$04, & ubyte radius@$10) That's it -- the entire library is now usable from Slang using commands like GrCircle(20,30,r) as if the command was a Slang command all along. Let's examine the interface for GrCircle: sub asm GrCircle@$c021(int x1@$02, int y1@$04, & ubyte radius@$10) The sub asm tells Slang to create a subroutine interface called GrCircle. The @$c021 says that the routine is located at $c021. The routine has three input parameters: two signed 16-bit values, and an unsigned radius value. These parameters are located in zero page at $02, $04, and $10 respectively. Slang now has everything it needs to know: where to copy variables, what type they are, and what routine to call. Looking at some of the other interfaces above you'll see that some use regvars, and some have no parameters at all. Using this technique it's possible to call pretty much any existing 6502 routine in a natural way. 2.2.5 Operators --------------- There are actually several different kinds of operators in Slang. The first type is the "assignment operator". There used to just be one assignment operator, "=": a=1 As of Slang v1.3, there are increment and decrement operators, a++ b-- along with shorthand operators +=, -=, *=, /= b+=a ;like b = b+a In general, these new operators produce more efficient (faster and smaller) code than the longer versions. Note that the ++ and -- operators do not (currently) work inside of expressions (things like c=b++). There are bitwise operators, such as "bitand", "bitor", etc. There are regular math operators, such as +, -, *, etc. Integer multiplication and division are done using the core library; if you don't include the core library in some way, they will generate an error. Floating point routines use the BASIC routines, and hence require that BASIC is switched in (and floats are the only part of Slang that are really C64 specific). There are shift operators, << and >>, which are only used for fixed-point variables. There are comparison operators, like <, >=, etc. These return 0 or 1, which can be used in an expression: x = y>3 ;x=0 or x=1 Finally, there are logical comparison operators "and", "or", "eor". These are actually identical to the bitwise operators but have the lowest precedence, so that if x>y and y>3 works the way it should. 2.2.6 Functions --------------- The standard BASIC math functions like sin() and cos() are available and work the way you'd expect: x = sin(y) These all use the BASIC routines, and as such are done using floating point. The function "not" is also available, which negates a comparison, i.e. converts 0 to 1 and nonzero to zero. All of the basic types perform type conversion functions as well: int a,c float x a = x+c ;promote c to float, add to x, convert to int a = int(x)+c ;convert x to int, integer add to c That is, byte(), ubyte(), int(), uint(), float() may be used to perform explicit type conversions. Finally, some misc function are available for things like VIC functions. For example, the function "SpriteColSpr" may be used in statements to check for sprite collisions. 2.2.7 Flow control, loops ------------------------- The standard flow control statements like if, while, repeat, etc. are all available. Normally, a statement like if (expression) is parsed as "if expression is true, then". The first twist is that ! may be used to flip this around: if! (expression) is parsed as "if expression is _false_, then...". Why not just use if not(expression) instead? The reason is that the not() command involves some extra computations, whereas if! simply changes the branching opcode (BEQ vs BNE). That is, it generates more efficient code. Along similar lines, the compiler in general has no a-priori way of knowing whether to use a branch or a JMP, and so has to use a JMP by default. This can lead to inefficient code. When writing assembly, a programmer normally uses a branch and if the assembler generates an error he changes it to a JMP. In Slang, you can place a + after these commands to try using a branch instead of a JMP: if+ expression ...do stuff endif If the compiler generates a "bad branch" error, you then remove the +. Both modifiers ! and + are simply code-optimization features. They are optional, so if none of this makes sense, don't worry about it. They may also be used together, so that if!+ expression is valid. (if+! is also valid.) The test-case structure is available as a shorthand for if/elsif type stuff. It's a little different than the switch/case type of thing in C, and goes like the following: testall ;test ALL of the cases which follow case expr1 do something case expr2 do something else ... endtest The above is identical to if expr1 do something endif if expr2 do something else endif The key difference here from switch/case is that expr1 and expr2 can be full, totally independent expressions. If "testelse" is used instead, it's like if expr1 do something elsif expr2 do something else ... endif i.e. the cases are exclusive, instead of checking all cases. 2.2.8 Printing and IO commands ------------------------------ Printing is straightforward and works more or less the way you'd expect. The "sprint" command -- for string print -- does not require any libraries and can be used for printing simple strings. "sprintln" will add a chr$13 at the end, and you can print to a specific row/column using sprint(10,3) "yo!" The more powerful command is print/println. This command requires the core library, which means you have to "PUT" the library in or else link to it using the linker. The print command will print strings, numbers, expressions, the works. When printing a series of expressions, commas or spaces may be used to separate expressions. That is, print "x="x print "y=",y are both valid -- whichever looks better to you is the one to use. As with sprint, you can move to a specific part of the screen via println(12,3) !5"yo!" ; ; Input ; Slang has commands for inputting characters, strings, and numbers. The simplest commands are the character commands: getchar b ;get char, don't wait. Think "GET" in basic. waitchar b ;wait for char, return in b waitchar ;wait for a keypress These get, obviously, a single character. The next step up are the full blown input commands. The first, cheesy command is input b$ ;input a string This uses the kernal INPUT routine, and does not require any libraries. The remaining commands all use the core library: InputStr(maxlen) b$ ;input a string InputFloat x ;input a float InputInt d ;input an integer (signed/unsigned 16-bit) The InputStr command uses a custom input routine. The maxlen argument is optional. This input routine disables the cursor keys, enforces the maximum input length, you can use SHIFT-CLR to erase the current text, and R/S to abort. The InputFloat and InputInt use the InputStr command and then convert the string to a number. The InputInt command accepts decimal, hex, and binary number input. ; ; Load/Save ; Finally, there's load and save. load "filename",dev,address save "filename",dev,start,end In principle these commands are 24-bit aware, for loading into SuperRAM, but I can't remember if it's enabled or not. 2.2.9 Misc commands ------------------- These are handy commands that don't really belong in another category. The first is MemConfig. MemConfig sets the memory configuration -- sets location $01 -- to one of the 8 configurations (0-7). So you can use this to switch out BASIC, etc. The memory configurations are: # 01 $A000 $D000 $E000 # -- ----- ----- ----- # 0 RAM RAM RAM # 4 RAM RAM RAM # 1 RAM ChrROM RAM # 5 RAM I/O RAM # 2 RAM ChrROM Kernal # 6 RAM I/O Kernal # 3 BASIC ChrROM Kernal # 7 BASIC I/O Kernal Another useful command is the "wait" command: wait $d012,255 This simply waits for address $d012 to be 255. There are a flock of wait commands for different conditions: waitne (wait until address _isn't_ equal to something), waitbits (wait for specific bits to be set/clear), and so forth. FillMem is used to fill an area of memory with a specific byte -- handy for clearing large swaths of memory. The "done" command is used to end a program: all it does is an rts. You can also use the "donebrk" to issue a BRK instead of an RTS, for use with a monitor. The PUT command is used to insert files. PUT 'filename' will assemble the named file at the current location in the text. You can't nest PUT files. Two commands for automatically running/saving are saveobj 'filename' ;auto-save object code if compile successful autorun ;auto-run program if compile successful Finally, sometimes it's nice to have a pure assembler around, without worrying about label conflicts (using "next" as a label, say) and such. For that purpose, the command SlangOut switches the compiler off, with SlangIn used to switch it back on. 2.2.10 Interrupts ----------------- Slang adds a number of commands for performing interrupts. There's nothing magic about them; the idea is just to take some of the tedium out of the process. They do assume the kernal ($E000) is switched in. An interrupt routine is a subroutine that gets called when an interrupt occurs. The main difference from a subroutine is that it exists through an interrupt handler. irq name(varlist) ;declare as a normal sub, using "irq" do stuff endirq ;exit irq JmpKernalIRQ ;alternatively, exit through normal kernal If the routine exits with "endirq" control will simply resume from where the interrupt occured. If JmpKernalIRQ is instead used the routine will exit through the system IRQ, meaning that the keyboard will get scanned, TI will be updated, the cursor will get flashed, etc. To enable the interrupt, you simply do SetIRQRoutine(name) EnableInterrupts This sets the $0314 irq vector to your routine, and off you go. There are two primary sources of an interrupt on the C64: VIC (when a raster line is reached, when a collision occurs, etc.) and CIA Timer A (the normal system interrupt). These are the only two sources I bothered adding commands for. There are two crucial things to understand about interrupts. The first is that they must be acknowledged: when VIC signals an interrupt, he continues signalling that interrupt until you tell him to stop. If you don't acknowledge the interrupt, it will never stop, and your interrupt routine will get called continuously. The second is that there is just one pin on the CPU for interrupts, which means the CPU doesn't know where an interrupt comes from. What this means for a program is that if you set up a VIC interrupt but forget to disable the CIA Timer A interrupt, oops: another infinite interrupt. Your program acknowledges VIC, and he turns off his interrupt, but Timer A is still signalling an interrupt. So the general procedure is: if you want to simply wedge in to the normal Timer A interrupt, no problem, just use SetIRQRoutine. Then somewhere in your irq routine be sure to place a AckTimerAIRQ command to acknowledge the IRQ. You can also change the timer A value using SetTimerA(value) If instead you want to use a VIC interrupt, you need to disable the Timer A IRQ: DisableTimerAIRQ You can then enable a raster IRQ using SetRaster(value) EnableRasterIRQ and use AckRasterIRQ to acknowledge the IRQ in your routine. There are several example IRQ programs on the slang homepage (spritedemo.e.s, slangdemo1.p.s, etc.). 2.2.11 VIC commands ------------------- Finally, some handy VIC commands have been added for manipulating sprites and the screen. To use these commands, you must first add the command UseSpriteStuff to your program. This sets up some tables and variables for use by the sprite commands. The sprite commands let you set the position of individual sprites (SetSpriteX and SetSpriteY), as well as their color and which data block to use. Currently this last command is sort-of dumb, because it assumes the screen is at $0400. The SpriteOn and SpriteOff commands are used to, well, turn sprites on and off. There are also commands for setting the X fine scroll register, VIC memory configurations, and so on. See the quickref for more details. That brings us to the end of the compiler overview. The assembler and linker follow. ============= 2.3 Assembler ============= The assembler included with Slang is the Sirius assembler. Full docs are available on the webpage; what follows is a more compact summary. There is also a full ML monitor included with the SCPU version. Be sure to check out the !d debugging mode, which allows highlighted single-stepping through code. The assembler is fully integrated with Slang. This means that, for example, assembly language instructions may be applied to Slang variables, and inserted into the middle of routines. In short, assembly should be considered part of the language, and nobody should fear using asm to maniplate Slang things like arrays or strings. In the end, it's all just 6502 assembly. THE RULE, though, is that normal assembly rules must be followed. Most importantly, this means that labels begin in column 0, and opcodes must be indented to at least column 1. If you put "lda #00" in the first column you will get an error. To eliminate label name conflicts with reserved keywords, the "SlangOut" command may be used to disable the compiler. "SlangIn" is used to re-enable it. Assembler --------- R/S - Halt assembly <- - Toggle screen output label opcode argument comment * Fields must be separated by at least one space * Labels and opcodes are case insensitive * Arguments may be 24-bit * Comments ;this is a comment * This is also a comment, provided * is in column 1 * Alternate mnemonics: BCC BLT BCS BGE DEC A DEA INC A INA JSR JSL (Long JSR) JMP JML (Force long JMP) BRK and COP may specify optional one-byte argument. * Quotes: Upper-case letters are 96-127 within single quotes, 192-223 within double quotes; space also has high bit clear/set. CMP #' ' ;CMP #32 CMP #" " ;CMP #160 TXT 'Hola' ;104 79 76 65 TXT "Hola" ;200 79 76 65 * Label '*' refers to address of current opcode. e.g. BCC *-3 * Local labels begin with : (e.g. :LOOP). Local labels are attached to the last global label, and hence may be reused. Example: glob1 lda #$ff :loop sta $1000,y iny bne :loop glob2 lda #00 ;new gobal :loop sta $1100,y ;new local, different scope iny bne :loop Make copious use of local labels. * Re-definable labels begin with ], e.g. ]label. ]labels must be defined before they are used, and can be re-defined. Example: ]addr = $c000 lda ]addr,x ]addr = ]addr+$0100 sta ]addr,x When combined with LUP, below, this allows for automatic code generation. For example: ]src = $0400 ]dest = $0800 LUP 25 lda ]src,x sta ]dest,x ]src = ]src+40 ]dest= ]dest+40 --^ * Use "DO 0/FIN" to comment out large sections of code. * Prefixes Immediate mode: Operand One-byte result Two-byte result #$01020304 04 04 03 #<$01020304 04 04 03 #>$01020304 03 03 02 #^$01020304 02 02 01 (Sirius currently supports only 24-bits, not 32-bit) Absolute mode: < Force one byte (direct a.k.a. zero page) ! Force two bytes (absolute) > Force three bytes (abs long) Example: LDA $0203 and LDA !$010203 are equivalent. * Pseudo-Opcodes ORG address Set program ORiGin ORG $C000 ORG Re-ORG *= * Alternate syntax *= $C000 EQU or = EQUate label CHROUT = $FFD2 DFB or DB DeFine Byte DFB 100,$64,%1100100 DA or DW Define Address DA $FFD2 ;D2 FF DLA Define Long Address DA $0102 ;00 01 02 HEX Define hex bytes HEX 20D2FF ;20 D2 FF DS Define Storage DS 5 ;00 00 00 00 00 DS ^ Fill to page boundary DS ^,$3D ;$3D to boundary TXT TeXT TXT 'Hola' ;68 4F 4C 41 TXT "Hola",0d ;C8 4F 4C 41 0D DO arg Conditional assembly DO 0 ;Don't assemble #ELSE Reverse last DO FIN End DO/#ELSE constructs PUT 'file' Assemble from disk PUT 'test,s' ;SEQ file PUT 'test',9 ;PRG file, dev 9 BIN 'file' Include binary file BIN 'tables',9 LUP arg Repeat enclosed code "arg" times. Combine with ]labels code to generate code. --^ PRT Redirect screen output to printer PRT 'file' Redirect screen output to disk file REG #arg Set 8/16-bit assembly (set status reg) REG ON Automatically follow REP and SEP (default) REG OFF Don't track REP and SEP Note: 8/16 tracking doesn't track E! REL Assemble as relocatable file EXT Label is external to file ENT Label is entry point ========== 2.4 Linker ========== You've probably heard me harp on about the linker. The linker is the single most important feature of the assembler, and the key to writing large projects. But let me start with a story. Years ago, when I started doing C64 coding, by dumb luck I wound up using the Merlin 128 assembler. I noted that it had this thing called a linker, and never used it. Some time later, when writing polygonamy, I ran out of memory with big source files, and started using the linker as a way around that. But I still did not appreciate it. But once I got to even bigger projects, and started using it more, and started using it correctly, the value became apparent. Pretty dull story, I know. But since the point of the linker wasn't obvious to me until after using it several times, I won't try to convince you of it either. Instead, I will just suggest that if you're interested in writing large programs, you should give it a try; it's really easy to use and pays big dividends. (For small programs, it's pointless.) The linker is fully integrated into Slang. So, the linker. There are just three commands that are used with the linker: REL - REL is placed at the top of the code, before any lines are assembled/compiled, and states that the file is to be assembled as a RELocatable module. The object code produced is in a special format for use by the linker, and will have a .l suffix. ENT - Mark label/variable/subroutine as an ENTry point. Things marked as ENT are visible to outside modules. EXT - Mark label/variable/subroutine as EXTernal to the module. These labels will be resolved at link time. Once all your modules are in place, you create a file along the following lines: link $8000 ;can use any address module1.l module2.l ... That is, the first line invokes the linker, and says what address to assemble/link to. The following list of modules are then loaded in, and all the variables and addresses are fixed up, resulting in the final object code. To change the target address, simply change the first line. That's it -- that's all there is to it. Piece of cake. Just three commands. Okay, a really really simple example. Let's say module2 has a routine in it called PRHEX: PRHEX ENT ... ;details of the routine invisible Now lets say module1 is a really simple application of this routine, and the entire code is REL PRHEX EXT lda #$ff jsr PRHEX lda #$d2 jmp PRHEX As the code is being assembled, the actual address of PRHEX is unknown; it's used as normal, but it's somewhere outside of the module, to be resolved at link time. The contents are saved to a REL .l file. At link time, the modules are loaded in, and the address of PRHEX is resolved and fixed up in the code, and poof, object code is ready. Simple and transparent. For a Slang example, consider the GDK sprite editor, available on the web page. The linker file looks like this: link $7000 spred.l joymenu.l core.l There are three modules total, and the code is all linked to $7000. The first module, spred, contains the main sprite editor code. The second module, joymenu, is a collection of routines for creating joystick menus. The third module, core.l, is the core library. Each module contains routines available for use by the other modules. Spred uses routines from both joymenu.l (routines for setting up menus, etc.) and the core.l library (printing, math). Joymenu uses core library routines -- routines for printing text, getting input, etc. The ENT and EXT definitions look something like: core: sub ent core]InputString(...) ;ENTry point for other modules joymenu: sub ext core]InputString(...) ;EXTernal routine sub ent DisplayMenu(...) ;ENTry point spred: sub ext core]InputString(...) ;EXTernal sub ext DisplayMenu(...) ;EXTernal With these modules in place, the linker goes to work. It loads in the first module to $7000, fixes up all the variable address, and makes a note of all ENTry points and EXTernals. In then does the same thing with the next two modules. Because the modules are pre-compiled, this goes very quickly. Once all modules are loaded, the linker figures out all the EXT locations, looks up their ENT points, and fixes up the addresses. Poof, the program is now assembled. So that's it: in any module, if it uses a routine or variable that is outside the module, it defines the variable as an EXT. If it has routines or variables to make available to outside modules, they are marked as ENT. And the linker takes care of the rest. If you're familiar with OOP, you can kinda sorta think of a rel module as a class: all routines and variables are internal and private except those made explicitly public.