Processing Arrays and Using Functions
by Jacqueline A. Jones, Brooklyn College, CUNYUsing an Index Register
Your text doesn’t show any examples of true use of an index register. An index register is used to hold consecutive offsets from the beginning of an array. If the array consists of doublewords, those items are 4 bytes apart, and the index register will hold first 0, then 4, then 8, then C, etc. An index register is like an array subscript in C++. Traditionally, the legal index registers are esi and edi, though more recent assemblers allow other registers to be used in this fashion.
Example 1: Using an index register.
Sum the values in the arr array using an index register.
arr dword 9, 8, 15, -5
n dword 4
...
mov ecx,n ; loop counter
sub esi,esi ; index reg
sub eax,eax ; sum = 0
add_t: add eax,[arr+esi]
add esi, 4
loop add_t
Using a Base Register
Using an index register to process an array is clear, because it allows you to mention the name of the array in the address reference. However, it is not possible to use this form of notation in a procedure which cannot see the names of variables declared in main. Therefore, another method is needed. This method is using a base register. A base register initially holds the address of the array and is incremented by the size of the array elements. Traditionally, the base register can be ebx or ebp, though esi and edi can be similarly used. Note that ebp points into the stack, not into the data segment, so there are extra steps needed to use it to process an array
Example 2: Using a base register.
Sum the values in the arr array using a base register (the summation is done in main).
arr dword 9, 8, 15, -5
n dword 4
...
mov ecx,n ; loop counter
lea ebx,arr ; base reg
sub eax,eax ; sum = 0
add_t: add eax,[ebx]
add ebx, 4
loop add_t
Passing Parameters in Registers
When calling a procedure, you can pass parameters in registers. Obviously, this is limited by the number of available registers (eax, ebx, ecx, edx, esi, edi, ebp). For our purposes, it is simpler than passing parameters on the stack.
When possible, put the item in the register in which it will be used; this saves time in the procedure. Pass the address of an array in ebx (though using other registers is legal), and pass the number of values in an array in ecx, if you plan to use the loop instruction.
Example 3: Passing parameters in registers.
main:
arr dword 9, 8, 15, -5
n dword 4
...
mov ecx,n ; number of values in the array
lea ebx,arr ; address of the array
call myproc
In the procedure, simply use the registers as you would if you had loaded them in the procedure.
; modify array in proc, using base register notation
myproc proc near32
push eax
push ebx
push ecx
mov eax,9
top: add [ebx],eax ; add value to each array element
add ebx,4
loop top
pop ecx
pop ebx
pop eax
ret
myproc endp
Returning to the Top of an Array in a Procedure
Sometimes, it is useful to be able to keep a pointer to the beginning of the array while still processing through the array. For example, in Example 3, what if you wanted to print the changed array after the end of the loop? The ebx register points past the last element in the array (and ecx is 0). You can’t use lea to access the address of arr, because the function may not refer to names declared in main.
You can solve the problem by popping registers from the stack, as shown in Example 4.
Example 4: Popping the stack to retrieve address and counter values
; modify array in proc, using base register notation;
; pop stack to call another proc
myproc proc near32
push eax
push ebx
push ecx
mov eax,9
top: add [ebx],eax ; add value to each array element
add ebx,4
loop top
pop ecx ; retrieve values from stack
pop ebx
push ebx ; push them back onto stack in case printarr changes them
push ecx
call printarr
pop ecx ; popping is always the last event in a proc
pop ebx
pop eax
ret
myproc endp
Note that the procedure pops the necessary registers and then pushes them back onto the stack. That is a precaution in case printarr is badly-behaved and fails to preserve register values. Each procedure should take charge of preserving the original values of the registers it changes.
Base-Indexed Addressing
Another way to return to the top of an array is to use a different form of addressing, one which does not change the base register at all while processing the array. Base-indexed addressing uses both a base and and index register, changing the index register. The base register continues to point to the top of the array, as shown in Example 5, while esi changes. Starting over inside the proc requires only resetting esi to 0; ebx remains unchanged.
Base-indexed notation in its most traditional form allows one base and one index register per address: ebx+esi, or ebx+edi, or ebp+esi or ebp+edi. Note that ebp points into the stack, not into the data segment, so there are extra steps needed to use it to process an array.
Example 5: Using base-indexed addressing to process an array
; modify array in proc, using base-indexed register notation
myproc proc near32
push eax
push esi ; esi changes, but ebx doesn’t
push ecx
mov eax,9
sub esi,esi ; index register
top: add [ebx+esi],eax ; add value to each array element
add esi,4 ; modify esi rather than ebx
loop top
pop ecx ; retrieve counter from stack
push ecx ; push ecx back in case printarr changes it
call printarr
pop ecx ; popping is always the last event in a proc
pop esi
pop eax
ret
myproc endp
Base+Displacement or Base-Indexed+Displacement Notation
There are other forms of addressing that are useful. The addition of a displacement to base address notation allows processing adjacent values in an array without changing the base address. For example, if [ebx] points to an element in an array of double words, [ebx+4] points to the next element. The extra displacement can also be used with base-indexed addressing, as in [ebx+esi+4]. Example 6 shows using this notation to find whether any two adjacent elements in an array have the same value. If the array contains 4, 5, 5, -2. -2, 9, the message will print twice, once because of the adjacent 5s and once because of the adjacent -2s.
Example 6: Using base+displacement addressing to print a message when adjacent array elements have the same value
; using base+displacement notation to compare array elements
findequal proc near32
.data
msg byte "adjacent elements are equal",13,10,0
.code
push eax
push ebx
push ecx
dec ecx ; process only to element n - 1
top: mov eax,[ebx]
cmp eax,[ebx+4]
jne next
output msg ; special case
add ebx,4 ; all cases
loop top
pop ecx
pop ebi
pop eax
ret
findequal endp
Loop Organization
Example 6 also shows how to organize a loop that has a jump. Do not repeat code. If there is a jump, it jumps over the code which is not done in all cases; I’ve labelled this code as "special case". It jumps forward to the code that is done in all cases (which I’ve labelled "all cases"). The "all cases" code is the necessary indexing and jumping that must be done every time through the loop. It should not be repeated, but instead, you should jump to it.
Location of Procedures
Internal procedures are placed in the same file as the main program. They are placed at the end of the main program, between the statement "invoke exit_process,0" and the end statement. They must go before the directive end, because end stops assembly of the code. They must go after "invoke exit_process" so that they will not be executed when they are encountered in the code and so that you won't have to jump to that statement.
External (Separately-Assembled) Procedures
Often procedures are written to be used frequently. A procedure of this sort should not be placed in the same file as the main program. It is written in a separate file, assembled separately, and linked together with the main program file during the link step.
The extrn Directive
A main program that calls an external procedure must contain an extrn statement, as shown below, where "sample" is the name of the external procedure.
.code
extrn sample: near32 ; extrn says that sample is defined
; outside this file
...
call sample
Without the extrn statement, the assembler would not know what to do when it encountered the call to sample. The extrn statement is a promise that a sample function will be made available later, outside this file.
A separately assembled file must have some of the scaffolding of a main program, but not all. It does not need to have a stack segment, because the stack is declared in the main program. It does not specify the starting positions, since the main program also does that.
Example 7: Format of an external procedure
.386
.model flat
public sample ; make name available to linker
include io.h ; needed if proc does I/O
.data ; data goes ABOVE proc header
string1 byte 40 dup(?), 0 ; variables used must be declared in proc
.code
sample proc near32 ; proc goes inside .code segment
... ; processing goes here
ret
sample endp
end
The public Directive
The external procedure contains the line "public sample", where "sample" is the name of the array. The public directive allows the name "sample" to be available to the linker. The name can be seen outside the file; any names that have not been made public can’t be seen outside the file.
Assembling and Linking an External File
After writing and saving the external procedures, you must assemble them separately from the main procedure (and the same way that you assemble main). The assembly process will produce two object files. If main is called main.asm and this external procedure is called sample.asm, you will end up with main.obj and sample.obj.
In the link step, you must add the file sample.obj to the list of files being linked together. The link step links together main.obj and other files to produce an executable file. (You’ve already done this with io.obj.) The link step "resolves external references." That is, it looks for a matching label for the label specified in the extrn statement in main. It finds it in the item specifed in the public statement, and that allows it to provide the appropriate addresses for the function calls.