Sunday, February 23, 2014

Architecture of 80386



The internal architecture of 80386DX is divided into following sections:
  • Central processing unit
    • Execution unit
    • Instruction decode unit
  • Memory management unit
    • Segmentation unit
    • Paging unit
  • Bus control unit

The central processing unit is further divided into execution unit and instruction unit. The memory management unit consists of a segmentation unit and a paging unit . These unit operate in parallel. Fetching, decoding, memory management and bus access for several instruction are performed simultaneously. This parallel operation is called pipelined instruction processing .

Execution unit : The execution unit read the instruction from the instruction queue and executes the instructions. It consists of three subunit: control unit, data unit and protection test unit .

  1. Control unit : It contains microcode and special hardware. The microcode and special hardware allow 80386DX to reduce time required for execution of multiply and divide instruction . It also speeds up the effective address calculation .
  2. Data unit : The data unit contain the ALU, eight 32-bit general perpose registers and a 64-bit barrel shifter. The barrel shifter is used for multiple bit shifts in one clock. Thus it increases the speed of all shift and rotate operation. The multiply/divide logic implement the bit shift rotate algorithm to complete the operation in minimum time. The entire data unit is responsible for data operation requested by the control unit .
  3. Protection test unit : the protection test unit check for segmentation violations under the control of the microcode.

Instruction decode unit : The instruction decode unit takes the instruction bytes fron the code prefetch queue and translates them into microcode the decoded. The decoded instruction are then stored in the instruction queue. They are passed to control section for deriving the necessary control signals .

Segmentation unit : The segmentation unit translates logic addresses into linear addresses at request of the execution unit . The segmentation unit compares the effective address for the length limit specified in the segment descriptor. The segment unit adds the segment base and the effective address to generate linear address. Before calculation of linear address it also check for access rights. It provides a 4-level protection mechanism for protecting and isolating the system code and data from those of the application program.

Paging unit : When the 80386DX paging mechanism is unabled, the paging unit translates linear addresses generated by the segmentation unit or the code prefetch unit into physical addresses. If paging unit is not enabled, the physical address is the same as the linear address, and no translation is necessary. The paging unit gives physical address to the bus interface unit to perform memory and I/O accesses . It organizes the physical memory in term of pages of 4 kbytes size each .

The control and attribute PLA check the privileges at the page level. Each of the page maintain the paging information of the task. The limit and attribute PLA checks segment limits and attributes at the segment level to avoid anvalid accesses to code and data in the memory segments .

Bus control unit : The bus control unit is the 80386DX's communication with the out side world. It provide a full 32-bit bi-directional data bus and 32-bit address bus. The bus control unit is responsible for the following operations :

  1. It accepts internal request for code fetch and data transfer from the code fetch unit and from the execution unit. It then prioritize the request with the help of prioritize and generate signal to perform bus cycles.
  2. It sends address, data and control signal to communicate with memory and I/O devices. The address driver drives the bus enable and address signal A0-A31 and the transceiver interface the internal data bus with the system bus .
  3. It control the interface to the external bus masters and coprocessors.
  4. It also provides the address relocation facility.

Instruction prefetch unit : The instrction prefetch unit fetch sequentially the instruction byte stream from the memory. It uses bus control unit to fetch instruction bytes when the bus control unit is not performing bus cycle to execute an instruction. These prefetched instruction bytes are stored in the 16-byte code queue. A 16-byte code queue holds these instruction until the decoder needs them the prefetcher always fetches instruction in the order in which they appear in the memory. It fact, the prefetcher simply reads code one double word at the time, not caring whether it's bringing in complete instruction are executed, the contents of the prefetched and decode queues are cleared out. In this case, prefetcher again starts filling its queue.

Instruction predecode unit : The instruction predecode unit takes instruction bytes from the instuction prefecth queue an translate them into microcode the decoded instruction are then stored in instruction queue.

References:
  1. Wikipedia page on 80386.
  2. Advanced 80386 Programming Techniques” by James L. Turley, TMH Publications.

Friday, January 17, 2014

What is 10,13 in Assembly Language variable declaration?



In many assembly language programs written for x86 architecture, the values 10, 13 are written in data segment declaration.
e.g.

    .data
     message db 10, 13, 'abcd$'

Here the string will be declared and it will have the values declared in front of it. The actual array will be,
message:

0AH
0DH
41H
42H
43H
44H
24H
10         13         'a'          'b'         'c'         'd'         '$'

The respective ASCII characters of these values will be printed till '$'. It is considered as end of the string character. 'a' 'b' 'c' 'd' are printable characters but 10 & 13 are non-printable characters .
In short,they are control characters chart below.(Reference -Wikipedia).


10 is called as LF or Line Feed or new line and 13 is called as CR or Carriage return.
These character are used to control the cursor position. The 10 shifts cursor on new line with same column no. and 13 returns the cursor to its initial position of line that is at start of the line!

So, every time you make use of these control characters as the part of string or any array. It shifts the cursor to new line and at the start of the line.

(Note – the term carriage return (CR) is taken from printer's operation. When printer finishes one line printing, it shifts to new line but it prints head moves to start of new line).

LF & CR are equivalent to \n & \r used is print statement of high level programming languages.

Lets take an Example.

Declare variable as

   .data
      message db 10,13,'Welcome$'

and print using,

    MOV AH, 09H
  LEA DX, MESSAGE
  INT 21H

Now make some changes in declaration as,

   .data
   message db 10, 13, 'Wel', 10, 13, 'come$'

Check the output by printing this message.

Thursday, January 16, 2014

Batch file for execution of multiple commands


There are only three kinds of file in DOS which are directly executable. These are .exe, .com,and .bat. Out of these, .exe and .com are executable files. The .bat is batch file or a scripting file.


Whenever we want to execute multiple commands at a time, we may use the batch file. We just need to add all commands in a file whose extension is .bat. This is very usefull while executing a assembly language program.
For example, you have written a program named myprog.asm and you want to assemble, link and execute in one command. Then just create a file called run .bat and type in it.

   masm myprog.asm;
   link myprog;
   myprog

This file must be present in your current working directory. i.e /masm. Now, after typing 'run' (name of batch file ) on terminal all above commands will get executed.

See the screenshots below:



Final execution:

Wednesday, January 15, 2014

How to set video mode in X86 assembly?


The interrupt no. 10H is dedicated for all kinds on video operation of 8086/8088 DOS system. The function number 01 with interrupt 10H is used to set the video mode.
The mode values must be set in AL register.


The values 00h to 03H specifies the text modes of DOS. Remaining are Graphics modes. The default text mode is 80 X 25. That is, 80 character per line X 25 line per screen page.
So, this text resolution can be changed with the help of interrupt no. 10H.

By setting any video mode, the display screen will be cleared automatically.

For e.g.

    MOV AH,00H
  MOV AL,00H
  INT 10H

This will set the video mode (text) of 40 characters per line X 25 line per page. It changes the default resolution. Likewise, we may use different kind of values in AL register.

These are:

00H – 40 X 25 (text) color burst OFF
01H – 40 X 25 (text)
02H – 80 X 25 (text) color burst OFF
03H – 80 X 25
04H – 320 X 200 pixels (graphics)
05H – 320 X 200 pixels (graphics) color burst OFF
06H – 640 X 200 (graphics)
... and so on.

To Know more about these values, refer book “Advanced MS-DOS Programming” by Ray Duncan (BPB Publications).

Sunday, January 12, 2014

Printing a String using x86 assembly under MASM/TASM


String is an array of character, where all character are stored in contiguous fashion. In computer's view, string is an array of bytes stored in contiguous memory. Here, I have used term 'byte' because computer does not recognize integer and character separately. It knows only bytes. 

 
To print a string, we need an array of bytes or character stored in a memory with a 'end of string' character. Let's see how it is done.
Generally,the array are stored in a data segment.For example, 
 
    .data
             char db 'a','b','c','d','$'

Here, 'char' is the name of array variable. These characters will get stored in contiguous memory locations with values 'abcd'. The '$' is end of string character. It recognizes that there are no more characters after it. It is compulsory to have '$' as the end of string characters for all strings. As I already told, computer only knows bytes, following declaration of string is also similar to previous one.

    .data
    char db 41,42,43,44,'$'

Here, 41, 42, 43 & 44 are ASCII values of 'abcd' respectively. In order to print a string on screen x86 architecture uses DOS interrupt number 21H with function number 09H. So,

    INT 21H & AH=09H

will do the task for us.
Before this, we must store offset address/effective address of the string in data register (DX). Address of string or address of first character of string must be stored in DX register, function 09H gets this address from DX and print the characters on the screen from first byte till '$'.
Following code will print 'welcome' on screen.

.DATA
      MESSAGE DB 'WELCOME$'

.CODE
     MOV DX, @DATA
     MOV DS, AX
     LEA DX, MESSAGE
     MOV AH, 09H
     INT 21H

  MOV AH, 4CH
  INT 21H
END

The instruction 'LEA' loads the offset of effective address of variable 'MESSAGE' in DX register. Remember, it is mandatory to have '$' at the end!
You may try by removing '$' from the string.

Monday, January 6, 2014

Print a number using Assembly Language


Let's see something simple but interesting stuff in assembly language. To print a value on screen, various high level programming languages use there library function and statement. For example, C uses 'printf', C++ uses 'cout', Java uses 'println' and Python uses 'print'. It is single line statements that does our task. But, microprocessor implements several sequential steps to print a number on screen. 
 

Here, I have used the widely used MASM to demonstrate this work. Basically, to print a number microprocessor don't have any function for this. It is only possible to print a single character with function number 02 an interrupt number 21h. The character that you have to print, must be present in DL register.
For example,

MOV DL,'a'
MOV AH,02H
INT 21H

This code will print 'a' on the screen (without quote!). Here DL register of processor is storing ASCII value of character 'a'. So, following code will also do the same task, as ASCII value of character 'a' is 41 in hexadecimal.
 
MOV DL,41H
MOV AH,02H
INT 21H

Here, 02 is function number and it is necessary to store it in AH register before invoking an interrupt. Now, to print number on screen it is not directly possible using assembly language. We need to do it using ASCII values of hexadecimal digit. Remember ASCII values of 0 to 9 are 30 to 39 respectively. So, if you want to print 5 on screen, we need to store 35 in DL register before invoking the interrupt. Now, following code will print 5 on screen.
 
MOV DL,35H
MOV AH,02H
INT 21H

It means that, we need code conversion by adding 30 in hex to the respective single digit number. If you want to print 2 digit number the same procedure can be followed for both digit by rotating. Then one after the another following algorithm will do the task for us.
  1. Get a two digit number in temporary register say BH.
  2. Rotate the bits of the number by 4 position so hex digit would be swapped.
  3. Copy the rotated number in DL register.
  4. Mask high order 4 bits to zero.
  5. Add 30h in DL.
  6. Use function 02h and int 21h to print the number.
  7. Repeat the steps 2 to 6 for second digit also.
Code:
  1. MOV BH,96H ;number to print
  2. MOV CH,02H ;number of digits
  3. MOV CL,04H ;rotation count
  4. UP:ROL BH,CL ;swap digits
  5. MOV DL,BH
  6. AND DL,0FH ;mask MSB digit
  7. ADD DL,30H ;add 30 in DL
  8. MOV AH,02H ;function number
  9. INT 21H
  10. DEC CH ;do twice
  11. JNZ UP
Here, we simulate the code stepwise.
  1. BH=96H
  2. CH=02H
  3. CL=04H
  4. BH=69H
  5. DL=69H
  6. DL=09H
  7. DL=39H
  8. AH=02H
  9. It will print DL=39 i.e 9 on screen.
  10. CH=01H
  11. Condition true as ZF=0. Jump to 4th Statement.
  1. BH=96H
  2. DL=96H
  3. DL=06H
  4. DL=36H
  5. AH=02H
  6. It will print DL=36 i.e 6 on screen.
  7. CH=00H
  8. Condition false so terminate.
Now, here we have printed the hex digits only between 0-9. But, if digits are between A-F then we need to check this condition also. The ASCII value of numbers and characters have difference of 07H. So, if the hex digit of number is greater than 9 then to have to add 07H also in DL register. The following code will do the task.
 
MOV BH,5CH ;number to print
MOV CH,02H
MOV CL,04H
UP:
ROL BH,CL
MOV DL,BH
AND DL,0FH
CMP DL,09H
JBE NEXT
ADD DL,07H
NEXT:
ADD DL,30H
MOV AH,02H
INT 21H
DEC CH
JNZ UP

You may use debugger to check the execution of the code. It is to be noted that this code will work in x86 architecture using MASM or TASM.

Sunday, January 5, 2014

Input in Assembly Language Programming


You must think of function 'scanf' first. How simple it is! This C function reads the data from keyboard. As it is high level language function, it is very simple to understand. But, behind the scenes a lot many forces are active to make this 'scanf' function very successful! This all procedures can be seen in an assembly language program. When you want to read something from keyboard using microprocessor's programming language, it gives real demonstration of input procedures of computing. Lets see how it does it. I am writing the procedure here from DOS point of view. 

 
Most of processor's operations are based on interrupt. To read the data from keyboard, we need to invoke an interrupt! The DOS interrupt whose number is 21 in hexadecimal is required to accomplish this operation. An interrupt of computer needs the function number also. Now, to read a character DOS 21H interrupt needs function number 01. This function number must get stored into AH register of the processor.
The function number 01 an interrupt 21H is used to read a character from keyboard. So reading a character is so simple.

MOV AH,01H
INT 21H

This code reads a character from keyboard and processor immediately switches to next statement. The character which was pressed by user on keyboard will be stored in AL register! It usually stores the ASCII value of character pressed. In short, this function acts as getche() function if C. For example, if user pressed 'a' on keyboard then the value of AL will be 41 in hexadecimal! Now, how to read a 2 or 4 digit hex number, becomes a critical problem.

In this case, we may follow the following procedure for two digit number.
  1. Read the first digit number as a character.
  2. As the ASCII values of numbers and actual number has difference of 30, subtract 30 from AL For example, ASCII of 5 is 35.
  3. Swap the digits of two digit hex value in AL register. For example, AL=05 will become 50.
  4. Store this value in some other register such as BL.
  5. For reading second digit of the number, follow step 1 and 2 and finally add contents of AL and BL. Then we will get actual number that we read.

The following code will do the task for achieving this two digit number input.

1. MOV AH,01H
2. INT 21H ;reads first digit
3.
4. MOV BL,AL
5. SUB BL,30H
6. MOV CL,04H
7. ROL BL,CL
8.
9. MOV AH,01H
10. INT 21H ;reads second digit
11.
12. SUB AL,30H
13. ADD AL,BL ;AL will have number

For example, user entered 75 from keyboard then,
4. BL=37H
5. BL=07H
6. CL=04H
7. BL=70H
10. AL=35H
12. AL=05H
13. AL=75H ;AL with final value.

The code will work for digit entered between 0 to 9 only.

The hex number entered will be greater than 09H also. When user enters any hex digit between A to F then we have to add one more step in algorithm. In such case subtract the input value in AL by 07H also. Because the ASCII value difference between numerical value and character value is 07! So compare the value is below or equal to 09. For this, the conditional jump JBE/JNA can be used. Following code will do the task.

MOV AH,01H
INT 21H
MOV BL,AL
CMP BL,39H
JBE NEXT1

SUB BL,07H
NEXT1:
SUB BL,30H
MOV CL,04H
ROL BL,CL
MOV AL,01H
INT 21H
CMP AL,39H
JBE NEXT2
SUB AL,07H
NEXT2:
SUB AL,30H
MOV CL,04H
ADD AL,DL

Note: All the values used in this article are in hexadecimal form. It will work for x86 programming using MASM/TASM.