ECE291 Computer Engineering II Lockwood, Fall 1997

Machine Problem 2: Data Compression

Due DateFriday 10/3/97
Purpose Math, User I/O, Subroutines
Points50

Introduction

Compression algorithms reduce the size of data by eliminating redundancy. These algorithms often allow the information content of a mesage, image, or picture to be preserved using a fewer number of bits. Run Lenth Encoding (RLE) is one such compression algorithm that preserves the exact content of the original information.

Many every-day devices use compression. Modems encode data reduncies to increase thoughput. Fax machines encode blank areas to reduce facsimile transmission time. Graphic programs save disk space by encoding the redundancy in photos.

A run length encoder looks for strings of identical symbols. When found, the encoder transmits the data element once followed by a special repeat (REP) symbol to indicate redunancy, and a count that indicates how many times the symbol should be repeated.

In this MP, we will encode and decode English text messages. The English alphabit contains 26 letters (A..Z). Five bits of data can be used to uniquely represent 2^5=32 symbols. This is sufficient information to encode all of the letters and still provide a few extra symbols for characters such as the space and asterisk (*).

Encoding Rules

User Interface

Sample Input & Output

Data Structures

Procedures

Preliminary Procedure

Final Steps

  1. Demonstrate MP2.EXE to a TA or to the instructor. You will then be asked to recompile and demonstrate MP2 with different input files. Your program must work with all given input. Once approved, you are ready to turn in your program.
  2. Be prepared to answer questions about any aspect of the operation of your program. The TAs will not accept an MP if you cannot fully explain the operation of your code.
  3. Copy your programs to handin floppy:
    A:\Handin YourWindowsLogin
  4. Print MP2.ASM
  5. Take your printout and disk with MP1 to the same TA which approved your demonstration. Be sure that your name is on the disk and on the printout.

MP2.ASM (Program framework)

PAGE 75, 132 TITLE ECE291:MP2:MP2-Compress - Your Name - Date COMMENT * Data Compression. The world contains a great deal of data. Luckily, a great deal of it is redundant (i.e., repeats itself or has repeating patterns). Using compression algorithms, one can encode such data using a smaller number of bits. For this MP, you will write a program which uses Run-Length Encoding (RLE) to compress textual data. As you will see, RLE is most effective on data which has long runs of identical characters. ECE291: Machine Problem 2 Prof. John W. Lockwood Dept. of Electrical & Computer Engineering Unversity of Illinois Fall 1997 Ver 1.0 * ;====== Constants ========================================================= BEEP EQU 7 BS EQU 8 CR EQU 13 LF EQU 10 ESCKEY EQU 27 SPACE EQU 32 BufferMaxLength EQU 35 ; Bytes BufferMaxLengthBits EQU BufferMaxLength * 8 ; Bits TextMsgMaxLength EQU 56 ; Bytes ;====== Externals ========================================================= ; -- LIB291 Routines (Free) --- extrn kbdine:near, kbdin:near, dspout:near ; LIB291 Routines extrn dspmsg:near, binasc:near, ascbin:near ; (Always Free) extrn mp2xit:near ; Exit program with a call to this procedure ; -- LIBMP2 Routines (Replace these with your own code) --- extrn PrintBuffer:near ; Print contents of Buffer extrn ReadBuffer:near ; Read Buffer from keyboard extrn ReadTextMsg:near ; Read TextMsg from keyboard extrn PrintTextMsg:near ; Print contents of TxtMsg extrn Encode:near ; Encode ASCII -> 5-bit extrn AppendBuffer:near ; Add a character to Buffer extrn EncodeRLE:near ; Run Length Encode TextMsg -> Buffer extrn DecodeRLE:near ; Run Length Decode Buffer -> TextMsg ;====== SECTION 3: Define stack segment =================================== stkseg segment stack ; *** STACK SEGMENT *** db 64 dup ('STACK ') ; 64*8 = 512 Bytes of Stack stkseg ends ;====== SECTION 4: Define code segment ==================================== cseg segment public 'CODE' ; *** CODE SEGMENT *** assume cs:cseg, ds:cseg, ss:stkseg, es:nothing ;====== SECTION 5: Variables ============================================== Buffer db BufferMaxLength dup(0) ; Data Buffer for encoded Message TextMsg db TextMsgMaxLength dup('$'), '$' ; Text Message BufferLength dw 0 ; Number of bits in buffer crlf db CR,LF,'$' ; DOS uses carriage return + Linefeed for new line PUBLIC Buffer, TextMsg, BufferLength ;====== Procedures ======================================================== ; Your Subroutines go here ! ; ---- ----------- -- ---- ;====== Main procedure ==================================================== MenuMessage db CR,LF, \ '------------- MP2 Menu --------------',CR,LF,\ ' Enter (T)ext / (B)inary',CR,LF, \ ' Print (M)essage / (R)buffeR',CR,LF, \ ' RLE (E)ncode / (D)ecode',CR,LF, \ '------ [ESC] or (Q)uit to exit ------',CR,LF,'$' main proc far mov ax, cseg mov ds, ax MOV DX, Offset MenuMessage CALL DSPMSG ; Display Menu MainLoop: MOV DX, Offset CRLF CALL DSPMSG MainRead: CALL KBDIN ; Read Input CMP AL,'a' JB MainOpt CMP AL,'z' ; Convert Lowercase to Uppercase JA MainOpt SUB AL,'a'-'A' MainOpt: CMP AL,'T' JNE MainNotT Call ReadTextMsg ; Read in a text message JMP MainLoop MainNotT: CMP AL,'B' JNE MainNotB Call ReadBuffer ; Read in a binary message JMP MainLoop MainNotB: CMP AL,'M' JNE MainNotM Call PrintTextMsg ; Print TextMsg JMP MainLoop MainNotM: CMP AL,'R' JNE MainNotR ; Print Buffer Call PrintBuffer ; (show least significants bit first) JMP MainLoop MainNotR: CMP AL,'E' JNE MainNotE Call EncodeRLE ; Run Length Encode Message Call PrintBuffer ; and print result JMP MainLoop MainNotE: CMP AL,'D' JNE MainNotD Call DecodeRLE ; Run Length Decode Message Call PrintTextMsg ; and show result JMP MainLoop MainNotD: CMP AL,ESCKEY JE MainDone ; Quit program CMP AL,'Q' JE MainDone JMP MainRead ; Ignore any other character MainDone: call MP2xit ; Exit to DOS main endp cseg ends end main