Data Compression

From TLoZ: ALTTP Hacking Resources
Jump to: navigation, search

See also: Standard Notation

ALTTP (and some other SNES games) use a simple form of run-length compression to reduce the ROM consumption of data that often have repeated sequences within them (such as a row of pixels all the same color, or a row of tiles all the same type). Decompressing data that are compressed in this way takes more time than simply copying them from ROM bytewise, so compression should be used only when necessary.

Compressed Format

A compressed datum is a sequence of single-byte command headers with optional arguments. The compressed sequence is terminated by the $FF header with no arguments.

Header byte -- bits 7..5
command
Header byte -- bits 4..0
length parameter

If the command within the header byte is %111, then the byte after the header byte is a header extension byte and the extended header is as follows.

Header byte -- bits 4..2
command
Header byte -- bits 1..0
bits 9..8 of length parameter
Header extension byte
bits 7..0 of length parameter

NOTE: The length parameter is implied to be one more than actually stored in the header, because you would never want a command to generate 0 bytes, so once the length has been calculated, increment it by 1.

Commands 0-4 are valid. Commands 5-7 are unused and invalid.

Command 0
the next (length parameter + 1) bytes of the input are copied to the output bytewise
Command 1
the next byte of the input is copied to the output (length parameter + 1) times
Command 2
let the next byte of the input be A
let the byte of the input after A be B
A and B are copied (alternating, as in ABABABAB...) to the output until (length parameter + 1) bytes have been copied to the output
Command 3
let the next byte of the input be A
A is copied and incremented (as in A,A+1,A+2,A+3...) to the output until (length parameter + 1) bytes have been copied to the output
Command 4
let the next byte of the input be A
let the byte of the input after A be B
A and B form an offset into the current output buffer (the bytes that have already been decompressed and copied to the output)
let this offset be X
X = A | ( B << 8 )
(length parameter + 1) bytes are copied from within the current output buffer and appended to the end of the current output buffer

Sample C# Code

NOTE: This code assumes the compressed input is well-formed. If the input is garbage, corrupted, or the wrong format, the code may hang, raise an exception, or generate garbage output.

public static class AlttpDataCompression
{
	static System . Collections . Generic . List < byte > Output = new System . Collections . Generic . List < byte > ( ) ;

	public static byte [ ] Decompress ( byte [ ] Input , int InputOffset )
	{
		Output . Clear ( ) ;

		while ( true )
		{
			int Header = Input [ InputOffset ] ;

			InputOffset ++ ;

			if ( Header == 0xFF )
			{
				return ( Output . ToArray ( ) ) ;
			}

			int Command = Header >> 5 ;

			int Length = Header & 0x1F ;

			if ( Command == 0x7 )
			{
				Command = ( Header >> 2 ) & 0x7 ;

				Length = ( Header & 0x3 ) << 8 ;

				Length |= Input [ InputOffset ] ;

				InputOffset ++ ;
			}

			Length ++ ;

			if ( Command == 0 )
			{
				for ( int Index = 0 ; Index < Length ; Index ++ )
				{
					Output . Add ( Input [ InputOffset + Index ] ) ;
				}

				InputOffset += Length ;
			}
			else if ( Command == 1 )
			{
				for ( int Index = 0 ; Index < Length ; Index ++ )
				{
					Output . Add ( Input [ InputOffset ] ) ;
				}

				InputOffset ++ ;
			}
			else if ( Command == 2 )
			{
				int AlternatingOffset = 0 ;

				for ( int Index = 0 ; Index < Length ; Index ++ )
				{
					Output . Add ( Input [ InputOffset + AlternatingOffset ] ) ;

					AlternatingOffset = ( AlternatingOffset + 1 ) % 2 ;
				}

				InputOffset += 2 ;
			}
			else if ( Command == 3 )
			{
				int IncrementingValue = Input [ InputOffset ] ;

				InputOffset ++ ;

				for ( int Index = 0 ; Index < Length ; Index ++ )
				{
					Output . Add ( ( byte ) ( IncrementingValue & 0xFF ) ) ;

					IncrementingValue ++ ;
				}
			}
			else if ( Command == 4 )
			{
				int CopyOffset = Input [ InputOffset ] | ( Input [ InputOffset + 1 ] << 8 ) ;

				InputOffset += 2 ;

				for ( int Index = 0 ; Index < Length ; Index ++ )
				{
					Output . Add ( Output [ CopyOffset + Index ] ) ;
				}
			}
		}
	}
}