Assembly vs C - how many bytes does it take to change a single byte in a file?
How many bytes does it take to change a single byte in a file? Well, a simple patch program just needs to do several system calls to OS (to make the change and to print some output to the user). Such executable should never exceed a kilobyte, right? Right?
Well, it very much depends on your choice of the programming language…
A couple of weeks ago I were gifted an original box with Need For Speed Special Edition, I decided to play the DOS version from the disc on my retro computer. The game started if DOS was loaded into the lower memory, but the in-game video playback was completely broken. Fortunately there exists a solution described in this brilliant post by Michal Necasek Need for Speed SE video glitch The solution is extremely simple - you need to change a value of one byte in the game executable.
As an exercise I decided to write two implementations of the patch program - one on Assembly and another on C (using the standard C library). I were curious how big will be the compiled programs.
Here is my assembly implementation. patch.asm:
org 100h
mov dx, FILE_NAME
mov al, 010b
mov ah, 3Dh
int 21h
jc file_open_error
mov bx, ax
mov ax, 4200h
mov cx, 0006h
mov dx, 07E6h
int 21h
jc patch_error
mov ah, 40h
mov cx, 1
mov dx, NEW_BYTE_VALUE
int 21h
jc patch_error
mov ah, 3Eh
int 21h
mov dx, MSG_SUCCESS
mov ah, 9
int 21h
mov al, 0
mov ah, 4Ch
int 21h
patch_error:
mov dx, MSG_ERROR_P
mov ah, 9
int 21h
mov ah, 3Eh
int 21h
jmp exit_with_err
file_open_error:
mov dx, MSG_CANT_OPEN
mov ah, 9
int 21h
exit_with_err:
mov al, -1
mov ah, 4Ch
int 21h
NEW_BYTE_VALUE: db 08h
MSG_SUCCESS: db "The file has been successfully patched!", 0Dh, 0Ah, "$"
MSG_ERROR_P: db "Error patching file", 0Dh, 0Ah, "$"
MSG_CANT_OPEN: db "Can not open: "
FILE_NAME: db "NFS.EXE", 0h
MSG_LINE_END: db 0Dh,0Ah, "$"
As expected it is just a series of system calls and several jumps for error handling. A makefile to compile it with NASM:
patch.com : patch.asm
nasm -o patch.com patch.asm
The result is a COM executable of 171 bytes
. Nice!
Now let’s see what C will get us.
Here is my C equivalent. patch.c
:
#include <stdio.h>
#define FILE_NAME "NFS.EXE"
#define BYTE_POS 0x607E6
#define BYTE_VAL 0x08
int main() {
FILE* fp;
fp = fopen(FILE_NAME, "rb+");
if (fp == NULL) {
fputs("Can not open: ", stdout); // fputs does not add \r\n, unlike puts
puts(FILE_NAME); // puts footprint is lesser than printf
return -1;
}
if ((fseek(fp, BYTE_POS, SEEK_SET) != 0) || (fputc(BYTE_VAL, fp) == EOF)) {
puts("Error patching file");
fclose(fp);
return -1;
}
fclose(fp);
puts("The file has been successfully patched!");
return 0;
}
Here is the makefile to compile and link it using Open Watcom:
patch.obj : patch.c
wcc -0 -d0 -ms patch.c
patch.com : patch.obj
wlink system com file patch.obj
Before analysing the results, I compiled an empty C program in Dos as a COM executable using Open Watcom. The result size of the COM file without any useful payload slightly exceeded a kilobyte, and most of it was occupied by the C Runtime Library.
The role of the C Standard Library is even bigger - the program with the
actual code from the above compiles to a COM file of 7490 bytes
.
Different functions from the standard library had different impact,
for example using more simple puts
instead of printf
helped to decreased
the size by several kilobytes.
The difference in the compiled sizes between the programs written on ASM and vanilla C is almost 44x. Most of the difference is of course the code from the Standard Library, most of which is actually never executed in the program and this illustrates the need of such projects as Nolibc.
Conclusion
As a next experiment I, probably, should try building it using some minimal libc replacement to see how smaller can I get it with C, but honestly, my impression is that I don’t actually want to use it for such simple task. To do such trivial job as patching, I can instruct the computer directly, get predictable size of the compiled binary and enjoy the minimalism by using pure Assembly. Yes, it’s not cross-platform, but this patch would not be used on other platforms anyway, and seeing humble 171 bytes across the name of an executable is something very satisfying, something nice, long forgotten and lost in the past decades. I’m declaring Assembly the winner :-)