Last modified 6/3/2004, Edward Keyes, ed-at-chibinochoco-dot-org INTRODUCTION ------------ These utilities are used to extract and re-package data for the Wind game by Minori. They have been developed under Linux and OS X but should run without modification on other platforms equipped with a standard C compiler and library, on either little- or big-endian processors. Although originally developed using the game files from the demo version of Wind, these utilities have been subsequently updated so they work on the full game data files as well, including the v1.02 update. Thanks to sagara for his original elucidation of the PAK file format, and zalas for his reverse-engineering of the Wind executable. UTILITIES --------- wind-unpak -- Extracts a .PAK file into the current directory. It also generates a file list for easy repacking later. wind-pak -- Accepts a list of files and encodes them into a .PAK file. wind-unmis -- Extracts the text elements from a .MIS script file (such as those contained in scr.pak) to an annotated text file. wind-mis -- Accepts an original .MIS script file and an annotated text file, and merges the two, creating a new script file with the text elements substituted in at the corresponding places. All of the .PAK files from the demo version of Wind have been successfully unpacked and repacked into byte-for-byte exact duplicates of the originals, so there should be no compatibility issues with repacked files. In addition, all of the .MIS files from both the demo and full version have been extracted and recombined into byte-for-byte duplicates as well, so the command parser seems to be reasonably robust. The .MIS utilities are limited in that they are mostly only able to perform a one-to-one substitution of text blocks, rather than any modification to sound and graphic elements or gameplay. However, the wind-mis utility can accomodate splitting long text blocks into multiple smaller ones, and does the job of recalculating internal file offsets to incorporate changed text lengths (not necessary in the demo). FORMAT OF .PAK FILES -------------------- The .pak format is a very simple concatenated file format with a small bit of obscuration. Unless otherwise noted, all integers in the header are stored in little-endian four-byte format. To read the file, all bytes should be two's complemented (i.e. negated when treated as 8-bit signed integers). This is most likely a minimal security measure to keep people from easily looking at the pictures or the script text with simple tools. Once decoded, the .PAK file format begins with an integer count of the number of files contained. For each file there is then the following header entry: int header_entry_length (not including this field) string filename (null-terminated) int file_size int file_position (measured from the end of the whole header) The first field is basically the string length plus 9 for the null byte and the two other fields, so it's pretty much redundant unless you're trying to scan through the header quickly to get to the Nth entry. It may just be included to account for the possibility of different header entry formats. Files are stored immediately after the header, with no compression, although they do have the same two's-complement encoding as the header. So far all PAK examples have had the files exactly concatenated and in the same order as in the header list: it is not known whether this is technically required or not. All examples have also been in reverse alphabetical order, and this may or may not be required as well. FORMAT OF .MIS FILES -------------------- The script files are a concatenated series of variable-length commands, each identified by an initial command byte. Some of these commands are understood, and others are mysteries -- however, as long as we know the length of the unknown commands, we can jump ahead to the next until we find the ones that we're looking for and do understand. Unless otherwise noted, all integers are stored in little-endian four-byte format. The file also begins with an unknown four-byte sequence, possibly a command count. After this, the command stream begins. The most important command for our purposes is 0x08, the text block. This byte is followed by: int sequence_identifier (an incremented text block ID) string voice_filename (from voice.pak, null-terminated) int speaker_code (from charnames.png in sys.pak, or -1 for none) int number_of_text_lines string text[number_of_text_lines] Each line is null-terminated and in the Shift-JIS character set. If there is no filename for a voice sound file, just the null byte is included for an empty string. The text block ID number is probably used for identifying save-game locations in the script, and duplicating them or skipping numbers doesn't seem to adversely affect the game. If you need to manually remove or insert a new text block, you will also need to account for the sequence 0x01 0x00 which often appears next to a text block. 0x01 is the 'pause for the player' command, and 0x00 clears the text area in preparation for the next block. Command 0x23 incorporates the text of an in-game choice. It has the format: int choice_number (zero-based) string choice_text (null-terminated) Each option is only one line long, so care with the wording of any translation may be necessary. The other important commands control jumps within the files, based on player choices or other considerations. There are several of them: 0x09, 0x24, 0x2A, 0x2B, 0x2C, and 0x2E (and hopefully no more than that). All of these unfortunately use exact byte positions within the file, so substituting text blocks destroys the positioning. The wind-mis utility recalculates all of these offsets to preserve the game structure. The details of the rest of the commands are not terribly interesting and not terribly well known... have a look at the mis.h header file for various comments about them if you're interested. They all seem to take a mixture of strings (mostly filenames) and integers as arguments so far, although some commands may have an extra terminating null which isn't accounted for: since 0x00 is a single-byte command anyway, the distinction is rarely important. FORMAT OF THE SCRIPT TEXT FILE ------------------------------ The wind-unmis utility creates a text file for editing containing all of the text blocks from a .MIS file. Each text block is preceded by a comment (denoted by # at the beginning of a line) indicating the block number, the associated voice file, and the speaker, if any. A single text block can consist of multiple lines. Lines can be added or subtracted without problem -- the game's text area can contain 4 lines. You will have to perform word-wrapping manually: the default font in Wind can fit exactly 60 English character per line. Note that Japanese symbols and some special characters (curly quotes, etc.) are double-width if you need to mix and match with English. The wind-mis utility ignores the contents of the comment lines, but does use them as separators for identifying text block boundaries. So if you want, you can comment out the Japanese text (by putting a # at the start of those lines) and leave it present as a reference. Note that changing the block number, voice file, or speaker is not currently supported: all this utility does is swap in the new text in the place of any existing block. If more than 4 lines have to be substituted in, the utility creates multiple text blocks with four or less each, separated by user pauses. The text from in-game choices is also included in the script file, similarly separated by a comment line. In this case, however, only one line of choice text is allowed, similarly limited to 60 characters. Jump points are also noted, with the approximate text block of the destination given. A choice block will be followed by a number of jump points equal to the number of options, but additional jumps occur in the game. A jump point which seems to point past the end of the text probably lands on an internal "go to another script file" command. ABOUT THE WIND EXECUTABLE ------------------------- There is something of a bug in the 'log' feature of the game, wherein any English text is offset vertically by about half a character height compared to any Japanese text. The best guess is that the half-width characters are throwing off the game's computation of the text height. In any case, this means that lines with mixed English and Japanese (or other Shift-JIS special characters) will look a bit odd when using the log function. The lines will look normal while just playing the game, though. Since the voice files are played from within a text block command, the game would ordinarily have a problem if it needed to be able to play a voice file without showing the text area. In fact, at the end of the demo, there's an instance of this. To get around this problem, the Wind programmers did an unfortunate thing: they coded that a text block consisting only of a special character would not display the text area. Sadly, the "special" character they chose is the space character 0x20, which isn't used in Japanese text ordinarily but is used a lot in the English text we wanted to insert: any instances of spaces were ignored by the text engine. In the demo executable, this particular check is controlled by the 0x20 byte at file offset 0x0122FC. Changing this to the value of some other (nonused) character will change the behavior of the game. For the NNL patched demo executable we chose the nonprintable character 0x1F, and the last two text blocks had to contain that value in order to hide the text area. Note that the full game executable is different from the demo version, and it will need to be patched differently.