So this post is a bit of a follow-up to a post I made last year (man, time sure does fly) about expanding the tag data section of a cache file. A lot of work has done since then (especially fairly recently), and things have changed enough that I feel the topic needs revisiting. Everything that I will cover here has already been implemented into the injection branch of Assembly, so the intent of this is to document the changes that were made and how they work.
Also, most of the stuff covered in here wasn't my work, so don't credit me for it.
But first...story time!
So...why revisit this?
Good question. Last year, I hacked together discovered a way to expand the tag data section of a cache file. However, of course, that only works on the tag data section. You can't use it to expand the resource table, you can't use it to expand the debug string data (tag names and stringIDs), and you can't use it to expand the localization data (even though I figured out a hackish way and never documented it).
The problem with this is that it makes tag injection difficult. Injected tags can't have names (because we can't rely upon free space being available) and stringIDs have to get approximated when injecting. On top of that, Gamecheat has been yelling at me because locale stringIDs can't be edited for the same reason (sorry). So some work needed to be done.
Blazing a trail
Well, Zedd figured out a way to expand the resource table by setting it to wumbo comparing a leaked version of a map he had with the release version and then coming up with a list of offsets that need to be changed. It worked and he was able to (sort of) inject a custom bitmap. Pretty cool.
This leaves the debug string section (the section containing tag names and stringIDs). OrangeMohawk made a tutorial describing how to add new stringIDs and tag names, but this relies upon the fact that the section sizes are rounded to multiples of 0x1000 bytes and that there's a high chance of there being free space. But this doesn't guarantee that space will be available and it isn't consistent. So I challenged him to find a way to expand the section.
Again, he was successful. However, something didn't make sense. He found that the value at 0x474 had to be increased by the amount of extra space injected in order for the map to load, and this value had been previously undocumented by me. I checked the BlamLib project (part of Open-Sauce) to see if Kornman knew anything about this value, and he had it labeled as "RuntimeBaseAddress," describing it as "Base address for runtime (Debug, Tag and Language pack) sections" but then also left a comment stating that it's unused in release builds. Wait, what?
Considering that there was an obvious contradiction here, I contacted Korn and told him about what Mohawk found. He found this interesting and decided to turn to looking in the XEX. He came across this bit of code in the Reach beta:
.text:82551218 loc_82551218: # CODE XREF: cache_file_read_tag_section+148j.text:82551218 lwz r6, s_map_file_info.header.memory_buffer_size(r30).text:8255121C clrlwi r11, r6, 20.text:82551220 cmpwi cr6, r11, 0.text:82551224 beq cr6, loc_82551230.text:82551228 ori r11, r6, 0xFFF.text:8255122C addi r6, r11, 1 # buffer size.text:82551230.text:82551230 loc_82551230: # CODE XREF: cache_file_read_tag_section+164j.text:82551230 lwz r11, 0(r29).text:82551234 li r4, 2 # section type (tags).text:82551238 mr r3, r29 # sub_8260F390.text:8255123C lwz r7, s_map_file_info(r30) # buffer (tag memory address).text:82551240 lwz r5, s_map_file_info.header.memory_buffer_offset(r30) # cache offset.text:82551244 lwz r10, 0(r11).text:82551248 mtspr CTR, r10
Now, I get it, most of you probably can't understand that. But what it essentially does is determines the offset of the tag data in the file by taking the memory buffer offset (located at offset 0x14) and adding the value at 0x474 (accounting for 32-bit integer overflow). This essentially makes the value at 0x474 an offset mask. Or more formally:
tag data offset = memory buffer offset + tag data offset masktag data address mask = tag data address - tag data offset
So, yes, that's right. The method that's been used to calculate "map magic" for the past six years is totally wrong.
Putting it all together
Given this, we can look around the tag data offset mask to find that starting at 0x46C, there's an array of offset masks, one for each section of the file. In order, they are for the debug, resource, tag, and localization sections:
0x46C debug offset mask0x470 resource offset mask0x474 tag offset mask0x478 localization offset mask
Now, following this, we have the "section interop" table. This table lists virtual address and size pairs for each section of the file. (Note: the virtual addresses are not memory addresses. Don't even try.) The section addresses follow after each other (with a few exceptions, which I'll explain below), and adding an offset mask to a section's corresponding address yields its offset in the file. So:
0x47C debug virtual address0x480 debug size0x484 resource virtual address0x488 resource size0x48C tag data virtual address0x490 tag data size0x494 localization virtual address0x498 localization size
Finally, this leaves rebuilding that data. If this can be done, then any section of the file can be resized (provided that all pointers are recalculated accordingly, which Assembly's backend takes care of automatically). Based off of some observations I made about how these values are set up, I came up with the following formulas:
resource virtual address = 0localization virtual address = resource virtual address + resource sizetag virtual address = resource virtual address + resource size + localization sizedebug virtual address = resource virtual address + resource size + localization size + tag data size if tag data size > 0: debug virtual address -= first tag partition size debug offset mask = debug data offset - debug virtual addressresource offset mask = resource data offset - resource virtual addresstag offset mask = tag data offset - tag data virtual addresslocalization offset mask = localization data offset - localization virtual address
Of course, there are a few things to note about this:
- If a section's size is zero, then its offset mask and virtual address should also be zero (however, the converse is not true - a virtual address or offset mask of zero does not imply that the section doesn't exist).
- The virtual addresses in Halo 3 are allocated differently, but it doesn't matter.
- The official cache files have a "gap" between the resource and localization sections which is likely due to removed debugging data. It doesn't matter and doesn't need to be accounted for.
- Doing this requires rebasing any string or localization pointers in the file. Deal with it.
So...there we have it. Using this, any part of a cache file can be expanded, and any part of a cache file can be loaded. Properly.
Hey look, that program again!
Mapexpand has been updated to accomodate this. A new feature allows you to optionally specify a section to expand (instead of just expanding the tag data, which is the default). For example:
mapexpand forge_halo.map stringiddata 1
adds one page to the string ID data in forge_halo.map. Valid sections include "stringidindex", "stringiddata", "tagnameindex", "tagnamedata", "resource", and "tag". If no section is specified, the tag name data is expanded (so usage is backwards-compatible). Should be pretty self-explanatory.
Virus scan here.
Source code for this is available here.
So, let's see what people can do with this.
Love you guys.
Here's to the next year.