You may of heard @jeb_ talking about implementing a new file format for MineCraft. You may be wondering what he means by the new format being “faster” and “easier to move around” and why the current system is inefficient. The following will be divided into three parts; Part 1: A brief explanation of how hard drives save files, Part 2: File size efficiently, and Part 3: Increasing 'Copy and Paste' speed. This will be aimed at those who are not too familiar with technology and disk file-systems.
(Note: The following assumes you are running Windows with the NTFS file system. If not, It shouldn't be too different anyway.)
Open Notepad and type one character. Doesn’t matter which. For instance type the letter “a”. Save this somewhere, such as the desktop. Right click the file and open the 'Properties' menu. You will notice it says:
Size: 1 bytes (1 bytes)
But look at the line below it:
Size on disk: 4.00 KB (4,096 bytes)
What's the difference between “Size” and “Size on disk”? And why do the two accrue to such different amounts? Well to answer that you need to know a bit about how your hard drive saves files.
It says the size is only one byte because there is only one character in the file. But it says “Size on disk: 4.00 KB (4,096 bytes)” because that is the size of one cluster. In order to make managing data on your hard drive more efficient with less overhead, it is divided into these 4 KiloByte clusters. Old hard drives used to be divided into 512 Byte sectors. But as the size of hard drives started to expand into the GigaBytes, the number of sectors per hard drive began to grow as well. The solution to this was to group the sectors into indivisible “clusters”. The average home computer's hard drive is formatted at 8 sectors or 4 Kib. The size of the clusters can be changed when you reformat the drive.
What this means is that every file must take up at least one cluster. Even files that only have one byte in them will still take up the entire cluster, and files cannot share clusters. The space not used by a file is known as “Slack Space”. For the example above, our file has 1 byte of data with 4,095 bytes of slack space. That ends up being 99.975% slack space. If our text file had exactly 4,096 characters in it, there would be 0% slack space. However, if we added one more character and saved it, then the number of needed clusters would increase to two (8,192 Bytes). With a file of 4,097 bytes but 8,192 bytes being used to save the file to the drive, that means the file has 50.01% slack space. Now, you shouldn't have to worry about slack space wasting vast portions of your hard drive. For instance, I downloaded a YouTube video to my computer and the file size comes to this:
Size: 26.8 MB (28,188,389 bytes)
Size on disk: 26.8 MB (28,188,672 bytes)
In this case, the slack space comes to only 0.001%.
Yet, we are not as fortunate with the MineCraft save files.
Go to “C:\Users\*****\AppData\Roaming\.minecraft\saves”, and right click on one of your world save folders. Go down to “Properties” and click on it. Notice the size of the folder. For example, my world5 folder says:
Size: 207 MB (217,511,973 bytes)
But at the line below it:
Size of disk: 303 MB (318,201,856 bytes)
Which comes in at a staggering 31.64% slack space! How can this be? Lets look closer.
This world's save files is comprised of 72,615 Files. Each one being about 2-3 KB. This means that each file has 1-2 KB of slack space. And some files are just over the 4096 byte limit to need just one cluster. If a file has 4100 bytes then it needs two clusters and would have 4,092 (or about 4 KB) of slack space. So if each file wastes 1-4 KB of disk space, and there are between 10,000 and 100,000 (or more) files, then the amount of slack space can add up to several MegaBytes. In this case, almost 100. 100 MB may not seem like much but it's drive space still worth recovering.
Obviously, creating backups of your worlds is important. Or perhaps you want to transfer your world to a thumbdrive. How would the current file format of MineCraft be a problem? Lets try a little experiment.
Right click on one of your world folders and go to “7-zip”. If you don't have 7-zip, install it. Put your world into a 7z archive. Set the compression level to “store”. This means that 7zip wont even try to compress the data, but instead just put it into a .7z file. Since you put those thousands and thousands of little game files into one file (the .7z file) that means that almost all the slack space is gone. Even with the compression level set to “store” we still reduced the overall file size by about one third! Anyway, not the point. Copy the new 7z file to your thumb drive (or copy and paste it to the same drive) and will likely take only a few seconds. Try to do the same with the regular game files. All 90,000 (or however many you have) of them. Suddenly, it takes 30 minutes to and hour or more to transfer your world. Why? You could say it was because putting the world into a .7z file eliminated almost all the slack space and that's why it transferred faster than the non-.7z version. Well, if a file is 3 KB then 1 KB of the cluster it is in is slack. When you transfer this file, your computer only has to move 3 KB. If your Minecraft world is 300 MB and 100 MB of it is slack, then the computer still only has to transfer 200 MB if actual data. So eliminating slack space with 7-zip isn't the reason for the faster transfer since your computer already ignores the slack space and just transfers the actual data. And besides, the .7z file transfer speed is disproportionally faster then the regular game files.
The reason it's faster is because with the .7z file, your computer only has to move one file. If you instead try to copy and paste your regular world data files then your computer has to move upwards of 100,000 or more. Every time the computer goes to transfer a file, it has to look to see the size of the file (number of clusters), then locate free clusters on the receiving drive, move the data for the file, and make note of the file on the receiving drive's file table. Doing this cost time. And if you are doing it 100,000 times then it can really add up. Whereas with the .7z file you put your world into, the computer only has to do this process once. Imagine a street with one stop light compared to a street with 50,000 stop lights.
So with the new file format that Mojang (Jens in particular I believe) is working on, all the data for your Minecraft world will be in a few, possibly one, file instead of several thousand. Making it much easier to copy and move your worlds. According to Jens, the new format will also be much faster in some aspects. Can't wait!
As nice as it's going to be I think It's too limited in scope. There will still be hundreds if not thousands of files for any large world. While this will limit the slack space it will still take an absurd amount of time to more or archive the world. As for chunk errors they just became worse.
I'd really like to see an automated backup system. Saves the world in two different places and if one chunk is unreadable it checks the backup before generating a new one. It would make saving longer but I think it would be worth it.
It has helped and it has also not helped my game performance issues. My framerates are marginally better, but not really. However, playing the game doesn't cause my computer to be as noisy as it would before. So that's nice. It's obviously easier on my computer, but the game doesn't want to run with better framerates anyway.
Rollback Post to RevisionRollBack
You tell me that you've heard every sound there is
And your bird can swing
But you can't hear me
You can't hear me