As you know, java 9 came out! I can't wait to start optimizing a totally different garbage collector instead of the CMS: the garbage first collector, or G1GC (it will be default in Java 9)!
Welcome to the Vanilla Optimizations thread! This thread is intended to provide optimizations for the community without mods, though it is recommended that Optifine be used with these optimizations for best results.
Introduction
Instead of just giving you a list of optimizations that do something you don't understand, I'll teach you a little bit about Java and Minecraft so you that you can learn how to optimize the game yourself and know the reasons behind these modifications.
Note: before you start, do realize that these optimizations are for client only! These are not guaranteed to optimize server performance!
Audience
If you have a computer that can barely manage 30fps consistently and want to get a smooth 60fps, or you have a computer that gets a nice 60fps and that's it, but you want some more eye candy, or your computer happens to be a supercomputer and you just want the game to be the best version of itself, then this thread is for you!
About Java
Minecraft, as you know, runs on Java. Because of this, adequate understanding of the Java Virtual Machine is essential for optimization. This section will teach you about the JVM and the reasons why Minecraft may not be running so smoothly on your machine.
How Java Works
Before I can explain anything about the Java 8 HotSpot 64-bit Server VM, you first have to understand what the JVM actually does; before that, your computer.
Your computer is made up of many modular hardware components that are the result of decades of research. Inside your computer, you have a Central Processing Unit (CPU), and Random Access Memory (RAM). The CPU is simply a logical processor that interprets instructions represented in binary. Each CPU can execute a specific set of instructions, and this is called an instruction set, and these can have as few as 10 instructions, to over 100 instructions for a single CPU. The instruction set of a CPU varies from one brand to another. The most widely known are AMD and Intel. A more uncommon one is ARM (Advanced RISC Machines, RISC: reduced instruction set computer).
A program is a file containing instructions for a CPU to execute. These programs first have to be written in a human-readable format by programmers, then compiled to machine code, the instructions for a CPU. As mentioned above, the instruction set varies from one brand of CPU to the next, so programmers have to make multiple versions of the same program for each instruction set. This is a problem that is, in most cases, easy to overcome, but it has its consequences.
The solution to this problem is Java, or at least one of them, .NET being the other. With Java, you develop for a single instruction set which is an intermediary form of instructions called bytecode. The bytecode for Java is the same for every JVM, so you can write your program once and compile to this bytecode and it will run on any JVM. The JVM interprets bytecode to the native instructions for your CPU and executes it.
In the JVM, the bytecode is then optimized to native instructions rather than interpreted. It is optimized by the JIT or Just-In-Time compiler. From there, it basically runs like a native application that doesn't run on Java.
The Heap
Memory is split into different segments. One of these segments is called the Heap. This area specifically stores objects. Briefly, an object is an abstract data structure; semantically, just an array of data. Objects can reference other objects, and they can also dereference objects. When an object becomes orphaned, that is, all references to the object no longer exist, the object is available for garbage collection since the object won't be used any more. You know why Java appears so slow? This is the whole reason why: Garbage Collection.
Garbage Collectors
As mentioned before, usually the program is the problem. If the program is the problem, that can usually be bad news. In this case, doubly so, because you have two problems: Minecraft is not optimized at all, and doesn't even follow OOP at all in some cases, and on top of that, it's on the JVM using GC which is the main cause of lag spikes. I'll talk about Minecraft bottlenecks later, but generally speaking, when a programmer has exhausted all efforts to make sure the program is efficient, then you can consider the environment. This latter option is not available with native applications because that then forces users of the program to modify their environment. However, the JVM is its own environment. In this case, I'm talking specifically about Garbage Collection.
There are a number of garbage collectors available with their own strengths and weaknesses, but the best one to use for Minecraft (which Oracle has considered making the default over the throughput collector), is the Concurent Mark Sweep collector. You can read about it here. In Java 9, the Garbage First Collector (G1GC) will be made default.
Minecraft Bottlenecks
You may notice that Minecraft lags while chunks are generating. This is for multiple reasons. The first is that terrain generation uses a very intense polynomial equation to generate terrain.
The second is that these chunks with a size of width*height*depth=16*256*16 are very large, cached in memory, for a time, and when more chunks need to be loaded, the cached chunks are written to disk. In fact, chunks are written to disk almost immediately. Writing to disk, as everyone knows, is a very expensive operation.
The third reason is that these two operations are asynchronously executed in the in the main loop which is in the main thread, which means that you have to wait until the chunks are written to disk before calculations can continue. Thus, you get tremendous amounts of unnecessary lag. It would be more efficient to simply have a chunk queue on a separate thread that periodically saves chunks to disk from main memory in proportion to the rate at which chunks are generated (otherwise memory would fill up too quickly and an OutOfMemoryError would be thrown). In another thread, chunk generation could utilize a ForkJoinPool to generate chunks in a more efficient manner. When the generated chunks are ready, they could be put into the chunk cache and rendered.
Particles are particularly slow to render for whatever reason. I suspect it has to do with transparency, since fancy leaves are very slow to render as well, and lighting is actually implemented in software by the actual Minecraft engine. These bottlenecks can only be overcome with faster hardware, or with mods (so consider installing Optifine, I recommend it).
Another problem is that Minecraft actually runs on the old fixed function pipeline in OpenGL instead of the core OpenGL profile (Version 3.1+) with modern functionality such as GLSL Shaders. Shaders are faster and more flexible than using the Fixed Function Pipeline of OpenGL which is from version 1.1, and the fixed function pipeline itself is implemented in shaders.
Those are primarily the only bottlenecks, aside from other smaller bottlenecks, which are compensated by the JIT compiler anyways, or are negligible.
Optimizations
Fortunately, Minecraft is built on the JVM, and the Garbage Collector is fully configurable. It is the largest bottleneck, and it is very hairy to tune. This section will provide some JVM arguments that can be specified at initialization in the JVM Options area in the Minecraft Launcher.
-server
required. This runs the JVM in server mode. Only supported by 64-bit.
-Xverify:none
improves startup time. At JVM startup, when the JAR files are loaded, the class files are verified for security reasons, and this significantly increases the amount of time it takes to start the application. Enabling this option significantly reduces startup time in a noticeable way.
-da -dsa
"da" and "dsa" both disable assertions and system assertions, respectively. Assertions are checks which are made. System assertions are checks made in the Java API. Disabling assertions is a negligible optimization since most assertions are not very expensive, but it may add a few extra fps.
-Xms210828288
This option sets the minimum Heap size to 210,828,288 bytes (210.8MB). Minecraft does not need much memory to run at all. In fact, it can run smoothly at only a max heap of 512MB. This value is calculated based on the maximum vanilla chunk rendering distance at a radius of 16 chunks. A block is stored in a chunk by its numerical ID, which is an integer (32-bit value), which is 4 bytes. A vanilla chunk is width*height*depth=16*256*16=65,536 blocks. 4 bytes * 65536 blocks = 262,144 bytes (262.144 kilobytes). To find the amount of bytes used for a radius of 16 chunks, we use PI*r^2 (the area of a circle). 262144 * (PI * 16^2) = ~210,828,714.1331565 bytes. The JVM only takes values in multiples of 1024, so 210,828,714.1331565 / 1024 = 205,887.4161456607. We round this value to 205,887 and multiply by 1024 to get 210,828,288 bytes for a radius of 16 chunks.
The Chunk Cache
Note: you can also add a certain number of bytes to this number. The game itself has a chunk cache, and the maximum size of this cache depends on the size of the heap, so you only need to calculate the number of bytes for however many chunks you want to cache in the heap and add that to 210,828,288 bytes using the number of bytes in a chunk, so that means you can use the following function to calculate the number of bytes for Xms:
210828288 + ( number of chunks to cache * 262144 )
For the chunk cache, I personally would have at least four times as many chunks as the square of the rendering distance. If you keep your distance at 8 chunks, that means your chunk cache would be 4 * 8^2 = 256 chunks, which is 67,108,864 bytes. For 16 chunks, that's 268,435,456 bytes! This value additive value ontop of 210,828,288 bytes can definitely affect performance. If the chunk cache is too small, chunks will be generated more often, which is an expensive operation. If the chunk cache is too big, every periodic GC may have a larger and larger lag spike in proportion to the heap size. It really also depends on how many chunks you tend to explore in one session of Minecraft.
-Xmx1024m
This option sets the maximum Heap size to 1024 MB. Minecraft has 8x8 textures, but the textures are only stored in Graphics memory, so those aren't important. However, for what ever reason, if you open debug (F3), you'll notice that the Heap fills quickly, especially while you move. If you make the Heap too large, this can affect GC performance by making GC passes take longer, thus introducing unnecessary latency. The size of the Heap is directly proportional to the amount of time it takes to do one GC pass, so a smaller heap is better, and that goes for any machine—just because you have 16GB of RAM doesn't mean you should allocate 8192MB for Minecraft to use. I suggest not going higher than 4096MB.
-XX:ReservedCodeCacheSize=512m
Sets the code cache size to 512MB for compiled methods. No more than 512MB are required for Minecraft. This goes even for mods, but if you have a lot of mods, increasing this value to 768m may compensate for the extra methods added by mods. Lower values may decrease performance since the JIT compiler may run out of memory for compiled methods, and those methods will remain interpreted because they didn't fit in the compiled code cache. The absolute limit is 2GB.
-XX:+AggressiveOpts
Enables aggressive optimizations that are expected to become final in a future release. The performance increase is very noticeable.
-XX:+DisableExplicitGC
This option disables calls to System.gc(). As mentioned earlier, Mojang places that call in the game loop, so GC cycles are executed after each frame to keep memory usage low. This is a latency bottleneck, however. On my machine, this option adds 10-12fps, so I would now be getting ~42fps on average without input lag.
-XX:+UseConcMarkSweepGC
This option enables the CMS garbage collector. This is a better collector for low-latency requirements, especially realtime applications. This adds 2-5fps on its own, so now I would have 44-47fps on average. To learn what the CMS collector does, see https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/cms.html .
-XX:+CMSScavengeBeforeRemark
This option enables scavenging (collection) of the young generation before the remark stage of the CMS collector. The remark is the second (final) mark stage of the CMS collector before a full GC (sweep). The CMS tracks objects on the heap and marks objects that are unreachable for collection by the CMS collector.
-XX:+CMSParallelRemarkEnabled
This option enables parallel remark, the final marking stage of the CMS collector before a full GC. This will reduce latency which will scale to the number of physical cores available to the JVM.
-XX:+AlwaysPreTouch
This option enables touching of every page of memory. Your system uses a paged memory model which splits memory into "page" slices. When an application requests for a memory allocation, the system must find an empty page that your application can use, which may take some time. If the Heap needs to grow, then memory has to be allocated every time the heap needs to resize. This option may improve performance by allowing the heap to shrink or grow without having to allocate new pages of memory by touching all pages at initialization. Otherwise, pages are allocated as needed. The performance is very negligible, but this makes your game run a very small tad faster (less than 0.1fps), but depending on the system, this could save much since page size is usually based on the total virtual memory available and split accordingly, so a page could be 1k or 8k (and everything inbetween or higher) depending on how much memory you have.
-Xnoclassgc
Disables garbage collection of classes. This shortens GC latency, and adds maybe 1-2fps, depending on how many mods are installed, even more. Generally, a class is loaded when functionality is needed, so OutOfMemoryErrors aren't going to be a problem. Usually, even if something is only used once or twice, the size of classes are negligible, and is usually beneficial to prevent GC of classes.
-XX:-CMSClassUnloadingEnabled
This option disables unloading of classes with the CMS collector, which saves GC time.
-XX:CMSInitiatingOccupancyFraction=90
Sets the initiating occupancy percent of the old generation to 90%. This is one criterion for the CMS collector to start a CMS collection cycle (full GC). When the old generation reaches 90% occupancy, a CMS collection cycle will begin. This is one of many triggers for the CMS collector.
-XX:InitiatingHeapOccupancyPercent=90
WARNING: THIS VALUE MUST NOT CHANGE. Sets the percent of the entire heap to be filled before a full GC cycle to 90%. This is one option, in combination with MinHeapFreeRatio and MaxHeapFreeRatio, that is essential to the performance of Minecraft. I have found this to be a good balance for reducing lag spikes and stutters for minor GCs throughout execution since this means a full GC only occurs when the total heap occupancy has reached 90% capacity.
-XX:MaxHeapFreeRatio=87
WARNING: THIS VALUE MUST NOT CHANGE. Sets the maximum percent of the entire heap that is allowed to be free after a GC cycle. If after a GC cycle the amount of free memory is greater than this value, the heap will be shrunk. This value, along with MinHeapFreeRatio and MaxGCPauseMillis, make the heap tend to expand only when necessary. Expansions are more likely to occur after long sessions of Minecraft, and during initialization. These values also affect the minor GC latency.
-XX:MinHeapFreeRatio=13
WARNING: THIS VALUE MUST NOT CHANGE. Sets the minimum percent of the entire heap that is allowed to be free after a GC cycle. If after a GC cycle the amount of free memory is smaller than this value, the heap will expand. This value, along with MaxHeapFreeRatio and MaxGCPauseMillis, make the heap tend to expand only when necessary. Expansions are more likely to occur after long sessions of Minecraft, and during initialization. These values also affect the minor GC latency.
-XX:MaxGCPauseMillis=1
WARNING: THIS VALUE MUST NOT CHANGE. Sets the target maximum allowed time for GC. This is a soft goal, and the JVM makes a best effort to reach this goal. Along with MinHeapFreeRatio, MaxHeapFreeRatio, and InitiatingHeapOccupancyPercent, this option minimizes GC lag spikes (stutters). On my machine, the average GC latency is 3-13 ms with all of these default parameters. If you have more CPU cores or a faster CPU, the latency will be lower. I only have two cores at a max clock speed of 2200MHz (2.2GHz), 1400MHz on my newer laptop.
-XX:NewRatio=3
WARNING: THIS VALUE MUST NOT CHANGE. Sets the size ratio of the young generation to the old generation to 1:3. This means that, as the heap shrinks and grows, the young generation will be 1/4 the size of the entire heap compared to the old generation. Out of all the the options specified here related to GC and heap ratios, this option has the largest impact on GC latency, and thus directly affects frame rate and GC-induced lag stutters.
-XX:SurvivorRatio -EXPERIMENTAL
WARNING: DO NOT ENABLE YET UNLESS YOU KNOW WHAT YOU ARE DOING. Sets the survivor ratio between the second survivor space relative to the first, against the Eden space. For example, to set the ratio to 1/8 the size, this value would be 6 for a 1:6 ratio. That means the second survivor space would be 1/8 the total young generation size.
NOTES:
- options that I have labeled with warnings indicating the value should not be changed, should not be changed.
- options that I have labeled as experimental should not be used unless you are an expert on the matter. These options are being experimented with for their potential performance improvements.
System Optimizations
- use a faster persistent storage device: as mentioned before, Minecraft heavily relies on disk to save chunks to and from memory. This is one bottleneck that can be lessened or removed by using a faster hard drive. If not that, consider at least a large hard drive, an SSD, and USB 3.1, or any combination of the three of those. You could even setup a temporary drive that uses a portion of your RAM if you have the right software.
- on Windows, open the "Performance Options" dialog by searching for it or googling how to open it. A window with many desktop visual performance checkboxes should come up. Modify that as you like and exit. Your system will have better performance; Windows will not process certain aesthetics in Windows which will speed up your system in general. I recommend keeping "compositing" enabled, or vsync will not be available.
- on Windows 7, consider switching to a non-Aero theme or disabling Aero glass to make sides of Windows opaque. It is an aesthetic that is more demanding than you think.
- on Windows, open the "Performance Options" dialogue and click into the "Advanced" tab. Adjust the first setting for best performance of "Background services". In turn, the Windows operating system will be prioritized. As backwards as that sounds, it means the OS will stall foreground programs less when the OS decides to do something, and reduces the cost of system signals from Minecraft which means better input latency from mice, and other peripherals, while also improving desktop (explorer.exe) and application responsiveness (fewer "Not responding" problems).
- on Windows, consider deleting C: drive and installing Arch Linux to get a more efficient OS with better latency for foreground applications and full configurabioty with your system from scratch. Or just dual boot Linux.
- on Windows, consider reverse engineering kernel32.dll, user32.dll, and other Windows components, and hack Windows to make Windows run, you know, efficiently? By switching to GCD as the primary threading model and a UNIX-based networking model, then having Windows distribute across multiple cores to improve OS throughput and latency.
- on Linux, consider using a lighter Desktop Environment, or move to just a lightweight Window Manager to improve performance.
- on Linux consider faster kernels such as zen (faster) or liqourix (fastest)
- on Linux, consider KMS and Realtime priority support to improve audio latency
- on Linux, consider making a more efficient kernel than all the rest by developing a minimalist modular exokernel written in D.
- on any OS, consider making an OS specifically for running Minecraft
Note: some of the above suggestions are actually sane.
Authority: If there is any reason you should question whatever I put in this thread or reply to, you should know that I have been programming in Java since 2010. If there seems to be an error, please tell me; it is a logical fallacy to appeal to my experience and expect me to be perfect and take everything I say as correct. As of writing, I have 7 years of Java experience.
I ultimately skipped through most of that, I got a bit lost in the second category. The information my feeble mind was able to process seemed well said though!
I recommend adding spoilers for each category to clean it up a bit. Just seeing the length of this post made it even harder to follow.
Nice job though, I can tell a lot of time and effort went into making this collection of optimizations!
-Xms210828288 isnt -Xms211MB better? Also, if you crash when this is set with a out of memory error, increment it(although to avoid the GC having to run too much in the heap I would leave it in 256M or 512M)
Some of them are already in the game
-server would make it slow at the start while it warms up
I ultimately skipped through most of that, I got a bit lost in the second category. The information my feeble mind was able to process seemed well said though!
I recommend adding spoilers for each category to clean it up a bit. Just seeing the length of this post made it even harder to follow.
Nice job though, I can tell a lot of time and effort went into making this collection of optimizations!
I will rewrite the thread to improve the structure whenever I can. I admit I had a better version of this article before, but Chrome didn't save it on this iPhone. This one was a bit more hasty, and properly introducing/teaching concepts that affect performance to players who probably barely know all the keyboard shortcuts to their OS nor touched a lick of code is a bit challenging but the knowledge that people receive from this on-going optimization thread for Minecraft is invaluable. Instead of me just giving a list of JVM arguments that you have no clue about what any of it does or means, I'll teach you a bit about GC, objects, the JVM, and those arguments so you can learn how to optimize Minecraft for yourself.
Some words I use are a bit of esoteric computer jargon, but I'll do my best to explain everything without sounding too verbose.
-Xms210828288 isnt -Xms211MB better? Also, if you crash when this is set with a out of memory error, increment it(although to avoid the GC having to run too much in the heap I would leave it in 256M or 512M)
Some of them are already in the game
-server would make it slow at the start while it warms up
Approximating -210MB to ~211MB makes little difference, but if you want to spare every byte that you can, an exact value is good.
The JVM will not throw OutOfMemoryError unless Xmx is less than 512M. Xms is the minimum heap size, not the maximum. The heap shrinks and grows as more memory is needed.
An option does not add overhead, and the only difference between -client and -server is how the JVM handles memory and JIT compilation. The server VM also has Tiered compilation. The server VM is needed for some features. Adding this option does not increase startup time. Generally speaking, the server VM is best to use for Minecraft. In fact, the World is a server in itself. Optimization of Minecraft is entirely centered around memory management, so GC is one of the biggest performance impacts. The other thing is how quickly Minecraft creates new objects. I will measure this and adjust the JVM arguments as necessary.
Mods: Custom modpack based on FTB Beyond but with 121 mods instead of ~191. (No biome mods).
OS: Windows 8
CPU: I7 2600k OC'ed to 4,4 ghz
Memory: 16 gb DDR3 1333 mhz
Gfx: AMD HD 6950
Disk: Crucial M4 128 gb
With lower video settings without vsync, BetterFPS mod using the algorithm optimized for my hardware, setting javaw.exe to high priority and these Java arguments I wen't from an average of ~25 fps in a new world without a single block placed by me to an average of ~150 fps peaking at 300 fps.
I really thought something was wrong with my computer, but it turns out, that Minecraft must be optimized to yield the best FPS and that vsync in Minecraft is useless unless you have screen tearing of course.
Java arguments that didn't work together with the arguments list above
Now - I'm currently building my own dedicated machine for hosting the modpack mentioned above.
Would all the same arguments have a benefit server side? Which arguments can you recommend for optimizing the server?
I will be running it on Windows 10. I know Linux is better, but if I were to learn that as well, I wouldn't be hosting the server before 2019.
I'll probably use some kind of server management software as well.
Thee arguments are generally client-side. A server handles more than just one player at a time, so memory management is different, especially since you usually have different and more powerful hardware.
If you would like some tips for how to handle GC on your server, I suggest going with parallel and throughout collectors. You'll have to look into them for yourself however. Tuning them is a whole different ball-game on servers.
If you are making a server, you need to make balancing memory and network latency your priority. Minecraft is a heavy memory game, and network latency requirements are high for smooth play. You need to make memory and network bandwidth and latency requirements your priority of optimization. First, optimize as best you can the local environment (which is memory, hardware, and running only what is necessary for your Minecraft server). Next, optimize for network latency. There are probably plenty of registry hacks out there, but if you look up how to disable something called "nagling", it basically disables a feature on your system that combines small packets into one packet to increase throughput, but I don't know if Windows 10 servers already do that or not.
If you want to use any of the arguments listed above, I suggest using the ones unrelated to memory management as that is going to be the job of an expert to tune for your server, but for now at least pick a parallel or throughput garbage collector for starters. For example, you can use AggressiveOpts (and definitely use AggressiveHeap, see JVM arguments list here), but don't use GC tuning parameters specifically for the CMS collector. You can, however, keep the ReservedCodeCacheSize at 768m because you are using a modpack and that doesn't change during runtime.
Besides that, there isn't much else to do besides become a Java expert and learn how to tune it yourself, but that is why I made this thread: it teaches you enough to know the JVM and GC to optimize Minecraft on your own, and if you are optimizing a server, that really depends on your own commitment and effort. Technically I'd say that a server is well off for running the Minecraft server itself, but really the biggest and simplest optimization anyone can do is just keep Java up to date on the latest version. Oracle always has some new optimizations in each minor release. But not Minecraft. Those guys think they know what they're doing, but in reality they seem to be ignoring what the heck the Minecraft Snooper anonymous data statistics are for. They probably make Minecraft for middle- to high-end hardware because that is what they make the game on! In fact, the recommended specs on the back of Minecraft account cards in the store are probably Markus Persson's computer specs!!!
Coming Soon: Java 9 Optimizations!
As you know, java 9 came out! I can't wait to start optimizing a totally different garbage collector instead of the CMS: the garbage first collector, or G1GC (it will be default in Java 9)!
Welcome to the Vanilla Optimizations thread! This thread is intended to provide optimizations for the community without mods, though it is recommended that Optifine be used with these optimizations for best results.
Introduction
Instead of just giving you a list of optimizations that do something you don't understand, I'll teach you a little bit about Java and Minecraft so you that you can learn how to optimize the game yourself and know the reasons behind these modifications.
Note: before you start, do realize that these optimizations are for client only! These are not guaranteed to optimize server performance!
Audience
If you have a computer that can barely manage 30fps consistently and want to get a smooth 60fps, or you have a computer that gets a nice 60fps and that's it, but you want some more eye candy, or your computer happens to be a supercomputer and you just want the game to be the best version of itself, then this thread is for you!
About Java
Minecraft, as you know, runs on Java. Because of this, adequate understanding of the Java Virtual Machine is essential for optimization. This section will teach you about the JVM and the reasons why Minecraft may not be running so smoothly on your machine.
How Java Works
Before I can explain anything about the Java 8 HotSpot 64-bit Server VM, you first have to understand what the JVM actually does; before that, your computer.
Your computer is made up of many modular hardware components that are the result of decades of research. Inside your computer, you have a Central Processing Unit (CPU), and Random Access Memory (RAM). The CPU is simply a logical processor that interprets instructions represented in binary. Each CPU can execute a specific set of instructions, and this is called an instruction set, and these can have as few as 10 instructions, to over 100 instructions for a single CPU. The instruction set of a CPU varies from one brand to another. The most widely known are AMD and Intel. A more uncommon one is ARM (Advanced RISC Machines, RISC: reduced instruction set computer).
A program is a file containing instructions for a CPU to execute. These programs first have to be written in a human-readable format by programmers, then compiled to machine code, the instructions for a CPU. As mentioned above, the instruction set varies from one brand of CPU to the next, so programmers have to make multiple versions of the same program for each instruction set. This is a problem that is, in most cases, easy to overcome, but it has its consequences.
The solution to this problem is Java, or at least one of them, .NET being the other. With Java, you develop for a single instruction set which is an intermediary form of instructions called bytecode. The bytecode for Java is the same for every JVM, so you can write your program once and compile to this bytecode and it will run on any JVM. The JVM interprets bytecode to the native instructions for your CPU and executes it.
In the JVM, the bytecode is then optimized to native instructions rather than interpreted. It is optimized by the JIT or Just-In-Time compiler. From there, it basically runs like a native application that doesn't run on Java.
The Heap
Memory is split into different segments. One of these segments is called the Heap. This area specifically stores objects. Briefly, an object is an abstract data structure; semantically, just an array of data. Objects can reference other objects, and they can also dereference objects. When an object becomes orphaned, that is, all references to the object no longer exist, the object is available for garbage collection since the object won't be used any more. You know why Java appears so slow? This is the whole reason why: Garbage Collection.
Garbage Collectors
As mentioned before, usually the program is the problem. If the program is the problem, that can usually be bad news. In this case, doubly so, because you have two problems: Minecraft is not optimized at all, and doesn't even follow OOP at all in some cases, and on top of that, it's on the JVM using GC which is the main cause of lag spikes. I'll talk about Minecraft bottlenecks later, but generally speaking, when a programmer has exhausted all efforts to make sure the program is efficient, then you can consider the environment. This latter option is not available with native applications because that then forces users of the program to modify their environment. However, the JVM is its own environment. In this case, I'm talking specifically about Garbage Collection.
There are a number of garbage collectors available with their own strengths and weaknesses, but the best one to use for Minecraft (which Oracle has considered making the default over the throughput collector), is the Concurent Mark Sweep collector. You can read about it here. In Java 9, the Garbage First Collector (G1GC) will be made default.
Minecraft Bottlenecks
You may notice that Minecraft lags while chunks are generating. This is for multiple reasons. The first is that terrain generation uses a very intense polynomial equation to generate terrain.
The second is that these chunks with a size of width*height*depth=16*256*16 are very large, cached in memory, for a time, and when more chunks need to be loaded, the cached chunks are written to disk. In fact, chunks are written to disk almost immediately. Writing to disk, as everyone knows, is a very expensive operation.
The third reason is that these two operations are asynchronously executed in the in the main loop which is in the main thread, which means that you have to wait until the chunks are written to disk before calculations can continue. Thus, you get tremendous amounts of unnecessary lag. It would be more efficient to simply have a chunk queue on a separate thread that periodically saves chunks to disk from main memory in proportion to the rate at which chunks are generated (otherwise memory would fill up too quickly and an OutOfMemoryError would be thrown). In another thread, chunk generation could utilize a ForkJoinPool to generate chunks in a more efficient manner. When the generated chunks are ready, they could be put into the chunk cache and rendered.
Particles are particularly slow to render for whatever reason. I suspect it has to do with transparency, since fancy leaves are very slow to render as well, and lighting is actually implemented in software by the actual Minecraft engine. These bottlenecks can only be overcome with faster hardware, or with mods (so consider installing Optifine, I recommend it).
Another problem is that Minecraft actually runs on the old fixed function pipeline in OpenGL instead of the core OpenGL profile (Version 3.1+) with modern functionality such as GLSL Shaders. Shaders are faster and more flexible than using the Fixed Function Pipeline of OpenGL which is from version 1.1, and the fixed function pipeline itself is implemented in shaders.
Those are primarily the only bottlenecks, aside from other smaller bottlenecks, which are compensated by the JIT compiler anyways, or are negligible.
Optimizations
Fortunately, Minecraft is built on the JVM, and the Garbage Collector is fully configurable. It is the largest bottleneck, and it is very hairy to tune. This section will provide some JVM arguments that can be specified at initialization in the JVM Options area in the Minecraft Launcher.
-server
required. This runs the JVM in server mode. Only supported by 64-bit.
-Xverify:none
improves startup time. At JVM startup, when the JAR files are loaded, the class files are verified for security reasons, and this significantly increases the amount of time it takes to start the application. Enabling this option significantly reduces startup time in a noticeable way.
-da -dsa
"da" and "dsa" both disable assertions and system assertions, respectively. Assertions are checks which are made. System assertions are checks made in the Java API. Disabling assertions is a negligible optimization since most assertions are not very expensive, but it may add a few extra fps.
-Xms210828288
This option sets the minimum Heap size to 210,828,288 bytes (210.8MB). Minecraft does not need much memory to run at all. In fact, it can run smoothly at only a max heap of 512MB. This value is calculated based on the maximum vanilla chunk rendering distance at a radius of 16 chunks. A block is stored in a chunk by its numerical ID, which is an integer (32-bit value), which is 4 bytes. A vanilla chunk is width*height*depth=16*256*16=65,536 blocks. 4 bytes * 65536 blocks = 262,144 bytes (262.144 kilobytes). To find the amount of bytes used for a radius of 16 chunks, we use PI*r^2 (the area of a circle). 262144 * (PI * 16^2) = ~210,828,714.1331565 bytes. The JVM only takes values in multiples of 1024, so 210,828,714.1331565 / 1024 = 205,887.4161456607. We round this value to 205,887 and multiply by 1024 to get 210,828,288 bytes for a radius of 16 chunks.
The Chunk Cache
Note: you can also add a certain number of bytes to this number. The game itself has a chunk cache, and the maximum size of this cache depends on the size of the heap, so you only need to calculate the number of bytes for however many chunks you want to cache in the heap and add that to 210,828,288 bytes using the number of bytes in a chunk, so that means you can use the following function to calculate the number of bytes for Xms:
210828288 + ( number of chunks to cache * 262144 )
For the chunk cache, I personally would have at least four times as many chunks as the square of the rendering distance. If you keep your distance at 8 chunks, that means your chunk cache would be 4 * 8^2 = 256 chunks, which is 67,108,864 bytes. For 16 chunks, that's 268,435,456 bytes! This value additive value ontop of 210,828,288 bytes can definitely affect performance. If the chunk cache is too small, chunks will be generated more often, which is an expensive operation. If the chunk cache is too big, every periodic GC may have a larger and larger lag spike in proportion to the heap size. It really also depends on how many chunks you tend to explore in one session of Minecraft.
-Xmx1024m
This option sets the maximum Heap size to 1024 MB. Minecraft has 8x8 textures, but the textures are only stored in Graphics memory, so those aren't important. However, for what ever reason, if you open debug (F3), you'll notice that the Heap fills quickly, especially while you move. If you make the Heap too large, this can affect GC performance by making GC passes take longer, thus introducing unnecessary latency. The size of the Heap is directly proportional to the amount of time it takes to do one GC pass, so a smaller heap is better, and that goes for any machine—just because you have 16GB of RAM doesn't mean you should allocate 8192MB for Minecraft to use. I suggest not going higher than 4096MB.
-XX:ReservedCodeCacheSize=512m
Sets the code cache size to 512MB for compiled methods. No more than 512MB are required for Minecraft. This goes even for mods, but if you have a lot of mods, increasing this value to 768m may compensate for the extra methods added by mods. Lower values may decrease performance since the JIT compiler may run out of memory for compiled methods, and those methods will remain interpreted because they didn't fit in the compiled code cache. The absolute limit is 2GB.
-XX:+AggressiveOpts
Enables aggressive optimizations that are expected to become final in a future release. The performance increase is very noticeable.
-XX:+DisableExplicitGC
This option disables calls to System.gc(). As mentioned earlier, Mojang places that call in the game loop, so GC cycles are executed after each frame to keep memory usage low. This is a latency bottleneck, however. On my machine, this option adds 10-12fps, so I would now be getting ~42fps on average without input lag.
-XX:+UseConcMarkSweepGC
This option enables the CMS garbage collector. This is a better collector for low-latency requirements, especially realtime applications. This adds 2-5fps on its own, so now I would have 44-47fps on average. To learn what the CMS collector does, see https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/cms.html .
-XX:+CMSScavengeBeforeRemark
This option enables scavenging (collection) of the young generation before the remark stage of the CMS collector. The remark is the second (final) mark stage of the CMS collector before a full GC (sweep). The CMS tracks objects on the heap and marks objects that are unreachable for collection by the CMS collector.
-XX:+CMSParallelRemarkEnabled
This option enables parallel remark, the final marking stage of the CMS collector before a full GC. This will reduce latency which will scale to the number of physical cores available to the JVM.
-XX:+AlwaysPreTouch
This option enables touching of every page of memory. Your system uses a paged memory model which splits memory into "page" slices. When an application requests for a memory allocation, the system must find an empty page that your application can use, which may take some time. If the Heap needs to grow, then memory has to be allocated every time the heap needs to resize. This option may improve performance by allowing the heap to shrink or grow without having to allocate new pages of memory by touching all pages at initialization. Otherwise, pages are allocated as needed. The performance is very negligible, but this makes your game run a very small tad faster (less than 0.1fps), but depending on the system, this could save much since page size is usually based on the total virtual memory available and split accordingly, so a page could be 1k or 8k (and everything inbetween or higher) depending on how much memory you have.
-Xnoclassgc
Disables garbage collection of classes. This shortens GC latency, and adds maybe 1-2fps, depending on how many mods are installed, even more. Generally, a class is loaded when functionality is needed, so OutOfMemoryErrors aren't going to be a problem. Usually, even if something is only used once or twice, the size of classes are negligible, and is usually beneficial to prevent GC of classes.
-XX:-CMSClassUnloadingEnabled
This option disables unloading of classes with the CMS collector, which saves GC time.
-XX:CMSInitiatingOccupancyFraction=90
Sets the initiating occupancy percent of the old generation to 90%. This is one criterion for the CMS collector to start a CMS collection cycle (full GC). When the old generation reaches 90% occupancy, a CMS collection cycle will begin. This is one of many triggers for the CMS collector.
-XX:InitiatingHeapOccupancyPercent=90
WARNING: THIS VALUE MUST NOT CHANGE. Sets the percent of the entire heap to be filled before a full GC cycle to 90%. This is one option, in combination with MinHeapFreeRatio and MaxHeapFreeRatio, that is essential to the performance of Minecraft. I have found this to be a good balance for reducing lag spikes and stutters for minor GCs throughout execution since this means a full GC only occurs when the total heap occupancy has reached 90% capacity.
-XX:MaxHeapFreeRatio=87
WARNING: THIS VALUE MUST NOT CHANGE. Sets the maximum percent of the entire heap that is allowed to be free after a GC cycle. If after a GC cycle the amount of free memory is greater than this value, the heap will be shrunk. This value, along with MinHeapFreeRatio and MaxGCPauseMillis, make the heap tend to expand only when necessary. Expansions are more likely to occur after long sessions of Minecraft, and during initialization. These values also affect the minor GC latency.
-XX:MinHeapFreeRatio=13
WARNING: THIS VALUE MUST NOT CHANGE. Sets the minimum percent of the entire heap that is allowed to be free after a GC cycle. If after a GC cycle the amount of free memory is smaller than this value, the heap will expand. This value, along with MaxHeapFreeRatio and MaxGCPauseMillis, make the heap tend to expand only when necessary. Expansions are more likely to occur after long sessions of Minecraft, and during initialization. These values also affect the minor GC latency.
-XX:MaxGCPauseMillis=1
WARNING: THIS VALUE MUST NOT CHANGE. Sets the target maximum allowed time for GC. This is a soft goal, and the JVM makes a best effort to reach this goal. Along with MinHeapFreeRatio, MaxHeapFreeRatio, and InitiatingHeapOccupancyPercent, this option minimizes GC lag spikes (stutters). On my machine, the average GC latency is 3-13 ms with all of these default parameters. If you have more CPU cores or a faster CPU, the latency will be lower. I only have two cores at a max clock speed of 2200MHz (2.2GHz), 1400MHz on my newer laptop.
-XX:NewRatio=3
WARNING: THIS VALUE MUST NOT CHANGE. Sets the size ratio of the young generation to the old generation to 1:3. This means that, as the heap shrinks and grows, the young generation will be 1/4 the size of the entire heap compared to the old generation. Out of all the the options specified here related to GC and heap ratios, this option has the largest impact on GC latency, and thus directly affects frame rate and GC-induced lag stutters.
-XX:SurvivorRatio -EXPERIMENTAL
WARNING: DO NOT ENABLE YET UNLESS YOU KNOW WHAT YOU ARE DOING. Sets the survivor ratio between the second survivor space relative to the first, against the Eden space. For example, to set the ratio to 1/8 the size, this value would be 6 for a 1:6 ratio. That means the second survivor space would be 1/8 the total young generation size.
NOTES:
- options that I have labeled with warnings indicating the value should not be changed, should not be changed.
- options that I have labeled as experimental should not be used unless you are an expert on the matter. These options are being experimented with for their potential performance improvements.
System Optimizations
- use a faster persistent storage device: as mentioned before, Minecraft heavily relies on disk to save chunks to and from memory. This is one bottleneck that can be lessened or removed by using a faster hard drive. If not that, consider at least a large hard drive, an SSD, and USB 3.1, or any combination of the three of those. You could even setup a temporary drive that uses a portion of your RAM if you have the right software.
- on Windows, open the "Performance Options" dialog by searching for it or googling how to open it. A window with many desktop visual performance checkboxes should come up. Modify that as you like and exit. Your system will have better performance; Windows will not process certain aesthetics in Windows which will speed up your system in general. I recommend keeping "compositing" enabled, or vsync will not be available.
- on Windows 7, consider switching to a non-Aero theme or disabling Aero glass to make sides of Windows opaque. It is an aesthetic that is more demanding than you think.
- on Windows, open the "Performance Options" dialogue and click into the "Advanced" tab. Adjust the first setting for best performance of "Background services". In turn, the Windows operating system will be prioritized. As backwards as that sounds, it means the OS will stall foreground programs less when the OS decides to do something, and reduces the cost of system signals from Minecraft which means better input latency from mice, and other peripherals, while also improving desktop (explorer.exe) and application responsiveness (fewer "Not responding" problems).
- on Windows, consider deleting C: drive and installing Arch Linux to get a more efficient OS with better latency for foreground applications and full configurabioty with your system from scratch. Or just dual boot Linux.
- on Windows, consider reverse engineering kernel32.dll, user32.dll, and other Windows components, and hack Windows to make Windows run, you know, efficiently? By switching to GCD as the primary threading model and a UNIX-based networking model, then having Windows distribute across multiple cores to improve OS throughput and latency.
- on Linux, consider using a lighter Desktop Environment, or move to just a lightweight Window Manager to improve performance.
- on Linux consider faster kernels such as zen (faster) or liqourix (fastest)
- on Linux, consider KMS and Realtime priority support to improve audio latency
- on Linux, consider making a more efficient kernel than all the rest by developing a minimalist modular exokernel written in D.
- on any OS, consider making an OS specifically for running Minecraft
Note: some of the above suggestions are actually sane.
Authority: If there is any reason you should question whatever I put in this thread or reply to, you should know that I have been programming in Java since 2010. If there seems to be an error, please tell me; it is a logical fallacy to appeal to my experience and expect me to be perfect and take everything I say as correct. As of writing, I have 7 years of Java experience.
I ultimately skipped through most of that, I got a bit lost in the second category. The information my feeble mind was able to process seemed well said though!
I recommend adding spoilers for each category to clean it up a bit. Just seeing the length of this post made it even harder to follow.
Nice job though, I can tell a lot of time and effort went into making this collection of optimizations!
Be the change our world needs.
Project Realism - A Charity Network
-Xms210828288 isnt -Xms211MB better? Also, if you crash when this is set with a out of memory error, increment it(although to avoid the GC having to run too much in the heap I would leave it in 256M or 512M)
Some of them are already in the game
-server would make it slow at the start while it warms up
Op in #minecrafthelp, JIRA Helper in bugs.mojang.com, Chat moderator in Minecraft Forums, Twitch/Mixer mod
How to get a dxdiag
If I helped you, dont forget to click the thanks arrow!
I will rewrite the thread to improve the structure whenever I can. I admit I had a better version of this article before, but Chrome didn't save it on this iPhone. This one was a bit more hasty, and properly introducing/teaching concepts that affect performance to players who probably barely know all the keyboard shortcuts to their OS nor touched a lick of code is a bit challenging but the knowledge that people receive from this on-going optimization thread for Minecraft is invaluable. Instead of me just giving a list of JVM arguments that you have no clue about what any of it does or means, I'll teach you a bit about GC, objects, the JVM, and those arguments so you can learn how to optimize Minecraft for yourself.
Some words I use are a bit of esoteric computer jargon, but I'll do my best to explain everything without sounding too verbose.
Approximating -210MB to ~211MB makes little difference, but if you want to spare every byte that you can, an exact value is good.
The JVM will not throw OutOfMemoryError unless Xmx is less than 512M. Xms is the minimum heap size, not the maximum. The heap shrinks and grows as more memory is needed.
An option does not add overhead, and the only difference between -client and -server is how the JVM handles memory and JIT compilation. The server VM also has Tiered compilation. The server VM is needed for some features. Adding this option does not increase startup time. Generally speaking, the server VM is best to use for Minecraft. In fact, the World is a server in itself. Optimization of Minecraft is entirely centered around memory management, so GC is one of the biggest performance impacts. The other thing is how quickly Minecraft creates new objects. I will measure this and adjust the JVM arguments as necessary.
Thank you so much for not only performance optimization tips, but also an in depth description! Just what I was looking for.
I looked through this thread as well and combined all the arguments. This is what I ended up with:
Memory allocation: 8192Mb (because I use a Sphax resource pack which eats a lot of memory)
Minecraft version: 1.10.2 (modded) forge-12.18.3.2488
Java version: 8 Update 144 (build 1.8.0_144-b01)
Mods: Custom modpack based on FTB Beyond but with 121 mods instead of ~191. (No biome mods).
OS: Windows 8
CPU: I7 2600k OC'ed to 4,4 ghz
Memory: 16 gb DDR3 1333 mhz
Gfx: AMD HD 6950
Disk: Crucial M4 128 gb
With lower video settings without vsync, BetterFPS mod using the algorithm optimized for my hardware, setting javaw.exe to high priority and these Java arguments I wen't from an average of ~25 fps in a new world without a single block placed by me to an average of ~150 fps peaking at 300 fps.
I really thought something was wrong with my computer, but it turns out, that Minecraft must be optimized to yield the best FPS and that vsync in Minecraft is useless unless you have screen tearing of course.
Java arguments that didn't work together with the arguments list above
-XX:+UseCompressedStrings
-XX:+CMSIncrementalMod
Now - I'm currently building my own dedicated machine for hosting the modpack mentioned above.
Would all the same arguments have a benefit server side? Which arguments can you recommend for optimizing the server?
I will be running it on Windows 10. I know Linux is better, but if I were to learn that as well, I wouldn't be hosting the server before 2019.
I'll probably use some kind of server management software as well.
Thee arguments are generally client-side. A server handles more than just one player at a time, so memory management is different, especially since you usually have different and more powerful hardware.
If you would like some tips for how to handle GC on your server, I suggest going with parallel and throughout collectors. You'll have to look into them for yourself however. Tuning them is a whole different ball-game on servers.
If you are making a server, you need to make balancing memory and network latency your priority. Minecraft is a heavy memory game, and network latency requirements are high for smooth play. You need to make memory and network bandwidth and latency requirements your priority of optimization. First, optimize as best you can the local environment (which is memory, hardware, and running only what is necessary for your Minecraft server). Next, optimize for network latency. There are probably plenty of registry hacks out there, but if you look up how to disable something called "nagling", it basically disables a feature on your system that combines small packets into one packet to increase throughput, but I don't know if Windows 10 servers already do that or not.
If you want to use any of the arguments listed above, I suggest using the ones unrelated to memory management as that is going to be the job of an expert to tune for your server, but for now at least pick a parallel or throughput garbage collector for starters. For example, you can use AggressiveOpts (and definitely use AggressiveHeap, see JVM arguments list here), but don't use GC tuning parameters specifically for the CMS collector. You can, however, keep the ReservedCodeCacheSize at 768m because you are using a modpack and that doesn't change during runtime.
Besides that, there isn't much else to do besides become a Java expert and learn how to tune it yourself, but that is why I made this thread: it teaches you enough to know the JVM and GC to optimize Minecraft on your own, and if you are optimizing a server, that really depends on your own commitment and effort. Technically I'd say that a server is well off for running the Minecraft server itself, but really the biggest and simplest optimization anyone can do is just keep Java up to date on the latest version. Oracle always has some new optimizations in each minor release. But not Minecraft. Those guys think they know what they're doing, but in reality they seem to be ignoring what the heck the Minecraft Snooper anonymous data statistics are for. They probably make Minecraft for middle- to high-end hardware because that is what they make the game on! In fact, the recommended specs on the back of Minecraft account cards in the store are probably Markus Persson's computer specs!!!
Thanks for some great info!
I've learned a lot by reading this thread and will look into further optimizations myself.
Wow, this looks great! You are a man of magic! I will be sure to try these out!
Apply Here!
Interesting. How can we use a different JVM then?