About the Author: A little over a year ago, Gabe “lazyeye” Weiner discovered the joy of programming video games, which quickly spiralled into a passion powerful enough to make him drop out of college to pursue the career professionally. He has since released a variety of tools and jam games, with Forager now being his first commercial credit. You can follow him at his Twitter to stay up to date with his work, and get Forager now on Steam.
HOW FORAGER HAPPILY HANDLES THOUSANDS OF INSTANCES
Sometimes you’re lucky enough to be hired onto a project from the start, having full control over what your codebase looks like. Other times, you’re handed a mildly troublesome project of 50,000 lines and told: “Please fix this”. Hi, I’m lazyeye, the lead programmer of Forager.
Before I was made the lead programmer of the game, I was contracted for one task: optimization. Forager is a massive crafting game where the player can collect resources and build structures on an enormous map, meaning that there could easily be 5000 instances active at a time, if not more. This particular genre of game raises a major problem for an optimizer: the player has the ability to create seemingly infinite instances, and somehow we still have to ensure that the game runs smoothly across all platforms.
Instance count is a common problem in GameMaker games; in nearly every project I have worked on, excessive instance counts have been the worst performance drag. Often people will fall into the trap of using an object rather than learning about other tools, for example: particle systems, data structures, asset layers, etc. Understanding how to apply these other methods is important since they can improve performance and make some complex tasks easier. If something isn't moving around, isn't being depth sorted, and isn't colliding with stuff, it probably shouldn’t be an object.
But, Forager is different. Unlike most projects I've seen, almost every single instance in the game seemed to be justified and would be a nightmare to simulate without instances. Frustrated, I realized I would have to roll up my sleeves and think a bit more outside the box.
When I say a lot of instances, I mean… a lot
DRAW OPTIMIZATION
So, we’re stuck with these instances. However, there are still a plethora of ways to improve the code they are running each frame. Performance drops often stem from a lack of understanding of how drawing actually works under the hood. Unfortunately, this process is a very loaded subject (and one I am not an expert on), but a basic understanding is critical to avoid common mistakes.
You may have heard the term “batch breaks” before. A batch is a set of instructions that GameMaker sends to the GPU to draw things. GPUs do a great job at drawing things very quickly when they come grouped up in a single batch, but the more batches you send to the GPU, the more time it has to spend switching between instruction sets, hurting your performance.
A batch break can occur when you update one of GameMaker’s draw settings, such as changing color/alpha, font, blendmode, among others. Here are some common “batch breakers”:
- Primitives (
draw_rectangle, draw_circle, draw_primitive_begin, etc.
) - Surfaces (
surface_set_target, surface_reset_target
) - Shaders (
shader_set_target, shader_reset_target
) - Draw settings (
draw_set_font, draw_set_colour, etc.
) - GPU settings (
gpu_set_blendmode, gpu_set_alphaenable, etc.
)
Batch breaks are inevitable, and different platforms will handle them better than others. The key is to structure your code so that you minimize these breaks. Often times this can be done by grouping similar instructions together. For example, rather than having lots of instances use the following Draw event:
// Outline
shader_set(shd_outline);
draw_self();
shader_reset();
See if you can get away with doing them all at once inside a controller’s Draw event:
// Instance outlines
shader_set(shd_outline);
with (par_outlined) {
draw_self();
}
shader_reset_target();
Often depth sorting gets in the way of this trick, but sometimes you’ll find exceptions. The use of layer scripts can assist in this process greatly by allowing you to perform code exactly before and after a layer is drawn.
STEP OPTIMIZATION
Optimizing the Step event is a constant question of “does this need to be running every step?” Usually, I will ask myself this question quite a few times, the first answers always being, "Yes, there is no way they can work without running every frame," until perhaps the eighth time, when I realize, "Oh. That might work."
Is an instance fetching information from some global data structure every frame? Perhaps you can just do it once in a Create event. Are you iterating through the entire player's inventory each step? Maybe it just needs to be done when an item is added or removed. None of those things? Maybe nobody will notice if that code is just running every other step.
A neat trick is to utilize GML’s short-circuiting. Short-circuiting is how GML decides to stop reading your conditionals when a false value is reached. For example, given the following code:
if (1 + 1 == 3) && (instance_place(x, y, obj_enemy)) {
// stuff
}
Due to the fact that 1 + 1 is not 3, GameMaker won’t even bother reading the instance_place
call. Since you are saying both statements must be true, there is no way that the conditional could return true if the first part is false. Use this to your advantage when ordering your conditionals. If you have checks that are more performance heavy than others, put the lighter ones first! And additionally, keep in mind which conditionals are most likely to be false -- having five conditionals that will almost always be true, followed by one that will almost always be false, is a waste of time to process.
TAKING IT FURTHER: INSTANCE CULLING
All these optimizations are great, but for Forager, we’re going to need a big picture solution. As an optimizer, part of your job is to find places that you can trick the player into thinking one thing is happening, but something else entirely is being processed behind the scenes. These kind of tricks are the backbone of optimization: clever illusions that the player won't know are happening, but enable things that previously would not have been possible.
While we have already established that all of our instances are justified in their use, this doesn’t mean that each of them is always important. Take objTree
, for example; trees are depth sorted, have collisions, visual effects, and all sorts of processes when the player interacts with them. However, if the player can only interact with the tree if they are near it... perhaps we don't need to process it 95% of the time. If a tree vanishes in a forest, and the player isn't there to see it, do they notice?
This is where our culling system emerges. If the player can't see an instance, we're simply going to deactivate it. If the player moves to a place where the instance would now be visible, we'll quickly reactivate it before it pops in view. This gif shows the process -- the white rectangle represents where we have set our cull’s region.
Watch the borders!
SOME ACTUAL CODE
The following script, CullObject
, is used in a Step event to check each active instance of an object to see if it should be culled:
All code screenshots are using the Dracula theme created by TonyStr
We supply the object to be culled and check each of its instances to see if its sprite is outside the view. If it is, we create an array with its ID and boundary box and stick the array in a list of our deactivated instances. Note that this boundary box is based on the actual sprite drawn with its scaling, not the boundary box of the instance's collision. We add a small amount of padding due to the fact that Forager occasionally applies minor scaling to draws with other variables.
The next script, ProcessCulls
, is what handles "reviving" our deactivated instances:
Note: Forager has a little bit more going on in all these scripts, but for the sake of example, I’ve cut it all down to the bare system.
We simply run through our list of deactivated instances, check if the camera has now moved within view of it, and reactivate it if necessary, deleting it from our list.
WAIT, I BROKE EVERYTHING
It was only moments after I pushed this commit to the repository that I had the thought, “Hm, I wonder if the game uses instance_exists
, instance_number
, and other instance functions a lot. That might be messed up by the culling”.
I decided to do a quick search for instance_find
, instance_exists
, and instance_number
and was met with over 500 results.
Oops...
This is suddenly a very tricky situation -- these functions will not return proper results because they can only check for active instances. If the game is constantly using them for various pieces of game logic… then the culling was going to be a major issue.
Rather than give up on it, I decided to try to add another layer to the culling system. I need a way to check if an instance exists, active or not. I also need to be able to get an accurate number of all those instances and to be able to retrieve them. One challenge is that with instance_exists
, we can supply our argument three different ways -- ID, object name, or a parent name.
THE TRUE INSTANCE FUNCTIONS
The first step is to add every cullable instance to a global data structure when it is created:
We add this instance’s ID to its objects list in our cache of instances. Next, we iterate through an array of all of our possible parent object IDs, and check if this instance is a descendant of that parent. If it is, we add it to that parent’s list as well. This is because in GameMaker running an instance function with a parent as the supplied argument will include all of that parent’s children, so our scripts need to as well.
Next, we need to clean up our instances from our cache when they are destroyed, so in a Clean Up event we have:
The same process as before just reversed.
Now that our instances are set up, we’re ready to write our replacement functions.
You’ll notice that in all these of functions, I’ve included compatibility for instances that are not a part of our true instance system. Remember, there are over 500 occurrences of these scripts in the source code and I’m trying to be as time-efficient as possible. Being able to do a quick find and replace to implement this is crucial.
Now we can see the purpose of our active variable -- this way, we can check if an instance is currently active before returning it, and if it's not, we activate it and mark it on our temporary activations list. However, we don’t flip its active flag, because this way we can tell the difference between an instance actually being activated versus it only being temporarily activated.
Temporarily reactivating the instance before we return it is necessary because it is possible that the source code will want to write values to the instance it retrieves. Deactivated instances can be read from, but cannot be written to. In an ideal system, this activation would be optional, but since I needed to keep the argument format to maintain compatibility with the old instance functions, I left the activation as a requirement.
Finally, with TrueInstanceExists
, we can check to see if an instance ID or object ID was supplied by checking if it is above or below 100000. Like TrueInstanceFind
, we make sure to activate the object before returning it.
Finally, we have to deactivate our temporarily activated instances, which has one small issue -- GameMaker rebuilds its event queues in between each Step and Draw event. This means that we have to make sure our controller object runs the following script in each of those events, otherwise we could risk an instance running one of its events when it should not.
MAKING IT EVEN FASTER
Recall how I said earlier that we must always question if the code in our Step events must run every single frame? The same applies here. We have a macro, SYSTEM_CHECK_INTERVAL
, which controls how often our system scripts (such as culling) run. The scripts within it will all scale accordingly, so for example, in the system script that controls plant growth, the growth value will be multiplied by our interval. If the system is only running checks every 20 frames, the growth value will be multiplied by 20. The scripts are all in a Switch event, meaning the work is evenly spread across 20 frames.
CLOSING THOUGHTS
Optimization is an impossibly large subject, and we've only covered fragments here. Instance count is only a part of what hurts most projects; what goes on under the hood of GameMaker, the process of rendering textures, and many other intricacies of game development are critical to understanding how to properly optimize your project.
However, the advice "don't prematurely optimize your game" still holds plenty of truth. Don't hinder your progress by worrying about performance issues before you know you'll have them. The truth is that GameMaker caters very well to the projects it’s built for -- most developers will never have to worry about these things.
That said, if you do end up creating a large game which faces performance issues, I encourage you to be clever and curious, to experiment, and to do research.
Or, you know, just give me a call. ;)