When it comes to developing a game engine (whether it be reusable or the engine powering a single specific game) there are many alternative approaches that can be taken at almost every single stage of the design.
For this reason (and I say this from personal experience) the single most critical part of the development process is the designing of the overall architecture, before a single line of code is ever written. If you don’t decide upon an overall architecture before your development begins, you will almost-certainly find yourself scrapping the codebase and starting over each time you realize that there’s a better way your engine could have been designed.
So, in this series of articles, I’m going to share with you the various “architectural rethinks” that have occurred during the development of my newest game engine (called “AGE” which stands for “Another Game Engine”).
The single most significant of these decisions is covered in this first part of the series.
It is my hope that you can benefit from my otherwise-wasted time, and skip the series of poor decisions I made along the way; particularly those which necessitated a complete ground-up redevelopment.
I would like to point out that there will not be any usable code provided in this series of articles. This is because the principals discussed apply universally (to any object-oriented programming language). Some generic code examples may be used where appropriate, but they cannot be assembled into a functional game engine as they will merely be individual, unrelated snips of illustrative code (as opposed to segments of a working example). These code samples will be written in Pascal, as it is my favourite programming language, and is easily translated into other languages.
Flowcharts will be used extensively to illustrate the relationships between subsystems, and individual logical operations.
Please be sure to read the whole article before you begin your engine design. The reason will become apparent as you read further!
What is a “Game Loop”?
Put simply, the “Game Loop” advances the Physics, Logic (including AI) and Rendering. Engines utilizing the Game Loop approach place the Game Loop at the absolute heart of their architecture.
Game Loops, as with any approach to just about anything when it comes to game development, have advantages and disadvantages.
On the plus-side, a Game Loop can be laughably simple to implement. There are several different kinds of Game Loop, but they all operate on a principal of Differential Time (or Delta Time) which is literally the amount of time it took for the previous Loop to execute. Any Game Loop that doesn’t take Delta Time into consideration is doomed to complete and total failure, as there is no way to ensure that the Physics, Logic/AI, and Rendering progress at a consistent rate (resulting in very poor gameplay, and making multiplayer an absolute impossibility).
A simple Game Loop itself can be implemented in mere minutes (or less).
A Simple Game Loop
Note that the order of execution is not arbitrary. Since Physics dictates the Position and Angle of world objects, and the Logic and AI need the latest information to provide accurate behaviour, Physics needs to be updated first. Again, since the Logic and AI could very well have an influence over the rendered appearance of the game world (such as the current Sprite state and Particle Effects), we need update Logic and AI second. Rendering therefore must be performed last.
The Game Loop illustrated above operates at an unfixed rate (so, as fast as the hardware possibly can). Of course, the advantage is that this Game Loop is laughably easy to produce:
var LLastTime, LDeltaTime: Double; begin LLastTime := GetReferenceTime; while GameRunning do begin LDeltaTime := GetReferenceTime - LLastTime; UpdatePhysics(LDeltaTime); UpdateLogic(LDeltaTime); RenderFrame(LDeltaTime); LLastTime := GetReferenceTime; end; end;
GetReferenceTime returns the current time with an extremely high precision, adjusted for the frequency of the CPU. Different programming languages and standardized libraries will provide their own counterpart to do the same thing.
GameRunning is an external boolean value, function or flag to dictate whether the Loop should continue or break.
One disadvantage to this particular Game Loop is that, while it accounts for the time differential (Delta Time) between each cycle, each cycle will run at the slowest possible speed. This is because each cycle is being made to update all three Simulation components (Physics, Logic and Rendering).
Rendering needs to occur (at minimum) at the same rate as that at which the screen is refreshed (Refresh Rate or Vertical Sync Rate). Typically this is 60 times each second for the vast majority of computer monitors (120 times a second for 3D monitors). Ideally, you want to Render at double the refresh rate of the monitor. This will provide the smoothest possible appearance, and any framerate greater than this will simply not be visible to a player (meaning you’re wasting cycles, reducing performance unnecessarily, and unduly burdening the player’s hardware).
Physics and Logic, on the other hand, need only occur at a suitable base rate. For the majority of 2D games, and even some 3D games, 30 updates per second is usually very adequate. Modern 3D First-Person Shooter [FPS] games ideally want to update the Physics and Logic 60 times every second.
So, what we can do to improve this Game Loop is modify it to fix the rate of the Physics and Logic updates. This will free up more cycles for Rendering.
A Rate-Limited Game Loop
Okay, so the most common implementation of a Rate-Limited Game Loop enforces a limit on both the Physics and Logic, leaving Rendering unlimited. This way, more CPU time is given for Rendering, without necessarily impeding the performance of the Physics and Logic processing.
Here’s what this looks like in code:
const PHYSICS_RATE_LIMIT = 1 / 30.00; var LLastTime, LDeltaTime: Double; LPhysicsUpdateTime: Double; begin LLastTime := GetReferenceTime; LPhysicsUpdateTime := LLastTime; // We want the first Physics and Logic update to occur immediately while GameRunning do begin LDeltaTime := GetReferenceTime - LLastTime; if GetReferenceTime >= LPhysicsUpdateTime then begin UpdatePhysics(LDeltaTime); UpdateLogic(LDeltaTime); LPhysicsUpdateTime := GetReferenceTime + PHYSICS_RATE_LIMIT; end; RenderFrame(LDeltaTime); LLastTime := GetReferenceTime; end; end;
Okay, this code snip (above) adds a simple Rate Limiter to both the Physics and Logic. Now each time the Loop cycles, it will check whether the current Reference Time is equal to or later than the indicated time for the next Physics and Logic update (LPhysicsUpdateTime).
Not so difficult, right? Okay, but this still leaves potential for significant problems! For one thing, the Rate Limit can only be an advantage if the time it takes to Render a frame does not exceed that limit.
An “Improved” Rate-Limited Game Loop
One way we can further improve the previous Game Loop example would be to introduce a Rate-Limit on Rendering.
Rate-Limiting the Rendering is a good idea because it ensures that our game isn’t forcing the player’s hardware to work harder than it needs to. Since we need only render at an absolute maximum rate of double the refresh rate of the player’s monitor, it makes absolutely no sense to allow the Game Loop to Render at any higher rate than that.
However, since different monitors have different refresh rates, we should not consider the rate limit for Rendering to be a fixed (constant) number, and should instead make it a setting (or “property”) that can be specified either automatically by having the game engine retrieve the refresh rate of the monitor on initialization, or by allowing the player to specify their own rate limit in the game’s “Advanced Graphics Options” menu.
This diagram (above) illustrates the logical progression of this Game Loop. Everything in yellow occurs on every Tick, everything in green only occurs if a logical operation evaluates as True, and everything in red occurs only if a logical operation evaluates as False.
Here’s what this game loop could look like in code:
const PHYSICS_RATE_LIMIT = 1 / 30.00; var LLastTime, LDeltaTime: Double; LPhysicsUpdateTime, LRenderUpdateTime: Double; begin LLastTime := GetReferenceTime; LPhysicsUpdateTime := LLastTime; // We want the first Physics and Logic update to occur immediately LRenderUpdateTime := LLastTime; // We want the first Rendering update to occur immediately too while GameRunning do begin LDeltaTime := GetReferenceTime - LLastTime; if GetReferenceTime >= LPhysicsUpdateTime then begin UpdatePhysics(LDeltaTime); UpdateLogic(LDeltaTime); LPhysicsUpdateTime := GetReferenceTime + PHYSICS_RATE_LIMIT; end; if ((FPSLimit > 0) and (GetReferenceTime >= LRenderUpdateTime)) or (FPSLimit = 0) then begin RenderFrame(LDeltaTime); LRenderUpdateTime := GetReferenceTime + FPSLimit; end; LLastTime := GetReferenceTime; end; end;
FPSLimit represents the variable or property setting specifying how many frames per second the player desires. This implementation also accounts for the possibility of an unlimited rate for Rendering by checking whether the FPSLimit is set to zero.
There are countless other ways you can arguably-improve on the Rate Limited Game Loop, however…
The reason why Game Loops are a really bad idea, regardless of their complexity.
Regardless of how sophisticated your Game Loop becomes, you simply cannot escape one critically-limiting and fundamental flaw: By using a Game Loop, you’re limiting all three processes to a single CPU core.
This means that your engine completely ignores the other cores available on the device, and these days, even relatively inexpensive mobile devices have at least two CPU cores, most Desktop systems now have 4, and there are even 6 and 8 core CPUs already on the market… my main development system has 16 cores [2x 8 core CPUs].
This is why I have since chosen to reject the Game Loop entirely, which pretty-much renders this entire section pointless. Well, it would be pointless were it not for the lessons you’ve hopefully learned from reading it.
A Parallel Game Engine
So, we’ve discussed the most common (somewhat lazy) method of moving a game engine along by way of a Game Loop, and we now know why that method is a bad idea. Let’s take a look at a much better approach, shall we?
Rather than having a single processing Thread looping through the three separate processes, let’s divide the processes up onto separate Threads.
This diagram (above) illustrates the point, with Physics and Logic rolled into one Thread, and Rendering on another.
The reason we bundle Physics and Logic together is that they are fundamentally two parts of the same process. Since we only ever need to update the game’s Logic when the Physical State of the game advances (such as when two objects collide, or cease colliding, and of course when World Object Positions and/or Angles change), it makes all the sense in the world to bundle them together on a single Thread. I name this combined Thread the “Simulation Thread”.
Rendering, on the other hand, has to be performed regardless of the Game State. For example, we need to render the Title Screens when the game launches, the Main Menu before you begin playing the game etc.
Have you spotted a problem yet?
If we’re Simulating on one Thread, and Rendering on another, how do we ensure we’re Rendering a consistent Simulation State?
Rendering a consistent Simulation State
Any competent game engine should employ a comprehensive Event Engine in order to facilitate communication between modules. Not only modules, in fact, as the Event Engine should also handle communications between Game Entities within the game itself.
Rather than me explaining once again what an Event Engine is and how it works, if you don’t already know, you can read my series of articles on Event-Driven, Asynchronous Development with Delphi and the LKSL.
Anyway, a good game engine will employ an Event Engine of some sort, and it is through the use of this Event Engine that we can ensure that our game engine is always rendering a consistent Simulation State.
This flowchart (above) shows how the Simulation Thread communicates with the Rendering Thread using our Event Engine.
It is critical that our Simulation Thread never communicates directly with the Rendering Thread, as we cannot assume that the binary contains the Rendering Thread. This is because the “dedicated server” platform is compiled from the same codebase as the game itself, only the Rendering Thread is not compiled into the “dedicated server” executable.
So, by using the Event Engine to handle communication between Threads, we eliminate any potential conflict between the game’s executable and the server’s.
Now, once we’ve progressed the Physics and the Logic on a Tick in the Simulation Thread, we assemble an Event containing all of the Game Entities’ current render data (position, angle, linear and angular velocity) and dispatch that Event through the Event Engine. This Event also includes the Reference Time at which that render data was assembled (that’s actually very important information)
The Event Engine then passes the event along to the Rendering Thread, where those values will be used in the next Tick.
This way, the Rendering Thread is always operating on a complete (and consistent) set of data.
The Rendering Thread can then interpolate and extrapolate (respectively) based on the last complete set of data (and the Reference Time at which that data was assembled) where on the screen to display each Game Entity, and at what angle.
Since we’re now using two separate Threads, we have given our game engine the ability to use two separate CPU cores, which means both processes can occur at different rates, entirely asynchronously, and we can smoothly display the Simulation on the screen.
Note that it is also the best practise to set realistic Rate Limits on each respective processing Thread so that you yield “spare time” for other processes.
The only downside to this method is that it is architecturally more complex than the Render Loop method… however, this one downside (which I personally consider to be minor) is massively outweighed by its significant advantages.
What we have seen here is just one case in which there are many alternative approaches you can take when designing and developing your game engine. While the Game Loop approach might seem like the quickest option, and that might strike you as preferable in the beginning, you will eventually encounter one or more of the many shortcomings of that approach, and end up wishing you’d opted for the more complicated – but ultimately superior – approach from the start.
It is my personal opinion that the Game Loop is an architectural concept we should finally retire from the world of game development, and that it should be viewed as inherently flawed. Design your game engine to be Event-Driven and fully Asynchronous by the use of separate Threads. It may add a few hours to your development time, but this investment pays off in the long-run.
I hope you have learned something from this article. The point of this series is to educate others based on my own experiences, and in so doing “justify” the time I would otherwise have wasted through the flawed design principals I investigated when developing my game engine.
While I’m aware that there aren’t many of out there designing game engines these days, there are parallels you may recognize in other types of development project… and the lessons game development teaches us have relevance throughout the world of programming.
As always, questions are always welcomed. If you would like clarification on any of the subject-matter raised in this article, please leave a comment and I shall try to answer as best I can.
Thank you for taking the time to read this article.