Introduction
It’s no secret that DayZ and the Arma franchise are Bohemia Interactive’s most popular titles. While they are both deep sandboxes with endless possibilities, they consist, at their core, of a few primary entities that players interact with the most. For this article’s purposes, we’ll be focusing on vehicles.
The vehicles in Arma Reforger represent years of continuous improvement in simulation, modeling, and configuration. They are the product of the combined efforts of many different departments and roles such as gameplay, network, physics, animation, render, and audio programmers, as well as our QA, designers, scripters, technical designers, sound designers, animators, and 2D / 3D artists. The vehicles are undeniably one of the best parts of the entire experience. We want to keep improving them and ensure a solid foundation for their future, be that in the air, water, or on the tracks. But it's impossible to move forward without reflecting on the past.
Without going too deep into the technical side of things, it's important to acknowledge that there were flaws with our previous implementations that our communities have been dealing with for a long time. Arma Reforger vehicles, as great as they are, have a networking architecture that mostly stems from previous Arma titles, which has limited our ability to fix some of their issues and improve them further. DayZ vehicles were the first step into a new architecture and were reworked with a new simulation in 2022, but the networking solution still has its own limitations.
While we have always aimed to create an optimal in-game experience, it became clear that our previous implementations were preventing us from making the vehicles the best that they could be. The choice was clear: we had to improve the architecture.
This improvement is going to be a significant change for the Arma Reforger community. We'd like to help you understand why we had to do this and why the vehicles will behave differently. It is a change that will take some time to fully realize, but we believe we will deliver something in the end that was never possible in our previous titles.
Vocabulary
Let’s begin with some basic terminology to ensure that everyone understands the new architecture.
Lerping - the process of smoothly teleporting an object from one position to another.
Client (Owner Proxy) - The client driving the vehicle.
Server (Authority) - The representation of the vehicle on the server.
Client (Non-Owner Proxy) - The representation of all other clients and AI driving all other vehicles (these are not simulated but are lerped to their last reported position).
Move - A set of inputs to any simulation and its outputs (position, the velocity of vehicle, etc.).
Internal state - The value of the variables that make up the memory of all systems inside the vehicle (gearbox, suspensions, wheels, etc.).
Signals - The value of the variables that make up the visual representation of a vehicle (the rotation of wheels or RPM of an engine).
Source of truth - The responsibility for defining the internal state and position of the vehicle located.
Client authoritative architecture - The client simulates the vehicle and sends the server the signals of their vehicle; the server distributes it to all other clients.
Server authoritative architecture - Synchronization between the server and client about the vehicle's state, with the server making the final decision.
Network delay - Everything the client sees around them is in the past due to the nature of the speed of light; information takes time to arrive and be processed.
Network unreliability - Packets get dropped when they travel over the internet.
Networked environment - Simulation running over a network, which is by its nature delayed and unreliable.
AS IS
Let's begin with the technical details of our current implementations.
Arma Reforger
Like all previous Arma titles, Arma Reforger runs vehicles on a client-authoritative architecture. As previously defined, this means that the client simulates the vehicle they are driving, sends the server the signals of their vehicle, and the server distributes it to all the other clients. In an attempt to fix some inconsistencies, these signals were sent from the (owner-proxy) client to the server using reliable transmission and were distributed to all other (non-owner proxy) clients using reliable broadcasts, causing network strain on weak connections for both uploads and downloads.
Advantages
The benefits of this approach are:
Immediate and authoritative control. Any input you make locally is guaranteed to be executed and the output it produces is taken as the truth.
Consistent behavior on the client regardless of connection issues, non-deterministic behavior, and machine or server performance.
Simplicity of implementation. The vehicle is only simulated on the client and its state is sent to the server for distribution.
Disadvantages
Let’s discuss the disadvantages.
Cheating
Cheating is a serious threat to the integrity of any game. It's a never-ending fight that we take very seriously. Anything with a client-side architecture is much easier to exploit.
Unsolvable issues
There are a number of unsolvable issues, which can be traced to the fact that the source of truth for the simulation is spread across multiple sources. Let’s take a look at a few examples.
Unsolvable collisions
This occurs when you are able to teleport yourself inside another vehicle. This “desync ramming” should be a familiar sight to those of you who’ve played our previous Arma titles. Here’s an example from booglog on YouTube. Here’s another from PsySin on YouTube.
Unsolvable collision responses
The response to such collisions is often catastrophic. This is most obvious when collision damage is implemented.
Simulation handover
When a vehicle is spawned, it is simulated by the server. When a client enters the vehicle, they start simulating it. In order for the client to properly simulate the vehicle, they need the internal state of the vehicle. This is a process called handover.
The same occurs when a client exits a vehicle, meaning they send the vehicle’s internal state to the server. This is an unreliable process that can often end up with the vehicle in an undefined state. For example, the position of the vehicle may be slightly compressed in order to save data, the client may disconnect and never transmit the state, etc. This can result in the burrowing of vehicles underground upon entering them, which is a well-known issue in our community.
Continual simulation
The client can lag, suffer packet loss, or simply take time to complete the handover process. This can result in the vehicle behaving in an undefined way.
Who can forget the infamous flying car duels that were possible in the past due to this issue? Here’s an example from TheTimidShade on YouTube.
Inability to implement further features
We are prevented from implementing further features in terms of physics simulation and networking. For example:
Towing - imagine two clients pulling on a rope. One client would report one result, while the other would report something completely different. Such an issue is impossible to solve.
Vehicles as cargo - imagine an actively controlled car being loaded onto a boat or plane. It would be impossible to make this feature consistent in this architecture due to unsolvable collisions and their responses.
Damage handling - as we have already seen, the damage is completely inconsistent.
Impossible to limit bandwidth - A given connection only has so much bandwidth. Any packet that exceeds this limit would be dropped. Due to reliable transmission, it would have to be sent again. This prevents us from scaling our networking solution. Furthermore, the fact that the data sent in each direction is reliable makes it impossible to limit the bandwidth without causing delays.
DayZ
We want to briefly reflect on DayZ since it uses server authoritative architecture, which was a significant improvement over the Real Virtuality implementation. The algorithm, which we will call "delta vector", worked like this:
Client sends only inputs and simulates their vehicle locally.
Server simulates with given inputs and continually sends Client their resulting position.
If the Client detects a discrepancy in position, it calculates what the difference is (e.g. a vector, if applied to the Client's position, would give the server's position).
It then applies this delta vector to its entire history.
Lastly, it lerps to the last position in history.
This method works relatively well for characters because they move slowly. It does not work well for vehicles, however, which are physically simulated, fast, complex, and chaotic. Note that only the position is sent to the client, not the entire internal state of the vehicle, so any system inside the vehicle would now be irreparably desynchronized.
Systems inside vehicles have complex interactions. For example; wheels have points of contact with the ground, which then affect the steering of the vehicle, etc. The error in the internal state and position of the vehicle would continue to accumulate, causing the client to always reconcile incorrectly and forcing them to lerp to the server's position. All of this accumulates, causing the infamous "floating" feeling and loss of control.
TO BE
To summarize: the challenge was immense on one side, while client-sided vehicles were perceived as good and a highlight of Arma Reforger’s gameplay. On the other, we knew certain issues with them were unsolvable and that it was only a matter of time before the existential problem of cheating came up.
Architecture requirements
Let's establish some requirements for our architecture.
As a controlled entity operating in a networked environment, the vehicle must follow some rules and principles:
Immediate control - when the client makes an input, the vehicle must immediately react, irrelevant of ping, etc. This means we must simulate the vehicle on the client.
Server authority - we consider the server to be the only source of canonical truth; it is considered the final, true state of the simulation. This means that we must also simulate the vehicle on the server.
What happens when the client and server don't agree on where the vehicle should be in the world, or in its internal state? This brings us to our next requirement:
The ability to reconcile differences in simulation: differences should be resolved, with the server taken as the final truth while ensuring immediate control. This is the problem we are trying to solve.
These differences may occur:
Because of the networked environment; the fact that one party sees everything in the past and that information can be lost.
When either party experiences packet loss or performance degradation.
Due to non-deterministic behavior - given the same inputs to a vehicle, the output (internal state or position) will be different.
Finally, we must ensure that we utilize unreliable transmission and replicated variables at every step of the process in order to have efficient networking and still allow for bandwidth limiting.
"Full Circle" server authoritative architecture
To satisfy these requirements, we will use the following algorithm, which we’ll refer to as "full circle replay".
(Client - Owner Proxy) Produce move on the client from key presses
Predict the move of the client by simulating the vehicle (also called client side prediction).
Send the move to the server.
Store the move into the client's history.
(Server - Authority) Simulate the move received from the client
If the server's internal state and position match the client's position, the server approves the move.
If the internal state or position doesn't match, then the server issues a correction containing the server's internal state of the vehicle.
The server informs all other Non-Owner Proxies of the position and signals of the vehicle.
(Client - Owner Proxy) Reconciliation
If the client receives approval, they delete the given move and all those before it from its history.
If the client receives a correction, it starts the Reconciliation:
Rewinds its simulation to the internal state sent by the server.
Replays all the moves in its client history.
Smoothly teleports from its current position to the newly predicted position.
(Client - Non-Owner Proxy) Upon receiving an update lerp between the old and the new position.
Let's examine the details of the implementation in the following paragraphs.
Replaying
I would like to draw your attention to what happens when the client receives a correction. This is where a significant amount of our effort has been spent. Let's remember that a correction is intended for some move produced in the past, as it takes time for inputs to arrive on the server and be simulated and corrected.
Replaying is necessary because the client must correctly guess what the internal state and position of the vehicle on the server will be once the server has received all the moves still to come. Otherwise, the requirement of "immediate control" would collapse, since all we can do is put the vehicle in the server’s position, which you would then perceive as the vehicle being constantly pulled back.
Example
Let's look at an example.
Imagine a simple 2D world for our vehicle. We record the move with the following data:
Each of these inputs was simulated, recorded in our local history, and then sent to the server as per the algorithm. Remember that it takes some time for these inputs to arrive on the server. As the server receives the first input, it consumes it and simulates the vehicle. In this example, however, someone has collided with you on the server and forcibly moved you back.
This means that the server will issue a correction for the first move, which looks like this:
We must now reconcile. We have already made two inputs and sent them to the server. Now we have to correctly guess what the position will be on the server when it simulates them.
Let's now replay all the moves we have in our history and lerp to the resulting position.
As the server continues to receive and simulate the client's input, it now gets the same results and issues no more corrections, meaning that our replay was accurate.
Physically simulated vehicles
Where all this becomes significantly more difficult is with physically simulated vehicles, which not only operate in a 3D space but are also complex and chaotic simulations.
Determinism
Determinism was a significant hurdle to overcome. The goal was that, given the same initial state of the vehicle and exactly the same inputs, the internal state and position of the vehicle should be the same. Otherwise, we would be constantly issuing expensive corrections.
The following images show our struggles during early development. Notice how much diversion there was between different attempts, even though the exact same inputs were simulated for all of them.
The result was that we were able to play the simulation for many hours at a time and still had 100% determinism in the internal state and position of the vehicle. This is true even with simple collisions with the static world across different machines and platforms.
The left is the client, the right is the server. Follow the topmost positional distance graph on the right (white value). The client and server are always in sync.
Replaying physically simulated vehicles
To give you some context, all physics engines, for the sake of optimization and consistency, tick their physics scenes/worlds at very specific times. This fact prevented us from replaying the vehicle properly in previous titles. Without replaying, we cannot recover from lost packets, stutters, or any desynchronization events.
The following video shows how the vehicle behaves when we disable corrections altogether. Notice how the positional distance spikes, and that we are completely desynchronized when we hit the vehicle (a non-deterministic event).
So we send a correction in an attempt to recover from our desynchronization. Remember, however, that the client is always in the future compared to the server, and that any correction is perceived as the vehicle being pulled back constantly. We can simulate how that would look by enabling corrections.
Notice how we can never zero out the positional difference. It always remains and a snapping is obviously present. This is the lack of Replay.
During our development, we were able to find a way to tick the simulation of the vehicle away from the main loop. This allows us to fully reset the vehicle to the server's position, replay its inputs, and accurately predict the vehicle's internal state and position, thereby closing the loop in the algorithm. Note that any difference in this replayed state of the vehicle would eventually accumulate in a difference in position. Replaying allows the client to correctly predict what the starting state will be on the server after they play the same inputs (once they arrive) and therefore remain in sync.
To demonstrate this in action, the following video simulates a 3% packet loss. As a result, the server will lose some inputs and the simulation will deviate. The crucial thing to look for is the eventual zeroing out of position. This means that the client predicted correctly.
Advantages of server authoritative vehicles and "full circle replay"
The advantages can be summarized as meeting all requirements and allowing further feature development.
Cheating prevention
Cheating is simply impossible and the experience cannot be compromised. The client sends their inputs and the server simulates them. Inconsistencies are impossible, as the server is the only source of truth.
Here’s an example of our internal "un-flip cheat" where we flip the car onto its wheels on the client. Notice how the server authority prevents us from doing this and puts us back on the roof.
Solvable collisions and collision responses
Collisions are not catastrophic like they were before, and collision responses can finally be resolved properly. Here’s an example of vehicle-on-vehicle collisions.
Here’s an example of how the collision response is handled. As you can see, the appropriate amount of damage is applied.
Continual simulation
There is no handover of the simulation. The vehicles are always simulated, no matter if the client disconnects, has a bad connection, or has low fps. What happens on the server is taken as the final state of the simulation.
Here’s an example of low fps, high packet loss, and ping client. Notice that the server is still playing the simulation smoothly.
Networking impact
The impact on network traffic is much less severe with our new architecture, and bandwidth limiting is possible.
Only inputs are sent from the (owner proxy) client to the server and this is done unreliably. This is compared to all signals being sent, which would be done reliably. This means that the server doesn't send acknowledgment packets for each of the client’s inputs. The client also doesn't resend them if they are lost. Lastly, inputs are a much smaller size than signals.
The same is true for the server to a (non-owner proxies) client’s traffic, where signals are distributed via replicated states (compared to all signals being sent as reliable broadcasts). This means that each client receives only the latest signals and they do not need to send acknowledgment packets for each. Finally, a great deal of work has gone into compressing the vehicle signals, which has significantly reduced their size.
Disadvantages of server authoritative vehicles and "full circle replay"
We want to be transparent about the shortcomings of our architecture. Some issues can be improved in the future, while others need to be managed.
Collisions during replaying
The collision detection of the vehicle’s chassis is not performed when the client is in the process of replaying. This can lead to a number of visual problems, such as temporarily burrowing through terrain (particularly visible when the vehicle is on its roof), driving through trees, etc.
This can be solved by resolving collisions during replay.
Complex collisions
Complex collisions include any collision with two or more objects which, due to the nature of our physics engine, are non-deterministic and will therefore cause a deviation in the state of the vehicle. This can happen when intentionally driving over branches and similar complex colliders.
The following video demonstrates what the consequences of such desynchronization would look like by using packet dropping to cause it.
Collision with dynamic objects
For now, the client does not simulate other vehicles. They simply lerp them to their last known position, while the server simulates all vehicles. This causes the client to react to a collision with another vehicle as if it were a solid wall, meaning that the rebound force will be greater on the client than on the server. This can lead to some unsavory snapping when combined with the inability to solve collisions during replaying.
It is possible to improve this by simulating other vehicles on the client.
Water physics
The implementation of water physics is non-deterministic and doesn't allow for proper replaying. A temporary fix has been implemented that increases the distance allowance when a vehicle is in water, allowing for a slightly smoother experience for the client while still maintaining server authority. We acknowledge it's an issue that we'll fully address later.
Notice the snapping when entering and exiting the water, or when colliding with a rock.
This can be solved by making the water physics deterministic.
Low fps or packet loss
Naturally, if the client or server run below 5-10 fps, then the physics simulation will be unable to tick the same amount of times in the same time frame and will start giving different results, forcing a correction.
Here’s an example of what that looks like. This clearly demonstrates why server performance is very important.
Server performance is a broad topic that is being actively addressed. It should be noted that vehicle simulation itself has a negligible impact on server performance.
If the client experiences packet loss, their inputs will not arrive. Similarly, if the server's corrections do not arrive in a timely manner, then the correction on the client will result in snapping huge distances.
Here’s a short example of what a 10% packet loss and around 120 ms of round trip time look like.
Removal of non-deterministic features
Wheel surface noise has been removed as it introduced a large amount of non-determinism. Here’s a comparison:
Engine startup randomization has been removed because the type of randomness it uses is not deterministic. Here’s a comparison:
Both can eventually be reintroduced.
Future development plans
Scheduling simulation of proxies
The biggest improvement to the overall experience could be made by scheduling the simulation of other vehicles around the client so that we get a better collision response. To be clear, this issue can't be fully resolved, as the client operates in a networked environment.
Resolving collisions during replaying
Another big potential improvement could be in the resolution of collisions during replaying. This would allow for better replaying and a smoother experience for the client when a correction occurs. It's a complex topic but there are multiple implementations we can build up to.
Lerping and API
Lerping is quite simple and improvements are possible. The camera can snap quite badly when lerping long distances. A floating camera is necessary to smooth out the experience. There is a lack of an API for further features that want to be integrated with this system. This is a long-term goal we want to follow.
Adjacent development plans
The helicopters use this method as well and are actively being developed with it. As the vehicles now operate in a state-based networking system and have a custom compression, we can further develop the scheduling of networking states and "network LODs". This allows us to have proper bandwidth limiting and further improve server performance.
Mod guide
Unfortunately, this change breaks current vehicle mods. The good news is that this problem is easy to fix. In order for vehicle mods to work with server authoritative architecture, you must change the following components:
CarControllerComponent to CarControllerComponent_SA
VehicleWheeledSimulation to VehicleWheeledSimulation_SA
NwkVehicleMovementComponent to NwkCarMovementComponent
If you don’t make these changes, you will receive warnings and vehicles will not work as intended.
Conclusion
Having weighed the pros and cons of each architecture, we believe this is the right choice. We are aware of its shortcomings and have a clear path towards improvement.
We have opened the door to many exciting possibilities and features while closing the door to many old bugs that have plagued both Arma and DayZ vehicles for a very long time.
We also want to be transparent about this big change with our community. Please try this implementation when our upcoming update is released and let us know what you think.