Future server optimizations and multi-threading
Personally, I think that the threads are the root of all evil, especially in a MMO server. They make debugging very difficult, can cause serious and hard to trace bugs if not done properly, and the speed benefit is not that great (about 30-40% in practice, if done right).
All this being said, there are some cases where multi threading is a good idea for a MMO server:
1. When you are processing a lot of data that is not dependent on other thread's result.
2. When you are CPU limited.
3. When you want the fastest response time possible (lowest latency).
Some MMO servers use a blocking/threading model. What that means is that you have 1 thread for each player (or 1 thread for a number of players), and the server uses a blocking socket (the execution of the thread is intrerrupted until there is any incoming server activity).
Other MMOs (like Eternal Lands) use a non blocking, non threaded model. That means that a socket will not block the program execution when it doesn't have data, and, instead, you move to the next socket to process the next player.
A third category of MMO servers use a hybrid model: Non blocking, threaded.
This is what I plan to work on, after we are done with the update.
So, how will it work?
Right now, there are two routines that take most of the CPU time: the path finding, and the range calculations (the part that determines who sees who).
And it so happens that those two routines are not dependent of previous results; they can be done in parallel.
The range calculation is basically done like this:
for each map,
for each player on the map
test to see if you can see each player on the same map
How can this be switched to multi-threading?
Well, we can have multiple threads (as many as the physical number of cores in the system), and each one will do one map. Once it finishes, it will move to the next unprocessed map. Of course, there will be some state table so two or more threads won't do the same map, wasting time and causing conflicts. This table needs to be locked each time it is accessed (read/write) to prevent other threads from doing the same and mess things up.
The path finding is slightly more complicated. Why? Because right now, we use an "as you go" model, where the path finding routine is called whenever a path needs to be determined.
However, most of the time it is not necesary to have a path right away, although in some cases it is (such as for determining if you can access a certain location or not).
So then the path finding function, which currently looks like: int find_path(int player_id, int target_x, int target_y) will be modified to look like: int find_path(int player_id, int target_x, int target_y, int urgency)
The urgency value is a boolean, and 1 means I need the result right now, while 0 means that it can wait for a while.
If the value is 0, the server will not attempt to calculate the path, but just set a variable on the player structure that a path is needed.
And after all the stuff is read from the sockets, we can have a global path finding routine, which checks every actor and calculates the path for those who need it. Then each thread can do a different path in a similar manner with the range calculation, since the paths are not dependent of eachother.
Currently there is really no need for this, because even with 750 player/bot connections and over 1300 AI entities, the server never went up more than 15% CPU.
However, this will slightly improve the response time (by a few MS), and will allow us to host even more players (maybe up to even 10-20K, depending on how many CPUs we have).
All this being said, there are some cases where multi threading is a good idea for a MMO server:
1. When you are processing a lot of data that is not dependent on other thread's result.
2. When you are CPU limited.
3. When you want the fastest response time possible (lowest latency).
Some MMO servers use a blocking/threading model. What that means is that you have 1 thread for each player (or 1 thread for a number of players), and the server uses a blocking socket (the execution of the thread is intrerrupted until there is any incoming server activity).
Other MMOs (like Eternal Lands) use a non blocking, non threaded model. That means that a socket will not block the program execution when it doesn't have data, and, instead, you move to the next socket to process the next player.
A third category of MMO servers use a hybrid model: Non blocking, threaded.
This is what I plan to work on, after we are done with the update.
So, how will it work?
Right now, there are two routines that take most of the CPU time: the path finding, and the range calculations (the part that determines who sees who).
And it so happens that those two routines are not dependent of previous results; they can be done in parallel.
The range calculation is basically done like this:
for each map,
for each player on the map
test to see if you can see each player on the same map
How can this be switched to multi-threading?
Well, we can have multiple threads (as many as the physical number of cores in the system), and each one will do one map. Once it finishes, it will move to the next unprocessed map. Of course, there will be some state table so two or more threads won't do the same map, wasting time and causing conflicts. This table needs to be locked each time it is accessed (read/write) to prevent other threads from doing the same and mess things up.
The path finding is slightly more complicated. Why? Because right now, we use an "as you go" model, where the path finding routine is called whenever a path needs to be determined.
However, most of the time it is not necesary to have a path right away, although in some cases it is (such as for determining if you can access a certain location or not).
So then the path finding function, which currently looks like: int find_path(int player_id, int target_x, int target_y) will be modified to look like: int find_path(int player_id, int target_x, int target_y, int urgency)
The urgency value is a boolean, and 1 means I need the result right now, while 0 means that it can wait for a while.
If the value is 0, the server will not attempt to calculate the path, but just set a variable on the player structure that a path is needed.
And after all the stuff is read from the sockets, we can have a global path finding routine, which checks every actor and calculates the path for those who need it. Then each thread can do a different path in a similar manner with the range calculation, since the paths are not dependent of eachother.
Currently there is really no need for this, because even with 750 player/bot connections and over 1300 AI entities, the server never went up more than 15% CPU.
However, this will slightly improve the response time (by a few MS), and will allow us to host even more players (maybe up to even 10-20K, depending on how many CPUs we have).