28 May 2007

Future server optimizations and multi-threading

Personally, I think that the threads are the root of all evil, especially in a MMO server. They make debugging very difficult, can cause serious and hard to trace bugs if not done properly, and the speed benefit is not that great (about 30-40% in practice, if done right).

All this being said, there are some cases where multi threading is a good idea for a MMO server:
1. When you are processing a lot of data that is not dependent on other thread's result.
2. When you are CPU limited.
3. When you want the fastest response time possible (lowest latency).

Some MMO servers use a blocking/threading model. What that means is that you have 1 thread for each player (or 1 thread for a number of players), and the server uses a blocking socket (the execution of the thread is intrerrupted until there is any incoming server activity).
Other MMOs (like Eternal Lands) use a non blocking, non threaded model. That means that a socket will not block the program execution when it doesn't have data, and, instead, you move to the next socket to process the next player.
A third category of MMO servers use a hybrid model: Non blocking, threaded.

This is what I plan to work on, after we are done with the update.

So, how will it work?
Right now, there are two routines that take most of the CPU time: the path finding, and the range calculations (the part that determines who sees who).

And it so happens that those two routines are not dependent of previous results; they can be done in parallel.

The range calculation is basically done like this:
for each map,
for each player on the map
test to see if you can see each player on the same map

How can this be switched to multi-threading?
Well, we can have multiple threads (as many as the physical number of cores in the system), and each one will do one map. Once it finishes, it will move to the next unprocessed map. Of course, there will be some state table so two or more threads won't do the same map, wasting time and causing conflicts. This table needs to be locked each time it is accessed (read/write) to prevent other threads from doing the same and mess things up.

The path finding is slightly more complicated. Why? Because right now, we use an "as you go" model, where the path finding routine is called whenever a path needs to be determined.
However, most of the time it is not necesary to have a path right away, although in some cases it is (such as for determining if you can access a certain location or not).
So then the path finding function, which currently looks like: int find_path(int player_id, int target_x, int target_y) will be modified to look like: int find_path(int player_id, int target_x, int target_y, int urgency)
The urgency value is a boolean, and 1 means I need the result right now, while 0 means that it can wait for a while.
If the value is 0, the server will not attempt to calculate the path, but just set a variable on the player structure that a path is needed.
And after all the stuff is read from the sockets, we can have a global path finding routine, which checks every actor and calculates the path for those who need it. Then each thread can do a different path in a similar manner with the range calculation, since the paths are not dependent of eachother.

Currently there is really no need for this, because even with 750 player/bot connections and over 1300 AI entities, the server never went up more than 15% CPU.
However, this will slightly improve the response time (by a few MS), and will allow us to host even more players (maybe up to even 10-20K, depending on how many CPUs we have).

26 May 2007

Problems with the Intel video cards

Everything about the update is ready and tested, but what holds us back is a problem with the Intel videocards under Windows.
Under some circumstances (not often, that is), they will not display any 3D stuff. That happens at night only, in some areas.

Given the fact that none of the developers or programmers have such a video card, it is very difficult to fix it.
While we have a 'no one left behind' policy, that is, we try not to release a client until it works fine on everyone's computers, this time we might have to make an exception, which means release the client and then if we find the bug later, issue a patch. This is not a critical bug, because the game is still pretty enjoyable, except that in some areas you need to turn the camera around to see stuff. And given the fact that it affects few players, I decided to just go on with the update.
Probably we'll have a new client for pre-download on Monday, then have the update on Tuesday.

17 May 2007

Status report

We released a RC a few days ago, and, as expected, there were a few problems.
Now most of them are fixed, and we'll have another RC tomorrow.
There is at least one client crashing bug, which happens when fighting armed orcs, and possibly some other monsters, but the problem is that it doesn't affect everyone.
In fact, very few people are affected, so that makes debugging quite hard.
The good thing is that KarenRei's client can crash via fighting armed orcs, so last night we worked at this problem (well, she did the debugging while I did the orcs summoning/fighting). Hopefully she will be able to fix it tonight, and if not, prior to the update.

One thing is certain: the client is looking much better with the new special effects, but there is a severe cost in the frame rate, when there are too many effects in the game at once. When we are ready with this update, we'll be looking into using vertex shaders for some of them, which should significantly improve the speed. There are a lot of new plans for the future clients, and I'll post some of them here in the near future.

12 May 2007

The Release Candidate is almost there

Most of the bugs and problems have been fixed, so this coming week there should be a RC available, and depending on how it goes, we should have the update before the end of the month.

Meanwhile, I did a lot of server stuff, like adding new monsters, changing some spawns around, the new books for the Engineering skill, new manufacture formulas, a new harvesting resource, adding new items and changing some item images (for the items we didn't have images before), and all kind of other behind the scenes work.

Of course, they will need to be tested on the test server before we have the update, so, unfortunately, we can't surprise the players (at least not those that bother to test stuff on the test server). But then again, we have a pretty open development policy, and we don't have many secrets about upcoming releases. For example, the yesterday's update had all the new engineering books, so people know what to expect from this skill.

05 May 2007

Getting closer to the update

The update was planned for sometime in May, and we are making good progress.
It seems that each new client has fewer problems than the one before it, although some people still report isolated crashes and performance problems with the new particle systems (the eye candy).
If everything goes well, I plan to have a RC (release Candidate) by the end of next week, and an update around 20th of May.

This update will add a new skill (Engineering), new armors and weapons, new animals, and, for the first time in a long time, no public new maps (we do have some guild maps, and fixes to existing maps).

Meanwhile, while waiting for the client to be fixed, I am doing some behind the scene server fixes and code cleanup, plus adding the new monsters, items and books. The manufacturing formulas for the new things will be added after the update.

Once we are done with it, we will be working at implementing the missile system on the client, land mines and caltrops, new special effects, and various other improvements. It is yet too early to have a date for the client update coming after this one, but I would like to have it in October (longer release times allows the client developers to add moe reliable code, since no one likes to work under pressure).