25 October 2008

Profiling and optimizations

In the last two or three weeks, I spent a lot of time optimizing the server.
I rewrote all kind of functions, and fixed a few bugs while at it. The profiling results were kind of surprising, for example the most expensive function in my code is was a big loop, mostly empty, that looks something like this:
for 0 to 6K
//only maybe 30% of the times the execution gets here.
do some stuff, not very time expensive

Now, this function is called 8 times a second, but even so, it was a little bit unexpected that such a little innocent function would take about 0.7 ms at each run.
So, I focused all my m4d hax0r skillz on optimizing this function, which meant making a few lists to get rid of some of the ifs. But this required reordering some networking code, which took a while.

After doing this, I was expecting the CPU usage to drop a lot. But, to my surprise, it did not drop at all (nothing noticeable, anyway).
And then it finally realized it:
The problem was not in the server code, the problem was on the networking side!

For example, a server with no players, but 1300+ AI (which do most of the things players would do) takes about 1% of the CPU.
But a server with 100 players takes about 2%, and a server with 800 players (and player run bots) takes 18% of the CPU.

From the very beginning, I relied on SDL_net for the networking, but until a few days ago, I didn't even look at the SDL_net source code to see how it is doing things. So I took a look, and didn't find any problems, except that it has to go through the list of all the connections once, and through the list of active connections twice. Then the server must go through the list of all the connections once again, to check for data.
Obviously, this is not optimal.
After doing some research, I found out that the select() method, which SDL_net is using, scales very poorly when there are a lot of connections, because the kernel must do a lot of expensive operations as well.

Well, Learner told me today that on FreeBSD there is another way to check to see if the sockets have any data, with kqueue.
After doing some more research on the matter, I found this. It explains in details how kqueue works, and why it is so much faster than the good ol' select().
So I am going to implement kqueue on the server, to be used on the 'production' server, but also leave the SDL_net way in there as well, because my development is done on Windows.
Right now, I am downloading Desktop BSD and will run it on a virtual machine, to help me with my local testing. Once I actually start implementing the new system, will let you know how it went.


Anonymous Anonymous said...

Absolutely brilliant.

25/10/08 04:22  
Anonymous Anonymous said...

Also using of sendfile(...) instead of send(...) is much faster because its copying the buffers in kernel space only.

6/11/08 00:19  

Post a Comment

<< Home