C# performance vs C++ performance
Some of you know that I have been playing around with writing poker calculators and poker game simulations. These days, my preferred language is C# so I decided to undertake my project in C#. I happened to run across Cactus Kev’s Poker Hand Evaluator and it looked pretty slick. I have already written my own evaluator in C# and it was pretty fast but this looked like a screamer.
I did the port of the calculator including the modification written by Paul Senzee over to C#. This did not prove to be too difficult and it was done in about 4 hours. Then I did some timings. It could rank all 2.5 million (2,598,960) hands in about 500 milliseconds on my 3 GHZ P4 Xeon running XP Pro with 2 Gigs or Ram. This seemed a bit slow to me.
So I became curious about the performance in C++. I decided to port the code to MS VC++ (the original code was written in C). I even used some libraries for srand48 and drand48 so I could leave the code as untouched as possible. This was pretty simple as well and only took about 2 hours (with time to find the rand code).
I ran the speed test and it took about 50 milliseconds to rank all 2.5 million hands. I was pretty shocked; thats 10 times faster than C#. I figured it would be faster but not that much faster. This is when I had my stroke of genius, I thought I would just add some dllexports and pinvoke the c++ dll from C#.
That took all of about 15 minutes to do and I was ready to go. Now, hold on to your hats for this timing. Ranking all 2.5 million hands took 27,166 milliseconds. Yes, thats twenty seven thousand milliseconds, which is real slow. Again, I knew that pinvoke added the overhead of creating managed thunks etc…, however, I did not realize just how much slower it would be. I will grant you that this is an extreme example, I am calling a function 2.5 million times in a loop.
Then I decided I would create a test wrapper in the C++ dll that would get called once from C# and execute the loop in C++ returning the number of hands actually ranked. With this method, I was able to run the test from C# in 60 milliseconds. So all the extra time is definitely coming from the pinvoke calls.
I then decided that I would hook up the C# dll and find out where the bottleneck was. For this, I used Red Gate Ants Profiler. I was again surprised by the results. The bottleneck was in this piece of code.
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: Consolas, “Courier New”, Courier, Monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
private static uint find_fast(uint u)
{
uint a, b, r;
u += 0xe91aaa35;
u ^= u >> 16;
u += u << 8;
u ^= u >> 4;
b = (u >> 8) & 0x1ff;
a = (u + (u << 2)) >> 19;
r = a ^ hash_adjust[b];
 
return r;
}
This is the hash lookup from the modification written by Paul Senzee. It seems that C# does not perform bitwise operations as fast as C++. This is something I need to look into, these should be fairly cheap operations. I need to figure out why they are more expensive in C# than they are in C++. I would have thought that once the code was complied the performance would have been nearly identical.
So here is the summary of what I learned from this little experiment.
PInvoke is very slow
C# is much slower than C++ at bitwise operations (Although it is still the fastest way to complete this kind of operation).
Cactus Kev’s Poker Hand Evaluator really does scream