Last week, a colleague forwarded me the link to one of the most blatantly incorrect computing articles he had ever seen. It's entitled C# vs C/C++ Performance, and after reading it, I must agree with my colleague. It isn't often that we get to see an article so full of misinformation.
We see the first major mistakes within the "Point 1" discussion. The author clearly has some serious misunderstandings about hyper-threading and instruction sets. One of the gems we see is: ... a C++ program will not be able to take the advantages of the "Hyper Threading" instruction set of the Pentium 4 HT processor.
Of course, no such instruction set exists. Hyper-threading, as implemented by Intel up to this point, is transparent to userland applications.
Immediately after that, we read: Of course HT is outdated now....
That is, of course, absolutely false. Intel's recent Core i7 processor makes use of hyper-threading, with each of its four cores supporting two simultaneous threads. It's not an "outdated" technique.
It gets better as we read on: It will also not be able to take advantages of the Core 2 duo or Core 2 Quad's "true multi-threaded" instruction set as the compiler generated native code does not even know about these instruction sets.
Again, we see more ignorance regarding instruction sets and hyper-threading. While newer CPUs do often include new instructions, the supposed "true multi-threaded" instruction set that the author of that article writes about is bunk.
The next misleading claims we see are as follows: In the earlier days, not much changes were introduced to the instruction set with every processor release. The advancement in the processor was only in the speed and very few additional instruction sets with every release. Intel or AMD normally expects game developers to use these additional instruction sets.
We can see how wrong this is by looking at Wikipedia's x86 Instruction Listings page. It shows the original 8086/8088 instruction set, along with the instructions added with each processor generation. Based on that information, we can see that older processors such as the 80386 and the Pentium Pro each added quite a few new instructions. And the claim that new instructions are typically added for "game developers" is laughable. Games are just a small subset of the multimedia applications which benefit the most from the newer instruction sets. And that's not to mention the scientific and engineering applications which benefit significantly, as well.
Next we move on to the "Point 2" discussion. It almost immediately starts off with a five-line snippet of completely unrealistic code. It's not even a remotely valid microbenchmark, which themselves are often bad enough when they actually can compile and actually can be executed. Yet from this code, which consists of an undefined function that performs a "really time consuming operation" being called 100,000,000 times, the author of that article comes to the conclusion that "C++ is faster by a order of magnitude." Huh?
The next sentence reads: Nearly all the threads I've seen that claims C++ is faster writes a small application like this a prove that C++ is atleast n times faster than an equivalent c++ program and yes it's true.
So we find out that the claims of the author's article aren't even based on personal experience, let alone more rigorous approaches. They're based on what was read in some message board or mailing list. And beyond that, we can see statements that don't make even the slightest bit of sense as written, such as "C++ is atleast n times faster than an equivalent c++ program." The second "c++" should apparently read "C#".
The "Point 3" discussion focuses on memory management. While the author is somewhat correct in pointing out that memory management is more involved when using C++, there is no mention of Boost's smart pointers, the Boehm-Demers-Weiser garbage collector, Valgrind and the various other technologies that greatly help to prevent or track down memory leaks in C and C++ applications. I've seen first-hand how these technologies can be used to develop long-running systems in C++ that contain millions of lines of source code, yet don't suffer from obvious or significant memory leaks.
Further along, we get to read: Everyone knows that page fault is one of the most time-consuming operation as it requires a hard disk access. One page fault and you are dead.
While excessive page faults are typically bad for performance, they're not the evil that the author of that article portrays them to be. A single page fault won't typically harm performance as badly as the author suggests. And with most modern operating systems, we often see demand paging used. In such a scenario, we don't load a page from disk until it's actually referenced, which can lead to improved application startup times and reduced memory usage. So some page faults are to be expected.
The next misunderstanding is: A lot of classical applications including Google Picasa suffers from memory management problems. After about two or three days, you can notice that these applications become slower necessitating a Windows Restart. This problem is completely alleviated in C#. the Framework comes with a broom behind you and sweeps your drop during the course of the execution and as a result your working set never grows (unless you really use it) which means lesser page faults.
While many desktop applications today do leak memory, it doesn't make sense for the author of that article to suggest that we need to perform a full reboot of the operating system. Killing the application process should, under most modern operating systems (including most versions of Windows still in use), be sufficient to free whatever memory it may have been using. Furthermore, it's incorrect to suggest that such problems will be "completely alleviated" while using a language like C#. It's quite possible for an application to have code that maintains references to objects that are no longer needed, thus preventing them from being garbage collected. Carelessness can cause problems regardless of which programming language is being used.
The author of that article comes to the conclusion that the best thing to do is take a hybrid approach; write most of the application in C#, and have it call out to performance-critical code written in C++. While this is an option, a better approach is to first profile your code to see where and why it is actually slow. Don't just assume it's the language. Often times, we see poor algorithms being used, or unnecessary computation being performed. Based on my years of experience, I'd say that fixing such issues will typically give a much greater performance boost than changing programming languages.
Every author, myself included, will no doubt make minor mistakes here and there while writing. They're expected, and forgiven. However, the article that my colleague linked me to was incorrect from top to bottom. What is perhaps the most disturbing thing about that article is that some people may very well believe what it is saying to be true. Perhaps articles like that one are why so much software is so poorly written. To those who don't know better, such articles sound legitimate and sensible. But after even the slightest bit of analysis, we see such articles fall apart almost completely. Unfortunately, there are a lot of people out there who can't or won't perform such analysis.