Pinderkent

Pain and glory from the trenches of the IT world.

The Haskell Platform sounds very promising!

Posted on Thursday, July 30, 2009 at 1:43 AM.

Although it's still in its infancy, with its first release coming just at the beginning of May 2009, the Haskell Platform is apparently making quite a splash, especially amongst Windows users. In many ways this isn't surprising, as the Haskell Platform offers just what Haskell has been lacking for some time now.

By providing a convenient and standardized Haskell environment, the Haskell Platform helps make Haskell much more accessible and practical to a much wider developer audience. Indeed, part of the reason why languages and platforms like Java, .NET, Perl and Python are so popular and widely used is because they offer a good all-in-one platform so that developers can focus on developing their software, rather than trying to put together a suitable development environment.

Solid platforms of this style are essential for larger, real-world software systems like those commonly fulfilling critical tasks for businesses of all sizes. By having such a platform, especially one with a vibrant community backing, developers can begin to trust Haskell more and more. And in some cases it will become essential to make use of functional programming techniques if we want to effectively make use of the massively multi-core CPUs of the near future. Efforts like the Haskell Platform will help get us there quicker, and will allow us to produce higher-quality and higher-performance software more reliably and efficiently.

As the Haskell Platform matures, I don't doubt that it will garner much support throughout the Haskell community, which will in turn help it improve even further. I'm very interested to see how quickly the Haskell Platform can build momentum, and how quickly it'll be able to help bring Haskell to the forefront of modern software development. Given our current situation, we very badly need the power of a strong, statically-typed functional language. It looks like Haskell may just be the language to provide that to us.

Permalink: http://pinderkent.phumblog.com/post/2009/07/the_haskell_platform_sounds_very_promising
Share:

Keeping commented-out code is justifiable.

Posted on Sunday, May 24, 2009 at 12:10 AM.

Today I read an article that suggested the following:

Commented out code are not comments - Use version control, don't track code changes by commenting them out. Commented out code is schizophrenic code.

To some extent, this is true. It is poor practice to try and maintain extensive source code history through commented-out code. As the author of that article correctly points out, there are numerous software version control systems out there, and most developers are familiar with at least one of them.

But we shouldn't go so far as to say that there's no place for commented-out code. Contrary to what the author of that article suggests, a few lines of commented code can say more to a developer than paragraphs of prose comments. One case where we want to retain such code is when it has a serious flaw that we (or more realistically, a maintenance programmer) don't want to accidentally repeat in the future. By leaving the flawed code there, albeit commented-out and with a quick note describing why the code should not be used, we can leave an effective reminder of the problem.

One other situation I've seen where we legitimately have code in a comment involved a Perl script that was used to generate code for a C array containing certain values. Instead of putting the Perl script in a separate file, where its purpose may not be fully understood, it was stored within the C source file just above the array code that it had generated.

Like goto statements, we shouldn't think that code within in comments shouldn't be used at all just because it can be abused in some cases. It is a technique that has appropriate uses.

Permalink: http://pinderkent.phumblog.com/post/2009/05/keeping_commentedout_code_is_justifiable
Share:

Losing developer time to performance problems hidden by high-level languages.

Posted on Saturday, May 23, 2009 at 11:48 PM.

One of the main purposes of high-level programming languages is to save developer time by abstracting away the onerous and tedious aspects of the underlying hardware. In general, most high-level languages tend to do a good job at this. Unfortunately, we see these same high-level languages wasting significant amounts of developer time. Many times, this is due to performance problems. What becomes problematic, however, is that in order to properly diagnose and fix many of these performance problems, the developers involved need to obtain a high degree of understanding about the implementation of the high-level language that's involved.

A good example of this is a performance issue described recently with IronPython, an implementation of Python for Microsoft's .NET platform. In short, a very innocuous line of code was apparently responsible for the poor performance.

This incident highlights several main problems. The first is that high-level code can lead to some very unexpected interactions within the high-level language's implementation. This can obviously cause problems by misleading the developer or developers dealing with the performance problems. What appears on the surface to be a simple and likely very fast operation ends up being the culprit. A lot of developer time can be spent looking in the wrong places.

The second concern is that tracking down the problem requires in-depth knowledge about the high-level language's implementation. To some extent, we use such high-level languages in the first place to avoid needing to acquire such lower-level knowledge. We want to focus on the application we're writing, not on dealing with issues pertaining to the platform we're building upon. Time spent learning about the high-level language's implementation is time not spent on developing the application at hand.

This particular situation seems to have had a "happy" ending. The victim of the poor performance got a rapid response from somebody who did have inside knowledge about IronPython's implementation. Unfortunately, this isn't always the case. I've seen far too many times when developers have spun their wheels trying to track down obscure performance problems of that type. And it isn't a problem associated just with programming languages like Python, Ruby, or Perl, either. We often see it happen with SQL. A minor change to a query can result in a huge performance gain or loss.

As we start using high-level programming language implementations like IronPython, Scala, Clojure and JRuby, which are themselves often implemented in high-level programming languages like Java or C#, which in turn run on some sort of a virtual machine, we'll run into these sorts of problems more and more frequently. Each additional layer of software abstraction that we add in makes the situation more and more difficult. Soon we may need to look in two or three very different layers of software, assuming we even have source access, to track down performance issues. This could very well lead to a serious waste of developer time and effort.

Permalink: http://pinderkent.phumblog.com/post/2009/05/losing_developer_time_to_performance_problems_hidden_by_highlevel_languages
Share:

C and C++ play a very crucial role in most Web application systems.

Posted on Friday, May 15, 2009 at 2:21 AM.

Today, over at Hacker News, I saw a topic asking why C++ isn't commonly used for Web applications. The question itself is quite valid; we typically don't see Web applications themselves developed in C++. But that doesn't mean that C and C++ don't have an integral role within a Web-based system. Their use isn't as visible as that of Ruby, PHP, Python or Perl, but it's important nevertheless.

Admittedly, the back-end of many Web applications really isn't all that complex. In many cases, it's basically just a friendlier interface to a datastore of some sort, maybe offering some caching, and usually some basic data manipulation. And although C++ libraries like the STL and Boost allow for such tasks to be performed with relative ease, there's essentially little benefit in using C++. Scripting languages are often sufficient.

That said, C and C++ still do have a huge role in most Web application stacks today. We shouldn't forget that most of the popular server operating systems, Web servers and database systems today, as well as the most widely used implementations of most scripting languages, are typically written in C or C++. This is quite apparent within the popular open source Web stacks.

At the very core, we have C playing an integral role in virtually all of the popular server operating systems today, especially UNIX-like systems like Linux, FreeBSD, and Solaris. On top of that, we have popular Web servers like Apache, nginx, and lighttpd that are all written in C. And for database systems, PostgreSQL and SQLite are written in C, while MySQL uses both C and C++.

C and C++ are also critical to the programming languages used to implement many Web applications. The most widely used implementations of Python, Ruby, Perl and PHP all use C. Even Sun's HotSpot Java virtual machine makes very extensive use of C and C++.

So when we take a more holistic view of Web applications, we see that C and C++ prove to be very widely used. They're used for some of the most critical aspects of Web-based systems, where performance and reliability truly matter. Even if they get more of the attention, languages like PHP, Python, Ruby, Java and Perl end up being little more than glue languages, tying together the software implemented in C or C++. It becomes easy to forget their importance, but this may just be because the software developed using them has matured to the point where they provides such stable interfaces that we can totally ignore their implementation language. Nevertheless, C and C++ are very critical to the vast, vast majority of Web applications that exist today.

Permalink: http://pinderkent.phumblog.com/post/2009/05/c_and_c_play_a_very_crucial_role_in_most_web_application_systems
Share:

Putting Stack Overflow's hardware usage in perspective.

Posted on Saturday, May 09, 2009 at 2:34 AM.

Anand Iyer recently wrote an article that transcribes a portion of a video in which Joel Spolsky is discussing the hardware and software that is backing the very useful and increasingly-popular Stack Overflow Web site. It's mentioned that there is one Web server and one database server, both running on "eight core Xeons" and serving "16 million" pages a month. At first, that sounds impressive. But thinking about it more, I'm not so sure it really is.

First, we should convert that value of 16 million pageviews into something we can comprehend better. Assuming a month of just 30 days, a quick bit of math shows that to be 2,592,000 seconds. Sixteen million pages over that number of seconds ends up being a mere 6.2 pages per second. Now, that's probably not a totally accurate picture. There are no doubt times when the traffic is much higher than that, and other times when it's lower. But even if their overall pageview traffic were to triple or quadruple, we're still not seeing huge numbers of simultaneous page requests.

I don't think people today truly realize the power of today's hardware. Even the low-cost, consumer-grade PCs you can buy for a few hundred dollars are significantly more powerful than the servers of just a few years ago. So I don't think we should get too excited about Stack Overflow being able to serve 6 or 7 pages per second, if not many times that during periods of heavy load, over what's essentially 16 very powerful CPUs.

Thinking back to some of the database-backed intranet Web sites I've worked on in the past, we were able to reasonably handle sustained traffic of 30+ pages per second at times on far inferior hardware. This was even when we still used CGI scripts written in Perl, which have not only the overhead of starting up the interpreter process with each request, but also the overhead of the program interpretation itself.

I recall one job in particular because of how rushed it was. The company had several call centers located throughout the world, and was moving towards a custom Web-based solution for the call center operators to use. Expecting up to 30 simultaneous users per second at peak hours, they had placed an order for some significantly powerful hardware at that time. The order was delayed for some reason, but management wanted the site to go live. So the decision was made to temporarily use some older, unused Sun workstations as servers. I recall spending a night getting two workstations set up as Web servers, and one as a database server, so the system could go live the following morning. It went live, and everyone was very surprised to find that even under higher than expected load, running on older Sun workstation hardware and using Perl CGI scripts, the responsiveness of the Web site was quite acceptable.

Now, I don't expect that to be the case all of the time. Like Joel points out in his talk, there are many sites even today that use significantly more hardware than they probably should. But with some sensible caching policies and a small degree of care while programming, it really wasn't overly difficult to get high-traffic sites running on a small amount of lower-end hardware. Even in Stack Overflow's case, it sounds like they could get by very easily with a small fraction of their current infrastructure. However, it is good to have room to grow, as the traffic to that site likely will.

Computer hardware today is extremely powerful. For most Web sites, even those getting millions upon millions of pageviews per month, scalability just shouldn't be an issue. If it is, it's likely that there have been some pretty significant programming mistakes made when developing the software powering the Web site. And with low-end servers today typically coming with eight or more logical processors and many gigabytes of memory, servicing hundreds of requests per second from a single system should be considered routine.

Permalink: http://pinderkent.phumblog.com/post/2009/05/putting_stack_overflows_hardware_usage_in_perspective
Share:
Feeds
  • RSS 2.0 Feed
  • Atom 2.0 Feed
Tags
Archives