Pinderkent

Pain and glory from the trenches of the IT world.

CGI scripts are often a perfectly fine approach.

Posted on Monday, May 24, 2010 at 1:21 PM.

Today I noticed a submission at reddit about modern Web development. Much of the discussion there currently centers around technologies like PHP, Ruby on Rails, and Django. One commenter, however, brought up the acceptability of CGI scripts. As is usually expected when the topic of CGI scripts comes up, somebody replied and mentioned how they have "terrible overhead".

In some cases, this is absolutely true. If you have a site getting substantial traffic, or your site could experience an unexpected spike in traffic, then using CGI scripts clearly isn't a viable option. However, very few sites fall into this category. With the vast majority of sites online getting less than one hit per second, CGI scripts can actually prove to be a very versatile technology. The development overhead is very minimal compared to other techniques, especially those involving complex frameworks. Just about any programming language can be used to write a CGI script. There is much flexibility when it comes to the implementation, as no templating engines or O/R mappers must be used. Almost all modern web servers have excellent support for CGI scripts, and they're very easy to deploy. And if a performance boost is ever needed, it's almost trivial to convert a CGI script to use FastCGI.

A lot of developers today don't truly understand the power of modern server hardware and software. They don't realize how insignificant it is to start up a new process. Indeed, when it comes to CGI scripts, the process start-up time is often extremely negligible compared to the time it takes to perform database queries, for example. This is even true for interpreted scripting languages. When you factor in the caching done by virtually all server-grade operating systems today, as well as the caching of bytecode by an interpreted scripting language like Python, process start-up time becomes a non-issue.

Something else to keep in mind is that, well over a decade ago, we used CGI scripts alone to power sites that even today would be considered as being high-traffic sites. This was done on hardware that's a mere fraction the power of the hardware we have available to us today. Now, it wasn't unusual to see such servers become saturated with requests, and have an extremely high load, so technologies like NSAPI and Apache modules were eventually developed to combat this. Nevertheless, many sites were unable to make use of those approaches, so CGI scripts still remained widely used, and help create what became today's Web.

Contrary to what some people misleadingly suggest, CGI scripts are still a viable, acceptable and even optimal approach for many dynamic Web sites today. They provide a high degree of flexibility when it comes to the programming language used, any templating engine that may be used, any ORM system that may be used, the web server software used, the operating systems they run on, and so forth. To immediately write-off CGI scripts due to misconceptions about process start-up overhead is absurd. In reality, CGI scripts are a very acceptable approach for most Web sites today, and no doubt should be considered as an option.

Permalink: http://pinderkent.phumblog.com/post/2010/05/cgi_scripts_are_often_a_perfectly_fine_approach
Share:

Parrot just can't compete with LLVM, the JVM, and the .NET CLR.

Posted on Sunday, May 23, 2010 at 11:51 AM.

I read an article today, written by Andrew Whitworth, that discusses Parrot and its fitness as a target platform. His article, along with other recent developments, may very well answer the question I asked nearly three years ago, Will Parrot ever truly deliver? Unfortunately, the answer appears to be a resounding No.

For those who might not be aware, Parrot is, according to their web site, a "virtual machine designed to efficiently compile and execute bytecode for dynamic languages." Although it has been in development for about a decade, there has been comparatively little to show for all of the effort that has gone into it. Sure, there have been frequent releases, but in the end we still don't have a platform that garners much attention, and we still don't see anyone really putting forth a lot of effort to target it.

Andrew's article helps highlight why both language implementors and users may be hesitant to spend time targeting Parrot. Towards the end of his article, he covers parts of the system that he thinks will be seeing major changes within the next few months. Throughout these seven points, we see some very unsettling things. The very first point, for instance, mentions that, "GC is a very internal thing, when it works properly, you don't even need to know it exists." Now, garbage collection isn't a trivial task, but it has been very well studied and implemented many times over in real-world systems. Although we can't expect any such system to be perfect, it is unsettling when we read statements like "when it works properly" regarding a ten-year-old virtual machine platform. There just shouldn't be so much doubt about such a fundamental part of a virtual machine.

The second point is no better. Just-in-time compilation, like garbage collection, is another one of those cornerstones of a VM that we should expect to be mature and robust after 10 years. It's very worrisome to read that Parrot is lacking so badly in this area, even after two major releases.

The third point is perhaps the worse of all. In it, he states, "We don't really have a good, working, reliable threads implementation now and HLLs are generally not using them." It's currently 2010, and the situation today is that almost all new desktop PCs, and even many notebooks and netbooks, have a CPU with at least two cores. Most server-grade computer systems offer several times that, with multiple CPUs, with multiple cores per CPU, and even multiple threads of simultaneous execution per core. Efficiently using these systems to their full potential currently means writing multithreaded software. Like garbage collection and just-in-time compilation, threading has been well-studied, implemented repeatedly, and is one of the major pieces of any virtual machine platform. There's just no excuse for Parrot not to have better multithreading support.

The fifth point is pretty serious, as well. It discusses packfiles, which are the files that contain Parrot bytecode, debug data, and so forth. This is one more essential part of any VM implementation that should be very mature after a decade's worth of development. It's disappointing to hear that there are still portability issues with these files after so many years.

After reading about those rather serious deficiencies, I have a hard time understanding how Andrew can suggest that, "In summary, Parrot is a good, stable platform for HLL developers to use." From what I can see, Parrot is a platform that has had a lot of time and opportunity to make something of itself, but due to various problems, from internal developer strife, to a bad reputation, to a lack of serious users, it just hasn't matured.

Since I wrote my other article about Parrot almost three years ago, we've seen major developments out of the other major VM providers. We're seeing the Java platform get better support for dynamic languages in the upcoming JDK 7 release. We've also seen Microsoft's Dynamic Language Runtime become available for their .NET platform, allowing for mature and usable language implementations like IronRuby and IronPython to be developed.

Perhaps the biggest threat of all to Parrot is LLVM. LLVM has become widely accepted by industry, and even significant open source projects like FreeBSD are integrating and supporting it. In addition to having excellent support for C, C++ and Objective-C, we're even seeing it used as the back-end for dynamic programming language implementations. Rubinius and MacRuby are two examples of Ruby implementations that support LLVM. Then there are Python implementations like Unladen Swallow and PyPy.

I just don't think that Parrot can compete with these other platforms. Parrot has spun its wheels for far too long, and just isn't as mature as the JVM, the .NET CLR, or LLVM have become. Aside from casual or hobby development, I don't see why anyone would develop a software system specifically targeting Parrot. Its future seems extremely bleak at this point.

Permalink: http://pinderkent.phumblog.com/post/2010/05/parrot_just_cant_compete_with_llvm_the_jvm_and_the_net_clr
Share:

NoSQL, the next big mess we'll get to clean up.

Posted on Sunday, April 04, 2010 at 9:56 PM.

Over the past couple of years, we've been hearing more and more about the so-called "NoSQL" movement. In short, its adherents advocate the use of various data management systems that do away with some of the features that most relational database systems have come to offer, in favor of supposedly offering better performance for large data sets. The hype has become particularly strong lately, with it being revealed less than a month ago that Digg has started using Cassandra heavily. A few days later, a similar announcement was made regarding reddit.

There has been a fair amount of discussion regarding this topic. Dennis Forbes, for instance, discusses Digg's transition, and explains how properly using some of the most integral features of virtually all existing relational database systems, along with solid-state drives, can help alleviate many performance issues. We've also seen Ted Dziuba write about the risk and unnecessity of NoSQL-esque approaches for most situations, while Royans Tharakan has suggested the opposite to be true. Jeremy Zawodny describes NoSQL as "software Darwinism".

Regardless of how one personally feels about NoSQL or relational database systems, I can think of a few things that will likely hold true:

  1. NoSQL techniques and systems will continue to get the sort of hype that misleads many developers and managers into thinking it's an approach that's much better than it actually is.
  2. Numerous existing software systems currently using relational databases very successfully will be transitioned to using using a NoSQL approach.
  3. Many new software systems will use NoSQL technologies, especially when it isn't necessary or even suitable to do so.
  4. These new and modified systems will fail horribly. Expected performance gains won't materialize, data will be lost or badly mangled due to NoSQL's general lack of focus on data consistency, and codebases will be ruined by these transitions.

This is a mixed blessing. On one hand, it will ensure a lot of work for those of us who often get called in to deal with software blunders. But this isn't truly productive work, of course. It's mainly just fixing mistakes that shouldn't have been made in the first place. Techniques and software that worked for NoSQL users like Facebook, Google, Digg or reddit just won't work across the board, and it's quite unfortunate that so many developers and development managers won't realize this until it's far too late.

Permalink: http://pinderkent.phumblog.com/post/2010/04/nosql_the_next_big_mess_well_get_to_clean_up
Share:

Not all tutorials target the same audience.

Posted on Sunday, April 04, 2010 at 6:12 PM.

Today I read an article by James Hague that suggests we re-think how programming language tutorials are written. Mid-way through the article, I think he sums up his point when he writes, "Programming language tutorials shouldn't be about leaning languages. They should be about something interesting, and you learn the language in the process."

For tutorials targeting people who are completely new to computer programming itself, this is a sensible approach to take. It can help strike a good balance between informing the person about the language or languages being used, while also showing how to apply the concepts to somewhat realistic problems, but at the same time it doesn't bury them in terminology or concepts they might not yet understand.

However, this is clearly not an approach that caters well to the needs of experienced programmers who want to (or need to) get up to speed quickly with a new language. If I have a client coming to me with a problem that involves a language I haven't used before, the last thing I want to do is spend time reading such scenarios. I don't necessarily want a reference, but rather a quick, to-the-point book that summarizes the features that are available. I'd rather such texts use field-specific terminology like "associative array" if that will communicate the language's features more rapidly.

I don't think we should fault Programming in Lua as being a badly written tutorial; it's actually quite decent for experienced programmers who want to rapidly learn what Lua offers. Admittedly, it probably isn't the best book for novices. But reading its Audience section should make that clear. It does proceed on the assumption that the reader has at least some programming experience. Hopefully any computer programming novice who does think of reading that book does glance at the Audience section beforehand, and realizes on their own that some other resource may be a better introduction to programming itself.

It's probably best to not try and target both audiences with a single tutorial. Let there be tutorials that actively target inexperienced programmers, but the bulk of the tutorials should probably be of a format that is useful to programmers who have some prior experience and knowledge. Most programmers learn the basic concepts common to most programming languages once, yet need to learn the features and functionality offered by a specific language time and time and time again.

Permalink: http://pinderkent.phumblog.com/post/2010/04/not_all_tutorials_target_the_same_audience
Share:

Will Chrome OS be the most innovative consumer-grade operating system since BeOS?

Posted on Thursday, November 19, 2009 at 2:04 AM.

Earlier this year Google announced Google Chrome OS. Subsequently, some early indications of what it may offer came to light. And now there will apparently be an event held soon, where further details pertaining to Chrome OS may be made available.

I am interested in seeing what Chrome OS may offer us. Based on the original announcement, it sounded like it would bring some fresh ideas to the table. This is something we really haven't seen for well over a decade now. Modern mainstream desktop operating systems like Mac OS X and Windows 7 aren't overly different from their equivalent releases of 10 to 15 years ago. Mac OS X is still remarkably similar to NeXTSTEP and Mac OS 9 and earlier, while Windows 7 still follows the concepts introduced with Windows 95.

Looking back, the last truly innovative desktop operating system is likely BeOS. I covered many of the excellent design decisions behind BeOS in an article earlier this year. In short, it was far too many years ahead of its time. It's only today that we're getting the hardware that it would excel on. And although the original BeOS implementation can best be considered dead, the Haiku project has been making good strides creating an operating system inspired by it.

If Chrome OS can bring even just a fraction of the innovation that BeOS brought, I think we should be able to consider it a success. Unlike BeOS, Chrome OS has a powerful backer, which may very well be what it needs to become a mainstream competitor to the existing consumer-grade operating systems that are widely used today. So I'm looking forward to the upcoming announcements regarding it, and hopefully we'll be able to start using it out quite soon.

Permalink: http://pinderkent.phumblog.com/post/2009/11/will_chrome_os_be_the_most_innovative_consumergrade_operating_system_since_beos
Share:
Feeds
  • RSS 2.0 Feed
  • Atom 2.0 Feed
Tags
Archives