Pinderkent

Pain and glory from the trenches of the IT world.

Parrot just can't compete with LLVM, the JVM, and the .NET CLR.

Posted on Sunday, May 23, 2010 at 11:51 AM.

I read an article today, written by Andrew Whitworth, that discusses Parrot and its fitness as a target platform. His article, along with other recent developments, may very well answer the question I asked nearly three years ago, Will Parrot ever truly deliver? Unfortunately, the answer appears to be a resounding No.

For those who might not be aware, Parrot is, according to their web site, a "virtual machine designed to efficiently compile and execute bytecode for dynamic languages." Although it has been in development for about a decade, there has been comparatively little to show for all of the effort that has gone into it. Sure, there have been frequent releases, but in the end we still don't have a platform that garners much attention, and we still don't see anyone really putting forth a lot of effort to target it.

Andrew's article helps highlight why both language implementors and users may be hesitant to spend time targeting Parrot. Towards the end of his article, he covers parts of the system that he thinks will be seeing major changes within the next few months. Throughout these seven points, we see some very unsettling things. The very first point, for instance, mentions that, "GC is a very internal thing, when it works properly, you don't even need to know it exists." Now, garbage collection isn't a trivial task, but it has been very well studied and implemented many times over in real-world systems. Although we can't expect any such system to be perfect, it is unsettling when we read statements like "when it works properly" regarding a ten-year-old virtual machine platform. There just shouldn't be so much doubt about such a fundamental part of a virtual machine.

The second point is no better. Just-in-time compilation, like garbage collection, is another one of those cornerstones of a VM that we should expect to be mature and robust after 10 years. It's very worrisome to read that Parrot is lacking so badly in this area, even after two major releases.

The third point is perhaps the worse of all. In it, he states, "We don't really have a good, working, reliable threads implementation now and HLLs are generally not using them." It's currently 2010, and the situation today is that almost all new desktop PCs, and even many notebooks and netbooks, have a CPU with at least two cores. Most server-grade computer systems offer several times that, with multiple CPUs, with multiple cores per CPU, and even multiple threads of simultaneous execution per core. Efficiently using these systems to their full potential currently means writing multithreaded software. Like garbage collection and just-in-time compilation, threading has been well-studied, implemented repeatedly, and is one of the major pieces of any virtual machine platform. There's just no excuse for Parrot not to have better multithreading support.

The fifth point is pretty serious, as well. It discusses packfiles, which are the files that contain Parrot bytecode, debug data, and so forth. This is one more essential part of any VM implementation that should be very mature after a decade's worth of development. It's disappointing to hear that there are still portability issues with these files after so many years.

After reading about those rather serious deficiencies, I have a hard time understanding how Andrew can suggest that, "In summary, Parrot is a good, stable platform for HLL developers to use." From what I can see, Parrot is a platform that has had a lot of time and opportunity to make something of itself, but due to various problems, from internal developer strife, to a bad reputation, to a lack of serious users, it just hasn't matured.

Since I wrote my other article about Parrot almost three years ago, we've seen major developments out of the other major VM providers. We're seeing the Java platform get better support for dynamic languages in the upcoming JDK 7 release. We've also seen Microsoft's Dynamic Language Runtime become available for their .NET platform, allowing for mature and usable language implementations like IronRuby and IronPython to be developed.

Perhaps the biggest threat of all to Parrot is LLVM. LLVM has become widely accepted by industry, and even significant open source projects like FreeBSD are integrating and supporting it. In addition to having excellent support for C, C++ and Objective-C, we're even seeing it used as the back-end for dynamic programming language implementations. Rubinius and MacRuby are two examples of Ruby implementations that support LLVM. Then there are Python implementations like Unladen Swallow and PyPy.

I just don't think that Parrot can compete with these other platforms. Parrot has spun its wheels for far too long, and just isn't as mature as the JVM, the .NET CLR, or LLVM have become. Aside from casual or hobby development, I don't see why anyone would develop a software system specifically targeting Parrot. Its future seems extremely bleak at this point.

Permalink: http://pinderkent.phumblog.com/post/2010/05/parrot_just_cant_compete_with_llvm_the_jvm_and_the_net_clr
Share:

Python versus PHP is just professionalism versus amateurism.

Posted on Saturday, September 05, 2009 at 11:55 PM.

Joe Stump recently wrote about why he switched from PHP to Python. What he says is absolutely true. He nails down many of the problems with PHP, and also lists many of the benefits of Python. In the end, however, I think we can just sum up the comparison between the two languages as being one of comparing Python's professionalism versus PHP's amateurism.

When I describe Python as being professional in nature, I am referring to the emphasis on care, contemplation, quality, and doing-the-right-thing-even-if-it's-difficult we generally see embraced by the Python community. Like Joe mentions in his article, the language itself exhibits a high degree of sensibility, consistency and predictability. It has clearly been carefully evolved and grown, rather than having features and functionality tossed on here and there. It's a language developed by people who know what they're doing, and it's a language used by people who know what they're doing.

On the other hand, we've seen much of the opposite out of the PHP community. The language itself is a good testament to the general preference of that community towards doing things quick-and-dirty, rather than correctly and with care. It's full of quirks, it's inconsistent, and generally a jumble of differing ideas and philosophies. We've seen this happen with its standard library, as well. With the core platform having such poor quality, any software build upon it typically ends up suffering from quality and security problems, too.

It's no wonder that so many developers are moving away from PHP towards better languages and environments like Python and Ruby whenever possible. PHP is deceptively attractive. Sure, it's widely supported, and yes, it's got a large standard library. But it has many hidden and not-so-hidden costs, including horrid maintainability of applications written in it, numerous security flaws within critical code, and numerous hurdles towards developing decent software. Even if the perceived costs seem higher when using a more professional language like Python, in the long run it becomes the only viable choice when the alternative is something as amateurish as PHP.

Permalink: http://pinderkent.phumblog.com/post/2009/09/python_versus_php_is_just_professionalism_versus_amateurism
Share:

Losing developer time to performance problems hidden by high-level languages.

Posted on Saturday, May 23, 2009 at 11:48 PM.

One of the main purposes of high-level programming languages is to save developer time by abstracting away the onerous and tedious aspects of the underlying hardware. In general, most high-level languages tend to do a good job at this. Unfortunately, we see these same high-level languages wasting significant amounts of developer time. Many times, this is due to performance problems. What becomes problematic, however, is that in order to properly diagnose and fix many of these performance problems, the developers involved need to obtain a high degree of understanding about the implementation of the high-level language that's involved.

A good example of this is a performance issue described recently with IronPython, an implementation of Python for Microsoft's .NET platform. In short, a very innocuous line of code was apparently responsible for the poor performance.

This incident highlights several main problems. The first is that high-level code can lead to some very unexpected interactions within the high-level language's implementation. This can obviously cause problems by misleading the developer or developers dealing with the performance problems. What appears on the surface to be a simple and likely very fast operation ends up being the culprit. A lot of developer time can be spent looking in the wrong places.

The second concern is that tracking down the problem requires in-depth knowledge about the high-level language's implementation. To some extent, we use such high-level languages in the first place to avoid needing to acquire such lower-level knowledge. We want to focus on the application we're writing, not on dealing with issues pertaining to the platform we're building upon. Time spent learning about the high-level language's implementation is time not spent on developing the application at hand.

This particular situation seems to have had a "happy" ending. The victim of the poor performance got a rapid response from somebody who did have inside knowledge about IronPython's implementation. Unfortunately, this isn't always the case. I've seen far too many times when developers have spun their wheels trying to track down obscure performance problems of that type. And it isn't a problem associated just with programming languages like Python, Ruby, or Perl, either. We often see it happen with SQL. A minor change to a query can result in a huge performance gain or loss.

As we start using high-level programming language implementations like IronPython, Scala, Clojure and JRuby, which are themselves often implemented in high-level programming languages like Java or C#, which in turn run on some sort of a virtual machine, we'll run into these sorts of problems more and more frequently. Each additional layer of software abstraction that we add in makes the situation more and more difficult. Soon we may need to look in two or three very different layers of software, assuming we even have source access, to track down performance issues. This could very well lead to a serious waste of developer time and effort.

Permalink: http://pinderkent.phumblog.com/post/2009/05/losing_developer_time_to_performance_problems_hidden_by_highlevel_languages
Share:

C and C++ play a very crucial role in most Web application systems.

Posted on Friday, May 15, 2009 at 2:21 AM.

Today, over at Hacker News, I saw a topic asking why C++ isn't commonly used for Web applications. The question itself is quite valid; we typically don't see Web applications themselves developed in C++. But that doesn't mean that C and C++ don't have an integral role within a Web-based system. Their use isn't as visible as that of Ruby, PHP, Python or Perl, but it's important nevertheless.

Admittedly, the back-end of many Web applications really isn't all that complex. In many cases, it's basically just a friendlier interface to a datastore of some sort, maybe offering some caching, and usually some basic data manipulation. And although C++ libraries like the STL and Boost allow for such tasks to be performed with relative ease, there's essentially little benefit in using C++. Scripting languages are often sufficient.

That said, C and C++ still do have a huge role in most Web application stacks today. We shouldn't forget that most of the popular server operating systems, Web servers and database systems today, as well as the most widely used implementations of most scripting languages, are typically written in C or C++. This is quite apparent within the popular open source Web stacks.

At the very core, we have C playing an integral role in virtually all of the popular server operating systems today, especially UNIX-like systems like Linux, FreeBSD, and Solaris. On top of that, we have popular Web servers like Apache, nginx, and lighttpd that are all written in C. And for database systems, PostgreSQL and SQLite are written in C, while MySQL uses both C and C++.

C and C++ are also critical to the programming languages used to implement many Web applications. The most widely used implementations of Python, Ruby, Perl and PHP all use C. Even Sun's HotSpot Java virtual machine makes very extensive use of C and C++.

So when we take a more holistic view of Web applications, we see that C and C++ prove to be very widely used. They're used for some of the most critical aspects of Web-based systems, where performance and reliability truly matter. Even if they get more of the attention, languages like PHP, Python, Ruby, Java and Perl end up being little more than glue languages, tying together the software implemented in C or C++. It becomes easy to forget their importance, but this may just be because the software developed using them has matured to the point where they provides such stable interfaces that we can totally ignore their implementation language. Nevertheless, C and C++ are very critical to the vast, vast majority of Web applications that exist today.

Permalink: http://pinderkent.phumblog.com/post/2009/05/c_and_c_play_a_very_crucial_role_in_most_web_application_systems
Share:

Do programming languages for code embedded in Web pages need to be dynamic?

Posted on Saturday, April 11, 2009 at 9:41 PM.

Towards the end of his "On programming language design" article, which does a very good job of pointing out the benefits and necessity of statically-typed and statically-checked programming languages, Andrej Bauer writes the following:

There are situations in which a statically checked language is better, for example if you're writing a program that will control a laser during eye surgery. But there are also situations in which maximum flexibility is required of a program, for example programs that are embedded in web pages. The web as we know it would not exist if every javascript error caused the browser to reject the entire web page (try finding a page on a major web site that does not have any javascript errors).

This opens the door for some interesting speculation. One thing to consider is whether successful languages meant for embedding within Web pages need to be dynamic. Another thing to consider is how the current situation would differ if browser-based languages were more static than JavaScript is.

Anyone who has worked extensively with languages like Haskell, SML, and OCaml will be aware that a more static-oriented mindset itself typically doesn't negatively limit the development of an application. It may make certain software development techniques more difficult, but usually this is just a case of those techniques being a bad idea in the first place, regardless of the language being used.

A good example of such functionality is JavaScript's eval function, which allows for a string to be executed at runtime as if it were code. It's the sort of functionality that's abused far more than it's ever used appropriately. Some of the JavaScript community's more enlightened individuals have recognized this. Douglas Crockford, for instance, appropriately describes eval as "the most misued feature of JavaScript." His advice to "avoid it" makes perfect sense. Wladimir Palant has also written an article about eval. His article is very practical, giving five real-world abuses of eval. In the end, he concludes that there really aren't that many valid reasons for using eval.

So just because a language allows for certain dynamic techniques to be employed, often they're not what is wanted. The natural way of obtaining the same outcomes using static languages like Haskell, Standard ML and OCaml may require slightly more work on the part of the developer, but the end result is typically much safer and much more reliable than the dynamic language's equivalent. In short, using a static language for code embedded within Web pages shouldn't prevent any legitimate and sensible programming activity from being performed.

There's nothing about static languages that would prevent them from being embedded, in source form, into a Web page. Hugs and GHCi are good examples of how Haskell, for instance, can be be used in a manner similar to that of many interpreted scripting languages.

We'll have to resort more to speculation when considering how things would be different today and tomorrow were static languages, rather than JavaScript, more widely available for embedding within Web pages. One of the most significant changes, I think, would be the performance of such code. Until this past year, the performance of most JavaScript implementations can best be described as horrible. Even then, it took the initiative and involvement of Google with their Chrome Web browser, and Apple with Safari, before we even began to see reasonable performance.

Much of JavaScript's performance problems arise from its dynamic nature. This inherently makes the development of optimizing implementations difficult. Of course, this isn't a problem associated just with JavaScript. Other dynamic languages, like Perl, Python, and especially Ruby offer poor performance, as well. Being an embedded, interpreted language doesn't help JavaScript, either. Static languages, on the other hand, allow for implementations to safely perform a greater amount of analysis and optimization, which can lead to better performance than we'd see out of dynamic languages for the same task.

Greater reliability and increased security are other areas where static languages tend to excel. Andrej's article mentions a number of the common language features that help with this. In essence, static languages help eliminate techniques that are known to typically lead to flaws, and they usually provide greater syntax and type checking to catch human errors more readily. The developer mindset one develops by using such languages also inherently helps encourage the development of better software.

Many have touted JavaScript's accessibility as a key to it being widely used. That is indisputable, of course. Its shortcomings as a language have made it widely usable by people who probably shouldn't be using it. Many Web developers who can do a fine job designing a page that looks good have gotten away with using it (along with PHP, another poor language) to perform programming tasks with which they don't have as much experience and knowledge. These are the sort of users who typically create code rife with security holes and other problems. So aside from the natural ability of static languages to encourage higher-quality applications, making programming slightly more difficult may bring additional benefits, by weeding out users who probably shouldn't be performing programming-related tasks.

It's unlikely that we'll see a static language like Haskell, or one of the languages from the ML family, available within all popular browsers any time soon. Even today, Web developers spend an inordinate amount of time dealing with cross-browser issues when writing JavaScript code. Instead, we'll likely see the current trends continue, with JavaScript still having relatively poor performance even after much work funded by powerful industry backers, with JavaScript still being used to develop poor-quality and insecure software, often by developers who have little real programming knowledge or background.

Permalink: http://pinderkent.phumblog.com/post/2009/04/do_programming_languages_for_code_embedded_in_web_pages_need_to_be_dynamic
Share:
Feeds
  • RSS 2.0 Feed
  • Atom 2.0 Feed
Tags
Archives