Pinderkent

Pain and glory from the trenches of the IT world.

Losing developer time to performance problems hidden by high-level languages.

Posted on Saturday, May 23, 2009 at 11:48 PM.

One of the main purposes of high-level programming languages is to save developer time by abstracting away the onerous and tedious aspects of the underlying hardware. In general, most high-level languages tend to do a good job at this. Unfortunately, we see these same high-level languages wasting significant amounts of developer time. Many times, this is due to performance problems. What becomes problematic, however, is that in order to properly diagnose and fix many of these performance problems, the developers involved need to obtain a high degree of understanding about the implementation of the high-level language that's involved.

A good example of this is a performance issue described recently with IronPython, an implementation of Python for Microsoft's .NET platform. In short, a very innocuous line of code was apparently responsible for the poor performance.

This incident highlights several main problems. The first is that high-level code can lead to some very unexpected interactions within the high-level language's implementation. This can obviously cause problems by misleading the developer or developers dealing with the performance problems. What appears on the surface to be a simple and likely very fast operation ends up being the culprit. A lot of developer time can be spent looking in the wrong places.

The second concern is that tracking down the problem requires in-depth knowledge about the high-level language's implementation. To some extent, we use such high-level languages in the first place to avoid needing to acquire such lower-level knowledge. We want to focus on the application we're writing, not on dealing with issues pertaining to the platform we're building upon. Time spent learning about the high-level language's implementation is time not spent on developing the application at hand.

This particular situation seems to have had a "happy" ending. The victim of the poor performance got a rapid response from somebody who did have inside knowledge about IronPython's implementation. Unfortunately, this isn't always the case. I've seen far too many times when developers have spun their wheels trying to track down obscure performance problems of that type. And it isn't a problem associated just with programming languages like Python, Ruby, or Perl, either. We often see it happen with SQL. A minor change to a query can result in a huge performance gain or loss.

As we start using high-level programming language implementations like IronPython, Scala, Clojure and JRuby, which are themselves often implemented in high-level programming languages like Java or C#, which in turn run on some sort of a virtual machine, we'll run into these sorts of problems more and more frequently. Each additional layer of software abstraction that we add in makes the situation more and more difficult. Soon we may need to look in two or three very different layers of software, assuming we even have source access, to track down performance issues. This could very well lead to a serious waste of developer time and effort.

Permalink: http://pinderkent.phumblog.com/post/2009/05/losing_developer_time_to_performance_problems_hidden_by_highlevel_languages
Share:

C and C++ play a very crucial role in most Web application systems.

Posted on Friday, May 15, 2009 at 2:21 AM.

Today, over at Hacker News, I saw a topic asking why C++ isn't commonly used for Web applications. The question itself is quite valid; we typically don't see Web applications themselves developed in C++. But that doesn't mean that C and C++ don't have an integral role within a Web-based system. Their use isn't as visible as that of Ruby, PHP, Python or Perl, but it's important nevertheless.

Admittedly, the back-end of many Web applications really isn't all that complex. In many cases, it's basically just a friendlier interface to a datastore of some sort, maybe offering some caching, and usually some basic data manipulation. And although C++ libraries like the STL and Boost allow for such tasks to be performed with relative ease, there's essentially little benefit in using C++. Scripting languages are often sufficient.

That said, C and C++ still do have a huge role in most Web application stacks today. We shouldn't forget that most of the popular server operating systems, Web servers and database systems today, as well as the most widely used implementations of most scripting languages, are typically written in C or C++. This is quite apparent within the popular open source Web stacks.

At the very core, we have C playing an integral role in virtually all of the popular server operating systems today, especially UNIX-like systems like Linux, FreeBSD, and Solaris. On top of that, we have popular Web servers like Apache, nginx, and lighttpd that are all written in C. And for database systems, PostgreSQL and SQLite are written in C, while MySQL uses both C and C++.

C and C++ are also critical to the programming languages used to implement many Web applications. The most widely used implementations of Python, Ruby, Perl and PHP all use C. Even Sun's HotSpot Java virtual machine makes very extensive use of C and C++.

So when we take a more holistic view of Web applications, we see that C and C++ prove to be very widely used. They're used for some of the most critical aspects of Web-based systems, where performance and reliability truly matter. Even if they get more of the attention, languages like PHP, Python, Ruby, Java and Perl end up being little more than glue languages, tying together the software implemented in C or C++. It becomes easy to forget their importance, but this may just be because the software developed using them has matured to the point where they provides such stable interfaces that we can totally ignore their implementation language. Nevertheless, C and C++ are very critical to the vast, vast majority of Web applications that exist today.

Permalink: http://pinderkent.phumblog.com/post/2009/05/c_and_c_play_a_very_crucial_role_in_most_web_application_systems
Share:

Some Django tips and tricks pages that I've found helpful.

Posted on Saturday, April 25, 2009 at 4:14 PM.

My current work involves working on some Web applications developed using Django. Although I've used Python much in the past, my experience with Django was quite limited. So I recently did some research to become more proficient with it, and will list below some of the Web pages that I found provided the most useful tips and tricks for when using Django.

  • Some Django tips: Although an older article, it also makes some good non-Django suggestions, like installing IPython and ensuring your project has a test suite.
  • Small Django tips from one newbie to another: Another older article, this one also emphasizes the need for unit testing, and gives some examples (with code) about how to go about this. It also discusses ways to manage frequent model changes during development.
  • Usefull tips to start a new project with Django: A slightly dated article that summarizes how to get started with Django, and well as some suggestions for when deploying a production Django application.
  • Django Tips: UTF-8, ASCII Encoding Errors, Urllib2, and MySQL: Gives useful tips about handling UTF-8 encoded strings. Although the project I'm working on thankfully didn't make the mistake of using MySQL, this article does include some tips relating to string encoding and MySQL, which may be useful for some people.
  • Big list of Django tips (and some python tips too): This offers perhaps the greatest quantity of Django tips in a single page. It's quite complete, covering areas such as deployment, configuration, templating and views, the model, testing, and so forth.
  • Tips for Scaling a Web App: While not completely Django-specific, it lists some good ideas for how to develop a database-backed Web application that scales well.
  • Django Tips: PIL, ImageField and Unit Tests: Gives some time-saving suggestions about using the Python Imaging Library with Django and unit tests.
  • Django Image Uploading: Tips and Tricks: Outlines how to upload images in Django apps, with some suggestions about how to solve some common problems.
  • 10 Insanely Useful Django Tips: I think the title of this article overhypes it somewhat. The tips are useful, but they are somewhat common-sense tips, as well. Although I haven't tried it yet, this article did point me to django-debug-toolbar, which sounds like it might be useful.
  • 'Practical' tips for working with Django: Includes some suggestions regarding developing custom managers, wrapping generic views, and converting text to HTML before rendering the template.
  • Debugging Django: One of the more detailed articles I found suggesting some strategies for debugging Django applications.
  • Django development tips: Some ideas for setting up a long-running Django development server in a UNIX-like environment using GNU Screen. More advanced users of UNIX-like systems are probably familiar with this technique, but this article is still a useful reference and tutorial for newer users.
  • Django Tips - Unique Date Querysets: A quick suggestion about how to get all of the unique years and months for a data set such as the posts in a blog, or other timestamped data.
  • Favorite Django Tips & Features: This thread from Stack Overflow contains a variety of user-contributed tips. Some of them suggest software to use in conjunction with Django, including Jinja2.
  • Tips to keep your Django/mod_python memory usage down: Some deployment and configuration suggestions to reduce Django's memory usage when using Apache and mod_python.
  • Django Doctest Tips: Some tips for testing Django applications using doctest. Suggests better ways to locate failures, to use conditionals, to check context variables, and to check content type relations.
  • djangotips on Twitter: I didn't find the quality of these user-contributed tips as good as those from Stack Overflow or the other pages, but there were a few that seemed like they might be useful.
  • Django Tips, Vol. 1: Contains five tips covering topics like the difference between 'blank' and 'null', displaying multiple fields on the same line in the admin, and so on.
  • Django cheat sheet: Although this is a cheat sheet, and not really a Web page, this is one of the cleaner cheat sheets that I've seen. I've found it to be a useful reference so far.
  • Django performance tips: This article is also older, but many of its suggestions are very sensible and apply even when not using Django, such as using separate database and Web servers if possible, using PostgreSQL, and putting as much RAM as possible into the servers.

Of course, those are just a small sample of the many useful Django resources out there. But for those new to Django, reading through the articles about may help avoid some common pitfalls, as well as offer ideas to help become more productive while getting accustomed to Django.

Permalink: http://pinderkent.phumblog.com/post/2009/04/some_django_tips_and_tricks_pages_that_ive_found_helpful
Share:

Do programming languages for code embedded in Web pages need to be dynamic?

Posted on Saturday, April 11, 2009 at 9:41 PM.

Towards the end of his "On programming language design" article, which does a very good job of pointing out the benefits and necessity of statically-typed and statically-checked programming languages, Andrej Bauer writes the following:

There are situations in which a statically checked language is better, for example if you're writing a program that will control a laser during eye surgery. But there are also situations in which maximum flexibility is required of a program, for example programs that are embedded in web pages. The web as we know it would not exist if every javascript error caused the browser to reject the entire web page (try finding a page on a major web site that does not have any javascript errors).

This opens the door for some interesting speculation. One thing to consider is whether successful languages meant for embedding within Web pages need to be dynamic. Another thing to consider is how the current situation would differ if browser-based languages were more static than JavaScript is.

Anyone who has worked extensively with languages like Haskell, SML, and OCaml will be aware that a more static-oriented mindset itself typically doesn't negatively limit the development of an application. It may make certain software development techniques more difficult, but usually this is just a case of those techniques being a bad idea in the first place, regardless of the language being used.

A good example of such functionality is JavaScript's eval function, which allows for a string to be executed at runtime as if it were code. It's the sort of functionality that's abused far more than it's ever used appropriately. Some of the JavaScript community's more enlightened individuals have recognized this. Douglas Crockford, for instance, appropriately describes eval as "the most misued feature of JavaScript." His advice to "avoid it" makes perfect sense. Wladimir Palant has also written an article about eval. His article is very practical, giving five real-world abuses of eval. In the end, he concludes that there really aren't that many valid reasons for using eval.

So just because a language allows for certain dynamic techniques to be employed, often they're not what is wanted. The natural way of obtaining the same outcomes using static languages like Haskell, Standard ML and OCaml may require slightly more work on the part of the developer, but the end result is typically much safer and much more reliable than the dynamic language's equivalent. In short, using a static language for code embedded within Web pages shouldn't prevent any legitimate and sensible programming activity from being performed.

There's nothing about static languages that would prevent them from being embedded, in source form, into a Web page. Hugs and GHCi are good examples of how Haskell, for instance, can be be used in a manner similar to that of many interpreted scripting languages.

We'll have to resort more to speculation when considering how things would be different today and tomorrow were static languages, rather than JavaScript, more widely available for embedding within Web pages. One of the most significant changes, I think, would be the performance of such code. Until this past year, the performance of most JavaScript implementations can best be described as horrible. Even then, it took the initiative and involvement of Google with their Chrome Web browser, and Apple with Safari, before we even began to see reasonable performance.

Much of JavaScript's performance problems arise from its dynamic nature. This inherently makes the development of optimizing implementations difficult. Of course, this isn't a problem associated just with JavaScript. Other dynamic languages, like Perl, Python, and especially Ruby offer poor performance, as well. Being an embedded, interpreted language doesn't help JavaScript, either. Static languages, on the other hand, allow for implementations to safely perform a greater amount of analysis and optimization, which can lead to better performance than we'd see out of dynamic languages for the same task.

Greater reliability and increased security are other areas where static languages tend to excel. Andrej's article mentions a number of the common language features that help with this. In essence, static languages help eliminate techniques that are known to typically lead to flaws, and they usually provide greater syntax and type checking to catch human errors more readily. The developer mindset one develops by using such languages also inherently helps encourage the development of better software.

Many have touted JavaScript's accessibility as a key to it being widely used. That is indisputable, of course. Its shortcomings as a language have made it widely usable by people who probably shouldn't be using it. Many Web developers who can do a fine job designing a page that looks good have gotten away with using it (along with PHP, another poor language) to perform programming tasks with which they don't have as much experience and knowledge. These are the sort of users who typically create code rife with security holes and other problems. So aside from the natural ability of static languages to encourage higher-quality applications, making programming slightly more difficult may bring additional benefits, by weeding out users who probably shouldn't be performing programming-related tasks.

It's unlikely that we'll see a static language like Haskell, or one of the languages from the ML family, available within all popular browsers any time soon. Even today, Web developers spend an inordinate amount of time dealing with cross-browser issues when writing JavaScript code. Instead, we'll likely see the current trends continue, with JavaScript still having relatively poor performance even after much work funded by powerful industry backers, with JavaScript still being used to develop poor-quality and insecure software, often by developers who have little real programming knowledge or background.

Permalink: http://pinderkent.phumblog.com/post/2009/04/do_programming_languages_for_code_embedded_in_web_pages_need_to_be_dynamic
Share:

Static typing is a necessity for quality software.

Posted on Wednesday, April 08, 2009 at 10:55 AM.

When unit testing is used in conjunction with a dynamic programming language, it's typical to see many unit tests whose sole purpose is to test for errors that a language employing static typing would have caught at compile time. Justin Etheredge recently discussed this, and it's something that I have written about in the past in my "Unit testing is not a substitute for static typing" article.

This isn't necessarily an argument about whether statically-typed or dynamically-typed languages are better. There is no single answer to that argument, as what's best really depends very much on the situation at hand. When we want to write software really quickly and are willing to accept a low degree of quality, then dynamic languages are quite suitable. A good example of this is a system administrator writing a short Perl script to scan through certain log files, for instance. Chances are a reasonably competent administrator will be able to write such a script with no significant errors. But once we start moving beyond that scale, we need the help of a compiler, and often static typing.

While proponents of dynamic languages are often quick to point out that such languages allow for more rapid development, they often neglect to see the greater picture. Initially writing code is a very small portion of the lifetime of a typical software product. This is especially true as the software systems get larger and more complex. Significantly more time is often spent testing the software initially, testing for regressions as changes are made to other parts of the system, and debugging user-discovered problems after it has been deployed.

Automated unit tests have become a popular way of reducing this post-initial-development burden. While using a statically-typed (and often, but not always, compiled) language, such tests are usually about testing the actual functionality of the software. More trivial checks, such as whether we're assigning textual strings to variables we expect to be purely numeric, are left to the compiler to perform. And this makes perfect sense; developers shouldn't be wasting their time writing unit tests to essentially perform checks that a compiler could do far more thoroughly and efficiently.

There have been numerous times now when I've had to work with larger software systems developed using languages like Ruby or Python. Thankfully, there have been extensive unit test suites available in the majority of those cases. But the value of those test suites is diminished by the numerous tests that would be rendered immediately unnecessary by strong, static typing. Whatever time the developers might have saved by using the dynamic languages ended up being used instead to write trivial unit tests.

This isn't to suggest that we shouldn't use dynamic languages. Like I mentioned earlier, they're often practical and suitable for very small scripts and applications. But when we're writing serious software that's expected to perform reliably, it's in everyone's best interest to use a more static language. That way the developers don't have to waste their time writing unit tests that essentially test for typos and minor programming mistakes, rather than testing the functionality of the software. Likewise, the testers don't have to file numerous bug reports about the inevitable mistakes that the programmers made, but which were missed by their unit tests. And most importantly, the end-users don't have to fall victim to typing errors that both the developers and the testers missed. In short, it just makes more sense to use static languages.

Permalink: http://pinderkent.phumblog.com/post/2009/04/static_typing_is_a_necessity_for_quality_software
Share:
Feeds
  • RSS 2.0 Feed
  • Atom 2.0 Feed
Tags
Archives