Pinderkent

Pain and glory from the trenches of the IT world.

Do programming languages for code embedded in Web pages need to be dynamic?

Posted on Saturday, April 11, 2009 at 9:41 PM.

Towards the end of his "On programming language design" article, which does a very good job of pointing out the benefits and necessity of statically-typed and statically-checked programming languages, Andrej Bauer writes the following:

There are situations in which a statically checked language is better, for example if you're writing a program that will control a laser during eye surgery. But there are also situations in which maximum flexibility is required of a program, for example programs that are embedded in web pages. The web as we know it would not exist if every javascript error caused the browser to reject the entire web page (try finding a page on a major web site that does not have any javascript errors).

This opens the door for some interesting speculation. One thing to consider is whether successful languages meant for embedding within Web pages need to be dynamic. Another thing to consider is how the current situation would differ if browser-based languages were more static than JavaScript is.

Anyone who has worked extensively with languages like Haskell, SML, and OCaml will be aware that a more static-oriented mindset itself typically doesn't negatively limit the development of an application. It may make certain software development techniques more difficult, but usually this is just a case of those techniques being a bad idea in the first place, regardless of the language being used.

A good example of such functionality is JavaScript's eval function, which allows for a string to be executed at runtime as if it were code. It's the sort of functionality that's abused far more than it's ever used appropriately. Some of the JavaScript community's more enlightened individuals have recognized this. Douglas Crockford, for instance, appropriately describes eval as "the most misued feature of JavaScript." His advice to "avoid it" makes perfect sense. Wladimir Palant has also written an article about eval. His article is very practical, giving five real-world abuses of eval. In the end, he concludes that there really aren't that many valid reasons for using eval.

So just because a language allows for certain dynamic techniques to be employed, often they're not what is wanted. The natural way of obtaining the same outcomes using static languages like Haskell, Standard ML and OCaml may require slightly more work on the part of the developer, but the end result is typically much safer and much more reliable than the dynamic language's equivalent. In short, using a static language for code embedded within Web pages shouldn't prevent any legitimate and sensible programming activity from being performed.

There's nothing about static languages that would prevent them from being embedded, in source form, into a Web page. Hugs and GHCi are good examples of how Haskell, for instance, can be be used in a manner similar to that of many interpreted scripting languages.

We'll have to resort more to speculation when considering how things would be different today and tomorrow were static languages, rather than JavaScript, more widely available for embedding within Web pages. One of the most significant changes, I think, would be the performance of such code. Until this past year, the performance of most JavaScript implementations can best be described as horrible. Even then, it took the initiative and involvement of Google with their Chrome Web browser, and Apple with Safari, before we even began to see reasonable performance.

Much of JavaScript's performance problems arise from its dynamic nature. This inherently makes the development of optimizing implementations difficult. Of course, this isn't a problem associated just with JavaScript. Other dynamic languages, like Perl, Python, and especially Ruby offer poor performance, as well. Being an embedded, interpreted language doesn't help JavaScript, either. Static languages, on the other hand, allow for implementations to safely perform a greater amount of analysis and optimization, which can lead to better performance than we'd see out of dynamic languages for the same task.

Greater reliability and increased security are other areas where static languages tend to excel. Andrej's article mentions a number of the common language features that help with this. In essence, static languages help eliminate techniques that are known to typically lead to flaws, and they usually provide greater syntax and type checking to catch human errors more readily. The developer mindset one develops by using such languages also inherently helps encourage the development of better software.

Many have touted JavaScript's accessibility as a key to it being widely used. That is indisputable, of course. Its shortcomings as a language have made it widely usable by people who probably shouldn't be using it. Many Web developers who can do a fine job designing a page that looks good have gotten away with using it (along with PHP, another poor language) to perform programming tasks with which they don't have as much experience and knowledge. These are the sort of users who typically create code rife with security holes and other problems. So aside from the natural ability of static languages to encourage higher-quality applications, making programming slightly more difficult may bring additional benefits, by weeding out users who probably shouldn't be performing programming-related tasks.

It's unlikely that we'll see a static language like Haskell, or one of the languages from the ML family, available within all popular browsers any time soon. Even today, Web developers spend an inordinate amount of time dealing with cross-browser issues when writing JavaScript code. Instead, we'll likely see the current trends continue, with JavaScript still having relatively poor performance even after much work funded by powerful industry backers, with JavaScript still being used to develop poor-quality and insecure software, often by developers who have little real programming knowledge or background.

Permalink: http://pinderkent.phumblog.com/post/2009/04/do_programming_languages_for_code_embedded_in_web_pages_need_to_be_dynamic
Share:

The role of nesting in limiting the size and complexity of functions and methods.

Posted on Wednesday, March 11, 2009 at 1:53 AM.

I read an article today that discussed the optimal size of methods or functions. The article references a number of books and academic studies regarding this topic. As with most things, it appears that what's needed is a healthy balance. In this case, it's a matter of distributing complexity in a way that makes the entire application more understandable and easier to work with, while at the same time not making individual methods of functions too complex.

I've worked with a number of codebases over the years that have taken very different approaches to problems such as these. Some have set a very strict limit on the size of functions or methods, in terms of lines of code. Others have exhibited much more flexibility.

All in all, I think that strict lines-of-code limits typically aren't beneficial. When using an OO language, this often translates into classes with a large number of small private methods. While each method may be less than, say, 25 lines of code, the class as a whole isn't necessarily easier to understand or worth with. Following the flow of execution soon becomes an act of skipping between these small methods. In many cases we find that 10 methods of 20 lines each can be more difficult to follow than one method of 200 lines.

It doesn't help that many mainstream languages don't allow for nested functions. When using a language like Java, for instance, we often run into cases where we have a large method that we want to separate out into smaller methods, but our only choice is really to create one or more private methods in the same class. If the class consists of several such methods that we'd like to break up, then things can become more awkward as we have a proliferation of private methods that are really only used by one other method. Ideally, we could define functions or methods that are defined within the "parent" method. This doesn't reduce the overall size of that method, but it does allow for it to be broken up into smaller pieces of logic.

Noted C++ export Herb Sutter gives some examples of how to simulate nested functions using C++. I found another article showing a technique using C# delegates. Arguably, these techniques make the code more difficult to follow. GCC's extension allowing for nested C functions is somewhat cleaner, but at the cost of portability.

One benefit of using a functional programming language is that most allow for, if they don't outright encourage, the use of nested functions. The Standard ML of New Jersey homepage specifically mentions this as a benefit of Standard ML in its summary of the language, for instance. Some examples can be seen within the 'Merge Sort' section of the Wikipedia article on Standard ML. Functions exist within the context where they are most needed and most useful, but otherwise remain out of the way.

A good rule of thumb might be to make methods or functions just as long as they need to be to complete one task, and to do that task well. Immutability, which is core to functional programming, also helps encourage this sort of coding. We end up writing short, local functions that return values rather quickly, rather than writing longer functions or methods that manipulate state stored in global variables or class member variables. So in the end, it may come down to using a language or at least a language implementation that allows for nested functions or methods. If that's not a possibility, there are workarounds to simulate such functionality while using languages like Java and C++. And if that's not a possibility, then it may just be best to have slightly longer methods.

Permalink: http://pinderkent.phumblog.com/post/2009/03/the_role_of_nesting_in_limiting_the_size_and_complexity_of_functions_and_methods
Share:

"Adaptive PHP" techniques help ensure bugs, unmaintainability, and other problems.

Posted on Friday, March 06, 2009 at 2:34 AM.

The recent 6 Signs of Adaptive PHP article gives some examples of different PHP coding techniques. Unfortunately, it only bothers to cover the supposed "benefits" of each, without any consideration of how such techniques can prove to be problematic.

The first technique suggests passing all arguments to a function within a single associative array. The supposed "benefits" all promote laziness, namely in that all parameters become potentially optional, and less or no change is needed to any invocations of the function if parameter changes are made.

Not mentioned in that article were the drawbacks. One obvious problem is the vastly increased verbosity, both when it comes to handling default values, and when it comes to invoking the function. Default parameter values, as offered by languages like C++ and even VB.NET, are syntactically superior to this approach.

Even the very concept itself is flawed, as it's much better in terms of maintainability and clarity to have an explicit list of parameters. For one thing, this technique makes it much less obvious to other programmers how to call the function. Given the general lack of documentation associated with most PHP applications, it's likely that anybody wishing to use the function would have to consult the code of the function itself. It also inhibits the ability of the interpreter to check that the correct number of arguments have been passed to the function.

Furthermore, if one has a function that has so many optional parameters that such a technique is needed, perhaps too much is being done in that single function. It would appear that such a function is a good candidate for heavy refactoring.

The second suggestion recommends checking for functions and classes before making use of them, such as those exposed by a plugin module. This sounds risky. When it comes to plugins, the host application should insist that any plugins conform to a strict API. Working with plugins that may just decide not to export a certain required function, for instance, is a recipe for disaster. If a plugin module doesn't provide the proper interface, it should not be loaded.

The third suggestion is perhaps the only one that's sensible. It suggests using require_once() to prevent multiple inclusions of some external PHP code. Indeed, this typically is a good idea.

The fourth suggestion recommends that associative arrays be used to return multiple values from a function. While this isn't as bad of an idea as the use of an associative array to store parameter values, this seems to indicate that PHP should offer a more lightweight construct such as the tuples found in languages like Python and Standard ML. Sticking with what PHP already offers, perhaps even an object could be returned, rather than an associative array.

The fifth suggestion recommends the use of an __autoload function. The very existence of this function suggests that PHP has some serious shortcomings when it comes to allowing for the sensible separation of code. On one hand, this sort of dynamic loading is the sort of functionality that PHP should offer transparently.

Furthermore, it indicates a laziness on the part of PHP developers who wish to write code making use of classes defined in other source files, but who are unwilling to put forth the small amount of effort needed to explicitly include such files.

In addition, the three notes in the __autoload documentation should make developers hesitant to use such functionality. It appears that it interferes with the semantics of exception handling, it's not available when using PHP in its interactive mode, and has risks associated with the validity of the class names passed to it. Frankly, it sounds like yet another PHP "bandage" meant to patch over problems in the language and programmer laziness, while at the same time introducing far more severe problems.

The final suggestion recommends that the directory of the script be determined by calling dirname(_FILE), especially for the purposes of location other PHP files to include. For the given example, the use of a relative path should be sufficient.

The PHP community has a long history of not fixing the problems with their language and its most common implementation. Far too often we've seen "solutions" like these, which end up being messier and more convoluted than had the problem with PHP itself just been fixed sensibly. It's no wonder that so many PHP Web apps are buggy, full of security holes and essentially unmaintainable; the language itself is inherently broken, the workarounds are just as broken and full of caveats, and together they result in nothing but problems.

Permalink: http://pinderkent.phumblog.com/post/2009/03/adaptive_php_techniques_help_ensure_bugs_unmaintainability_and_other_problems
Share:
Feeds
  • RSS 2.0 Feed
  • Atom 2.0 Feed
Tags
Archives