Pinderkent

Pain and glory from the trenches of the IT world.

Higher-level languages aren't about making experts more productive. They're about allowing average programmers to do the otherwise impossible.

Posted on Saturday, March 28, 2009 at 11:41 PM.

I read an article today about whether higher-level programming languages like Python, Perl and C# are really that much more productive than a comparatively lower-level language like C. This is not a new line of discussion, by any means. But we're getting to the point where we've been using such higher-level languages for over a decade, and thus have had more of an opportunity to observe and analyze how successfully (or not) they've been used.

In my view, that article comes to the general conclusion that many of the popular claims regarding the benefits of high-level languages versus low-level languages don't hold true. It's suggested that languages like C do indeed have aspects that hamper developer productivity, but high-level languages bring their own, albeit different, set of problems. While a C developer may run into problems with pointers, a Perl programmer might lose a similar amount of time optimizing regular expressions.

I think the conclusions of that article are correct to some extent, but I also think that the greater picture may have been missed. The real impact of languages like Python, Ruby, Perl, JavaScript and PHP isn't that they allowed expert programmers to be marginally more productive. Their greatest "benefit" (or arguably their greatest disadvantage) is that they have allowed average and even poor programmers to accomplish things they couldn't have reasonably done in C.

PHP and JavaScript are good examples of this. As anyone who has used them knows, they are very unremarkable languages. Conceptually and syntactically, they're much like C in many ways, but without some of the aspects of C that average developers often find bothersome. They aim to eliminate manual memory management, for instance. They offer slightly nicer string handling. Their execution environments aren't as tied to the native hardware. But otherwise, the core PHP and JavaScript languages generally offer the same basic functionality that C offers.

By eliminating some of the more difficult aspects of C, even if they're not as flexible or as powerful in many ways, they've made programming accessible to people who otherwise would not have been able to handle it. I've had the misfortune of working with people like this. They can understand the concepts of variables, constants, loops, conditionals, functions and even the basics of OO to some extent. But they're totally unable to understand some of the basic, yet essential, concepts of C. Pointers, for some reason, is a common one. But luckily for such average developers, languages like PHP and JavaScript make their lives easier by getting rid of such constructs and functionality.

So we soon enough see these average developers using languages like PHP and JavaScript to develop applications. In many cases, there's little to nothing preventing the same application from having been developed using C, aside from the inability of the average developer or developers to use C. Anyone who has worked in the industry knows why businesses opt to go with such solutions. Sometimes it is cheaper and easier to hire several PHP and JavaScript developers, instead of just one or two expert C developers. Other times it's because inexperienced or unknowledgeable managers just don't know any better, or have bought into hype and marketing. Regardless, the outcome is typically a system that just barely works, assuming it's not outright broken. Whatever costs might have been saved initially end up becoming far more costly in the long run.

Had those JavaScript and PHP developers been forced to use C, it's likely that we wouldn't have seen any sort of a software system be produced at all. They would've still been struggling with significant memory leaks, segfaults, and sometimes even just getting their code to compile. So we can see the actual main benefit of such higher-level languages; they've reduced the complexity of an otherwise difficult skill down to something that is more palatable to non-experts.

Now, we need to ensure that we don't lump high-level languages like Haskell, Erlang, and Common LISP in with other high-level languages like PHP and JavaScript. They are clearly very different. Haskell, Erlang and Common LISP, for instance, use abstraction to empower the developer. They offer advanced features and techniques that expert developers can build upon to great benefit. This is very different from languages like PHP and JavaScript, which clearly took the C model of computing, and stripped out the parts that make C more awkward for the less-skilled programmers.

Even thought they are significantly higher-level languages than C, languages like Erlang, Haskell and Common LISP haven't become as popular because they still require a high level of knowledge and expertise to use for even the most basic of tasks. So they highlight the important difference between a language being "high-level" and a language being "accessible". Functional languages increase expert programmer productivity with more powerful abstractions; PHP and JavaScript increase average programmer productivity via simplification.

The whole debate with respect to whether high-level languages are better than low-level languages will likely rage for many more years. There are some tasks that just can't be done in languages like JavaScript and PHP, so we will surely see C remain around for a long time. But we likely will see languages like PHP and JavaScript remain around for a similar reason. Unfortunately, that reason won't be about allowing good developers to develop more advanced software more quickly, but rather about letting poor developers continue to put out just barely suitable software systems.

Permalink: http://pinderkent.phumblog.com/post/2009/03/higherlevel_languages_arent_about_making_experts_more_productive_theyre_about_allowing_average_programmers_to_do_the_otherwise_impossible
Share:

Mistakes are prevalent within PHP- and MySQL-based software systems.

Posted on Saturday, March 14, 2009 at 6:13 PM.

There was recently a posting at the The Daily WTF site entitled The Quest for the Unique ID. It gives an example of a software system that generated unique invoice identifiers by randomly generating a value, checking if that identifier had already been used by an existing database record, and repeating until an unused value was found.

Some people may laugh and doubt that software like this exists, but after years in the industry, one sees enough mistakes of that type to know they are a very real problem. No matter what platforms or technologies are used, software will be written incorrectly. Usually this is unintentional, and due to unclear specifications, typing mistakes, misunderstandings, and so forth. But there are other problems, like that of the The Daily WTF article, that go far beyond being bugs.

Almost any software developer, regardless of training or experience, should be able to see the obvious problems with the approach that was described in the article. Unfortunately, there are some software developers who do not, for whatever reason. It's difficult to even consider such people as "developers" or "programmers", because they lack even the most basic of knowledge of the craft. Having worked with a very wide variety of software systems, platforms, programming languages, and database systems, I have to say that I've seen most of these types of mistakes in software developed using PHP and MySQL.

I suspect this has to do with the general attitude within both of those communities. Namely, they have focused on quickly developing software, all without putting much emphasis on quality, security, and reliability. Both implementations, for instance, have a long history of having poor performance, numerous security flaws, numerous bugs (that often aren't fixed for years), poor architecture decisions, and a lack of critical features. Much of the software that is written in PHP and uses MySQL tends to absorb these negative traits, and then somehow manages to amplify them into the creation of spectacularly horrible software systems like that of the The Daily WTF article.

Far too often I've been at the bookstore and seen books that promise to teach both PHP development and MySQL development in just a few hundred pages. Unfortunately, such books appeal to those without much, if any, software development experience. And once they have read one such book, they come away mistakenly believing that they're on par with professionals who have spent years studying database and software development. Soon enough they've convinced somebody to hire them, and soon after a business software system has been developed that generates unique identifiers by random trial-and-error, avoids the use of primary keys, avoids the use of foreign keys and other constraints, is vulnerable to SQL injection attacks, performs extremely poorly, and is full of bugs.

Over time, I have become more and more hesitant towards taking consulting jobs that involve PHP and/or MySQL. The systems we see are often so broken that there is little that can be salvaged. From the database model to the highest levels of the UI, it's not a matter of fixing minor bugs or architectural deficiencies. Most of the time, the entire system is in dire need of replacement. Unfortunately, this often proves to be a process that most clients cannot afford nor justify. But perhaps if things had been done more correctly in the first place, they wouldn't be in such a bad position. Thus the lesson we can usually take away is that PHP and MySQL should be avoided whenever possible.

Permalink: http://pinderkent.phumblog.com/post/2009/03/mistakes_are_prevalent_within_php_and_mysqlbased_software_systems
Share:

"Adaptive PHP" techniques help ensure bugs, unmaintainability, and other problems.

Posted on Friday, March 06, 2009 at 2:34 AM.

The recent 6 Signs of Adaptive PHP article gives some examples of different PHP coding techniques. Unfortunately, it only bothers to cover the supposed "benefits" of each, without any consideration of how such techniques can prove to be problematic.

The first technique suggests passing all arguments to a function within a single associative array. The supposed "benefits" all promote laziness, namely in that all parameters become potentially optional, and less or no change is needed to any invocations of the function if parameter changes are made.

Not mentioned in that article were the drawbacks. One obvious problem is the vastly increased verbosity, both when it comes to handling default values, and when it comes to invoking the function. Default parameter values, as offered by languages like C++ and even VB.NET, are syntactically superior to this approach.

Even the very concept itself is flawed, as it's much better in terms of maintainability and clarity to have an explicit list of parameters. For one thing, this technique makes it much less obvious to other programmers how to call the function. Given the general lack of documentation associated with most PHP applications, it's likely that anybody wishing to use the function would have to consult the code of the function itself. It also inhibits the ability of the interpreter to check that the correct number of arguments have been passed to the function.

Furthermore, if one has a function that has so many optional parameters that such a technique is needed, perhaps too much is being done in that single function. It would appear that such a function is a good candidate for heavy refactoring.

The second suggestion recommends checking for functions and classes before making use of them, such as those exposed by a plugin module. This sounds risky. When it comes to plugins, the host application should insist that any plugins conform to a strict API. Working with plugins that may just decide not to export a certain required function, for instance, is a recipe for disaster. If a plugin module doesn't provide the proper interface, it should not be loaded.

The third suggestion is perhaps the only one that's sensible. It suggests using require_once() to prevent multiple inclusions of some external PHP code. Indeed, this typically is a good idea.

The fourth suggestion recommends that associative arrays be used to return multiple values from a function. While this isn't as bad of an idea as the use of an associative array to store parameter values, this seems to indicate that PHP should offer a more lightweight construct such as the tuples found in languages like Python and Standard ML. Sticking with what PHP already offers, perhaps even an object could be returned, rather than an associative array.

The fifth suggestion recommends the use of an __autoload function. The very existence of this function suggests that PHP has some serious shortcomings when it comes to allowing for the sensible separation of code. On one hand, this sort of dynamic loading is the sort of functionality that PHP should offer transparently.

Furthermore, it indicates a laziness on the part of PHP developers who wish to write code making use of classes defined in other source files, but who are unwilling to put forth the small amount of effort needed to explicitly include such files.

In addition, the three notes in the __autoload documentation should make developers hesitant to use such functionality. It appears that it interferes with the semantics of exception handling, it's not available when using PHP in its interactive mode, and has risks associated with the validity of the class names passed to it. Frankly, it sounds like yet another PHP "bandage" meant to patch over problems in the language and programmer laziness, while at the same time introducing far more severe problems.

The final suggestion recommends that the directory of the script be determined by calling dirname(_FILE), especially for the purposes of location other PHP files to include. For the given example, the use of a relative path should be sufficient.

The PHP community has a long history of not fixing the problems with their language and its most common implementation. Far too often we've seen "solutions" like these, which end up being messier and more convoluted than had the problem with PHP itself just been fixed sensibly. It's no wonder that so many PHP Web apps are buggy, full of security holes and essentially unmaintainable; the language itself is inherently broken, the workarounds are just as broken and full of caveats, and together they result in nothing but problems.

Permalink: http://pinderkent.phumblog.com/post/2009/03/adaptive_php_techniques_help_ensure_bugs_unmaintainability_and_other_problems
Share:

Frameworks are useful in the short-term, but become a burden in the long run.

Posted on Wednesday, February 18, 2009 at 2:32 AM.

Frameworks have become a familiar sight within the field of software development. They exist for virtually every language, every platform, and for all sorts of domains. One area where frameworks have really taken hold is that of Web development. Some people are very much in favor of using frameworks. Others suggest that frameworks are not such a good thing. In the end, it ends up being a mix of both viewpoints.

Frameworks prove to be quite useful when initially developing a Web site. In many cases, it is unknown how much interest there will be in the site. So it proves to be very helpful when one can use a framework to help quickly get a basic version of the site functional and deployed. The effort of initially developing the Web site is minimized to better account for the greater risk. This becomes even more important if the site is a business venture, and funding is limited.

If all goes well, the Web site will become popular. This is often the point when the framework that initially was very useful starts to become a burden. One of the first problems that developers typically encounter is the need to add highly-specialized functionality to their site, only run head-first into the framework. What they want to do doesn't mesh well with the philosophy of the framework. Unfortunately, the only course of action is to either circumvent what the framework tries to enforce, or twist the implementation of the feature in a way that'll fit with the framework. Either way, this usually results in numerous hacks, which in turn lead to code that is far more difficult to debug and maintain.

If the site is really lucky, performance will start to become a problem. This is one area where frameworks really start to become a hassle. Depending on the framework and its architecture, it may not scale very well. This is especially true for frameworks that integrate with a database system of some sort. We often run into a situation where the database access, often in the case of automatically-generated queries, is suboptimal, if not outright horrid. Many times the fixes are easy, but require some custom SQL code to be written, for instance. Soon, the developers may need to be continually avoiding the framework in order to maintain a suitable, or even just reasonable, level of performance.

By the time an application or system has become more advanced and mature, we often see a trend towards moving away from whatever framework was originally used. Sometimes this is easy to do, but usually it is not. The framework has become tightly woven with the software system, typically at the most fundamental levels, including user and session management. Any attempt to rip out and replace that framework-provided functionality will quickly cascade throughout the rest of the system. The scope of the necessary changes soon becomes too great, to the point of being too risky and requiring far too many resources to adequately test.

This is a trend I have seen for years now, regardless of the programming language or environment used. Frameworks really started to become widely used as Java gained in popularity, so this is where we see a lot of these problems arising. For many organizations in the aforementioned position, they really have a limited set of choices. On one hand, they need to continually develop their software to remain competitive, but this has become difficult to do because the initial use of a framework has contributed to a convoluted, unmaintainable code base. And on the other hand, they don't have the resources to partake in the significant refactoring and rewriting necessary to fix or replace the code base they currently have.

Some have tried to move from one framework to another as a way of solving their problems. One notable example is that of CD Baby, who tried to move to Ruby on Rails from PHP, but ended up falling back on PHP after Rails proved to not be flexible enough.

The best thing to do may be to initially start with a framework, but as soon as the site proves itself viable, every effort should be taken to move away from the framework and towards custom solutions that are crafted to tackle the specific problems at hand. This allows the development team to make the best use of the framework at the start of the project when they really need it, but it also gets it out of their way early on in the lifespan of the code base, to prevent too much other code from depending on it. This compromise may not always be possible, but should likely be chosen if possible.

Permalink: http://pinderkent.phumblog.com/post/2009/02/frameworks_are_useful_in_the_shortterm_but_become_a_burden_in_the_long_run
Share:

JavaScript has no place on the server.

Posted on Friday, January 30, 2009 at 2:10 AM.

Although it's not a new concept by any means, the use of JavaScript for server-side development has gotten some attention recently. This is unfortunate, as JavaScript is not the sort of technology you want to use when developing the back-end for a Web site.

Even as a client-side language, JavaScript has proven to be of a very questionable quality. I think Jonathan Edwards was correct when he described JavaScript as being "quick" and "dirty", and only just "good enough" for most users. For years now, even the major JavaScript implementations have exhibited horrible performance. Only after 15 years of use are we seeing that start to change, with faster VMs from Adobe, Mozilla and Google. As a language, JavaScript itself does very little to encourage the development of secure, high-quality code.

Server-side development calls for a more rigorous programming language and environment than JavaScript can offer. Considering JavaScript's performance problems on the client-side, it's likely they'd be magnified when it comes to server-side development. And JavaScript really doesn't offer anything beyond more traditional server-side languages like Java, C#, PHP, Perl, Python and Ruby. As is mentioned in the Ajaxian article, there is a huge amount of basic Web back-end infrastructure that will need to be built to make server-side JavaScript even barely comparable to any of the aforementioned programming languages.

What's worse, however, is the general "attitude" associated with JavaScript. It has always been a language that appeals to those who want to get their scripts written quickly, even if the quality is terrible and there are major security flaws. It has often been used by Web designers to quickly add some interactivity to the pages they have designed, without needing an extensive programming background. Indeed, this amateurish attitude becomes very dangerous on the server.

PHP is a good example of such danger. For years many PHP developers generated SQL queries directly using data sent to their script by the user, with little to no filtering. This resulted in SQL injection attacks becoming very common for PHP-based software. We still even see this sort of development happening today, in some very sad cases. Meanwhile, more mature server-side languages and environments like Java and .NET have for years provided support for parameterized SQL queries. But even the most inexperienced of Java and .NET developers know better than to not appropriately filter user input.

While there will of course be some professionals using it, and using it well, I fear that the rise of server-side JavaScript would lead to nothing but problems. Just as the PHP community is starting to get its collective act together, we might see a new generation of inexperienced, amateurish developers start with server-side development using JavaScript.

Then again, maybe that's not a bad thing. It'll guarantee a steady stream of work for those of us who have made a career out of fixing such mistakes. But it is unfortunate that we couldn't stop such problems in the first place, but using better languages for server-side development. With some work, Haskell could become very useful for server-side development where quality is concerned, while Erlang plays its part in reliable and highly-scalable systems. Regardless, the main thing we need to keep in mind is that JavaScript should be avoided for server-side development.

Permalink: http://pinderkent.phumblog.com/post/2009/01/javascript_has_no_place_on_the_server
Share:
Feeds
  • RSS 2.0 Feed
  • Atom 2.0 Feed
Tags
Archives