Pinderkent

Pain and glory from the trenches of the IT world.

Analyzing existing databases and their relationships with applications.

Posted on Saturday, October 17, 2009 at 3:24 AM.

Anybody working on business applications these days will undoubtedly have to familiarize himself or herself with one or more existing databases. These databases have often been "grown" rather than designed in any meaningful way, and thus will be littered with unused tables, unused functions or stored procedures, missing constraints, poor normalization, and a host of other problems.

Developers in such a situation will often look for an easy way out, such as the use of tools to automatically reverse-engineer various parts of the existing database or databases. While these tools can be useful, I don't think they are ever a replacement for just stepping through the code, line-by-line, and observing exactly what queries are executed.

Depending on the application, it may even be a bad idea to try and think of the application and database as separate. Many times we find that one can't exist without the other, and vice versa. For instance, we find hard-coded SQL queries within the application code written in languages like Java, PHP or C#. In such situations, one literally has to take a debugger and step through the application code in order to get even a basic understanding of how the system works.

Another thing that may be worth avoiding is trying to understand the system all at once. Often, it will take many months to truly grasp even a moderately sized application and the database behind it. As changes are made or bugs are fixed, take a moment or two to study and document the code paths that are involved. Doing this on a daily basis will eventually expose a developer to large portions of the software system they're working with.

Regardless of the approach, one thing to keep in mind that it's not an easy task becoming familiar with existing codebases and databases, especially when they're as ugly as so many real-world systems are. Give it time, remain patient, and eventually the system will start to feel much smaller and manageable.

Permalink: http://pinderkent.phumblog.com/post/2009/10/analyzing_existing_databases_and_their_relationships_with_applications
Share:

Python versus PHP is just professionalism versus amateurism.

Posted on Saturday, September 05, 2009 at 11:55 PM.

Joe Stump recently wrote about why he switched from PHP to Python. What he says is absolutely true. He nails down many of the problems with PHP, and also lists many of the benefits of Python. In the end, however, I think we can just sum up the comparison between the two languages as being one of comparing Python's professionalism versus PHP's amateurism.

When I describe Python as being professional in nature, I am referring to the emphasis on care, contemplation, quality, and doing-the-right-thing-even-if-it's-difficult we generally see embraced by the Python community. Like Joe mentions in his article, the language itself exhibits a high degree of sensibility, consistency and predictability. It has clearly been carefully evolved and grown, rather than having features and functionality tossed on here and there. It's a language developed by people who know what they're doing, and it's a language used by people who know what they're doing.

On the other hand, we've seen much of the opposite out of the PHP community. The language itself is a good testament to the general preference of that community towards doing things quick-and-dirty, rather than correctly and with care. It's full of quirks, it's inconsistent, and generally a jumble of differing ideas and philosophies. We've seen this happen with its standard library, as well. With the core platform having such poor quality, any software build upon it typically ends up suffering from quality and security problems, too.

It's no wonder that so many developers are moving away from PHP towards better languages and environments like Python and Ruby whenever possible. PHP is deceptively attractive. Sure, it's widely supported, and yes, it's got a large standard library. But it has many hidden and not-so-hidden costs, including horrid maintainability of applications written in it, numerous security flaws within critical code, and numerous hurdles towards developing decent software. Even if the perceived costs seem higher when using a more professional language like Python, in the long run it becomes the only viable choice when the alternative is something as amateurish as PHP.

Permalink: http://pinderkent.phumblog.com/post/2009/09/python_versus_php_is_just_professionalism_versus_amateurism
Share:

Programming languages should not try to guess the programmer's intentions.

Posted on Sunday, May 24, 2009 at 12:39 AM.

A common trait among some of the poorer-quality programming languages, namely PHP and JavaScript, is their use of weak typing. While some developers are convinced that it's acceptable, it's generally a bad idea to have a programming language essentially guess at what the programmer means.

Recently, I saw an article describing some problems within a PHP script caused by automatic conversion. Frankly, these kinds of issues should just not exist. Strong, static typing is clearly a better approach. Although it puts slightly more of a burden on the programmer, the act of manually specifying type conversions leads to higher-quality software, especially if any errors are caught at compile-time, rather than run-time.

JavaScript is another language that employs weak, dynamic typing. I recently saw another article that gives some good examples (under the "2. Plus operator overloading" section) of how this behavior may result in unexpected results, especially for novice developers. But even seasoned professionals still make mistakes, and such conversions should at least be flagged with warnings, if not outright disallowed.

Even though we often deal with fuzzy and incomplete specifications when developing software, we shouldn't bring such uncertainty and guesswork to our communication with the computer itself. We should specify exactly what we mean, even if it does take slightly more typing. Then again, when using languages like Haskell and OCaml, we can clearly see how strong, static typing and type inference can be implemented without overly burdening programmers. Any type conversions that must be manually specified help to force the programmers to think about what they're doing, which in some cases may be quite wrong, especially if a type conversion is necessary.

For the sake of trying to achieve even a moderately reasonable level of quality in our software, especially when programming for a hostile environment like the Internet, we shouldn't resort to languages like JavaScript and PHP that allow for type-related errors to occur so easily. It's even worse when they try to make automatic conversions that result in unexpected behavior. That's just plain unacceptable.

Permalink: http://pinderkent.phumblog.com/post/2009/05/programming_languages_should_not_try_to_guess_the_programmers_intentions
Share:

C and C++ play a very crucial role in most Web application systems.

Posted on Friday, May 15, 2009 at 2:21 AM.

Today, over at Hacker News, I saw a topic asking why C++ isn't commonly used for Web applications. The question itself is quite valid; we typically don't see Web applications themselves developed in C++. But that doesn't mean that C and C++ don't have an integral role within a Web-based system. Their use isn't as visible as that of Ruby, PHP, Python or Perl, but it's important nevertheless.

Admittedly, the back-end of many Web applications really isn't all that complex. In many cases, it's basically just a friendlier interface to a datastore of some sort, maybe offering some caching, and usually some basic data manipulation. And although C++ libraries like the STL and Boost allow for such tasks to be performed with relative ease, there's essentially little benefit in using C++. Scripting languages are often sufficient.

That said, C and C++ still do have a huge role in most Web application stacks today. We shouldn't forget that most of the popular server operating systems, Web servers and database systems today, as well as the most widely used implementations of most scripting languages, are typically written in C or C++. This is quite apparent within the popular open source Web stacks.

At the very core, we have C playing an integral role in virtually all of the popular server operating systems today, especially UNIX-like systems like Linux, FreeBSD, and Solaris. On top of that, we have popular Web servers like Apache, nginx, and lighttpd that are all written in C. And for database systems, PostgreSQL and SQLite are written in C, while MySQL uses both C and C++.

C and C++ are also critical to the programming languages used to implement many Web applications. The most widely used implementations of Python, Ruby, Perl and PHP all use C. Even Sun's HotSpot Java virtual machine makes very extensive use of C and C++.

So when we take a more holistic view of Web applications, we see that C and C++ prove to be very widely used. They're used for some of the most critical aspects of Web-based systems, where performance and reliability truly matter. Even if they get more of the attention, languages like PHP, Python, Ruby, Java and Perl end up being little more than glue languages, tying together the software implemented in C or C++. It becomes easy to forget their importance, but this may just be because the software developed using them has matured to the point where they provides such stable interfaces that we can totally ignore their implementation language. Nevertheless, C and C++ are very critical to the vast, vast majority of Web applications that exist today.

Permalink: http://pinderkent.phumblog.com/post/2009/05/c_and_c_play_a_very_crucial_role_in_most_web_application_systems
Share:

Do programming languages for code embedded in Web pages need to be dynamic?

Posted on Saturday, April 11, 2009 at 9:41 PM.

Towards the end of his "On programming language design" article, which does a very good job of pointing out the benefits and necessity of statically-typed and statically-checked programming languages, Andrej Bauer writes the following:

There are situations in which a statically checked language is better, for example if you're writing a program that will control a laser during eye surgery. But there are also situations in which maximum flexibility is required of a program, for example programs that are embedded in web pages. The web as we know it would not exist if every javascript error caused the browser to reject the entire web page (try finding a page on a major web site that does not have any javascript errors).

This opens the door for some interesting speculation. One thing to consider is whether successful languages meant for embedding within Web pages need to be dynamic. Another thing to consider is how the current situation would differ if browser-based languages were more static than JavaScript is.

Anyone who has worked extensively with languages like Haskell, SML, and OCaml will be aware that a more static-oriented mindset itself typically doesn't negatively limit the development of an application. It may make certain software development techniques more difficult, but usually this is just a case of those techniques being a bad idea in the first place, regardless of the language being used.

A good example of such functionality is JavaScript's eval function, which allows for a string to be executed at runtime as if it were code. It's the sort of functionality that's abused far more than it's ever used appropriately. Some of the JavaScript community's more enlightened individuals have recognized this. Douglas Crockford, for instance, appropriately describes eval as "the most misued feature of JavaScript." His advice to "avoid it" makes perfect sense. Wladimir Palant has also written an article about eval. His article is very practical, giving five real-world abuses of eval. In the end, he concludes that there really aren't that many valid reasons for using eval.

So just because a language allows for certain dynamic techniques to be employed, often they're not what is wanted. The natural way of obtaining the same outcomes using static languages like Haskell, Standard ML and OCaml may require slightly more work on the part of the developer, but the end result is typically much safer and much more reliable than the dynamic language's equivalent. In short, using a static language for code embedded within Web pages shouldn't prevent any legitimate and sensible programming activity from being performed.

There's nothing about static languages that would prevent them from being embedded, in source form, into a Web page. Hugs and GHCi are good examples of how Haskell, for instance, can be be used in a manner similar to that of many interpreted scripting languages.

We'll have to resort more to speculation when considering how things would be different today and tomorrow were static languages, rather than JavaScript, more widely available for embedding within Web pages. One of the most significant changes, I think, would be the performance of such code. Until this past year, the performance of most JavaScript implementations can best be described as horrible. Even then, it took the initiative and involvement of Google with their Chrome Web browser, and Apple with Safari, before we even began to see reasonable performance.

Much of JavaScript's performance problems arise from its dynamic nature. This inherently makes the development of optimizing implementations difficult. Of course, this isn't a problem associated just with JavaScript. Other dynamic languages, like Perl, Python, and especially Ruby offer poor performance, as well. Being an embedded, interpreted language doesn't help JavaScript, either. Static languages, on the other hand, allow for implementations to safely perform a greater amount of analysis and optimization, which can lead to better performance than we'd see out of dynamic languages for the same task.

Greater reliability and increased security are other areas where static languages tend to excel. Andrej's article mentions a number of the common language features that help with this. In essence, static languages help eliminate techniques that are known to typically lead to flaws, and they usually provide greater syntax and type checking to catch human errors more readily. The developer mindset one develops by using such languages also inherently helps encourage the development of better software.

Many have touted JavaScript's accessibility as a key to it being widely used. That is indisputable, of course. Its shortcomings as a language have made it widely usable by people who probably shouldn't be using it. Many Web developers who can do a fine job designing a page that looks good have gotten away with using it (along with PHP, another poor language) to perform programming tasks with which they don't have as much experience and knowledge. These are the sort of users who typically create code rife with security holes and other problems. So aside from the natural ability of static languages to encourage higher-quality applications, making programming slightly more difficult may bring additional benefits, by weeding out users who probably shouldn't be performing programming-related tasks.

It's unlikely that we'll see a static language like Haskell, or one of the languages from the ML family, available within all popular browsers any time soon. Even today, Web developers spend an inordinate amount of time dealing with cross-browser issues when writing JavaScript code. Instead, we'll likely see the current trends continue, with JavaScript still having relatively poor performance even after much work funded by powerful industry backers, with JavaScript still being used to develop poor-quality and insecure software, often by developers who have little real programming knowledge or background.

Permalink: http://pinderkent.phumblog.com/post/2009/04/do_programming_languages_for_code_embedded_in_web_pages_need_to_be_dynamic
Share:
Feeds
  • RSS 2.0 Feed
  • Atom 2.0 Feed
Tags
Archives