recapitulation of the ontogeny of computer languages • 2007 Dec 30 • mahiwaga

Steve Yegge’s rant about huge code bases and how Java exacerbates the problem is definitely circulating the internets. Jeff Atwood at Coding Horror chimes in and agrees wholeheartedly.

I recall running into static typing issues making my code unnecessarly verbose back when I was using Turbo Pascal. (Mostly because you had to declare how long strings were, and because of the way you had to declare records/structures, you might even have to create classes of certain string length, like String80 for an 80-character string, and String78 for a 78-character string.) But the issue really isn’t typing. True, I do believe static typing tends to add nearly meaningless, semantically sparse lines to your code, but if you think about it, most of the popular interpreted (i.e., scripting) languages are actually strongly typed as well. In Microsoft Basic 2.0 (the interpreter that the Commodore 64 used), you declared type by appending a special character, so you always knew that A$ was a string, A% was an integer, and A was float, and that was all you really had, unless you counted arrays, which were made up of these base classes. In Perl, $var is a scalar (which can be a string or a number, to be sure, so maybe it’s not that strong in terms of typing), @var is an array, %var is a hash. And while Ruby doesn’t really use these markers, you have to explicitly re-cast Integers to Strings (using to\_s) or to Floats (using to\_f).

So typing doesn’t have anything to with it.

I think what the problem is, is that most people don’t construct sane objects.

Sometimes the problem lies in the base classes of the language. There may be too many redundant classes—similar but not-quite-the-same—and you end up having to figure out which methods can take which objects, and sometimes you end up writing all sorts of kludgery to use the objects you want and process them with the methods you want—because polymorphism in C++ isn’t all that it’s cracked up to be.

Sometimes the language isn’t quite completely object-oriented, and a lot of the idiom is still procedural in style. A lot of these languages had OOP grafted onto them, when their lineages are clearly from procedural languages. (Perl is certainly evidence of this, as are ObjC and C++)

On the other hand, it seems like most of the coders who disagree with Yegge and Atwood are more into procedural code, and some are even overtly hostile to OOP. I think part of the problem lies in the fact that a lot of people ended up learning OOP through C++, which, from what I remember, isn’t much fun. It was easier to write kludgy C than it was to deal with C++’s class system.

The other thing is that it seems like it’s really difficult to tailor your classes to the functionality you want. While most OO allows polymorphism, this isn’t quite exactly what you need. I feel like the right solution to the problem of creating subclasses is to either go the ObjC/Smalltalk way, and utilize message passing, or go the Ruby way (which also allows message passing,) and utilize mix-ins.

The way to decrease the number of lines a developer needs to write is to come up with intelligent (not just intelligently-designed) base classes and interfaces. So when you throw a not-quite-right object at it, it won’t just crash-out, and it won’t perform completely unexpected and often destructive operations. The idea of self-reflection is a step in this direction. Rails uses this to its advantage (as does ObjC and Smalltalk, from what I understand.)

Ultimately, you have to choose: do you want to control the means, or are you more interested in the ends? What assembly and C allows you to do is control explicitly or almost explicitly what the machine does. I believe this necessarily has to come at the expense of the end results. In contrast, semantically dense OO systems like Smalltalk and Ruby are good at delivering the end results as you intended, but you have very little control over what algorithms are used to actually deliver the results. This is the reason why C is often touted as the epitome of speed: every command translates easily into the appropriate machine code. Whereas interpreted languages inherently have a lot of overhead, and a lot of indirectness, but it’s easier to get from point A to point B.