Tuesday, June 1, 2010

Objects Considered Harmful


Its time for a programming vent.

As many of you know, I wrote the "Early Access" version of what would become the Project Darkstar server. When that project transferred to its eventual team in labs, the team there spent well over a year debating and re-creating what I had already created. As I watched this I comforted myself with the belief that this team of specialists would produce a better result each in their own area then I had the time or ability to do, doing it all.

Well, I am now deep in the client/server transport and protocol sections of the code right now... and I have never seen a more overly complex, totally obfuscated mess in my life.

The thing about protocol stacks is, they map beautifully to a simple, proper, structured coding approach. What we used to call top-down design/bottom up implementation. Each layer of the protocol is a layer of structured code with a well defined interface, calling the level below it. PDS (now RedDwarf) has two layers of fundamental abstraction-- a transport that moves packets around and a protocol that interprets them. To be fair and give credit where credit was due, that idea was implicit in my original implementation and the author of the re-write did pull that out as an explicit organization principle and observe that there should be a plug-in interface for each.

That, however, is where my praise of this code ends.. It is an unholy mess of calls and callbacks on passed objects running up and down the stack in higgeldy-piggeldy fashion to the point where so much of the logic is spread out in so many places the total execution is virtually untraceable.

This is not the first time I've seen this in code in recent years. I think the culprits are primarily University professors and CS programs who are so in love with concepts of "Object Oriented" programming that they are failing to teach the basics, which still come down to data structures, interfaces and layers of code. Those of us who WERE taught such concepts recognize an "object" as just a convenient packaging of data structure + interface and continue to write clean, clear encapsulated code.

But it seems the kids these days don't have those clear organizing principles in their heads. As a result they write their code as a whirling cloud of disorganized interacting objects. This chaotic swirl is virtually impossible to statically trace on paper as we had to, instead they count on debuggers to show them run-time behavior and praying that what they saw in this limited sample really represents most if not all possible interactions.

I think its time for a harsh remedy. I am calling for teachers of coding everywhere to rip those
Java and C++ books out of your students hands. Give them C, or if your nice, Pascal, to learn their basics on. Teach them what data structures are and how to do top-down design/bottom up implementation. Take away their debuggers and make them debug with trace calls.

When they can do that with aplomb, they are ready for the objects. But when you put power tools in the hands of someone who has never used a saw or screw-driver, you get messy accidents. And thats what were getting in code today.

Or to paraphrase a common witticism I never agreed with anyway as a statement I CAN agree with...
"There is no problem in Computer science that cannot be totally obfuscated by the addition of too many levels of abstraction."


Keshlam said...

Actually, I don't disagree (though the headline is a bit hyperbolic, I'm sure you'll agree). OOP is a tool, just as structured programming was a tool. Used well, it can help organize your code into useful conceptual units. But the key word is "help".

What I encountered at MIT was, I think, the right approach. The first classes in the CS sequence actually introduced students not only to coding principles, but to several different languages selected specifically because they exemplified the principles being taught. Algol, to let folks learn the principles of basic procedural structured programming and to provide a vehicle for discussing details of language implementation such as calling conventions. (Algol is unusual in supporting call-by-reference, which can be considered a poor man's lambda in disguise.) Lisp, to illustrate how a minimal interpreter can be grown into a complex language, and to give recursive programming and data structure skills a heavy workout. And others.

Later in the program, some of us (the lucky ones, in my opinion) were introduced to Barbara Liskov's CLUster language, which is the base from which much of current OOP sprang. (Alas, OOP still hasn't absorbed all the lessons that CLU taught; it's still uncommon to find direct support for continuations/coroutines, or for multivalue expressions, and people have forgotten that one of CLUs principles was that the documentation structure was linked to the language structure.) Assembler (for those who hadn't picked it up elsewhere) was mostly deferred until the class which dealt with machine architecture, and the levels of abstraction it embodies, so the relationship between instruction sets and how a machine actually implements them were made clear.

Having said that: my take is that OOP is like many of the other higher-level abstractions in CS. It can help good programmers become better. But bad programmers, unless assisted, will only become obvious.

Having said that:

I haven't seen this particular system, so I have no opinion on its design. I will observe, however, that all code tends toward spaghetti unless actively (and somewhat viciously) pruned on a regular basis, and that when rushing to market with scant resources such code hygiene activities may be pushed aside for longer than they should be. Also, once code hits customers it can be hard to get permission to change any existing behavior, for fear of breaking a customer who has bet their business on it; this tends to spawn parallel paths unless very carefully controlled, and passing additional objects through the system may be a symptom of that.

I'm a firm believer that most code can benefit from being rewritten from first principles periodically, so the architecture reflects the actual needs. But that too tends to scare customers, unless it happens at very clean break points.

Comp sci != Software Engineering != production coding. In theory, practice follows theory, but in practice...

So: I don't think this is an OO problem per se. I think it's a combination of education (which you called out) and available resources and discipline. Or lack thereof. OO just makes it obvious.

CyberQat said...

Yup, its my soap box, so I tend to wax dramatic ;)