Saturday, April 28, 2012

Why OO programmers need to understand structured code

Its a natural tendency of people to want to imagine that what they do is somehow revolutionary and breaks with all that came before.  It makes them feel important and justifies laziness in learning history.  But nothing occurs in a vacuum and those who do not study history DO repeat it.

There are many places I take people to task for this today, but the focus of this blog is on object oriented coding and how, far from being revolutionary, it descends in a direct evolutionary path from structured code.  (Note that i am talking about code design and architecture, not so called "Object Oriented Design' which is actually a fad development methodology.  I'll leave that for another rant.)

There are two fundamentals of structured coding that define it. The first is the data structure. A data structure is defined as an organization of data in memory and a set of procedures to act upon it.

The second tenant is top-down design/bottom-up implementation. You start with what your code needs to accomplish and design the data-structure for it. This data structure will then require further data structures beneath it to implement so you design those, and so forth til you get to the bottom most level. The creates a clean multi-layered encapsulated design. You then "bottom up implement", starting with the most primitive data structures in your design and work your way up, testing each structure as it is built.

The first "high level" languages had no direct support for the concept of data structures, so we built them ourselves. In C code this generally meant a .h file to define the public interface and then one or more structs and procedures in a matching .c file. (We did similar things in Pascal.) The limit to only using a data structure's intended public API was a convention that good coder knew to follow but was not enforced by the language.

In response to this came the modular languages-- most notably Modula2. Modula2 added two very important ideas. The first was that it added import/export control. The .h file was replaced by the interface file. The interface file defined the publicly visible API to a code "module", which was just a data structure with some syntactic sugar. Other modules could only see what was published through that interface file. In addition, a module could control what other modules it saw with an import command. (In C and Pascal, all code was globally linked and saw all other global symbols in other files.)

The other important thing Modula2 brought was the concept of an opaque type. This was a publicly usable struct but its definition was hidden inside a module. To the rest of the code it was simply a reference to an instance of a data structure. As you can see, Modual2 was really the half-step between the procedural languages and OOP languages.

OOP languages came next and, just as Modula2 built on simple structured languages, OOP built on the modular languages. Modules became "classes" and the opaque type was formalized into the idea of class instances. More syntactic sugar was added to make such classes easier to define and use, and the level of control of imports was increased.

Fundamentally, however, all an object is is a formalization of a data structure. And good OOP code still has to follow the rules of good data structure design and use. Which, unfortunately, are not getting taught as much as they used to.

No comments: