Whenever you see yourself writing the same thing down more than once, there’s something wrong and you shouldn’t be doing it, and the reason is not because it’s a waste of time to write something down more than once. It’s because there’s some idea here, a very simple idea, which has to do with the Sigma notation…not depending upon what it is I’m adding up. And I would like to be able to always…divide the things up into as many pieces as I can, each of which I understand separately. I would like to understand the way of adding things up, independently of what it is I’m adding up.
– Gerald Sussman, SICP Lecture 2a, “Higher-order Procedures” (emphasis added)
The purpose of abstracting is not to be vague, but to create a new semantic level in which one can be absolutely precise.
– Edsger W. Dijkstra, The Humble Programmer
What Larry Wall said about Perl holds true: “When you say something in a small language, it comes out big. When you say something in a big language, it comes out small.” The same is true for English. The reason that biologist Ernst Haeckel could say “Ontogeny recapitulates phylogeny” in only three words was that he had these powerful words with highly specific meanings at his disposal. We allow inner complexity of the language because it enables us to shift the complexity away from the individual utterance.
– Hal Fulton, The Ruby Way, Introduction (emphasis added)
Programming is our thoughts, and with better ways to express them, we can spend more time thinking them, and less time expressing them.
3 + 3 + 3 + 3 + 3 + 3 is hard…hard to read (how many threes?), hard to get right (I lost count!), hard to reason about (piles of operations!). 3 x 6 is easy, once you learn multiplication. This is a good trade-off. We should look for ways to add abstractions, new semantic levels, to our programs.
If you’re doing the same thing twice, stop, and look for the common idea. Peel the idea away from the context, from the details. Grasp the idea, and then use it over and over. As a bonus, you’ll type less, re-use code, and debug less.
“But I can’t find ways to do that!”
When you look at similar bits of code, and can’t find a good way to remove the duplication, you’re hitting the limits of either your language, or your knowledge.
Programming languages put up very real walls, they force you down their paths, often by leaving out features. A language without recursion puts up a wall in front of recursive solutions; a language without first-class functions makes it tough to write higher-order functions. Language limitations are the cause of Greenspun’s Tenth Rule.
Sometimes, the language is not the problem. Sometimes you just can’t find your way through. This is why you read Refactoring, and Design Patterns, but really, this is why you learn other programming languages. Think about the right way to factor the problem.
If you can’t remove the duplication, you need to work around your language, or learn some new tricks.
12 thoughts on “Why We Abstract, and What To Do When We Can’t”
I dont want to be nitpicking on Hal Fulton but the concept of “Ontogeny recapitulates phylogeny” is since many years obsolete.
Besides, Haeckel only stated an “obvious” fact – he did not offer any explanations. I am aware that biology back then was more description than explanative (as it is these days thanks to countless reasons), but what _we_ can do is to look at the _current_ information, and not think about _past_ information sets.
If one wants to abstract these days and needs to use biology, I’d rather point out at BioBricks and encourage people to build virtual organisms.
The point isn’t what he said, it’s that he said it in three words. The accuracy of the statement is irrelevant.
Wow, please get rid of “Snap Shots.”
The annoyance of seeing a screenshot of every page I was about to visit made me stop reading what you had to say.
@she, I agree with Logan…in fact, while I’d heard Haeckel’s phrase before, I never could have told you who said it.
@Max, you really don’t like them? I’m not generally one for flashy UIs, but I like them. Maybe I keep my mouse on the side of the text when I scroll.
“Any problem in computer science can be solved with another layer of indirection. But that usually will create another problem.” – David Wheeler
I like abstractions in the language (properties, events, closures, enums etc.) but I hate to be buried in complex stacks and frameworks of the kind extremely pervasive in the Java space. Patterns can be handy as a communication mechanism but really they belong to the latter type of indirection. Thank goodness we don’t have to roll our own OO pattern complete with VTABLE lookups etc.
I sometimes think that a program is like some strange sort of waterbed with multiple cells. Complexity is the water – when you squeeze the water out of one cell it doesn’t disappear, it just goes somewhere else. Our job as programmers is to choose abstractions (distribute the water) in such a way that we’ve made ourselves a comfortable bed.
Okay, so, not the best analogy ever, but I think there might be something there worth saving.
There’s another Dijkstra quote on these lines.
“the main challenge of computer science is how not to get lost in complexities of their own making”
It says almost the opposite, doesn’t it? Adding abstraction ideally moves you farther away from the machine and closer to “reality”, but reality is complicated enough in its own right.
In the worst case scenario, abstraction moves you away from the machine and away from “reality” into some new mathematical formalism. And there’s no guarantee that your formalism is a net win, instead of a bunch of overhead. Instead it’s the wisdom lost in knowledge lost in information lost in data.
Now that I think about it, I feel remiss in not mentioning Joel’s leaky abstractions.
@Dan Lewis, that’s a good point, you want abstractions that remove UNNEEDED detail, to let you focus on the IMPORTANT detail. Still, while it’s a problem if you abstract badly, abstracting has value. You don’t stop driving a car because people sometimes have accidents, you learn to drive carefully.
I tend to think of abstraction as a ‘higher level’ of expressing something. A while back, I took a shot at trying to be a little more specific about it, which might help :-)
Abstraction is simply leaving irrelevant things behind.
We all know no thing is irrelevant, if they fail. The only problem with programming nowdays is that we went back to programming in assembly code, but instead of sending codes (control characters) to the printers and to the screens, we are doing the same but to the databases and the web browsers.
Some frameworks (ie: abstractions) have been created, but they are based too heavily on the past. Look for example at Hibernate, it tries to do away with SQL, but you end up writing a lot more code and the SQL generated is terrible. One query may end up doing 16,000 SQL queries. Then you wonder why the system is so slow…
The same occurs with HTML. Most programmers think that they need to know HTML, that’s like learning the opcodes of the LaserJet and sending the individual characters to the printer (why use drivers? real men know the machine details…).
Obviously this nonsense will end with new frameworks. I’m building mine. You never see SQL nor HTML. ANd it is far more efficient than carefully handwritten code.
@Real State Agent, post a link to your framework!
Comments are closed.