Out of the Tar Pit: Analysis of Software Complexity
November 17, 2014 1:35 PM Subscribe
Out of the Tar Pit (SL-GitHub to PDF) by Ben Moseley and Peter Marks. Abstract:
Related talk is "Simple Made Easy" from Rich Hickey.
These were mentioned by Kevin Dangoor in his talk at 1DevDay Detroit (11/17/2014): "Simplifying JavaScript with ReactJS and Friends"
Complexity is the single major difficulty in the successful development of large-scale software systems. Following Brooks we distinguish accidental from essential difficulty, but disagree with his premise that most complexity remaining in contemporary systems is essential. We identify common causes of complexity and discuss general approaches which can be taken to eliminate them where they are accidental in nature. To make things more concrete we then give an outline for a potential complexity-minimizing approach based on functional programming and Codd’s relational model of data.
Related talk is "Simple Made Easy" from Rich Hickey.
These were mentioned by Kevin Dangoor in his talk at 1DevDay Detroit (11/17/2014): "Simplifying JavaScript with ReactJS and Friends"
One would think that this domain is where AI would first become useful. A significant need exists. A better defined domain than most, terms are actually defined as opposed to the 'real world' where synonyms or euphemisms are obvious but not at all a precise definition. A computer anywhere with network access could 'work' on a software project. But no.
posted by sammyo at 1:45 PM on November 17, 2014
posted by sammyo at 1:45 PM on November 17, 2014
Thanks for noting that Harvey, and my apologies if, at first, it was confusing. For one reason or another the PDF automatically downloaded instead of being opened in the browser, so I deferred to the GitHub link
posted by JoeXIII007 at 1:53 PM on November 17, 2014
posted by JoeXIII007 at 1:53 PM on November 17, 2014
E. F. Codd. 1970. A relational model of data for large shared data banks. Commun. ACM 13, 6 (June 1970), 377-387.
posted by mikelieman at 2:35 PM on November 17, 2014 [2 favorites]
posted by mikelieman at 2:35 PM on November 17, 2014 [2 favorites]
I like the Brooks' update of "incidental" as opposed to "accidental" complexity. So, incidental and essential complexity instead of accidental and essential complexity.
posted by zeek321 at 2:37 PM on November 17, 2014 [1 favorite]
posted by zeek321 at 2:37 PM on November 17, 2014 [1 favorite]
This really is the thing I hate most about working in IT: You simply cannot convince people that complexity is not the essential kind. I've lost count of the number of times I've argued for the simpler solution, and have been met with blank stares. I might as well have been speaking Martian.
Even the classic CS essay, "Worse Is Better", is an example of this. The author of the paper is presented with overwhelming evidence that the simpler solution is the better solution, yet still clings to the idea that Simple == Worse. This leads to a large amount of cognitive dissonance on his part, hence the title Worse Is Better.
posted by 1970s Antihero at 2:39 PM on November 17, 2014 [4 favorites]
Even the classic CS essay, "Worse Is Better", is an example of this. The author of the paper is presented with overwhelming evidence that the simpler solution is the better solution, yet still clings to the idea that Simple == Worse. This leads to a large amount of cognitive dissonance on his part, hence the title Worse Is Better.
posted by 1970s Antihero at 2:39 PM on November 17, 2014 [4 favorites]
This is 66 pages long, so there's no way I'm going to finish it before tomorrow at the earliest. Nonetheless the distinction made in this paper has already given me an idea about writing comments in source code:
When I've written a large & complex module that fits into a big system, my instinct is to write a long & comprehensive comment explaining how my module works. I've got the whole functionality in my mind at that point, and I want to do a brain dump so the next person who has to take over that area can get up to speed. We could call that a "what this code does" comment.
But in fact it would be just as useful to write a "what this code doesn't do" comment! When people plan to add new features to an existing code-base in the future, they have to decrease the amount of "state" they've got to think about. They don't really want to know what my output formatting code does, so much as they want to know that it won't interfere with their output limiting project, for instance.
From now on, I'm going to write internal documentation for future not-me as much as I do for future me.
posted by Harvey Kilobit at 2:46 PM on November 17, 2014 [2 favorites]
When I've written a large & complex module that fits into a big system, my instinct is to write a long & comprehensive comment explaining how my module works. I've got the whole functionality in my mind at that point, and I want to do a brain dump so the next person who has to take over that area can get up to speed. We could call that a "what this code does" comment.
But in fact it would be just as useful to write a "what this code doesn't do" comment! When people plan to add new features to an existing code-base in the future, they have to decrease the amount of "state" they've got to think about. They don't really want to know what my output formatting code does, so much as they want to know that it won't interfere with their output limiting project, for instance.
From now on, I'm going to write internal documentation for future not-me as much as I do for future me.
posted by Harvey Kilobit at 2:46 PM on November 17, 2014 [2 favorites]
Oops: improperly dated that Detroit Conference. That was this past Saturday: 11/15
Carry on
posted by JoeXIII007 at 2:59 PM on November 17, 2014
Carry on
posted by JoeXIII007 at 2:59 PM on November 17, 2014
My current boss is a big fan of complexity for complexity's sake. The worst is that he simply mocks your programming skills if you say something is hard to read.
Example from today: in order to save 14 bytes of static strings, he has four tiny character strings sharing the same buffer and then code to extract those tiny substrings.
Pointing out that we recommend machines with 16 gigs of RAM for this project, and that it's not even clear we're saving any net memory because of the additional memory cost of calling that function probably consumes the extra 14 bytes has no effect.
Many people really want to prove how smart they are with their code.
By comparison, when I was at Google I got to read a lot of code by Jeff Dean and Sanjay Ghemawat. I was initially quite disappointed, because it didn't use any advanced language features, it seemed conservative and too simple. Later on I realized the code was super-efficient and obviously correct... they weren't trying to be fancy or clever but simply do the best job.
posted by lupus_yonderboy at 4:16 PM on November 17, 2014 [13 favorites]
Example from today: in order to save 14 bytes of static strings, he has four tiny character strings sharing the same buffer and then code to extract those tiny substrings.
Pointing out that we recommend machines with 16 gigs of RAM for this project, and that it's not even clear we're saving any net memory because of the additional memory cost of calling that function probably consumes the extra 14 bytes has no effect.
Many people really want to prove how smart they are with their code.
By comparison, when I was at Google I got to read a lot of code by Jeff Dean and Sanjay Ghemawat. I was initially quite disappointed, because it didn't use any advanced language features, it seemed conservative and too simple. Later on I realized the code was super-efficient and obviously correct... they weren't trying to be fancy or clever but simply do the best job.
posted by lupus_yonderboy at 4:16 PM on November 17, 2014 [13 favorites]
Sometimes I think programmers are too fascinated with "automagically", to the point where they are trying to turn code that could be straightforward into magic tricks. They think they have clean and elegant designs, but really they are creating obfuscated solutions.
A friend of mine called this style, "Why use one function, when three classes and a couple interfaces will do?"
posted by fleacircus at 5:41 PM on November 17, 2014 [2 favorites]
A friend of mine called this style, "Why use one function, when three classes and a couple interfaces will do?"
posted by fleacircus at 5:41 PM on November 17, 2014 [2 favorites]
Even the classic CS essay, "Worse Is Better", is an example of this. The author of the paper is presented with overwhelming evidence that the simpler solution is the better solution, yet still clings to the idea that Simple == Worse.
That is the opposite of how I would characterize Richard Gabriel's "Worse is Better" concept.
posted by jjwiseman at 5:49 PM on November 17, 2014 [1 favorite]
That is the opposite of how I would characterize Richard Gabriel's "Worse is Better" concept.
posted by jjwiseman at 5:49 PM on November 17, 2014 [1 favorite]
1970s Antihero: "Even the classic CS essay, "Worse Is Better", is an example of this. The author of the paper is presented with overwhelming evidence that the simpler solution is the better solution, yet still clings to the idea that Simple == Worse. This leads to a large amount of cognitive dissonance on his part, hence the title Worse Is Better."
Well, in the case of the Worse Is Better essay, the simple solution actually *is* worse. Or to be more precise, solutions with simple implementations are often inferior to ones with simple interfaces. A simple implementation often provides poor abstractions and leaks implementation details into any client code that uses it; now your client code is more complicated (or wrong) because the library or OS you've built it on took a worse-is-better approach.
The paradox of worse-is-better is that the simple implementation is often evolutionarily and economically more successful than the one with greater user-level simplicity (consider the main example in the essay of the success of Unix vs. the failure of Lisp machines). Now we're paying the price for that success. Every null pointer error, every buffer overflow, every SQL injection error we've faced in the last few decades comes from an engineering culture that favored simplicity of implementation over safety, correctness, and simplicity of interface. Those engineering decisions may have been reasonable back in the days of slow, expensive machines, but in an age of multi-gigahertz machines connected to untrusted networks, it's disgraceful that we're still building database queries by appending strings together and writing non-performance critical applications in unsafe languages like C and C++.
It's telling, I think, that the linked paper itself argues for functional languages and extending the relational model to essential program state. Functional languages typically have more complex implementations, but they genuinely abstract away from the accidental complexity of things like machine pointers and memory management. The same goes for relational databases — the relational model requires a more complex storage and query engine behind the scenes, but the trade-off is that now you can isolate the concerns of data storage and querying from the essential difficulty of processing and using the data.
posted by Wemmick at 5:49 PM on November 17, 2014 [7 favorites]
Well, in the case of the Worse Is Better essay, the simple solution actually *is* worse. Or to be more precise, solutions with simple implementations are often inferior to ones with simple interfaces. A simple implementation often provides poor abstractions and leaks implementation details into any client code that uses it; now your client code is more complicated (or wrong) because the library or OS you've built it on took a worse-is-better approach.
The paradox of worse-is-better is that the simple implementation is often evolutionarily and economically more successful than the one with greater user-level simplicity (consider the main example in the essay of the success of Unix vs. the failure of Lisp machines). Now we're paying the price for that success. Every null pointer error, every buffer overflow, every SQL injection error we've faced in the last few decades comes from an engineering culture that favored simplicity of implementation over safety, correctness, and simplicity of interface. Those engineering decisions may have been reasonable back in the days of slow, expensive machines, but in an age of multi-gigahertz machines connected to untrusted networks, it's disgraceful that we're still building database queries by appending strings together and writing non-performance critical applications in unsafe languages like C and C++.
It's telling, I think, that the linked paper itself argues for functional languages and extending the relational model to essential program state. Functional languages typically have more complex implementations, but they genuinely abstract away from the accidental complexity of things like machine pointers and memory management. The same goes for relational databases — the relational model requires a more complex storage and query engine behind the scenes, but the trade-off is that now you can isolate the concerns of data storage and querying from the essential difficulty of processing and using the data.
posted by Wemmick at 5:49 PM on November 17, 2014 [7 favorites]
Even the classic CS essay, "Worse Is Better", is an example of this. The author of the paper is presented with overwhelming evidence that the simpler solution is the better solution, yet still clings to the idea that Simple == Worse. This leads to a large amount of cognitive dissonance on his part, hence the title Worse Is Better.
Well, I don't know that it's that...simple. The lesson that I took from "Worse Is Better" is that people who prefer the good and attainable to the perfect and harder-to-attain are going to build systems that actually get used. In the essay the comparison is between C and Unix on the one hand and Lisp and...I dunno, maybe Lisp machines on the other. Lisp is an amazing programming language. It introduced a bunch of new things to the programming community, like garbage collection and functions as first-class objects, that C didn't bother with because they weren't simple things to implement. But they are features that make it possible to write simpler code. There has always been a tension between programming languages that favor the machine and languages that favor the programmer. As Moore's Law has done its work, language designers have been able to add in these more complex features because languages can now have them and still run reasonably quickly.
One example of a new language with features that Lisp introduced is Go, which is interesting because some of its designers come directly from the New Jersey school of Worse Is Better. Go uses garbage collection and its functions are first-class objects, which is great. So in that sense, we are moving closer to the perfect (whatever that is). But I just looked again at The Rise of "Worse Is Better" and I'll be damned if the canonical example of this wasn't the loser-ing problem. Go's approach to exception handling is to return an error as a separate value from a function call, but the user (the programmer) has to deal with it manually. Most other languages these days raise an exception, which also isn't great but I think I prefer it because it provides more direct feedback that something isn't right in the code. Go's solution is straight out of Worse Is Better.
Lisp also introduced map and reduce, and it was with great frustration that I recently read this Go thread in which the designers made it clear that they weren't going to add those functions. It wasn't clear to me why they were against them, it was either because of performance or because it would force them to introduce generics to the language, making it more complex. But I think it was a mistake, because I can write simpler code with map and reduce, and I enjoy it more when I can use them. That said, after having written multithreaded code in Java and Objective C, I think goroutines are a significant advance in programming concurrency and the next time I need to max out multiple cores, I'll try it with Go.
As for complexity, I'm pretty much in agreement with this paper. There's been an idea floating around recently that an imperative style of coding is good when you're dealing with I/O when connecting to various devices and servers, but when everything's in memory, the functional style is best. One of my favorite talks about this is Boundaries by Gary Bernhardt, the Wat guy, but Boundaries is serious and quite interesting. What's really interesting to me is that John Carmack has been trying out functional programming and seems in broad agreement with this idea of using a functional style whenever possible. I really like the Carmack essay because he talks about the tradeoffs as well as the benefits of functional programming, while being overall in favor of it.
Anyway, sorry, I agree that it's great to strive for simplicity whenever possible, but sometimes making things simple for one person or machine makes it harder for another.
posted by A dead Quaker at 6:19 PM on November 17, 2014 [3 favorites]
Well, I don't know that it's that...simple. The lesson that I took from "Worse Is Better" is that people who prefer the good and attainable to the perfect and harder-to-attain are going to build systems that actually get used. In the essay the comparison is between C and Unix on the one hand and Lisp and...I dunno, maybe Lisp machines on the other. Lisp is an amazing programming language. It introduced a bunch of new things to the programming community, like garbage collection and functions as first-class objects, that C didn't bother with because they weren't simple things to implement. But they are features that make it possible to write simpler code. There has always been a tension between programming languages that favor the machine and languages that favor the programmer. As Moore's Law has done its work, language designers have been able to add in these more complex features because languages can now have them and still run reasonably quickly.
One example of a new language with features that Lisp introduced is Go, which is interesting because some of its designers come directly from the New Jersey school of Worse Is Better. Go uses garbage collection and its functions are first-class objects, which is great. So in that sense, we are moving closer to the perfect (whatever that is). But I just looked again at The Rise of "Worse Is Better" and I'll be damned if the canonical example of this wasn't the loser-ing problem. Go's approach to exception handling is to return an error as a separate value from a function call, but the user (the programmer) has to deal with it manually. Most other languages these days raise an exception, which also isn't great but I think I prefer it because it provides more direct feedback that something isn't right in the code. Go's solution is straight out of Worse Is Better.
Lisp also introduced map and reduce, and it was with great frustration that I recently read this Go thread in which the designers made it clear that they weren't going to add those functions. It wasn't clear to me why they were against them, it was either because of performance or because it would force them to introduce generics to the language, making it more complex. But I think it was a mistake, because I can write simpler code with map and reduce, and I enjoy it more when I can use them. That said, after having written multithreaded code in Java and Objective C, I think goroutines are a significant advance in programming concurrency and the next time I need to max out multiple cores, I'll try it with Go.
As for complexity, I'm pretty much in agreement with this paper. There's been an idea floating around recently that an imperative style of coding is good when you're dealing with I/O when connecting to various devices and servers, but when everything's in memory, the functional style is best. One of my favorite talks about this is Boundaries by Gary Bernhardt, the Wat guy, but Boundaries is serious and quite interesting. What's really interesting to me is that John Carmack has been trying out functional programming and seems in broad agreement with this idea of using a functional style whenever possible. I really like the Carmack essay because he talks about the tradeoffs as well as the benefits of functional programming, while being overall in favor of it.
Anyway, sorry, I agree that it's great to strive for simplicity whenever possible, but sometimes making things simple for one person or machine makes it harder for another.
posted by A dead Quaker at 6:19 PM on November 17, 2014 [3 favorites]
A friend of mine called this style, "Why use one function, when three classes and a couple interfaces will do?"
"Architecture Astronauts"
posted by mikelieman at 6:33 PM on November 17, 2014 [1 favorite]
"Architecture Astronauts"
posted by mikelieman at 6:33 PM on November 17, 2014 [1 favorite]
1970s Antihero: "This leads to a large amount of cognitive dissonance on his part, hence the title Worse Is Better."
It's funny that you use that term, because if you read the historical perpsective, the guy wrote both papers on both sides of the argument.
posted by pwnguin at 6:47 PM on November 17, 2014 [1 favorite]
It's funny that you use that term, because if you read the historical perpsective, the guy wrote both papers on both sides of the argument.
posted by pwnguin at 6:47 PM on November 17, 2014 [1 favorite]
I recently finished working through Learn You a Haskell for Great Good! I was left with the feeling that while I understood the syntax I had no idea what I'd use functional programming for in writing the business systems I create for a living. This abstract gives me hope that perhaps this is the article that will help me finally wrap my head around what in the hell FP people are driving at.
Thanks for posting it, JoeXIII007.
posted by ob1quixote at 6:49 PM on November 17, 2014 [2 favorites]
Thanks for posting it, JoeXIII007.
posted by ob1quixote at 6:49 PM on November 17, 2014 [2 favorites]
> ... I had no idea what I'd use functional programming for in writing the business systems I create for a living.
I don't know what kind of business systems you create, but there's the classic Haskell paper Composing contracts: an adventure in financial engineering.
posted by benito.strauss at 7:50 PM on November 17, 2014 [3 favorites]
I don't know what kind of business systems you create, but there's the classic Haskell paper Composing contracts: an adventure in financial engineering.
posted by benito.strauss at 7:50 PM on November 17, 2014 [3 favorites]
fleacircus: "A friend of mine called this style, "Why use one function, when three classes and a couple interfaces will do?""
Wait - I thought that was just how Java worked?
posted by symbioid at 7:22 AM on November 18, 2014 [1 favorite]
Wait - I thought that was just how Java worked?
posted by symbioid at 7:22 AM on November 18, 2014 [1 favorite]
Example from today: in order to save 14 bytes of static strings, he has four tiny character strings sharing the same buffer and then code to extract those tiny substrings.
Ah, the sophmoric fetish for "optimization" that you see so many CS students develop. WHich isn't actually improving a damn thing, usually, it's about shoiwing off. You're supposed to move beyond thinking of programming as mostly being about showing how clever you are.
posted by thelonius at 8:12 AM on November 18, 2014 [4 favorites]
Ah, the sophmoric fetish for "optimization" that you see so many CS students develop. WHich isn't actually improving a damn thing, usually, it's about shoiwing off. You're supposed to move beyond thinking of programming as mostly being about showing how clever you are.
posted by thelonius at 8:12 AM on November 18, 2014 [4 favorites]
« Older Danny Macaskill: The Ridge | Mexico on the brink Newer »
This thread has been archived and is closed to new comments
posted by Harvey Kilobit at 1:44 PM on November 17, 2014 [2 favorites]