I was reminded of this excellent blog post by Leon Fayer of OmniTI.
As software developers, we often think our job is to develop software, but, really, that is just the means to an end, and the end is to empower business to reach their goals. Your code may be elegant, but if it doesn't meet the objectives (be they time or business) it doesn't f*ing work.
Likewise I've seen sysadmin projects that spent so much time in the planning stage that they never were going to ship unless someone stood up and said, "we've planned enough. I'm going to start coding whether you like it or not". Yes, that means that some aspect of the design wasn't perfect. Yet, the suggestion that more planning would lead to the elimination of all design imperfections is simply hubris. (If not hubris, it is a sign that one's OCD or OCD-like tendencies is being used as a cowardly excuse to not get started.)
But what I really want to write about today is...
"The big project that won't ship".
There once was a team that had a large software base. One part of it was obsolete and needed to be rewritten. It was written in an unsupported language. It didn't have half the features it needed. It didn't even have a GUI.
There were two proposals:
One was to refactor and recode bits of it until the system was replaced. Along the way every few weeks the results would see the light of day. There were many milestones: add a read-only "viewer" GUI, build a better data storage system, refactor the old code to use the new GUI, enhance the GUI to include full editing, etc.
The competing proposal was to assign 4 developers to build a replacement system. They'd be given 2 years to write the new system from scratch. During that time they'd be protected and, essentially, hidden. The justification for this was that the old system was so broken that doing any kind frankenstein half-old half-new system would be flatly impossible or would be a drag to efficiency. It would be more efficient to code it "pure" and not constantly be dealing with the old system.
Management approved the competing proposal. 1.5 years later the project hadn't gotten anywhere. When people were needed for other projects, management looked around and decided to steal the 4 engineers. This is because it is good management to take resources away from low priority projects and put it on high priority projects. Any project, no matter how noble, with no results for 18 months, is lower priority than a project with a burning need. In fact, the definition of a low priority project is that you can wait 2 years for the results.
The project was cancelled and 1.5 years of work was thrown away. 4 engineers times 18 months... at least a million dollars down the tube.
Meanwhile the person that proposed the incremental project had gone forward in parallel with the first milestone: a simple enhancement to the existing system that solved the biggest complaint of the system. It talked to the old datastore and would have to be re-engineered when the new datastore was finally available, but it worked and solved a very serious problem. It was a "half measure" but served its purpose.
The person that created the "half measure" had been scolded for wasting time on a parallel project. Yet, the "big" project was cancelled and this "half measure" is still in use today. At least he had the gravitas to not say "I told you so".
The biggest "cost" to a company is opportunity cost. That is, the loss of $$$ from not taking action. By shipping early and often you grab opportunity.
Imagine a factory that made widgets for 24 months, stored them in a warehouse, and then started shipping them all at once. That would be crazy. A factory sells what they make as soon as they are manufactured. Software companies used to write code for years and then ship it. That was crazy. Now you make a minimum viable product, ship that, and use the knowledge gained to make the next iteration.
My career advice is to only do projects that produce usable output every few weeks or months. Being on a project that will not show any results for a year or more is a good way to hide from management. Being invisible is a career killer. For software projects this means setting early milestones of some kind of minimal viable product. For purely operational projects be able to announce milestones or progress (number of machines converted, number of ms latency improvement, etc.)
At StackExchange there is a big project coming up related to how we provision new machines. While a "green field" approach would be nice, I'm looking into how we can refactor the current cruddy bits so that we can do this project incrementally. The biggest problem is that we have a crappy CMDB with no API. Everything seems to touch that one element and replacing it is going to be a pain. (I'd like to evaluate Flipkart/HostDb if anyone has opinions, let me know.) However I think we can restructure the project into 5 independent milestones. By "independent" I mean they can be done in any order with the other 4 requiring minimal refactoring as a result.
This will have a few benefits: We'll get the benefit of each milestone as it happens. Certain milestones can be done in parallel by different sub-teams. If the first few completed milestones make the process "good enough", we don't have to do the other milestones.
I think you have clearly demonstrated yet another example of the "Second System Effect" which is well described in Fred Brooks' "The Mythical Man-Month." This pattern emerges again and again in the industry. The temptation of a greenfield design leads you to develop OS/2, with all it's internal elegance and end-user horror. Your career advice is sound and backed by years of industry successes and disasters; focus on delivering customer value before worrying about the perfect underpinning system.