Skunkworks in the Clouds

April 23, 2009

Hal Pomeranz, Deer Run Associates

Skunks... clouds... get it?

I was recently asked to make a guest appearance on a podcast related to information security in “the cloud”.  One of the participants brought up an interesting anecdote from one of his clients.  Apparently the IT group at this company had been approached by a member of their marketing team who was looking for some compute resources to tackle a big data crunching exercise.  The IT group responded that they were already overloaded and it would be months before they could get around to providing the necessary infrastructure.  Rebuffed but undeterred, the marketing person used their credit card to purchase sufficient resources from Amazon’s EC2 to process the data set and got the work done literally overnight for a capital cost of approximately $1800.

There ensued the predictable horrified gasping from us InfoSec types on the podcast.  Nothing is more terrifying than skunkworks IT, especially on infrastructure not under our direct control.  “Didn’t they realize how insecure it was to do that?” “What will happen when all of our users realize how easily and conveniently they can do this?” “How can an organization control this type of risky behavior?” We went to bed immersed in our own paranoid but comfortable world-view.

Since then, however, I’ve had the chance to talk with other people about this situation.  In particular, my friend John Sechrest delivered an intellectual “boot to the head” that’s caused me to consider the situation in a new light.  Apparently getting the data processed in a timely fashion was so critical to the marketing department that they figured out their own self-service plan for obtaining the IT resources they needed. If the project was that critical, John asked, was it reasonable from a business perspective for the IT group to effectively refuse to help their marketing department crunch this data?

Maybe the IT group really was overloaded– most of them are these days.  However, the business of the company still needs to move forward, and the clever problem-solving monkeys in various parts of the organization will figure out ways to get their jobs done even without IT support. “Didn’t they realize how insecure it was to do that?”  No, and they didn’t care.  They needed to accomplish a goal, and they did.

“What will happen when all of our users realize how easily and conveniently they can do this?” My guess is they’re going to start doing it a lot more.  Maybe that’s a good thing.  If the IT group is really overloaded, then perhaps it should think about actually empowering their users to do these kind of “one off” or prototype projects on their own without draining the resources of the core IT group.  Remember that if you let a thousand IT projects bloom, 999 of them are going to wither and die shortly thereafter.  Perhaps IT doesn’t need to waste time managing the death of the 999.

“How can an organization control this type of risky behavior?” You probably can’t.  So perhaps your IT group should provide a secure offering that’s so compelling that your users will want to use your version rather than the commodity offerings that are so readily available.  This solution will have to be tailored to each company, but I think it starts with things like:

  • Pre-configured images with known baseline configurations and relevant tools so that groups can get up an running quickly without having to build and upload their own images.
  • Easy toolkits for migrating data and out of these images in a secure fashion, with some sort of DLP solution baked in.
  • Secure back-end storage to protect the data at rest in these images with no extra work on the part of the users.
  • Integration with the organization’s existing identity management and/or AAA framework so that users don’t have to re-implement their own solutions.
  • Integration with the organization’s auditing and logging infrastructures so you know what’s going on.

Putting together the kind of framework described above is a major IT project, and will require input and participation from your user community.  But once accomplished, it could provide massive leverage to overtaxed IT organizations.  Rather than IT having to engineer everything themselves, they provide secure self-service building blocks to their customers and let them have at it.

Providing architecture support and guidance in the early stages of each project is probably prudent.  After all, the one hardy little flower that blooms and refuses to die may become a critical resource to the organization that may eventually need to be moved back “in house”.  While the fact that the building blocks that were used to create the service are already well-integrated with the organization’s centralized IT infrastructure will help, having a reasonable architectural design from the start will also be a huge help when it comes time to migrate and continue scaling the service.

Am I advocating skunkworks IT?  No, I like to think I’m advocating self-service IT on a grand scale.  You’ll see what skunkworks IT looks like if you ignore this issue and just let your users develop their own solutions because you’re too busy to help them.

Advertisements

Hal Pomeranz, Deer Run Associates

Recently my pal Bill Schell and I were gassing on about the current and future state of IT employment, and he brought up the topic of IT jobs being “lost to the Cloud”.  In other words, if we’re to believe in the marketing hype of the Cloud Computing revolution, a great deal of processing is going to move out of the direct control of the individual organizations where it is currently being done.  One would expect IT jobs within those organizations that had previously been supporting that processing to disappear, or at least migrate over to the providers of the Cloud Computing resources.

I commented that the whole Cloud Computing story felt just like another turn in the epic cycle between centralized and decentralized computing.  He and I had both lived through the end of the mainframe era, into “Open Systems” on user desktops, back into centralized computing with X terminals and other “thin clients”, back out onto the desktops again with the rise of extremely powerful, extremely low cost commodity hardware, and now we’re harnessing that commodity hardware into giant centralized clusters that we’re calling “Clouds”.  It’s amazingly painful for the people whose jobs and lives are dislocated by these geologic shifts in computing practice, but the wheel keeps turning.

Bill brought up an economic argument for centralized computing that seems to crop up every time we’re heading back into the shift towards centralized computing.  Essentially the argument is summarized as follows:

  • As the capital cost of computing power declines, support costs tend to predominate.
  • Centralized support costs less then decentralized support.
  • Therefore centralized computing models will ultimately win out.

If you believe this argument, by now we should have all embraced a centralized computing model.  Yet instead we’ve seen this cycle between centralized and decentralized computing.  What’s driving the cycle?  It seems to me that there are other factors that work in opposition and keep the wheel turning.

First, it’s generally been a truism that centralized computing power costs more than decentralized computing.  In other words, it’s more expensive to hook 64 processors and 128GB of RAM onto the same backplane than it is to purchase 64 uniprocessor machines each with 2GB of RAM.  The Cloud Computing enthusiasts are promising to crack that problem by “loosely coupling” racks of inexpensive machines into a massive computing array. Though when “loose” is defined as Infiniband switch fabrics and the like, you’ll forgive me if I suspect they may be playing a little Three Card Monte with the numbers on the cost spreadsheets.  The other issue to point out here is that if your “centralized” computing model is really just a rack of “decentralized” servers, you’re giving up some of the savings in support costs that the centralized computing model was supposed to provide.

Another issue that rises to the fore when you move to a centralized computing model is the cost to the organization to maintain their access to the centralized computing resource.  One obvious cost area is basic “plumbing” like network access– how much is it going to cost you to get all the bandwidth you need (in both directions) at appropriately low latency?  Similarly, when your compute power is decentralized it’s easier to hide environmental costs like power and cooling, as opposed to when all of those machines are racked up together in the same room.  However, a less obvious cost is the cost of keeping the centralized computing resource up and available all the time, because now with all of your “eggs in one basket” as it were your entire business can be impacted by the same outage.  “Five-nines” uptime is really, really expensive.  Back when your eggs were spread out across multiple baskets, you didn’t necessarily care as much about the uptime of any single basket and the aggregate cost of keeping all the baskets available when needed was lower.

The centralized vs. decentralized cycle keeps turning because in any given computing epoch the costs of all of the above factors rise and fall.  This leads IT folks to optimize one factor over another, which promotes shifts in computing strategy, and the wheel turns again.

Despite what the marketeers would have you believe, I don’t think the Cloud Computing model has proven itself to the point where there is a massive impact on the way mainstream business is doing IT.  This may happen, but then again it may not.  The IT job loss we’re seeing now has a lot more to do with the general problems in the world-wide economy than jobs being “lost to the Cloud”.  But it’s worth remembering that massive changes in computing practice do happen on a regular basis, and IT workers need to be able to read the cycles and position themselves appropriately in the job market.