Skip to content ↓ | Skip to navigation ↓

I was talking with a buddy of mine yesterday, and he was telling me a story that made me laugh, and made me cry. This is what seems to inevitably happen when the virtualization team creates a stable environment. But, I’m getting ahead of myself…

My buddy, let’s call him, Dude, was at a client site helping assess an organization’s processes around their virtualized computing environments. They are tackling this from both an operational readiness perspective, as well as for security and compliance requirements. Among the things they do is do a network scan to independently discover what is connected to the network.

“Huh, that’s funny…” says Dude. He’s looking at a report that shows about fifty Oracle instances running, all running on the VMware ESX cluster.

“Hey, DBA Manager, how many Oracle database instances did you say there were in production on those VMware ESX boxes?”

“Four.”

“That’s odd. Here’s a network scan that shows fifty of them running.”

“That’s inconceivable. I own all the database administration and deployment activities, and it’s just not possible that there are fifty of them out there!”

(Let’s ignore for now the fact that he used the word “inconceivable” incorrectly, just like inThe Princess Bride.)

“Well, here’s the report. There’s fifty Oracle of them, and here are the IP addresses.”

“Wow, that is just highly unlikely. How did this happen?”

Of course, the most likely scenario is that someone who didn’t want to roll their own VM took one that was already in production. This is great, because this is what virtualization enables, saving everyone hours of watching an OS install.

The problem is, the VM that they chose to clone had Oracle installed. Here’s a quick analysis of what went wrong:

  • There was probably not a repository of known, authorized VMs to clone, which was hardened, and went through a security review, had all unneeded components (like, say, Oracle databases) removed
  • Sysadmins took a production Oracle VM, and used that as the basis of many more VMs, mostly likely without authorization and understanding what they were cloning
  • Roles and permissions were set up incorrectly that allowed VM admins to clone the Oracle VM
  • Changes to the VM inventory were made without anyone noticing
  • Tons of VMM resources (used unnecessarily, perhaps resulting in frantic capacity expansion (e.g., what’s another $40K to add more servers to the ESX cluster, because all the available RAM keeps being used up in 3GB chunks by the jumbo VMs?)
  • Licensing headaches and disasters (e.g., production systems stop working after a reboot, because all the licenses keys are being used by cloned VMs — that’s never happened in real life, right?)

Of course, Dude and I had a laugh about this. After all, it’s so easy to do. It just takes one right-click and about 15 seconds to do! The sad thing is, both of us could point to ten other people (including ourselves) who had this happen to them, too.

One right-click and 15 seconds. Voila. A new General Ledger system! (And a new instance of Great Plains, a new SQL Server instance, complete with live interfaces to Softrax revenue manager, ADP payroll, etc…)