Category Archives: Big Data

Rapidly Deploy Hadoop on vSphere: Part 2 – Distros

In Rapidly Deploy Hadoop on vSphere: Part 1 – Basics I discussed how to setup vSphere Big Data Extensions and deploy a Hadoop cluster using the built in Apache Hadoop 1.2.1 distribution. While this will get a basic cluster up quickly you don’t get any of the other features that come with commercial distributions. VMware has addressed this by allowing you to add other distros to vBDE manually. Once they have… Read More »

Rapidly Deploy Hadoop on vSphere: Part 1- Basics

During his remarks in the General Session at PEX 2014 VMware CTO Ben Fathi said that Hadoop clusters were part of the 20% of the enterprise data center not being virtualized. This makes a lot of sense in a use case where you have a very large data set that is either static or slowly growing. However due the adoption of ideas like “The Internet of Things”, more companies are… Read More »

Visualizing vscsiStats with RStudio

  After yesterday’s post we have gathered valuable data about VM drive activity through with the vscsiStats tool. It is time to put that data to use. The CSV file that was generated is broken down into 16 different metrics for each drive collected and reported as a Histogram. While the numbers themselves are helpful, they have a much larger impact when they are visualized. This can be easily done… Read More »