Creating an HDInsight cluster from the Azure portal is very easy. However, sometimes you want all the choices and best practices explained as well as the “how to”. I have created a series of slides with audio recordings to walk you through the process and choices. They are available as sessions 1-8 of “Create HDInsight Cluster in Azure Portal” on my YouTube channel Small Bites of Big Data.
Playlist Getting Started with HDInsight: https://www.youtube.com/playlist?list=PLAD2dOpGM3s1R2L5HgPMX4MkTGvSza7gv
- Why HDInsight: http://youtu.be/J9KzIShLeD8
- Azure Subscription: http://youtu.be/lSxMtmRE114
- Azure Storage – WASB: http://youtu.be/6OdDDmdaVVE
- Metastore: http://youtu.be/1Og_eftYVpA
- Create HDInsight: http://youtu.be/SysIo3LwONk
- Hive Query: http://youtu.be/DRAuOXsuec0
- Load Demo Data: http://youtu.be/XyiOpRPjfUs
- Pricing, Automation, and Wrapup: http://youtu.be/78YowrOnNGM
PowerPoint deck: http://www.slideshare.net/cindygross1/create-hd-insightfeb2015
HDInsight is Hadoop on Azure as a service.
- Easy, cost effective, changeable scale out data processing
- Lower TCO – easily add/remove/scale
- Separation of storage and compute allows data to exist across clusters
- Hortonworks HDP is one of the 3 major Hadoop
distributors, the most purely open source
- HDInsight *IS* Hortonworks HDP as a service in Azure (cloud)
- Metastore (Hcatalog) exists independently across clusters via SQL DB
- #, size, type of clusters are flexible and can all access the same data
- Hive is a Hadoop component that makes data look like rows/columns for data warehouse type activities
Continue reading on SQLCindy’s Small Bites of Big Data blog.