H2O Install and Run Guide
Note: Vanilla Apache Hadoop is not supported. Only Hortonworks, Cloudera and Mapr Hadoop Distros are supported.
- Install and run Hortonworks HDP 2.2 on all nodes. (Contact Steve Capper to get ARM64 source for HDP and you can build it like how you build Vanilla Hadoop)
- Make sure there are no errors in Hadoop logs. Check Hadoop health (resource and node manager logs especially)
- Make sure all Hadoop nodes are in sync w.r.t. system time
Download H2O on your master machine from http://h2o.ai/download/
- Go to the above link
- Under Download H2O section, Click on the latest stable release link
- Then click on the 'Install on Hadoop' tab
- Follow the instructions on this page and download H2O for hadoop using the links provided as per your Hadoop distro
- Run H2O with the following command from H2O's directory:
$HADOOP_HOME/bin/hadoop jar h2odriver.jar -nodes 6 -mapperXmx 25g -timeout 1800 -network '10.123.234.0/24' -output hdfsOutput_h2o_01
- This will start a ReST API on the cluster
- H2O is started as a mapper on each node
- You can set -mapperXmx as per your system. This is memory per node allocated for mapper process. Minimum 6g per node is recommended
- For multiple nodes, it usually takes longer than 120s which is the default timeout to bringup H2O cluster. Hence, set a high timeout value.
- -output directory concept is similar to Hadoop. New directory for each run so that old results are not overwritten.
Flow Guide and Airlines Demo
Once H2O is running, go to your web browser and open port 54321 of your master machine on which H2O is running. For eg: http://10.123.234.23:54321 (Note: It is not necessary that H2O will start it's ReST API only on Master machine. Please check the std out logs on your terminal when you start H2O.)
- The above step loads H2O flow in the browser which is an open source web UI.
Video for H2O flow: https://www.youtube.com/watch?v=wzeuFfbW7WE
Video for Airlines demo: https://www.youtube.com/watch?v=5UCZngHX7EI
An older video for airlines demo: https://www.youtube.com/watch?v=bInMSgZhDd4
- Play around with Cluster Status and Water meter tools available in 'Admin' toolbar menu
LEG/Engineering/BigData/H2OInstallAndRunGuide (last modified 2016-03-21 23:03:09)