1. You will investigate how Hive works; create, load, query and store data in Apache Hive in
both our HU cloud platform and MapR sandbox.
2. You will compare the Hive performance between our HU cloud platform and MapR sandbox.
* Most contents are coming from https://learn.mapr.com/ by the permission of MapR
technology.
Prerequisite:
To our HU cloud platform,
For Hadoop Cluster Overview,
http://hdfs-namenode-hadoop.apps.myhu.cloud/dfshealth.html#tab-overview
For accessing Hadoop cluster nodes,
https://master1.myhu.cloud:8443/console/project/hadoop/browse/pods
User Name : hadoop, passwd: hadoop
To access the terminal of name node, please, click “hdfs-namenode-0” and then click
“Terminal”.
1. Create the folder named with your student id and work under the folder created.
2. To upload file, please use “wget” or other commands you like.
3. If you have any problem/issue with HU cloud, please report it to your submission and
use google cloud or Amazon cloud
4. If you have any problem/issue with HU cloud, google cloud or amazon cloud, please
report it to your submission and work only with MapR sandbox.
For using MapR sandbox,
Please, download the one of the MapR sandboxes listed below.
• VMware Course Sandbox: http://package.mapr.com/releases/v5.1.0/sandbox/MapR-
Sandbox-
https://learn.mapr.com/
http://hdfs-namenode-hadoop.apps.myhu.cloud/dfshealth.html#tab-overview
https://master1.myhu.cloud:8443/console/project/hadoop/browse/pods
For-Hadoop-5.1.0-vmware.ova
• VirtualBox Course Sandbox: http://package.mapr.com/releases/v5.1.0/sandbox/MapR-
Sandbox-
For-Hadoop-5.1.0.ova
For the installation, please refer to https://mapr.com/docs/52/SandboxHadoop/c_sandbox_overview.html
Logging in to the Command Line
● Before you get started, you'll want to have the IP address handy for your Sandbox VM.
See the screenshot below for an example of where to find that.
● Next, use an SSH client such as Putty (Windows) or Terminal (Mac) to login. See below
for an example:
● use userid: user01 and password: mapr.
●
● For VMWare use: $ ssh user01@ipaddress
● For Virtualbox use: $ ssh user01@127.0.0.1 -p 2222
For MapR sandbox,
Connect to the Hive CLI
The lab file contains data and source code you will use to complete the lab exercises.
1. Log in to your cluster as user01 (password is mapr).
2. Position yourself in the /user/user01 directory in the cluster file system:
$ cd /mapr/MyCluster/user/user01
3. Then, download and unzip the lab files
$ wget http://course-files.mapr.com/DA4400-R1/DA440