How To Install Hadoop In Redhat Linux
Infrastructure and Management
- Red Hat Enterprise Linux
- Red Hat Virtualization
- Red Hat Identity Management
- Crimson Hat Directory Server
- Ruby-red Hat Document Organization
- Red Hat Satellite
- Red Lid Subscription Management
- Red Hat Update Infrastructure
- Red Hat Insights
- Red Chapeau Ansible Automation Platform
Cloud Calculating
- Carmine Hat OpenShift
- Red Hat CloudForms
- Carmine Hat OpenStack Platform
- Blood-red Hat OpenShift Container Platform
- Ruby Chapeau OpenShift Data Science
- Ruby-red Hat OpenShift Online
- Red Hat OpenShift Dedicated
- Red Hat Advanced Cluster Security for Kubernetes
- Cherry Hat Avant-garde Cluster Management for Kubernetes
- Red Hat Quay
- Blood-red Hat CodeReady Workspaces
- Red Hat OpenShift Service on AWS
Storage
- Ruby Hat Gluster Storage
- Scarlet Hat Hyperconverged Infrastructure
- Red Lid Ceph Storage
- Cherry-red Chapeau OpenShift Information Foundation
Runtimes
- Red Hat Runtimes
- Red Lid JBoss Enterprise Application Platform
- Red Hat Data Filigree
- Scarlet Hat JBoss Web Server
- Red Hat Single Sign On
- Red Hat support for Spring Boot
- Red Chapeau build of Node.js
- Red Hat build of Thorntail
- Red Hat build of Eclipse Vert.x
- Red Hat build of OpenJDK
- Scarlet Hat build of Quarkus
- Red Hat CodeReady Studio
Integration and Automation
- Ruddy Hat Process Automation
- Cherry Hat Process Automation Director
- Ruby-red Hat Determination Manager
Prove Table of Contents
12.2. Installing the Hadoop FileSystem Plugin for Red Hat Storage
12.ii.1. Calculation the Hadoop Installer for Crimson Hat Storage
You must have the large-information channel added and the hadoop components installed on all the servers to use the Hadoop feature on Cherry-red Hat Storage. Run the following command on the Ambari Management Server, the YARN Principal Server and all the servers within the Crimson Hat Storage trusted storage pool:
# yum install rhs-hadoop rhs-hadoop-install
On the YARN Master Server
The YARN Chief Server is required to FUSE Mountain all Blood-red Hat Storage Volumes that is used with Hadoop. It must have the Cerise Hat Storage Client Channel enabled and then that the setup_cluster script tin install the Red Hat Storage Client Libraries on it.
-
If you have registered your auto using Red Hat Subscription Manager, enable the aqueduct by running the following command:
# subscription-manager repos --enable=rhel-six-server-rhs-client-1-rpms
-
If you accept registered your machine using Satellite server, enable the channel by running the following command:
# rhn-aqueduct --add --channel=rhel-x86_64-server-rhsclient-half dozen
12.2.2. Configuring the Trusted Storage Pool for employ with Hadoop
Ruddy Hat Storage provides a series of utility scripts that allows yous to speedily set up Red Hat Storage for use with Hadoop, and install the Ambari Direction Server. You must first run the Hadoop cluster configuration initial script to install the Ambari Management Server, set up the YARN Primary Server to host the Resource Manager and Job History Server services for Red Hat Storage and build a trusted storage pool if it does not exist.
Y'all must run the script given below irrespective of whether yous have an existing Cherry-red Chapeau Storage trusted storage puddle or not.
To run the Hadoop configuration initial script:
-
Open the terminal window of the server designated to be the Ambari Management Server and navigate to the
/usr/share/rhs-hadoop-install/
directory. -
Run the hadoop cluster configuration script as given below:
setup_cluster.sh [-y] [--hadoop-mgmt-node <node>] [--yarn-chief <node>] <node-list-spec>
where <node-list-spec> is
<node1>:<brickmnt1>:<blkdev1> <node2>[:<brickmnt2>][:<blkdev2>] [<node3>[:<brickmnt3>][:<blkdev3>]] ... [<nodeN>[:<brickmntN>][:<blkdevN>]]
where
-
<brickmnt> is the name of the XFS mount for the higher up <blkdev>,for example,
/mnt/brick1
or/external/HadoopBrick
. When a Blood-red Lid Storage book is created its bricks has the volume name appended, so <brickmnt> is a prefix for the volume'southward bricks. Example: If a new volume is namedHadoopVol
and then its brick listing would be:<node>:/mnt/brick1/HadoopVol
or<node>:/external/HadoopBrick/HadoopVol
. -
<blkdev> is the name of a Logical Volume device path, for example,
/dev/VG1/LV1
or/dev/mapper/VG1-LV1
. Since LVM is a prerequisite for Cherry-red Hat Storage, the <blkdev> is not expected to be a raw block path, such equally/dev/sdb
.
Given below is an case of running setup_cluster.sh script on a the YARN Master server and four Scarlet Hat Storage Nodes which has the same logical volume and mount point intended to be used as a Ruddy Hat Storage Brick.
./setup_cluster.sh --yarn-primary yarn.hdp rhs-one.hdp:/mnt/brick1:/dev/rhs_vg1/rhs_lv1 rhs-2.hdp rhs-3.hdp rhs-iv.hdp
If a brick mount is omitted, the brick mount of the get-go node is used and if one block device is omitted, the block device of the first node is used.
-
12.2.3. Creating Volumes for use with Hadoop
To use an existing Red Lid Storage Book with Hadoop, skip this section and continue with the section Adding the User Directories for the Hadoop Processes on the Ruby-red Chapeau Storage Book.
Whether you have a new or existing Red Hat Storage trusted storage puddle, to create a volume for use with Hadoop, the volume need to be created in such a mode as to support Hadoop workloads. The supported book configuration for Hadoop is Distributed Replicated volume with replica count 2. You must non proper noun the Hadoop enabled Crimson Hat Storage volume every bit hadoop
or mapredlocal
.
Run the script given below to create new volumes that yous intend to use with Hadoop. The script provides the necessary configuration parameters to the volume too as updates the Hadoop Configuration to brand the volume accessible to Hadoop.
-
Open the concluding window of the server designated to exist the Ambari Management Server and navigate to the
/usr/share/rhs-hadoop-install/
directory. -
Run the hadoop cluster configuration script every bit given below:
create_vol.sh [-y] <volName> <volMountPrefix> <node-listing>
where
-
<node-listing> is: <node1>:<brickmnt> <node2>[:<brickmnt2>] <node3>[:<brickmnt3>] ... [<nodeN>[:<brickmntN>
-
<brickmnt> is the name of the XFS mountain for the block devices used by the in a higher place nodes, for example,
/mnt/brick1
or/external/HadoopBrick
. When a RHS book is created its bricks will have the volume name appended, then <brickmnt> is a prefix for the volume's bricks. For example, if a new volume is namedHadoopVol
then its brick list would exist:<node>:/mnt/brick1/HadoopVol
or<node>:/external/HadoopBrick/HadoopVol
.
The node-list for
create_vol.sh
is similar to thenode-list-spec
used bysetup_cluster.sh
except that a block device is not specified increate_vol
.Given below is an example on how to create a book named HadoopVol, using 4 Red Hat Storage Servers, each with the same brick mount and mount the book on
/mnt/glusterfs
./create_vol.sh HadoopVol /mnt/glusterfs rhs-1.hdp:/mnt/brick1 rhs-2.hdp rhs-3.hdp rhs-4.hdp
-
12.two.4. Adding the User Directories for the Hadoop Processes on the Crimson Hat Storage Volume
Later creating the volume, y'all need to setup the user directories for all the Hadoop ecosystem component users that you created in the prerequisites section. This is required for completing the Ambari distribution successfully.
Perform the steps given below merely when the book is created and enabled to exist used with Hadoop.
Open the terminal window of the Red Hat Storage server within the trusted storage pool and run the post-obit commands:
# mkdir /mnt/glusterfs/HadoopVol/user/mapred # mkdir /mnt/glusterfs/HadoopVol/user/yarn # mkdir /mnt/glusterfs/HadoopVol/user/hcat # mkdir /mnt/glusterfs/HadoopVol/user/hive # mkdir /mnt/glusterfs/HadoopVol/user/ambari-qa
# chown ambari-qa:hadoop /mnt/glusterfs/HadoopVol/user/ambari-qa # chown hive:hadoop /mnt/glusterfs/HadoopVol/user/hive # chown hcat:hadoop /mnt/glusterfs/HadoopVol/user/hcat # chown yarn:hadoop /mnt/glusterfs/HadoopVol/user/yarn # chown mapred:hadoop /mnt/glusterfs/HadoopVol/user/mapred
12.2.v. Deploying and Configuring the HDP 2.0.6 Stack on Red Hat Storage using Ambari Manager
Perform the following steps to deploy and configure HDP stack on Red Hat Storage:
This section describes how to deploy HDP on Ruddy Chapeau Storage. Selecting HDFS
as the storage selection in the HDP two.0.six.GlusterFS stack is not supported. If you want to deploy HDFS, and so y'all must select the HDP 2.0.6 stack (non HDP two.0.6.GlusterFS) and follow the instructions of the Hortonworks documentation.
-
Launch a web browser and enter
http://hostname:8080
in the URL by replacing hostname with the hostname of your Ambari Management Server.If the Ambari Console fails to load in the browser, it is normally considering iptables is still running. Stop iptables past opening a terminal window and run
service iptables stop
control. -
Enter
admin
andadmin
for the username and password. -
Assign a name to your cluster, such as
MyCluster
. -
Select the
HDP 2.0.6.GlusterFS Stack
(if not already selected past default) and clickAdjacent
. -
On the
Install Options
screen:-
For
Target Hosts
, add the YARN server and all the nodes in the trusted storage pool. -
Select
Perform manual registrations on hosts and exercise not utilize SSH
pick. -
Accept any warnings you may see and click
Register and Ostend
button. -
Click
OK
onBefore yous keep alarm
warning. The Ambari Agents have all been installed for you during thesetup_cluster.sh
script.
-
-
For
Confirm Hosts
, the progress must show as green for all the hosts. ClickSide by side
and ignore theHost Check
alarm. -
For
Cull Services
, unselect HDFS and as a minimum select GlusterFS, Ganglia, YARN+MapReduce2 and ZooKeeper.-
Exercise not select the Nagios service, as it is not supported. For more data, encounter subsection 21.1. Deployment Scenarios of chapter 21. Administering the Hortonworks Data Platform on Carmine Hat Storage in the Red Hat Storage 3.0 Administration Guide.
-
The use of HBase has not been extensively tested and is not yet supported.
-
This department describes how to deploy HDP on Red Chapeau Storage. Selecting
HDFS
as the storage selection in the HDP two.0.6.GlusterFS stack is not supported. If users wish to deploy HDFS, then they must select the HDP 2.0.6 stack (non HDP 2.0.6.GlusterFS) and follow the instructions in the Hortonworks documentation.
-
-
For
Assign Masters
, ready all the services to your designated YARN Master Server. For ZooKeeper, select at least three carve up nodes within your cluster. -
For
Assign Slaves and Clients
, select all the nodes asNodeManagers
except the YARN Master Server. You must also ensure to click theCustomer
checkbox for each node. -
On the
Customize Services
screen:-
Click YARN tab, curlicue downward to the yarn.nodemanager.log-dirs and yarn.nodemanager.local-dirs backdrop and remove any entries that begin with
/mnt/glusterfs/
. -
Click MapReduce2 tab, curlicue down to the
Avant-garde
department, and modify the following property:Key Value yarn.app.mapreduce.am.staging-dir glusterfs:///user -
Click MapReduce2 tab, roll downward to the bottom, and under the custom mapred-site.xml, add together the following four custom properties and so click on the
Next
button:Key Value mapred.healthChecker.script.path glusterfs:///mapred/jobstatus mapred.task.tracker.history.completed.location glusterfs:///mapred/history/washed mapred.system.dir glusterfs:///mapred/arrangement mapreduce.jobtracker.staging.root.dir glusterfs:///user -
Review other tabs that are highlighted in red. These require you to enter additional information, such every bit passwords for the respective services.
-
-
Review your configuration and then click
Deploy
button. Once the deployment is complete, it volition state that the deployment is 100% complete and the progress bars will be colored in Orange.The deployment process is susceptible to network and bandwidth bug. If the deployment fails, try clicking "Retry" to effort the deployment over again. This oft resolves the upshot.
12.ii.half dozen. Enabling Existing Volumes for use with Hadoop
This section is mandatory for every volume you intend to use with Hadoop. It is not sufficient to run the create_vol.sh
script, y'all must follow the steps listed in this section besides.
If you have an existing Blood-red Hat Storage trusted storage pool with volumes that incorporate data that you would like to analyze with Hadoop, the volumes need to exist configured to back up Hadoop workloads. Execute the script given below on every volume that you lot intend to utilise with Hadoop. The script provides the necessary configuration parameters for the volume and updates the Hadoop Configuration to brand the volume accessible to Hadoop.
The supported volume configuration for Hadoop is Distributed Replicated book with replica count 2.
-
Open the terminal window of the server designated to be the Ambari Management Server and navigate to the
/usr/share/rhs-hadoop-install/
directory. -
Run the Hadoop Trusted Storage pool configuration script as given beneath:
# enable_vol.sh [-y] [--hadoop-mgmt-node <node>] [--user <admin-user>] [--pass <admin-password>] [--port <mgmt-port-num>] [--yarn-principal <node>] [--rhs-node <storage-node>] <volName>
For Example;
./enable_vol.sh --yarn-main yarn.hdp --rhs-node rhs-1.hdp HadoopVol
If --yarn-primary and/or --rhs-node options are omitted then the default of localhost (the node from which the script is being executed) is assumed.
--rhs-node
is the hostname of any of the storage nodes in the trusted storage pool. This is required to access the gluster control. Default is localhost and it must have gluster CLI access.
12.2.7. Configuring the Linux Container Executor
The Container Executor program used by the YARN framework defines how any container
is launched and controlled. The Linux Container Executor sets upward restricted permissions and the user/grouping ownership of local files and directories used by the containers such equally the shared objects, jars, intermediate files, log files, and then on. Perform the post-obit steps to configure the Linux Container Executor plan:
-
In the Ambari console, click Stop All in the Services navigation panel. You lot must wait until all the services are completely stopped.
-
On each server within the Crimson Hat Storage trusted storage pool:
-
Open the terminal and navigate to
/usr/share/rhs-hadoop-install/
directory: -
Execute the
setup_container_executor.sh
script.
-
-
On each server inside the Red Hat Storage trusted storage pool and the YARN Master server:
-
Open the terminal and navigate to
/etc/hadoop/conf/
directory. -
Supersede the contents of
container-executor.cfg
file with the following:yarn.nodemanager.linux-container-executor.group=hadoop banned.users=yarn min.user.id=yard allowed.system.users=tom
Ensure that there is no additional whitespace at the end of each line and at the cease of the file. Likewise,
tom
is an case user. Hadoop ignores theallowed.organization.user
parameter, only we recommend having at least ane valid user. Y'all can modify this file on one server and so employ Secure Copy (or whatever another approach) to copy the modified file to the same location on each server.
-
Source: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/installation_guide/sect-installing_the_hadoop_filesystem_plugin_for_red_hat_storage
Posted by: vanallendiffeclus.blogspot.com
0 Response to "How To Install Hadoop In Redhat Linux"
Post a Comment