1. 准备三台机器
准备三台CentOS7机器(这里我的三台机器IP分别为192.168.19.172、192.168.19.173、192.168.19.174),每个机器内存最好调到4G。 下载好JDK和Hadoop安装包:jdk-8u261-linux-x64.rpm、hadoop-2.10.1.tar.gz
1.1 安装JDK8
假设我们安装到/usr/java/jdk1.8.0_261-amd64 rpm -ivh jdk-8u261-linux-x64.rpm
1.2 设置主机名
hostnamectl set-hostname hadoop-master
hostnamectl set-hostname hadoop-node1
hostnamectl set-hostname hadoop-node2
1.3 配置hosts
vi /etc/hosts
192.168.19.172 hadoop-master
192.168.19.173 hadoop-node1
192.168.19.174 hadoop-node2
1.4 配置环境变量
vi ~/.bash_profile
在最后加入:
#HADOOP
export HADOOP_HOME=/opt/hadoop
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
使得环境变量生效
source ~/.bash_profile
1.5 设置互信
ssh-keygen
for host in hadoop-master hadoop-node1 hadoop-node2 ; do ssh-copy-id -i ~/.ssh/id_rsa.pub $host; done
2. master服务器执行配置
2.1 解压安装包到/opt
tar -xvf hadoop-2.10.1.tar.gz
mv hadoop-2.10.1 hadoop
mv hadoop /opt
2.2 修改环境变量
vi /opt/hadoop/etc/hadoop/hadoop-env.sh
# 找到 “ export JAVA_HOME ” 这行,用来配置jdk路径
# 修改为:export JAVA_HOME=/usr/java/jdk1.8.0_261-amd64
2.3 配置core-site.xml
vi /opt/hadoop/etc/hadoop/core-site.xml
需要在
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop-master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoopdata</value>
</property>
2.4 配置hdfs-site.xml
vi /opt/hadoop/etc/hadoop/hdfs-site.xml
需要在
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
2.5 配置yarn-site.xml
vi /opt/hadoop/etc/hadoop/yarn-site.xml
需要在
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop-master:18040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop-master:18030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop-master:18025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoop-master:18141</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop-master:18088</value>
</property>
2.6 配置mapred-site.xml
cp /opt/hadoop/etc/hadoop/mapred-site.xml.template /opt/hadoop/etc/hadoop/mapred-site.xml
vi /opt/hadoop/etc/hadoop/mapred-site.xml
需要在
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
2.7 配置slaves
vi /opt/hadoop/etc/hadoop/slaves 加入以下代码:
hadoop-node1
hadoop-node2
3. 拷贝配置到node1和node2
scp -r /opt/hadoop root@hadoop-node1:/opt
scp -r /opt/hadoop root@hadoop-node2:/opt
4. 创建Hadoop数据目录(只在master做)
mkdir /opt/hadoopdata
# 执行格式化文件系统命令:
hadoop namenode -format
5. 启动和关闭Hadoop集群(只在master做)
cd /opt/hadoop/sbin
start-all.sh
如果要关闭Hadoop集群,可以使用命令: stop-all.sh
6. 验证Hadoop集群是否启动成功
在master节点,执行:jps
,如果显示:SecondaryNameNode、 ResourceManager、 Jps 和NameNode这四个进程,则表明主节点master启动成功
然后分别在node1和node2节点下执行命令:jps
,如果成功显示:NodeManager、Jps 和 DataNode,这三个进程,则表明从节点(node1和node2)启动成功.