2014년 5월 1일 목요일

우분투에 하둡 (Hadoop) 설치

우분투(12.04 LTS)에 Hadoop(2.2.0) 설치하기


Hadoop 다운로드


release로 가서 2.2.x 을 다운받았다.
적당한 경로로 복사한 압축을 푼다.

Java 설치

당연한 이야기이지만, java 가 설치되어 있지 않으면 java 을 설치한다. 
Java 의 환경변수가 잡혀 있지 않다면 잡아준다.

하둡 설치

$HADOOP_INSTALL/etc/hadoop/hadoop-env.sh

하둡 설정파일에 java 설정을 해준다. (환경변수로 잡혀 있으면 딱히 해줄게 없다.)




hadoop version을 통해서 실행되는지 확인한다.




 .bashrc

export HADOOP_INSTALL=/home/eyeopener/dev/bin/hadoop/hadoop-2.2.0
export PATH=$PATH:$HADOOP_INSTALL/bin:$HADOOP_INSTALL/sbin
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"

1행 : 하둡의 install 경로
2행 : PATH에 하둡의 bin과 sbin 을 추가해준다.
3행~4행 : 64bit에 하둡을 설치하는 하는 경우 하둡 시작시에 아래와 같은 에러를 만나게 된다. 하둡은 default  native library 가 32bit 이기 때문이다. (참고 - http://stackoverflow.com/questions/20011252/hadoop-2-2-0-64-bit-installing-but-cannot-start)

eyeopener@ubuntu:~$ start-dfs.sh 14/05/02 00:48:45 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Starting namenodes on [Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/eyeopener/dev/bin/hadoop/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'. localhost] sed: -e expression #1, char 6: unknown option to `s' HotSpot(TM): ssh: Could not resolve hostname HotSpot(TM): Name or service not known Java: ssh: Could not resolve hostname Java: Name or service not known have: ssh: Could not resolve hostname have: Name or service not known warning:: ssh: Could not resolve hostname warning:: Name or service not known loaded: ssh: Could not resolve hostname loaded: Name or service not known have: ssh: Could not resolve hostname have: Name or service not known VM: ssh: Could not resolve hostname VM: Name or service not known The: ssh: Could not resolve hostname The: Name or service not known disabled: ssh: Could not resolve hostname disabled: Name or service not known library: ssh: Could not resolve hostname library: Name or service not known You: ssh: Could not resolve hostname You: Name or service not known stack: ssh: Could not resolve hostname stack: Name or service not known guard.: ssh: Could not resolve hostname guard.: Name or service not known fix: ssh: Could not resolve hostname fix: Name or service not known will: ssh: Could not resolve hostname will: Name or service not known -c: Unknown cipher type 'cd'

ssh 설정

개발용으로 설치를 할 것이라서 의사 분산 모드로 실행하려고 한다. 의사 분산 모드에서는 데몬 프로세스를 실행해야 하는데 그러기 위해서는 SSH가 설치되어 있어야 한다.

참고  : http://sidcode.tistory.com/213


하둡 설정


core-site.xml 


<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!--

  Licensed under the Apache License, Version 2.0 (the "License");

  you may not use this file except in compliance with the License.

  You may obtain a copy of the License at



    http://www.apache.org/licenses/LICENSE-2.0



  Unless required by applicable law or agreed to in writing, software

  distributed under the License is distributed on an "AS IS" BASIS,

  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

  See the License for the specific language governing permissions and

  limitations under the License. See accompanying LICENSE file.

-->



<!-- Put site-specific property overrides in this file. -->



<configuration>

        <property>

                <name>hadoop.tmp.dir</name>

                <value>/home/eyeopener/dev/bin/hadoop/hadoop-2.2.0/tmp</value>

         </property>

        <property>

                <name>fs.default.name</name>

                <value>hdfs://localhost</value>

        </property>

</configuration>



hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!--

  Licensed under the Apache License, Version 2.0 (the "License");

  you may not use this file except in compliance with the License.

  You may obtain a copy of the License at



    http://www.apache.org/licenses/LICENSE-2.0



  Unless required by applicable law or agreed to in writing, software

  distributed under the License is distributed on an "AS IS" BASIS,

  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

  See the License for the specific language governing permissions and

  limitations under the License. See accompanying LICENSE file.

-->



<!-- Put site-specific property overrides in this file. -->



<configuration>

        <property>

                <name>dfs.replication</name>

                <value>1</value>

        </property>

         <property>

                 <name>dfs.name.dir</name>

                 <value>/home/eyeopener/dev/bin/hadoop/hadoop-2.2.0/dfs/name</value>

        </property>

        <property>

                <name>dfs.name.edits.dir</name>

                <value>${dfs.name.dir}</value>

         </property>

        <property>

                <name>dfs.data.dir</name>

                <value>/home/eyeopener/dev/bin/hadoop/hadoop-2.2.0/dfs/data</value>

        </property>

</configuration>

 

mapred-site.xml


<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!--

  Licensed under the Apache License, Version 2.0 (the "License");

  you may not use this file except in compliance with the License.

  You may obtain a copy of the License at



    http://www.apache.org/licenses/LICENSE-2.0



  Unless required by applicable law or agreed to in writing, software

  distributed under the License is distributed on an "AS IS" BASIS,

  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

  See the License for the specific language governing permissions and

  limitations under the License. See accompanying LICENSE file.

-->



<!-- Put site-specific property overrides in this file. -->



<configuration>

        <property>

                <name>mapred.job.tracker</name>

                <value>localhost:8021</value>

        </property>

</configuration>


HDFS 파일시스템 포멧

완전히 새로운 HDFS 설치를 위해 포멧이 필요하다. 포멧과정은 저장도 디렉토리와 네임노드의 영속적인 데이터 구조체의 초기버전을 생성함으로서 빈 파일 시스템을 만들어 낸다. 데이터 노드는 초기 포멧 과정에 참여하지 않는데, 네임 노드가 모든 파일 시스템의 메타 데이터를 관리하고, 데이터 노드는 동적으로 클러스터에 참여하거나 벗어날 수 있기 때문이다.

다음과 같이 입력하자.
$hadoop namenode -format


데몬 프로세스의 실행과 중지


실행
eyeopener@ubuntu:~/dev/bin/hadoop/hadoop-2.2.0/etc/hadoop$ start-all.sh 
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
14/05/02 02:19:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /home/eyeopener/dev/bin/hadoop/hadoop-2.2.0/logs/hadoop-eyeopener-namenode-ubuntu.out
localhost: Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/eyeopener/dev/bin/hadoop/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
localhost: It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
localhost: starting datanode, logging to /home/eyeopener/dev/bin/hadoop/hadoop-2.2.0/logs/hadoop-eyeopener-datanode-ubuntu.out
localhost: Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/eyeopener/dev/bin/hadoop/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
localhost: It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/eyeopener/dev/bin/hadoop/hadoop-2.2.0/logs/hadoop-eyeopener-secondarynamenode-ubuntu.out
0.0.0.0: Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/eyeopener/dev/bin/hadoop/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
0.0.0.0: It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
14/05/02 02:19:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting yarn daemons
starting resourcemanager, logging to /home/eyeopener/dev/bin/hadoop/hadoop-2.2.0/logs/yarn-eyeopener-resourcemanager-ubuntu.out
localhost: starting nodemanager, logging to /home/eyeopener/dev/bin/hadoop/hadoop-2.2.0/logs/yarn-eyeopener-nodemanager-ubuntu.out
localhost: Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/eyeopener/dev/bin/hadoop/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
localhost: It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
eyeopener@ubuntu:~/dev/bin/hadoop/hadoop-2.2.0/etc/hadoop$


웹브라우저를 통해서 확인

http://localhost:50070




중지
eyeopener@ubuntu:~/dev/bin/hadoop/hadoop-2.2.0/etc/hadoop$ stop-all.sh
This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
14/05/02 02:22:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [localhost]
localhost: stopping namenode
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
14/05/02 02:23:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
stopping yarn daemons
stopping resourcemanager
localhost: stopping nodemanager
no proxyserver to stop
eyeopener@ubuntu:~/dev/bin/hadoop/hadoop-2.2.0/etc/hadoop$ 



댓글 1개:

  1. 아 감사합니다. 전 비록 맥이었지만 그래도 알려주신대로 했더니 잘 됩니다.

    답글삭제