Apache hadoop installation on linux

Apache hadoop installation on linux install#
Apache hadoop installation on linux password#
Apache hadoop installation on linux download#

Set it to reflect the path where you have placed the Hadoop directory. Anyway, following are the variables I’m talking about:Įxport HADOOP_HOME=/mnt/d/bigdata/hadoop-3.3.1Įxport HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/nativeĮxport PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/binĮxport HADOOP_OPTS"=$HADOOP_HOME/lib/nativ"Īs you can see, you only have to change the value of the first environment variable, HADOOP_HOME. The best part is, you have to customize only one variable. We have to set a bunch of environment variables. Because we’re installing Hadoop on our local machine, we’re going to do a single-node deployment, which is also known as pseudo-distributed mode deployment. This will create a directory named hadoop-3.3.1 and place all files and directories inside that directory. To decompress that file, use the following command: tar xzf hadoop-3.3.1.tar.gz

Apache hadoop installation on linux download#

As of this writing, the latest version of Hadoop is version 3.3.1, and you can download it from here. Let’s now move on to installing Hadoop.įirst step to installing Hadoop is to actually download it. This concludes the dependency installation phase. You can exit back to your previous session by hitting the key combination CTRL + d. For this, run the following command:Īwesome. Next, start the SSH service so that we can test if the the server is working fine. So run the following command to set the right permissions: chmod 0600 ~/.ssh/id_rsa.pub This will make the system refuse the key and not allow SSH login. This is because if the key file has more public access than needed, the system would think that the key can be copied or duplicated or tampered with, which would mean the key is not secure. Now, make sure that the public key file has the right permission. So, run the following command to cat the file contents of the key file we just created, and then copy that to the authorized_keys file: cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys At least I haven’t seen an option to change this behavior. This is especially important because this is what Hadoop expects.

Apache hadoop installation on linux password#

Once the keys are generated, you have to copy them over to the list of authorized keys so that you don’t have to enter a password each time you SSH into the machine. For this, run the following command and go through the instructions that you’ll get: ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa Once we have both the server and the client installed for SSH, we have to generate keys for authentication.

Apache hadoop installation on linux install#

To install OpenSSH, run the following commands in the terminal: sudo apt install openssh-server openssh-client -y Without SSH into localhost, most components of Hadoop wouldn’t work.

The next dependency to install is OpenSSH so that Hadoop can SSH into the localhost. To make it permanent, you’ll have to add this command to the. If you just run this command in your terminal, the variable will be exported only for the current session. Export JAVA_HOME=/usr/lib/jvm/adoptopenjdk-8-hotspot-amd64/bin/java