Skip to content

Using HDFS

    Format the configured HDFS file system and then open the namenode (HDFS server) and execute the following command.

    $ hadoop namenode -format

    Start the distributed file system and follow the command listed below to start the namenode as well as the data nodes in cluster.

    $ start-dfs.sh

    Listing Files in HDFS

    Finding the list of files in a directory and the status of a file using ‘ls’ command in the terminal. Syntax of ls can be passed to a directory or a filename as an argument which are displayed as follows:

    $ $HADOOP_HOME/bin/hadoop fs -ls <args>

    Inserting Data into HDFS

    Below mentioned steps are followed to insert the required file in the Hadoop file system.

    Step1: Create an input directory

    $ $HADOOP_HOME/bin/hadoop fs -mkdir /user/input

    Step2: Use put command transfer and store the data file from the local systems to the HDFS using the following commands in the terminal.

    $ $HADOOP_HOME/bin/hadoop fs -put /home/intellipaat.txt /user/input

    Step3: Verify the file using ls command.

    $ $HADOOP_HOME/bin/hadoop fs -ls /user/input

    Retrieving Data from HDFS

    For an instance if you have a file in HDFS called Intellipaat. Then retrieve the required file from the Hadoop file system by carrying out:

    Step1: View the data from HDFS using cat command.

    $ $HADOOP_HOME/bin/hadoop fs -cat /user/output/intellipaat

    Step2: Gets the file from HDFS to the local file system using get command as shown below

    $ $HADOOP_HOME/bin/hadoop fs -get /user/output/ /home/hadoop_tp/

    Shutting Down the HDFS

    Shut down the HDFS files by following the below command

    $ stop-dfs.sh

    Multi-Node Cluster

    Installing Java

    Syntax of java version command

    $ java -version

    Following output is presented.

    java version "1.7.0_71"
    Java(TM) SE Runtime Environment (build 1.7.0_71-b13)
    Java HotSpot(TM) Client VM (build 25.0-b02, mixed mode)

    Creating User Account

    System user account is used on both master and slave systems for the Hadoop installation.

    # useradd hadoop
    # passwd hadoop

    Mapping the nodes

    Hosts files should be edited in /etc/ folder on each and every nodes and IP address of each system followed by their host names must be specified mandatorily.

    # vi /etc/hosts

    Enter the following lines in the /etc/hosts file.

    192.168.1.109 hadoop-master
    192.168.1.145 hadoop-slave-1
    192.168.56.1 hadoop-slave-2

    Configuring Key Based Login

    Ssh should be setup in each node so they can easily converse with one another without any prompt for password.

    # su hadoop
    $ ssh-keygen -t rsa
    $ ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
    $ ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
    $ ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
    $ chmod 0600 ~/.ssh/authorized_keys
    $ exit