Skip to content

Using HDFS

    Format the configured HDFS file system and then open the namenode (HDFS server) and execute the following command.

    $ hadoop namenode -format

    Start the distributed file system and follow the command listed below to start the namenode as well as the data nodes in cluster.


    Listing Files in HDFS

    Finding the list of files in a directory and the status of a file using ‘ls’ command in the terminal. Syntax of ls can be passed to a directory or a filename as an argument which are displayed as follows:

    $ $HADOOP_HOME/bin/hadoop fs -ls <args>

    Inserting Data into HDFS

    Below mentioned steps are followed to insert the required file in the Hadoop file system.

    Step1: Create an input directory

    $ $HADOOP_HOME/bin/hadoop fs -mkdir /user/input

    Step2: Use put command transfer and store the data file from the local systems to the HDFS using the following commands in the terminal.

    $ $HADOOP_HOME/bin/hadoop fs -put /home/intellipaat.txt /user/input

    Step3: Verify the file using ls command.

    $ $HADOOP_HOME/bin/hadoop fs -ls /user/input

    Retrieving Data from HDFS

    For an instance if you have a file in HDFS called Intellipaat. Then retrieve the required file from the Hadoop file system by carrying out:

    Step1: View the data from HDFS using cat command.

    $ $HADOOP_HOME/bin/hadoop fs -cat /user/output/intellipaat

    Step2: Gets the file from HDFS to the local file system using get command as shown below

    $ $HADOOP_HOME/bin/hadoop fs -get /user/output/ /home/hadoop_tp/

    Shutting Down the HDFS

    Shut down the HDFS files by following the below command


    Multi-Node Cluster

    Installing Java

    Syntax of java version command

    $ java -version

    Following output is presented.

    java version "1.7.0_71"
    Java(TM) SE Runtime Environment (build 1.7.0_71-b13)
    Java HotSpot(TM) Client VM (build 25.0-b02, mixed mode)

    Creating User Account

    System user account is used on both master and slave systems for the Hadoop installation.

    # useradd hadoop
    # passwd hadoop

    Mapping the nodes

    Hosts files should be edited in /etc/ folder on each and every nodes and IP address of each system followed by their host names must be specified mandatorily.

    # vi /etc/hosts

    Enter the following lines in the /etc/hosts file. hadoop-master hadoop-slave-1 hadoop-slave-2

    Configuring Key Based Login

    Ssh should be setup in each node so they can easily converse with one another without any prompt for password.

    # su hadoop
    $ ssh-keygen -t rsa
    $ ssh-copy-id -i ~/.ssh/ [email protected]
    $ ssh-copy-id -i ~/.ssh/ [email protected]
    $ ssh-copy-id -i ~/.ssh/ [email protected]
    $ chmod 0600 ~/.ssh/authorized_keys
    $ exit