Configure Hadoop and start cluster services using Ansible Playbook

2 min readDec 14, 2020

Let's know about Hadoop and ansible then we will jump int our challenge

Hadoop

Big data is the collection and analysis of large set of data which holds many intelligence and raw information based on user data, Sensor data, Medical and Enterprise data. The Hadoop platform is used to Store, Manage, and Distribute Big data across several server nodes. This paper shows the Big data issues and focused more on security issue arises in Hadoop Architecture base layer called Hadoop Distributed File System (HDFS)

Ansible

Ansible is an agent-less software platform that automates cloud provisioning, configuration management, application deployment, intra-service orchestration, and many other IT needs. It can configure both Unix-like systems as well as Microsoft Windows.

Ansible portrays a wide range of features that can enhance current processes, migrate applications for better optimization, and provide a single language for DevOps practices across the organization.

Challenging Statement:

Configure Hadoop and start the Hadoop cluster by ansible-playbook.

Prerequisites:

Ansible Controller node
Two more VM's
1. For name node configuring
2. For Datanode configuring
JDK and Hadoop rpm files

Steps:

Write Ansible configure file and inventory file

Write playbook for name node
Write Playbook for data node
Run the playbook

ansible-playbook playbook.yml # for configuring namenode
ansible-playbook datanode.yml  # for configuring datanode

#namenode

#Datanode

Let's check from Namenode VM and data node VM

Here is the Github link for playbooks

Venkateshsandupatla/Hadoop-cluster-by-ansible

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

github.com

Thank You for reading!!