Configure Hadoop and start cluster services using Ansible Playbook

Venky
2 min readDec 14, 2020

--

Let's know about Hadoop and ansible then we will jump int our challenge

Hadoop

  • Big data is the collection and analysis of large set of data which holds many intelligence and raw information based on user data, Sensor data, Medical and Enterprise data. The Hadoop platform is used to Store, Manage, and Distribute Big data across several server nodes. This paper shows the Big data issues and focused more on security issue arises in Hadoop Architecture base layer called Hadoop Distributed File System (HDFS)

Ansible

Ansible is an agent-less software platform that automates cloud provisioning, configuration management, application deployment, intra-service orchestration, and many other IT needs. It can configure both Unix-like systems as well as Microsoft Windows.

Ansible portrays a wide range of features that can enhance current processes, migrate applications for better optimization, and provide a single language for DevOps practices across the organization.

Challenging Statement:

  • Configure Hadoop and start the Hadoop cluster by ansible-playbook.

Prerequisites:

  • Ansible Controller node
  • Two more VM's
  • 1. For name node configuring
  • 2. For Datanode configuring
  • JDK and Hadoop rpm files

Steps:

  • Write Ansible configure file and inventory file
  • Write playbook for name node
  • Write Playbook for data node
  • Run the playbook
ansible-playbook playbook.yml # for configuring namenode
ansible-playbook datanode.yml # for configuring datanode

#namenode

#Datanode

  • Let's check from Namenode VM and data node VM

Here is the Github link for playbooks

Thank You for reading!!

--

--

No responses yet