Introduction to Big Data and Hadoop



Overview:

The ‘Introduction to Big Data and Hadoop’ is an ideal course package for individuals who want to understand the basic concepts of Big Data and Hadoop. On completing this course, trainees will be able to interpret what goes behind the processing of huge volume of data as the industry moves from excel-based analytics consuming high processing time to real-time analytics with extensively expedited processing speed.

The course focuses on the basics of Big Data and Hadoop. It is a blend of both theory and hands-on demonstration & exercises which will give the trainee a working level experience about the operational process and associated benefits of Big Data.

 

Big Data Hands-on Training Syllabus:

Week Lesson Topics Hours Type
1 Introduction to Big Data & Hadoop • What is Big Data?
• How Big is Big Data?
• What are we trying to solve in Big Data?
• Types of Data Structure
• Hadoop System Principle
• History of Hadoop
• Comparison with RDBMS
• Hadoop Eco System
• Hadoop Distribution
• Supported Operating System, Hardware and Resources
2 Hours Theory
2 Understanding Hadoop HDFS & Map Reduce • HDFS Concept
• HDFS Architecture
• Introduction to Map Reduce
• Working Methodology of Map Reduce
2 Hours Theory
3 Basic Hadoop Configuration, Setup, Administration and Command Reference • Design a Hadoop Cluster
• Procedure setting up a basic Hadoop Cluster
• Setting Up a Hadoop Cluster
• Basic Administration of Hadoop
• Basic Command Reference
2 Hours Theory & Practical
4 Understanding Spark Essential, Architecture • Introduction to Apache Spark
• Sparck Architecture
• Introduction to Spark RDD, Dataset, DataFrame and DAG
• Introduction to Spark Component
• Understanding Spark Execution Model
• Why Spark where we have Hadoop?
2 Hours Theory
5 Spark Configuration, Administration & Setup • Design a Spark Cluster
• Procedure setting up a basic Spark Cluster
• Setting Up a Spark Cluster
• Basic Administration of Spark
• Understanding of how Spark runs our job
2 Hours Theory & Practical
6 Fundamental of Python and Shell Scripting • Introduction to Python
• Installing & utilizing New Python Package
• Introduciton to Shell Scripting
• Running a Python Script with a Sheel Script in Shedule
2 Hours Practical
7 Programming with HDFS & Spark • Accessing HDFS Files using Python
• Loading HDFS Files in Spark
• Running basic transformations and actions in Spark
• Understanding of how Spark runs our job with an example
2 Hours Theory & Practical
8 Do a Practical Experiment with a real life problem • Understand the problem
• Desing the solution
• Implement the solution
• Q&A
2 Hours Practical