HADOOP COURSE – LEARN @ YOUR CONVENIENCE
Unit 1 : Hadoop Architecture
Learning Objectives – In this module, you will understand what is Big Data, What are the limitations of the existing solutions for Big Data problem, How Hadoop solves the Big Data problem, What are the common Hadoop ecosystem components, Hadoop Architecture, HDFS and Map Reduce Framework, and Anatomy of File Write and Read.
Topics – What is Big Data, Hadoop Architecture, Hadoop ecosystem components, Hadoop Storage: HDFS, Hadoop Processing: MapReduce Framework, Hadoop Server Roles: NameNode, Secondary NameNode, and DataNode, Anatomy of File Write and Read.
Unit 2: Hadoop Cluster Configuration and Data Loading
Learning Objectives – In this module, you will learn the Hadoop Cluster Architecture and Setup, Important Configuration files in a Hadoop Cluster, Data Loading Techniques.
Topics – Hadoop Cluster Architecture, Hadoop Cluster Configuration files, Hadoop Cluster Modes, Multi-Node Hadoop Cluster, A Typical Production Hadoop Cluster, MapReduce Job execution, Common Hadoop Shell commands, Data Loading Techniques: FLUME, SQOOP, Hadoop Copy Commands, Hadoop Project: Data Loading.
Unit 3: Hadoop MapReduce framework
Learning Objectives – In this module, you will understand Hadoop MapReduce framework and how MapReduce works on data stored in HDFS. Also, you will learn what are the different types of Input and Output formats in MapReduce framework and their usage.
Topics – Hadoop Data Types, Hadoop MapReduce paradigm, Map and Reduce tasks, MapReduce Execution Framework, Partitioners and Combiners, Input Formats (Input Splits and Records, Text Input, Binary Input, Multiple Inputs), Output Formats (TextOutput, BinaryOutPut, Multiple Output), Hadoop Project: MapReduce Programming.
Unit 4: Advance MapReduce
Learning Objectives – In this module, you will learn Advance MapReduce concepts such as Counters, Schedulers, Custom Writables, Compression, Serialization, Tuning, Error Handling, and how to deal with complex MapReduce programs.
Topics – Counters, Custom Writables, Unit Testing: JUnit and MRUnit testing framework, Error Handling, Tuning, Advance MapReduce, Hadoop Project: Advance MapReduce programming and error handling.
Unit 5: Pig and Pig Latin
Learning Objectives – In this module, you will learn what is Pig, in which type of use case we can use Pig, how Pig is tightly coupled with MapReduce, and Pig Latin scripting.
Topics – Installing and Running Pig, Grunt, Pig’s Data Model, Pig Latin, Developing & Testing Pig Latin Scripts, Writing Evaluation, Filter, Load & Store Functions, Hadoop Project: Pig Scripting.
Unit 6: Hive and HiveQL
Learning Objectives – This module will help you in understanding Apache Hive Installation, Loading and Querying Data in Hive and so on.
Topics – Hive Architecture and Installation, Comparison with Traditional Database, HiveQL: Data Types, Operators and Functions, Hive Tables(Managed Tables and External Tables, Partitions and Buckets, Storage Formats, Importing Data, Altering Tables, Dropping Tables), Querying Data (Sorting And Aggregating, Map Reduce Scripts, Joins & Subqueries, Views, Map and Reduce side Joins to optimize Query).
Unit 7: Advance Hive, NoSQL Databases and HBase
Learning Objectives – In this module, you will understand Advance Hive concepts such as UDF. You will also acquire in-depth knowledge of what is HBase, how you can load data into HBase and query data from HBase using client.
Topics – Hive: Data manipulation with Hive, User Defined Functions, Appending Data into existing Hive Table, Custom Map/Reduce in Hive, Hadoop Project: Hive Scripting, HBase: Introduction to HBase, Client API’s and their features, Available Client, HBase Architecture, MapReduce Integration.
Unit 8: Advance HBase and ZooKeeper
Learning Objectives – This module will cover Advance HBase concepts. You will also learn what Zookeeper is all about, how it helps in monitoring a cluster, why HBase uses Zookeeper and how to Build Applications with Zookeeper.
Topics – HBase: Advanced Usage, Schema Design, Advance Indexing, Coprocessors, Hadoop Project: HBase tables The ZooKeeper Service: Data Model, Operations, Implementation, Consistency, Sessions, States.
Unit 9: Hadoop 2.0, MRv2 and YARN
Learning Objectives – In this module, you will understand the newly added features in Hadoop 2.0, namely, YARN, MRv2, NameNode High Availability, HDFS Federation, support for Windows etc.
Topics – Schedulers:Fair and Capacity, Hadoop 2.0 New Features: NameNode High Availability, HDFS Federation, MRv2, YARN, Running MRv1 in YARN, Upgrade your existing MRv1 code to MRv2, Programming in YARN framework.