Big hadoop Tarining

Pig hadoop tutorial

Pig Hadoop Tutorial Introduction:

Pig Hadoop Tutorial Course Content

Introduction to the Pig
  • What Is the Pig?
  • Features of the Pig’s
  • Pig Use Cases
  • Interacting with the Pig
Basic Data Analysis with the Pig
  • Pig Latin Syntax
  • Loading the Data
  • Simple Data Types
  • Field Definitions
  • Data Output
  • Viewing the Schema
  • Filtering & Sorting Data
  • Commonly Used Functions
  • H&s-On Exercise: Using Pig for ETL Processing
Processing Complex Data with the Pig
  • Storage Formats
  • Complex/Nested Data Types
  • Grouping
  • Built-In Functions for Complex Data
  • Iterating Grouped Data
  • H&s-On Exercise: Analyzing Ad Campaign Data with Pig
Multi-Dataset Operations with the Pig
  • Techniques for Combining Data Sets
  • Joining Data Sets in Pig
  • Set Operations
  • Splitting Data Sets
  • H&s-On Exercise: Analyzing Disparate Data Sets with Pig
Extending the Pig
  • Adding Flexibility with Parameters
  • Macros & Imports
  • UDFs
  • Contributed Functions
  • Using Other Languages to Process Data with Pig
  • H&s-On Exercise: Extending Pig with Streaming & UDFs
Pig Troubleshooting & Optimization
  • Troubleshooting Pig
  • Logging
  • Using Hadoop’s Web UI
  • Optional Demo: Troubleshooting a Failed Job with the Web UI
  • Data Sampling & Debugging
  • Performance Overview
  • The Execution Plan
  • Tips for Improving the Performance of Your Pig Jobs
Introduction to the Hive
  • What Is Hive?
  • Hive Schema & Data Storage
  • Comparing Hive to Traditional Databases
  • Hive vs. Pig
  • Hive Use Cases
  • Interacting with Hive
Relational Data Analysis with the Hive
  • Hive Databases & Tables
  • Basic HiveQL Syntax
  • Data Types
  • Joining Data Sets
  • Common Built-In Functions
  • H&s-On Exercise: Running Hive Queries on the Shell, Scripts, & Hue
Hive Data Management
  • Hive Data Formats
  • Creating Databases & Hive-Managed Tables
  • Loading Data into Hive
  • Altering Databases & Tables
  • Self-Managed Tables
  • Simplifying Queries with Views
  • Storing Query Results
  • Controlling Access to Data
  • H&s-On Exercise: Data Management with Hive