I provide hadoop and spark trainings across India. I can take sessions at your company too. Course content can be customized according to your needs. If you are interested drop me a line.

The following is the training details.

Hadoop Developer training

This training provides an in-depth understanding of the MapReduce framework, the Hadoop File Distribution System, and its associated technologies.

Duration

30 hours

Audience

Developers who are looking to understand Big data and hadoop technologies.

Frameworks covered

  • HDFS
  • Map/Reduce
  • Hive
  • Pig

Prerequisites

  • Programming experience in any language.
  • Basic Java knowledge is highly recommended.
  • Prior knowledge about Hadoop is not required.

Course highlights

  • Each topic with hands on experience
  • Real world Map/Reduce use cases
  • Excellent material with Exercise and Quiz

Course content

  • Introduction
    • Why Bigdata?
    • Introduction to Hadoop
    • Hadoop Architecture
    • Introduction to Hadoop 2.0
    • Introduction to Map/Reduce
  • HDFS
    • Hadoop installation
    • Introduction to HDFS
    • HDFS interfaces
    • HDFS ETL
    • HDFS api java examples
  • Map/Reduce
    • Mapper and Reduce API
    • Configuration API
    • Custom writables
    • Chain mapper and chain reducer
    • Combiners
    • Reduce side Join
    • Distributed Cache
    • Map side join
    • Job dependency management
    • Sequence files
    • Custom input format

    All the above api's with real world example code.

  • Hive
    • Introduction to Hive
    • Hive installation
    • Hive query language Hands on
    • Partitions
    • Bucketing
    • Indexing
    • UDF / UDAF
    • Hive Serde
    • Hive bucketing
  • Pig
    • Introduction to Pig
    • Pig installation
    • Pig latin Hands on
    • Pig UDF
  • Apache Spark Developer training

    This training provides an in-depth understanding of the spark framework and it's ecosystem.

    Duration

    30 hours

    Audience

    Developers with hadoop experience looking for understanding Apache Spark and its ecosystem

    Frameworks covered

    • Spark
    • Spark Streaming
    • Spark on YARN
    • Shark
    • MLLib

    Prerequisites

    • Programming experience in any language.
    • Basic Java/Scala knowledge is highly recommended.
    • Prior knowledge of Hadoop is highly recommended.

    Course highlights

    • Each topic with hands on experience
    • Real world spark use cases
    • Excellent material with Exercise and Quiz

    Course content

    • Introduction
      • Why Second generation frameworks?
      • Introduction to Spark
      • Spark Architecture
      • Spark on Cluster
    • Scala session
      • Why Scala?
      • Hands on Scala features
      • Type inference
      • Higher order functions
      • Collections and Combinators
      • Lazy evaluation
      • implicit
    • Spark API Hands on
      • RDD
      • map, flatMap, filter
      • Hadoop RDD
      • Pair RDD
      • Double RDD
      • Caching
      • Join
    • Advanced Spark operation
      • Aggregate
      • fold
      • mapParititions
      • glom
      • Accumulators
      • Broadcasters
    • Anatomy of a spark RDD
      • Splits
      • Localization
      • Serialization
      • Transformations vs Actions
    • Integration with HDFS
    • Shark and other ecosystem projects