Apache Spark developer training

Sreyobhilashi IT is one of the best bigdata training service provider in Hyderabad, specialized in Next bigdata technologies. Every week We are planning to start online & offline PySpark training in Hyderabad.

 

If you are looking for spark training, please fill this form:

If you are interested please fill the form.
https://forms.gle/msxPF7VAbPDP53C39

Fee: 30,000/-  (if u practice well & if you get cloudera/databricks/hortnworks certificate)

Mode: online

Call: +91-8500002025 (please WhatsApp me)

Training time:
Morning 7-9 am ist, 10-12 am ist,
evening 7-9 pm IST
Time 6.30 AM to 8.30 AM IST (daily Morning batch) (9-11 pm EST)

Trainer: Venu


To attend this training, just WhatsApp me to get link to join


+91-8500002025 or call me +91-8500002025

Spark Training for Non-Hadoop background Students.



Note: In my training, every session recorded. I’ll share that video after training for revision purposes.


Each and every Spark training session recorded,
Give daily tasks with real-time scenarios

Recorded Spark Demo

Pls download spark full syllabus
Spark_Training_90_hours_syllabus

Course content:

Hadoop Overview

  • Lecture
    • How HDFS read/write the data
    • YARN internal architecture
    • HDFS Internal Architecture.
  • Hands-On
    • HDFS Shell Commands
    • Install Hadoop & Spark in Ubuntu
    • Configure Hadoop/spark environment in Eclipse

Hive Overview

  • Lecture
    • How Hive functioning properly
    • Optimize Hive queries
    • Using Sqoop
  • Hands-On
    • Process csv, JSON data
    • Bucketing, Partitioning tables.
    • Import MySQL/Oracle data using Sqoop

Scala Basics

  • Lecture
    • Functional language
    • Scala Vs Java
  • Hands-On
    • Strings, Numbers
    • List, Array, Map, Set
    • Control Statements, collections
    • Functions, methods
    • Pattern matching

Spark Overview

  • Lecture
    • The power of Spark?
    • Spark Ecosystem
    • Spark Components vs Hadoop
  • Hands-On
    • Installation & Eclipse configuration
    • Programs in Command line Interface & Eclipse
    • Process Local, HDFS files

RDD Fundamentals

  • Lecture
    • Purpose and Structure of RDDs
    • Transformations, Actions, and DAG
    • Key-Value Pair RDDs
  • Hands-On
    • Creating RDDs from Data Files
    • Reshaping Data to Add Structure
    • Interactive Queries Using RDDs

SparkSQL and DataFrames

  • Lecture
    • Spark SQL and DataFrame Uses
    • DataFrame / SQL APIs
    • Catalyst Query Optimization
  • Hands-on
    • Creating (CSV, JSON) DataFrames
    • Querying with DataFrame API and SQL
    • Caching and Re-using DataFrames
    • Process Hive data in Spark

Spark DataSet API

  • Lecture
    • Power of Dataset API in Spark 2.0
    • Serialization concept in DataSet
  • Hands-on
    • Creating DataSet API
    • Process CSV, JSON, XML, Text data
    • DataSet Operation

Spark Job Execution

  • Lecture
    • Jobs, Stages, and Tasks
    • Partitions and Shuffles
    • Broadcast Variables and accumulators
    • Job Performance
  • Hands-On
    • Visualizing DAG Execution
    • Observing Task Scheduling
    • Understanding Performance
    • Measuring Memory Usage
    • shared variables usage

Clustering Architecture

  • Lecture
    • Cluster Managers for Spark: Spark Standalone, YARN, and Mesos
    • Understanding Spark on YARN
    • What happened in a cluster when you submit a job
  • Hands-On
    • Tracking Jobs through the Cluster UI
    • Understanding Deploy Modes
    • Submit a sample job and monitor job

Spark Streaming

  • Lecture
    • Streaming Sources and Tasks
    • DStream APIs and Stateful Streams
    • Flink Introduction
    • Kafka architecture
  • Hands-On
    • Creating DStreams from Sources
    • Operating on DStream Data
    • Viewing Streaming Jobs in the Web UI
    • Sample Flink Streaming program.
    • Kafka sample program

AWS with Spark

  • Lecture
    • AWS architecture
    • Redshift, EMR, and EC2 functionalities
    • How to minimize AWS cost
  • Hands-On
    • Submit a sample jar in AWS Cluster
    • Create a cluster using EMR
    • Read/Write data from Redshift

Advanced concepts in Spark

  • Lecture
    • Memory management in Spark
    • How to optimize Spark Applications
    • Spark how to integrate with other Applications
  • Hands-On
    • Spark with Cassandra Integration
    • Spark kafka Nifi integration
    • automate spark jobs using oozie

Sample Spark Project

  • Lecture
    • End to end a project overview
    • Complicated problems in a project
    • Common steps in any project
  • Hands-On
    • Implement Spark SQL Mini project
    • Kafka, Cassandra, Spark Streaming project
    • Pull Twitter data and analyze the data
    • Oozie scheduling & shell script

Important notes:

  • Daily after training assign a task
  • Who completed all these tasks they will get 5000/- money back.
  • After training provides a solution to that problem.
  • Minimum 3 months online support & Job Assistance
  • Training in Spark 2.4.5 and spark 3.0.0 using Python (Pyspark) and Scala language
  • Excellent Materials all major spark and Scala books
  • Guide to get Cloudera/MapR/Databricks spark certification

Recommendations: To learn Apache Spark, no need to learn Hadoop, but If you have Hadoop knowledge, it’s a huge plus to implement production level project.
To learn Spark Minimum core java (to learn Scala) and SQL queries knowledge mandatory.
This training intentionally is done for non Hadoop background students.

If you interested to take Online Spark training, just fill the form we will send scale materials to familiarize scala & spark.
[contact-form-7 id=”603″ title=”Request for Apache Spark training”]

Hands-on

Scroll to Top
× How can I help you?