Course Batch Starts, Timing, Price & Enroll

Program Duration Batch Starts Time Price # Enroll Book free demo
Weekend
32 Hrs Weekend Morning-Batch USD 400
INR 20000
Enroll Now Book free demo class
Weekend
32 Hrs Weekend Evening-Batch USD 400
INR 20000
Enroll Now Book free demo class
Weekdays
32 Hrs Weekdays Morning-Batch USD 400
INR 20000
Enroll Now Book free demo class
Weekdays
32 Hrs Weekdays Evening-Batch USD 400
INR 20000
Enroll Now Book free demo class

# Cloud lab charges will be extra. Our technical consultant will share actual lab charges with you.

About Course

The targeted audience for this course can be -
Software Engineers
ETL Developers
Data Scientists
Analytics Professionals
Professional looking a career in Big Data
To become an expert in Big Data Hadoop Ecosystem you are required to have in-depth understanding of Spark applications using Scala programming. This course is designed to help you in understanding the core concept of Apache Spark such as Spark Streaming, RDD, Spark SQL, DataFrames, Datasets, Spark MLlib, Spark GraphX and Spark Shell. Under this course you will learn how to customize Spark application using Scala programming.
After completing this course you will be able to –

Understand the core concept of Apache Spark
Use Scala to write programs
Work with Spark on a cluster
Understand different features of Spark like Spark Streaming, RDD, SparkSQL
Programming with Spark MLlib and Spark GraphX
As such there is no formal prerequisite to join this course but having a fundamental knowledge about any programming language, database, SQL queries and basics of Linux will help to cover this course in quick way.

CURRICULUM

Apache Spark and Scala

  • 1.1 Introduction to Scala
  • 1.2 Install and configure Scala
  • 1.3 First program using Scala
  • 1.4 Different operators in Scala
  • 1.5 Functions and Loops
  • 1.6 Array, Map, Lists, Tuples
  • 1.7 Collection
  • 1.8 OOPs concept and their use
  • 1.9 Traits as Interfaces
  • 2.1 Interactive Analysis with the Spark Shell
  • 2.2 RDD Operations
  • 2.3 Caching
  • 3.1 Linking with Spark
  • 3.2 Initializing Spark
  • 3.3 Resilient Distributed Datasets (RDDs)
  • 3.4 Parallelized Collections
  • 3.5 External Datasets
  • 3.6 RDD Operations
  • 3.7 Working with Key-Value Pairs
  • 3.8 Transformations
  • 3.9 Actions
  • 3.10 Shuffle operations
  • 3.11 RDD Persistence
  • 3.12 Shared Variables
  • 3.13 Deploying to a Cluster
  • 3.14 Unit Testing
  • 4.1 Linking
  • 4.2 Initializing StreamingContext
  • 4.3 Discretized Streams (DStreams)
  • 4.4 Input DStreams and Receivers
  • 4.5 Transformations on DStreams
  • 4.6 Output Operations on DStreams
  • 4.7 Accumulators and Broadcast Variables
  • 4.8 DataFrame and SQL Operations
  • 4.9 MLlib Operations
  • 4.10 Caching / Persistence
  • 4.11 Checkpointing
  • 4.12 Deploying Applications
  • 4.13 Monitoring Applications
  • 4.14 Reducing the Batch Processing Times
  • 4.15 Setting the Right Batch Interval
  • 4.16 Memory Tuning
  • 4.17 Fault-tolerance Semantics
  • 5.1 SQL
  • 5.2 Datasets and DataFrames
  • 5.3 Starting Point: SparkSession
  • 5.4 Creating DataFrames
  • 5.5 Running SQL Queries Programmatically
  • 5.6 Creating Datasets
  • 5.7 Data Sources
  • 5.8 Generic Load/Save Functions
  • 5.9 Parquet Files
  • 5.10 JSON Datasets
  • 5.11 Hive Tables
  • 5.12 JDBC To Other Databases
  • 5.13 Troubleshooting
  • 5.14 Performance Tuning
  • 5.15 Distributed SQL Engine
  • 6.1 Data types
  • 6.2 Basic statistics
  • 6.3 Classification and regression
  • 6.4 Collaborative filtering
  • 6.5 Clustering
  • 6.6 Dimensionality reduction
  • 6.7 Feature extraction and transformation
  • 6.8 Frequent pattern mining
  • 6.9 Evaluation metrics
  • 6.10 PMML model export
  • 7.1 The Property Graph
  • 7.2 Graph Operators
  • 7.3 Pregel API
  • 7.4 Graph Builders
  • 7.5 Vertex and Edge RDDs
  • 7.6 Optimized Representation
  • 7.7 Graph Algorithms - PageRank, Connected Components and Triangle Counting

Exam & Certification

Cloudera is offering a certification exam named as “CCA Spark and Hadoop Developer” to demonstrate the individual’s knowledge in Spark and BigData terminology.
Exam Name: CCA Spark and Hadoop Developer
Exam Code: CCA175
Number of Questions: 10–12 performance-based tasks on CDH5 cluster.
Time Limit:120 minutes
Passing Score: 70%
Language: English, Japanese

Select Trainer for Demo


Archana Jaiswal
Certification: Cloudera Certified Developer - Hadoop , Hortonworks Certified Developer (HDPCD)
From
Professional Experience
Training Experience

Qualification
MCA

Skills
Big Data, Cassandra, Hadoop , MongoDB, Apache Spark, Hortonworks Certified Developer (HDPCD), Cloudera Certified Developer - Hadoop,

Profile
Archana Jaiswal is a Freelance Corporate Trainer, Blogger, and Consultant with International Experience. Archana currently conducts Hadoop Developer, Spark, HBaase, MongoDB, Cassandra training programs and executive coaching for various large organizations. She is a Cloudera Certified Trainer for Hadoop Developer and Hadoop Spark as well as Hortonworks Certified Trainer for Data Analyst and has successfully completed Cloudera Hadoop Developer certification (CCDH) and Hortonworks Developer Certification. Archana did her Master’s in Computer Application from Sikkim Manipal University and is a Graduate from University of Delhi in Human Psychology. Read More...
RATING & REVIEWS
Shiva Reddy
Certification: Big Data , IBM Big Data & Analytics , IBM Spark
From
Professional Experience
Training Experience

Qualification
MCA

Skills
Hadoop , Qlik Sense, QlikView, Talend Open Studio, Apache Spark,

Profile
He is having 6+ years of experience. Read More...
RATING & REVIEWS
syamkakumani
Certification: IBM DataScience Foundations , SCALA
From
Professional Experience
Training Experience

Qualification
Master of Computer Applications

Skills
Apache Sqoop, Big Data, Hadoop , Hibernate, Java , Java EE, SOAP, Spring AOP, MVC, Apache Hadoop MapReduce, Apache Hadoop YARN, Apache Hive, Apache Pig, Apache Spark, Java EE Web Services,

Profile
Hadoop / Java - Continuous Learner ! Passionate about sharing the knowledge ! Read More...
RATING & REVIEWS
Disclaimer

** The above course information is taken from The Apache Software Foundation

* Money Back Guarantee till demo and 1st class of the course.


Copyright ©2015 Hub4Tech.com, All Rights Reserved. Hub4Tech™ is registered trademark of Hub4tech Portal Services Pvt. Ltd.
All trademarks and logos appearing on this website are the property of their respective owners.
FOLLOW US