C2090-103 Apache Spark 1.6 Developer

Posted by

Test information:
Number of questions: 60
Time allowed in minutes: 120
Required passing score: 65%
Languages: English

This test will certify that the successful candidate has the necessary skills to work with, transform, and act on data at a very large scale. The candidate will be able to build data pipelines and derive viable insights into the data using Apache Spark. The candidate is proficient in using streaming, machine learning, SQL and graph processing on Spark. This candidate may be a member of a Data Scientist team and/or Analytics team and has applied knowledge with deployment architectures and can assist in tuning, troubleshooting, and optimization.

Section 1 – Architecture (12%)
Compare and contrast Spark with Hadoop MapReduce
Explain memory management in Spark
Explain concepts such as master, drivers, executors, stages and tasks
Explain Spark transformations and actions with respect to lazy evaluation
Configure your application to run on a cluster

Section 2 – Performance and Troubleshooting (22%)
Manage partitions to improve RDD performance and apply different partition strategies
Identify what operations cause shuffling
Optimize memory usage with serialization options
Use caching, checkpoint, and persistence in appropriate situations
Debug Spark code
Monitor Spark applications
Manage runtime issues and performance bottlenecks in Spark

Section 3 – Core Skills (48%)
Read/writre data from multiple data sources and file types
Create and work with RDDs and related APIs
Create and work with DataFrames and related APIs
Create Spark config contexts for different requirements
Work with key value pairs and associated Spark APIs for key value pairs
Work with SparkSQL
Define and work with accumulators
Define and work with broadcast variables
Launch applications with spark-submit

Section 4 – Advanced Skills (18%)
Build a pipeline with Streaming, MLLib, SQL, and Graph on Spark
Work with Spark Streaming APIs
Work with SparkML and MLLib APIs
Work with GraphX

IBM Certified Developer – Apache Spark 1.6

Job Role Description / Target Audience
This test will certify that the successful candidate has the necessary skills to work with, transform, and act on data at a very large scale. The candidate will be able to build data pipelines and derive viable insights into the data using Apache Spark. The candidate is proficient in using streaming, machine learning, SQL and graph processing on Spark. This candidate may be a member of a Data Scientist team and/or Analytics team and has applied knowledge with deployment architectures and can assist in tuning, troubleshooting, and optimization.

Recommended Prerequisite Skills
Read and write Python code
Read and write Scala code
Create and work with RDDs and related APIs
Create and work with DataFrames and related APIs
Create and work with Dstreams and related APIs
Read/writre data from multiple data sources and file types
Read and write SQL statements
Compare and contrast Spark with Hadoop MapReduce
Manage partitions to improve RDD performance and apply different partition strategies
Identify what operations cause shuffling
Optimize memory usage with serialization options
Create Spark config contexts for different requirements
Use caching, checkpoint, and persistence in appropriate situations
Explain memory management in Spark
Work with key value pairs and associated Spark APIs for key value pairs
Define and work with accumulators
Configure your application to run on a cluster
Explain core concepts such as master, drivers, executors, stages and tasks
Debug your spark code
Define and work with broadcast variables
Explain Spark transformations and actions with respect to lazy evaluation
Monitor Spark applications
Launch applications with spark-submit
Manage runtime issues and performance bottlenecks in Spark
Build a pipeline with Streaming, MLLib, SQL, and Graph on Spark
Work with Spark Streaming APIs
Work with SparkML and MLLib APIs
Work with GraphX
Work with SparkSQL

Click here to view complete Q&A of C2090-103 exam
Certkingdom Review

MCTS Training, MCITP Trainnig

Best IBM C2090-103 Certification, IBM C2090-103 Training at certkingdom.com

Leave a Reply

Your email address will not be published. Required fields are marked *