
BIG DATA ANALYTICS ALL 3 MODULE
Quiz by rashmi theertha
Feel free to use or edit a copy
includes Teacher and Student dashboards
Measure skillsfrom any curriculum
Tag the questions with any skills you have. Your dashboard will track each student's mastery of each skill.
- edit the questions
- save a copy for later
- start a class game
- automatically assign follow-up activities based on students’ scores
- assign as homework
- share a link with colleagues
- print as a bubble sheet
- Q1
Which of the following are properties of Spark transformations?
They are computed right away
None of the above
They are vulnerable to machine failures
They are not computed right away
30s - Q2
Which of the following is not a property of Spark Actions?
They are lazily evaluated
They cause Spark to execute the recipe to transform the source data
They are the primary mechanism for getting results out of Spark
The results are returned to the driver
30s - Q3
S1: Actions cause parallel computation to be immediately executed
S2: Transformations lazily create new RDDs
S1: True S2: False
S1: True S2: True
S1: False S2: True
S1: False S2: False
30s - Q4
What language spark written in
R
Scala
Python
Java
30s - Q5
Complete the following:
Spark store data in ___________. which makes it 100 times faster than MapReduce
disk
Memory
30s - Q6
Which of them is not Spark Library?
Spark SQL
Spark Streaming
GraphX (graph)
kafka
MLlib (machine learning)
30s - Q7
Which is not a component on the top of Spark Core?
Spark RDD
None of the above
MLlib
Spark Streaming
30s - Q8
In a which cluster manager to do support of Spark?
Pseudo Cluster manager
YARN
Standalone Cluster manager
MESOS
All of the above
30s - Q9
The primary Machine Learning API for Spark is now the _____ based API.
DataFrame
RDD
All of the mentioned
Dataset
30s - Q10
Which of the following is a module for Structured data processing?
Spark SQL
GraphX
Spark R
MLlib
30s - Q11
SparkSQL translates commands into codes. These codes are processed by
Driver nodes
None of the Mentioned
Executor Nodes
Cluster manager
30s - Q12
Spark SQL plays the main role in the optimization of queries.
True
false
30s - Q13
DataFrame in Apache Spark prevails over RDD and does not contain any feature of RDD.
false
true
30s - Q14
Which of the following is not true for DataFrame?
We can build DataFrame from different data sources. structured data file, tables in Hive
DataFrame in Apache Spark is behind RDD
Both in Scala and Java, we represent DataFrame as Dataset of rows.
The Application Programming Interface (APIs) of DataFrame is available in various languages
30s - Q15
We can create DataFrame using
All of the above
External databases
Structured data files
Tables in Hive
30s