Quiz -2 (18.07.2023)
Quiz by Anitha PSGRKCW
Feel free to use or edit a copy
includes Teacher and Student dashboards
Measure skillsfrom any curriculum
Tag the questions with any skills you have. Your dashboard will track each student's mastery of each skill.
- edit the questions
- save a copy for later
- start a class game
- automatically assign follow-up activities based on students’ scores
- assign as homework
- share a link with colleagues
- print as a bubble sheet
- Q1
Spark was initially started by ____________ at UC Berkeley AMPLab in 2009.
Mahek Zaharia
Matei Zaharia
Stonebraker
Doug Cutting
30s - Q2
Users can easily run Spark on top of Amazon’s __________
Infosphere
EC2
None of the mentioned
EMR
30s - Q3
Spark runs on top of ___________ a cluster manager system which provides efficient resource isolation across distributed applications.
All of the mentioned
Mesus
Mesjs
Mesos
30s - Q4
______________ leverages Spark Core fast scheduling capability to perform streaming analytics.
GraphX
RDDs
MLlib
Spark Streaming
30s - Q5
Which of the following is a module for Structured data processing?
GraphX
Spark SQL
ML lib
Spark R
30s - Q6
Which of the following are the common feature of RDD and DataFrame?
All the Above
In-memory
Resilient
Immutability
30s - Q7
Which of the following is the fundamental data structure of Spark
Dataset
DataFrame
None of the above
RDD
30s - Q8
Which of the following is not true for DataFrame?
The Application Programming Interface (APIs) of DataFrame is available in various languages
Both in Scala and Java, we represent DataFrame as Dataset of rows.
We can build DataFrame from different data sources. structured data file, tables in Hive
DataFrame in Apache Spark is behind RDD
30s - Q9
Scala stands for ___.
Scalable language
None
Sequential language
Scripted advanced language
30s - Q10
Which of the following organized a data into a named column?
a. RDD
b. DataFrame
c. Dataset
Both b and c
Both a and b
Both a and c
30s - Q11
Does Dataset API support Python and R.
No
Yes
30s - Q12
Which of the following is good for low-level transformation and actions.
DataFrame
RDD
All the above
Dataset
30s - Q13
The Dataset API isaccessible in
Scala andR
Java and scala
Java,Scalaand python
Scala andPython
30s - Q14
Which of thefollowing is incorrect way for Spark deployment?
Spark SQL
Spark in MapReduce
Standalone
Hadoop Yarn
30s - Q15
SparkSQL translates commands into codes. Thesecodes are processed by______
None of the above
Cluster manager
Executor Nodes
Driver nodes
30s