
JCP SET 4
Quiz by akash rajput
Feel free to use or edit a copy
includes Teacher and Student dashboards
Measure skillsfrom any curriculum
Tag the questions with any skills you have. Your dashboard will track each student's mastery of each skill.
- edit the questions
- save a copy for later
- start a class game
- automatically assign follow-up activities based on students’ scores
- assign as homework
- share a link with colleagues
- print as a bubble sheet
- Q1
You store historic data in CloudStorage. You need to perform analytics on the historic data. You want to use asolution to detect invalid data entries and perform data transformations thatwill not require programming or knowledge of SQL.What should you do?
Use Cloud Dataprep with recipes to detect errorsand perform transformations.
· Use federated tables inBigQuery with queries to detect errors and perform transformations.
Use Cloud Dataflow with Beam to detect errors andperform transformations.
· Use Cloud Dataproc with aHadoop job to detect errors and perform transformations.
30s - Q2
Your company needs to upload theirhistoric data to Cloud Storage. The security rules don't allow access fromexternal IPs to their on-premises resources. After an initial upload, they willadd new data from existing on-premises applications every day. What should theydo?
Execute gsutil rsync from the on-premisesservers.
Use Dataflow and write the data to Cloud Storage.
· Write a job template inDataproc to perform the data transfer.
· Install an FTP server on aCompute Engine VM to receive the files and move them to Cloud Storage.
30s - Q3
You have a query that filters aBigQuery table using a WHERE clause on timestamp and ID columns. By using bqquery `"-dry_run you learn that the query triggers a full scan of thetable, even though the filter on timestamp and ID select a tiny fraction of theoverall data. You want to reduce the amount of data scanned by BigQuery withminimal changes to existing SQL queries. What should you do?
Use the LIMIT keyword to reduce the number of rowsreturned.
Create a separate table for each ID.
Use the bq query --maximum_bytes_billed flag torestrict the number of bytes billed.
Recreate the table with a partitioning column andclustering column.
30s - Q4
You have a requirement to insertminute-resolution data from 50,000 sensors into a BigQuery table. You expectsignificant growth in data volume and need the data to be available within 1minute of ingestion for real-time analysis of aggregated trends. What shouldyou do?
Use the MERGE statement to apply updates in batchevery 60 seconds.
Use a Cloud Dataflow pipeline to stream data intothe BigQuery table.
· Use the INSERT statement to insert a batch of data every 60seconds.
· Use bq load to load abatch of sensor data every 60 seconds.
30s - Q5
You need to copy millions of sensitive patient records from arelational database to BigQuery. The total size of the database is 10 TB. Youneed to design a solution that is secure and time-efficient. What should youdo?
Export the records from the database as an Avrofile. Create a public URL for the Avro file, and then use Storage TransferService to move the file to Cloud Storage. Load the Avro file into BigQueryusing the BigQuery web UI in the GCP Console.
. Export the records from the database into a CSVfile. Create a public URL for the CSV file, and then use Storage TransferService to move the file to Cloud Storage. Load the CSV file into BigQueryusing the BigQuery web UI in the GCP Console.
Exportthe records from the database as an Avro file. Upload the file to GCS usinggsutil, and then load the Avro file into BigQuery using the BigQuery web UI inthe GCP Console.
l Export the records from the database as anAvro file. Copy the file onto a Transfer Appliance and send it to Google, andthen load the Avro file into BigQuery using the BigQuery web UI in the GCPConsole.
30s - Q6
You need to create a near real-time inventory dashboard that reads themain inventory tables in your BigQuery data warehouse. Historical inventorydata is stored as inventory balances by item and location. You have severalthousand updates to inventory every hour. You want to maximize performance ofthe dashboard and ensure that the data is accurate. What should you do?
Use the BigQuery streaming the stream changes intoa daily inventory movement table. Calculate balances in a view that joins it tothe historical inventory balance table. Update the inventory balance tablenightly.
lUsethe BigQuery bulk loader to batch load inventory changes into a daily inventorymovement table. Calculate balances in a view that joins it to the historicalinventory balance table. Update the inventory balance table nightly.
LeverageBigQuery UPDATE statements to update the inventory balances as they arechanging.
Partition the inventory balance table by item toreduce the amount of data scanned with each inventory update.
30s - Q7
You have a data stored in BigQuery. The data in the BigQuery datasetmust be highly available. You need to define a storage, backup, and recoverystrategy of this data that minimizes cost. How should you configure theBigQuery table that have a recovery point objective (RPO) of 30 days?
Setthe BigQuery dataset to be multi-regional. In the event of an emergency, use apoint-in-time snapshot to recover the data.
Set the BigQuery dataset to be multi-regional.Create a scheduled query to make copies of the data to tables suffixed with thetime of the backup. In the event of an emergency, use the backup copy of thetable.
Set the BigQuery dataset to be regional. In theevent of an emergency, use a point-in-time snapshot to recover the data.
l Set the BigQuery dataset to be regional.Create a scheduled query to make copies of the data to tables suffixed with thetime of the backup. In the event of an emergency, use the backup copy of thetable.
30s - Q8
You used Dataprep to create a recipe on a sample of data in a BigQuerytable. You want to reuse this recipe on a daily upload of data with the sameschema, after the load job with variable execution time completes. What shouldyou do?
l Export the recipe as a Dataprep template,and create a job in Cloud Scheduler.
Exportthe Dataprep job as a Dataflow template, and incorporate it into a Composerjob.
Create a cron schedule in Dataprep.
l Create an App Engine cron job to schedulethe execution of the Dataprep job.
30s - Q9
You want to automate execution of a multi-step data pipeline running onGoogle Cloud. The pipeline includes Dataproc and Dataflow jobs that havemultiple dependencies on each other. You want to use managed services wherepossible, and the pipeline will run every day. Which tool should you use?
Cloud Scheduler
CloudComposer Most
l cron
Workflow Templates on Dataproc
30s - Q10
You are managing a Cloud Dataproc cluster. You need to make a job runfaster while minimizing costs, without losing work in progress on yourclusters. What should you do?
lIncrease the cluster size with preemptibleworker nodes, and configure them to use graceful decommissioning.
Increase the cluster size with more non-preemptibleworkers.
Increase the cluster size with preemptible workernodes, and use Cloud Stackdriver to trigger a script to preserve work.
l Increase the cluster size with preemptibleworker nodes, and configure them to forcefully decommission.
30s - Q11
Your organization has hundreds of Cloud SQL for MySQL instances. Youwant to follow Google-recommended practices to optimize platform costs. Whatshould you do?
l Use Query Insights to identify idleinstances.
Build indexes on heavily accessed tables.
Remove inactive user accounts.
Run the Recommender API to identify overprovisioned instances.
30s - Q12
You work for a shipping company that uses handheld scanners to readshipping labels. Your company has strict data privacy standards that requirescanners to only transmit tracking numbers when events are sent to Kafkatopics. A recent software update caused the scanners to accidentally transmitrecipients' personally identifiable information (PII) to analytics systems,which violates user privacy rules. You want to quickly build a scalablesolution using cloud-native managed services to prevent exposure of PII to theanalytics systems. What should you do?
Create an authorized view in BigQuery to restrictaccess to tables with sensitive data.
Use Cloud Logging to analyze the data passedthrough the total pipeline to identify transactions that may contain sensitiveinformation.
Builda Cloud Function that reads the topics and makes a call to the Cloud Data LossPrevention (Cloud DLP) API. Use the tagging and confidence levels to eitherpass or quarantine the data in a bucket for review.
Install a third-party data validation tool onCompute Engine virtual machines to check the incoming data for sensitiveinformation.
30s - Q13
You have developed three data processing jobs. One executes a CloudDataflow pipeline that transforms data uploaded to Cloud Storage and writesresults to
BigQuery. The second ingests data from on-premises servers and uploadsit to Cloud Storage. The third is a Cloud Dataflow pipeline that getsinformation from third-party data providers and uploads the information toCloud Storage. You need to be able to schedule and monitor the execution ofthese three workflows and manually execute them when needed. What should you do?
Use Stackdriver Monitoring and set up an alert witha Webhook notification to trigger the jobs.
Createa Direct Acyclic Graph in Cloud Composer to schedule and monitor the jobs.
Set up cron jobs in a Compute Engine instance toschedule and monitor the pipelines using GCP API calls.
Develop an App Engine application to schedule andrequest the status of the jobs using GCP API calls.
30s - Q14
You want to migrate your on-premises PostgreSQL database to ComputeEngine. You need to migrate this database with the minimum downtime possible.What should you do?
Create a read replica on Cloud SQL, and thenpromote it to a read/write standalone instance.
Perform a full backup of your on-premisesPostgreSQL, and then, in the migration window, perform an incremental backup.
Createa hot standby on Compute Engine, and use PgBouncer to switch over the connections.
Use Database Migration Service to migrate yourdatabase.
30s - Q15
You have Cloud Functions written in Node.js that pull messages fromCloud Pub/Sub and send the data to BigQuery. You observe that the messageprocessing rate on the Pub/Sub topic is orders of magnitude higher thananticipated, but there is no error logged in Stackdriver Log Viewer. What arethe two most likely causes of this problem? (Choose two.)
A. Publisher throughput quota is too small.
B. Total outstanding messages exceed the10-MB maximum.
C.Error handling in the subscriber code is not handling run-time errors properly.
D. The subscriber code cannot keep up withthe messages.
E. The subscriber code does not acknowledgethe messages that it pulls.
DE
AE
CE
AB
30s