• support@dumpspool.com
SPECIAL LIMITED TIME DISCOUNT OFFER. USE DISCOUNT CODE TO GET 20% OFF DP2021

PDF Only

$35.00 Free Updates Upto 90 Days

  • DP-203 Dumps PDF
  • 316 Questions
  • Updated On April 19, 2024

PDF + Test Engine

$55.00 Free Updates Upto 90 Days

  • DP-203 Question Answers
  • 316 Questions
  • Updated On April 19, 2024

Test Engine

$45.00 Free Updates Upto 90 Days

  • DP-203 Practice Questions
  • 316 Questions
  • Updated On April 19, 2024
Check Our Free Microsoft DP-203 Online Test Engine Demo.

How to pass Microsoft DP-203 exam with the help of dumps?

DumpsPool provides you the finest quality resources you’ve been looking for to no avail. So, it's due time you stop stressing and get ready for the exam. Our Online Test Engine provides you with the guidance you need to pass the certification exam. We guarantee top-grade results because we know we’ve covered each topic in a precise and understandable manner. Our expert team prepared the latest Microsoft DP-203 Dumps to satisfy your need for training. Plus, they are in two different formats: Dumps PDF and Online Test Engine.

How Do I Know Microsoft DP-203 Dumps are Worth it?

Did we mention our latest DP-203 Dumps PDF is also available as Online Test Engine? And that’s just the point where things start to take root. Of all the amazing features you are offered here at DumpsPool, the money-back guarantee has to be the best one. Now that you know you don’t have to worry about the payments. Let us explore all other reasons you would want to buy from us. Other than affordable Real Exam Dumps, you are offered three-month free updates.

You can easily scroll through our large catalog of certification exams. And, pick any exam to start your training. That’s right, DumpsPool isn’t limited to just Microsoft Exams. We trust our customers need the support of an authentic and reliable resource. So, we made sure there is never any outdated content in our study resources. Our expert team makes sure everything is up to the mark by keeping an eye on every single update. Our main concern and focus are that you understand the real exam format. So, you can pass the exam in an easier way!

IT Students Are Using our Data Engineering on Microsoft Azure Dumps Worldwide!

It is a well-established fact that certification exams can’t be conquered without some help from experts. The point of using Data Engineering on Microsoft Azure Practice Question Answers is exactly that. You are constantly surrounded by IT experts who’ve been through you are about to and know better. The 24/7 customer service of DumpsPool ensures you are in touch with these experts whenever needed. Our 100% success rate and validity around the world, make us the most trusted resource candidates use. The updated Dumps PDF helps you pass the exam on the first attempt. And, with the money-back guarantee, you feel safe buying from us. You can claim your return on not passing the exam.

How to Get DP-203 Real Exam Dumps?

Getting access to the real exam dumps is as easy as pressing a button, literally! There are various resources available online, but the majority of them sell scams or copied content. So, if you are going to attempt the DP-203 exam, you need to be sure you are buying the right kind of Dumps. All the Dumps PDF available on DumpsPool are as unique and the latest as they can be. Plus, our Practice Question Answers are tested and approved by professionals. Making it the top authentic resource available on the internet. Our expert has made sure the Online Test Engine is free from outdated & fake content, repeated questions, and false plus indefinite information, etc. We make every penny count, and you leave our platform fully satisfied!

Microsoft DP-203 Sample Question Answers

Question # 1

You have the Azure Synapse Analytics pipeline shown in the following exhibit. You need to add a set variable activity to the pipeline to ensure that after the pipeline’s completion, the status of the pipeline is always successful.What should you configure for the set variable activity?

A. a success dependency on the Business Activity That Fails activity
B. a failure dependency on the Upon Failure activity
C. a skipped dependency on the Upon Success activity
D. a skipped dependency on the Upon Failure activity

Question # 2

Note: This question is part of a series of questions that present the same scenario.Each question in the series contains a unique solution that might meet the statedgoals. Some question sets might have more than one correct solution, while othersmight not have a correct solution.After you answer a question in this section, you will NOT be able to return to it. As aresult, these questions will not appear in the review screen.You have an Azure Data Lake Storage account that contains a staging zone.You need to design a daily process to ingest incremental data from the staging zone,transform the data by executing an R script, and then insert the transformed data into adata warehouse in Azure Synapse Analytics.Solution: You schedule an Azure Databricks job that executes an R notebook, and theninserts the data into the data warehouse.Does this meet the goal?

A. Yes
B. No

Question # 3

You are implementing a star schema in an Azure Synapse Analytics dedicated SQL pool.You plan to create a table named DimProduct. DimProduct must be a Type 3 slowly changing dimension (SCO) table that meets thefollowing requirements:• The values in two columns named ProductKey and ProductSourceID will remain thesame.• The values in three columns named ProductName, ProductDescription, and Color canchange.You need to add additional columns to complete the following table definition.

A. Option A
B. Option B
C. Option C
D. Option D
E. Option E
F. Option F

Question # 4

You plan to use an Apache Spark pool in Azure Synapse Analytics to load data to an AzureData Lake Storage Gen2 account.You need to recommend which file format to use to store the data in the Data Lake Storageaccount. The solution must meet the following requirements:• Column names and data types must be defined within the files loaded to the Data LakeStorage account.• Data must be accessible by using queries from an Azure Synapse Analytics serverlessSQL pool.• Partition elimination must be supported without having to specify a specific partition.What should you recommend?

A. Delta Lake
B. JSON
C. CSV
D. ORC

Question # 5

You have an Azure Synapse Analytics dedicated SQL pool named Pool1 that contains atable named Sales. Sales has row-level security (RLS) applied. RLS uses the followingpredicate filter. A user named SalesUser1 is assigned the db_datareader role for Pool1. Which rows in theSales table are returned when SalesUser1 queries the table?

A. only the rows for which the value in the User_Name column is SalesUser1
B. all the rows
C. only the rows for which the value in the SalesRep column is Manager
D. only the rows for which the value in the SalesRep column is SalesUser1

Question # 6

You are designing 2 solution that will use tables in Delta Lake on Azure Databricks.You need to minimize how long it takes to perform the following:*Queries against non-partitioned tables* Joins on non-partitioned columnsWhich two options should you include in the solution? Each correct answer presents part ofthe solution.(Choose Correct Answer and Give Explanation and References to Support the answersbased from Data Engineering on Microsoft Azure)

A. Z-Ordering
B. Apache Spark caching
C. dynamic file pruning (DFP)
D. the clone command

Question # 7

Note: This question is part of a series of questions that present the same scenario.Each question in the series contains a unique solution that might meet the statedgoals. Some question sets might have more than one correct solution, while othersmight not have a correct solution.After you answer a question in this section, you will NOT be able to return to it. As aresult, these questions will not appear in the review screen.You are designing an Azure Stream Analytics solution that will analyze Twitter data.You need to count the tweets in each 10-second window. The solution must ensure thateach tweet is counted only once.Solution: You use a tumbling window, and you set the window size to 10 seconds.Does this meet the goal?

A. Yes
B. No

Question # 8

You have an Azure subscription that contains an Azure Blob Storage account namedstorage1 and an Azure Synapse Analytics dedicated SQL pool named Pool1.You need to store data in storage1. The data will be read by Pool1. The solution must meetthe following requirements:Enable Pool1 to skip columns and rows that are unnecessary in a query.Automatically create column statistics.Minimize the size of files.Which type of file should you use?

A. JSON
B. Parquet
C. Avro
D. CSV

Question # 9

You have an Azure Databricks workspace that contains a Delta Lake dimension tablenamed Tablet. Table1 is a Type 2 slowly changing dimension (SCD) table. You need toapply updates from a source table to Table1. Which Apache Spark SQL operation shouldyou use?

A. CREATE
B. UPDATE
C. MERGE
D. ALTER

Question # 10

You are performing exploratory analysis of the bus fare data in an Azure Data LakeStorage Gen2 account by using an Azure Synapse Analytics serverless SQL pool.You execute the Transact-SQL query shown in the following exhibit. What do the query results include?

A. Only CSV files in the tripdata_2020 subfolder.
B. All files that have file names that beginning with "tripdata_2020".
C. All CSV files that have file names that contain "tripdata_2020".
D. Only CSV that have file names that beginning with "tripdata_2020".

Question # 11

You are designing an inventory updates table in an Azure Synapse Analytics dedicatedSQL pool. The table will have a clustered columnstore index and will include the followingcolumns: You identify the following usage patterns: Analysts will most commonly analyze transactions for a warehouse.Queries will summarize by product category type, date, and/or inventory eventtype.You need to recommend a partition strategy for the table to minimize query times.On which column should you partition the table?

A. ProductCategoryTypeID
B. EventDate
C. WarehouseID
D. EventTypeID

Question # 12

You have an Azure Synapse Analytics dedicated SQL pool named Pool1. Pool1 contains atable named table1.You load 5 TB of data intotable1.You need to ensure that columnstore compression is maximized for table1.Which statement should you execute?

A. ALTER INDEX ALL on table1 REORGANIZE
B. ALTER INDEX ALL on table1 REBUILD
C. DBCC DBREINOEX (table1)
D. DBCC INDEXDEFRAG (pool1,tablel)

Question # 13

You have two Azure Blob Storage accounts named account1 and account2?You plan to create an Azure Data Factory pipeline that will use scheduled intervals toreplicate newly created or modified blobs from account1 to account?You need to recommend a solution to implement the pipeline. The solution must meet thefollowing requirements:• Ensure that the pipeline only copies blobs that were created of modified since the mostrecent replication event.• Minimize the effort to create the pipeline. What should you recommend?

A. Create a pipeline that contains a flowlet.
B. Create a pipeline that contains a Data Flow activity.
C. Run the Copy Data tool and select Metadata-driven copy task.
D. Run the Copy Data tool and select Built-in copy task.

Question # 14

You have an Azure Data Factory pipeline named pipeline1 that is invoked by a tumblingwindow trigger named Trigger1. Trigger1 has a recurrence of 60 minutes.You need to ensure that pipeline1 will execute only if the previous execution completessuccessfully.How should you configure the self-dependency for Trigger1?

A. offset: "-00:01:00" size: "00:01:00"
B. offset: "01:00:00" size: "-01:00:00"
C. offset: "01:00:00" size: "01:00:00"
D. offset: "-01:00:00" size: "01:00:00"

Question # 15

You are building a data flow in Azure Data Factory that upserts data into a table in anAzure Synapse Analytics dedicated SQL pool.You need to add a transformation to the data flow. The transformation must specify logicindicating when a row from the input data must be upserted into the sink.Which type of transformation should you add to the data flow?

A. join
B. select
C. surrogate key
D. alter row

Question # 16

You have an Azure Data lake Storage account that contains a staging zone.You need to design a daily process to ingest incremental data from the staging zone,transform the data by executing an R script, and then insert the transformed data into adata warehouse in Azure Synapse Analytics.Solution: You use an Azure Data Factory schedule trigger to execute a pipeline thatexecutes an Azure Databricks notebook, and then inserts the data into the datawarehouse.Dow this meet the goal?

A. Yes
B. No

Question # 17

You are designing an Azure Data Lake Storage solution that will transform raw JSON filesfor use in an analytical workload.You need to recommend a format for the transformed files. The solution must meet thefollowing requirements:Contain information about the data types of each column in the files.Support querying a subset of columns in the files.Support read-heavy analytical workloads.Minimize the file size.What should you recommend?

A. JSON
B. CSV
C. Apache Avro
D. Apache Parquet

Question # 18

You have an Azure subscription that contains an Azure Synapse Analytics workspacenamed ws1 and an Azure Cosmos D6 database account named Cosmos1 Costmos1contains a container named container 1 and ws1 contains a serverless1 SQL pool. you need to ensure that you can Query the data in container by using the serverless1 SQLpool.Which three actions should you perform? Each correct answer presents part of the solutionNOTE: Each correct selection is worth one point.

A. Enable Azure Synapse Link for Cosmos1
B. Disable the analytical store for container1.
C. In ws1. create a linked service that references Cosmos1
D. Enable the analytical store for container1
E. Disable indexing for container1

Question # 19

You are designing a folder structure for the files m an Azure Data Lake Storage Gen2account. The account has one container that contains three years of data.You need to recommend a folder structure that meets the following requirements:• Supports partition elimination for queries by Azure Synapse Analytics serverless SQLpooh • Supports fast data retrieval for data from the current month• Simplifies data security management by departmentWhich folder structure should you recommend?

A. \YYY\MM\DD\Department\DataSource\DataFile_YYYMMMDD.parquet
B. \Depdftment\DataSource\YYY\MM\DataFile_YYYYMMDD.parquet
C. \DD\MM\YYYY\Department\DataSource\DataFile_DDMMYY.parquet
D. \DataSource\Department\YYYYMM\DataFile_YYYYMMDD.parquet

Question # 20

You have an Azure Synapse Analytics dedicated SQL pod. You need to create a pipeline that will execute a stored procedure in the dedicated SQLpool and use the returned result set as the input (or a downstream activity. The solutionmust minimize development effort.Which Type of activity should you use in the pipeline?

A. Notebook
B. U-SQL
C. Script
D. Stored Procedure

Question # 21

You have an Azure Synapse Analytics dedicated SQL pool that contains a table namedTable1. Table1 contains the following:One billion rowsA clustered columnstore index A hash-distributed column named Product KeyA column named Sales Date that is of the date data type and cannot be nullThirty million rows will be added to Table1 each month.You need to partition Table1 based on the Sales Date column. The solution must optimizequery performance and data loading.How often should you create a partition?

A. once per month
B. once per year
C. once per day
D. once per week

Question # 22

You have an Azure Databricks workspace named workspace! in the Standard pricing tier.Workspace1 contains an all-purpose cluster named cluster). You need to reduce the time ittakes for cluster 1 to start and scale up. The solution must minimize costs. What shouldyou do first?

A. Upgrade workspace! to the Premium pricing tier.
B. Create a cluster policy in workspace1.
C. Create a pool in workspace1.
D. Configure a global init script for workspace1.

Question # 23

What should you recommend to prevent users outside the Litware on-premises networkfrom accessing the analytical data store?

A. a server-level virtual network rule
B. a database-level virtual network rule
C. a database-level firewall IP rule
D. a server-level firewall IP rule

Question # 24

What should you recommend using to secure sensitive customer contact information?

A. data labels
B. column-level security
C. row-level security
D. Transparent Data Encryption (TDE)

Question # 25

You have an Azure subscription that contains an Azure Data Lake Storage account named myaccount1. The myaccount1 account contains two containers named container1 and contained. The subscription is linked to an Azure Active Directory (Azure AD) tenant that contains a security group named Group1. You need to grant Group1 read access to contamer1. The solution must use the principle of least privilege. Which role should you assign to Group1? 

A. Storage Blob Data Reader for container1 
B. Storage Table Data Reader for container1 
C. Storage Blob Data Reader for myaccount1 
D. Storage Table Data Reader for myaccount1 

Question # 26

You are designing database for an Azure Synapse Analytics dedicated SQL pool to support workloads for detecting ecommerce transaction fraud. Data will be combined from multiple ecommerce sites and can include sensitive financial information such as credit card numbers. You need to recommend a solution that meets the following requirements: Users must be able to identify potentially fraudulent transactions. Users must be able to use credit cards as a potential feature in models. Users must NOT be able to access the actual credit card numbers. What should you include in the recommendation? 

A. Transparent Data Encryption (TDE) 
B. row-level security (RLS) 
C. column-level encryption 
D. Azure Active Directory (Azure AD) pass-through authentication 

Question # 27

You have an Azure Synapse Analytics dedicated SQL pool. You need to Create a fact table named Table1 that will store sales data from the last three years. The solution must be optimized for the following query operations: Show order counts by week. • Calculate sales totals by region. • Calculate sales totals by product. • Find all the orders from a given month. Which data should you use to partition Table1?

A. region 
B. product 
C. week
 D. month 

Question # 28

You plan to create a dimension table in Azure Synapse Analytics that will be less than 1 GB. You need to create the table to meet the following requirements: • Provide the fastest Query time. • Minimize data movement during queries. Which type of table should you use? 

A. hash distributed 
B. heap
 C. replicated 
D. round-robin

Question # 29

You are designing an Azure Databricks interactive cluster. The cluster will be used infrequently and will be configured for auto-termination. You need to ensure that the cluster configuration is retained indefinitely after the cluster is terminated. The solution must minimize costs. What should you do? 

A. Clone the cluster after it is terminated. 
B. Terminate the cluster manually when processing completes.
 C. Create an Azure runbook that starts the cluster every 90 days.
 D. Pin the cluster. 

Question # 30

You have an Azure Data Lake Storage Gen2 account that contains two folders named Folder and Folder2. You use Azure Data Factory to copy multiple files from Folder1 to Folder2.  You receive the following error. What should you do to resolve the error.

A. Add an explicit mapping. 
B. Enable fault tolerance to skip incompatible rows. 
C. Lower the degree of copy parallelism 
D. Change the Copy activity setting to Binary Copy 

Question # 31

You have an Azure Databricks workspace and an Azure Data Lake Storage Gen2 account named storage! New files are uploaded daily to storage1. • Incrementally process new files as they are upkorage1 as a structured streaming source. The solution must meet the following requirements: • Minimize implementation and maintenance effort. • Minimize the cost of processing millions of files. • Support schema inference and schema drift. Which should you include in the recommendation?

A. Auto Loader 
B. Apache Spark FileStreamSource 
C. COPY INTO 
D. Azure Data Factory 

Question # 32

You have an Azure Synapse Analytics dedicated SQL pool named SA1 that contains a table named Table1. You need to identify tables that have a high percentage of deleted rows. What should you run?

A. Option 
B. Option
C. Option 
D. Option 

Question # 33

You have an activity in an Azure Data Factory pipeline. The activity calls a stored procedure in a data warehouse in Azure Synapse Analytics and runs daily. You need to verify the duration of the activity when it ran last. What should you use?

A. activity runs in Azure Monitor 
B. Activity log in Azure Synapse Analytics 
C. the sys.dm_pdw_wait_stats data management view in Azure Synapse Analytics 
D. an Azure Resource Manager template 

Question # 34

You have an Azure subscription linked to an Azure Active Directory (Azure AD) tenant that contains a service principal named ServicePrincipal1. The subscription contains an Azure Data Lake Storage account named adls1. Adls1 contains a folder named Folder2 that has a URI of https://adls1.dfs.core.windows.net/container1/Folder1/Folder2/. ServicePrincipal1 has the access control list (ACL) permissions shown in the following table.  You need to ensure that ServicePrincipal1 can perform the following actions: Traverse child items that are created in Folder2. Read files that are created in Folder2. The solution must use the principle of least privilege. Which two permissions should you grant to ServicePrincipal1 for Folder2? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point. 

A. Access - Read 
B. Access - Write
 C. Access - Execute 
D. Default-Read 
E. Default - Write
 F. Default - Execute 

Question # 35

You are designing a highly available Azure Data Lake Storage solution that will induce geozone-redundant storage (GZRS). You need to monitor for replication delays that can affect the recovery point objective (RPO). What should you include m the monitoring solution? 

A. Last Sync Time 
B. Average Success Latency 
C. Error errors 
D. availability 

Question # 36

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution. After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You have an Azure Synapse Analytics dedicated SQL pool that contains a table named Table1. You have files that are ingested and loaded into an Azure Data Lake Storage Gen2 container named container1. You plan to insert data from the files in container1 into Table1 and transform the data. Each row of data in the files will produce one row in the serving layer of Table1. You need to ensure that when the source data files are loaded to container1, the DateTime is stored as an additional column in Table1. Solution: You use an Azure Synapse Analytics serverless SQL pool to create an external table that has an additional DateTime column. Does this meet the goal?

A. Yes 
B. No 

Question # 37

A. Option 
B. Option 
C. Option 
D. Option 

Question # 38

You have an Azure Stream Analytics job. You need to ensure that the job has enough streaming units provisioned. You configure monitoring of the SU % Utilization metric. Which two additional metrics should you monitor? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

A. Backlogged Input Events 
B. Watermark Delay 
C. Function Events 
D. Out of order Events 
E. Late Input Events 

Question # 39

You need to alter the table to meet the following requirements: Ensure that users can identify the current manager of employees. Support creating an employee reporting hierarchy for your entire company. Provide fast lookup of the managers’ attributes such as name and job title. Which column should you add to the table?

A. [ManagerEmployeeID] [int] NULL 
B. [ManagerEmployeeID] [smallint] NULL 
C. [ManagerEmployeeKey] [int] NULL 
D. [ManagerName] [varchar](200) NULL 

Question # 40

A company uses Azure Stream Analytics to monitor devices. The company plans to double the number of devices that are monitored. You need to monitor a Stream Analytics job to ensure that there are enough processing resources to handle the additional load. Which metric should you monitor?

A. Early Input Events 
B. Late Input Events 
C. Watermark delay 
D. Input Deserialization Errors 

Question # 41

You are designing an enterprise data warehouse in Azure Synapse Analytics that will contain a table named Customers. Customers will contain credit card information. You need to recommend a solution to provide salespeople with the ability to view all the entries in Customers. The solution must prevent all the salespeople from viewing or inferring the credit cardinformation. What should you include in the recommendation?

A. data masking 
B. Always Encrypted 
C. column-level security 
D. row-level security 

Question # 42

A. Access - Read 
B. Access - Write 
C. Access - Execute 
D. Default-Read 
E. Default - Write 
F. Default - Execute 

Question # 43

You are designing the folder structure for an Azure Data Lake Storage Gen2 account.You identify the following usage patterns:• Users will query data by using Azure Synapse Analytics serverless SQL pools and AzureSynapse Analytics serverless Apache Spark pods.• Most queries will include a filter on the current year or week.• Data will be secured by data source.You need to recommend a folder structure that meets the following requirements:• Supports the usage patterns• Simplifies folder security• Minimizes query timesWhich folder structure should you recommend?

A. Option A 
B. Option B 
C. Option C 
D. Option D 
E. Option E 

Question # 44

You have an Azure Databricks resource.You need to log actions that relate to changes in compute for the Databricks resource.Which Databricks services should you log?

A. clusters 
B. workspace 
C. DBFS 
D. SSH 
E lobs 

Question # 45

You need to implement a Type 3 slowly changing dimension (SCD) for product categorydata in an Azure Synapse Analytics dedicated SQL pool.You have a table that was created by using the following Transact-SQL statement. Which two columns should you add to the table? Each correct answer presents part of thesolution.NOTE: Each correct selection is worth one point.

A. [EffectiveScarcDate] [datetime] NOT NULL, 
B. [CurrentProduccCacegory] [nvarchar] (100) NOT NULL, 
C. [EffectiveEndDace] [dacecime] NULL, 
D. [ProductCategory] [nvarchar] (100) NOT NULL, 
E. [OriginalProduccCacegory] [nvarchar] (100) NOT NULL, 

Question # 46

You have an Azure Data lake Storage account that contains a staging zone.You need to design a daily process to ingest incremental data from the staging zone,transform the data by executing an R script, and then insert the transformed data into adata warehouse in Azure Synapse Analytics.Solution You use an Azure Data Factory schedule trigger to execute a pipeline thatexecutes an Azure Databricks notebook, and then inserts the data into the data warehouseDow this meet the goal?

A. Yes 
B. No 

Question # 47

You plan to build a structured streaming solution in Azure Databricks. The solution willcount new events in five-minute intervals and report only events that arrive during theinterval. The output will be sent to a Delta Lake table.Which output mode should you use?

A. complete 
B. update 
C. append 

Question # 48

You have an enterprise data warehouse in Azure Synapse Analytics.Using PolyBase, you create an external table named [Ext].[Items] to query Parquet filesstored in Azure Data Lake Storage Gen2 without importing the data to the data warehouse.The external table has three columns.You discover that the Parquet files have a fourth column named ItemID.Which command should you run to add the ItemID column to the external table?

A. Option A 
B. Option B 
C. Option C 
D. Option D 

Question # 49

You need to trigger an Azure Data Factory pipeline when a file arrives in an Azure DataLake Storage Gen2 container.Which resource provider should you enable?

A. Microsoft.Sql 
B. Microsoft-Automation 
C. Microsoft.EventGrid 
D. Microsoft.EventHub 

Question # 50

You are designing an Azure Databricks interactive cluster. The cluster will be usedinfrequently and will be configured for auto-termination.You need to ensure that the cluster configuration is retained indefinitely after the cluster isterminated. The solution must minimize costsWhat should you do?

A. Clone the cluster after it is terminated. 
B. Terminate the cluster manually when processing completes. 
C. Create an Azure runbook that starts the cluster every 90 days. 
D. Pin the cluster. 

Question # 51

You have an enterprise data warehouse in Azure Synapse Analytics named DW1 on aserver named Server1.You need to verify whether the size of the transaction log file for each distribution of DW1 issmaller than 160 GB.What should you do?

A. On the master database, execute a query against thesys.dm_pdw_nodes_os_performance_counters dynamic management view. 
B. From Azure Monitor in the Azure portal, execute a query against the logs of DW1. 
C. On DW1, execute a query against the sys.database_files dynamic management view. 
D. Execute a query against the logs of DW1 by using the Get-AzOperationalInsightSearchResult PowerShell cmdlet. 

Question # 52

You are designing a financial transactions table in an Azure Synapse Analytics dedicatedSQL pool. The table will have a clustered columnstore index and will include the followingcolumns:TransactionType: 40 million rows per transaction typeCustomerSegment: 4 million per customer segmentTransactionMonth: 65 million rows per monthAccountType: 500 million per account typeYou have the following query requirements:Analysts will most commonly analyze transactions for a given month.Transactions analysis will typically summarize transactions by transaction type,customer segment, and/or account typeYou need to recommend a partition strategy for the table to minimize query times.On which column should you recommend partitioning the table?

A. CustomerSegment 
B. AccountType 
C. TransactionType 
D. TransactionMonth 

Question # 53

You plan to ingest streaming social media data by using Azure Stream Analytics. The datawill be stored in files in Azure Data Lake Storage, and then consumed by using AzureDatiabricks and PolyBase in Azure Synapse Analytics.You need to recommend a Stream Analytics data output format to ensure that the queriesfrom Databricks and PolyBase against the files encounter the fewest possible errors. Thesolution must ensure that the tiles can be queried quickly and that the data type informationis retained.What should you recommend?

A. Parquet 
B. Avro 
C. CSV 
D. JSON 

Question # 54

You are performing exploratory analysis of the bus fare data in an Azure Data LakeStorage Gen2 account by using an Azure Synapse Analytics serverless SQL pool.You execute the Transact-SQL query shown in the following exhibit What do the query results include?

A. Only CSV files in the tripdata_2020 subfolder. 
B. All files that have file names that beginning with "tripdata_2020". 
C. All CSV files that have file names that contain "tripdata_2020". 
D. Only CSV that have file names that beginning with "tripdata_2020". 

Question # 55

Note: This question is part of a series of questions that present the same scenario.Each question in the series contains a unique solution that might meet the statedgoals. Some question sets might have more than one correct solution, while othersmight not have a correct solution.After you answer a question in this section, you will NOT be able to return to it. As aresult, these questions will not appear in the review screen.You plan to create an Azure Databricks workspace that has a tiered structure. Theworkspace will contain the following three workloads:A workload for data engineers who will use Python and SQL.A workload for jobs that will run notebooks that use Python, Scala, and SOL.A workload that data scientists will use to perform ad hoc analysis in Scala and R.The enterprise architecture team at your company identifies the following standards forDatabricks environments: The data engineers must share a cluster.The job cluster will be managed by using a request process whereby datascientists and data engineers provide packaged notebooks for deployment to thecluster.All the data scientists must be assigned their own cluster that terminatesautomatically after 120 minutes of inactivity. Currently, there are three datascientists.You need to create the Databricks clusters for the workloads.Solution: You create a Standard cluster for each data scientist, a High Concurrency clusterfor the data engineers, and a Standard cluster for the jobs.Does this meet the goal?

A. Yes 
B. No 

Question # 56

You have an Azure Stream Analytics job.You need to ensure that the job has enough streaming units provisionedYou configure monitoring of the SU % Utilization metric.Which two additional metrics should you monitor? Each correct answer presents part of thesolution.NOTE Each correct selection is worth one point

A. Out of order Events 
B. Late Input Events 
C. Baddogged Input Events 
D. Function Events 

Question # 57

You are creating an Azure Data Factory data flow that will ingest data from a CSV file, castcolumns to specified types of data, and insert the data into a table in an Azure SynapseAnalytic dedicated SQL pool. The CSV file contains three columns named username,comment, and date.The data flow already contains the following:A source transformation.A Derived Column transformation to set the appropriate types of data.A sink transformation to land the data in the pool.You need to ensure that the data flow meets the following requirements:All valid rows must be written to the destination table.Truncation errors in the comment column must be avoided proactively.Any rows containing comment values that will cause truncation errors upon insertmust be written to a file in blob storage.Which two actions should you perform? Each correct answer presents part of the solution.NOTE: Each correct selection is worth one point.

A. To the data flow, add a sink transformation to write the rows to a file in blob storage. 
B. To the data flow, add a Conditional Split transformation to separate the rows that will cause truncation errors. 
C. To the data flow, add a filter transformation to filter out rows that will cause truncation errors
D. Add a select transformation to select only the rows that will cause truncation errors. 

Question # 58

You are developing a solution that will stream to Azure Stream Analytics. The solution willhave both streaming data and reference data.Which input type should you use for the reference data?

A. Azure Cosmos DB 
B. Azure Blob storage 
C. Azure IoT Hub 
D. Azure Event Hubs 

Question # 59

You have an Azure Synapse Analytics dedicated SQL pool that contains a table namedTable1.You have files that are ingested and loaded into an Azure Data Lake Storage Gen2container named container1.You plan to insert data from the files into Table1 and azure Data Lake Storage Gen2container named container1.You plan to insert data from the files into Table1 and transform the data. Each row of datain the files will produce one row in the serving layer of Table1.You need to ensure that when the source data files are loaded to container1, the DateTimeis stored as an additional column in Table1.Solution: You use a dedicated SQL pool to create an external table that has a additionalDateTime column.Does this meet the goal?

A. Yes 
B. No 

Question # 60

You plan to perform batch processing in Azure Databricks once daily.Which type of Databricks cluster should you use?

A. High Concurrency 
B. automated 
C. interactive 

Question # 61

You have an Azure Synapse Analytics dedicated SQL pool named Pool1 and a databasenamed DB1. DB1 contains a fact table named Table1.You need to identify the extent of the data skew in Table1.What should you do in Synapse Studio?

A. Connect to the built-in pool and query sysdm_pdw_sys_info. 
B. Connect to Pool1 and run DBCC CHECKALLOC. 
C. Connect to the built-in pool and run DBCC CHECKALLOC. 
D. Connect to Pool! and query sys.dm_pdw_nodes_db_partition_stats. 

Question # 62

You are creating a new notebook in Azure Databricks that will support R as the primarylanguage but will also support Scale and SOL Which switch should you use to switchbetween languages?

A. @<Language> 
B. %<Language> 
C. \\(<Language>) 
D. \\(<Language>) 

Question # 63

You create an external table named ExtTable that has LOCATION='/topfolder/'.When you query ExtTable by using an Azure Synapse Analytics serverless SQL pool,which files are returned?

A. File2.csv and File3.csv only 
B. File1.csv and File4.csv only 
C. File1.csv, File2.csv, File3.csv, and File4.csv 
D. File1.csv only