Showing posts with label SSIS Intreview Questions. Show all posts
Showing posts with label SSIS Intreview Questions. Show all posts

Jul 28, 2015

SSIS Interview Questions - Part 4

Q:  How do you eliminate quotes from being uploaded from a flat file to SQL Server?
This can be done using Text Qualifier property. In the SSIS package on the Flat File Connection Manager Editor, enter quotes into the Text qualifier field then preview the data to ensure the quotes are not included.

Q: What is Data Viewer and what are the different types of Data Viewers in SSIS?
A Data Viewer allows viewing data at a point of time at run-time. If data viewer is placed before and after the Aggregate transform, we can see data flowing to the transformation at the run-time and how it looks like after the transformation occurred.
The different types of data viewers are:
  • Grid
  • Histogram
  • Scatter Plot
  • Column Chart.

Q: What are variables and what is variable scope?
A variable is used to store values. There are basically two types of variables:
  • System Variable: (like ErrorCode, ErrorDescription, PackageName etc) whose values you can use but cannot change.
  • User Variable: which you create, assign values and read as needed. A variable can hold a value of the data type you have chosen when you defined the variable.
Variables can have a different scope depending on where it was defined. For example you can have package level variables which are accessible to all the tasks in the package and there could also be container level variables which are accessible only to those tasks that are within the container.

Q: Difference between Union All and Merge Join?
  • Merge transformation can accept only two Inputs whereas Union All can take more than two Inputs
  • Data has to be sorted before Merge Transformation whereas Union All doesn't have any condition like that.
Q: Can we add our custom code in SSIS?
We can customize SSIS through code by using Script Task. The main purpose of this task is to control the flow of the package. This is very useful in the scenario where the functionality you want to implement is not available in existing control flow item.
To add your own code:-
  • In Control Flow Tab, drag and drop Script Task from Toolbox.
  • Double click on Script Task to open and select edit to open Script Task Editor.
  • In Script Task Editor, there are 3 main properties
    • General – Here you can specify name and description
    • Script – through this we can add our code by clicking on Design Script button. The scripting language present is Net only.
    • Expression

Q What is Ignore Failure option in SSIS?
In Ignore Failure option, the error will be ignored and the data row will be directed to continue on the next transformation. Let’s say you have some JUNK data(wrong type of data or JUNK data) flowing from source, then using this option in SSIS we can REDIRECT the junk data records to another transformation instead of FAILING the package. This helps to MOVE only valid data to destination and JUNK can be captured into separate file.

Q: What is a breakpoint in SSIS? How is it setup? How do you disable it?
A breakpoint is a stopping point in the code. The breakpoint can give the Developer\DBA an
opportunity to review the status of the data, variables and the overall status of the SSIS package. 10 unique conditions exist for each breakpoint.
Breakpoints are setup in BIDS. In BIDS, navigate to the control flow interface. Right click on the object where you want to set the breakpoint and select the ‘Edit Breakpoints…’ option.

Q: Error Handling in SSIS?
An error handler allows us to create flows to handle errors in the package in quite an easy way. Through event handler tab, we can name the event on which we want to handle errors and the task that needs to be performed when such an error arises. We can also add sending mail functionality in event of any error through SMTP Task in Event handler. This is quite useful in event of any failure in office non-working hours. In Data flow, we can handle errors for each connection through following failure path or red arrow.

Q: How to log SSIS Executions?
SSIS includes logging features that write log entries when run-time events occur and can also write custom messages. This is not enabled by default. Integration Services supports a diverse set of log providers, and gives you the ability to create custom log providers. The Integration Services log providers can write log entries to text files, SQL Server Profiler, SQL Server, Windows Event Log, or XML files. Logs are associated with packages and are configured at the package level. Each task or container in a package can log information to any package log. The tasks and containers in a package can be enabled for logging even if the package itself is not.

Q: Can you explain how to setup a checkpoint file in SSIS?
The following items need to be configured on the properties tab for SSIS package:
CheckpointFileName: Specify the full path to the Checkpoint file that the package uses to save the value of package variables and log completed tasks. Rather than using a hard-coded path as shown above, it’s a good idea to use an expression that concatenates a path defined in a package variable and the package name.
  • CheckpointUsage: Determines if/how checkpoints are used. Choose from these options: Never(default), IfExists, or Always. Never indicates that you are not using Checkpoints. IfExists is the typical setting and implements the restart at the point of failure behavior. If a Checkpoint file is found it is used to restore package variable values and restart at the point of failure. If a Checkpoint file is not found the package starts execution with the first task. The Always choice raises an error if the Checkpoint file does not exist.
  • SaveCheckpoints: Choose from these options: True or False (default). You must select True to implement the Checkpoint behavior.

Q: What are the different values you can set for CheckpointUsage property ?
There are three values, which describe how a checkpoint file is used during package execution:
  • Never: The package will not use a checkpoint file and therefore will never restart.
  • If Exists: If a checkpoint file exists in the place you specified for the CheckpointFilename property, then it will be used, and the package will restart according to the checkpoints written.
  • Always: The package will always use a checkpoint file to restart, and if one does not exist, the package will fail.

Q: How would you restart package from previous failure point?What are Checkpoints and how can we implement in SSIS?
When a package is configured to use checkpoints, information about package execution is written to a checkpoint file. When the failed package is rerun, the checkpoint file is used to restart the package from the point of failure. If the package runs successfully, the checkpoint file is deleted, and then re-created the next time that the package is run.

Q: How do you deploy SSIS packages?
  • Through Manifest
    • Create deployment utility by setting its propery as true .
    • It will be created in the bin folder of the solution as soon as package is build.
    • Copy all the files in the utility and use manifest file to deply it on the Prod.
  • Using DtsExec.exe utility
    • Import Package directly in MSDB from SSMS by logging in Integration Services.

Q Where are SSIS package stored in the SQL Server?
MSDB.sysdtspackages90 stores the actual content and ssydtscategories, sysdtslog90, sysdtspackagefolders90, sysdtspackagelog, sysdtssteplog, and sysdtstasklog do the supporting roles.

Q How would you schedule a SSIS packages?
Using SQL Server Agent. Read about Scheduling a job on Sql server Agent

Q Can you name five of the Perfmon counters for SSIS and the value they provide?
  • SQLServer:SSIS Service
  • SSIS Package Instances
  • SQLServer:SSIS Pipeline
  • BLOB bytes read
  • BLOB bytes written
  • BLOB files in use
  • Buffer memory
  • Buffers in use
  • Buffers spooled
  • Flat buffer memory
  • Flat buffers in use
  • Private buffer memory
  • Private buffers in use
  • Rows read
  • Rows written

SSIS Interview Questions - Part 3

Q: Explain Aggregate Transformation in SSIS?
It aggregates data, similar you do in applying T-SQL functions like Group By, Min, Max, Avg, and Count. For example you get total quantity and Total line item for each product in Aggregate Transformation Editor. First you determine input columns, then output column name in Output Alias table in data-grid, and also operations for each Output Alias in Operation columns of the same data-grid.
Some of operation functions listed below :
  • Group By
  • Average
  • Count
  • Count Distinct : count distinct and non null column value
  • Min
  • Max
  • Sum
Step by Step Tutorial on Aggregate Transformation in SSIS 

Q: Explain Conditional Split transformation in SSIS?
This is just like IF condition which checks for the given condition and based on the condition evaluation, the output will be sent to the appropriate OUTPUT path. It has ONE input and MANY outputs. Conditional Split transformation is used to send paths to different outputs based on some conditions.

Step by Step Tutorial on Conditional Transformation in SSIS

Q: Explain Data Conversion Transformation in SSIS?
This component does conversion data type, similar to T-SQL function CAST or CONVERT. It is used to convert the data from one type to another.

Data Conversion Transformation in detail with Example

Q: Explain Derived column Transformation?
Derived column creates new column or put manipulation of several columns into new column. You can directly copy existing or create a new column using more than one column also.

Step by Step Tutorial on Derived Column Transformation in SSIS

Q: Explain Audit Transformation in SSIS?
It allows you to add auditing information as required in auditing world specified by HIPPA and Sarbanes-Oxley (SOX). Auditing options that you can add to transformed data through this transformation are :
Execution of Instance GUID: ID of execution instance of the package
  • PackageID: ID of the package
  • PackageName
  • VersionID: GUID version of the package
  • Execution StartTime
  • MachineName
  • UserName
  • TaskName
  • TaskID: uniqueidentifier type of the data flow task that contains audit transformation.

Q: Explain Character Map Transformation in SSIS?
It transforms some character. It gives options whether output result will override the existing column or add to new column. If you define it as new column, specify new column name. Operations available here are:
  • Uppercase
  • Lowercase
  • Byte reversal
  • Hiragana
  • Katakana
  • Half width
  • Full width
  • Linguistic casing
  • Simplified Chinese
  • Traditional Chinese

Q: Explain Copy column Transformation in SSIS?
This component simply copies a column to another new column. Just like ALIAS Column in T-SQL.

Q: Explain Merge Transformation in SSIS?
Merge transformation merges two paths into single path. It is useful when you want to break out data into path that handles errors after the errors are handled, the data are merge back into downstream or you want to merge 2 data sources. It is similar with Union All transformation, but Merge has some restrictions :
  • Data should be in sorted order
  • Data type , data length and other meta data attribute must be similar before merged.

Q: Explain Merge Join Transformation in SSIS?
Merge Join transformation will merge output from 2 inputs and doing INNER or OUTER join on the data. But if you the data come from 1 OLE DB data source, it is better you join through SQL query rather than using Merge Join transformation. Merge Join is intended to join 2 different data source.

Q: Explain Multicast Transformation in SSIS?
This transformation sends output to multiple output paths with no conditional as Conditional Split does. Takes ONE Input and makes the COPY of data and passes the same data through many outputs. In simple Give one input and take many outputs of the same data.

Q: Explain Percentage and row sampling Transformations in SSIS?
This transformation will take data from source and randomly sampling data. It gives you 2 outputs. First is selected data and second one is unselected data. It is used in situation where you train data mining model. These two are used to take the SAMPLE of data from the input data.

Q: Explain Sort Transformation in SSIS?
This component sorts the data similar to T-SQL command ORDER BY.

Q: Explain Union all Transformation in SSIS?
It works in opposite way to Merge transformation. It can take output from more than 2 input paths and combines into single output path.

SSIS Interview Questions - Part 2

Q: What is the function of Event handlers tab in SSIS?
On the Event handlers tab, workflows can be configured to respond to package events.
For example, we can configure Work Flow when any task fails, stops or starts ..

Q: What is the function of Package explorer tab in SSIS?
This tab provides an explorer view of the package. You can see what is happening in the package. The Package is a container at the top of the hierarchy.

Q:  Can you name 5 or more of the native SSIS connection managers?
  • OLE DB connection – Used to connect to any data source requiring an OLE DB connection (i.e.,SQL Server 2000). OLE DB Connection Manager Example
  • Flat file connection – Used to make a connection to a single file in the File System. Required for reading information from a File System flat file. Tutorial: Import Flat File to SQL Server
  • ADO.Net connection – Uses the .Net Provider to make a connection to SQL Server 2005 or other connection exposed through managed code (like C#) in a custom task
  • Analysis Services connection – Used to make a connection to an Analysis Services database or project. Required for the Analysis Services DDL Task and Analysis Services Processing Task
  • File connection – Used to reference a file or folder. The options are to either use or create a file or folder
  • Excel

Q: What are the different types of data sources available in SSIS?
Q: What are the different types of data destinations available in SSIS?
Q:What is a container and how many types of containers are there?
A container is a logical grouping of tasks which allows you to manage the scope of the tasks together. These are the types of containers in SSIS:
  • Sequence Container - Used for grouping logically related tasks together
  • For Loop Container - Used when you want to have repeating flow in package
  • For Each Loop Container - Used for enumerating each object in a collection; for example a record set or a list of files.
Apart from the above mentioned containers, there is one more container called the Task Host Container which is not visible from the IDE, but every task is contained in it (the default container for all the tasks).

Q: What are the possible locations to save SSIS package?
You can save a package wherever you want.
  • SQL Server
  • Package Store
  • File System

Q: How to provide security to packages?
We can provide security in two ways
  • Package encryption
  • Password protection.

Q:  How do we convert data type in SSIS?
The Data Conversion Transformation in SSIS converts the data type of an input column to a different data type.
Refer this tutorial where Data Conversion Transformation is used in a Real Time Example
Import Flat File to SQL Server

Q: What is the use of Bulk Insert Task in SSIS?
Bulk Insert Task is used to upload large amount of data from flat files into Sql Server. It supports only OLE DB connections for destination database.

Q: How to quickly load data into sql server table?
Fast Load option. This option is not set by default so most developers know this answer as otherwise the load is very slow.

Q: What are the Transformations available in SSIS?
  • Aggregate: It applies aggregate functions to Record Sets to produce new output records from aggregated values. Example
  • Conditional Split:  Separates available input into separate output pipelines based on Boolean Expressions configured for each output. Example
  • Data Conversion: Converts columns data types from one to another type. It stands for Explicit Column Conversion. Example
  • Derived Column:  Create a new (computed) column from given expressions. Example
  • Lookup: Performs the lookup (searching) of a given reference object set against a data source. It is used for exact matches only.
  • Merge: Merges two sorted data sets into a single data set into a single data flow.
  • Merge Join: Merges two data sets into a single data-set using a join junction.
  • Multicast: Sends a copy of supplied Data Source onto multiple Destinations.
  • Row Count: Stores the resulting row count from the data flow / transformation into a variable.
  • Union All: Merge multiple data sets into a single data-set.
  • Audit: Adds Package and Task level Metadata - such as Machine Name, Execution Instance, Package Name, Package ID, etc..
  • Character Map: Performs SQL Server column level string operations such as changing data from lower case to upper case.
  • Copy Column: Add a copy of column to the output we can later transform the copy keeping the original for auditing.
  • Data Mining Query: Used to perform data mining query against analysis services and manage Predictions Graphs and Controls.
  • Export Column: Used to export a Image specific column from the database to a flat file.
  • Fuzzy Grouping: Used for data cleansing by finding rows that are likely duplicates.
  • Fuzzy Lookup: Used for Pattern Matching and Ranking based on fuzzy logic.
  • Import Column: Reads image specific column from database onto a flat file.
  • Row Sampling: Captures sample data by using a row count of the total rows in data-flow specified by rows or percentage.
  • Pivot: Used for Normalization of data sources to reduce anomalies by converting rows into columns
  • Unpivot: Used for de-normalize the data structure by converts columns into rows in case of building Data Warehouses.

SSIS Interview Questions - Part 1

Q: What is SSIS?
SSIS was first introduced in SQL Server 2005, which was the next generation of SQL Server software after SQL Server 2000. SSIS is a form of ETL (Extraction, Transformation and Load) used to perform operations like loading the data performing transformations/caluculations on data to define workflow of the process flow. The SSIS engine automation perform many data maintenance tasks, so you can update data without manually firing procedures and imports.

Q: What are the tools associated with SSIS?
  • Business Intelligence Development Studio (BIDS):  to work with Development of SSIS Projects.
  • SQL Server Management Studio (SSMS): to manage the SSIS Packages and Projects.

Q: How is SSIS different from DTS?
Older versions of SQL Server used DTS, which was similar to SSIS. DTS let you create steps that you would then assign a priority order. With SSIS, you can separate data from work flow, and SSIS has significantly better performance than older DTS packages. While performance isn’t always an issue when running jobs during off-peak hours, it’s a problem when you must run jobs during normal business hours. You can run SSIS during business hours without too much performance degradation.

Q: What is a package?
A discrete executable unit of work composed of a collection of control flow and other objects, including data sources, transformations, process sequence, and rules, errors and event handling, and data destinations.

Q: Different components in SSIS package?
  • Control flow
  • Data flow
  • Event handler
  • Package explorer

Q: What is Control Flow?
A Control Flow consists of one or more tasks and containers that execute when the package runs. To control order or define the conditions for running the next task or container in the package control flow, we use Precedence Constraints to connect the tasks and containers in a package. A subset of tasks and containers can also be grouped and run repeatedly as a unit within the package control flow. SQL Server Integration Services (SSIS) provides three different types of control flow elements:
  • Containers that provide structures in packages
  • Tasks that provide functionality
  • Precedence Constraints that connect the executable, containers, and tasks into an ordered control flow.

Q: What is a Precedence Constraint and what types of Precedence Constraint are there?
SSIS allows you to place as many as tasks you want to be placed in control flow. You can connect all these tasks using connectors called Precedence Constraints. Precedence Constraints allow you to define the logical sequence of tasks in the order they should be executed. You can also specify a condition to be evaluated before the next task in the flow is executed.
Types of precedence constraints and the condition could be either a constraint, an expression or both
  • Success: Next task will be executed only when the last task completed successfully
  • Failure:  Next task will be executed only when the last task failed
  • Complete: Next task will be executed no matter the last task was completed or failed.

Q:  Which are the different types of Control Flow components in SSIS?
The different types of Control Flow components are:
  • Data Flow Tasks
  • SQL Server Tasks
  • Data Preparation Tasks
  • Work flow Tasks
  • Scripting Tasks
  • Analysis Services Tasks
  • Maintenance Tasks
  • Containers.

Q: What is the data flow?
Data flow is nothing but the flow of data from the corresponding sources to the referred destinations. In this process, the data transformations make changes to the data to make it ready for the data warehouse.

Q:  What are the different types of Data flow components in SSIS?
  • Sources
  • Transformations
  • Destinations