May 6, 2010

DFT 2 DFT


In SSIS Data Flow Task (DFT) is the most used task as almost all the ETL is done in DFTs. Situation do arise when we need to use data between two or more DFTs. So how can this be achieved? Well there are a number of ways you can save you data in one DFT to so that you access the same data in another DFT. Below are the different ways to store the data in the intermediate process, to be used again in another DFT or even another package.
  • Raw File
  • Flat File
  • SQL Table
  • Excel File
  • Recorset destination with Script component as the source
For details check this out: DFT 2 DFT

Apr 7, 2010

SSIS - Delete files in a folder older than a specified number of days

Scenario: How to delete files created a number of days before today. The files will have the date appended to the file name and we will not check the file properties to decide the file age. The file name would follow the naming convention: File_ddmmyyyy.txt

This can be achieved very easily in SSIS. Create an integer variable to store the value stating how old files do you wish to delete. Suppose you wish to delete all files older than 3 days, the variable would have the value 3. I name this variable intFileMaxAge. Make this variable configurable so that it can be changed as and when required.

Next I show the layout of the package.

Control Flow

                                                       Fig. 1

The Foreach Loop Container is configured to pick the files from a particular folder(you should make this connection configurable) and the “Retrieve file name” property is set to “Name and extension”. Refer fig. 2

ForEach loop

                                                       Fig. 2

Next we need to save the file name for each iteration to a string variable strFileName. For this go to the Variable Mappings tab and select the variable strFileName from the drop down box and set the index to zero. Refer fig.  3

 

ForEach loop_VarMapping

                                                       Fig. 3

Once the above is done put a script task in the Foreach Loop Container. Do not make any changes to the script task. It is just there so that we can have Precedence Constraint before the File system task to control the flow. We need the File System Task to be triggered only when the condition is met i.e. files created before certain days (Refer fig. 1). Double click the precedence  and set the Evaluation operation to Expression. In the Expression put the below expression:

DATEDIFF("dd",(DT_Date)(SUBSTRING(@[User::strFileName],6,2)+"-"+SUBSTRING(@[User::strFileName],8,2)+"-"+SUBSTRING(@[User::strFileName],10,4)),GETDATE()) > @[User::intFileMaxAge]

The above expression extracts the date part from the file name and converts it to date type. Once that’s done it gets the difference of the between the file date and today’s date. After this the expression checks whether the difference is greater than the value specified in the variable intFileMaxAge.

If the above expression will be evaluated to true the control will go to the File Sytem Task.

Setting up the File System Task. (refer fig. 4) The operation is set to Delete file  as we need to delete files. Set the IsSourcePathVariable to False as I create a connection to connect to the file and use expression to modify the connection at run time.

FileSystem Task

                                                       Fig. 4

In the Connection Managers select the connection manager configured above(refer fig. 4) go to its properties and in the expression set connection string to the following expression:

@[User::strFolderPath]+ @[User::strFileName]

Where the variable strFolderPath contains the folder in which the files are placed. Again make this configurable.

That completes the package.

Mar 10, 2010

Using DB Mail in SSIS


In the SSIS Forum I often find people asking if it’s possible to send mail in HTML format or Fetch data from Table and use it in the mail body. In the Send Mail Task we can only send the mails in text format or send the query result as an attachment.

Today I will explain the way this can be achieved.
If we have a table with columns as ID, CustName, Amount and we need to have the mail sent in the following format. The rows will be colored based on a particular condition which can be set in the query.




Mail Header





IDCustomerNameAmount
1Sudeep100
2Ankur-10
3Manju20


Source Table Script

CREATE TABLE [dbo].[tblCust](
                
[ID] [int] IDENTITY(1,1) NOT NULL,
                
[CustName] [nvarchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
                
[Amount] [int] NULL
    )
ON [PRIMARY]


Insert Records

INSERT INTO [RAJ].[dbo].[tblCust]([CustName],[Amount])
    (
    
SELECT 'Sudeep',100
        
UNION ALL
    
SELECT 'Ankur',-10
        
UNION ALL
    
SELECT 'Manju',3 )
            
First we set up the Database Mail Profile with the following query in our SQL server.

USE msdb
GO
DECLARE @ProfileName VARCHAR(255)
DECLARE @AccountName VARCHAR(255)
DECLARE @SMTPAddress VARCHAR(255)
DECLARE @EmailAddress VARCHAR(128)
DECLARE @DisplayUser VARCHAR(128)

--Here I am setting up our Profile Name, Account Name, STMP server name, and the name that will display in the from field in the e-mail.

SET @ProfileName = 'Sudeep';
SET @AccountName = 'Sudeep';
SET @SMTPAddress = '192.168.1.101';
SET @EmailAddress = 'rajsudeep@gmail.com';
SET @DisplayUser = 'SQL Mail';

--The following section adds our Account, Profile, and Profile-Account association to the system.

EXECUTE msdb.dbo.sysmail_add_account_sp
@account_name = @AccountName,
@email_address = @EmailAddress,
@display_name = @DisplayUser,
@mailserver_name = @SMTPAddress

EXECUTE msdb.dbo.sysmail_add_profile_sp
@profile_name = @ProfileName

EXECUTE msdb.dbo.sysmail_add_profileaccount_sp
@profile_name = @ProfileName,
@account_name = @AccountName,
@sequence_number = 1 ;


Next is the query that provides us our data in HTML format which will be used in the mail body.

DECLARE @tableHTML NVARCHAR(MAX) ;
    
SET @tableHTML =
    
N'<html><body><h1>Mail Header</h1>' +
    
N'<table border="1" width="100%">' +
    
N'<tr bgcolor="gray"><td>ID</td><td>CustomerName</td><td>Amount</td></tr>' +
            
CAST((
        
SELECT
                    
td = ID, '',
                    
td = CustName, '',
            
'td/@bgcolor'=CASE WHEN Amount>0 THEN 'Green'
            
ELSE 'Red' END,
                    
td = Amount, ''
            
FROM tblCust
                    
FOR XML PATH('tr'), TYPE) AS NVARCHAR(MAX)) + N'</table></body></html>'EXEC msdb.dbo.sp_send_dbmail @recipients='<MailID@maildomain.Com>',
    
@subject = 'SQL Errors Report',
    
@body = @tableHTML,
    
@body_format = 'HTML',
    
@profile_name = 'SudeepTest';
            
 
In this query the HTML body is
formed in the variable @tableHTML, and the body format is set to ‘HTML’.
Depending on the number of records fetched in the query the HTML body that is created from the above query would looks like:




<html>
<body>
    <h1>
        Mail Header</h1>
    <table border="1" width="100%">
        <tr bgcolor="gray">
            <td>
                ID</td>
            <td>
                CustomerName</td>
            <td>
                Amount</td>
        </tr>
        <tr>
            <td>
                1</td>
            <td>
                Sudeep</td>
            <td bgcolor="Green">
                100</td>
        </tr>
        <tr>
            <td>
                2</td>
            <td>
                Ankur</td>
            <td bgcolor="Red">
                -10</td>
        </tr>
        <tr>
            <td>
                3</td>
            <td>
                Manju</td>
            <td bgcolor="Green">
                3</td>
        </tr>
    </table>
</body>
</html>

If you need any kind of modification in your table look and feel you could think about the related HTML that you need and modify the above query accordingly.
Now in the SSIS Package that you have use the 2nd query in Execute SQL Task and check the mail that you receive.
It is not necessary that it can be used for tables. You can modify the select statement in the above query to return the data as HTML format without FROM clause or simply put your 
HTML code in the @tableHTML variable.

Prerequisites:
  • ·         The SMTP server should be installed and configured
  • ·         The user in the profile should have access to the SMTP Server
  • ·         Check the SMTP server for the domains to which it can broadcast mails.

Feb 26, 2010

Sample Packages

I am uploading a few sample packages that you could use and modify as per your requirement. Let me know if you need any specific sample or any change in these packages.

This package shows an example where the file connection is changed dynamically based on inputs from the database table. This package includes Execute SQL Task, Foreach Loop with ADO Enumerator, Data Flow Task including Flat File Connection Manager, Derived Column for date transformation, OLEDB Destination and Flat File destination for error records. For step by step description check this post.

This package picks files from folder and deletes files based on the file name. It makes use of Foreach Loop, dummy Script Task, Precedence Constraint and File System Task.

This is a simple package where I just show how we remove duplicates from the records and perform lookup with no caching. This package includes Sort Transform, Lookup Transform, Flat File Source and OLEDB Destination.


QueryFromFile.dtsx
This package executes a set of queries kept in a folder using For Each Loop and Execute SQL Task 

I will be uploading more samples soon.


Update1:
SQLXMLtoFile.dtsx
This package takes a XML stream from Execute SQL task as output and uploads the file to a XML File. The SQL Query could be of any form like:
SELECT * FROM Table1 FOR XML Auto
or
SELECT * FROM Table1 FOR XML PATH('Test')


Update2:
DeleteOldestFile.dtsx
This package will delete the oldest file from a given folder. The filename is of the format File_MMDDYYYY.txt For more details on this package I will be posting a blog soon.

Jan 12, 2010

Assigning variables in Script Component

One of the sore points of using Script component is assigning value to variables.
Read Write variables can only be altered in Post Execute phase. Script Component runs the code for each record and this would affect performance adversely. For details check "Coding and Debugging the Script Component".
In case I need to assign the value of system variable "PackageName" to a readWrite user variable "PkgNm"

Here is a sample code to assign value to Read write variable on the post execute phase:

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
ImportsMicrosoft.SqlServer.Dts.Runtime.Wrapper

Public Class ScriptMain
    Inherits UserComponent

    Public Overrides Sub CreateNewOutputRows()
       ' I have not added any code here
    End Sub
    Public Overrides Sub PostExecute()
        Me.Variables.PkgNm = Me.Variables.PackageName
        Msgbox(Me.Variables.PkgNm)
        MyBase.PostExecute()
    End Sub
End Class