SQL Server Integration Services, Data Type Mapping
SQL Server Integration Services (SSIS) utilize several type
systems – some depending on the different data providers supported by the SSIS,
some depending on the environment where the service is used, etc. When
designing SSIS packages, especially in heterogeneous environments,
where different data management platforms and tools are used, appropriate data
type mapping is highly critical – just think of quantities, precision and
scale. This article should provide you with some of the basics of how data
types of the several type systems supported by SSIS can be used when
integrating data from diverse source and type systems. The article also mentions
some of the type-related shortcomings of SSIS and tries to help
you correct work around them.
The SQL Server Integration Services Data Type Map
The following table provides a mapping of all data types
used in SQL Server Integration Services; the mapping is represented from
the SQL Server perspective. Data types that are, in fact,
supported by other type systems represented in the table, but which do not map
to a corresponding SQL Server data type, are not shown here.
Obviously, when I say all, I actually mean all,
except the type system used by the SQL Server Native Client ODBC provider;
I plan to cover that in a later post.
Also note that, rather than providing a data type
compatibility matrix (which, to be honest, could have been even more useful),
I've instead tried to provide a simplified, one-to-one data type compatibility
map, that you can use to safely cast a value using a data type of one type
system to the nearest data type in another type system.
Data Type
|
|||||
SQL Server
|
SSIS
Variables |
SSIS
Pipeline Buffer |
OLE DB
|
ADO.NET
|
|
bigint
|
Int64
|
DT_I8
|
LARGE_INTEGER
|
Int64
|
|
binary
|
DT_BYTES
|
n/a
|
Binary
|
||
bit
|
Boolean
|
DT_BOOL
|
VARIANT_BOOL
|
Boolean
|
|
char
|
String
|
DT_STR
|
VARCHAR
|
StringFixedLength
|
|
date
|
DT_DBDATE
|
DBDATE
|
Date
|
||
datetime
|
DateTime
|
DT_DBTIMESTAMP
|
DateTime
|
||
datetime2
|
DT_DBTIMESTAMP2
|
DBTIME2
|
DateTime2
|
||
datetimeoffset
|
DT_DBTIMESTAMPOFFSET
|
DBTIMESTAMPOFFSET
|
DateTimeOffset
|
||
decimal
|
DT_NUMERIC
|
NUMERIC
|
Decimal
|
||
float
|
Double
|
DT_R8
|
FLOAT
|
Double
|
|
image
|
DT_IMAGE
|
n/a
|
Binary
|
||
int
|
Int32
|
DT_I4
|
LONG
|
Int32
|
|
money
|
DT_CY
(OLE DB) DT_NUMERIC (ADO.NET) |
CURRENCY
|
Currency
|
||
nchar
|
String
|
DT_WSTR
|
NVARCHAR
|
StringFixedLength
|
|
ntext
|
String
|
DT_NTEXT
|
n/a
|
String
|
|
numeric
|
DT_NUMERIC
|
NUMERIC
|
Decimal
|
||
nvarchar
|
String
|
DT_WSTR
|
NVARCHAR
|
String
|
|
nvarchar(max)
|
Object
|
DT_NTEXT
|
n/a
|
n/a
|
|
real
|
Single
|
DT_R4
|
FLOAT, DOUBLE
|
Single
|
|
rowversion
|
DT_BYTES
|
n/a
|
Binary
|
||
smalldatetime
|
DateTime
|
DT_DBTIMESTAMP
|
DateTime
|
||
smallint
|
Int16
|
DT_I2
|
SHORT
|
Int16
|
|
smallmoney
|
DT_CY
(OLE DB) DT_NUMERIC (ADO.NET) |
CURRENCY
|
Currency
|
||
sql_variant
|
|||||
n/a
|
|||||
text
|
DT_TEXT
|
n/a
|
n/a
|
||
time
|
DT_DBTIME2
|
DBTIME2
|
Time
|
||
timestamp
|
DT_BYTES
|
n/a
|
Binary
|
||
tinyint
|
Byte
|
DT_UI1
|
BYTE
|
Byte
|
|
uniqueidentifier
|
DT_GUID
|
GUID
|
Guid
|
||
varbinary
|
DT_BYTES
|
n/a
|
Binary
|
||
varbinary(max)
|
DT_IMAGE
|
n/a
|
Binary
|
||
varchar
|
String
|
DT_STR
|
VARCHAR
|
String
|
|
varchar(max)
|
DT_TEXT
|
n/a
|
n/a
|
||
DT_NTEXT
|
Type Systems
The following type systems are supported by SSIS:
- SQL
Server Data Types – these are the essential data types supported
by SQL Server; also known as built-in or system data
types. Custom, CLR-based data types, are not discussed in this article,
mostly because they are specific to SQL Server, specific to a
particular version of SQL Server, or could be represented
using system data types (e.g. by character or binary data types, or by
XML);
- SSIS Variable Data
Types – the data types of the underlying type system used by the SSIS service
are exposed to the programming environment as .NET Framework data
types. However, be aware that not all .NET Framework data types can be
used by the SSIS variables. You can find a list of
supported SSISvariable data types in the TypeCode Enumeration article on
MSDN;
- SSIS Pipeline
Buffer Data Types – the data types used by the pipeline buffer,
the essential element of the Data Flow task, are exposed using a dedicated
type system. This type system is different from the one used by the SSIS variables.
As long as the Data Flow definition metadata corresponds to the metadata
of the data sources involved in the Data Flow, you should not experience
any problems with this particular type system, regardless of the data
providers used by the Data Source or the Data Destination components. For
additional information on SSIS pipeline buffer data types consult
the Integration Services Data Types article
on MSDN;
- OLE DB and ADO.NET Data
Types – the data providers use their own type systems, which are exposed
to the SSIS programming interface. These are also different from
the rest of the type systems. When referring to these two data providers
in this article, I specifically refer to their use in the Execute SQL
Task. You can find more information on OLE DB data types and data
type conversions on MSDN, beginning withData Types in OLE DB (OLE DB).
For more information about ADO.NET data types start with Data Type Mappings in ADO.NET.
The Base of All Bases
In the .NET Framework all data types are derived from Object.
Therefore, all .NET data types are convertible to Object. In other words:
wherever .NET data types are used and no other more appropriate data type is
available, Object can be used instead.
SSIS variables utilize a subset of .NET data types,
including Object; if the most appropriate data type is not available for
your SSIS variable, use Object.
Why not use Object for just any SSIS variable?
Think of debugging and logging. While most other data types implement at least
one standard accessor that allows you to view or display the actual value, or
to write it to a log, with Object you would have to add your own
programmatic logic to convert the values to their actual type before they could
be displayed or written to the log.
Exceptional Cases
DATE
Even though OLE DB documentation – for instance,
the Data Type Support
for OLE DB Date/Time Improvements article on MSDN (also available for
SQL Server versions 2008 and 2008 R2) – suggests that the OLE DB DBTIMESTAMP be
used for SQL Server DATETIME or SMALLDATETIME values, this
data type does not seem to be supported in the Execute SQL Task. Attempts to
pass SQL ServerDATETIME or SMALLDATETIME values to (or from) an
OLE DB DBTIMESTAMP parameter will result in the following error:
Executing the query "..." failed with the
following error: "Invalid time format".
Possible failure reasons: Problems with the query,
"ResultSet" property not set
correctly, parameters not set correctly, or connection not
established correctly.
Alternatively, you can use the OLE DB DATE data
type for DATETIME or SMALLDATETIME values. Be warned,
though, that the precision of DATE is one second, which
means that DATETIME values might be truncated. On the other
hand, SMALLDATETIME values, whose precision is one minute,
will not be affected.
ADO.NET is a more appropriate option in such cases, as it
does not suffer from these limitations.
DECIMAL
Prior to SSIS 2012, DECIMAL was not a
supported SSIS variable data type. Various workarounds have so far
been proposed by other users, but to me the only sensible option for SSIS variables
holding decimal values is the use of the Object data type; the values
will be implicitly converted to the appropriate destination data type – for
instance, when passed to an Execute SQL Task parameter. Starting with SQL
Server 2012, DECIMAL is (finally) available for use with SSISvariables;
so, at least this particular workaround is now a thing of the past.
The Decimal data type is available in OLE DB
as well as ADO.NET data providers; however, only ADO.NET actually supports
it in SSIS. The following exception is raised by SSIS when
trying to use the DECIMAL data type in an Execute SQL Task using
OLE DB (regardless of the fact that DECIMALis listed as a supported
data type in the Execute SQL Task editor in SSDT or BIDS):
The type is not supported.DBTYPE_DECIMAL
The OLE DB NUMERICAL data type is available
and compatible; its precision and scale match the ones used for SQL
Server DECIMAL (and NUMERIC) data types.
UNIQUEIDENTIFIER
Even though Guid is a native .NET data type, it is
not part of the SSIS variable type system. You can create UNIQUEIDENTIFIER or Guid values
using SSIS (e.g. using the Script Task or the Execute SQL Task), but
in order to pass them to other SSIS components, you must resort to a
"trick". For instance, to create a Guid value in a Script
Task and then assign it to a SSIS variable of typeString, use the
following assignment:
Dts.Variables["my_guid_string_variable"].Value
= "{" +
Guid.NewGuid().ToString() + "}";
A Guid value cast to String and
formatted this way can then be passed as a parameter to an Execute SQL Task, or
to the SSIS Pipeline Buffer (e.g. in a Derived Column data flow
component) – mapped to a Guid parameter, or a DT_GUID column.
It will be implicitly cast to the appropriate type.
When UNIQUEIDENTIFIER values are returned from the
Execute SQL Task, their type depends on the data provider used by the task.
Attempts to assign a Guid value returned from ADO.NET to a StringSSIS variable,
will result in the usual type mismatch error:
The type of the value being assigned to variable
"User::string_guid_variable"
differs from the current variable type. Variables may not
change type during
execution. Variable types are strict, except for variables
of type Object.
Again, you can prevent this by using Object as
the SSIS variable data type.
SQL_VARIANT
It would be wrong to say that the SQL_VARIANT data
type is not supported by the Execute SQL Task. On the other hand, you should be
very careful with this; any OLE DB type, which is compatible with a SQL
Server data type, which is in turn compatible with the SQL_VARIANT data
type, can be used in the Execute SQL Task. However, having to know the type in
advance contradicts the principal purpose of SQL_VARIANT – its
ability to support a variety of data types in one variable, or in the same
column. I would advise against using the OLE DB provider with the Execute
SQL Task if you have to deal with SQL_VARIANT.
There is no equivalent data type for SQL_VARIANT in
the type system used by the SSIS Pipeline Buffer. If you're using an
OLE DB data source and destination, the Data Flow designer will
automatically use the DT_WSTR type for SQL_VARIANT columns
with the following warning:
The output "OLE DB Source Output" references an
external data type that cannot
be mapped to a Data Flow task data type. The Data Flow task
data type DT_WSTR
will be used instead.
Also note that the size of the column in the data flow
metadata will be set by the editor based on the sample of rows provided at
design time. At run time, you might encounter truncation errors, should the
size of the actual data exceed the one set at design time.
If you're using an ADO.NET source, the Data Flow designer
will automatically use the DT_NTEXTdata type for SQL_VARIANT columns
with the following warning:
The data type "System.Object" found on column
"sql_variant_column" is not
supported for the component "ADO NET Source"
(201). This column will be
converted to DT_NTEXT.
Personally, I would advise against using SQL_VARIANT columns
in data flows at all, unless your own tests conclusively show that
the data in your environment is read from the source and written to the
destination correctly.
TABLE and XML
Even though the SQL Server TABLE type is
supported by the .NET Framework, OLE DB, and ADO.NET – that is, via
Table-valued Parameters – it is not supported as a data type in SSIS.
Typically, in SSIS, in-flight set-oriented processing is performed using
the Data Flow task; of course, if you can stage your data, and are not required
to perform all data processing in a single data flow, you can move some of that
set-oriented logic outside the SSIS process (for instance, updates or
merges can be performed using the UPDATE and/or MERGE statements,
executed fromSSIS by using the Execute SQL Task).
XML is a complex native data type in SQL Server,
and it can also be represented by a string. Unfortunately, in SSIS, things
are not quite as simple as that. The .NET Framework implements several types
based on the W3C XML Recommendation, ADO.NET also supports XML natively, and
OLE DB supports the use of XML data – to some extent. I believe XML
deserves special attention in SSIS, and I've covered it in more detail in
my earlier post on SSIS
and complex parameters.
Large Object Data
Large-object data types, such as VARCHAR(MAX), NVARCHAR(MAX) OR VARBINARY(MAX),
are not fully supported by the Execute SQL Task, by neither the OLE DB,
nor the ADO.NET, providers. By using the ADO.NET provider it is possible to
pass IMAGE, NTEXT or VARBINARY(MAX) data to and from
the Execute SQL Task, but neither VARCHAR(MAX), nor NVARCHAR(MAX),
parameters are supported. It is, however, possible to work around these
limitations by using more elaborate techniques, which I feel deserve a
dedicated article, so keep watching this blog for more information on that
particular subject.
This was a very helpful reference and helped me get my SSIS job aligned. Thanks very much!
ReplyDelete