Arrays and lists in sql server 2008

11/5/12 Arrays and Lists in SQL Server 2008

Arrays and Lists in SQL Server 2008
Using Table-Valued Parameters
An SQL text by Erland Sommarskog, SQL Server MVP. Latest revision: 2012-07-01.

Introduction
In the public forums for SQL Server, you often see people asking How do I use arrays in SQL Server? Or Why does S L C * F O
EET RM
t l W E E c l I ( l s )not work? The short answer to the first question is that SQL Server does not have arrays – SQL Server
b HR o N @it
has tables. Upto SQL Server 2005, there was no way to pass a table from a client, but you had to pass a comma-separated string or
similar to SQL Server, and then you would unpack that list into a table in your stored procedure.

This changed with SQL 2008. The advent of table-valued parameters makes it dirt simple to pass a comma-separated list to SQL Server.
In this article I will introduce a simple and perfectly reusable class for this task. Table-valued parameters are also great when you want to
load data to SQL Server through a stored procedure; you no longer have to build XML documents that you shred in SQL Server. In this
article I show you how to load a master-detail file into SQL Server tables in two different ways: read the entire file into memory or stream it
directly. What I am not showing – because it's so simple – is that if you already have your data in a DataTable object, you can pass that
DataTable as the value for your TVP.

The examples are in C# and VB .NET, using the SqlClient API, and the main body of the article covers this environment. For other
environments such as Java or Entity Framework, there is a quick overview of what is possible at the end of the article.

There is an accompanying article: Arrays and Lists in SQL Server 2005 and Beyond (and an even older for SQL 2000) where I in detail
describe various methods to pass a list of values in a string and unpack them into a table in SQL Server. For the hard-core geeks there are
two performance indexes, one labelled SQL 2008 and an older labelled SQL 2005, where I relate data from performance tests of the
methods described in the articles, including table-valued parameters.

Contents:

Introduction
Background
Table-Valued Parameters in T‑SQL
Declarations
Invoking an SP with a TVP
Permissions
Restrictions
Passing Table-Valued Parameters from ADO .NET
About the Sample Code
Sending a Comma-Separated List to SQL Server
Using the Class with a Stored Procedure
Inside the CSV_splitter Class
What About Performance?
Performance in SQL Server
Client-side Performance
Primary Keys and Sorted Data – Looking Closer at the SqlMetaData constructors
Loading Data through Table-Valued Parameters
The Setup
Take One: Reading the File Into a List
Take Two: Streaming the File
Performance Considerations
Using Table-Valued Parameters from Other APIs
Acknowledgements, Feedback and Further Readeing
Revision History

www.sommarskog.se/arrays-in-sql-2008.html#Invoking 1/27


Some of the sample code in this article refers to the Northwind database. This database does not ship with SQL Server, but you can
download the script to install it from Microsoft's web site.

Background
You have a number of key values, identifying a couple of rows in a table, and you want to retrieve these rows. If you are the sort of person
who composes your SQL statements in client code, you might have something that looks like this:

cdCmadet="EETPoutD PoutaeFO NrhiddoPout "&_
m.omnTx SLC rdcI, rdcNm RM otwn.b.rdcs
"HR PoutDI ( &Ls &""
WEE rdcI N " it )
rae =cdEeueedr)
edr m.xctRae(

L s is here a string variable that you somewhere have assigned
it a comma-separated list, for instance "9,12,27,39".

This sort of code is bad practice, because you should never interpolate parameter values into your query string. (Why, is beyond the scope
of this article, but I discuss this in detail in my article The Curse and Blessings of Dynamic SQL, particularly in the sections on SQL
Injection and Caching Query Plans.)

Since this is bad practice, you want to use stored procedures. However, at first glance you don't seem to find that any apparent way to do
this. Many have tried with:

CET POEUEgtpoutnms@d vrhr5)A
RAE RCDR e_rdc_ae is aca(0 S
SLC PoutD Poutae
EET rdcI, rdcNm
FO
RM NrhiddoPout
otwn.b.rdcs
WEE PoutDI (is
HR rdcI N @d)

But when they test:

EE gtpoutnms',22,7
XC e_rdc_ae 91,73'

The reward is this error message:

Sre:Mg25 Lvl1,Sae1 Poeuegtpoutnms Ln 2
evr s 4, ee 6 tt , rcdr e_rdc_ae, ie
Sna errcnetn tevrhrvle',22,7 t aclm
ytx ro ovrig h aca au 91,73' o oun
o dt tp it
f aa ye n.

This fails, because we are no longer composing an SQL statement dynamically, and @ d is just one value in the IN clause. An IN clause
is
that could also read:

..WEEclI (a @,@)
. HR o N @, b c

Or more to the point, consider this little script:

CET TBE#s ( vrhr2)NTNL)
RAE AL cv a aca(0 O UL
go
ISR #s ()VLE (91,73'
NET cv a AUS ',22,7)
ISR #s ()VLE (smtiges'
NET cv a AUS 'oehn le)
SLC aFO #s WEEaI (91,73' - Rtrsoerw
EET RM cv HR N ',22,7) - eun n o.

So now you know why c l I ( l s )does not work. Or rather: you know now that it works differently from your expectations. In
o N @it
the following we will look into how to solve this kind of problem with table-valued parameters.

Before I go on, I should add that sometimes you may find yourselves in the (very unfortunate) situation when you have a delimited list in a
table column in your database. To unpack such a list, you would need any of the methods that I discuss in Arrays and Lists for SQL 2005
and Beyond; TVPs cannot help you here.

Table-Valued Parameters in T-SQL
Declarations



Let's first look at how to use TVPs in T‑SQL without involving a client. To be able to declare a TVP, you first need to create a table type
like this:

CET TP itgrls_btp A TBE( itNTNL PIAYKY
RAE YE nee_ittlye S AL n n O UL RMR E)

That is, after CREATE TYPE you specify the type name followed by AS TABLE and then comes the table definition, using the same syntax
as CREATE TABLE. You cannot use everything you can use with CREATE TABLE, but you can define PRIMARY KEY, UNIQUE and
CHECK constraints, you can use IDENTITY and DEFAULT definitions, and you can define computed columns. Once you have this table
type, you can use it to declare table variables:

DCAE@yititgrls_btp
ELR mls nee_ittlye

However, you cannot use the type with CREATE TABLE (it could have been nutty with temp tables!), nor can you use it for the declaration
of the return table in a multi-step table function. The raison d'être for table types is to make it possible to declare table-valued parameters
for stored procedures or user-defined functions. Here is one example:

CET POEUEgtpoutnms@rdd itgrls_btp RAOL A
RAE RCDR e_rdc_ae pois nee_ittlye EDNY S
SLC pPoutD pPoutae
EET .rdcI, .rdcNm
FO
RM NrhiddoPout p
otwn.b.rdcs
WEE pPoutDI (EETnFO @rdd)
HR .rdcI N SLC RM pois

The body of the procedure brings no surprises. The code looks just as it would have if @ r d d had been a local table variable. The
pois
parameter declaration on the other hand includes a keyword hitherto not seen in this context: READONLY. This keyword means what it
says: you cannot modify the contents of the table parameter in any way in the procedure. As an aside, this restriction makes TVPs far less
useful than they could have been; often you want to pass data between stored procedures as I discuss in my article How to Share Data
Between Stored Procedures. However, for the task at hand, passing data from a client, the READONLY restriction is no major obstacle.

Invoking an SP with a TVP
Calling this procedure is straightforward:

DCAE@yititgrls_btp
ELR mls nee_ittlye
ISR @yitn VLE()(2,2)(7
NET mls() AUS9,1)(7,3)
EE gtpoutnms@yit
XC e_rdc_ae mls

(Here I use the new syntax for INSERT that permits me to specify values for more than one row in the VALUES clause.) You can also use
TVPs with sp_executesql:

DCAE@yititgrls_btp,
ELR mls nee_ittlye
@q naca(A)
sl vrhrMX
SLC @q =NSLC pPoutD pPoutae
EET sl 'EET .rdcI, .rdcNm
FO
RM Nrhid.rdcsp
otwn.Pout
HR .rdcI N SLC RM pois'
ISR @yitVLE()(2,2)(7
NET mls AUS9,1)(7,3)
EE s_xctsl@q,N@rdd itgrls_btp RAOL' @yit
XC peeueq sl 'pois nee_ittlye EDNY, mls

There are a few peculiarities, though. This does not work:

EE gtpoutnmsNL
XC e_rdc_ae UL

but results in this error message:

Mg26 Lvl1,Sae2 Poeuegtpoutnms Ln 0
s 0, ee 6 tt , rcdr e_rdc_ae, ie
Oeadtp cah vi tp i icmail wt itgrls_btp
prn ye ls: od ye s noptbe ih nee_ittlye

It is quite logical when you think of it: NULL is a scalar value, not a table value. But what do you think about this:

EE gtpoutnms
XC e_rdc_ae

You may expect this to result in an error due to the missing parameter, but instead this runs and produces an empty result set! The same
happens with:

EE gtpoutnmsDFUT
XC e_rdc_ae EAL



The scoop is that a table-valued parameter always has the implicit default value of an empty table. Whether this is good or bad can be
disputed, but if there were to be explicit default values, Microsoft would have to invent a lot of syntax for it. And in most cases, the default
value you want is probably the empty table, so it is not entirely unreasonable.

Permissions
One thing with table types which is not apparent is that you need permission to use a table type. This can be demonstrated in this script:

CET UE tsue WTOTLGN
RAE SR etsr IHU OI
go
EEUEA UE ='etsr
XCT S SR tsue'
go
DCAE@ itgrls_btp
ELR p nee_ittlye
go
RVR
EET
go
DO UE tsue
RP SR etsr

(What we do here is to create a loginless user, and then impersonate that user. This is a quick way to test permissions. For more details on
impersonation, see my article Giving Permissions through Stored Procedures.)

The output from this script is puzzling:

Mg29 Lvl1,Sae5 Ln 1
s 2, ee 4 tt , ie
TeEEUEpriso wsdne o teojc 'nee_ittlye,
h XCT emsin a eid n h bet itgrls_btp'
dtbs 'epb,shm 'b'
aaae tmd' cea do.

But the message is to be taken by the letter. To be able to use a table type, you need to have EXECUTE permission on the type. (This does
not apply to normal scalar types, but it does apply to user-defined CLR types.) To grant permission on a type, the syntax is:

GATEEUEO TP:itgrls_btp T tsue
RN XCT N YE:nee_ittlye O etsr

The T P : prefix is needed to specify the object class.
YE:

Restrictions
It is maybe not so surprising that you cannot use table-valued parameters across linked servers, given that there are many restrictions with
linked servers. But it does not stop there: you cannot even use table-valued parameters across databases. If you try something like:

UEtmd
S epb
go
CET TP tbeA TBE( itNTNL PIAYKY
RAE YE ob S AL a n O UL RMR E)
go
CET POEUEtbes @ tbeRAOL A
RAE RCDR ob_p t ob EDNY S
SLC aFO @
EET RM t
go
UEohrb
S ted
go
CET TP tbeA TBE( itNTNL PIAYKY
RAE YE ob S AL a n O UL RMR E)
go
DCAE@ tbe
ELR t ob
EE tmd.tbes @
XC epb.ob_p t

The error message is:

Mg26 Lvl1,Sae2 Poeuetbes,Ln 0
s 0, ee 6 tt , rcdr ob_p ie
Oeadtp cah tbei icmail wt tbe
prn ye ls: ob s noptbe ih ob

And if you try

CET POEUEtbes @ ohrbdotbeRAOL A
RAE RCDR ob_p t ted.b.ob EDNY S
SLC aFO @
EET RM t

You get the message:


Mg17 Lvl1,Sae2 Poeuetbes,Ln 1
s 1, ee 5 tt , rcdr ob_p ie
Tetp nm 'ted.b.ob'cnan mr ta temxmmnme o peie.Temxmmi 1
h ye ae ohrbdotbe otis oe hn h aiu ubr f rfxs h aiu s .

This some awkward message informs us that the data type for a parameter cannot be a three-part name with a database component.

You cannot create stored procedures or user-defined functions in the CLR that take a table-valued parameter. But the other way works:
you can call a T‑SQL procedure with a TVP from a CLR procedure, using the same mechanisms you use from a client and which is what
we will look at next.

Passing Table-Valued Parameters from ADO .NET
Passing values to TVPs from ADO .NET is very straightforward, and requires very little extra code compared to passing data to regular
parameters. You need .NET Framework 3.5 SP1 or higher to have support for TVPs. You can only use TVPs with SqlClient; you cannot
use TVPs with the classes in the System.Data.OleDb or System.Data.Odbc namespaces.

The specifics can be summarised as:

For the data type you specify SqlDbType.Structured.
You specify the name of the table type in the TypeName property of the parameter.
You set the Value property of the parameter to something suitable.

Exactly what is suitable then? There is an MSDN topic that suggest that the three choices are List<SqlDataRecord>, a DataTable or a
DbDataReader. It turns out that this is not the full story. I have not been able – and nor have I really tried – to find the exact
requirements, but it seems that you can pass anything that implements IEnumerable and IDataRecord, and then DataTable is a special
case that goes beyond that. Exactly what you can use and not use is not particularly interesting. I would suggest that in practice you will use
one of these four:

List<SqlDataRecord>.
Custom-written classes that implement IEnumerable<SqlDataRecord> and IEnumerator<SqlDataRecord>.
DataTable.
A DbDataReader of some sort.

Of these, you would use the first two for general-purpose programming. The only reason to pass a DataTable is that you already have the
data in such an object. If you have the data somewhere else – in a file, on a wire etc – there is no reason to fill a DataTable when you can
use a List which is more lightweight. On the same token, the only reason you would use a DbDataReader, is because you have a
DbDataReader anyway. That is, if the data for your TVP comes from an Oracle database, you can pass an OracleDataReader directly
– no need to populate a List or a DataTable.

For this reason, in this article I focus on the first two alternatives, and all examples use either a custom-written class or a
List<SqlDataRecord>.

One caveat about DataTable and DbDataReader: if your TVP has an IDENTITY column or a computed column, you may not be able
to get these columns to work with your objects. In this case, you can always use a List or a custom-written iterator, since this gives you
access to some more options to define the metadata for the TVP, and I will briefly cover how to do this later.

About the Sample Code
Before we move on, I like to give a quick introduction to the demo files that accompany this article. They are compiled in two zip files, one
with the demos in C# and one with the same demos in Visual Basic .NET. Use the language that is the most convenient to you. In the text
itself, I sometimes show the code in C# and sometimes in VB .NET. If you are more comfortable with the language I'm not using for the
moment, please refer to the corresponding file in the other language. For instance, if I refer to TVPDemo.DemoHelper.cs, you can rely
on that there is also a TVPDemo.DemoHelper.vb file in the zip file with the VB code.

Beside the source code in C# and VB .NET, the zip files also include an SQL script and a file with sample data for one of the demos. There
is also a file compile.bat you can use to compile the files. There are however no project/solution files for Visual Studio, as that goes a little
above my head.



I will cover the code in the zip files as we arrive to the examples where they belong. There is however one class I like to highlight here and
now, and that is the TVPDemo.DemoHelper class. This class includes some utility routines that are of little interest for the article as such.
There is one thing I like to highlight, though, to wit the connection string:

piaecntsrn cnsr=
rvt os tig ont
"plcto Nm=Vdm;nertdScrt=SI"+
Apiain aeTPeoItgae euiySP;
"aaSuc=;nta Ctlgtmd"
Dt ore.Iiil aao=epb;

You may have to change it to fit your environment. Particularly, if you only have Express Edition installed, you should probably use
.SQLEXPRESS for D t S u c instead of the single dot.
aa ore

In the article, the code mainly appears without comments, since I explain the code in the text. However, in the source files, the code is
thoroughly commented.

Disclaimer: My expertise is in SQL Server, and I only write .NET code left-handedly. While I have done my best to adhere what I think is
best practice, you may see things which makes you think "I would never do something like that". It is not unlikely that you will be right.
Please let me know in such case!

Sending a Comma-Separated List to SQL Server
The Arrays and Lists articles take their base in the problem of using a comma-separated list in SQL Server. Programmers often encounter
them, because there are form controls that produce such lists. (Or so the .NET programmers I know keep telling me.) The other articles in
this series present solutions to transform these lists into a table in SQL Server. Here I'm showing you a much better solution: transform the
list in the client and pass it to a table-valued parameter. SQL Server should spend its resources on reading and writing tabular data, not
string processing. Not only are the resources better spent this way, the solution is also much simpler and cleaner with help of the class
CSV_splitter that I will introduce.

Using the CSV_splitter Class
Using the CSV_splitter class is extremely simple. All your application code sees is a call to the constructor and that's it. Here is an
example where we call a stored procedure with a TVP. We use the table type and the stored procedure I used in the T‑SQL section
above.:

go
SLC pPoutD pPoutae
EET .rdcI, .rdcNm
FO
RM NrhiddoPout p
otwn.b.rdcs

In the set of demo files you find CSVDemo.vb which includes the procedure C V t _ Pthat calls g t p o u t n m s and it is no
S_oS e_rdc_ae,
more complicated than this.

PiaeSbCVt_P)
rvt u S_oS(

Uigc A Sloncin=TPeoDmHle.euCneto(,_
sn n s qCneto VDm.eoeprStponcin)
cdA Slomn =c.raeomn(
m s qCmad nCetCmad)

cdCmadye=CmadyeSoePoeue
m.omnTp omnTp.trdrcdr
cdCmadet="b.e_rdc_ae"
m.omnTx dogtpoutnms

cdPrmtr.d(@rdd" SlbyeSrcue)
m.aaeesAd"pois, qDTp.tutrd
cdPrmtr(@rdd".ieto =Prmtrieto.nu
m.aaees"pois)Drcin aaeeDrcinIpt
cdPrmtr(@rdd".yeae="nee_ittlye
m.aaees"pois)TpNm itgrls_btp"
cdPrmtr(@rdd".au =_
m.aaees"pois)Vle
nwTPeoCVslte(91,73"
e VDm.S_pitr",22,7)

Uigd A nwSlaadpe(m) _
sn a s e qDtAatrcd,
d A nwDtSt)
s s e aae(
d.ild)
aFl(s
TPeoDmHle.rnDtStd)
VDm.eoeprPitaae(s
EdUig
n sn

EdUig
n sn
EdSb
n u

To a large extent, very typical code to call a stored procedure. We first set up a connection and create a command object. We move on to
state which stored procedure to call, and we define the single parameter that g t p o u t n m saccepts. Finally, we invoke the
e_rdc_ae
procedure, and in this example I have chosen to use DataAdapter.Fill together with a method in my DemoHelper class that prints the
result set. In a real-world scenario you may prefer to use ExecuteReader or whatever fits you.

(I assume that most readers are acquainted with U i g but in case you are not: this statement permits you to declare a variable that is
sn,
accessible in the enclosed block, and when the block exits, any Dispose method of the class will be invoked. This is highly
recommendable for SqlConnection and SqlCommand objects. If you just leave it to garbage collection to take care of them, you may
spew SQL Server with a lot of extra connections. U i gis available in C# as well, but spelt u i g
sn s n .)

The interesting part is the four statements that set up the parameter. The first adds the parameter and defines the type:


For a table-valued parameter, you always specify SqlDbType.Structured here.


Specifying the direction of the parameter is somewhat superfluous; since TVPs are read-only, Input is the only choice. Nevertheless, I
have included it for clarity. Next we introduce the name of the table type in SQL Server, by setting the special parameter property
TypeName:


Strictly speaking, this is not necessary when calling a stored procedure, since SQL Server knows the type anyway. However, it is definitely
best practice to always specify the type. For one thing, this will give you a clearer error message when there is a mismatch between the
structure you pass from the client and the table type in SQL Server.

And now – drum roll! – it's time pass an actual value to the TVP:

cdPrmtr(@rdd".au =nwTPeoCVslte(91,73"
m.aaees"pois)Vle e VDm.S_pitr",22,7)

You create a new CSV_splitter object which you pass as the parameter value. And as the parameter to the constructor you pass your list
of comma-separated integers. Since this is a sample, the list is a constant; in practical code you would of course have a variable here.

All you need to do to get this to work is to put the CSV_splitter class in place. Which is very simple, since the code is included in the
download files. You only need to change the namespace to fit your local conventions. The joy is that this class is perfectly reusable, and
while I will cover the internals of the class in a second, all you really need to know if you are in a hurry is this:

The class assumes a list of integers – more precisely Int64. If you want a list of strings, you will need to clone the class.
The constructor takes an optional parameter which permits you to specify a different delimiter. (But it has to be a single-character
delimiter.)
Empty elements in the string are ignored.

You might ask: what if I don't use stored procedures? Can I still use TVPs and the CSV_splitter class? Sure enough. The file
CSVDemo.vb also includes this routine:

PiaeSbCVt_Q(
rvt u S_oSL)

m s qCmad nCetCmad)

cdCmadye=CmadyeTx
m.omnTp omnTp.et
cdCmadet="SLC pPoutD pPoutae"&_
m.omnTx EET .rdcI, .rdcNm
"FO
RM NrhiddoPout p"&_
otwn.b.rdcs
"WEE pPoutDI (EETnFO @rdd)
HR .rdcI N SLC RM pois"



cdPrmtr(@rdd".au =_
m.aaees"pois)Vle
NwTPeoCVslte(1 1,7,3"
e VDm.S_pitr", 1 6 4)

Uigd A nwSlaadpe(m) _
sn a s e qDtAatrcd,
d A nwDtSt)
s s e aae(
d.ild)
aFl(s
TPeoDmHle.rnDtStd)
VDm.eoeprPitaae(s
EdUig
n sn
EdUig
n sn
EdSb
n u

It is very similar to C V t _ P The one thing to observe is this line:
S_oS.


When you use CommandType.Text, it is compulsory to specify the name of the table type. For stored procedures you can leave it out,
but as I noted above, best practice is to always include the type name.

Inside the CSV_splitter Class
As I discussed above, the object you pass as the value for a TVP must implement IEnumerable<SqlDataRecord> and
IEnumerator<SqlDataRecord>. I guess most .NET programmers understand what this means. In case you don't: an interface consists of
a number of members with well-defined signatures, but without any code. To implement an interface you write a class that includes the
members of the interface with exactly those signatures – now with code added. To this you can add other members as you like. You don't
have to worry too much about inadvertently leaving something out – the compiler will inform you of any small detail you forget. Some
interfaces feature over 20 members, but these two interfaces are quite slender with in total four methods and one property.

While the main purpose is to feed a table-valued parameter, it is worth noting that since CSV_splitter implements IEnumerable, you can
use the class in this way.

frah(qDtRcr rci nwTPeoCVslte(1234) {
oec Slaaeod e n e VDm.S_pitr",,,")
CnoeWieie(e.eIt40.otig);
osl.rtLn rcGtn6()TSrn()
}

Not that this is particularly useful, but it gives an understanding what this is all about.

I will now walk you through the inside of the CSV_splitter class, which you find in TVPDemo.CSV_splitter.cs. For reference, here is
the u i gsection:
sn

uigSse;
sn ytm
uigSse.olcin.eei;
sn ytmCletosGnrc
uigMcootSlevrSre;
sn irsf.qSre.evr

Nothing startling here. (Most people would probably add a few more namespaces, but I refer to some classes in full for clarity.) The class
declaration looks like this:

pbi casCVslte :InmrbeSlaaeod,
ulc ls S_pitr Eueal<qDtRcr>
InmrtrSlaaeod
Eueao<qDtRcr>

That is, this class implements both IEnumerable and IEnumerator. This is possibly disputable; some people may prefer to have one
class per interface, but I could not really see the point in this. (I did say that I'm normally not a .NET programmer, didn't I?)

The class has a few private member variables:

srn
tig ipt
nu; / Teiptsrn.
/ h nu tig
ca
hr dlm
ei; / Tedlmtr
/ h eiie.
it
n sati;
tr_x / Satpsto frcretls eeet
/ tr oiin o urn it lmn.
it
n edi;
n_x / Psto frtenx ls dlmtr
/ oiin o h et it eiie.
Slaaeodote;
qDtRcr urc / Tercr w uet rtr dt.
/ h eod e s o eun aa

i p tis the comma-separated
nu list itself and d l mis the delimiter. s a t i and e d i keep track of where in the string we are. The
ei tr_x n_x
most interesting member is o t e . As the snippet above shows, each iteration produces an instance of SqlDataRecord and as we shall
urc
see, it comes from this o t e variable.
urc



To permit for an alternate delimiter, the class has two constructors which in the C# version fork off to a common private method.

pbi CVslte (tig sr
ulc S_pitr srn t,
ca
hr dlmtr {
eiie)
cntutrsr dlmtr;
osrco(t, eiie)
}

pbi CVslte (tigsr {
ulc S_pitr srn t)
cntutrsr '';
osrco(t, ,)
}

piaevi cntutrsrn sr
rvt od osrco(tig t,
ca
hr dlmtr {
eiie)
ti.nu =sr
hsipt t;
ti.ei =dlmtr
hsdlm eiie;

ti.urc=nwSlaaeod
hsote e qDtRcr(
nwSleaaa"nn,Sse.aaSlbyeBgn);
e qMtDt(nn" ytmDt.qDTp.iIt)

ti.ee(;
hsRst)
}

First the constructor saves the input parameters into the private members. Next comes the key part: the constructor creates an instance of
SqlDataRecord that matches the table type for the table-valued parameter. The constructor for SqlDataRecord accepts an array of
SqlMetaData objects. (Both these classes are in the Microsoft.SqlServer.Server namespace.) These classes are closely related: the
raison d'être for SqlMetaData is exactly to describe a single column in an SqlDataRecord and ultimately a column in SQL Server.

SqlMetaData has a whole slew of constructors to accommodate the various data types in SQL Server and I cover some of the variations
as we encounter them. For the CSV splitter we use the simplest constructor of them all and pass only the column name and the data type.
You may note that the column we define in SqlMetaData differs from the column in integer_list_tbltype on two accounts:

1. The column names are not the same. I have made them different on purpose to show that what names you put in the column definition
with SqlMetaData has no importance in the context of passing data to TVPs.
2. The data type is BigInt, while in the table type the column is int. I chose to use BigInt to make the class as general as possible. That
is, you can use the class with any integer data type. (Of course, I could have used bigint in the table type as well, but since product
IDs in Northwind are int, I used that type.) As we shall see later, this generalism comes with a price.

The last line in the constructor is a call to Reset which is one of the methods required by the IEnumerator interface. Its task is to initiate
s a t i and e d i , and we set them to values that indicate that we have not starting scanning the string yet:
tr_x n_x

pbi vi Rst){
ulc od ee(
ti.tr_x=-;
hssati 1
ti.n_x =-;
hsedi 1
}

Next comes the part of the class that implements IEnumerable. This interface requires the implementation of a single method:
GetEnumerator, which should return an object that implements IEnumerator. Since the class implements both interfaces, it returns itself:

Sse.olcin.Eueao
ytmCletosInmrtr
Sse.olcin.Eueal.eEueao( {
ytmCletosInmrbeGtnmrtr)
rtr ti;
eun hs
}

pbi Sse.olcin.eei.Eueao<qDtRcr>
ulc ytmCletosGnrcInmrtrSlaaeod
Gtnmrtr){
eEueao(
rtr ti;
eun hs
}

What is a little tricky is that the interface IEnumerable<T> requires that you also implement the non-generic version (and for some
reason, the latter cannot be p b i ).
ulc

The interface IEnumerator requires you to implement three methods MoveNext, Reset and Dispose and one read-only property,
Current. We have already looked at Reset, and Dispose is only there to permit you to explicitly close files or SQL connections without
waiting for garbage collection. That leaves MoveNext and Current as the two interesting members.



The purpose of MoveNext is to permit the caller to move to the next value in the iteration which the caller can retrieve with Current.
MoveNext is a boolean function and should return t u as long as there is a new item to retrieve with Current. If the caller moves past
re
the last item, the method should return f l e
as.

Here is how CSV_Splitter.MoveNext looks like:

pbi bo MvNx( {
ulc ol oeet)
ti.tr_x=ti.n_x+1
hssati hsedi ;

wie(hssati <ti.nu.egh&
hl ti.tr_x hsiptLnt &
ti.nu[hssati]= ti.ei){
hsiptti.tr_x = hsdlm
ti.tr_x+
hssati+;
}

i (hssati > ti.nu.egh {
f ti.tr_x = hsiptLnt)
rtr fle
eun as;
}

ti.n_x=ti.nu.neO(hsdlm ti.tr_x;
hsedi hsiptIdxfti.ei, hssati)
i (hsedi = -){
f ti.n_x = 1
ti.n_x=ti.nu.egh
hsedi hsiptLnt;
}

rtr tu;
eun re
}

The first action is to set s a t i to be one step ahead of e d i . This is followed by a w i eloop of which the purpose is to skip
tr_x n_x hl
adjacent delimiters. (Imagine that you have a string like " , , 4 .) If we at this point find that s a t i is equal to the length of the
12," tr_x
string, we are past the last character in the string, and we return f l eto the caller to indicate that the iteration is over.
as

Else s a t i is now at the first character in the next value, and we set e d i to be at the delimiter following this value. If there is no
tr_x n_x
delimiter after the last value, we pretend that there is one any way. Since there is at least one more value in this case, we return t u .
re

That is, all we do here is to position s a t i and e d i . In the Current property we make use of these values. This property should
tr_x n_x
return the same type as IEnumerable<T> was instantiated with, that is, SqlDataRecord. Here is how our Current property looks like:

pbi SlaaeodCret{
ulc qDtRcr urn
gt{
e
srn sr=ti.nu.usrn(hssati,
tig t hsiptSbtigti.tr_x
ti.n_x-ti.tr_x;
hsedi hssati)
ti.urcStn6(,CnetTIt4sr)
hsote.eIt40 ovr.on6(t);
rtr ti.urc
eun hsote;
}
}

We first extract the substring between s a t i and e d i - 1 and then we convert that value to Int64 to set the only column in the
tr_x n_x ,
o t e which we then return. From a logical point of view, the code could just as well have read:
urc

pbi SlaaeodCret{
ulc qDtRcr urn
gt{
e
srn sr=ti.nu.usrn(hssati,
tig t hsiptSbtigti.tr_x
ti.n_x-ti.tr_x;
hsedi hssati)
Slaaeodote =nwSlaaeod
qDtRcr urc e qDtRcr(
nwSleaaa"nn,Sse.aaSlbyeBgn);
e qMtDt(nn" ytmDt.qDTp.iIt)
ote.eIt40 CnetTIt4sr)
urcStn6(, ovr.on6(t);
rtr ote;
eun urc
}
}

And there would have been no need to have o t e as a variable on class level, but it seemed to me slightly more efficient to create the
urc
record once and reuse it.

When you implement IEnumerable<T> you must also implement a non-generic version, and we just let it invoke the generic version.

Ojc Sse.olcin.Eueao.urn {
bet ytmCletosInmrtrCret
gt{
e

rtr ti.urn;
eun hsCret
}
}

What you have seen in MoveNext and Current is fairly normal string-parsing code. There is certainly room for all sorts of improvements:
multi-character delimiters, alternate delimiters, trim blanks. (A string like " , , , "will cause a run-time error in Convert.ToInt64.) If
12 3
you want to handle comma-separated lists of strings, you can easily clone the class – or make the type a parameter or make the class
generic. I leave all these ideas as exercises to the reader.

To conclude, you can see that implementing a custom-iterator to feed a TVP is by no means any advanced matter, and we will leverage on
this later in this article.

What About Performance?
Before we move on to the next demo, we will learn some more theory, mainly from the perspective of performance. The other Arrays and
Lists articles discuss performance, so why not this one? In the first two subsections we will look at how TVPs performs compared to
methods where we send a comma-separated list or similar to SQL Server. In the last subsection, we will discuss whether the TVP should
have a primary key, and how we can improve performance when we know have data that is sorted. In passing, we will also learn how to
work with IDENTITY columns and computed columns in the table type.

Performance in SQL Server
As you have understood from the fact that I devoted an article solely to table-valued parameters, this is the preferred method for passing a
list of values to SQL Server. One important reason is simplicity: writing a stored procedure that accepts a table-valued parameter is
straightforward. Not that using a list-to-table function is a big deal, but relational databases are centred around tables. And as you have
seen, passing a value to a TVP from ADO .NET is a very simple affair. TVPs also have the advantage that you can add constraints to the
table type to enforce uniqueness or some other type of contract. Nor do you have to worry in your database code about format errors in a
comma-separated list.

Does this also mean that this method gives you the best performance? In general, yes. In each and every case? No. When running the tests
for the performance appendix, I did find situations where other methods outperformed TVPs. However, I believe that in the long run TVPs
will you give you better performance than any other method. There are two reasons for this:

Data is already in table format. With all other methods, cycles need to be spent on parsing a character string to get the data into table
format. In my tests, the fixed-length method performed better in some tests with integer data. Indeed, this method just chops up a fixed-
length string reading from a table of numbers, so it is very similar to reading from a table variable. However, TVPs have one more ace up
the sleeve:

The optimizer gets a clue. For all other methods, the optimizer has no understanding of what is going on. In many situations you get a
useful plan nevertheless, but with methods based on inline T‑SQL functions the optimizer often lose grip entirely and produce a plan that is
nothing but a disaster. And even if the plan is useful, it may not be the most optimal because the optimizer has no idea how many rows your
list of values will produce, which means that if you use the list in a query with other tables, row estimates are likely to be way off.

This is different for table-valued parameters. Just like table variables, table-valued parameters do not have distribution statistics, but there
is nevertheless one piece of information: cardinality. That is, the first time you call a procedure that takes a TVP, the optimizer sniffs the
TVP – as sniffs all other parameters – and the optimizer sees that the TVP has so and so many rows. This gives the optimizer better odds
for good estimates for the number of rows in the rest of the query.

Not that it is perfect: There is the general problem that the sniffed value may be atypical. (For a closer discussion on parameter sniffing, see
my article Slow in the Application, Fast in SSMS?.) And it is not always correct information leads to the best plan; in the performance
appendix for SQL 2008 you can read about a case where SQL Server chooses an incorrect plan, when it has more accurate cardinality
information. But as I discuss in the appendix this concerns only a window of the input size. Furthermore, cardinality is far from sufficient in
all cases. Consider the query:

SLC *FO Odr WEECsoeI I (EETcsi FO @utd)
EET RM res HR utmrD N SLC utd RM csis

Say that there are four values in @ u t d . If they are just four plain customers, seeking the non-clustered index on CustomerID is good.
csis



But if they are the top four customers accounting for 40 % of the volume, you want a table scan. But since a TVP does not have
distribution statistics, the optimizer cannot distinguish the cases. The workaround is simple: bounce the data over a temp table and take
benefit of that temp tables have distribution statistics. Since that workaround is the same as for all list-to-table functions, you may argue
that when you need to do this, there is no special performance advantage of TVPs.

Client-side Performance
A reasonable question is: does TVP incur more calling overhead than regular parameters? The answer is yes. In my tests I found that
passing 50 000 integer values to an unindexed TVP from ADO .NET took 40-50 ms compared to 20-35 ms for a comma-separated list.
(Note that these numbers apply to the specific hardware that I used for the tests.) For a TVP with a primary key, the overhead was around
150 ms.

While this overhead may seem considerable, you need to put it in perspective and look at the total execution time, and in most cases, the
server-side execution time exceeds the numbers in the previous paragraph with a wide margin. As just one data point: in my test, the
server-side execution time for my join test over 50 000 list elements was 213 ms for a non-indexed TVP, and the best non- TVP method
(fixed-length binary input) needed 420 ms. The performance appendix for SQL 2008 has more details.

As for the extra overhead when there is a primary key, we will discuss this more closely in the next section.

Primary Keys and Sorted Data – Looking Closer at the SqlMetaData constructors
The SqlMetaData class has no less than 15 constructors. They control in total 17 read-only properties – i.e., once set you can't change
them. To a great extent which constructor to use depends on the data type. For a string or a binary column you use a constructor that
includes the m x e g hparameter, for a decimal column you need one that exposes scale and precision etc. I am not covering all
aLnt
constructors and properties here, but I refer you to the .NET documentation.

Here I will discuss four parameters to control special properties for table-valued parameters. They appear in several constructors, and a
constructor either has all four or none of these parameters. Here is the C# declaration for the simplest of these constructors:

pbi Sleaaa
ulc qMtDt(
srn
tig nm,
ae
SlbyedTp,
qDTp bye
bo
ol ueevreal,
sSreDfut
bo
ol iUiuKy
snqee,
SrOdrclmSrOdr
otre ounotre,
it
n srOdnl
otria)

The first of these parameters, u e e v r e a l , serves a different purpose than the other three. You may guess from the name what it
sSreDfut
is all about, but your guess may not be exactly right. When you specify this parameter as t u , SQL Server will ignore any value you set
re
for the column but always set the column to its default value. Sounds corny? Here is the scoop: the SqlDataRecord must have exactly as
many columns as your table type has. But what if your table type includes an IDENTITY column or a computed column which you cannot
assign values to? It is for that sort of columns you specify u e e v r e a l as t u . It's also useful for columns with a default of
sSreDfut re
newid() or NEXT VALUE FOR. (The latter is for sequences, a feature added in SQL 2012.)

The other three parameters, i U i u K y c l m S r O d rand s r O d n lare related and they exist in order to permit a
snqee, ounotre otria
performance enhancement. But before we can discuss what purpose they serve and how they work, we need to take one step back and
look at the declaration for the table type we used with g t p o u t n m s
e_rdc_ae.


The table type has a primary key, and thus it assumes that the values in the TVP are unique. Is this a good thing? To start with, when you
design your tables, you should always look for a natural primary key, and this includes table variables and temp tables. One reason is that if
you write your code under the assumption that a certain column or a set of columns is unique, you should also state this in the table
declaration as an assertion. If your assumption is incorrect, your code will die early and not produce incorrect results.

But there is also a performance aspect. Let's look at the code for g t p o u t n m sagain:
e_rdc_ae

SLC pPoutD pPoutae
EET .rdcI, .rdcNm
FO
RM NrhiddoPout p
otwn.b.rdcs


For SQL Server, the query is equivalent to:

SLC pPoutD pPoutae
EET .rdcI, .rdcNm
FO
RM NrhiddoPout p
otwn.b.rdcs
JI
ON @rdd p O p. =pPoutD
pois s N sn .rdcI

If the table type would not have a primary key, this would not be true. Instead the equivalent query would be:

SLC DSIC pPoutD pPoutae
EET ITNT .rdcI, .rdcNm
FO
RM NrhiddoPout p
otwn.b.rdcs
JI
ON @rdd p O p. =pPoutD
pois s N sn .rdcI

That is, SQL Server would have to add an operator somewhere to remove duplicate values. This comes with an extra cost. Of course, for
four values in the TVP this is entirely negligible, but assume that there are has 50 000 values. Now the difference is starting to be
measurable, and you can see this in the performance appendix.

However, as I noted above, I found in my performance tests that there is considerable difference in overhead when passing data to a TVP
with a primary key and one without. And indeed, from what I have said this far, this is a zero-sum game. If the TVP has a primary key,
there is no need for a Sort or Hash operator in the query above to remove duplicates. But when the data arrives, SQL Server must sort it
so that it can be stored according to the index. Only if the TVP is used in more than one query, there is a performance gain with the primary
key.

If we don't know anything about the data we are passing to SQL Server, we can't do any better. But what if we know that the data already
is sorted according to the index? This is where the three parameters i U i u K y c l m S r O d rand s r O d n lcome into
snqee, ounotre otria
play. They permit you to specify that the data is sorted and how. i U i u K yshould be t u in this case. c l m S r O d rcan take
snqee re ounotre
any of the values SortOrder.Unspecified, SortOrder.Ascending and SortOrder.Descending. (The SortOrder enum is in the
System.Data.SqlClient namespace.) For sorted data you would use any of the latter two; Unspecified is the value you use when you
use a constructor with these parameters to be able to specify t u for u e e v r e a l . Finally, S r O d n lspecifies where in the
re sSreDfut otria
unique key the column appears. Use 0 for the first column in the key, 1 for the second etc. Use ‑1 for SortOrder.Unspecified. (If you
want to see an example on this, stay tuned. They will be coming.)

You need to use these parameters with care. It goes without saying that you need to ensure that the data you have really is sorted. If you
sort the data or create the sort keys yourself, you have control, but it may be precarious to rely on data coming from an outside source to
be sorted. If you are mistaken, SQL Server will not let you get away with it, but produce an error message like this one.

Mg41,Svrt 1,Sae 1 Poeue,Ln n:0
s 89 eeiy 6 tt: , rcdr ie o
Cno bl la.Tebl dt sra wsicretyseiida sre o te
ant uk od h uk aa tem a norcl pcfe s otd r h
dt voae auiuns cntan ipsdb tetre tbe Sr odr
aa ilts nqees osrit moe y h agt al. ot re
icretfrtefloigtorw:piaykyo frtrw (am) piay
norc o h olwn w os rmr e f is o: gma, rmr
kyo scn rw (et)
e f eod o: dla.
Mg32,Svrt 0 Sae 0 Poeue,Ln n:1
s 61 eeiy , tt: , rcdr ie o
Tesaeethsbe triae.
h ttmn a en emntd

There is another thing to watch out for, and in this case SQL Server will stay silent. Inspired by what we have read, we may get the idea to
change the constructor for the CSV_splitter class, so that o t e is created in this way:
urc

ti.urc=nwSlaaeod
hsote e qDtRcr(
nwSleaaa"nn,SlbyeBgn,
e qMtDt(nn" qDTp.iIt
fle tu,SrOdrAcnig 0)
as, re otre.sedn, );

If you make this change and then run the CSVdemo program, you will find that it runs just fine. But wait! In the procedure C V t _ Q
S_oSL
there is this line:

cdPrmtr(@rdd".au =NwTPeoCVslte(1 1,7,3"
m.aaees"pois)Vle e VDm.S_pitr", 1 6 4)

Data is out of order, so an error message is to be expected. Still we did not get any. Why? Recall that CSV_splitter uses BigInt to be as
reusable as possible, while the table type has an integer column. Because of the data-type mismatch, SQL Server decides to ignore the
information that the data is sorted and sorts it anyway. If you change the type to SqlDbType.Int and try again, you will get the error
message above.

Thus, to be sure that SQL Server does not decide to sort behind your back, you should make sure that you create the SqlDataRecord
object so that it matches your table type exactly. To be precise, you can have a mismatch as long as SQL Server feels that it can trust the


conversion to not affect the sort order. If you want to be sure, the simplest way to test is to send data out of order. If you get error
message 4819, the plot worked, else it did not. You can also use Profiler, and include the event Performance:Show Statistics XML
Profile and run the application. If you also add SP:StmtCompleted you will see the insertion into the TVP as encrypted text. This helps
you to locate the query plan, and it should not include a Sort or Hash operator.

Character data is particularly difficult in this context. The first thing to note is that the length must match. That is, if you define the column in
.NET as

nwSleaaa"hro" SlbyeNaCa,10
e qMtDt(cacl, qDTp.Vrhr 2,
fle tu,SrOdrAcnig 0;
as, re otre.sedn, )

but the target column is nvarchar(20), SQL Server will ignore your sorting parameters and sort the data. Another complication is that
character data can be sorted in many ways, that is, according to different collations. If you look through the constructors for
SqlMetaData you will find two parameters l c l I and c m a e p i n which seem like they could be used to specify the
oaeD oprOtos
collation. I tested this, but I found that they had no effect. From what I can tell, SQL Server assumes that character data is always sorted
according to the database collation. If the data you send with the TVP is sorted according to a different collation, you will get an error once
there is a deviation. You can of course specify an explicit collation for the column in the table type, and it may save you from error
messages about data being non-unique. However, my testing indicates that if a key column has a different collation from the database
collation, SQL Server will ignore the sorting parameters and always sort the incoming data stream.

Loading Data through Table-Valued Parameters
We will look at one more example. This time we will see how we can use table-valued parameters to easily load lots of data to SQL
Server. There are several other options for this task: BCP, BULK INSERT, SQL Server Integration Services and the SqlBulkCopy class.
But none these options permit you to send data directly to a stored procedure. We will learn two ways to do this. The plain way where we
read the file into memory and a more efficient way where we stream the file to the TVP. I've taken the opportunity to cover some ground
beyond the topic of TVPs, so you may learn some other tricks in this chapter as well.

The Setup
For this example we will look at loading data into these two tables:

CET TBEAbm (luI
RAE AL lus AbmD it
n IETT,
DNIY
Ats
rit naca(0)NTNL,
vrhr20 O UL
Tte
il naca(0)NTNL,
vrhr20 O UL
Rlaeaedt
eesDt ae NL,
UL
Lnt
egh tm()
ie0 NL,
UL
CNTAN p_lusPIAYKY(luI)
OSRIT kAbm RMR E AbmD
)

CET TBETak (luI
RAE AL rcs AbmD it
n NTNL,
O UL
Tako
rcN tnit
iyn NTNL,
O UL
Tte
il naca(0)NTNL,
vrhr20 O UL
Lnt
egh tm()
ie0 NL,
UL
CNTAN p_rcsPIAYKY(luI,Tako,
OSRIT kTak RMR E AbmD rcN)
CNTAN f_rcsAbm FRINKYAbmD
OSRIT kTak_lus OEG E(luI)
RFRNE Abm(luI)
EEECS lusAbmD
)

We have a music collection, and Albums includes information about an album, and Tracks details the tracks for the albums. All and all, a
fairly typical master-detail scenario. These table definitions, as well as other SQL code in this chapter, are inclued in the file
fileloaddemo.sql which you find among the demo files.

Our task is to load new albums with their tracks into the database, from the file Albums.csv which is also included in the demo files. Here
are some sample lines from this file:

AAra BlwDsr Cuh B teTi,3:5
,din ee,eie agt y h al,32
T1TnoZba485,
,,ag er,553
T2Luhn Mn326,
,,agig a,340
T3TeGpyZra174,
,,h ys un,811
T4Prri o Mrae,478
,,otat f agrt201,

T5BahCetrsDnigLk Cae,954
,,ec raue acn ie rns176,
T6A teSaieCf,139
,,t h esd ae131,
T7Genc,226
,,uria171,
T8""""381,
,,"Z",346
A"ld Moa Jh MLuhi,Pc d Lca,rdyNgti SnFacso81/914:9
,A i el, on cagln ao e ui"Fia ih n a rnic,/018,20
T1A MdtraenSnac-.RoAco788,
,,. eierna udneB i nh,070
T2SotTlso teBakFrs,344
,,hr ae f h lc oet558,
T3FeoRsao460,
,,rv agd,868
T4Fnai Sie519,
,,atsa ut,442
T5Gada Agl(cagi)276,
,,urin ne MLuhn,406
ADvdBwe""eos",41/974:6
,ai oi,"Hre""1/017,05
T1Bat adteBat278,
,,euy n h es,112

The first field defines whether the line contains an album (A) or a track (T). On an Album line, the fields are Artist. Album title, Release
date and Length in minutes and seconds. On a Track line, the fields are Track number, Track title and Length in milliseconds. That is,
the fields are the same as in the tables, except for one thing: there is no AlbumID. It is part of our loading task to assign this id.

As for the format, you can note that some fields are quoted in double quotes, but this happens only when the field includes a comma or a
double quote. Some fields include a plethora of double quotes; this happens when the double quotes are part of the value. (You may recall
that the name of David Bowie's classic album from 1977 really is "Heroes" with quotes and all.)

An aside: you cannot load this file with BCP or BULK INSERT in a simple way. To start with, they cannot really cope with master-detail
formats at all, but you would have to load the data into a staging table to be able to separate albums and tracks. And this is only possible it
there is an equal number of fields on each line. As it happens, Excel – which I used to create this file – was kind to add an extra comma at
the end of the Tracks lines, so this is not an issue here. Instead, the real killer is the inconsistent quoting. As long as a field is consistently
quoted through a file, you can load quoted fields with BCP or BULK INSERT, if you use a format file that specifies delimiters that include the
quotes. But in this file where only some values are in quotes and where these values include the field delimiter, BCP and BULK INSERT are
completely lost. These tools are designed to read a binary stream, and do they not do string parsing.

We need two table types and a stored procedure. The table types mirror the file with one addition:

CET TP Abm_btp A TBE
RAE YE lustlye S AL
(epD
TmI it
n NTNL,
O UL
Ats
rit naca(0)NTNL,
vrhr20 O UL
Tte
il naca(0)NTNL,
vrhr20 O UL
Rlaeaedt
eesDt ae NL,
UL
Lnt
egh tm()
ie0 NL,
UL
PIAYKY(epD
RMR E TmI)
)

CET TP Tak_btp A TBE
RAE YE rcstlye S AL
(epD
TmI it
n NTNL,
O UL
Tako
rcN tnit
iyn NTNL,
O UL
Tte
il naca(0)NTNL,
vrhr20 O UL
Lnt
egh tm()
ie0 NL,
UL
PIAYKY(epD Tako
RMR E TmI, rcN)
)
go

Since there is no album ID in the file, the loading process must assign new ids. As long as we have the file, we know which tracks that go
with which albums, since the file is ordered. But when we load the data into different tables that order is lost, since tables are unordered
objects by definition. For this reason, both table types include a column TempID, which is a temporary ID that uniquely identifies an
album during the loading process.

The stored procedure is worth dwelling on for an extra second:

CET POEUELaAbm @lusAbm_btp RAOL,
RAE RCDR odlus Abm lustlye EDNY
@rcsTak_btp RAOL A
Tak rcstlye EDNY S

DCAE@da TBE(epD itNTNL PIAYKY
ELR imp AL TmI n O UL RMR E,
AbmDitNTNL UIU)
luI n O UL NQE

STXC_BR O
E ATAOT N
BGNTASCIN
EI RNATO


MREAbm A
EG lus
UIG@lusTO 1=0
SN Abm N
WE NTMTHDB TRE TE
HN O ACE Y AGT HN
ISR(rit Tte Rlaeae Lnt)
NETAts, il, eesDt, egh
VLE(.rit TTte TRlaeae TLnt)
AUSTAts, .il, .eesDt, .egh
OTU TTmI,isre.luI IT @da(epD AbmD
UPT .epD netdAbmD NO impTmI, luI)
;

ISR Tak(luI,Tako Tte Lnt)
NET rcsAbmD rcN, il, egh
SLC iAbmD TTako TTte TLnt
EET .luI, .rcN, .il, .egh
FO
RM @rcsT
Tak
JI
ON @da iO iTmI =TTmI
imp N .epD .epD

CMI TASCIN
OMT RNATO
go

To start with, the procedure sets up a user-defined transaction so that we don't end up loading only the albums. While I am a strong
advocate of error handling, I don't use TRY-CATCH here. In the interest of brevity, I let it suffice with SET XACT_ABORT ON to make sure
that any error aborts and rolls back the transaction. (If you want directions for error handling, please see my article Error Handling in
SQL Server 2005 and Later.)

What may surprise the reader is the MERGE statement. This is a pure insert operation (for the sake of the example, I am completely
ignoring that the album may already be in the database), so why use MERGE? And with that weird condition 1 = 0? By using this condition
we make sure that no rows in the source match the target. That is, all rows in @ l u swill match the condition WHEN NOT MATCHED
Abm
BY TARGET , and thus all rows in @ l u swill be inserted into Albums. Or in another words, this is a complicated way of saying:
Abm

ISR Abm(rit Tte Rlaeae Lnt)
NET lusAts, il, eesDt, egh
SLC Ats,Tte Rlaeae Lnt
EET rit il, eesDt, egh
FO
RM @lus
Abm

Why all this? The answer lies in the OUTPUT clause. We need to map the TempID in @ l u sto the IDENTITY values generated for
Abm
AlbumID in Albums, so that we can insert the correct AlbumID values into Tracks, and this is the purpose of the table variable @ d a .
imp
If you try to make the mapping with INSERT, you will find that this does not work, because in the OUTPUT clause for INSERT you only
have access to the columns in the target table. This is different with MERGE; with MERGE you have access to both target and source
columns in the OUTPUT clause.

When inserting into Tracks there is no need for extra fireworks, and we can use plain INSERT where we pick up the album IDs from the
@ d a table.
imp

Take One: Reading the File Into a List
There are two example programs to load the file, and we will first look at fileloaddemo1.cs which reads the file into two
List<SqlDataRecord>, one for albums and one for tracks. This program starts of with a number u i gclauses, of which one may be
sn
surprising:

uigSse;
sn ytm
uigSse.aa
sn ytmDt;
uigSse.aaSllet
sn ytmDt.qCin;
uigSse.olcin.eei;
sn ytmCletosGnrc
uigMcootSlevrSre;
sn irsf.qSre.evr
uigMcootVsaBscFlI;
sn irsf.iulai.ieO

System is needed as always of course, and System.Data includes SqlDBType and more. SqlClient is what this text is all about. We
need System.Collections.Generic for the class List<T>, and as noted previously we get SqlDataRecord and SqlMetaData from
Microsoft.SqlServer.Server. But staunch fans of C# may be appalled by the appearance of Visual Basic here. As I pointed out above,
the format of this file is somewhat complex. While I could have written the code to parse the lines on my own, I said to myself "this file has
been generated by Excel; there must be code out there that performs this task". So I did a search on Google, and I was quickly pointed to
the class TextFieldParser that exists in the namespace Microsoft.VisualBasic.FileIO.

If you want to use this class from C#, you need to add a reference to Microsoft.VisualBasic.dll. VB programmers get this DLL
automatically.

Fileloaddemo1 includes two routines of interest, r a _ i eand l a _ l u s The latter first calls r a _ i eand then calls the
edfl odabm. edfl


stored procedure L a A b m . We will look at r a _ i efirst. Here is the declaration:
odlus edfl

piaesai vi ra_ie(tig
rvt ttc od edfl srn flnm,
ieae
ot
u Ls<qDtRcr>abm,
itSlaaeod lus
ot
u Ls<qDtRcr>tak){
itSlaaeod rcs

It accepts a file name and return album and track data in the two List parameters. The first few lines in r a _ i eare pretty dull:
edfl

Sse.lblzto.aeieomtnon_utr =
ytmGoaiainDtTmFraIf oclue
nwSse.lblzto.aeieomtno)
e ytmGoaiainDtTmFraIf(;
Sse.lblzto.aeietlsn_aeiesye=
ytmGoaiainDtTmSye odttm_tl
Sse.lblzto.aeietlsNn;
ytmGoaiainDtTmSye.oe

itabmn =0
n lu_o ;

The two items from System.Globalization are some jazz needed when we parse the date and time fields, I'll return to them later. The
variable a b m n is more interesting: this variable will feed the TempID columns in the table parameters.
lu_o

The next two statements are significantly hotter, because this is where we set up the SqlMetaData definitions that map to our table types:

Sleaaa]abm_btp =
qMtDt[ lustlye
{nwSleaaa"d,SlbyeIt fle
e qMtDt(i" qDTp.n, as,
tu,SrOdrAcnig 0,
re otre.sedn, )
nwSleaaa"rit,SlbyeNaCa,20,
e qMtDt(ats" qDTp.Vrhr 0)
nwSleaaa"lu" SlbyeNaCa,20,
e qMtDt(abm, qDTp.Vrhr 0)
nwSleaaa"eesd,SlbyeDt)
e qMtDt(rlae" qDTp.ae,
nwSleaaa"egh,SlbyeTm,0 0}
e qMtDt(lnt" qDTp.ie , );

Sleaaa]tak_btp =
qMtDt[ rcstlye
{nwSleaaa"d,SlbyeIt fle
e qMtDt(i" qDTp.n, as,
tu,SrOdrAcnig 0,
re otre.sedn, )
nwSleaaa"rcn" SlbyeTnIt fle
e qMtDt(tako, qDTp.iyn, as,
tu,SrOdrAcnig 1,
re otre.sedn, )
nwSleaaa"il" SlbyeNaCa,20,
e qMtDt(tte, qDTp.Vrhr 0)
nwSleaaa"egh,SlbyeTm,0 0}
e qMtDt(lnt" qDTp.ie , );

In difference to the CSV_splitter class, we don't create any SqlDataRecord at this point; since we are adding to a List, we will need a
new SqlDataRecord object each time. Whence, we only create the SqlMetaData arrays in advance.

Here we see some more examples of using the special parameters for the SqlMetaData constructor to specify that the data is sorted. For
a b m _ b t p there is a single column in the sort key, while for t a k _ b t p there is a composite key and as you see we specify
lustlye rcstlye
that both columns are unique. We set the parameter s r O d n lto 0 and 1 respectively. Admittedly, to some extent this contradicts
otria
what I said previously about ensuring that the data is sorted. The id columns are no problem; we are generating the id values in our code
and we have full control over them. But the track numbers comes from the file and in a real-world scenario, we may not be able to rely on
that the track numbers come in numeric order.

Here are also examples of SqlMetaData constructors where we specify the length for the string columns. For the time columns we need
to use a constructor that exposes scale and precision, even if time only has one of them.

In the C# version we create the lists at this point:

abm =nwLs<qDtRcr>)
lus e itSlaaeod(;
tak =nwLs<qDtRcr>)
rcs e itSlaaeod(;

(In the VB code this happens in l a _ l u ssince VB did not seem to like it when I passed uninitialised variables.)
odabm

Next we open the file by creating a TextFieldParser object:

TxFedasrf =nwTxFedasrflnm,
etilPre p e etilPre(ieae
Sse.etEcdn.eal)
ytmTx.noigDfut;

In total, this class offers eight different constructors. For this demo, we use one where we pass the name of the file (there are also
constructors accept a Stream object instead) and the encoding. The default for the TextFieldParser is UTF-8, but CSV files from Excel
appears to always be ANSI files. (And since one of the tracks from "Heroes" is called Neuköln it matters for the sample file.)



We need to configure our newly created object:

f.etilTp =FedyeDlmtd
pTxFedye ilTp.eiie;
f.eiies=nwSrn[ {,}
pDlmtr e tig] "";
f.aFedEcoeIQoe =tu;
pHsilsnlsdnuts re

The TextFieldParser can handle both delimited files and fixed-length formats. Here we set up the file to be comma-delimited. We also
specify that there are fields enclosed in quotes. Yet an option, that we don't make use of, is to specify comment tokens.

Once this is done, we have completed the preparations and can read the file.

wie( f.nODt){
hl ! pEdfaa
Srn[ fed =f.edils)
tig] ils pRaFed(;

The ReadFields method consumes the next set of fields and returns them in a string array. (If the file does not comply with the expected
format, the method will throw an exception, but I have not included any error handling to keep the example down in length.) Depending on
f e d [ ]we take different paths:
ils0

i (ils0 = "" {
f fed[] = A)
Slaaeodabmrc=nwSlaaeodabm_btp)
qDtRcr lu_e e qDtRcr(lustlye;

abmrcStn3(,+abmn)
lu_e.eIt20 +lu_o;
abmrcSttig1 fed[];
lu_e.eSrn(, ils1)
abmrcSttig2 fed[];
lu_e.eSrn(, ils2)

If we have an A in the first field, we create an SqlDataRecord that aligns with the table type for albums, and then we go on and populate
the fields, using various Set methods of the SqlDataRecord class. Above, we save the temporary id (which we first increment), the artist
name and the album title. As you see, we refer to the columns by number, starting on 0. If you prefer to access the columns by name, you
need to use the GetOrdinal method:

abmrcStn3(lu_e.eOdnl"d) +abmn)
lu_e.eIt2abmrcGtria(i", +lu_o;
abmrcSttigabmrcGtria(ats",fed[];
lu_e.eSrn(lu_e.eOdnl"rit) ils1)
abmrcSttigabmrcGtria(abm) fed[];
lu_e.eSrn(lu_e.eOdnl"lu", ils2)

Note here that you need to use the name you specified in the SqlMetaData constructor; you cannot use the names in the table type.

The ReleaseDate column is a little more complex for two reasons: it is permitted to be NULL, and date formats are always problematic.
Here is the code:

DtTm rlaeae
aeie eesdt;
bo dt_k=DtTm.rPrexc (
ol aeo aeieTyasEat
fed[] "/Myy" n_utr,n_aeiesye
ils3, dM/yy, oclue odttm_tl,
otrlaeae;
u eesdt)
i (aeo){
f dt_k
abmrcStaeie3 rlaeae;
lu_e.eDtTm(, eesdt)
}
es {
le
abmrcStBul3;
lu_e.eDNl()
}

We use TryParseExact to see if there is a legit date in the field. It just so happens that the dates in the file are on the format
DD/MM/YYYY, because I generated the CSV file with my regional settings set to English (Australia). (Had I used my regular Swedish
settings, the CSV file would have had semicolon as delimiter, which would have been less interesting with regards to the double quotes.)
When reading dates from text input – be that a file or text box – you should never assume that all dates are well-formed. There may be a
mix of different date formats, and there may be completely bogus dates like 1992-02-30. If the parsing succeeds, we set the date column
in a b m r c else we set it to NULL.
lu_e,

When I composed this demo program, the release date proved to be the most difficult to get right. It turned out that it is not sufficient to
specify an exact date format. When I tested the program, I had switched back to Swedish settings where the date format is
YYYY-MM-DD. Eventually I found that I could not leave the third parameter n l , but I had to use an explicit value to state that I wanted
ul
to ignore regional settings, whence this n _ u t r . n _ a e i e s y eis the value for an enum parameter which is mandatory with
oclue odttm_tl
TryParseExact.



The last album field is the length, which is handled similarly:

DtTm lnt;
aeie egh
bo lnt_k=DtTm.rPrexc(
ol egho aeieTyasEat
fed[] nwsrn [ {hms" ":s}
ils4, e tig ] "::s, ms",
n_utr,n_aeiesye
oclue odttm_tl,
otlnt)
u egh;
i (egho){
f lnt_k
abmrcStiepn4 lnt.iefa)
lu_e.eTmSa(, eghTmODy;
}
es {
le
abmrcStBul4;
lu_e.eDNl()
}

The only thing that is different is that I permit for time formats both with and without hours, since an album may exceed one hour in length.

When all this is done, we add the record to the a b m list:
lus

abm.d(lu_e)
lusAdabmrc;

The code for dealing with the tracks data is similar with a sanity check added.

es i (ils0 = "" {
le f fed[] = T)
i (lu_o= 0 {
f abmn = )
trwnwEcpin"a fl fra:takrw bfr tefrtabmrw";
ho e xeto(Bd ie omt rc os eoe h is lu o!)
}
Slaaeodtakrc=nwSlaaeodtak_btp)
qDtRcr rc_e e qDtRcr(rcstlye;

takrcStn3(,abmn)
rc_e.eIt20 lu_o;
takrcStye1 CnetTBt(ils1);
rc_e.eBt(, ovr.oyefed[])
takrcSttig2 fed[];
rc_e.eSrn(, ils2)

i (ils3 ! "){
f fed[] = "
TmSa lnt =TmSa.rmilscnsCnetTIt2fed[])
iepn egh iepnFoMlieod(ovr.on3(ils3);
takrcStiepn3 lnt)
rc_e.eTmSa(, egh;
}
es {
le
takrcStBul3;
rc_e.eDNl()
}

tak.d(rc_e)
rcsAdtakrc;
}

Here is the code for l a _ l u s
odabm:

piaesai vi la_lus){
rvt ttc od odabm(

Ls<qDtRcr>abm;
itSlaaeod lus
Ls<qDtRcr>tak;
itSlaaeod rcs

ra_ie"luscv,otabm,ottak)
edfl(abm.s" u lus u rcs;

uig(qCneto c =TPeoDmHle.eu_oncin)
sn Sloncin n VDm.eoeprstpcneto()
uig(qCmadcd=c.raeomn(){
sn Slomn m nCetCmad)

m.omnTp omnTp.trdrcdr;
cdCmadet="b.odlus;
m.omnTx doLaAbm"

cdPrmtr.d(@lus,SlbyeSrcue)
m.aaeesAd"Abm" qDTp.tutrd;
cdPrmtr[@lus]Drcin=Prmtrieto.nu;
m.aaees"Abm".ieto aaeeDrcinIpt
cdPrmtr[@lus]TpNm ="lustlye;
m.aaees"Abm".yeae Abm_btp"
cdPrmtr[@lus]Vle=abm;
m.aaees"Abm".au lus

cdPrmtr.d(@rcs,SlbyeSrcue)
m.aaeesAd"Tak" qDTp.tutrd;
cdPrmtr[@rcs]Drcin=Prmtrieto.nu;
m.aaees"Tak".ieto aaeeDrcinIpt
cdPrmtr[@rcs]TpNm ="rcstlye;
m.aaees"Tak".yeae Tak_btp"
cdPrmtr[@rcs]Vle=tak;
m.aaees"Tak".au rcs


cdEeueoQey)
m.xctNnur(;
}
}

It first calls r a _ i eto fill a b m and t a k , and then it calls the stored procedure L a A b m . The only difference to the code
edfl lus rcs odlus
we saw for comma-separated list is this line:

cdPrmtr[@lus]Vle=abm;
m.aaees"Abm".au lus

That is, we pass a List object and not an custom-written iterator.

While this code may seem trivial, it is worth emphasising the flexibility. Here we pass a List object, but if we would change our mind and
want to pass something else, we can do that very easily. In fact, this is exactly what we will do in a second.

Take Two: Streaming the File
The sample file that comes with this article is short; there are only five albums. But imagine that you have a very large file, tens of
megabytes in size. With the solution above, you would have to read the entire file into memory before you start sending the data to SQL
Server. Is that really necessary? No, and you might already have guessed how we can approach this. If we can write a custom-iterator for
a comma-separated list, we should be able to write an iterator that reads the file, so that SqlClient can send a row to SQL Server as soon
as we have read it.

Now, the fact that this is a mater-detail file makes this a little more complicated. If we have two TVPs, we would need two iterator
classes, one for albums and one for tracks. And these classes would both have to read the file, which thus would be read twice. To avoid
this, I decided to use a single table type that can accommodate both row types. Depending on how different headers and details are from
each other, this can be quite messy. Thankfully, our albums-and-tracks example is quite forgiving in this sense. (At this point I can sense
objection from some readers who think that it is possible to have two table types and still only read the file once. Permit me to come back
to this idea after I have gone through the streaming example.)

Here is the table type (which you also find in fileloaddemo.sql):

CET TP AbmrcstlyeA TBE
RAE YE luTak_btp S AL
(epD
TmI it
n NTNL,
O UL
Tako
rcN tnit
iyn NTNL,
O UL
Ats
rit naca(0)NL,
vrhr20 UL
Tte
il naca(0)NTNL,
vrhr20 O UL
Rlaeaedt
eesDt ae NL,
UL
Lnt
egh tm()
ie0 NL,
UL
PIAYKY(epD Tako,
RMR E TmI, rcN)
CEK(rcN =0ADAts I NTNL
HC Tako N rit S O UL
O Tako>0ADAts I NL
R rcN N rit S UL
ADRlaeaeI NL)
N eesDt S UL
)

I did not add the A/T field to the table type; instead I use TrackNo as the distinguishing column; TrackNo = 0 indicates that this is a
header row. I've also added constraints to state rules that are unique for album rows (Artist must be present) and track rows (must not
have Artist and ReleaseDate). Such CHECK constraints help to detect errors in the client program.

To use this table type, there is a second stored procedure, similar to the one we looked at previously:

CET POEUELaAbm_ @luTak AbmrcstlyeRAOL A
RAE RCDR odlus2 Abmrcs luTak_btp EDNY S

DCAE@da TBE(epD itNTNL PIAYKY
ELR imp AL TmI n O UL RMR E,
AbmDitNTNL UIU)
luI n O UL NQE

STXC_BR O
E ATAOT N
BGNTASCIN
EI RNATO

MREAbm A
EG lus
UIG(EETTmI,Ats,Tte Rlaeae Lnt
SN SLC epD rit il, eesDt, egh
FO
RM @luTak
Abmrcs
WEE Tako=0 A O 1=0
HR rcN ) T N
WE NTMTHDB TRE TE
HN O ACE Y AGT HN
ISR(rit Tte Rlaeae Lnt)
NETAts, il, eesDt, egh


VLE(TAts,A.il,A.eesDt,A.egh
AUSA.rit TTte TRlaeae TLnt)
OTU A.epD isre.luI IT @da(epD AbmD
UPT TTmI, netdAbmD NO impTmI, luI)
;

ISR Tak(luI,Tako Tte Lnt)
NET rcsAbmD rcN, il, egh
SLC iAbmD A.rcN,A.il,A.egh
EET .luI, TTako TTte TLnt
FO
RM @luTak A
Abmrcs T
JI
ON @da iO iTmI =A.epD
imp N .epD TTmI
WEE A.rcN >0
HR TTako

CMI TASCIN
OMT RNATO

In demo files, there is a sample program fileloaddemo2.vb that calls this procedure. Here is the code that calls L a A b m _ :
odlus2

PiaeSbLaAbm
rvt u odlus

m s qCmad nCetCmad)

m.omnTp omnTp.trdrcdr
cdCmadet="b.odlus2
m.omnTx doLaAbm_"

cdPrmtr.d(@luTak" SlbyeSrcue)
m.aaeesAd"Abmrcs, qDTp.tutrd
cdPrmtr(@luTak".ieto =Prmtrieto.nu
m.aaees"Abmrcs)Drcin aaeeDrcinIpt
cdPrmtr(@luTak".yeae="luTak_btp"
m.aaees"Abmrcs)TpNm Abmrcstlye

cdPrmtr(@luTak".au =_
m.aaees"Abmrcs)Vle
nwTPeoAbmedr"luscv)
e VDm.luRae(Abm.s"

cdEeueoQey)
m.xctNnur(
EdUig
n sn
EdSb
n u

You have seen this pattern a couple of times now. What is different from fileloaddemo1 is that there is no call to r a _ i e but instead
edfl,
there is an instantiation of the class TVPDemo.AlbumReader. This is analogous to when we worked with comma-separated strings of
integers, and when we look inside TVPDemo.AlbumReader.vb there is a mix of what we saw in CSV_splitter.cs and
fileloaddemo1.cs. The most startling difference may be that this time I show the code is in Visual Basic... So I will rash through the code
fairly quickly. The important takeaway is that writing a class that streams a file to a TVP is by no means complicated.

Here is the I p r ssection:
mot

IprsSse
mot ytm
IprsSse.aa
mot ytmDt
IprsSse.olcin.eei
mot ytmCletosGnrc
IprsMcootSlevrSre
mot irsf.qSre.evr
IprsMcootVsaBscFlI
mot irsf.iulai.ieO

Again Microsoft.VisualBasic.FileIO is featured, but I like to remind you that the choice of using the TextFieldParser class is due to
the specific file format. While it is likely to be useful for CSV files in general, you may have a file format for which it is less suitable.
Particularly, it is not that you need to use this class only because you are streaming to a TVP.

The class declaration:

Pbi CasAbmedr
ulc ls luRae
Ipeet InmrbeO Slaaeod,_
mlmns Eueal(f qDtRcr)
InmrtrO Slaaeod
Eueao(f qDtRcr)

is no different from the CSV_splitter. I implement both interfaces in the same class. There are some global members:

DmAbmoA Itgr
i luN s nee 'Cretabm
urn lu.
Dmf A TxFedasr
i p s etilPre 'Orfl-edn cas
u ieraig ls.
DmOte A Slaaeod
i urc s qDtRcr 'Tercr w uet rtr dt.
h eod e s o eun aa

DmNClueA NwSse.lblzto.aeieomtno
i outr s e ytmGoaiainDtTmFraIf
DmNDtTmSyeA Sse.lblzto.aeietls=_
i oaeietl s ytmGoaiainDtTmSye
Sse.lblzto.aeietlsNn
ytmGoaiainDtTmSye.oe


Arrays and lists in sql server 2008

Arrays and lists in sql server 2008

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Arrays and lists in sql server 2008

Similaire à Arrays and lists in sql server 2008 (20)

Arrays and lists in sql server 2008