Optimize Data Connectivity in .NET Applications

Optimize Database Connectivity
in .NET Applications
March 17, 2016
Avadhoot Kulkarni

© 2013 Progress Software Corporation. All rights reserved.2
INTRODUCTION
Avadhoot Kulkarni
Product Owner ODBC, JDBC & ADO.NET Drivers, Progress DataDirect
https://www.progress.com/net
Avadhoot.Kulkarni@progress.com

AGENDA
Components of Performance
Performance Strategy
ADO.NET Data Provider
 Architecture of Data Provider
 Connection Improvements
 Query Execution Improvement
 Result Fetching Improvements
 Coding Guidelines

Components of Performance
Database Network Data Provider
Data Access
Code
Client
Infrastructure

Do not fix your applications for better performance,
Design it to perform better.
PERFORMANCE STRATEGY

Architecture

Architecture of Data Provider
 Architecture
• Managed vs. Unmanaged
• How well you manage the resources
– Database Capabilities
– Network Bandwidth
– Disk I/O
– CPU
– Memory

Connection
Improvements

Connection Improvements
 Avoid XA (Distributed Transaction) Enabled Connections (Enlist)
 Use Connection Pooling
• Benefit
• Pooling strategies
• When to avoid
 Use Bigger Protocol Packet Sizes
 Avoid Distributed Transactions
 Choose Extended Security Wisely

Use Connection Pooling
1. Application Server Started;
Connection Pool is populated
2. Application Makes a Connection
Request
3. A pooled Connection is given to
application.
4. Application is Connected to
database.
5. When the connection is closed, it
is placed back into the pool.
Application Server
Application

When to avoid Connection Pooling
 Frequently Restarting Applications
They pay upfront cost of populating the pool and hold up resources, which all gets lost
when application restarts.
 Single User Applications (e.g. Report Writers)
If the application needs to establish connection to database once in a while, for single
use, resources held by pool hamper the performance more than the pooling could
help.
 Single User Batch Jobs
Pooling offers no advantage to such applications, as they just need one connection.
Especially as these batch jobs are executed off ours when there is no real load on
server.

Use Bigger Protocol Packet Size
 Communication packet size is limited to Servers max packet size limit.
 Typically, larger the packet size, the better the performance

Avoid Distributed Transactions
 Distributed Transactions are used to execute atomic operations across multiple
connections. These are needed to keep data consistent.
 Requires communication between multiple servers, Hence are very slow compared
to local transactions.
 This co-ordination could be between several types of database servers e.g. SQL
Server, Oracle, DB2, Sybase, MySQL etc. or, multiple instances of same database
types.
 Be very cautious while designing applications; multiple databases architecture
would most likely require distributed transactions.

Choose Extended Security Wisely
 Performance Penalties are side effects of extended security. Let’s see if its possible
to limit them…
 There are 2 types to the extended security
• Network Authentications
• Data Encryption over Network

Improve Kerberos Authentication Performance
 Bottle Neck: As for each connection, client need to get a ticket from Kerberos server,
it can quickly become a bottle neck if it’s a shared server with lots of network heavy
services running on it.
 Place Kerberos Server on Dedicated machine
 Reduce Networking services run on this machine to bare minimum
 Make sure you have fast, reliable network connection to the machine

Avoid Unnecessary Data Encryption
 For secure transition of data another layer is added over TCP/IP i.e. SSL or any
proprietary Encryption of database vendor.
 Performance Intensive steps
• Initial Handshake
• Encryption and Decryption
 Usually longer the Cipher Key means more the security but lesser the performance.
 Use encrypted connections for sensitive data and unencrypted for non-sensitive data.
Though, not all databases supports this functionality.

Query Execution
Improvement

Query Execution Improvements
 Use Transactions (Auto-Commit)
 Use Prepared Queries
 Use Statement Caching/Pooling
 Use Parameterized Batching
 Use Bulk Protocols

Use Transactions (Auto-Commit=FALSE)
 In ADO.NET Default Auto Commit Mode is TRUE.
 DB2 do not support Auto-Commit mode, Data Provider by default send commit request
after every successful operation.
 Application has no control on when the Commit is fired. Mostly it gets fired even when
there is nothing to commit.
 Every commit and Rollback operation is performance Intensive
• Requires Disk I/O and sometimes network roundtrips.
Cont.…

Use Transactions (Auto-Commit=FALSE)
 In most cases you will want to turn off Auto-Commit mode. Start BeginTransaction()
to start local transaction in your application.
 Leaving transactions active can reduce throughput by holding locks for longer than
necessary.
Committing transactions in intervals gives best
performance with acceptable concurrency.

Use Prepared Queries
 Calling Command.Prepare() usually compiles the SQL statement into query plan for
efficiency.
 Available until the connection is closed.
 Though initial execution overhead of prepare is high, advantage is realised in
subsequent executions.
 Some databases like DB2 and Oracle supports Prepare and execute together, which has
multiple benefits.

Use Statement Caching/Pooling
 Statement Pool is group of prepared statements that an application can reuse.
 Its not a feature of database systems but, is a feature of drivers.
 Statement Cache or Pool is associated with a Connection.
 Not all drivers/providers in market support statement caching. To make use of this
feature, make sure you deploy a driver/provider with your database application
which does.

Use Statement Caching/Pooling
 Use statement caching if 90% or more of your statements are executed
multiple times.
 Use Parameterized queries to make best use of Prepared Queries and
statement caching.
 Statement pool max size should not exceed server’s limit for maximum
number of active statements per connection.
 Statement pool max size should be equal or greater than the number of
different statements executed by your application multiple times.

Using Parameter Array Binding/Batches
 To Reduce the number of network roundtrips when updating large amounts of data,
you can bind arrays to parameter value or execute Batches of SQL Statements.
 You can have similar effect while using disconnected Datasets to update the data if
your data provider supports parameter Array binding. To enable this, you can set
DataAdapter.UpdateBatchSize=<No of items in your Array>
 Not all Data Providers and databases supports this feature, make sure you are using a
driver/data provider which does with you database application.

Using Bulk Load
 If your data provider support bulk load (aka BulkCopy), inserting large amount of data
into database server will be even faster than using Array Binding.
 You can use BulkLoad functionality through xxxBulkCopy class which many data
Providers support.
 In Bulk Load, rows are sent as continuous stream, without making extra network
roundtrips. In addition, during Bulk Copy database can also optimize the way the rows
are inserted.
 Bulk Load can have negative impact, as data inserted with bulk load may ignore
referential integrity, causing consistency problems with data in database.

Result Fetching
Improvements

Fetching the Results
Delay Retrieving long data
 Most of the applications do not need
long data by default, but fetching them
is a costly operation.
 Avoid long data columns such as XML,
BLOB, CLOB, NLONGVARCHAR,
LONGVARCHAR, LONGVARBINARY in
the select list.
 Avoid “Select *” queries on tables
which contain long data columns.
Limiting Amount of Data Retrieved
 Use MaxRows or RowSetSize to limit
maximum numbers of rows returned by
the server to Data Provider.
 This will reduce the number of network
round trips.
 Limit the row size, by requesting limited
data of the long columns when
requested. E.g. First 1MB of 10MB long
log entry.

Use Get<Specific-Type> Method to Fetch Data
 Avoid using generic GetValue() to fetch Data From DataReader. It requires extra
processing to convert the value data type to a reference datatype. This process is called
boxing.
 One can avoid boxing by calling Specific Get<Specific-Type>() APIs. E.g. GetInt32()

Choose Right Data Types
Data Type Processing
Binary Transfer of raw bytes from database to application buffers.
int, smallint, float Transfer of fixed formats
Decimal Transfer of proprietary data, Driver must decode which uses CPU.
Usually Convert To String.
Timestamp Transfer of proprietary data, Driver must decode which uses CPU.
Usually Convert To Multipart structure or string.
char Typically transfer of large volume of data which needs code page
translation. Code page translation is usually CPU intensive intensive
and proportional to size of data.

Coding Guidelines

Coding Guidelines
 Executing SQL Statements
• ExecuteNonQuery() – returns number of Rows Affected but does not return actual rows.
– Use this for DML Operations like Insert, Update and Delete.
• ExecuteReader() – Returns DataReader object containing one or more rows of data.
– Use this for Select queries which returns complete Result Set
• ExecuteScalar() – Returns first column of first row of the result set.
– Use this for selects which returns single value is result set (or when application is just
interested in single value first column of first row).
 Though any API can be used for any type of query execution, they are usually
optimized for their specific purpose and contribute to a better performance.

Coding Guidelines
 100% Managed Providers
• Using unmanaged code can significantly impact the performance.
• If managed provider needs unmanaged database clients or other unmanaged pieces
then they are not true managed providers.
• Only few vendors produce true managed providers that work as 100% managed
component.
 Selecting .NET Objects
• Avoid CommandBuilder
• Choosing between DataReader and Datasets

References
THE DATA ACCESS HANDBOOK
Achieving optimal database application Performance and scalability
John Goodson
&
Robert A. Steward
Published By : PRENTICE HALL in 2009

Optimize Data Connectivity in .NET Applications

Optimize Data Connectivity in .NET Applications

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à Optimize Data Connectivity in .NET Applications

Similaire à Optimize Data Connectivity in .NET Applications (20)

Plus de Abhishek Kant

Plus de Abhishek Kant (20)

Dernier

Dernier (20)

Optimize Data Connectivity in .NET Applications

Notes de l'éditeur