Tuesday, June 26, 2012

Query to find the Database Name and File Size/Space Info

SELECT
a.FILEID,
CONVERT(decimal(12,2),ROUND(a.size/128.000,2)) as [FILESIZEINMB] ,
CONVERT(decimal(12,2),ROUND(fileproperty(a.name,'SpaceUsed')/128.000,2)) as [SPACEUSEDINMB],
CONVERT(decimal(12,2),ROUND((a.size-fileproperty(a.name,'SpaceUsed'))/128.000,2)) as [FREESPACEINMB],
a.name as [DATABASENAME],
a.FILENAME as [FILENAME]
FROM
dbo.sysfiles a

Sunday, June 24, 2012

An Overview of Performance Tuning is SSIS - Buffer Usage and Executuion Trees

When you architect data integration solutions, your design decisions not only determine how successfully your solution meets functional requirements, but also how well your solution meets performance requirements. To make the right performance design decisions, you need to understand the performance architecture of your data integration tool and, just as importantly, the techniques that enable you to maximize the tool’s utilization of system resources such as memory and CPU.
SSIS provides a variety of optimization opportunities to help you maximize resource utilization while successfully meeting the needs of your specific data integration scenario.
Before you can take advantage of specific tuning techniques in SSIS packages, it is important to familiarize yourself with the SSIS architecture. This architecture consists of two major components: the run-time engine and the data flow engine.

The run-time engine

The run-time engine is a highly parallel control flow engine that coordinates the execution of tasks or units of work within SSIS and manages the engine threads that carry out those tasks. For the most part, the performance of the run-time engine is most heavily influenced by conditions external to SSIS, such as network bandwidth, and interaction with external systems such as database servers, FTP servers, or email servers. When SSIS runs an Execute SQL Task, for example, it sends a call to the target database and then waits for a response from the database server before it continues. In this scenario, the performance of the Execute SQL Task is more dependent on the performance of the query execution than on the SSIS run-time engine.

The data flow engine

When you use SSIS for data integration, in addition to the run-time engine, you use the data flow engine that manages the data pipeline. The data flow engine is invoked by a special task in SSIS called the Data Flow task. When the Data Flow task executes, the SSIS data flow engine extracts data from one or more data sources, performs any necessary transformations on the extracted data, and then delivers that data to one or more destinations.
With data integration solutions, you will likely spend a large part of your performance tuning time optimizing the data flow engine. Like the run-time engine, the data flow engine is influenced by external conditions; however, within SSIS, there are a variety of settings that you can manipulate to tune the data flow engine’s performance based on the requirements of the data integration operations.

Buffer Usage

Behind the scenes, the data flow engine uses a buffer-oriented architecture to efficiently load and manipulate datasets in memory. The benefit of this in-memory processing is that you do not need to physically copy and stage data at each step of the data integration. Rather, the data flow engine manipulates data as it is transferred from source to destination.
As this data flows through the pipeline, SSIS attempts to reuse data from prior buffers as much as possible when additional operations are performed. How buffers are used and reused depend on the type of transformations that you use in a pipeline.
Row Transformations - Row transformations either manipulate data or create new fields using the data that is available in that row. Examples of SSIS components that perform row transformations include Derived Column, Data Conversion, Multicast, and Lookup. While these components might create new columns, row transformations do not create any additional records. Because each output row has a 1:1 relationship with an input row, row transformations are also known as synchronous transformations. Row transformations have the advantage of reusing existing buffers and do not require data to be copied to a new buffer to complete the transformation.
Partially blocking transformations - Partially blocking transformations are often used to combine datasets. They tend to have multiple data inputs. As a result, their output may have the same, greater, or fewer records than the total number of input records. Since the number of input records will likely not match the number of output records, these transformations are also called asynchronous transformations. Examples of partially blocking transformation components available in SSIS include Merge, Merge Join, and Union All. With partially blocking transformations, the output of the transformation is copied into a new buffer and a new thread may be introduced into the data flow.
Blocking transformations - Blocking transformations must read and process all input records before creating any output records. Of all of the transformation types, these transformations perform the most work and can have the greatest impact on available resources. Example components in SSIS include Aggregate and Sort. Like partially blocking transformations, blocking transformations are also considered to be asynchronous. Similarly, when a blocking transformation is encountered in the data flow, a new buffer is created for its output and a new thread is introduced into the data flow.
Transformations are not the only components that can be categorized as synchronous or asynchronous. Sources are a special type of asynchronous component. For example, an RDBMS source component creates two types of buffers: one for the Success output and one for the Error output. By contrast, destinations are a special type of synchronous component. You will see the interactions of source and destinations components when you examine the Execution Trees of a package.

Execution Trees

Execution trees demonstrate how your package uses buffers and threads. At run time, the data flow engine breaks down Data Flow task operations into execution trees. These execution trees specify how buffers and threads are allocated in the package. Each tree creates a new buffer and may execute on a different thread. When a new buffer is created such as when a partially blocking or blocking transformation is added to the pipeline, additional memory is required to handle the data transformation; however, it is important to note that each new tree may also give you an additional worker thread.
Examine the execution trees in the example depicted in Figure 1 and Table 1 where two Employee datasets are combined together and then aggregated to load into a common destination table.
In Execution Tree 2, SSIS reads data from the Employee OLE DB Source into the pipeline, a Derived Column transformation adds another column, and SSIS passes data to the Union All transformation. All of the operations in this execution tree use the same buffer; data is not copied again once it is read into the OLE DB Source Output.
In Execution Tree 3, SSIS creates a buffer to hold error records from the asynchronous Employee OLE DB Source before loading them into a destination error table.
In Execution Tree 4, SSIS reads data from the Employee Flat File Source and passes it to the Union All. These two operations use the same buffer.
In Execution Tree 5, a buffer is created to hold errors from the asynchronous Employee Flat File Source before loading them into a destination error table.
In Execution Tree 0, the Partially Blocking Union All transformation is executed and a new buffer is created to store the combined data and the aggregate is calculated.
In Execution Tree 1, after the Fully Blocking Aggregate transformation is completed, the output from the Aggregate operation is copied into a new buffer and data is loaded into the OLE DB Destination.
This example demonstrates how execution trees can help you understand buffer usage in a common SSIS package. This example also highlights how Partially Blocking transformations like Union All and Fully Blocking transformations like Aggregate create new buffers and threads whereas Row Transformations like Derived Column do not.
Execution trees are enormously valuable in understanding buffer usage. You can display execution trees for your own packages by turning on package logging, enabling logging for the Data Flow task, and then selecting the Pipeline Execution Tree event. Note that you will not see the execution trees until you execute the package. When you do execute the package, the execution trees appear in the Log Events window in Business Intelligence (BI) Development Studio.

Writers: Elizabeth Vitt, Intellimentum and Hitachi Corporation
Contributors: Donald Farmer, Microsoft Corporation; Ashvini Sharma, Microsoft Corporation; Stacia Misner, Hitachi Consulting

Wednesday, June 20, 2012

Filtered Index

Filtered Index is a new feature in SQL SERVER 2008 used to index a portion of rows in a table which improves query performance, reduce index maintenance costs and reduce index storage costs compared to full-table indexes.

If we have table with thousands of records and we are only concern with very few rows in our query we should use Filtered Index.

When we see an Index created with some filter i.e. WHERE clause then it is a FILTERED INDEX.

Consider these points while creating Filtered Index:

- They can be created only as Nonclustered Index
- They can be used on Views only if they are persisted views.
- They cannot be created on full-text Indexes.


Example:

CREATE NONCLUSTERED INDEX NCI_Dept
ON HR.Emp(EmpID)
WHERE Title= 'Manager'

Have a look at the Execution Plan and compare the performance before and after the Filtered Index!!

An Introduction to Data Compression in SQL Server 2008 - By Brad M. McGehee

Hi Everyone, today i came across this wondeful article by Brad M. McGehee, Director of DBA Education, Red Gate Software.. and i would suggest to go through this post for an insight on DATA COMPRESSION.

There is one thing every DBA knows with certainty, and that is that databases grow with time. MDFs grow, backups grow, and it never stops. The more data we have, the more work SQL Server has to perform in order to deal with it all; whether it’s executing a query on a table with 10 million rows, or backing up a 5 TB database. Whether we like it or not, we are fighting a losing battle, and DBA’s can’t reverse the information explosion. Or can we?

While we can’t stop growth, SQL Server 2008 (Enterprise Edition only), gives us some new tools to help us better deal with all this data, and that is the promise of compression. Given the right circumstances, DBAs can use data compression to reduce the size of our MDFS, and backup compression can help us reduce the amount of space our backups take. Not only does compression reduce physical file sizes, it reduces disk I/O, which can greatly enhance the performance of many database applications, along with database backups.
When we discuss SQL Server compression, we need to think of it two different ways. First, there is data compression, which includes row-level and page-level compression that occurs within the MDF files of our databases. Second, there is backup compression, which occurs only when data is backed up. While both of these are forms of compression, they are architected differently. Because of this, it is important to treat them separately.
Data Compression comes in two different forms:
  • Row-level Data Compression: Row-level data compression is essentially turning fixed length data types into variable length data types, freeing up empty space. It also has the ability to ignore zero and null values, saving additional space. In turn, more rows can fit into a single data page.
  • Page-level Data Compression: Page-level data compression starts with row-level data compression, then adds two additional compression features: prefix and dictionary compression. We will take a look at what this means a little later in this chapter. As you can imagine, page-level compression offers increased data compression over row-level compression alone.
Backup Compression comes in a single form:
  • Backup Compression: Backup compression does not use row-level or page-level data compression. Instead, backup compression occurs only at the time of a backup, and it uses its own proprietary compression technique. Backup compression can be used when using, or not using, data compression, although using backup compression on a database that is already compressed using data compression may not offer additional benefits.
In the next section, we will take a high-level overview of data compression, and then we will drill down into the detail of the different types of compression available with SQL Server 2008.

Data Compression Overview

Data compression has been around for years. For example, who hasn’t zipped a file at some point in their career? While compression isn’t a new technology, it’s new to SQL Server. Unlike zip compression, SQL Server’s data compression does not automatically compress an entire database; instead, data compression can only be used for these database objects:
  • A table stored as a heap
  • A table stored as a clustered index
  • A non-clustered index
  • An indexed view
  • Partitioned tables and indexes
In other words, as the DBA, you must evaluate each of the above objects in your database, decide if you want to compress it, then decide whether you want to compress it using either row-level or page level compression. Once you have completed this evaluation, then you must turn on compression for that object. There is no single switch you can flip that will turn compression on or off for all the objects listed above, although you could write a Transact-SQL script to accomplish this task.
Fortunately, other than turning compression on or off for the above objects, you don’t have to do anything else to enable compression. You don’t have to re-architect your database or your application, as data compression is entirely handled under the covers by the SQL Server Storage Engine. When data is passed to the Storage Engine, it is compressed and stored in the designated compressed format (on disk and in the Buffer Cache). When the Storage Engine passes the information to another component of SQL Server, then the Storage Engine has to uncompress it. In
other words, every time data has to be passed to or from the Storage Engine, it has to be compressed or uncompressed. While this does take extra CPU overhead to accomplish, in many cases, the amount of disk I/O saved by compression more than makes up for the CPU costs, boosting the overall performance of SQL Server.
Here’s a simplified example. Let’s say that we want to update a row in a table, and that the row we want to update is currently stored on disk in a table that is using row-level data compression. When we execute the UPDATE statement, the Relational Engine (Query Processor) parses, compiles, and optimizes the UPDATE statement, ready to execute it. Before the statement can be executed, the Relational Engine needs the row of data that is currently stored on disk in the compressed format, so the Relational Engine requests the data by asking the Storage Engine to go get it. The Storage Engine (with the help of the SQLOS) goes and gets the compressed data from disk and brings it into the Buffer Cache, where the data continues to remain in its compressed format.
Once the data is in the Buffer Cache, the row is handed off to the Relational Engine from the Storage Engine. During this pass off, the compressed row is uncompressed and given to the Relational Engine to UPDATE. Once the row has been updated, it is then passed back to the Storage Engine, where is it again compressed and stored in the Buffer Cache. At some point, the row will be flushed to disk, where it is stored on disk in its compressed format.
Data compression offers many benefits. Besides the obvious one of reducing the amount of physical disk space required to store data—and the disk I/O needed to write and read it—it also reduces the amount of Buffer Cache memory needed to store data in the Buffer Cache. This in turn allows more data to be stored in the Buffer Cache, reducing the need for SQL Server to access the disk to get data, as the data is now more likely to be in memory than disk, further reducing disk I/O.
Just as data compression offers benefits, so it has some disadvantages. Using compression uses up additional CPU cycles. If your server has plenty to spare, then you have no problem. But if your server is already experiencing a CPU bottleneck, then perhaps compression is better left turned off.

Row‐Level Data Compression

The simplest method of data compression, row-level compression, works by:
  • Reducing the amount of metadata used to store a row.
  • Storing fixed length numeric data types as if they were variable-length data types. For example, if you store the value 1 in a bigint data type, storage will only take 1 byte, not 8 bytes, which the bigint data types normally takes.
  • Storing CHAR data types as variable-length data types. For example, if you have a CHAR (100) data type, and only store 10 characters in it, blank characters are not stored, thus reducing the space needed to the store data.
  • Not storing NULL or 0 values
Row-level data compression offers less compression than page-level data compression, but it also incurs less overhead, reducing the amount of CPU resources required to implement it.

Page Level Data Compression

Page-level data compression offers greater compression, but at the expense of greater CPU utilization. It works using these techniques:
  • It starts out by using row-level data compression to get as many rows as it can on a single page.
  • Next, prefix compression is run. Essentially, repeating patterns of data at the beginning of the values of a given column are removed and substituted with an abbreviated reference that is stored in the compression information (CI) structure that immediately follows the page header of a data page.
  • And last, dictionary compression is used. Dictionary compression searches for repeated values anywhere on a page and stores them in the CI. One of the major differences between prefix and dictionary compression is that prefix compression is restricted to one column, while dictionary compression works anywhere on a data page.
The amount of compression provided by page-level data compression is highly dependent on the data stored in a table or index. If a lot of the data repeats itself, then compression is more efficient. If the data is more random, then little benefits can be gained using page-level compression.

Data Compression Demo

Data compression can be performed using either SQL Server Management Studio (SSMS) or by using Transact-SQL. For this demo, we will take a look at how you can compress a table that uses a clustered index, using SSMS.

Let’s say that we want to compress the Sales.Customer table (which has a clustered index) in the AdventureWorks database. The first step is to right-click on the table in SSMS, select “Storage,” and then select “Manage Compression.”
image
Figure 1: SSMS can be used to manage compression.

This brings up the Data Compression Wizard, displayed below.
image
Figure 2: The Data Compression Wizard, or Transact-SQL commands, can be used to manage data compression.

After clicking “Next,” the wizard displays the following screen, which allows you not only to select the compression type, but it also allows you to calculate how much space you will save once compression has been turned on.
image
Figure 3: Use this screen to select the compression type, and to calculate how much space will be saved.

First, let’s see how much space we will save if we use row-level compression on this table. To find out, click on the drop-down box below “Compression Type,” select “Row,” and then click “Calculate.”
image
Figure 4: For this table, row-level compression doesn’t offer much compression.

After clicking “Calculate,” the wizard runs and calculates how much space is currently being used, and how much space would be used after row-level compression. As we can see, very little space will be saved, about 1.6%.
Now, let’s see how much compression savings page-level compression offers us for this particular table. Again, I go to the drop-down menu under “Compression Type,” select “Page,” then press “Calculate.”
image
Figure 5: Page-level compression is higher than row-level compression.

After pressing “Calculate,” we see that compression has improved greatly, now saving about 20% space. At this point, if you should decide to turn on page-level compression for this table, click on the “Next” button.
image
Figure 6: The wizard allows you several options in which to turn on compression.

At the above screen, you can choose to perform the compression now (not a good idea during busy production times because the initial compression process can be very CPU and disk I/O intensive), schedule it to run later, or just to script the Transact-SQL code so you can run it at your convenience.

Once you have compressed this table (a clustered index), keep in mind that any non-clustered indexes that this table may have are not automatically compressed for you. Remember, compression is based on a per object basis. If you want to compress the non-clustered indexes for this table, you will have to compress each one, one at a time.

While this wizard helps you to see how much compression either method offers, it does not suggest which compression method should be used, nor does it recommend whether compression should be used at all for this object. As the DBA, it will be your job to evaluate each compressible object to determine if it should have compression enabled or not. In other words, you must decide if the benefits of compression outweigh the negatives.

Backup Compression

For years, there have been third-party programs that allow you to compress and speed up SQL Server backups. In most regards, the backup compression included with the Enterprise Edition of SQL Server is very plain vanilla. In fact, if you already are using a third-party backup program, I would suggest you continue using it, because SQL Server 2008 backup compression offers fewer features. In fact, the only option SQL Server 2008 backup compression offers you is to turn it off or on.
SQL Server 2008 backup compression, like the third-party add-ons, compresses backups, which not only saves backup space, but it can substantially reduce backup times. Unlike data compression, there is very little downside to using backup compression, other than the additional CPU resources required to perform the compression (or decompression during a restore). Assuming that you perform backups during slower times of the day, the additional CPU resources used will not be noticeable.
The time and space savings offered by backup compression depends on the data in your database. If you are heavily using data compression in your databases, or are using Transparent Data Encryption, then using backup compression probably won’t offer you many benefits, as already compressed data, or encrypted data, is not very compressible.
Let’s take a brief look at how you turn on SQL Server 2008 backup compression. While our example will use SSMS, you can use Transact-SQL to perform the same task. To backup AdventureWorks, right-click on the database in SSMS, select “Tasks,” and then select “Back Up,” and the backup dialog box appears.
image
Figure 7: As with any backup, you can use the backup dialog box to make your selections.

Once you have selected your backup options, next click on “Options,” and the following screen appears.
image
Figure 8: Backup compression options are limited.
At the top of figure 8 are the standard backup options, while at the bottom of the screen you see the options for backup compression. Notice that you only have three choices.
The first option, “Use the default server settings” tells the backup to use the server’s default backup compression setting. In SQL Server 2008, there is a new sp_configure option called “backup compression default.” By default, it is set to have backup compression off. If you want, you can set this option so that backup compression is turned on by default. So if you choose the “Use the default server settings” option above, then whatever option is set for the “backup compression default” will be used for the backup.
The “Compress Backup” option turns backup compression on, and the “Do not compress backup” option turns it off. Both of these options override the “backup compress default” server setting, whatever it happens to be.
Once you have chosen your backup compression method, you proceed with the backup just like any other SQL Server backup.
If you need to restore a compressed backup, you don’t have to do anything special, it will uncompress itself automatically. Although you can only compress backups using the Enterprise Edition of SQL Server 2008, you can restore a compressed backup to any edition of SQL Server 2008. On the other hand, you cannot restore a compressed SQL Server 2008 backup to any previous version of SQL Server.

Summary

In this article, we have learned about the two forms of data compression, and about backup compression. While data compression might seem like a seductive new feature of SQL Server, I highly recommend that it is only used by experienced DBAs. While it offers lots of promise for increased performance, it can just as easily cause performance problems if misused. Backup compression, on the other hand, can be used by DBAs of all skill levels.

Tuesday, June 19, 2012

A Brief Note on VIEWS

View is nothing but an image table or virtual table, which is created for a base table. A view can be created by taking all values from the base table or by taking only selected values from base table.

Note: If we perform any modifications in base table, then those modifications automatically effected in view and vice-versa.

Syntax:

CREATE VIEW VIEWNAME [WITH ENCRYPTION]
AS SELECT * FROM TABLENAME [WHERE CONDITION]
                                                                 [WITH CHECK OPTION]


E.g.:
CREATE VIEW V1 AS SELECT * FROM EMP
INSERT INTO V1 VALUES (45,’ARJUN’, 10000,10)


The above insert statement inserts the values into base table EMP as well as into view v1.

E.g.:

CREATE VIEW V2 AS SELECT * FROM EMP WHERE DEPTNO=10
INSERT INTO V2 VALUES (55,’NAG’, 10000,10)


The above insert statement inserts the values into base table EMP as well as into view v2.

INSERT INTO V2 VALUES (65,’NAGARJUN’, 20000,20)


The above insert statement inserts the values into only base table EMP but not into view v2 because according to the definition of v2 user supplied values are invalid values ( DEPTNO=20 is not valid as view definition is with condition where clause). It means invalid values are inserting into the base table EMP.

To avoid such operations we need to create the view with – ‘WITH CHECK OPTION
E.g: CREATE VIEW V2 AS SELECT * FROM EMP WHERE DEPTNO=10
                                                                                    WITH CHECK OPTION                                                           


INSERT INTO V3 VALUES (75,’ARUN’,25000,20)

The above insert statement cannot insert values into base table EMP and into View V3.

SP_HELPTEXT: This Stored Procedure is used to display the definition of a specific view.

Syntax: SP_HELPTEXT VIEW NAME
E.g.: SP_HELPTEXT V1
Output:  CREATE V1 AS SELECT * FROM EMP


If we want to hide the definition of view use ‘WITH ENCRYTION’.

WITH ENCRYTION: Once we create a view with ‘WITH ENCRYTION’ then we cannot find the definition of that view using SP_HELPTEXT.

E.g.:

CREATE VIEW V4 WITH ENCRYPTION
AS SELECT * FROM EMP WHERE DEPTNO=20

SP_HELPTEXT V4

Output: The text for V4 object is encrypted.

To Open the definition of view V4 we have to follow the below approach

       1.       Replace the CREATE with ALTER
2.       Remove WITH ENCRYPTION key word
3.       Select the query and press F5
E.g.:

ALTER VIEW V4
AS SELECT * FROM EMP WHERE DEPTNO =20

Creating view based on another views:
SQL Server enables users to create views based on another view. We can create view based on another view up to 32 levels.


Syntax:


CREATE VIEW VIEWNAME [WITH ENCRYPTION]
AS SELECT * FROM VIEWNAME [WHERE CONDITION]
                                                                 [WITH CHECK OPTION]

E.g.: CREATE VIEW V5 AS SELECT * FROM V1 WHERE DEPTNO=10

Syntax to drop the views:
DROP VIEW VIEWNAME […..N]
E.g.:  DROP VIEW V1,V2,V3,V4,V5