Saturday, July 13, 2013

Partitioning

Now that we have a good learning experience on Slowly changing dimensions (SCD) and how to design  tables as per the requirement to maintain history or not for the important business columns. We now move on to a new feature Introduced in SQL server 2005 and is being used extensively for large volume tables from SQL server 2005 to SQL server 2008.

Introduction

Partitioning is an important feature which helps improve the performance of the queries against a very large table. The reason for this speedy retrieval of the data is that all the rows are not directly stored in the table, but are distributed in different partitions of this table. Due to the presence of the data in different subsets the query results retrieval will be faster on the server.

In this article, we will concentrate on the Range partitions , specifically horizontal range partitions. Based on the range that the value of the partitioned column falls in.

simply saying when we partition a table we are setting a criteria of where  a particular row is being stored and in which portion of the table is the row being stored. How do we set the criteria to fill the table partitions is through the Range that we define. The range is based on a particular column of the table and we call this the partitionkey. We set the range through Partition function and Partition Scheme.

When tables and indexes become very large, partitioning can help by partitioning the data into smaller, more manageable sections. This article focuses on horizontal partitioning, in which large groups of rows will be stored in multiple separate partitions. The definition of the partitioned set is customized, defined, and managed by your needs. Microsoft SQL Server 2008 allows you to partition your tables based on specific data usage patterns using defined ranges or lists. SQL Server 2008 also offers numerous options for the long-term management of partitioned tables and indexes by the addition of features designed around the new table and index structure.

Furthermore, if a large table exists on a system with multiple CPUs, partitioning the table can lead to better performance through parallel operations. The performance of large-scale operations across extremely large data sets (for instance many million rows) can benefit by performing multiple operations against individual subsets in parallel. example, instead of aggregating a single large table, SQL Server can work on partitions independently, and then aggregate the aggregates. In SQL Server 2008, queries joining large datasets can benefit directly from partitioning.

Below example simplifies the idea of partitioning a yearly sales table. The data is partitioned horizontally based on the month of the year making the MONTH as the partitionkey.  The range here would be January, February, March…..December.

In next post we shall discuss on implementation of partition using partition functions and partition schemes.



No comments:

Post a Comment