SQL Server: How to create an index on a table
Goal
Having the right indexes on tables are critical to making your queries performant, especially when your data grows. Not having the needed indexes will typically result in high CPU usage in your database server, slow response times, and ultimately unhappy users. Indexing also helps improve performance by avoiding duplicate records, improving JOIN operations and table partitioning, and enabling full text search capability in SQL Server. In this tutorial, you will learn how to create an index in SQL Server, and various types of indexing that you can do.
Instructions
Creating index
To create indexes, use the CREATE INDEX
command:
-- syntax
create index index_name
on table_name(column1, column2, .., columnN);
-- create index on one column
create index products_category
on products(category);
Creating index on multiple columns
You can create an index on multiple columns. When used for columns that are frequently used together as filters, a multiple-column index performs better than multiple single-column indexes:
-- create index on multiple columns
create index products_category_brand
on products(category, brand_id);
Creating unique index
A unique index prevents duplicate data for a defined key:
-- two rows can not have same category and brand_id key values
create unique index products_category_brand
on products(category, brand_id);
Creating filter indexes
Create filter indexes to only index rows where a certain condition is met:
create index products_brand
on products(brand_id)
where brand_id = 1;
Creating unique filter index
You can also have a unique filter index. For example, imagine if you want to prevent rows having duplicate category and brand_id key values for particular brand:
-- this will prevent duplicate categories within a brand that has id=1
create unique index products_brand_1_category
on products (category, brand_id)
where brand_id = 1;
Troubleshooting
Troubleshooting indexing issues in SQL Server involves identifying problems with existing indexes and optimizing them for better performance. Here are some common indexing issues and how to address them with short code examples:
Missing indexes
While there is always a lot of data at hand, data from a few fixed columns is used the most by the entire business. Due to missing indexes, queries become slow. If your queries are too slow, it is worth checking if there are no suitable indexes.
Creating new indexes for frequently used columns helps:
-- Identify missing indexes using SQL Server's built-in tools
SELECT
migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) AS improvement_measure,
OBJECT_NAME(mid.[object_id]) AS TableName,
'CREATE INDEX [IX_' + OBJECT_NAME(mid.[object_id]) + '_' + REPLACE(REPLACE(REPLACE(ISNULL(mid.equality_columns, ''), ', ', '_'), '[', ''), ']', '')
+ CASE WHEN mid.[equality_columns] IS NOT NULL AND mid.[inequality_columns] IS NOT NULL THEN '_' ELSE '' END
+ REPLACE(REPLACE(REPLACE(ISNULL(mid.inequality_columns, ''), ', ', '_'), '[', ''), ']', '') + ']'
+ ' ON ' + mid.statement
+ ' (' + ISNULL (mid.equality_columns,'')
+ CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN ',' ELSE '' END
+ ISNULL (mid.inequality_columns, '')
+ ')'
+ ISNULL (' INCLUDE (' + mid.included_columns + ')', '') AS create_index_statement,
migs.*, mid.database_id, mid.[object_id]
FROM sys.dm_db_missing_index_groups mig
INNER JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = mig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle
WHERE mid.database_id = DB_ID() -- Replace with your database ID
ORDER BY migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans) DESC;
Overlapping indexes
Unlike missing indexes, when there are many indexes for the same columns, we end up with overlapping indexes. These indexes need to be cleaned up; otherwise, they cause redundant storage and maintenance overhead. Here’s how you can identify and remove overlapping indexes:
-- Identify overlapping indexes
SELECT
OBJECT_NAME(i.[object_id]) AS TableName,
i.name AS IndexName,
COL_NAME(ic.[object_id], ic.column_id) AS ColumnName
FROM sys.indexes i
INNER JOIN sys.index_columns ic ON i.[object_id] = ic.[object_id] AND i.index_id = ic.index_id
WHERE i.[object_id] = OBJECT_ID('YourTableName') -- Replace with your table name
ORDER BY TableName, IndexName, ColumnName;
-- Remove redundant indexes (use DROP INDEX statement)
-- Example: DROP INDEX [IX_RedundantIndex] ON [YourTableName];
Fragmented indexes
Fragmented indexes also cause performance issues. Here’s how you can rebuild or reorganize fragmented indexes:
-- Rebuild index to reduce fragmentation
ALTER INDEX [YourIndexName] ON [YourTableName] REBUILD;
-- Reorganize index to reduce fragmentation
ALTER INDEX [YourIndexName] ON [YourTableName] REORGANIZE;
Conclusion
Indexing is a crucial performance optimization technique in SQL Server (and other relational database management systems) that helps improve the speed and efficiency of querying and retrieving data from database tables. Now you have everything you need to work with indexes in SQL Server. You learned how to create an index, create index on multiple columns, and create unique and filter indexes. You also learned how to troubleshoot indexing for your project to improve query performance and avoid redundancy.
Next steps
Now that you learned how to create an index in SQL Server, it is time to undo it (only if you need to)! Learn how to drop an index in the next tutorial.
FAQs
1. What is the primary purpose of indexing in SQL Server?
Indexing in SQL Server primarily serves to improve query performance by providing a faster way to locate and retrieve data. It does this by creating a data structure that allows SQL Server to quickly find the rows that match a query's search criteria.
2. When should I consider creating a clustered index, and what is its impact on data storage?
A clustered index determines the physical order of data in a table, so it should be chosen carefully. Typically, it's best suited for columns with unique values, like a primary key. However, it's essential to remember that a table can have only one clustered index, and its choice affects data storage, as the data is physically organized based on the clustered index.
3. How do I identify missing indexes in SQL Server, and what should I do when I find them?
You can identify missing indexes by using tools like SQL Server's Database Engine Tuning Advisor or by analyzing execution plans. When you find missing indexes, carefully evaluate their potential impact on query performance before implementing them. Consider factors like query workload and the balance between improving reads and the cost of maintaining the indexes.
4. What is index fragmentation, and why is it a concern?
Index fragmentation occurs when data pages within an index are disordered or scattered. This can slow down query performance as SQL Server needs to read more pages to satisfy a query. To address fragmentation, you can rebuild or reorganize indexes regularly.
5. Can too many indexes on a table be detrimental to performance?
Yes, having too many indexes on a table can lead to performance issues. While indexes improve read performance, they come with storage and maintenance overhead. Excessive indexes can lead to increased storage requirements, slower data modification operations, and potentially suboptimal query plans. It's essential to strike a balance between indexing for reads and considering the impact on data modification and overall database performance.
Previous
How to Import a CSV