Three Pages Of Summary About
-Basic database performance-tuning concepts
-How a DBMS processes SQL queries
-About the importance of indexes in query processing
-About the types of decisions the query optimizer has to make
-Some common practices used to write efficient SQL code
-How to formulate queries and tune the DBMS for optimal performance
APA Style 3 pages assignment
Chapter 11
Database Performance Tuning and Query Optimization
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Learning Objectives
In this chapter, the student will learn:
Basic database performance-tuning concepts
How a DBMS processes SQL queries
About the importance of indexes in query processing
About the types of decisions the query optimizer has to make
Some common practices used to write efficient SQL code
How to formulate queries and tune the DBMS for optimal performance
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Database Performance-Tuning Concepts
Goal of database performance is to execute queries as fast as possible
Database performance tuning: Set of activities and procedures that reduce response time of database system
Fine-tuning the performance of a system requires that all factors must operate at optimum level with minimal bottlenecks
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Table 11.1 - General Guidelines for Better System Performance
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Performance Tuning: Client and Server
Client side
SQL performance tuning: Generates SQL query that returns correct answer in least amount of time
Using minimum amount of resources at server
Server side
DBMS performance tuning: DBMS environment configured to respond to clients’ requests as fast as possible
Optimum use of existing resources
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
DBMS Architecture
All data in a database are stored in data files
Data files automatically expand in predefined increments known as extends
Data files are grouped in file groups or table spaces
Table space or file group: Logical grouping of several data files that store data with similar characteristics
Data cache or buffer cache: Shared, reserved memory area
Stores most recently accessed data blocks in RAM
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
DBMS Architecture
SQL cache or procedure cache: Stores most recently executed SQL statements or PL/SQL procedures
DBMS retrieves data from permanent storage and places them in RAM
Input/output request: Low-level data access operation that reads or writes data to and from computer devices
Data cache is faster than working with data files
Majority of performance-tuning activities focus on minimizing I/O operations
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Figure 11.1 - Basic DBMS Architecture
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Database Query Optimization Modes
Algorithms proposed for query optimization are based on:
Selection of the optimum order to achieve the fastest execution time
Selection of sites to be accessed to minimize communication costs
Evaluated on the basis of:
Operation mode
Timing of its optimization
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Classification of Operation Modes
Automatic query optimization: DBMS finds the most cost-effective access path without user intervention
Manual query optimization: Requires that the optimization be selected and scheduled by the end user or programmer
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Classification Based on Timing of Optimization
Static query optimization: best optimization strategy is selected when the query is compiled by the DBMS
Takes place at compilation time
Dynamic query optimization: Access strategy is dynamically determined by the DBMS at run time, using the most up-to-date information about the database
Takes place at execution time
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Classification Based on Type of Information Used to Optimize the Query
Statistically based query optimization algorithm: Statistics are used by the DBMS to determine the best access strategy
Statistical information is generated by DBMS through:
Dynamic statistical generation mode
Manual statistical generation mode
Rule-based query optimization algorithm: based on a set of user-defined rules to determine the best query access strategy
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Table 11.2 - Sample Database Statistics Measurements
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Query Processing
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
SQL Parsing Phase
Query is broken down into smaller units
Original SQL query is transformed into slightly different version of the original SQL code which is fully equivalent and more efficient
Query optimizer: Analyzes SQL query and finds most efficient way to access data
Access plans: DBMS-specific and translate client’s SQL query into a series of complex I/O operations
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
SQL Parsing Phase
If access plan already exists for query in SQL cache, DBMS reuses it
If not, optimizer evaluates various plans and chooses one to be placed in SQL cache for use
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
SQL Execution Phase
All I/O operations indicated in the access plan are executed
Locks are acquired
Data are retrieved and placed in data cache
Transaction management commands are processed
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
SQL Fetching Phase
Rows of resulting query result set are returned to client
DBMS may use temporary table space to store temporary data
Database server coordinates the movement of the result set rows from the server cache to the client cache
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Query Processing Bottlenecks
Delay introduced in the processing of an I/O operation that slows the system
Caused by the:
CPU
RAM
Hard disk
Network
Application code
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Indexes and Query Optimization
Indexes
Help speed up data access
Facilitate searching, sorting, using aggregate functions, and join operations
Ordered set of values that contain the index key and pointers
More efficient than a full table scan
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Indexes and Query Optimization
Data sparsity: Number of different values a column could have
Data structures used to implement indexes:
Hash indexes
B-tree indexes
Bitmap indexes
DBMSs determine best type of index to use
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Optimizer Choices
Rule-based optimizer: Uses preset rules and points to determine the best approach to execute a query
Cost-based optimizer: Uses algorithms based on statistics about objects being accessed to determine the best approach to execute a query
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Using Hints to Affect Optimizer Choices
Optimizer might not choose the best execution plan
Makes decisions based on existing statistics, which might be old
Might choose less-efficient decisions
Optimizer hints: Special instructions for the optimizer, embedded in the SQL command text
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Table 11.5 - Optimizer Hints
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
SQL Performance Tuning
Evaluated from client perspective
Most current relational DBMSs perform automatic query optimization at the server end
Most SQL performance optimization techniques are DBMS-specific and thus rarely portable
Majority of performance problems are related to poorly written SQL code
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Index Selectivity
Measure of the likelihood that an index will be used in query processing
Indexes are used when a subset of rows from a large table is to be selected based on a given condition
Index cannot always be used to improve performance
Function-based index: Based on a specific SQL function or expression
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Conditional Expressions
Expressed within WHERE or HAVING clauses of a SQL statement
Restricts the output of a query to only rows matching conditional criteria
Guidelines to write efficient conditional expressions in SQL code
Use simple columns or literals as operands
Numeric field comparisons are faster than character, date, and NULL comparisons
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Conditional Expressions
Equality comparisons are faster than inequality comparisons
Transform conditional expressions to use literals
Write equality conditions first when using multiple conditional expressions
When using multiple AND conditions, write the condition most likely to be false first
When using multiple OR conditions, put the condition most likely to be true first
Avoid the use of NOT logical operator
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Query Formulation
Identify what columns and computations are required
Identify source tables
Determine how to join tables
Determine what selection criteria are needed
Determine the order in which to display the output
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
DBMS Performance Tuning
Managing DBMS processes in primary memory and the structures in physical storage
DBMS performance tuning at server end focuses on setting parameters used for:
Data cache
SQL cache
Sort cache
Optimizer mode
In-memory database: Store large portions of the database in primary storage
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
DBMS Performance Tuning
Recommendations for physical storage of databases:
Use RAID (Redundant Array of Independent Disks) to provide a balance between performance improvement and fault tolerance
Minimize disk contention
Put high-usage tables in their own table spaces
Assign separate data files in separate storage volumes for indexes, system, and high-usage tables
*
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
DBMS Performance Tuning
Take advantage of the various table storage organizations in the database
Index-organized table or clustered index table: Stores the end-user data and the index data in consecutive locations in permanent storage
Partition tables based on usage
Use denormalized tables where appropriate
Store computed and aggregate attributes in tables