Instructions
post 1 thread of at least 400 words and 2 replies of at least 200 words each. For each thread, you must support your assertions with at least 4 citations in current APA format. Each reply must incorporate at least 2 citations. Acceptable sources include peer-reviewed journal articles, books, the course textbook, and the Bible.
Thread: Select 1 topic from among the 6 listed below, and in at least 400 words, write a post that explains your chosen topic.
Additional resources are listed in the Reading & Study folder of this Module/Week.
1. Query Optimizer Overview
Query Optimization Steps
*Be sure to address the question “What are the algorithm categories?”
2. Algorithms Introduction
External Sort/Merge
*Be sure to explain how the merge/sort algorithm works.
3. Select Algorithms - Simple
*Be sure to discuss Algorithms S1-S7.
4. Select Algorithms - Complex
*Be sure to discuss Algorithms S8-S11.
5. Join Operation Algorithms
*Be sure to explain Algorithms J1-J5
6. Optimization Techniques
*Be sure to discuss Cost vs. Heuristic based optimization
Replies: Reply to each thread of at least 200 words each.
Reply #1
When dealing with databases, an administrator will sometimes be handling incomprehensible amounts of data. Even for smaller sets of data, query optimization is an essential part of creating a high-performance solution. Query optimization with databases can be defined as constructing a query that “can be sped up through many different means, each of which has an associated time and resource cost” (SQL Shack, 2020). Across every use of a database system, optimization can deliver the organization with valuable gains in performance and save resources.
The process of query optimization can be broken into various steps. First, the query optimizer will parse the command to check for syntax errors, and then the optimizer checks all objects for binding and results in a query tree. This leads to an execution plan for the optimizer. This is viewable for queries in most SQL database tools, and this execution plan can be a valuable tool for administrators to check when they notice query is running slowly (Wenzel, 2018). The query optimizer in turn uses algorithms to evaluate these execution plans.
There are various execution (or “access”) plans based on different algorithms, but regardless, the query optimizer uses this information to evaluate the best path to take to perform the specific query. “Each operation in an access plan has a corresponding cost formula that estimates the physical record accesses and CPU operations” (Mannino, 2014, p. 273). These cost formulas are the key to query optimization. Once the best choice is determined, the query optimizer translates the query into machine code for execution, or the optimizer may interpret the code at runtime. The optimizer can also use other tricks, such as caching successful execution plans for speed increases.
The power of the query optimizer comes from the algorithms it uses. There are a few broad categories of algorithms used by a query optimizer. These might be used by the query optimizer, or the algorithm categories might include algorithms to conduct functions on the data sought out through the query. For example, some algorithm categories include searching, conversion, data mining, averaging, ranking, counting, analytic, and modelling algorithms (Maurya, 2019). To provide a specific example, consider some of the common join algorithms (which are some of the most common frequently used algorithms) include nested loops, sort merge, hybrid join, hash join, star join (Mannino, 2014, p. 274). The details of all these algorithms are out of scope, but more details can be found in the textbook.
The process of query optimization is both a theoretical process and a physical process within database management systems. It helps database administrators ensure that they are able to deliver data performance that satisfies a need. The key to physical query optimization is the query optimizer, and it utilizes important algorithms to determine what “performance” means in a given query.
#Reply #2
When it comes to creating and executing a successful database, it is important to ensure that there are great ways to access and implement that data as well. This is why many database management systems employ a query optimizer in order to retrieve the information that they desire in the most simple and cost-effective way possible. The optimization of a query is a process by which many companies design their database in order to retrieve information by taking into account the cost of the optimization as well as the space required to do so. (Ribeiro et al, 1997) In order to access the information that has been optimized, we need to use optimization techniques that have been implemented in place to assist in retrieving the information that we seek to gather. Our textbook discusses a few optimization techniques in regard to Relational On Line Analytic Processing, or ROLAP. ROLAP is an extension of database management systems that are used to support multidimensional data as well as supporting a variety of optimization techniques. One such technique is that of “bitmap join indexes”. A bitmap join index will provide a precomputed join of columns and rows of dimension and facts tables. (Mannino, 2015)
Another such technique is a “star join” query optimization. This type of optimization technique will use the bitmap join indexes in three phases to produce results. Star joins will combine results from the bitmap indexes on each of the three phases and are able to exponentially shorten execution times in comparison to older join commands that only would combine two tables. “Query rewriting” that uses materialized views is a technique that cuts back on the need to access large tables by using such views as indexes to greatly increase the performance to retrieve them for the user. Using query writing will take into account whether or not a user will benefit from the use of a materialized view or the large “fact and dimension” tables. (Mannino, 2015)
Oracle DBMS provides some other optimization techniques such as Bulk collect, Forall, and Returning. The Bulk collect technique helps to improve performance by sending the whole collection of the query to the database server instead of each individual piece. Forall allows many different write statements to be offered to the database in a single message which aids a user in performance gains. Lastly, the Returning technique saves time by giving the user information on whether or not their code has been implemented and saved to the server. This technique can replace using a “select” statement to find out if your information was processed successfully. (Almeida et al, 2019) In relation to cost versus heuristic optimization, within their article, Niu et. al. found that, although cost optimization finds the most cost-effective way to process data requests and heuristic follows a set of rules, data from their tests ran found great progressive results using either measure. Their findings seemed to point to greater efficiency and performance improvement for both techniques. (Niu et al, 2018)