Joins

How a sort-merge is performed

For a sort-merge join, there are three steps, since the inputs to the sort-merge joins are both sorted worktables:

A table or set of tables is scanned and the results are inserted into one worktable. This will be the outer table in the merge.

Another table is scanned and the results are inserted into another worktable. This will be the inner table in the merge.

Each of the worktables is sorted, then the two sorted result sets are merged

How a right-merge or left-merge is performed

A right-merge or left-merge join always operates on a user table and a worktable created for the merge join. There are two steps:

A table or set of tables is scanned, and the results are inserted into a worktable.

The worktable is sorted and then merged with the other table in the join, using the index.

The optimization criteria are:

hash_join – determines whether the query processor may use the hash join algorithm. Hash joins may consume more runtime resources, but are valuable when the joining columns do not have useful indexes or when a relatively large number of rows satisfy the join condition, compared to the product of the number of rows in the joined tables.

hash_union_distinct – determines whether the query processor may use the hash union distinct algorithm, which is not efficient if most rows are distinct.

merge_join – determines whether the query processor may use the merge join algorithm, which relies on ordered input. merge_join is most valuable when input is ordered on the merge key—for example, from an index scan. merge_join is less valuable if sort operators are required to order input.

merge_union_all – determines whether the query processor may use the merge algorithm for union all. merge_union_all maintains the ordering of the result rows from the union input. merge_union_all is particularly valuable if the input is ordered and a parent operator (such as merge join) benefits from that ordering. Otherwise, merge_union_all may require sort operators that reduce efficiency.

merge_union_distinct – determines whether the query processor may use the merge algorithm for union. merge_union_distinct is similar to merge_union_all, except that duplicate rows are not retained. merge_union_distinct requires ordered input and provides ordered output.

multi_table_store_ind – determines whether the query processor may use reformatting on the result of a multiple table join. Using multi_tablet_store_ind may increase the use of worktables.

nl_join – determines whether the query processor may use the nested-loop-join algorithm.

opportunistic_distinct_view – determines whether the query processor may use a more flexible algorithm when enforcing distinctness.

parallel_query – determines whether the query processor may use parallel query optimization.

store_index – determines whether the query processor may use reformatting, which may increase the use of worktables.

append_union_all – determines whether the query processor may use the append union all algorithm.

bushy_search_space – determines whether the query processor may use bushy-tree-shaped query plans, which may increase the search space, but provide more query plan options to improve performance.

distinct_hashing – determines whether the query processor may use a hashing algorithm to eliminate duplicates, which is very efficient when there are few distinct values compared to the number of rows.

distinct_sorted – determines whether the query processor may use a single-pass algorithm to eliminate duplicates. distinct_sorted relies on an ordered input stream, and may increase the number of sort operators if its input is not ordered.

group-sorted – determines whether the query processor may use an on-the-fly grouping algorithm. group-sorted relies on an input stream sorted on the grouping columns, and it preserves this ordering in its output.

distinct_sorting – determines whether the query processor may use the sorting algorithm to eliminate duplicates. distinct_sorting is useful when the input is not ordered (for example, if there is no index) and the output ordering generated by the sorting algorithm could benefit; for example, in a merge join.

group_hashing – determines whether the query processor may use a group hashing algorithm to process aggregates.

index_intersection – determines whether the query processor may use the intersection of multiple index scans as part of the query plan in the search space.

Subscribe to: Posts (Atom)