Performance Tuning :: NESTED LOOPS JOIN And Distributed Operations?
Oct 30, 2012
I want to make sure I am describing correctly what happens in a query where there is distributed database access and it is participating in a NESTED LOOPS JOIN. Below is an example query, the query plan output, and the remote SQL information for such a case. Of particular note are line#4 (NESTED LOOPS) and line#11 (REMOTE TABLE_0002).
What I want to know is more detail on how this NESTED LOOPS JOIN handles the remote operation. For example, for each row that comes out of line#5 and is thus going into the NESTED LOOPS JOIN operation @line#4, does the database jump across the network to do the remote loopkup? Thus if there are 1 million rows, does that mean 1 million network hops? Does batchsize play a role? For example, if the database batches in groups of 100 then does that mean 10 thousand network hops?
I think each row that comes out of line#5 means a network hop to the remote database. But I do not know for a fact.I have done some abbreviating in the plan in an attempt to make it fit on the page (line#7 TA = TABLE ACCESS).
Predicate Information (identified by operation id): ---------------------------------------------------
4 - access("A"."DEPTNO"="B"."DEPTNO")
Nested loop by defintion means,for every row returned by the outer query,the inner query is executed that many times.
In the above example,oracle does a full table scan and returned 14 rows.Now for dept table,it does a index unique scan and applies the predicate a.deptno=b.deptno and returns 1 row.
My question is why it is returning only 1 row? That measn for every 14 rows,this one row is fetched 14 times.
explain plan for select count(*) from orders, lineitem where o_orderkey= l_orderkey.
The trace 10053 (as shown below) for this query shows nested loop join with Lineitem as outer table and Orders as inner table. It is effectively join on composite index (pk_lineitem) of Lineitem and unique index(Pk_orderkey) of Orders table. The cost calculation formula as given in the book as "outer table cost + cardinality of outer table * inner table cost " fails here. I am not able to understand this.
I want to know how the Oracle optimizer choose joins and apply them while executing the query. So that I will insure about optimizer join before writing any query.
Doing some data conversion at the moment and using V$SESSION_LONGOPS to predict when the current task will be finished so I can run the next one.
V$SESSION_LONGOPS seems to have only the last 16 long operations for the session. Older operations are automatically purged. My bigger tables have 32 partitions, so after the first 16 are processed, I cannot tell which partition I am up to.
Googling "old longops" and "longops history" didn't work, nor did the same searches on this site. The Oracle Reference manual section on V$SESSION_LONGOPS did not mention that older entries are purged.
I am executing a sql statement which is doing FTS in parallel mode The server has 8 cpus and threads_per_cpu is 2
The v$sql shows PX_SERVERS_EXECUTIONS as 8
select PX_SERVERS_EXECUTIONS, sql_text from v$sql where sql_id='0q0nk5117yth2' 8, select /*+ full(a) parallel(a)....
however the px_sessions shows 17 sessions (16 parallel session + 1 parent session (where sid = qcsid) Now in px_sessions, these 16 parallel session are divided in 2 server sets 1 and 2 and values for degree and required degree are 8 and 16 respectively
However, all the time, only 8 sessions which belong to server set = 1, were active though its state was waiting with event "PX Deq Credit: send blkd"
The other session which belong to server set = 2 were never active and always had waint event ='PX Deq: Execution Msg'
what could be the reason that 16 parallel session could not be started though I am the only person using the server, there aren't any batch jobs, dbms_jobs,even archivelogs (not a prod system)?
Note that paralel_max_servers setting is 16
Another issue being the duing start of the query approximately 100-115 blocks were read for the query (checked from longops) however after 60-70% blocks are read the number of blocks read / seconds falls down to 10-20 blocks / second across all parallel sessions.
In the example below I believe I have created a Nested Table of PL/SQL type and have tried various references to get the SET operation to work, line containing MEMBER OF. Taking the example below from the oracle documentation I have two questions.
1) As I understand it I should be able to use SET operations on Nested tables of PL/SQL types, (I am not using the CREATE OR REPLACE DDL statement prior to the DECLARE block.). Is this correct? 2) I am assuming that I have to reference the record, can I reference by its type / row instance or can I only retrieve the record like a Cursor Fetch solution, (which would defeat the purpose.).
BANNER ---------------------------------------------------------------- Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi PL/SQL Release 10.2.0.4.0 - Production CORE 10.2.0.4.0 Production TNS for Solaris: Version 10.2.0.4.0 - Production NLSRTL Version 10.2.0.4.0 - Production
5 rows selected.
I have a problem with views and nested selects which I cannot explain. Here is a trimed down version of the research I have done. notice the following:
1) all code is executed from the same user CDRNORMALCODE. this user has all views and procedural code 2) all data is owned by a different user CDRDATA. This user has no views and no code.
My problem is this:
If I reference the table directly with a delete statement that uses a nested select (i.e. IN clause with select), the index I expect and want is used.But if I execute the same delete but reference even the most simple of views (select * from <table>) instead of the table itself, then a full table scan is done of the table.
Here is an execute against the table directly (owned by cdrdata). Notice the reference to the table in the table schema on line 3. Also please notice INDEX RANGE SCAN BSNSS_CLSS_CASE_RULE_FK1 at the bottom of the plan.
SQL> show user USER is "CDRNORMALCODE" SQL> SQL> explain plan for 2 delete
[code]...
OK, here is an update. The views I am useing normally have instead of triggers on them. If I remove the instead of trigger the problem looks like it goes away, when I put the trigger back the problem comes back.But why would an instead-of-trigger change the query plan for a view?
SQL> DELETE FROM PLAN_TABLE;
5 rows deleted.
SQL> explain plan for 2 delete 3 from BSNSS_CLSS_MNR_CASE_RULE_SV
I'm joinging two tables event_types and tmp_acc tables.
event_types contains 2 Billion records tmp_acc contains 20,000 records.
Resulting rows are about 300,000 records in event_types table end_t and account_obj_id0 are joined indexed
no indexs in tmp_acc.
When I run below query with nexted loop it takes 6 hrs to complete. But when I run with hash join even after 4 days it was still running. what is wrong with hash join here. Why it takes so long. I'm joining only 20000 rows. So I think there should be a way to get result rows quickly.
show parameters hash_area_size
NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ hash_area_size integer 2097152
explain plan for select --+ parallel(e,6) [code]....
I have to do the optimization of a query that has the following characteristics:
- Takes 3 hours to process - Performs the inner join with 30 tables - Produces an output of 280 million records with 450 fields
First of all it is not feasible to make 30 updates (one for each table) to 280 million records.
The best solution that I had found so far was to create 3 temporary tables, where each of them to do the join with 1/3 of the 30 tables, and in the end I make the join between the main table and these three tables temporary.
I know that you will ask (or maybe not) to the query and samples, but it is impossible to create 30 examples.
how to optimize this type of querys that perform the join with multiple tables and produce a large output with (too) many columns.
I'm looking to see if there is a way (fully expecting it to be an underscore, or two...) to force the optimizer to keep churning until all permutations are exhausted.I'm aware that it, to paraphrase, cuts out when it's spent more time parsing than it would just running it based on it's estimates.
I've got some irritating problems with xml rewrite, xml indexes and access paths/cardinalities etc and I'm really needing the entire thing considered as a one off for debugging this. I've already cranked up the maximum permutations to the max but it's not enough, it shorts out after 5041 permutations (I'd set that to 80000 max).
I know you'd not want to do this in the real world but I cant get the damned thing to run the plan I want in a 10053 so I can see the values it has there. I know I can hint it, but I'm trying to ascertain why it's not even considering it in a "normal" parse.
The product I work on requires a query to tell us what tables are dependent on certain types.
SELECT dba_tab_cols.owner, dba_tab_cols.table_name, dba_tab_cols.data_type_owner, dba_tab_cols.data_type FROM dba_tab_cols JOIN dba_types ON dba_types.owner = dba_tab_cols.data_type_owner AND dba_types.type_name = dba_tab_cols.data_type WHERE (dba_types.owner IN ('SCHEMA1', 'SCHEMA2'......))
I find this query to be pretty slow. I think it is because data_type_owner in dba_tab_cols is not indexed. Adding an index is not an option because users expect our product to read-only.
however I was able to identify a poorly performing query that seemed to be maxing out our CPU. I have been trying to understand the Explain Plan. The plan below is from our test system which has considerably less information in the tables than our PROD system.
I can see there are a bunch of table scans at the end which may indicate missing indexes, but I am unclear on whether this is actually a problem as the %CPU seems to be worse for the JOIN near the top of the plan.
------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Inst |IN-OUT| ------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1870M| 3018G| | 677M (1)|999:59:59 | | | | 1 | SORT ORDER BY | | 1870M| 3018G| 3567G| 677M (1)|999:59:59 | | |
Having production system: 11.2.0.1 on Windows Server x64 Test system: 9.2.0.1 on Windows XP
Problem preface: to get all unique CASEID which should be checked up by biometric system.What i should check - all CASEs for different PERSONs having same PHONEs at least among one phone type (1..4).Real table contains little bit more than 10 million records.I made test scripts.
Below the DDL for test table creation: ------------------------------------------ -- Create CASEINFO test table ------------------------------------------ DROP TABLE CASEINFO; CREATE TABLE CASEINFO
[code]...
Below i've put SQL/DLL to make test data.number of records inserted 2 millions. PERSON_COUNT := #/8; ------------------------------------------ -- fill CASEINFO with sample data ------------------------------------------ DECLARE I INTEGER;
[code]...
Below SQL select to check the data in created table. ------------------------------------------ -- Check test data counters ------------------------------------------ SELECT 'TOTAL',count(*) from CASEINFO UNION ALL SELECT 'LEGAL',count(*) from CASEINFO where
[code]...
The PROBLEM is that i am experiencing HUGE perfomance problems on both test and production systems with that query:
select distinct b.caseid from CASEINFO a, CASEINFO b where (a.person<>b.person) and (a.sex=b.sex) and ( (a.phone1=b.phone1) or (a.phone1=b.phone2) or (a.phone1=b.phone3) or
[code]...
This query takes almost 90 minutes to execute.And i do not know how to avoid this.Full SQL file to make test attached.
CREATE OR REPLACE TYPE CourseList AS TABLE OF VARCHAR2(64); CREATE TABLE department ( courses CourseList) NESTED TABLE courses STORE AS courses_tab; INSERT INTO department (courses)VALUES (CourseList('1','2','3'));
[code]....
The query returns the correct data, CourseList that are not subset of any other CourseList of the table.
I am trying to convert this not exists in a left outer join query to check if the performance is better, but I don't know how to do it.
I was making some variations of this code :
select d1.courses c_1, d2.courses c_2 from department d1,department d2 where d1.courses<>d2.courses(+);
CODECREATE OR REPLACE TYPE ADDR_T AS OBJECT ( ADDR1 VARCHAR2 (50), ADDR2 VARCHAR2 (50) ); CREATE OR REPLACE TYPE t_ADDr AS TABLE OF ADDR_T;
[code]....
I have added some records and created index on ID column. I want to get result of CODEselect id, p.addr1,p.addr2 from nested_table n,table(n.COL1) p where id=1
Explain plan for that is:
CODESELECT STATEMENT ALL_ROWSCost: 8 Bytes: 231 Cardinality: 3 4 HASH JOIN Cost: 8 Bytes: 231 Cardinality: 3 2 TABLE ACCESS BY INDEX ROWID TABLE SYS.NESTED_TABLE Cost: 2 Bytes: 13 Cardinality: 1 1 INDEX RANGE SCAN INDEX SYS.FDSFAS Cost: 1 Cardinality: 1 3 TABLE ACCESS FULL TABLE (NESTED) SYS.COL1_TAB Cost: 5 Bytes: 163,840 Cardinality: 2,560
How to avoid full table scan on nested table? Cardinality is sum of all records in nested column in all rows in main table, why?
Looking to understand the difference between instance tuning and database tuning.
What is the difference between these two tuning exercises? I understand that an instance is memory based structures (logical) where as database consists of physical structures.
However, how does one tune a database the physical structure? Does it have to do with file placements/block sizes etc. Would you agree that a lot of that is taken care by ASM now in 11g? What tools are required/available (third party as well as oracle supplied) for these types of tuning scenarios?
I have two tables with 113M records in DWH_BILL_DET & 103M in prd_rerate_chg_que and Im running following merge query, which is running for 13 hrs to update records, which is quiet longer time.
SQL> explain plan for MERGE /*+ parallel (rq, 16) */ INTO DWH_BILL_DET rq USING (SELECT rated_que_rowid, detail_rerate_flag_code, rerate_sel_key,
How the length of column width effects index performance?
For example if i had IOT table emp_iot with columns: (id number, job varchar2(20), time date, plan number)
Table key consist of(id, job, time)
Column JOB has fixed list of distinct values ('ANALYST', 'NIGHT_WORKED', etc...).
What performance increase i could expect if in column "job" i would store not names but concrete numbers identifying job names. For e.g. i would store "1" instead 'ANALYST' and "2" instead 'NIGHT_WORKED'.
I have a question about database fragmentation.I know that fragmentation can reduce performance in query times. The blocks are distributed in many extents and scans process takes a long time. Oracle engine have to locate the address of the next extent..
I want to know if there is any system view in which you can check if your table or index has high fragmentation. If it's needed I will have to re-create, move or rebulid the table or index, but before I want to know if the degree of fragmentation is high.
Any useful script or query to do this, any interesting oracle system view?
There is a simple way to increase the performance of a query by reducing the row-size of the table it hits. I used it in the past by dividing the table into smaller parts and querying respective smaller table in each query.
what is this method called ? just forgot the method and can't recall it. what this type of row-reduction optimization is called ?
How many records could I have in a single table without performance degradation with Standard Edition without partitioning with cutting-edge server (8 or 12 cores, 72 GB RAM, FC 4 Gbit, etc...) and good storage?
300 Millions in only one table with 500K transactions / day is too much?
Testing our 9i to 11g upgrade, we've imported the entire DB into the new machine.We've found that certain procedures are really suffering performance problems. BUT, we've also found, that if we check out a production copy of the procedure from our source code control, and reinstall it, the performance issue goes away. Just alter the procedure and recompiling does NOT work.
The new machine where the 11g database exists is slightly different than the source, but it's not like we have this problem with every procedure. It's only a couple.
any possible reason that we'd have to re-install a procedure to correct a performance problem?