Performance Tuning :: Undo Tablespace Full / Which Is Rectified But Now Having Big WAIT
Mar 7, 2012
we have a situation where both undo tablespaces were almost filled i.e UNDOTBS1 99% and UNDOTBS2 100% filled so i add data files to it and then i found a lot of blocking session and was just killing them through EM then i stop my front end listener and also down the service, now i don't have any blocking session but on EM a big WAIT is coming. alert log shows nothing serious, it was showing deadlock but now it is over as well.
i'm facing a problem while i'm inserting millions of record from table to table that undo tablespace reach 100% full and execution aborted. , how can free the undo tablespace ??? many of extendes are offline. will it flush automatically ??? or what i should do
I noticed oracle background process ora_fbda_padwsdpr is suffering from buffer busy wait. When i further finding the object, it was on SYS_FBA_FA tables.
what is this is causing BUFFER BUSY WAIT. Also to add we have disabled flashback database.
I am using 11.2.0.3.0 version of oracle. As told by the DBAs that two days back the system realised high wait_class i.e 'Application' wait, and its a lot high than regular value as per our system and we need to dig it down, to avoid any future issue. now using below query , i found that the high wait time due to wait class 'Application' is actually belongs to particular event 'SQL*Net break/reset to client', and the sample time was '9 AM' morning.
select time, round(max(case when event = 'SQL*Net break/reset to client' then time_waited_delta/1e3/decode(total_waits_delta,0,1,nvl(total_waits_delta,1)) end) ,2) SQL_break_reset_client, round(max(case when event = 'Wait for Table Lock' then time_waited_delta/1e3/decode(total_waits_delta,0,1,nvl(total_waits_delta,1)) end),2) Wait_for_Table_Lock , round(max(case when event = 'enq: KO - fast object checkpoint' then [code]....
Now i want to track it down further to the 'session/sql query/application' level resulting into such high value of wait time. so i queried, dba_hist_ active_ session_history, for the same sample duration (8.30 to 9 am) having event 'SQL*Net break/reset to client', and got two sessions(123,154) and their serial# , but there i got one sql_id(3ahgrey10ogh i.e a 'SELECT' query) specific to one session(123) but for other(125) no sqlid, and also SUM (wait_time + time_waited) is showing '0'.
Then i just removed the sampletime filter from the below query and observed that the same 'same session(123)+session serial#' was activated since 4 days back and was experiencing same waitevent 'SQL*Net break/reset to client', (it was having normal wait event 'db file sequential read' for sometime then after it went for this event ).
I have a partitioned table with degree for parallelism defined as 10.
I am getting maximum wait events on PX Deq: Execute Reply.For 2 hours trace the wait event is almost 1.5 hour.
I have done some seraching and i found this.
Quote: In principle:A parallel query against a partitioned table will use one slave per partition if the query is thought to span multiple partitions, and it can use all slaves on a single partition if the query is thought to target just one partition. Unfortunately, this is NOT strictly true.
It is possible for the optimizer to decide to use parallelism at degree M when accessing N partitions. Sometimes this can lead to very inefficient, brute-force, processing when a more efficient path is available. This can be a particular problem with multi-table joins that should be partition-wise joins. You may be better off leaving the tables defined as non-parallel and adding explicit parallel hints to the code for critical queries.
So i have following questions 1) What is the meaning of PX Deq: Execute Reply. 2) Is this not recommended to use DEGREE clause in Partitioning. 3) Defining DEGREE clause in partioning of table will automatically executes DML on table PARALLELY same a PARALLEL hint clause
I ran an AWR report. The database looks fine, but a data load that loaded 1 Million rows an hour is now doing 500K per hour.
Wait Class Waits %Time -outs Total Wait Time (s) Avg wait (ms) %DB time DB CPU 224 80.70 Other 2,668 0 28 10 9.99 System I/O 4,753 0 9 2 3.23 Administrative 1 0 6 5543 2.00 Commit 357 0 4 11 1.46
[code]....
The network value for wait: 630,601. What does this mean? Anything I should look at? When it was 1million per hour, the value was 4,563,000.
Top 5 Timed Foreground Events
Event Waits Time(s) Avg wait (ms) % DB time Wait Class DB CPU 224 80.70 unspecified wait event 2,666 28 10 9.99 Other control file sequential read 4,753 9 2 3.23 System I/O switch logfile command 1 6 5543 2.00 Administrative log file sync 357 4 11 1.46 Commit
We are using 11.2.0.3.0 on solaris 10 facing slow performance, following are the Wait Events in AWR report, Also if any specific document to analyze AWR report and to pin point the performance bottleneck.
Foreground Wait Events ********************** Avg %Time Total Wait wait Waits % DB Event Waits -outs Time (s) (ms) /txn time -------------------------- ------------ ----- ---------- ------- -------- ------ direct path read 308,729 0 21,191 69 58.0 39.5 db file sequential read 208,754 0 3,742 18 39.2 7.0 cursor: pin S 19,541,899 0 2,561 0 3,668.5 4.8 [code]....
I'm an application developer of an automotive company and developing a lot of database-based applications with either oracle forms or c#.Since we've moved from a 10g rac to 11g using a shared server configuration, the prevailing and overwhelming topic of addm performance analysis is "unusual network wait event" caused by virtual circuit waits. Therefore I cannot use grid control to detect bad sql as I could in 10g anymore, because all "tunable" sql is wiped out by virtual circuit wait.In top activity, I see virtual circuit wait on every type of statement (select, insert...) and pl/sql execution.
What do I have to do as an application developer to avoid virtual circuit waits? Especially in C#: we normally use auto committed dml statements and selects to fill either a datatable or generic list with a data reader. Usually we close a connection after each statement, but/and we are using connection pooling. How can such a activity cause virtual circuit waits? In Oracle Forms: Seems that we have a virtual circuit wait if we show sorted data in a block where not all records are fetched from database. It doesn't make sense to us to rewrite all blocks to always get all records due to performance reasons.
How do I have to write and execute my statements in C#, oracle forms and/or pl/sql to avoid virtual circuit wait?
I am getting the below error in alert log file,when my application calling a procedure.
ORA-01555 caused by SQL statement below (Query Duration=1576 sec, SCN: 0x09a2.5dda3165): Fri Sep 16 16:33:40 2011 UPDATE SSPT_NETWORK_DETAILS SET INCLUDE_OFFERS = 'Yes' WHERE SESS_ID = SESS_ID
There is no ROLLBACK statement in my procedure. As per my understanding, the ORA-1555 error will occur,
1. The required old image is not in the undo,when we rollback the trasaction. 2. the select query may face this error because of delayed block cleanout concept.
But I don't know why this update statement causing this 1555 error?
regarding sizing undo tablespace and undo_retention parameter.we have to implement the database in production system with 40 users but how much space should be allocated to undo tablespace is there any propotions related to virtual memory and the parameter.i have gone thru oracle doc's and some related sites.its an ERP aplications that contains 20 modules .I am an new one to this dba level
NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ undo_management string AUTO undo_retention integer 96000 undo_tablespace string undo
Following are the details in AWR report (00:00 til 01:00 of 21-Apr-2013) .... not thet the error was produced at 00:42
Undo Segment Summary DB/Inst: DBCPY/dbcpy01 Snaps: 18853-18854 -> Min/Max TR (mins) - Min and Max Tuned Retention (minutes) -> STO - Snapshot Too Old count, OOS - Out of Space count -> Undo segment block stats: -> uS - unexpired Stolen, your - unexpired Released, uU - unexpired reUsed
[code]....
Undo Advisor information taken 'now' is as following
SQL> select dbms_undo_adv.longest_query(sysdate-2,sysdate) from dual; DBMS_UNDO_ADV.LONGEST_QUERY(SYSDATE-2,SYSDATE) ---------------------------------------------- 379650 SQL> select dbms_undo_adv.required_retention from dual;
[code]....
In above situation what should be my first choice (assuming increasing space is not an issue) - increase undo tablespace or increase undo retention?
If latter is the choice then what should be the value? Because as I understand present 96000 value is taken as lower limit and because of auto tuning the actual value (TUNED_UNDORETENTION) being used was 345600 In that case shall I set it to something > max(maxquerylen) i.e 379,650 + X?Or I shall increase the undo tablespace size?
From Undo Advisor output it looks to me that even if I increase the undo retention to 379650 current undo size will be able to support it (may be at the expense of DMLs)Is that right?
Below query is degrading the performance of database. As we know that, without where clause, query do full table scan.Now, it is written to generate the sequence no.
SQL> explain plan for 2 SELECT NVL(MAX(P.NUM_SERIAL_NO), 0) + 1 FROM CNFGTR_IRDA_ENVELOPE_DTLS P 3 / Explained. SQL> select * from table(dbms_xplan.display()); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------- Plan hash value: 3345343365 ------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------
I have created an non unique index lk_fein on lookup_fein( code,map_id,trash). When I check the explain plan it does a full table scan on lookup_fein. if I force it to use index by it does and the cost also decreases.
on our 10.2.0.5 database, when we run full backup, my system performance comes to an halt. we run full backup and then do a validate backup to validate the structure of the database etc. Database performance takes a hit and all of the application connections goes in wait mode: On ASH or AWR - this is the top wait i see:
i am using 11.2.0.3.0 version of oracle. We are planning to move some ~40 tables/indexes to new encrypted tablespace as a part of TDE(transparent data encryption). Currently three tables are having size ~30GB and one having ~800GB other have <2GB in size. And tables/indexes are altogether placed in different tablespaces.
whether i should create as many no of encrypted table spaces as it was before as unencrypted tablespace? or I should create one encrypted tablespace and move all the tables/indexes into that?
All the analysis till now on our system proves that our system is clearly I/O bound and db sequential read is the biggest culprit.
We have even identified the index which is being affected by sequential read. I am thinking of creating a new tablespace with 32K blocksize (currently all table spaces are 8k) and migrate this index to the new space. That way, Oracle will have to do less number of reads to get the required data.
But is there anything wrong in having just one tablespace with a differnt block size? Or is there anything that I have to be watchful about while doing it?
I have a long running transaction (more than 3 hours), and at the same time other operations are occurring on different tables, using the same UNDO space.
Sometimes we see ORA-30036, but this error occurs very late in the process. The transaction normally takes 3 hours, but when UNDO space is full, we do not get ORA-30036 upto 8 hours or 9 hours of process.
I am wondering what could be happening in the background, when UNDO space is full, which makes the transaction to extend upto 8 hours or 9 hours (pl. note, this transaction gets completed within 3 hours normally). This is in 11g, UNDO space is managed manually.
At a time my 20 GB undo tablespace was full. So i increased the tablespace size upto 48 GB. Then i saw 45 GB was used. Then i changed undo_retention=60. After that am seeing that 48 GB is full.
1) why it's happened? 2) Here what is the effect of undo_retention=60 3) How to resolve?.
Can an undo tablespace be too large and actually hurt performance? I have seen a system with a dedicated 1 TB drive for undo tablespace (no guarantee) with an undo_retention of 7 days. Would this hurt performance? What about setting an undo_retention of 24 hours with no guarantee? The only mention I could find online said that it would not hurt performance but I wanted to double check. You would think that Oracle does not care if it deletes the undo at 15 minutes or if it deletes the undo at a later date such as 7 days later and the performance should stay the same.
I am trying to drop 90 columns from a big partitioned table. I was trying a physical drop first and since it is taking longer time I decided make the columns unused state and drop them. However I was able to set them to unused state.
Now I am trying to drop those unused columns from the table, it is running since 22hours Apporox. I am keep increasing the undo tablespace to retain the undo data.
I also have decreased the undo_retention to 300 from 900.
My question is there any better way to drop these columns. And is there any way to flush out the data from undo.
i Cannot drop old undo tablespace. While dropping the old undo tablespace we get an error
ERROR at line 1: ORA-01548: active rollback segment '_SYSSMU77$' found, terminate dropping tablespace
SQL> select tablespace_name, status, segment_name from dba_rollback_segs where status != 'OFFLINE';
TABLESPACE_NAME STATUS SEGMENT_NAME ------------------------------ ---------------- ------------------------------ SYSTEM ONLINE SYSTEM APPS_UNDO NEEDS RECOVERY _SYSSMU77$