Performance Tuning :: Data Type Conversion Impact?
Nov 28, 2011
my sql query has three tables in from clause so it has two join conditions and one where condition.
account_no is number data type and v_account_no is varchar2() data type
The where clause is :
"where account_no=to_number(v_account_no)" with this condition in my sql query has the cost 392
we just modify the where clause as where v_account_no=to_char(account_no) with this condition in the sql query has the cost 11.
what is impact of this data type conversion and difference between these two "to_number() and to_char()" in performance wise to reduce the cost of query?
I have been used to the consciousness that we should use the minimum length for varchar2 field that can store the data we need manipulate. But recently I was told that it has little impact on performance if we assign a much longer size.
We have a huge table in production, with LONG column. We are trying to change its datatype to CLOB. The table has 120 Million records and is of 270 GB in size.
We tried using the oracle expdp/impdp option to try the conversion in our perf environment. With 32 parallels, the export completed in 1.5 hrs. However, the import took 13 hrs.
I also tried the to_lob option using inserts, it went on for 20 hrs and I killed the process. Are there any ways to improve the performance of LONG to CLOB conversion on huge tables?
I'm trying to find some information on the performance impact of a trigger on a heavily updated table when the condition to fire the update trigger is NOT met. In other words I guess what I'm really trying to find out is what the performance impact of the system checking the condition on the trigger to determine if it should fire or not is.
For example I have a batch job that inserts and updates a table heavily, but the batch job almost never updates the column in question on the trigger to the value that would cause it to fire, but it does update that column to other values often.
I know about the many downsides of using triggers in general, but I'm working with a third party application, so more optimal solutions aren't an option.
in my oracle enterprise manager under " user i/o " .i am having basically four category:
if we rank them out of ten it would be like : read by other session 2/10 db file scattered read 1/10 direct path read .5/10 db file sequential read 6.5/10
and all these are comming for 2 tables involved for almost all time .some way to handle "read by other session" and " db file sequential read " .
i am rebuilding indexes of these involved table once in 10days and statistics for the these tables are collected every day using "analyze table "xxx" compute statistics."tell me the indepth approach i should take to minimize the impact as users are complaing for performance.
what the principal things to look at when we have for the same query different performance results are?I have 2 different bases: the plan and data are the same but performance results are very differents.
how I can avoid PLW-07204 "conversion away from column type may result in sub-optimal query plan" (or rather the problem notified here by Oracle) from my PL/SQL code.My code looks as follows:
create table T(D date); create or replace procedure P is cursor C(D_IN IN DATE) is select D from T where trunc(D)=SYSDATE; begin null; end P; [code]...
I don't want to disable the warning - it is there for a reason.
Additional description for the issue: It seems that D is of typ=12, while trunc(D) is of typ=13. You can use DUMP to read this info. This is described at URL....
1) Split values from "INST" Column : suppose 23 2) Find all values from "NUM" column for above splitted value i.e 23 ,
Eg:
For Inst : 23 , It's corresponding "NUM" values are : 1234,1298
3) Save these values into
A table Y : INST, NUM are column names.
INST NUM 23 1234,1298
1) I have a thousand records in Table X , and for all of those records i need to split and save data into Table Y.Hence, I need to do this task with best possible performance.
2) After this whenever a new data comes in Table X, above 'split & save' operation should automatically be called and append corresponding data wherever possible..
sometimes when I re-run a query a few times, the speed after the first run become much faster. this is a problem for me when I'm trying to optimize a query. is there some sort of cache? can it be disabled?
CREATE OR REPLACE procedure fast_proc (p_rows out number) is TYPE object_id_tab IS TABLE OF all_objects.object_name%TYPE INDEX BY BINARY_INTEGER lt_object_id object_id_tab; CURSOR c IS
[Code]....
Warning: Procedure created with compilation errors.
We have few tables in our production database which are havoc in size and will increase in size in future too so as part of the corrective measures , we have jotted down the below 3 methods to manage the size of those tables :-
1> Partitioning the table and take the export of identified partitions and after that, truncate those partition. 2> Creating history tables and remove not so current data from the original table to history table.
I'm extracting/retrieving the data from the oracle database using Java application it's bit slow. However, when I retrieve from the SQL server it's faster than oracle.
We have a data migration scripts written for oracle. Data is not huge but we are observing that the migration is faster in the development labs but is 5x slower in the production site.
The development Oracle setup is on Windows and Production setup on Solaris. I have attached the AWR generated for a period where migration was run for 3 hours and stopped due to slow performance.
Here is my initial analysis.
1) The first timed events is the DB CPU. Hence I feel the migration scripts can be modified to run in parallel so that they can finish faster. However here the question arises why it should run faster in development env if this is an issue. 2) I tried increasing the a.large_pool_size set to 512M b.sga_max_size set to 8G c.sga_target set to 8G from 0, 4G and 4G respectively.
I have attached the AWR and below are the etc/system contents for solaris settings.
* Begin MDD root info (do not edit) rootdev:/pseudo/md@0:0,1,blk * End MDD root info (do not edit) set noexec_user_stack=1 set noexec_user_stack_log=1 * IBMdpo vpath_START (do not remove) * default SCSI timeout is 60 seconds * uncomment to change SCSI timeout * set sd:sd_io_time=0x1e forceload: drv/vpathdd * IBMdpo vpath_END (do not remove)
set noexec_user_stack=1 set semsys:seminfo_semmni=100 set semsys:seminfo_semmns=1024 set semsys:seminfo_semmsl=256 set semsys:seminfo_semvmx=32767 set shmsys:shminfo_shmmax=4294967295 set shmsys:shminfo_shmmin=1 set shmsys:shminfo_shmmni=100 set shmsys:shminfo_shmseg=10
P.S. The awr report is renamed to .txt from .html to be able to upload the file.
We are on Oracle 10.2.0.4 on Solaris 10. There is a table in my production db that has 872944 number of rows. Most of its data is now unnecessary, we need to retain, based on a date column in the table just last one month's data and delete rest of the data. So after that the table will have just 3000 rows.
However as the table was huge earlier(872k rows prior to delete) , does the delete of data release its oracle blocks and does the size of the table reduce? If not, will it rebuild the table online (online redefinition) so that the query that does a full scan on this table goes faster?
I checked using an example table that just delete of data does not remove the oracle blocks - they remain in the user_tables for that table and cost of full table scan remains same. We have a query that does the full table scan so I am thinking that after this delete I should do an online table re-definition , is that the right decision?
I create a view on production server which takes almost 10 to 12 minutes when it shows data. this view contains 3 or 4 tables on which all primary and unique columns have indexes.which index will be better for fast retrieval of data .
I have a table which contains 8,21,177 amount of data totally.Now I am trying to delete around 4,84,000 of data from this table by using just one filter i.e. my query is something like below
DELETE /*+ parallel(resource,4) */ FROM resource where created_by = 'MIGN'
This is going to delete 4,84,000 rows of data . But my current issue is this is taking lots of time to delete the data . To be precise , its almost taking 25 hours to delete this data..The created_by column is indexed .
Execution Plan ---------------------------------------------------------- Plan hash value: 2389236532
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------
We are copying our transaction tables data into another database for our reporting applications (say every day midnight refresh will happen).
The Transaction Database has some 30tables. Existing system is following below points and it is taking 2hours to complete.
1) Truncate data from reporting database (or schema)
2) Direct path Insert into reporting database (or schema) as select * from transaction tables.
3) Rebuild index and Enable constraints.
Note: Each tables data will vary from 30lakhs to 50lakhs. Dump/import/export is not advised by the client.
I want to cut down the time i.e., below 2hours. Instead of above method. Can go for a field in each table specifying the time of each records update/insert operation and then pick the modified records only and copy into reporting db.
I am inserting data using a procedure for 2012 and 2013 year which is using partitioned tables includes crore of data in a partition taking lot of time or taking months. Is there any other way by which I can insert data fast from our query.
I have created a materialized view and also a normal View, which has 3 tables used in both the views, when inserted new records it reflects in a normal view but when i select the materialized view i cant see the updated data.
here is the materialized view i created;
CREATE MATERIALIZED VIEW pct_sales_materialized BUILD IMMEDIATE REFRESH ON DEMAND ENABLE QUERY REWRITE AS SELECT A.DEP_NAME,B.EMP_ID,C.EMP_NAME FROM department_head A,department_child B,emp_detail C WHERE A.DEP_ID = B.DEP_ID AND B.EMP_ID = C.EMP_ID
We have a table with huge data which is skewed on a 'status' column. The 'status' column has 6 distinct values with 1 particular value occupying 80-85% records.
In the batch process we query the data on the status and process the retrieved records. My senior is insisting on partitioning which I see not much feasible considering cost implications just for a part of functionality
See there are 6 status 'A','B','C','D','E','F'
with 'A' occupying 80% records 'B' to 'F' occupies 2% till 14% records in the table(approx)
1) Create a conditional index on status (using case) to have records with all statuses except 'A' Then create If-ELSE structure
IF input parameter is 'A' select /*+ FULL Parallel(t) */ * from t where status='A'; ELSE Select /*+ INDEX (t conditional_index) */ * from t where status in ('B','C'); END IF;
I want to create conditional index here for 2 reasons
1] since it will have values for status except 'A' this nullify the chance that this index will be picked up when status='A' will be queried Thus making the performance worst (status ='A' is for 80% records) - The IF-ELSE is additional protection 2] Less impact on the DMLS as the index will not be on status='A' which contribute to large chunk of records
2)Populate a dummy table which would contain rowid and status. Since the business closes at 21:00 and batch process starts at 21:30 Between these times periods refresh the dummy table every day using merge (to catch business transactions during the day)
Now during the batch process retrieve records from the main table using the rowids in the dummy table depending on the input status value
3)Create index on status Make sure hard coded status values are used in the database procedures Gather stats with the histograms And leave it to the Optimizer to choose the best possible path
I am getting back into Oracle (from a long haul in MS only env.) and am now testing Oracle installs.I have been given a task of seeing the diff. between 12c and 10.2g...I set up 2 vms (excatly same configs) and used the same dmp file (on both env.) to restore data and settings for our jobs to run.We have some aggregated data, and cubes with DIM tables each being run on the vm machines. We run nightly jobs to rebuild our cubes.
I am supposed to see/analyze the value of 12c, and understand things might vary from company to company, but am perplexed at my result.12c is half the speed of 10.2g, both env. are the same out of the box with same dmp file and same hardware.
I am using the same dmp file, with the same jobs on each machine, with both vms having 10.2g or 12c installed out of the box as is.what default oracle settings might have changed from 10.2g to 12c that could make the exact same env. run twice as slow on the 12c?
Expectations were that out of the box with both machines running same jobs on same data (from dmp files) would have it that 10.2g would be slower than the 12c, except the 12c takes 2 times as long to run the jobs. I have reviewed every possibility as I know usually the problem is the person sitting in the chair and not the pc...but I confirmed all was identical from the one vm env. to the other, except the version of oracle out of the box.
What could be done to bring that default setting back to atleast equal time between the 2, that would give me a great starting point. Otherwise, I would have to toss this up to bloatware.
I read up a bit on the CBO, and know this might have changed in 12c.is there a way to bring it back to a backwards ealier config, so as to atleast match both env. execution plans?
In order to improve the performance of our live server, I am trying to do an exhaustive comparison with our test environment which is quite quick in spite of the fact that we port the data from Live every month.
There are no obviously slow queries appearing in the the top SQLs of AWR, we have optimised such things already. Right now it is about general uplift rather than SQL based tuning.
I picked up random SQLs and I noticed a marked differences in the execution time. Typically they are 3 to 4 times and there are cases much more than that.
1. I observed that, while the explain plan of the queries are same, trace of the queries give a different picture. I have observed that the recursive calls, consistent gets and sorts(memory) are quite high on Live. 2. I have no solid reasons to say this but my instincts tell me that the recursive calls is the major contributing factor. It is sometimes 2000+ for an SQL. 3. On googling more on that, it finally made me compare the data dictionary on the AWR report of test and Live.
The dc_objects caught my eyes. In that 4 hour AWR, there were about 10 million get requests and the pct miss was ~10. For similar load, the test server had 5 million gets with 0.08 PCT miss for 4 hours.
A website requires to display consolidated data from databases located in different geographical regions (India, London and New York). The application server for the website is hosted only in one location India. What are the techniques that can be used for faster retrieval of data from all 3 databases?
Note: There is no need of real time data retrieval from different regions; however the user should able to view the updated data at predefined intervals.
We are working on a Data warehouse (ard 50G ) architecture with the following acquired environment:
Single server X3650 M4 Dual CPU ( 16 core in total ) with 48G ram Oracle standard 10g x64 Windows 2008 x64 128 SSD x 8 IBM ServeRAID M5110e SAS/SATA Controller
Due to budget concern, we will be running the App server(Business OBjects 4.0 w/ Tomcat and DB server on the same machine. ) We have a user base of around 30 ppl on the app server.
We intend to have external redundancy using IBM raid card on raid 10 configuration. I wonder what kind of disk config yield better performance if we only have write update in the morning and 95% read for the rest ?
Raid 1 for OS (128SSD x 2 including DB logfile ) Raid 10 for DB server ( 128 SSD x 6 )
I heard ASM provides better disk management but just wonder it increase performance in anyway.
What could be the reasons that some queries execute fast when executed on sqlplus on server, whereas the same queries run slower with same input values fed from application screen?
One issue I guess would be bind variable peeking while using application whereas executing from sqlplus is causing hard parsing and thus getting rid of "peeking"
If displaying the data on application screen is taking time after data has been fetched, where I can see this delay?
I understand the elapsed time under 'Fetch' in tkprof will show time taken to fetch from database and not the time taken to be displayed in the application GUI
finally how to set arraysize in jdbc to improve performance by reducing roundtrips?
The below query takes more than 30 minutes to return data.All the objects used are views. There is no direct reference to any table.The views with _mnth_ have data for 7 distinct months. The base table for all the views have a composite PK on the columns AR_ID (or ACCT_AR_ID),MSRMNT_PRD_ID
I need the order by, as the query is part of informatica code, and the order by works in the further processing.
SELECT ac.ar_id AS acct_ar_id, m.msrmnt_prd_dt AS msrmnt_prd_dt --removed the rest of column list to reduce size of code. FROM edxf.ar_rsrv_mnth_v ac, edxf.crdt_acct_mnth_v c, edxf.crdt_acct_v ca, (SELECT msrmnt_prd_id, msrmnt_prd_dt FROM edxf.msrmnt_prd_v WHERE msrmnt_prd_id = [code]....
Also, the count of data in the views is as below.
ViewTotal countCount for 1 msrmnt_prd_id --------------------------------------------------------- ar_rsrv_mnth_v1841892281945 crdt_acct_mnth_v664941457087369 crdt_acct_v12258728NA
The prod stats has been implemented in development. The stats has been gathered 2 months back on dev while in production the stats has been gathered 2 weeks back.
My question shouldn't the high volume of data causes changes in plan in both the environment? My thinking is that plan can be different as the high volume of data are changing in prod it may lead to a different plan.