Use Range-hash Partitioning Of A Large Dimension Table
Apr 12, 2013
At moment we use range-hash partitioning of a large dimension table (dimension model warehouse) table with 2 levels - range partitioned on columns only available at bottom level of hierarchy - date and issue_id.
Result is a partition with null value - assume would get a null partition in large fact table if was partitioned with reference to the large dimension.Large fact table similarly partitioned date range-hash local bitmap indexes
Suggested to use would get automatic partition-wise joins if used reference partitioningWould have thought would get that with range-hash on both dimension.
Database Version : DB : Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit ProductionOS : HP-UX nduhi18 B.11.31 U ia64 1022072414 unlimited-user licenseAPP : SAP - ERP I have to RANGE partition on UPDATED_ON or PROFILE either one table which is having below
structure : Name Null? Type -------------------- -------- -------------------------------- MANDT NOT NULL VARCHAR2(9) MR_ID NOT NULL VARCHAR2(60) PROFILE NOT NULL VARCHAR2(54) REGISTER_ID NOT NULL VARCHAR2(30) INTERVAL_DATE NOT NULL VARCHAR2(24) AGGR_CONSUMPTION NOT NULL NUMBER(21,6) MDM_VERS_NO NOT NULL VARCHAR2(9) MDP_UPDATE_DATE NOT NULL VARCHAR2(24) MDP_UPDATE_TIME NOT NULL VARCHAR2(18) NMI_CONFIG NOT NULL VARCHAR2(120) NMI_CONFIG_FLAG NOT NULL VARCHAR2(3) MDM_DATA_STRM_ID NOT NULL VARCHAR2(6) NSRD NOT NULL VARCHAR2
[Code]....
As per my knowledge, RANGE is better suited for DATE or NUMBER. and INTERVAL partition is possible on DATE or NUMBEr . Column PROFILEIts is of VARCHAR2 datatype. I know still I can partition as Oracle internally convert varchar2 to number while inserting data. But INTERVAL is not possible. How to RANGE partition on PROFILE ? Column CREATED_ON :It is of NUMBER with decimal
Can I add range sub partition to a hash partition table. Example like this.
CREATE TABLE test ( test_id VARCHAR2(10 ) , test_TYPE VARCHAR2(5) , CREATE_DATE date ) partition by hash (test_id, test_type) Partitions 3 SUBPARTITION BY RANGE (CREATE_DATE);
When Tried, I am getting syntax error as invalid option.
Other than the obvious to me, where interval partitioning creates partitions as needed. Is there any performance benefit from using interval partitions vs date range partitions.
One draw back for me is that developers do access the partition name in some of their queries, so if I use date range partitioning this will not break their code. I could not find a way to assign a name to a partition when using intervals, is this always system generated or can this be over-ridden.
I am running Oracle 11.1.0.7 soon to be running on 11.2.0.0
I am trying to create a partitioned table so that a number (which date converted to number ) partition is created on inserting a new row for release_date column.
note that release_date column is having number data type (as per design) and people want to create an interval based partition on this.
They want data type NOT to be altered.
create table product( prod_id number, prod_code varchar2(3), release_date number) partition by range(release_date) interval(NUMTOYMINTERVAL (1,'MONTH')) (partition p0 values less than (20120101))
$x being a range of non-consecutive values like so: 1,3,5-9,13,18,21 and so on...
I realize I can query using an array of operands and such, but these ranges will be in upwards of 100 or more items. I want to minimize the number of queries I have to do and the length of them. Is there any resource you can point me to that can optimize something like this?
I have problem to transfer data in non partitioning table to partitioning table.
I have non partitioning table and i create new table partitioning that have same column and type like in non partitioning. So how can i transfer data from table in non partitioning to table in partitioning?
I have flattened customer dimension table and I would like to query it with other dimension table like address. I write a query, where I join address table twice to get permanent, secondary, and work addresses, but customer and address tables are huge that causing performance issue. Is any other ways to join flatten table with address dimensions than join it twice.
CREATE TABLE CUSTOMER ( cust_sk NUMBER NOT NULL , cust_src_id VARCHAR2(20) NOT NULL , rec_eff_dt DATE NOT NULL , last_name VARCHAR2(75) NULL , first_name VARCHAR2(30) NULL brth_dt DATE NULL ,
I am facing a problem in fetching / updating records from a customer details table having around 20 million records. The table contains around 30 fields with 'MOBILE_NO' as primary key. most of the queries are having 'mobile_no' in where clause .I am planning to hash partition that table using mobile_no column as there is no other column available which can be used for partition.
clarify whether creating hash partition on such key would increase performance of data extraction as I have read on net that hash partitioning is not effective for performance tuning.
create tablespace mssm datafile 'c:appmssm01.dbf' size 100m segment space management manual;
create cluster hash_cluster_4k ( id number(2) ) size 8192 single table hash is id hashkeys 4 tablespace mssm;
-- Created a table in cluster with row size such that only one record fits one block and inserted 5 records each with a distinct key value
CREATE TABLE hash_cluster_tab_8k ( id number(2) , txt1 char(2000), txt2 char(2000), txt3 char(2000) ) CLUSTER hash_cluster_8k( id ); [code]....
If I issue the same query after creating unique index on hash_cluster_tab(id), the execution plan shows hash access and single I/O (cr = 1).Does it mean that to have single I/o in a single table hash cluster, we have to create unique index? Won't it create additional overhead of maintaining an index?
What is the second I/O needed for in case unique index is absent?
I created the 32 hash partition on a fact table. Based on hash parititon technique it should evenly distribute data accross the different partition.But when i analyze the table and check the distribution its not at all even.
While trying partition exchange feature of Oracle with 2 hash partitioned tables, I come to know that I can't directly exchange partitions between 2 partitioned tables
I have two hash partitioned tables , so to move partition data from one table to another will include-
1) Exchange from partitioned table to non-partitioned table. 2) exchange from non-partitioned table to new partitioned table.
But I am not sure in which hash partition my data will go in new partitioned table (data need to be moved has single key value on basis of which tables are partitioned),
I partitioned a source table of around 100 million rows (62GB) in DEV server. The target database was created new. It was range partioned on a date column as follows:
PARTITION BY RANGE (ENTRY_DATE_TIME) ( PARTITION ppre2012 values less than (TO_DATE('01/01/2012','DD/MM/YYYY')) TABLESPACE WST_LRG_D, PARTITION p2012 values less than (TO_DATE('01/01/2013','DD/MM/YYYY')) TABLESPACE WST_LRG_D, PARTITION p2013 values less than (TO_DATE('01/01/2014','DD/MM/YYYY')) TABLESPACE WST_LRG_D, PARTITION p2014 values less than (MAXVALUE) TABLESPACE WST_LRG_D )
That is yearly basis. Anything before 2012 went to ppre2012, then p2012, p2013 and so forth. There is 20 million rows in p2012. and around 75 million rows in ppre2012. We needed both the source (un-partitioned) and target (partitioned) tables in DEv for comparision. The queries are normally on the current year partition. Just to state taht I am a developer and don't have full visibility to the production instance.
Now that our tests are complete, we would like to promote this in production. Obviously in production we would not not need both source and target tables. In all probability this will be performed over a weekend window. Therefore I would like to suggest the following .
1) use expdp to export source table 2) drop the source table 3) create a new source table "partitioned" with no indexes 4) use impdp to get data back into table 5) create global index (it is a unique index to enforce uniquness) and the rest of indexes as local 6) perform dbms_stats.gather_table_stats(user,'SOURCE', cascade=>true). This takes around 2 hours in dev
My point is that whether importing 100 million rows will not cause issues with undo segments. Can we import data say first to the current partition p2012 (20 million rows) first.
I recently started working with legacy code and noticed that some huge tables (5 years worth of data, don't have more details on me right now but can post later if needed) are partitioned based on time sequence number column while majority of queries are done based on time (different column). Queries performance is degrading and I'd like to try to modify partitioning and run some tests to evaluate performance improvement.
My only concern is with so much live data I have to come up with solution on how to switch partitioning with the least impact on applications running 24 x 7. Something you have done in the same situation and it worked?
DECLARE v_name VARCHAR2(256); BEGIN SELECT sys_context('userenv', 'current_user') INTO v_name FROM dual; DBMS_REDEFINITION.CAN_REDEF_TABLE(v_name, 'SO33070_ORIGINAL', dbms_redefinition.CONS_USE_ROWID); END;
Success
3. Creating a duplicate table
CREATE TABLE SO33070_NEW ( SERIAL_ID NUMBER(15,0), INSERTED_TIME DATE DEFAULT SYSDATE ) PARTITION BY RANGE ("INSERTED_TIME") INTERVAL (NUMTODSINTERVAL(1,'DAY')) ( PARTITION "p1_1" VALUES LESS THAN (TO_DATE(' 2012-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')) )
I have a big table in which we load about 37M recrods. We have informatica ETL which Loads the data in bulk Mode and creats index after completion. The data load takes about 1Hr and Index Creation takes about 1/2 hr. In total it takes about 90 to 95 Mnts.
Now I thought if Partition and Load paralley, It will improve perfromance. We did 4 partition and and each Partition about 9M records. The data load in Bulk mode is completing in 25 Mnts. Again When I am creating index over it, It is taking about 40 Mnts. and in Total Load time is 65 Mnts.
Is there way I can better performance to complete the load in 1/2 hr ?
I want to add column to table which has huge amount of data and fill with data from another table. What is the best way to do it? Is it faster to use CTAS instead of ALTER TABLE ADD COLUMN?
I need to dump the contents of a very large table into text files for archiving as we retire this old DB. The table has about 16 million rows, and a few of the columns are up to 4000 characters wide (varchar2(40000)). I've got 2 problems:
1) How can I select records that occur in a certain month of a year (there is a date column) and put the selected records into a file?
2) I don't have access to the server OS, so UTL_FILE is not possible. The output is also so large that I'm having trouble with the DBMS_OUTPUT.PUT_LINE.
I'm trying to get the first block of the IF working first, so the rest is just placeholders.
DECLARE v_mm number (2); v_yyyy number (4); min_mm number (2); min_yyyy number (4); max_mm number (2); max_yyyy number (4); min_date date; [code]....
I have two large tables(rptbody and rpthead) which has over millions or even more records. Below is the table schema
describe rpthead Name Null Type --------------------------- -------- ------------- RPTNO NOT NULL NUMBER RPTDATE NOT NULL DATE RPTD_BY NOT NULL VARCHAR2(25) PRODUCT_ID NOT NULL NUMBER [code]...
What I want is getting all data if the referenced RPTNO belongs to a particular product_id from rptbody table, here's the sql
SELECT t0.LINENO, t0.COMMENTS, t0.RPTNO, t0.UPD_DATE FROM RPTBODY t0 WHERE ( t0.RPTNO IN ( SELECT t1.RPTNO FROM RPTHEAD t1 where t1.PRODUCT_ID IN ('4647') ) ) ORDER BY t0.LINENO
Since the result set is pretty large, so my application(think it as c couple of jobs, each job should be finished in a time window) can only process a subset of all data, so I need pagination so that the next job can continue the processing until all data is processed, below is the SQL with pagination
select * from ( select a.*, ROWNUM rnum from ( SELECT t0.LINENO, t0.COMMENTS, t0.RPTNO, t0.UPD_DATE FROM RPTBODY t0 WHERE ( [code]....
As you can see each query will take 100 rows from the db. The problem for now is that the query taking too much of time(10+ mins), I know the slowness is due to "ORDER BY t0.LINENO", but it's required for pagination.