SQL & PL/SQL :: Removing Duplicate Rows When Condition Is Matched
Mar 17, 2010
My requirement if id, join_date, join_time, result of table1 is matched with table2 at least one time then if repeating rows associated with the id should not come.Here is the test case.
create table table1
( id number , join_date varchar2(8), join_time varchar2(6), status varchar2(10));
create table table2
( id number , join_date varchar2(8), join_time varchar2(6), status varchar2(10));
insert into table1 values (01, '20010101', '0500', 'PASS');
insert into table1 values (01, '20010102', '0501', 'FAIL');
insert into table1 values (02, '20010103', '0502', 'PASS');
insert into table1 values (03, '20010104', '0503', 'FAIL');
insert into table1 values (04, '20010105', '0504', 'PASS');
insert into table1 values (05, '20010106', '0505', 'FAIL');
[code]...
I have tried the below mentioned query, whether any better query is there than this because in real-time data have 2 millions of record in table 1 and 60 thousand in table2.
select distinct a.id, a.join_date, a.join_time, a.status
from table1 a, table2 b
where a.id = b.id
and (a.id, a.join_date, a.join_time, a.status) not in (select b.id, b.join_date, b.join_time, b.status
from table2 b)
and a.id = (
select distinct a.id
[code]....
Trying to delete duplicate rows from a table. The problem is, they aren't exactly duplicate rows. Let me explain.
I am migrating data from a Oracle 8.1.7 db to a 10.2.1 db. In the older db, this certain table does not have a PK/Unique Index, but in the new db there is a unique index. The fields that the index is unique on are:
In the old db, when I run this query I get 1229 rows. With a count of 2 each.
select SUBSCR_NO, SUBSCR_NO_RESETS, EXTERNAL_ID, EXTERNAL_ID_TYPE, ACTIVE_DATE, count(*) from customer_id_equip_map group by SUBSCR_NO, SUBSCR_NO_RESETS, EXTERNAL_ID, EXTERNAL_ID_TYPE, ACTIVE_DATE having count(*)>1;
They are duplicates on those fields, but they are not totally duplicate rows because there is a field called is_current that has 0 in one row and has 1 in the other. What I need to do, is delete the 1229 rows with is_current=0.
INSERT INTO NODE_LVL VALUES('TBL_APL','TBL_AFL'); INSERT INTO NODE_LVL VALUES('TBL_APP','TBL_ACS'); INSERT INTO NODE_LVL VALUES('TBL_ADD','TBL_ADW'); INSERT INTO NODE_LVL VALUES('TBL_ADP','TBL_ADV'); INSERT INTO NODE_LVL VALUES('TBL_AOP','TBL_AOV'); [code]......
Table 'TBL_APP' is having 2 parent nodes i.e 'TBL_AOV' and 'TBL_ADV' SELECT * FROM node_lvl WHERE child_node = 'TBL_APP';
At level 5 there is duplicate nodes i.e 'TBL_APP' and 'TBL_ACS' as parent_node and child_node respectively.
SELECT PARENT_NODE, CHILD_NODE, LEVEL FROM NODE_LVL START WITH PARENT_NODE = 'TBL_ACF' CONNECT BY PRIOR CHILD_NODE = PARENT_NODE;
I want to suppress such duplicates. So I added DISTINCT
SELECT DISTINCT PARENT_NODE, CHILD_NODE, LEVEL FROM NODE_LVL START WITH PARENT_NODE = 'TBL_ACF' CONNECT BY PRIOR CHILD_NODE = PARENT_NODE;
BUT requirement is to maintain the same order (of hierarchy) as it was before adding DISTINCT.
when am trying to use nvl for one condition it is taking lot of time to execute but when am removing nvl function then the query executing in 2 min. condition is given below
SSD@ermd> desc person_pos_history Name Null? Type ------------------------------------------------------------------------ -------- ------------------------
PERSON_POSITION_HISTORY_ID NOT NULL NUMBER(10) POSITION_TYPE_ID NOT NULL NUMBER(10) PERSON_ID NOT NULL NUMBER(10) EVENT_ID NOT NULL NUMBER(10) USER_INFO_ID NUMBER(10) TIMESTAMP NOT NULL DATE
We found out that few person_id's are repeating for a particular event (3):
select PERSON_ID, count(*) from person_pos_history group by PERSON_ID, EVENT_ID having event_id=3 and count(*) > 1 order by 2
If we look at the 1st person id "217045", we can see that it is repeating 356 times for event id 3.
SSD@ermd> select POSITION_ASSIGNMENT_HISTORY_ID, POSITION_TYPE_ID, PERSON_ID,EVENT_ID, to_char(timestamp, 'YYYY-MM-DD HH24:MI:SS') 2 from person_pos_history 3 where EVENT_ID=3 4 and person_id=217045 5 order by timestamp;
356 rows selected.It is safe to assume that the person id/event id with the earliest timestamp is the one that was loaded 1st, hence, the one we want to keep and the rest should be deleted.
insert into table1(field1,field2)values('A','1'); insert into table1(field1,field2)values('A','1'); insert into table1(field1,field2)values('A','1'); insert into table1(field1,field2)values('B','2'); insert into table1(field1,field2)values('B','2'); insert into table1(field1,field2)values('B','1'); insert into table1(field1,field2)values('B','1'); SELECT field1 FROM table1 WHERE field2=all(select '1' from dual) FIELD1
CREATE TABLE A(EMP_ID NUMBER, EMP_NAME VARCHAR2(100)) CREATE TABLE B(EMP_ID NUMBER, EMP_ATT1 VARCHAR2(10), EMP_ATT2 VARCHAR2(10)) INSERT INTO A VALUES(1, 'ONE'); INSERT INTO A VALUES(2, 'TWO'); INSERT INTO A VALUES(3, 'THREE');
[Code]....
This query returns all the matching row of A and B
SELECT A.EMP_ID, A.EMP_NAME, B.EMP_ATT1, B.EMP_ATT2 FROM A INNER JOIN B ON A.EMP_ID=B.EMP_ID
The output for this shows:
EMP_ID EMP_NAME EMP_ATT1 EMP_ATT2 1 ONE 1ATT1 1ATT2 2 TWO 2ATT1 2ATT2 2 TWO 2ATT1.1 2ATT2.1 3 THREE 3ATT1 3ATT2
The requirement is to avoid duplicate rows even if matched:
EMP_ID EMP_NAME EMP_ATT1 EMP_ATT2 1 ONE 1ATT1 1ATT2 2 TWO 2ATT1 2ATT2 3 THREE 3ATT1 3ATT2
I am trying to find sum for one record for each partition but while taking that timestamp giving me bit trouble, i have tried to reproduce the table and some little data
CREATE TABLE TEST_COUNT (END_TIME DATE ,SUCCESSFUL_ROWS NUMBER ,FAILED_ROWS NUMBER ,TBL_NAME VARCHAR (4) ,PARTITION_NAME VARCHAR (240) )
column sid format 'a5' column serial# format 'a10' column mins_running format 'a15' column sql_text format 'a100' set linesize 200 set pagesize 30
[Code]..
I am running this code, and the output shows multiple lines.
TRIM(S.SID) TRIM(S.SERIAL#) MINS_RUNNING SUBSTR(Q.SQL_TEXT,1,70) ---------------------------------------- ---------------------------------------- --------------- ---------------------------------------------------------------- 700 46592 242.08 Select count(*) as count, case when count(*)>0 then 'FAIL' else 700 46592 242.08 'PASS' end as result from (SELECT cv.code_value FROM code_valu
[Code]...
Is there a way to wrap up the column for SQL_TEXT VARCHAR2(64) so that I can 1 row for the output?
I can get it through this query : select PARTY_ID from XXWFS_CUSTOMER_EXT group by PARTY_ID having count (PARTY_ID) > 1;
Now for the records which i got for each duplicate row i want to update the second row with a specific value.. so that duplicate rows does not exist anymore
Ex: I got party id's 12, 14, 16, 18 two times each
Now as 12 is two times.. i want to update the second row of 12 with some x value same is the case for other values like 14,16, etc
I have a view and in that view i need to remove duplicate rows from output. For that i need to run select query in where clause of view if select query return true then we need to execute second condition.
my requirement in view like
And.......... And ((select count(*) from table A where conditions)=1 )then name is null AND
in that code first we need to check first select query condition then we need to apply name is null condition. but i tried to run it but select query not run properly. because tables is used in View.
Name _____ Smith Street Smith Street John Street Ed Street Ed Street Ed Street
and need to assign sequence numbers only when the record (Name) changes, e.g. :
Name Seq _____ ____ Smith Street 1 Smith Street 1 John Street 2 Ed Street 3 Ed Street 3 Ed Street 3
I have experimented with row_number partition but then i just get the sequence returning to 1 when the name value changes.
If I grouped the records by Name I would like to have unique, sequential numbers: 1, 2, 3 but where there is the same name I would like the sequence to stop and the number to replicate?
I have one table in which I want to restrict some records from being inserted. I don't want to put any checked constraints. e.g. consider following table
transaction( id number primary key, txn_date timestamp(7), payee varchar2(40), amount number, memo varchar2(40), ref_num number )
I want to write SQL which should not inset duplicate record.
e.g.
I have written one as bellow:
insert into transaction select 1, to_date('2009-12-12','YYYY-MM-DD'), 'Payee1', 12, 'Test', 212 from dual where (select count(*) from transaction where txn_date=to_date('2009-12-12','YYYY-MM-DD') and payee='Payee1' and amount=12)=0;
Can I use exists/not exists, which query will be more appropriate. (Please consider that fields which I am using to filter out the duplicate transactions does not contain primary key.)
Can I write such SQL. Or do i check for duplicate rows one by one and then filter the duplicate records.
my need is to perform merge - update when id column is matched, but one of others columns not.When id column is not matched then I perform insert.
It works fine for matched or not matched id column.
Commented code is my try to perform check for others columns, The code should not update when all columns match. It should update only when on of columns doesn't match (except id column of course, because it's key column).
begin merge into copy.table1 rr using ( select ID , DEALID , ESTIMATIONDATE , BOUNDOVERESTIMATDATE , ESTIMATIONTYPEID , MARKETAMOUNT , LIQUIDATINGAMOUNT , [code]....
What would cause Oracle to insert duplicate rows into a table? Could a join of two tables in the initial query assigned to an application page cause ORacle to insert an extra row into a table when an update to data value occurs? I have no insert triggers and no foreign keys assigned to the table. I am not sure what would cause Oracle to assume that an insert of a row must occur. I want to prevent that insert.
I have a table of addresses where the indexed column consists of the city, an optional area name, the street name and the street number. For example 'Stockholm Drottninggatan 2'.
The users must enter the full city name and the beginning of the street name. So if the user wants to find all the addresses of both the streets Stockrosvägen and Stockbergsvägen which are in Stockholm, the query would look something like this:
Select * From AddressSearch Where Contains(AddressSearch.Address, 'Stockholm AND Stock%') > 0;
But this will select all the addresses of Stockholm. Is there a way to make the part after the AND not match the already matched first part?
db and dev 10g rel2 ,suppose that i have a table with a lot of duplicate rows ,what i need is to delete the duplicates and retain one row of these duplicates . likecolumn -- with those values...how to delete two (hi's) and retain the third , ?it is all applied to all the duplicate values in the column.