Server Utilities :: SQL Loader - High Volume Of Date?
Jun 6, 2013
I have 15 million of records as csv, want to load through sqlloader Is sqlloader is the right option to load high volume of data? I have loaded with 2.5 lac records which has taken 4 mins to load.
I am in the very early planning stages of a project the goal of which is to identify separate organizations which may in fact be the same organization.
Our first implementation of this task was a process designed to look for a few thousand organizations in a pool of a few hundred thousand organizations. To accomplish this we made heavy use of Oracle's Text index as well as a custom index type we created which utilized n-grams. This approach worked quite well for on-demand editing of the organizations, in which a user might log in and say in addition to what we already know about organization A we also know x, y and z does that change anything and worked acceptably well for the bulk processing we did on our "known" information once a week running for a couple of hours on the weekend.
We have now been tasked with reworking this initial implementation only now we want to look at a set consisting of several million organizations for potential matches which exist within the set. As in our initial implementation we will be breaking what we know about organizations into groupings so we aren't comparing a phone number to an email address and normalizing the data as much as we can so we ignore things like case and punctuation. Even after all this we are still talking about looking for similar values in a group which might be in the tens of millions (some types of data will have more than one value per organization).
My initial thought on the problem is to use n-grams though not in the way we did in the past. The basic idea here is that we break the search values up into all the substrings it is made of and look for other values which have a high number of those substrings in common.
SQL & PL/SQL was the best place for the question, but I could not think of a better one.
I want to load a delimited file that contains many records which contained within the table where I'm going to load a date type field and I need to do this by concatenating three fields
field1 = 1 - this is the day field2 = 11 - this corresponds to the month field3 = 5 - this corresponds to the year
I need is in the field Save as type date 01/11/2005 i donĀ“t know how to do it but I tried as follows but I get error loading.
I have another interesting SQL Loader issue. I have created 2 prior topics RE: "Trapping SQL Loader summary counts" and "Using Sql Loader issue". This new issue is along the same lines, but unfortunately the same input csv file has a date field that is has a different input format..
The 10G Oracle Table looks like... --------------------------- CREATE TABLE PTLIVE.MODULE_CSV_LOADS ( MODULE_ID NUMBER (20), MODULE_TAGNUMBER VARCHAR2(100), MODULE_SERIAL_NUMBER NUMBER (20,0),
[code]...
My initial control file looks like... ----------------- OPTIONS (SKIP=1) load data infile 'J:GrowerRound_ModulesCrop2011ProcessedBatch_2011Jul12_081326_746563.csv' BADFILE 'J:GrowerRound_ModulesCrop2011LogsBatch_2011Jul12_081326_746563.bad'
[code]...
The input csv data file (for this example) contains 3 records. The first is a heading record that is skipped. The 2nd record is correct and the field entitled "GMT Date" contains a value of "16/05/2011". The 3rd record, however, has a date of "2011/5/18", which get rejected by the above control file, as the output SQL Loader log indicates...
Record 2: Rejected - Error on table PTLIVE.MODULE_CSV_LOADS, column DATEPICKED. ORA-01861: literal does not match format string
Now we will be receiving literally hundreds of these csv files and in testing I have found that when I open the csv file (the default is Excel), and Excel must alter its display, as all date appear 100% okay. However, upon opening the csv file with Wordpad, I discovered the dates in the same file had these 2 different formats. So the control file was attempting to concat the input csv file "GMT Date" in one format with the input "GMT_TIME" to load the output value into the Oracle table column "DATEPICKED" ... but different date formats cause some records to be rejected.
It would be super if SQl Loader could use a IF clause or an OR clause to execute loading the date in one or more input formats. We only expect the 2 foramts DD/MM/YYYY or YYYY/MM/DD....however, there could be 6 different combinations in theory.
I thought about writing a Function or Procedure to analyst the input date and output a standard one to load into the Oracle column
The prod stats has been implemented in development. The stats has been gathered 2 months back on dev while in production the stats has been gathered 2 weeks back.
My question shouldn't the high volume of data causes changes in plan in both the environment? My thinking is that plan can be different as the high volume of data are changing in prod it may lead to a different plan.
the control file code in this path (c:externalctrl.ctl)
load data infile 'C:externalmy_data.txt' into table emp2 fields terminated by ',' (empno, ename, hiredate, etime, ejob, deptno)
this is the error :
C:>sqlldr scott/tiger control=C:externalctrl.ctl
SQL*Loader: Release 10.2.0.1.0 - Production on Mon May 31 09:45:10 2010 Copyright (c) 1982, 2005, Oracle. All rights reserved. Commit point reached - logical record count 5 C:>
I want to load data from a file using sqlldr.I have a table commissions ( technician_id char(5) , tech_name char(30) , Comm_rcd_date DATE , Comm_Paid_date DATE , comm_amt number(10,2) )
my file is 00001,TIMOTHY TROENDLY,2011-03-04T01:45:12+0006,2011-03-04T01:45:12+0007,123.56 00002,KENNETH KLEMENZ,2011-03-04T01:45:12+0006,2011-03-04T01:45:12+0009,123.56 00003,SHUNDAR ARDERY,2011-03-04T01:45:12+0006,2011-03-04T01:45:12+0005,123.56 write a ctl file to load this data.
I installed oracle 10G complete so I can have everything. But now I cannot run sql loader. I check my oracle devsuitehome directory and I cannot find sqlldr.exe
I need to install sql loader separately? I can't find sql loader installer on web.
- 1 - M or P: indicates which table to insert: MASTER or PARENT; - 2 - M or A or B: indicates MASTER, PARENT_A, PARENT_B; - 3:18 - DATA.
Based on the values above, what I need to do is:
1. Load a line to MASTER_TABLE; 2. Load a line to PARENT_TABLE_A pointing to its relative line in MASTER_TABLE; 3. Load a line to PARENT_TABLE_B pointing to its relative line in MASTER_TABLE; 4. In the original file line, there is nothing I can use to join a MASTER line with a PARENT line.
The result would be: MASTER_ID PARENT_DATA 1 PARENT_TABLE_A1 1 PARENT_TABLE_B1 2 PARENT_TABLE_A2 2 PARENT_TABLE_B2
I tried to use both: SEQUENCE and Sequence.NextVall (CurrVal) but they only work when using ROWS=1 and the file I need to load has millions of rows, so I need direct path loading.Also, I read about External Table, but it does not suit my needs because the Application server is not the same as Database server, which is needed by external tables.
in this case is better load the data to a temporary table and then insert to the other tables, I found almost the same question in the topic pointed by the link below: URL....
I want to load geometry into a table using sql*loader. My datafile contains geometry defined as WKT URL...., a standard for geometry and also Oracle has a function called sdo_util.from_wktgeometry.If I'm using a separate 'insert into' statement using this function in sql*plus, there's no problem. But if I'm using the same function in my control-file for sql*loader import, I get a sql*loader-418 error: "bad datafile for column geometrie".
My data file contains records with RECORD_TYPE '01', '02', '03' and '04' in position (30:31) I use sql loader to load this data into a table. I need to load ONLY rows with RECORD_TYPE ='04'. I have to output all the other rows into a file, so I decided to use DISCARD file. Now, those with RECORD_TYPE ='4' have to be loaded into different columns depending on the value in position (267:268).
So, my ctl file should look something like:
WHEN (30:31) = '04' into table MYDATA WHEN (267:268) != 'O ' into table MYDATA WHEN (267:268) = 'O '
and whatever is not '04' goes to discard file.
I tried to use
into table MYDATA WHEN (30:31) = '04' and (267:268) != 'O ' into table MYDATA WHEN (30:31) = '04' and (267:268) = 'O '
but I don't get the right result in terms of the discard file.
Is there any way to put together all these conditions?
The SQL loader somehow is loading only first record. The data file is a csv and the end of line character is a new line. Some text fields have multiple new lines.
Here is my control file
load data infile '/home/devo/c0397105/RuleImport/testLoad/dummyLoad.csv' Truncate into table DUMMY_LOAD_TABLE fields terminated by "," optionally enclosed by '"' ( ID "to_number(:ID)",REQUESTED_GROUP,PURPOSE,COMMENTS) [code]........
We don't have Retail resource type as Dependent System now; the rule will be changed later when Dependent Systems can accept resource types other than application"
I am using sqlloader for loading the data into database by using csv file.My csv file is delimited by comma in that i am having a column which is having the , and line feeds targeted to load into a long data type.for example as below
descri,dfdfdfd,dfdfdf, sdfsdf, dfsdfd,
i want to move this column data into a single table column.But due to because of delimited "," it is splitting into number of columns
I had a requirement of loading flatfile into staging table using SQL Loader, One of the columns in the the Flat file is having values FALSE or TRUE and my requirement is that I load 0 for FALSE and 1 for TRUE which can be achieved by simple DECODE function...I did use decode and tried to load several times but did not work.
INFILE 'sql_4ODS.txt' BADFILE 'SQL_4ODS.badtxt' APPEND INTO TABLE members FIELDS TERMINATED BY "|"
[code]...
I did try putting a trim as well as SUBSTR but did not work....the cloumn just doent get any values in the output (just null or say free space)
I have a small problem when I am trying to load data into a table using SQL Loader. The data I am trying to load should be a number, but it is in the format '999,999,999 USD'. When I try to load the data, I am getting an invalid number error, due to the USD (I have already accounted for the thousands seperators). My question is, how can I load the data as a number with USD in the format?
LOAD DATA APPEND INTO TABLE IPGITLREDATA WHEN ITL_REC_TYPE = 'D' FIELDS TERMINATED BY ',' TRAILING NULLCOLS (
[code].....
The data file might have a value of " D " instead of "D" for ITL_REC_TYPE and ITL_REC_TYPE is in the WHEN clause. How can I check for the trimmed value of ITL_REC_TYPE in the WHEN clause ?
I want to populate totale number of record in the file. Usually i get 10000 records per file and i load them using sql loader.I want to also insert the number of records in file while loading the data in table.
How can i achive it.
structure of control file is
load data BADFILE '/backup/temp/rajesh/RIO/BadFiles/FILENAME' append into table ERS_RIO_SRC TRAILING NULLCOLS ( INSTALLATION_ID CHAR
I have loaded 14324590 rows into target tables using sql*loader.I used below consideration during the load process.
1) direct=true,parllel=true 2) unrecoverable 3) disable all indexes and triggers.
But, sql loader takes 21 minutes to load 14324590 rows in database? tuning sql loader process? we cannot change data file because it has given by client.
from Control file, data is getting popupalated in TEST_USER and TEST_TITLE similarly if remove
INTO TABLE TEST_TITLE( TX_SID POSITION(1:3) CHAR, ID_TITLE "get_title_id(:TITLE_NAME)" ,
[code]...
from Control file, TEST_USER and TEST_ROLE is getting populated.
Here RD_ROLE_MASTER script
CREATE TABLE RD_ROLE_MASTER ( "ID_ROLE" NUMBER(38,0) NOT NULL ENABLE, "TX_ROLE_NAME" VARCHAR2(20 BYTE) NOT NULL ENABLE
[code]...
Here is RD_TITLE_MASTER script CREATE TABLE RD_TITLE_MASTER( "ID_TITLE" NUMBER(38,0) NOT NULL ENABLE, "TX_TITLE_NAME" VARCHAR2(25 BYTE) NOT NULL ENABLE);
Insert into RD_TITLE_MASTER (ID_TITLE,TX_TITLE_NAME) values (7,'RED_LOB_ESCALATION_L1'); what is the problem?
I Have one csv file.i want to load to a table trough sql*loader.but in table 3 column is there.but in the csv file some record hav one semicolumn in last filed like this
My requirement is to load the data from feed file into two tables based on the value of a column in feed file.
say column in feed file is "Activity_Value" . If the Activity_value is 10, the data from feed file should be loaded into Table A and If the Activity Value is 20 the data from feed file should be loaded into Table B.