|
 |
filling gaps - Piotr Lipski |
11-Feb-08 06:41:15
|
I have to import xls data into a sql 2000 table.
The headings in xls and in the destination table are identical (names,
types, order).
The problem is that one of the source columns has gaps that have to be
filled in during the load process.
Example of input (headings in 1st row):
COL1, <other columns>
'aaa', <some other data>
, <some other data>
, <some other data>
'bbb', <some other data>
, <some other data>
, <some other data>
'ccccc', <some other data>
'dddd', <some other data>
, <some other data>
The COL1 is blank for some of the the records.
I need to fill the gaps with values from previous records so the
destination data would be like:
COL1, <other columns>
'aaa', <some other data>
'aaa', <some other data>
'aaa', <some other data>
'bbb', <some other data>
'bbb', <some other data>
'bbb', <some other data>
'ccccc', <some other data>
'dddd', <some other data>
'dddd', <some other data>
At the moment I load the whole file into a stage table containing
additional autoincrementing field (that allows me to access the rows in
the same order as in the source file), then I fill the gaps using a
cursor and finally I write the data to a destination table. But this
approach comes from procedural programming I'm familiar with. I feel
this could be done better - without cursors. Any hints?
Btw, is there any 'smart' way of importing large number of xls files
into a sql table? Each file has identical structure. I use VB Script +
DTS to achieve this but I think there is better/easier way.
--
PL |
 |
| |
| |
|
| |
|
filling gaps - Uri Dimant |
11-Feb-08 07:33:09
|
Piotr
Just a guess
create table #t (id int not null, c int, c1 char(1))
insert into #t values (1,1,'a')
insert into #t values (2,1,null)
insert into #t values (3,1,null)
insert into #t values (4,2,'b')
insert into #t values (5,2,null)
insert into #t values (6,3,'c')
insert into #t values (7,3,null)
insert into #t values (8,3,null)
select coalesce(#t.c1,t1.c1) from #t
left join #t t1 on t1.id<#t.id and t1.c=#t.c
and t1.c1 is not null |
 |
| |
filling gaps - Piotr Lipski |
11-Feb-08 08:32:43
|
Uri,
nice try, thanks. But there is no 'c' field in my scenario.
Maybe there isn't any cursorless solution...
--
PL |
 |
| |
filling gaps - Uri Dimant |
11-Feb-08 08:45:34
|
Piotr
Of cause, you dont as it is my test. I tried to give you an idea. |
 |
| |
filling gaps - Steven Wilmot |
11-Feb-08 09:26:31
|
This all depends on how man files you import, and how frequently
However, the approach below gives you an alternative method to consider.
Firstly, the table needs to have a way of determining row order
- I've assumed an ID column
Importing the data should give something like this
ID Col1 Other
1 abc ....
2 -- ....
3 -- ....
4 def ....
5 -- ....
6 ghi ....
7 jkl ....
8 mno ....
9 -- ....
-----
use dummydb
go
set nocount on
go
create table TestMe
(
id int not null identity(1,1) primary key,
col1 varchar(5) null,
col2 varchar(5) null
)
go
create nonclustered index ix_col1 on testme(col1)
go
insert into TestMe(col1,col2) values ('abc','...')
insert into TestMe(col1,col2) values (null,'...')
insert into TestMe(col1,col2) values (null,'...')
insert into TestMe(col1,col2) values ('def','...')
insert into TestMe(col1,col2) values (null,'...')
insert into TestMe(col1,col2) values ('ghi','...')
insert into TestMe(col1,col2) values ('jkl','...')
insert into TestMe(col1,col2) values ('jkl','...')
insert into TestMe(col1,col2) values (null,'...')
set nocount off
go
select A.*,
Best_ID = (select max(B.ID) from testme B where b.ID < a.ID and b.col1 is
not null )
from testme A
where col1 is null
go
update testme
set col1 = ChangeTo.col1
from
testme
inner join
(
select
B.ID as Match_ID,
Best_ID = (select max(C.ID) from testme C where C.ID < B.ID and C.col1 is
not null )
from testme B
where B.Col1 is null
) TMP
on testme.id = TMP.Match_ID
inner join testme ChangeTo
on ChangeTo.ID=TMP.Best_ID
go
select * from testme
go
drop table TestMe
go |
 |
| |
|
|
| DATEDIFF rounding |
| |