Saturday, February 25, 2012

Attribute/Tags style search

Hi,

I am trying to write a query that will take a table with arbitrary key-value pairs and perform an inclusive search (ie, everything must match) on a second table with an arbitrary set of key-value search criteria pairs. The first table can get rather large, the search table will remain very small (<20 items).

Any help would be appreciated.

Thanks,

Mike

--
-- defines a source table that holds part attributes. the attribute names are
-- completely arbitrary
declare @.source table (partId int, attrKey varchar(max), attrVal varchar(max))
insert into @.source (partId, attrKey, attrVal)
(
select 1, 'STOCKNUM', '12345'
union
select 1, 'STOCKED', 'Y'
union
select 1, 'WEIGHT', '12'
union
select 2, 'STOCKNUM', '34567'
union
select 2, 'STOCKED', 'Y'
union
select 2, 'WEIGHT', '12'
union
select 3, 'STOCKNUM', '67890'
union
select 3, 'STOCKED', 'Y'
union
select 3, 'WEIGHT', '14'
)

--
-- define some inclusive search criteria to locate the partIds against. Every attribute
-- in the search table must match the record in the source table. ie, find all part ids that have
-- WEIGHT=12 and STOCKED=Y
declare @.search table (attrKey varchar(max), attrVal varchar(max))
insert into @.search (attrKey, attrVal)
(
select 'WEIGHT', '12'
union
select 'STOCKED', 'Y'
-- this could continue with more search values
)

--
-- would like this to return only part ids that matches all the criteria (in this case 1 & 2)
select a.partid from
@.source a join @.search s on a.attrKey=s.attrKey
where a.attrVal = s.attrVal

I took a stab at it... I do not claim this is the best way to do this, but you have an inherent problem, which is multiple rows in your "search" table that end up matching all of the rows in the "source" table. So, with that said, you need to take the first search condition and filter the source, save the results, and then filter those results with the next search condition and so on.

I'm sure performance would suck with my approach, not to mention memory consumption if you have a very large source table.

create table #temp (
partId int
)
create table #temp2 (
partid int
)
declare @.myKey varchar(max), @.myVal varchar(max)
declare c cursor
for select * from @.search
open c
fetch c into @.myKey, @.myVal

insert into #temp select distinct partid from @.source

while (@.@.FETCH_STATUS=0) BEGIN
truncate table #temp2
insert into #temp2 select * from #temp
truncate table #temp
insert into #temp select distinct a.partid from #temp2 a, @.source b where a.partid = b.partid and b.attrkey = @.myKey and b.attrVal = @.myVal
fetch c into @.myKey, @.myVal
end
close c
deallocate c
select distinct partid from #temp
|||

Thanks for the try. Performance would be a major concern with that approach though.

After putting more thought into it, I came up with this simple solution:

select a.partId
from @.source a join @.search s on a.attrKey=s.attrKey
where a.attrVal = s.attrVal
group by partId
having count(a.partId) = (select count(attrKey) from @.search)

Now for the perf testing over the actual db

Thanks again,

Mike

|||This is the generic form of query for this problem known as relational division. This will work fine.|||Thank you.

No comments:

Post a Comment