[PRL] Re: [plt-internal] cisco router problem

Mitchell Wand wand at ccs.neu.edu
Wed Jan 14 15:27:34 EST 2004


I'm assuming the following is a reasonable abstraction of the problem:

Given two long lists of integers (possibly with duplicates), one much
longer than the other, find their intersection.  

Doing intersection (or DB join) is usually a n^2 problem.

Some standard solutions:

1.  Read the big list once and the smaller list many times.  I'm
    guessing he's already thought of this.

2.  Keep the lists sorted.  That way it turns into a O(n) problem.  If
    you have to sort the lists, that makes it O(n log n).

3.  Preprocess the smaller list once to get a hashtable of possible
    hits.  That way you only scan through the small list for big-list
    elements

4.  Indices are Good Things. (Generalization of #3).

--Mitch 



More information about the PRL mailing list