samedi 27 juin 2015

How to get n elements of a list not contained in another one?

I have two lists, of different size (either one can be larger than the other one), with some common elements. I would like to get n elements from the first list which are not in the second one.

I see two families of solutions (the example below is for n=3)

a = [i for i in range(2, 10)]
b = [i * 2 for i in range (1, 10)]
# [2, 3, 4, 5, 6, 7, 8, 9] [2, 4, 6, 8, 10, 12, 14, 16, 18]

# solution 1: generate the whole list, then slice
s1 = list(set(a) - set(b))
s2 = [i for i in a if i not in b]

for i in [s1, s2]:
    print (i[:3])

# solution 2: the simple loop solution
c = 0
s3 = []
for i in a:
    if i not in b:
        s3.append(i)
        c += 1
        if c == 3:
            break
print(s3)

All of the them are correct, the output is

[9, 3, 5]
[3, 5, 7]
[3, 5, 7]

(the first solution does not give the first 3 ones because set does not preserve the order - but this is OK in my case as I will have unsorted (even explicitly shuffled) lists anyway)

Are there the most pythonic and reasonably optimal ones?

The solution 1 first computes the difference, then slices - which I find quite inefficient (the sizes of my lists will be ~100k elements, I will be looking for the first 100 ones).

The solution 2 looks more optimal but it is ugly (which is a matter of taste, but I learned that when something looks ugly in Python, it means that there are usually more pythonic solution).

I will settle for solution 2 if there are no better alternatives.

Aucun commentaire:

Enregistrer un commentaire