The book is indeed correct, and it provides a good argument. Note that timings are not a usefulreliable indicator of algorithmic complexity. The timings might happen withonly consider a special data distribution, andor the test cases might be too small: algorithmic complexity only describes how resource usage or runtime scales beyond some suitably large input size.
The book makes the argument that complexity is O(n²) because the if a == b branch is entered at most n times. This is non-obvious because the loops are still written as nested. It is more obvious if we extract it:
def disjoint(A, B, C):
AB = (a
for a in A
for b in B
if a == b)
ABC = (a
for a in AB
for c in C
if a == c)
for a in ABC:
return False
return True
This variant uses generators to represent intermediate results.
- In the generator
AB, we will have at most n elements (because of the guarantee that input lists won't contain duplicates), and producing the generator takes O(n²) complexity. - Producing the generator
ABCinvolves a loop over the generatorABof length n and overCof length n, so that its algorithmic complexity is O(n²) as well. - These operations are not nested but happen independently, so that the total complexity is O(n² + n²) = O(n²).
Because pairs of input lists can be checked sequentially, it follows that determining whether any number of lists are disjoint can be done in O(n²) time.
This analysis is imprecise because it assumes that all lists have the same length. We can say more precisely that AB has at most length min(|A|, |B|) and producing it has complexity O(|A|•|B|). Producing ABC has complexity O(min(|A|, |B|)•|C|). Total complexity then depends how the input lists are ordered. With |A| ≤ |B| ≤ |C| we get total worst-case complexity of O(|A|•|C|).
Note that efficiency wins are possible if the input containers allow for fast membership tests rather than having to iterate over all elements. This could be the case when they are sorted so that a binary search can be done, or when they are hash sets. Without explicit nested loops, this would look like:
for a in A:
if a in B: # might implicitly loop
if a in C: # might implicitly loop
return False
return True
or in the generator-based version:
AB = (a for a in A if a in B)
ABC = (a for a in AB if a in C)
for a in ABC:
return False
return True