0

I have tried to implement a flatten function to even flatten strings but got an error for Recursion. Could someone help resolve this puzzle?

def flatten(items):
  for x in items:
      if isinstance(x, Iterable):
         yield from flatten(x)
      else:
         yield x

items = [2, [3, 4, [5, 6], 7], 8, 'abc']

for x in flatten(items):
    print(x)

I was expecting to print '2, 3, 4, 5, 6, 7, 8, a, b, c'; but instead, I got '2, 3, 4, 5, 6, 7, 8 and a RecursionError. I think the 'abc' is also 'Iterable', so why the code doesn't work?

Thank you!

4
  • 5
    'abc' is a sequence that contains 'a' as its first element, which is a sequence that contains 'a' as its first element, which is a sequence that contains 'a' as its first element... Basically, you have to special-case strings whenever you recursively explore any structure that might contain them. Commented Nov 30, 2022 at 19:46
  • I would think that flattening those items should give [2, 3, 4, 5, 6, 7, 8, 'abc'] since "to flatten" means to remove nesting and one doesn't typically think of a list of strings as being a nested data structure. Commented Nov 30, 2022 at 19:58
  • A related problem: how would you want to handle a cyclic reference like x = []; x.append(x)? Or worse, x = []; y = []; x.append(y); y.append(x)? Commented Nov 30, 2022 at 20:37
  • @jasonharper, very good insight! I didn't realize that 'a' itself is also iterable Commented Dec 5, 2022 at 5:27

2 Answers 2

3

The problem as jasonharper pointed is that 'a' is an iterable element which contains 'a' and so on. You can however, rewrite the code with another if before the yield from flatten(x) something like

from collections.abc import Iterable
def flatten(items):
  for x in items:
      if isinstance(x, Iterable):
         if len(x)==1:
             yield next(iter(x))
         else: 
             yield from flatten(x)
      else:
         yield x

items = [2, [3, 4, [5, 6], 7], 8, 'abc']

for x in flatten(items):
    print(x)
Sign up to request clarification or add additional context in comments.

9 Comments

Not all iterables support len. You'd have to check isinstance(x, Sized) first.
(This also would incorrectly yield [1], not 1, when x == [1].)
The idea that x == next(iter(x)) is weird enough that it's probably safe to assume that str and bytes are the only classes that satisfy the equality.
@chepner I agree, I did not thought that there are iterables with no len but you are right, I should have written x[1] and not x😅. I don't get the last comment.
And this is why I think it's easier (and probably good enough) to simply hard-code a check for str or bytes, rather than try to figure out what more general class of str-like types need to be special-cased.
|
1

This happens because you're exceeding the limit of the call stack I won't get into the nitty-gritty here but, you can read this article: https://towardsdatascience.com/python-stack-frames-and-tail-call-optimization-4d0ea55b0542

Recursion is a tricky problem to get right, it's often best to avoid it when possible in my own personal opinion. If you refactor your code to use a minimal amount of recursion and use the built-in iter() function on string values it works without exiting the call stack like so.

    from collections.abc import Iterable

def flatten(items):
  for x in items:
      if isinstance(x, str):
         yield from iter(x)
      elif isinstance(x, Iterable):
         yield from flatten(x)
      else:   
         yield x

items = [2, [3, 4, [5, 6], 7], 8, 'abc']

for x in flatten(items):
    print(x)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.