0

I'm iterating over a list that I'm trying to extract data from that looks like this:

for i in lst: 
    print(i.split('-'))


... output

['a', 'doa', 'a', 'h5t']
['a', 'rne']
['a', 'ece']
['a', 'ece']
['a', 'tnt', 'c', 'doa', 'd', 'nvc', 'a', 'nnm', 'a', 'h5t']
['a', 'tps']

My goal is to extract all the strings within each list that 3 characters long. If I do

len(i.split('-')) 

in which case the above would look like:

4
2
2
2
10
2

in the loop than I just get the length of each unique string in the list. My question is how can I get a count of the characters in each string in each list?

EDIT:

The output should look like:

['doa', 'h5t']
['rne']
['ece']
['ece']
['tnt', 'doa', 'nvc', 'nnm', 'h5t']
['tps']
1
  • What would be the required output given the example data you have implied? Ok, got it! Commented Nov 27, 2017 at 15:59

3 Answers 3

3

My goal is to extract all the strings within each list that 3 characters long.

A nested list comprehensions will do the trick.

>>> l = ['a-bc-def-ghij-klm', 'abc-de' 'fg-hi']
>>> [[x for x in s.split('-') if len(x) == 3] for s in l]
[['def', 'klm'], ['abc']]
Sign up to request clarification or add additional context in comments.

6 Comments

@schwobaseggl yup, fixed.
While I like the list-comp approach, a list comp within a list comp tends to be ugly. Also since OP edited his original post, it doesn't look like he wants a nested list.
@BrianM.Sheldon I disagree. This is well within PEP8 line length and very readable. More so, I'd suggest, than a nested loop where the list variable declaration and the append calls are 3 lines apart.
An alternative would be [list(filter(lambda x: len(x) == 3, s.split('-'))) for s in l] but I personally don't like it.
@timgeb haha, but reality is that filter is faster in py3 =))
|
2

This code:

lst = ['a-doa-a-h2t','a-rne','a-ece','a-ece','a-tnt-c-doa-d-nvc-a-nnm-a-h5t','a-tps']
for item in lst:
    words = item.split('-')
    print([word for word in words if len(word) == 3])

produces output something like your requirement:

['doa', 'h2t']
['rne']
['ece']
['ece']
['tnt', 'doa', 'nvc', 'nnm', 'h5t']
['tps']

3 Comments

really ugly, with list generator in print
Sure is. This seems to be the way the kids are writing code these days! But, hey, that is the requirement of the OP.
I don't think it's ugly and I will punish you with an upvote.
-1

You need another loop to extract each word in the split like so:

for item in lst:
  for word in item.split('-'):
    if len(word) == 3:
      print(word)

1 Comment

I chose to save the split strings into a substrings variable with the assumption that he might need it later, if not then your solution is definitely cleaner.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.