0

This is my current code:

def poisci_pare(besedilo):
    import re
    seznam = re.split("[.]", besedilo)
    return seznam

this returns (we assume the sentences will always end with a dot .)

poisci_pare("Hello world. This is great.")
>>>output: ["Hello world", "This is great"]

What would I have to write to get python to split the string like this:

poisci_pare("Hello world. This is great.")
>>>output: [["Hello", "world"], ["This", "is", "great"]]
2
  • 1
    Im actually suprised that worked ... . typically means any character in regex ... I guess when its in a box bracket it treats it as a literal ... Commented Nov 5, 2014 at 20:10
  • Yeah I didn't think it would work in the first place, but after some experimenting with re.split I got it to work perfectly.. Commented Nov 5, 2014 at 20:15

2 Answers 2

3
def poisci_pare(text):
    sents = text.split('.')
    answer = [sent.split() for sent in sents if sent]
    return answer

Output:

In [8]: poisci_pare("Hello world. This is great.")
Out[8]: [['Hello', 'world'], ['This', 'is', 'great']]
Sign up to request clarification or add additional context in comments.

Comments

0

this also will do the trick:

input = "Hello world. This is great."
print [s.split() for s in input.split('.') if s.split()]
[['Hello', 'world'], ['This', 'is', 'great']]

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.