copy section of text in file python

Question

I need to extract values from the text file below:

fdsjhgjhg
fdshkjhk
Start
Good Morning
Hello World
End
dashjkhjk
dsfjkhk

The values I need to extract are from Start to End.

with open('path/to/input') as infile, open('path/to/output', 'w') as outfile:
    copy = False
    for line in infile:
        if line.strip() == "Start":
            copy = True
        elif line.strip() == "End":
            copy = False
        elif copy:
            outfile.write(line)

The code above I am using is from this question: Extract Values between two strings in a text file using python

This code will not include the strings "Start" and "End" just what is inside them. How would you include the perimeter strings?

I would use multiline RegExp for that - the code will also look much easier — MaxU - stand with Ukraine
– MaxU - stand with Ukraine, Commented Mar 2, 2016 at 21:36

Dan H · Accepted Answer · 2016-03-04 04:04:32Z

@en_Knight has it almost right. Here's a fix to meet the OP's request that the delimiters ARE included in the output:

with open('path/to/input') as infile, open('path/to/output', 'w') as outfile:
    copy = False
    for line in infile:
        if line.strip() == "Start":
            copy = True
        if copy:
            outfile.write(line)
        # move this AFTER the "if copy"
        if line.strip() == "End":
            copy = False

OR simply include the write() in the case it applies to:

with open('path/to/input') as infile, open('path/to/output', 'w') as outfile:
    copy = False
    for line in infile:
        if line.strip() == "Start":
            outfile.write(line) # add this
            copy = True
        elif line.strip() == "End":
            outfile.write(line) # add this
            copy = False
        elif copy:
            outfile.write(line)

Update: to answer the question in the comment "only use the 1st occurance of 'End' after 'Start'", change the last elif line.strip() == "End" to:

        elif line.strip() == "End" and copy:
            outfile.write(line) # add this
            copy = False

This works if there is only ONE "Start" but multiple "End" lines... which sounds odd, but that is what the questioner asked.

That makes a lot of sense. Is it possible to be selective and end the copy only use the 1st occurance of 'End' after 'Start'. My file contains a number of strings 'End'?
@Dan H what if there is a Start after End how to prevent to copy this Strat? and stop copying immediately
@Catalina : options: 1) call exit() after you see "End". 2) count the number of starts you see; only set copy to "True" if this is the first one.

MaxU - stand with Ukraine · Accepted Answer · 2016-03-02 21:44:33Z

1

RegExp approach:

import re

with open('input.txt') as f:
    data = f.read()

match = re.search(r'\n(Start\n.*?\nEnd)\n', data, re.M | re.S)
if match:
    with open('output.txt', 'w') as f:
        f.write(match.group(1))

answered Mar 2, 2016 at 21:44

MaxU - stand with Ukraine

212k37 gold badges402 silver badges436 bronze badges

2 Comments

en_Knight Over a year ago

This is probably the more robust solution, but for someone who was unclear on elif v if, maybe you could include some textual description?

dawg Over a year ago

This is better: (^Start[\s\S]+^End) Demo (Or (^Start[\s\S]+?^End) if there is more than 1 End...)

Community · Accepted Answer · 2017-05-23 12:07:49Z

The "elif" means "do this only if the other cases fail". It's syntactically equivalent to "else if", if you're coming from a differnet C-like language. Without it, the fall through should take care of including "Start" and "End"

with open('path/to/input') as infile, open('path/to/output', 'w') as outfile:
    copy = False
    for line in infile:
        if line.strip() == "Start":
            copy = True
        if copy: # flipped to include end, as Dan H pointed out
            outfile.write(line)
        if line.strip() == "End":
            copy = False

Collectives™ on Stack Overflow

copy section of text in file python

3 Answers 3

3 Comments

2 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

2 Comments

Comments

Linked

Related