0

I'm trying to extract specific text from a text file using batch code. The file from which I need to extract data will have multiple lines of text and the number of lines will vary, which means the position of the indicators will change as well. Here's a sample of the text file:

File 1:

<File>
<General>
   <Primary_1>1.2.3.4.5</Primary_1>
   <Secondary_2>9.8.7.6.5</Secondary_2>
</General>
<Main_List>
   <Details="Title" One="C:\Folder1\Folder2\Folder3\Folder4\Folder5" Two="I" Three="4"/>
</Main_List>
</File>

I've gone through some manipulation already and extracted the lines that contain the data I need from the text file and saved it to two separate text files so I end up with this:

File 2:

   <Primary_1>1.2.3.4.5</Primary_1>

File 3:

<Details="Title" One="C:\folder1\folder2\folder3\folder4" Two="A" Three="5"/>

So, from the two files above (file 2 & file 3), I need to be able to extract two values. The first being between the |Primary_1| and |/Primary_1| indicators...in this case I would need to pull the "1.2.3.4.5" value. The second being the value after the |Details="| and before the |" One=| indicators...in this case I would need to pull the "Title" value.

I searched around and couldn't find anything that quite fit the bill. The closest I found was the "...on the same line..." code (Extract part of a text file using batch dos), but I kept getting errors. Any help would be greatly appreciated. Thank you.

1 Answer 1

2

Try this when both lines are in file.txt

It works for the txt as given, if TABs aren't in the file.

@echo off
for /f "tokens=2 delims=<> " %%a in ('find "<Primary_1>" ^< "file.txt" ') do echo "%%a"
for /f "tokens=2 delims==" %%a in ('find "<Details =" ^< "file.txt" ') do SET "xtitle=%%a"
SET ntitle=%xtitle:~1%
SET xtitle="%xtitle%"
ECHO +%ntitle%+ or +%xtitle%+ - your choice...

There is a more robust method using a helper batch file if your wanted text contains spaces.

(little tickle by Magoo - allows spaces in the quoted "Title" string - but I don't know whether the requirement is for quoted or unquoted variable contents...so you get both. (no extra charge)

Sign up to request clarification or add additional context in comments.

2 Comments

The first "for" line worked perfectly and pulled the correct information...the "1.2.3.4.5" data. However, the second "for" line pulled "Details" instead of "Title". Am I missing something? Thanks.
My bad. Looks like I missed a space in the third file as follows: <Details ="Title" One="C:\folder1\folder2\folder3\folder4" Two="A" Three="5"/> How would I modify the delims in the second for statement to accommodate this space? The space is between the word Details and the equals sign. Thanks again.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.