1

I have a file full of strings which i read into a list. Now I'd like to find a specific line (for example the first line below) by looking for .../002/... and add to these 002 +5 to give me /007/, in order to find my next line containing /007/.

The file looks like this

https://ladsweb.modaps.eosdis.nasa.gov/archive/allData/6/MYD021KM/2018/002/MYD021KM.A2018002.1345.006.2018003152137.hdf
https://ladsweb.modaps.eosdis.nasa.gov/archive/allData/6/MYD021KM/2018/004/MYD021KM.A2018004.1345.006.2018005220045.hdf

with this i could identify for example the first line:

match = re.findall("/(\d{3})/", data_time_filtered[i])

The problem now is: how do I convert the string to integers but keeping the format 00X? Is this Ansatz correct?:

match_conv = ["{WHAT's in HERE?}".format(int(i)) for i in match]

EDIT according to suggested answers below:

So apparently there's no way to directly read the numbers in the string and keep them as they are?

adding 0s to the number with zfill and other suggested functions makes it more complicated as /00x/ should remain max 3 digits (as they represent days of year). So i was looking for an efficient way to keep the numbers from the string as they are and make them "math-able".

2
  • can you try this --> ["{WHAT's in HERE?}".format(int(i)) before for loop and then you can try to iterate Commented Sep 14, 2018 at 7:26
  • An int does not have a format. Commented Sep 14, 2018 at 8:55

4 Answers 4

1

We can first define a function that adds a integer to a string and returns a string, padded with zeros to keep the same length:

def add_to_string(s, n):
    total = int(s)+n
    return '{:0{}}'.format(total, len(s))

add_to_string('003', 2)
#'005'
add_to_string('00030', 12 )
#'00042

We can then use re.sub with a replacement function. We use the regex r"(?<=/)\d{3}(?=/)" that matches a group of 3 digits, preceded and followed by /, without including them in the match.

The replacement function takes a match as parameter, and returns a string.You could hardcode it, like this:

import re

def add_5_and_replace(match):
    return add_to_string(match.group(0), 5)

url = 'https://nasa.gov/archive/allData/6/MYD021KM/2018/002/MYD021KM.hdf'

new = re.sub(r"(?<=/)\d{3}(?=/)", add_5_and_replace, url)
print(new)
# https://nasa.gov/archive/allData/6/MYD021KM/2018/007/MYD021KM.hdf

But it could be better to pass the value to add. Either use a lambda:

def add_and_replace(match, n=1):
    return add_to_string(match.group(0), n)

url = 'https://nasa.gov/archive/allData/6/MYD021KM/2018/002/MYD021KM.hdf'

new = re.sub(r"(?<=/)\d{3}(?=/)", lambda m: add_and_replace(m, n=5), url)

Or a partial function. A complete solution could then be:

import re
from functools import partial

def add_to_string(s, n):
    total = int(s)+n
    return '{:0{}}'.format(total, len(s))

def add_and_replace(match, n=1):
    return add_to_string(match.group(0), n)

url = 'https://nasa.gov/archive/allData/6/MYD021KM/2018/002/MYD021KM.hdf'

new = re.sub(r"(?<=/)\d{3}(?=/)", partial(add_and_replace, n=3), url)
print(new)

# https://nasa.gov/archive/allData/6/MYD021KM/2018/005/MYD021KM.hdf

If you only want to add the default value 1 to your number, you can simply write

new = re.sub(r"(?<=/)\d{3}(?=/)", add_and_replace, url)
print(new)

# https://nasa.gov/archive/allData/6/MYD021KM/2018/003/MYD021KM.hdf
Sign up to request clarification or add additional context in comments.

2 Comments

it's a very neat idea with those two functions :) How could I use add_5.. and add_to_string as an input? So, add 5 to the match and use this output to look for a string containing the new number in the file... eventually adding the newly found, whole string to a new list. --- re.search maybe? or use it as an input for my match function above?
You can get the string containing the value using the same regex, and add whatever you want, to get the new number as a string. Something like old = re.findall(r"(?<=/)\d{3}(?=/)", url)[0] ; new = add_to_string(old, 5)
1

Read about mini format language here:

c = "{:03}".format(25) # format a number to 3 digits, fill with 0
print(c)

Output:

025

Comments

1

You can't get int to be 001, 002. They can only be 1, 2.

You can do similar by using string.

>>> "3".zfill(3)
'003'
>>> "33".zfill(3)
'000ss'
>>> "33".rjust(3, '0')
'033'
>>> int('033')
33

>>> a = 3
>>> a.zfill(3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute 'zfill'

Comments

1

Or you rjust and ljust:

>>> '2'.ljust(3,'0')
'200'
>>> '2'.rjust(3,'0')
'002'
>>> 

Or:

>>> '{0:03d}'.format(2)
'002'

Or:

>>> format(2, '03')
'002'

Or:

>>> "%03d" % 2
'002'

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.