As far as I can tell, the only way to do this is to subclass Formatter
and override the parse()
method. This breaks the string down into a set of tuples, each of which represents some (optional) literal text followed by an (optional) field to be expanded.
Unfortunately, the default parse()
implementation isn't exposed and is non-trivial, involving some regex magic. It's particularly bad if you want to implement anything using balanced delimiters, which can't be done with regexes. Consider braceField={someDict["}"]}
.
For my specific use case, I want to expand fields containing valid Python expressions. To do this, I'm splitting the string on the opening delimiters and then using the ast
module to grab a valid Python expression before parsing the rest of the string. This means that balanced delimiters work automatically.
import ast
class BracketedFormatter(string.Formatter):
def parse(self, format_string):
while format_string:
left, *right = format_string.split("{", 1)
if not right:
yield (left, None, None, None)
break
right = right[0]
# Handle doubled delimiters for literals.
if right.startswith("{"):
yield (left + "{", None, None, None)
format_string = right[1:]
continue
# Use ast.parse to find the length of the expression.
offset = len(right) + 1
try:
ast.parse(right)
except SyntaxError as e:
if not str(e).startswith("unmatched '}'"):
raise e
offset = e.offset
expr = right[0 : offset - 1]
format_string = right[offset:]
yield (left if left else None, expr, None, None)
And the tests:
from hamcrest import (
assert_that,
contains_exactly,
)
def parse(s):
return list(BracketedFormatter().parse(s))
assert_that(parse(""), contains_exactly())
assert_that(parse("1234"), contains_exactly(("1234", None, None, None)))
assert_that(parse("1234{5678}"), contains_exactly(("1234", "5678", None, None)))
assert_that(
parse("1234{5678}9abc"),
contains_exactly(("1234", "5678", None, None), ("9abc", None, None, None)),
)
assert_that(parse("{5678}"), contains_exactly((None, "5678", None, None)))
assert_that(parse("{}"), contains_exactly((None, "", None, None)))
assert_that(
parse("{1}{2}"),
contains_exactly((None, "1", None, None), (None, "2", None, None)),
)
assert_that(parse("123}456"), contains_exactly(("123}456", None, None, None)))
assert_that(
parse("{1}123}456"),
contains_exactly((None, "1", None, None), ("123}456", None, None, None)),
)
assert_that(parse("{'}'}"), contains_exactly((None, "'}'", None, None)))
assert_that(parse("{'{}'}"), contains_exactly((None, "'{}'", None, None)))
assert_that(
parse("abc{{def"),
contains_exactly(("abc{", None, None, None), ("def", None, None, None)),
)
This doesn't handle format specifiers at all, because I don't need them, and no doubt there are many nasty edge cases (I'll try and come back and edit this as I find them). But even though this doesn't quite fit the OP's requirements, I haven't found any other examples allowing this online after some time of looking, so it's worth posting here.
Changing it to work with a different delimiter is left as an exercise for the reader (there are nasty edge cases).