unmatcher
unmatcher tries to solve the following problem:
Given a regular expression, find any string that matches the expression.
Why? Mostly just because. But one possible application is to generate test data for string processing functions.
Status
Most typical elements of regexes are supported:
- multipliers:
*,+ - capture groups:
|,( )(including backreferences) - character classes (
\d|\w|\setc.) and character sets ([])
API
unmatcher module exposes a single reverse function.
It takes a regular expression - either in text or compiled form - and returns a random string that matches it:
>>> import unmatcher >>> print unmatcher.reverse(r'\d') 7
Additional arguments can be provided, specifying predefined values for capture groups
inside the expression. Use positional arguments for numbered groups ('\1', etc.):
>>> import unmatcher >>> print unmatcher.reverse(r'<(\w+)>.*</\1>', 'h1') <h1>1NLNVlrOT4YGyHV3vD7cHvrAl8OHVWDPKgmaE4gUsctboyFYUx</h1>
and keyword arguments for named groups:
>>> import unmatcher
>>> print unmatcher.reverse('(?P<foo>\w+)__(?P=foo)', foo='bar')
bar__bar
Note that a predefined value is not validated against actual subexpression for the capture group.

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.


