I am trying to parse a file of street names for a project, and need to remove modifiers (Upper / Lower /Old / New / North / East / South / West ...) and endings (street / road / way / lane...), but I am hving no luck with a regular expression.
The way it is set up at the moment is that the program will parse the file one line (ie. street) at a time, and check it
I think the problem is word boundries - what I need for example are the following transformations...
Old Harrow Way -> Harrow (ie. remove 'Old' prefix and 'Way' ending)
Chittock Mead -> Chittock (Remove the ending 'Mead')
- But to leave these alone when in a word:
Gold Lane -> Gold (just remove ending)
Eastley Avenue -> Eastly (just remove ending)
Upper Western Avenue -> Western (remove prefix and ending)
Obviously, things like "South Street" would remove both - This is ok, because I can discard an empty string.
Can anyone give me an idea of how to do this - I've been reading up on regular expressions and trying things for hours!