0

I've got a variable "Variable" in VBScript that will receive different values, based on names that come from xml files i don't trust. I can't let "Variable" have forbidden caracters on it (<, >, :, ", /, \, |, ?, * ) or characters with accents (I think they are called accent in english) like (Á, á, É, é, Â, â, Ê, ê, ñ, ã).

So, my question is: How can I create a script that studies and replace these possible multiple possible characters in the variable I have? I'm using a Replace function found in MSDN Library, but it won't let me alter many characters in the way I'm using it.

Example:

(Assuming a Node.Text value of "Example A/S")

For Each Node In xmlDoc.SelectNodes("//NameUsedToRenameFile")
     Variable = Node.Text
Next

Result = Replace(Variable, "<", "-")
Result = Replace(Variable, "/", "-")

WScript.Echo Result This Echo above returns me "Example A-S", but if I change my Replaces order, like:

Result = Replace(Variable, "/", "-")
Result = Replace(Variable, "<", "-")

I get a "Example A/S". How should I program it to be prepared to any possible characters? Thanks!

7
  • You should define a list of allowed characters. This makes it easier to put things under control. Commented Jan 16, 2014 at 14:39
  • You'd have to create some sort of mapping array/object I believe. Commented Jan 16, 2014 at 14:39
  • And String is immutable, so you need to pass the Result to the next Replace, not Variable. Commented Jan 16, 2014 at 14:40
  • Why it functions in the first example, but not in the second? Commented Jan 16, 2014 at 14:45
  • @CharlieVelez: Like I said, you need to pass Result to the next Replace. Otherwise, the result you see is only the Variable being replace once, by the last line of replacement. Commented Jan 16, 2014 at 14:46

2 Answers 2

1

As discussed, it might be easier to do things the other way around; create a list of allowed characrters as VBScript is not so good at handling unicode like characters; whilst the characters you have listed may be fine, you may run into issues with certain character sets. here's an example routine that could help your cause:

Consider this command:

wscript.echo ValidateStr("This393~~_+'852Is0909A========Test|!:~@$%#@@#")

Using the sample routine below, it should produce the following results:

This393852Is0909ATest

The sample routine:

Function ValidateStr (vsVar)
    Dim vsAllowed, vscan, vsaScan, vsaCount
    vsAllowed = "ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890"
    ValidateStr = ""
    If vartype(vsvar) = vbString then
        If len(vsvar) > 0 then 
            For vscan = 1 To Len(vsvar)
               vsValid = False          
               vsaCount = 1
               Do While vsaValid = false and vsaCount <= len(vsAllowed)
                   If UCase(Mid(vsVar, vscan, 1)) = Mid(vsAllowed, vsaCount, 1) Then vsValid = True
                   vsaCount = vsaCount + 1
               Loop
               If vsValid Then ValidateStr = ValidateStr & Mid(vsVar, vscan,1)
            Next
        End If
    End If
End Function

I hope this helps you with your quest. Enjoy!

EDIT: If you wish to continue with your original path, you will need to fix your replace command - it is not working because you are resetting it after each line. You'll need to pump in variable the first time, then use result every subsequent time..

You had:

Result = Replace(Variable, "/", "-")
Result = Replace(Variable, "<", "-")

You need to change this to:

Result = Replace(Variable, "/", "-")
Result = Replace(Result, "<", "-")
Result = Replace(Result, ...etc..)
Result = Replace(Result, ...etc..)

Edit: You could try Ansgar's Regex, as the code is by far more simple, but I am not sure it will work if as an example you had simplified Chinese characters in your string.

Sign up to request clarification or add additional context in comments.

1 Comment

It functioned perfectly. Thanks a lot, mate!
0

I agree with Damien that replacing everything but known-good characters is the better approach. I would, however, use a regular expression for this, because it greatly simplifies the code. I would also recommend to not remove "bad" characters, but to replace them with a known-good placeholder (an underscore for instance), because removing characters might yield undesired results.

Function SanitizeString(str)
  Set re = New RegExp
  re.Pattern = "[^a-zA-Z0-9]"
  re.Global  = True

  SanitizeString = re.Replace(str, "_")
End Function

1 Comment

Hello again, @Ansgar! I'll try it too!