219

I have a task to match floating point numbers. I have written the following regular expression for it:

[-+]?[0-9]*\.?[0-9]*

But, it returns an error:

Invalid escape sequence (valid ones are  \b  \t  \n  \f  \r  \"  \'  \\ )

As per my knowledge, we need to use an escape character for the . also. Please correct me where I am wrong.

8
  • 13
    What language is this regex used in? Commented Sep 28, 2012 at 15:34
  • 6
    @JDB - Why are you giving away 100 points for a number/float regex? The standard has always been (?:\d+(?:\.\d*)?|\.\d+) and has been posted ad infinitum on SO... Commented Feb 22, 2018 at 17:42
  • 1
    see also stackoverflow.com/questions/638565/… Commented Feb 27, 2018 at 4:24
  • 6
    [-+]?([0-9]*[.])?[0-9]+([eE][-+]?\d+)? if you want to catch exponential notation too, e,g, 3.023e-23 Commented Aug 2, 2019 at 17:26
  • In some languages like Java or C++, the backslash must be escaped. So to get the regex "\.", you would use the string "\\.". Python gets around this by using raw strings. Commented Jan 10, 2020 at 20:32

22 Answers 22

470

TL;DR

Use [.] instead of \. and [0-9] instead of \d to avoid escaping issues in some languages (like Java).

Thanks to the nameless one for originally recognizing this.

One relatively simple pattern for matching a floating point number in a larger string is:

[+-]?([0-9]*[.])?[0-9]+

This will match:

  • 123
  • 123.456
  • .456

See a working example

If you also want to match 123. (a period with no decimal part), then you'll need a slightly longer expression:

[+-]?([0-9]+([.][0-9]*)?|[.][0-9]+)

See pkeller's answer for a fuller explanation of this pattern

If you want to include a wider spectrum of numbers, including scientific notation and non-decimal numbers such as hex and octal, see my answer to How do I identify if a string is a number?.

If you want to validate that an input is a number (rather than finding a number within the input), then you should surround the pattern with ^ and $, like so:

^[+-]?([0-9]+([.][0-9]*)?|[.][0-9]+)$

Irregular Regular Expressions

"Regular expressions", as implemented in most modern languages, APIs, frameworks, libraries, etc., are based on a concept developed in formal language theory. However, software engineers have added many extensions that take these implementations far beyond the formal definition. So, while most regular expression engines resemble one another, there is actually no standard. For this reason, a lot depends on what language, API, framework or library you are using.

(Incidentally, to help reduce confusion, many have taken to using "regex" or "regexp" to describe these enhanced matching languages. See Is a Regex the Same as a Regular Expression? at RexEgg.com for more information.)

That said, most regex engines (actually, all of them, as far as I know) would accept \.. Most likely, there's an issue with escaping.

The Trouble with Escaping

Some languages have built-in support for regexes, such as JavaScript. For those languages that don't, escaping can be a problem.

This is because you are basically coding in a language within a language. Java, for example, uses \ as an escape character within it's strings, so if you want to place a literal backslash character within a string, you must escape it:

// creates a single character string: "\"
String x = "\\";

However, regexes also use the \ character for escaping, so if you want to match a literal \ character, you must escape it for the regex engine, and then escape it again for Java:

// Creates a two-character string: "\\"
// When used as a regex pattern, will match a single character: "\"
String regexPattern = "\\\\";

In your case, you have probably not escaped the backslash character in the language you are programming in:

// will most likely result in an "Illegal escape character" error
String wrongPattern = "\.";
// will result in the string "\."
String correctPattern = "\\.";

All this escaping can get very confusing. If the language you are working with supports raw strings, then you should use those to cut down on the number of backslashes, but not all languages do (most notably: Java). Fortunately, there's an alternative that will work some of the time:

String correctPattern = "[.]";

For a regex engine, \. and [.] mean exactly the same thing. Note that this doesn't work in every case, like newline (\\n), open square bracket (\\[) and backslash (\\\\ or [\\]).

A Note about Matching Numbers

(Hint: It's harder than you think)

Matching a number is one of those things you'd think is quite easy with regex, but it's actually pretty tricky. Let's take a look at your approach, piece by piece:

[-+]?

Match an optional - or +

[0-9]*

Match 0 or more sequential digits

\.?

Match an optional .

[0-9]*

Match 0 or more sequential digits

First, we can clean up this expression a bit by using a character class shorthand for the digits (note that this is also susceptible to the escaping issue mentioned above):

[0-9] = \d

I'm going to use \d below, but keep in mind that it means the same thing as [0-9]. (Well, actually, in some engines \d will match digits from all scripts, so it'll match more than [0-9] will, but that's probably not significant in your case.)

Now, if you look at this carefully, you'll realize that every single part of your pattern is optional. This pattern can match a 0-length string; a string composed only of + or -; or, a string composed only of a .. This is probably not what you've intended.

To fix this, it's helpful to start by "anchoring" your regex with the bare-minimum required string, probably a single digit:

\d+

Now we want to add the decimal part, but it doesn't go where you think it might:

\d+\.?\d* /* This isn't quite correct. */

This will still match values like 123.. Worse, it's got a tinge of evil about it. The period is optional, meaning that you've got two repeated classes side-by-side (\d+ and \d*). This can actually be dangerous if used in just the wrong way, opening your system up to DoS attacks.

To fix this, rather than treating the period as optional, we need to treat it as required (to separate the repeated character classes) and instead make the entire decimal portion optional:

\d+(\.\d+)? /* Better. But... */

This is looking better now. We require a period between the first sequence of digits and the second, but there's a fatal flaw: we can't match .123 because a leading digit is now required.

This is actually pretty easy to fix. Instead of making the "decimal" portion of the number optional, we need to look at it as a sequence of characters: 1 or more numbers that may be prefixed by a . that may be prefixed by 0 or more numbers:

(\d*\.)?\d+

Now we just add the sign:

[+-]?(\d*\.)?\d+

Of course, those slashes are pretty annoying in Java, so we can substitute in our long-form character classes:

[+-]?([0-9]*[.])?[0-9]+

Matching versus Validating

This has come up in the comments a couple times, so I'm adding an addendum on matching versus validating.

The goal of matching is to find some content within the input (the "needle in a haystack"). The goal of validating is to ensure that the input is in an expected format.

Regexes, by their nature, only match text. Given some input, they will either find some matching text or they will not. However, by "snapping" an expression to the beginning and ending of the input with anchor tags (^ and $), we can ensure that no match is found unless the entire input matches the expression, effectively using regexes to validate.

The regex described above ([+-]?([0-9]*[.])?[0-9]+) will match one or more numbers within a target string. So given the input:

apple 1.34 pear 7.98 version 1.2.3.4

The regex will match 1.34, 7.98, 1.2, .3 and .4.

To validate that a given input is a number and nothing but a number, "snap" the expression to the start and end of the input by wrapping it in anchor tags:

^[+-]?([0-9]*[.])?[0-9]+$

This will only find a match if the entire input is a floating point number, and will not find a match if the input contains additional characters. So, given the input 1.2, a match will be found, but given apple 1.2 pear no matches will be found.

Note that some regex engines have a validate, isMatch or similar function, which essentially does what I've described automatically, returning true if a match is found and false if no match is found. Also keep in mind that some engines allow you to set flags which change the definition of ^ and $, matching the beginning/end of a line rather than the beginning/end of the entire input. This is typically not the default, but be on the lookout for these flags.

Sign up to request clarification or add additional context in comments.

25 Comments

JDB, thanks and I hope you are still around! I'm reading your post in the future :) Your answer certainly takes care of 0.24 and 2.2 and correctly disallows 4.2.44 All tested with regex101.com However, it disallows 123. which as you say may be acceptable (and I think it is!). I can fix this by changing your expression to [-+]?(\d*[.])?\d* (notice * at end instead of +) but then crazy things like . (your second example) are allowed. Anyway to have my cake and eat it too?
@Dave - \d+(\.\d*)?|\.\d+
@yeouuu yes, because 1. matches. Add ^ and $ to the beginning and end of the regex if you want to match only if the whole input matches.
floats can have exponents or be NaN/Inf, so i would use this: [-+]?(([0-9]*[.]?[0-9]+([ed][-+]?[0-9]+)?)|(inf)|(nan)), e/d for float/double precision float. Don't forget a fold case flag to the regex
I would recommend using a non-capturing group, as it us unlikely that someone aims at only capturing the integer part of the number. Like this: [+-]?(?:[0-9]*[.])?[0-9]+. Then, capturing the whole number is trivial.
|
46
+100

I don't think that any of the answers on this page at the time of writing are correct (also many other suggestions elsewhere on SO are wrong too). The complication is that you have to match all of the following possibilities:

  • No decimal point (i.e. an integer value)
  • Digits both before and after the decimal point (e.g. 0.35 , 22.165)
  • Digits before the decimal point only (e.g. 0. , 1234.)
  • Digits after the decimal point only (e.g. .0 , .5678)

At the same time, you must ensure that there is at least one digit somewhere, i.e. the following are not allowed:

  • a decimal point on its own
  • a signed decimal point with no digits (i.e. +. or -.)
  • + or - on their own
  • an empty string

This seems tricky at first, but one way of finding inspiration is to look at the OpenJDK source for the java.lang.Double.valueOf(String) method (start at http://hg.openjdk.java.net/jdk8/jdk8/jdk, click "browse", navigate down /src/share/classes/java/lang/ and find the Double class). The long regex that this class contains caters for various possibilities that the OP probably didn't have in mind, but ignoring for simplicity the parts of it that deal with NaN, infinity, Hexadecimal notation and exponents, and using \d rather than the POSIX notation for a single digit, I can reduce the important parts of the regex for a signed floating point number with no exponent to:

[+-]?((\d+\.?\d*)|(\.\d+))

I don't think that there is a way of avoiding the (...)|(...) construction without allowing something that contains no digits, or forbidding one of the possibilities that has no digits before the decimal point or no digits after it.

Obviously in practice you will need to cater for trailing or preceding whitespace, either in the regex itself or in the code that uses it.

9 Comments

@JDB You are right, sorry for missing the version in your comment. My concern was that the regex that featured most prominently in the accepted answer wouldn't work for all cases. Thanks for linking to/including my suggestion.
This, and all/most other answers, ignore that a float can have an exponent.
@NateS That's right, I did write "ignoring for simplicity the parts of it that deal with NaN, infinity, Hexadecimal notation and exponents", because that seems to match the scope of the OP's question. There are more complete implementations around, including the one that I found in the JDK source code.
Can the regex [+-]?((?=\.?\d)\d*\.?\d*) be used to avoid the alternation? It uses a lookahead...
@4esn0k Nice regex! I have played around with it, and it does work. I have two caveats: (1) not all regex engines support zero-width assertions (although most modern ones do, AFAIK), and (2) the look-ahead is just an alternation by another name: the engine still has to try something and backtrack if it doesn't work. Have an upvote for a very neat idea nevertheless.
|
35

I want to match what most languages consider valid numbers (integer and floats):

  • '5' / '-5'

  • '1.0' / '1.' / '.1' / '-1.' / '-.1'

  • '0.45326e+04', '666999e-05', '0.2e-3', '-33.e-1'

Notes:

  • preceding sign of number ('-' or '+') is optional

  • '-1.' and '-.1' are valid but '.' and '-.' are invalid

  • '.1e3' is valid, but '.e3' and 'e3' are invalid

In order to support both '1.' and '.1' we need an OR operator ('|') in order to make sure we exclude '.' from matching.

[+-]? +/- sing is optional since ? means 0 or 1 matches

( since we have 2 sub expressions we need to put them in parenthesis

\d+([.]\d*)?(e[+-]?\d+)? This is for numbers starting with a digit

| separates sub expressions

[.]\d+(e[+-]?\d+)? this is for numbers starting with '.'

) end of expressions

  • For numbers starting with '.'

[.] first character is dot (inside brackets or else it is a wildcard character)

\d+ one or more digits

(e[+-]?\d+)? this is an optional (0 or 1 matches due to ending '?') scientific notation

  • For numbers starting with a digit

\d+ one or more digits

([.]\d*)? optionally we can have a dot character an zero or more digits after it

(e[+-]?\d+)? this is an optional scientific notation

  • Scientific notation

e literal that specifies exponent

[+-]? optional exponent sign

\d+ one or more digits

All of those combined:

[+-]?(\d+([.]\d*)?(e[+-]?\d+)?|[.]\d+(e[+-]?\d+)?)

To accept E as well:

[+-]?(\d+([.]\d*)?([eE][+-]?\d+)?|[.]\d+([eE][+-]?\d+)?)

(Test cases)

Comments

11
+200

This is simple: you have used Java and you ought to use \\. instead of \. (search for character escaping in Java).

Comments

10

what you need is:

[\-\+]?[0-9]*(\.[0-9]+)?

I escaped the "+" and "-" sign and also grouped the decimal with its following digits since something like "1." is not a valid number.

The changes will allow you to match integers and floats. for example:

0
+1
-2.0
2.23442

4 Comments

The problem with this expression is that .1 would not be permitted, even though such input is universally recognized as correct.
This will now accept zero length strings, - and +, which are not numbers. Regex is tricky! :)
Also, this doesn't answer the OP's actual question, which is that \. doesn't work.
it fails at most floating point numbers. Check this regex demo.
6

Match strings which are considered valid representations of floating point values by C and C++ (and many other language) compilers, using the C++ regex library:

In C++ with #include <regex> you can do this:

std::regex r("[+-]?[0-9]+[.][0-9]*([e][+-]?[0-9]+)?");
return std::regex_match(value, r);

which is considerably more simple than most of the above C++ related answers.

It matches strings which are considered to be valid string representations of floating point numbers according to C++ compilers.

That means things like

1.
-1.

are considered valid representations of floating point numbers but that

.1
-.1

are not.

To explain the expression in more detail, it is essentially composed of two parts:

[+-]?[0-9]+[.][0-9]*([e][+-]?[0-9]+)?

[+-]?[0-9]+[.][0-9]*
and                 ([e][+-]?[0-9]+)?

The first part is easy to understand:

  • Optional (meaning 0 or 1 occurances of) '+' or '-' character
  • At least 1 digit, or more than one digit
  • A literal '.' character, which is mandatory (otherwise you have a representation of an integer not a floating point value)
  • If you want the '.' to be optional, change it to [.]?
  • Followed by zero or more digits

The second part is also quite easy once broken down.

  • Firstly note that the expression is contained in parenthesys, followed by a ?. This means the expression inside the parentesys must match 0 or 1 times. (Meaning it is optional.)
  • Inside we have a literal 'e' character which must match
  • Followed by an optional '+' or '-' character
  • Followed by 1 or more digits

The last part [+-]?[0-9]+ is a regex for matching an integer.

To match integer values as well use:

[+-]?[0-9]+[.]?[0-9]*([e][+-]?[0-9]+)?

Note the ? after the [.].

But be aware this will also match things like

+100e+100

which is perhaps an unusual representation of an integer. Although it is technically an integer, you probably wouldn't expect this to be a match.

Other answers provide a solution if you don't want this behaviour.

To ensure an entire string is a match rather than just a string containing a match use anchors:

"^[+-]?[0-9]+[.][0-9]*([e][+-]?[0-9]+)?$"

Examples

Without anchor characters

Without anchor characters

With anchor characters

enter image description here

With optional '.' character:

enter image description here

Note that this matches the string .-100 and .1e100 if you do not include the anchor characters, which may not be what you want.

When considering this problem:

My aim was to validate user input to ensure it matches a valid C++ string representation of a floating point number. Hence I am assuming you will use anchor characters and that you do not consider strings like

hello world 3.14 this contains a floating point number

to be a valid floating point number - because although the string contains a floating point number, the whole string is not a valid floating point number.

Other answers may suit your needs better if you just want to detect floating points within larger strings/text.

9 Comments

This will fail to match "0E-10", an odd edge case I've run across while parsing JSON data. [+-]?[0-9]+([.][0-9]+)?([e][+-]?[0-9]+)? will match that and similar cases.
@Gravis You need a version which matches e and E for the exponent. What you have is not an unusual edge case, but a format which doesn't match.
aww dang it! It was supposed to be [eE] in my comment. Also, the site backend that generated this value was Java which does use the uppercase 'E' for scientific notation per String.valueOf()!
@Gravis That's weird, I would expect that to match. I've just put this into regex101 and it seems to be working? If you still find you can't get it working can you please post a new question and send a link to me either by pasting the link here or sending it to my dm
I see the problem now. Right above the "Examples" section you forgot to put ? after [.].
|
4

This one worked for me:

(?P<value>[-+]*\d+\.\d+|[-+]*\d+)

You can also use this one (without named parameter):

([-+]*\d+\.\d+|[-+]*\d+)

Use some online regex tester to test it (e.g. regex101 )

Comments

2

This captures floating-point numbers as recognized in C/C++ code:

[+-]?((((\d+\.?\d*)|(\.\d+))([eE][+-]?\d+[fF]?)?)|((\d+\.\d*)|(\.\d+))[fF]?)
  • +/- sign
  • either only digits, digits., .digits or digits.digits
  • optional exponent with e or E, +/- sign and digits
  • optional f or F at the end, but only if the number contains a . or an exponent

1 Comment

This works also for NP++!
2
^[+-]?([0-9]{1,})[.,]([0-9]{1,})$

This will match:

  1. 1.2
  2. 12.3
  3. 123.4
  4. 1,2
  5. 12,3
  6. 123,4

Comments

1
[+-]?(([1-9][0-9]*)|(0))([.,][0-9]+)?

[+-]? - optional leading sign

(([1-9][0-9]*)|(0)) - integer without leading zero, including single zero

([.,][0-9]+)? - optional fractional part

2 Comments

Give more info - for people not knowing the regexps it is hyerogliphs. For people knowing them, they don't need it.
Cool, seems the only one that doesn't match inputs like 000007, exactly what I was trying to design, thanks!
1

for javascript

const test = new RegExp('^[+]?([0-9]{0,})*[.]?([0-9]{0,2})?$','g');

Which would work for 1.23 1234.22 0 0.12 12

You can change the parts in the {} to get different results in decimal length and front of the decimal as well. This is used in inputs for entering in number and checking every input as you type only allowing what passes.

Comments

1
(\d*)(\.)*(\d+)

This would parse the below.

11.00
12
.0

There must be one number. The decimal point and the number before the decimal point is optional.

Comments

0
[+/-] [0-9]*.[0-9]+

Try this solution.

Comments

0

In C++ using the regex library

The answer would go about like this:

[0-9]?([0-9]*[.])?[0-9]+

Notice that I don't take the sign symbol, if you wanted it with the sign symbol it would go about this:

[+-]?([0-9]*[.])?[0-9]+

This also separates a regular number or a decimal number.

Comments

0

In c notation, float number can occur in following shapes:

  1. 123
  2. 123.
  3. 123.24
  4. .24
  5. 2e-2 = 2 * 10 pow -2 = 2 * 0.1
  6. 4E+4 = 4 * 10 pow 4 = 4 * 10 000

For creating float regular expresion, I will first create "int regular expresion variable":

(([1-9][0-9]*)|0) will be int

Now, I will write small chunks of float regular expresion - solution is to concat those chunks with or simbol "|".

Chunks:

- (([+-]?{int}) satysfies case 1
- (([+-]?{int})"."[0-9]*)  satysfies cases 2 and 3
- ("."[0-9]*) satysfies case 4
- ([+-]?{int}[eE][+-]?{int}) satysfies cases 5 and 6

Final solution (concanating small chunks):

(([+-]?{int})|(([+-]?{int})"."[0-9]*)|("."[0-9]*)|([+-]?{int}[eE][+-]?{int})

Comments

0

For those who searching a regex which would validate an entire input that should be a signed float point number on every single character typed by a user.

I.e. a sign goes first (should match and be valid), then all the digits (still match and valid) and its optional decimal part.

In JS, we use onkeydown/oninput event to do that + the following regex:

^[+-]?[0-9]*([\.][0-9]*)?$

Comments

0

In C Language, the answer would go about like this:

[+-]?((\d+\.?\d*)|(\.\d+))(([eE][+-]?)?\d+)?[fFlL]?

Comments

0

If we are only looking to identify the floating points and not the integers, then can use this:

'\d*\.\d+'

Comments

0

I would suggest this pattern [-+]?[0-9]+[.]?[0-9]*

Comments

0

Based on other answers, I guess we need to set start and end for our expression too, cause we can have "match" on something like this: '3.2.3' where match will be '2.3'. So here is my solution:

^[+-]?[0-9]+(\.[0-9]+)?$

If you have problems like the author of this thread, use this:

^[+-]?[0-9]+([.][0-9]+)?$

And a reminder for better understanding:

  • '^' - Start of our expression
  • '[+-]' - '+' or '-' symbol
  • '?' - previous symbol (or sequence) is optional
  • '[0-9]' - any number
  • '()' - group of expressions
  • '+' - one or more symbols (or sequences) like previous
  • '$' - end of our expression

Hope it will help you)

Comments

-1

This is for javascript (idk if there's a large difference between languages)

`int: /0|[1-9][0-9]*/`

For floats:

`float:   /[0-9]+\.[0-9]+/`

Comments

-2

if you are using flutter you can use [0-9]([.]([0-9])*)? This would parse 123.123

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.