1

I'm a beginner in regexp and i try to search in json formatted text, but i cannot make it work right:

SELECT DISTINCT tag, body FROM pages 
WHERE (body REGEXP BINARY '"listeListeOuiNon":".*1.*"')

It shows me as results text with

"listeListeOuiNon":"1" and

"listeListeOuiNon":"1,2" and

"listeListeOuiNon":"0,1" as expected,

but also "listeListeOuiNon":"2" (not expected)

Any idea? Maybe it's because it's greedy, but i'm not sure...

Thanks in advance!

4
  • Just a word of caution, MySQL REGEX is a very expensive operation. It's often faster to break what you can into several LIKE statements if you can swing it. Also, is this a common query or will the matching text change often? Commented Sep 23, 2011 at 12:31
  • yes i know about like being faster, but in fact matching text is changing and trying it with LIKE '"listeListeOuiNon":"%1%"' is not better.. Commented Sep 23, 2011 at 12:41
  • Can you give us an example row where "listeListeOuiNon":"2" was matched? Commented Sep 23, 2011 at 12:59
  • yes for example : SELECT '"listeListeOuiNon":"2", "listeToto":"1"' REGEXP BINARY '"listeListeOuiNon":".*1.*"' Commented Sep 23, 2011 at 13:07

3 Answers 3

2

Well, it's quite easy to debug:

SELECT '"listeListeOuiNon":"2"' REGEXP BINARY '"listeListeOuiNon":".*1.*"'

returns 0

SELECT '"listeListeOuiNon":"1"' REGEXP BINARY '"listeListeOuiNon":".*1.*"'

returns 1

SELECT '"listeListeOuiNon":"1,2"' REGEXP BINARY '"listeListeOuiNon":".*1.*"'

returns 1

So something is not right at your side... because it just could not return rows where body equals "listeListeOuiNon":"2". But it is possible, that body has several of these statements, something like:

body => '"listeListeOuiNon":"1,2", "listeListeOuiNon":"2"'

So you have to modify your regexp:

'^"listeListeOuiNon":".*1.*"$'

Well, then you have to modify your query:

SELECT DISTINCT tag, body FROM pages WHERE (body REGEXP BINARY '"listeListeOuiNon":".*1.*"') AND NOT (body REGEXP BINARY '"listeListeOuiNon":"2"')

Sign up to request clarification or add additional context in comments.

1 Comment

thanks Jauzsika, but in fact i have complicated data, like : {"bf_titre":"Veille partage","listeListeLogiciel":"BAZ","listeListeEtatDuBug":"CONF","listeListeBugs":"USER","listeListeOuiNon":"2","bf_sauveur":"Florian","id_typeannonce":"31","createur":"Anonyme","categorie_fiche":"Maintenance du site","date_creation_fiche":"2011-08-30 14:20:09","date_debut_validite_fiche":"2011-08-30","date_fin_validite_fiche":"0000-00-00","statut_fiche":"1","id_fiche":"VeillePartagee"} <-- and for this it doesn't work..
1

I would try to replace the two .* with [^"]*... That'll however only be sufficient if your listeListeOuiNon cannot contain litteral "s, or you'd have to also handle the escape sequence. Basically with the . you'll match any JSON string that has a 1 "after" "listListOuiNon":", even if it's in another field, and yes, that's because it's greedy.

1 Comment

waouw! it seems to work!! SELECT '"listeListeOuiNon":"2", "listeToto":"1"' REGEXP BINARY '"listeListeOuiNon":"[^"]*1[^"]*"' is egal to 0! merci beaucoup!
1

Returns 0.

enter image description here

1 Comment

my problem is that have have more in my json text.. : SELECT '"listeListeOuiNon":"2", "listeToto":"1"' REGEXP BINARY '"listeListeOuiNon":".*1.*"'

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.