Simple Regex NOT on multidimensional JSON string - php

So i will provide this simple example of json string covering most of my actual string cases:
"time":1430702635,\"id\":\"45.33\",\"state\":2,"stamp":14.30702635,
And i'm trying to do a preg replace to the numbers from the string, to enclose them in quotes, except the numbers which index is already quoated, like in my string - '\state\':2
My regex so far is
preg_replace('/(?!(\\\"))(\:)([0-9\.]+)(\,)/', '$2"$3"$4',$string);
The rezulting string i'm tring to obtain in this case is having the "\state\" value unquoted, skipped by the regex, because it contains the \" ahead of :digit,
"time":"1430702635",\"id\":\"45.33\",\"state\":2,"stamp":"14.30702635",
Why is the '\state\' number replaced also ?
Tried on https://regex101.com/r/xI1zI4/1 also ..
New edit:
So from what I tried,
(?!\\")
is not working !!
If I'm allowed, I will leave this unanswered in case someone else does know why.
My solution was to use this regex, instead of NOT, I went for yes ..
$string2 = preg_replace('/(\w":)([0-9\.]+)(,)/', '$1"$2"$3',$string);
Thank you.

(?!\\") is a negative lookahead, which generally isn't useful at the very beginning of a regular expression. In your particular regex, it has no effect at all: the expression (?!(\\\"))(\:) means "empty string not followed by slash-quote, then a colon" which is equivalent to just trying to match a colon by itself.
I think what you were trying to accomplish is a negative lookbehind, which has a slightly different syntax in PCRE: (?<!\\"). Making this change seems to match what you want: https://regex101.com/r/xI1zI4/2

Related

PHP regex to parse a string of the form {string}\{string}

As per title I need to parse string of the form string_1\string_2 as in a string followed by a backslash then by another string with the following requirements:
if string_1 and string_2 are present, break them into two tokens: string_1 and \string_2
if only string_1 is present, return it
if \string_2 is present but nothing behind the backslash, don't match anything.
So far I've come up with this :
^([\w\s]*)((?!\\\).*)
but the last character in string_1 keeps 'leaking' through and going to string_2 right before the backslash.
Is there a way to fix that? Or any other alternative regex?
The following regex does helps with the leaking but it break the third requirement.
^([\w\s]*).((?!\\\).*)
In order to make sure this question is not too localized, note that this could help parse a subset of latex when you have a string coming before say \section{section title comes here {*}}.
I think this is the regex you're looking for:
/^([^\\]+)(\\.+)?/
The first group is a "non-\" of at least 1 character, followed by optional "\" and anything else.

How would i write a regular expression to check for a string of text surrounded by equal signs?

How would i use regular expressions to check for characters within the following string of text:
=== logo ===
I tried to use a regex tester but could come up with the correct expression for i've tried this:
/^[=]{3}$/
I want search within a string find where the text starts with 3 equal signs.
Find a string or any other characters within the equal signs.
Find 3 more equal signs.. ending the expression.
Thanks in advance.
Try using this regex:
/===[^=]+===/
If you want to capture the text, surround it in parentheses:
/===([^=]+)===/
Here's the fiddle: http://jsfiddle.net/jufXA/
If you might have equal signs in your text (but less than 3, obviously) you should instead match everything lazily (which is a tad slower):
/===(.+?)===/
Here's the fiddle: http://jsfiddle.net/jufXA/1/
How about as simple as...
/===(.+?)===/
For example:
$test = "here's ===something special===, like ===this=one===";
preg_match_all('/===(.+?)===/', $test, $matches);
var_dump($matches[1]);
Laziness is kinda virtue here: the regex engine won't advance past the first 'closing delimiter ==='. Without ?, however, you need to use negated character classes (but then again, what about ===something=like=this===?).
I prefer:
/([=]{3})\s*(.+?)\s*\1/.
This puts the text markup (three equal signs) in the beginning and then just uses a back reference for the end. It also trims your text of spaces, which is what you probably want.

PHP PregEX for detecting a variant

I am trying to create a registry expression that will detect the following syntax in a string:
OPEN-BRACKET > ANYTHING > PLUS-OR-MINUS > CLOSE-BRACKET
Example String: NB###-#####-#####-###[#+]
Please note that the expression could be anywhere in the string and have multiple occurrences.
I have tried [(.+)(\+|-)] which doesn't seem to work as I imagined it to do in php, but does work in rubular.com
What would the expression be to return the string *and* whether it was a plus or minus?
I'd suggest the pattern:
"(\[(.+)(\+|-)\])"
The parentheses capture the whole group, the \ escapes the [ and ] characters, and also the + character, that, otherwise (when unescaped) have special meanings in regular expressions.
Maybe .+ consumes all due to the default greedyness? What happens if you anchor the string using ^\[(.+)(\+|-)\]$.
If you cannot anchor the string due to multiply occurrences, try using look-ahead feature. And if "ANYTHING" really can be anything, how do you distinguish an ANYTHING +] from a PLUS-OR-MINUS > CLOSE-BRACKET?
If neither plus nor minus is allowed in ANYTHING, go for \[([^+-]+)(\+|-)\].
Thanks for the help. I managed to figure out /\[([\$|a-zA-Z]+)(\+|-)\]/ works as i intended it to.

Regular expression doesn't quite work

I have created a Regular Expression (using php) below; which must match ALL terms within the given string that contains only a-z0-9, ., _ and -.
My expression is: '~(?:\(|\s{0,},\s{0,})([a-z0-9._-]+)(?:\s{0,},\s{0,}|\))$~i'.
My target string is: ('word', word.2, a_word, another-word).
Expected terms in the results are: word.2, a_word, another-word.
I am currently getting: another-word.
My Goal
I am detecting a MySQL function from my target string, this works fine. I then want all of the fields from within that target string. It's for my own ORM.
I suppose there could be a situation where by further parenthesis are included inside this expression.
From what I can tell, you have a list of comma-separated terms and wish to find only the ones which satisfy [a-z0-9._\-]+. If so, this should be correct (it returns the correct results for your example at least):
'~(?<=[,(])\\s*([a-z0-9._-]+)\\s*(?=[,)])~i'
The main issues were:
$ at the end, which was anchoring the query to the end of the string
When matching all you continue from the end of the previous match - this means that if you match a comma/close parenthesis at the end of one match it's not there at match at the beginning of the next one. I've solved this with a lookbehind ((?<=...) and a lookahead ((?=...)
Your backslashes need to be double escaped since the first one may be stripped by PHP when parsing the string.
EDIT: Since you said in a comment that some of the terms may be strings that contain commas you will first want to run your input through this:
$input = preg_replace('~(\'([^\']+|(?<=\\\\)\')+\'|"([^"]+|(?<=\\\\)")+")~', '"STRING"', $input);
which should replace all strings with '"STRING"', which will work fine for matching the other regex.
Maybe using of regex is overkill. In this kind of text you can just remove parenthesis and explode string by comma.

PHP Regex: match text urls until space or end of string

This is the text sample:
$text = "asd dasjfd fdsfsd http://11111.com/asdasd/?s=423%423%2F gfsdf http://22222.com/asdasd/?s=423%423%2F
asdfggasd http://3333333.com/asdasd/?s=423%423%2F";
This is my regex pattern:
preg_match_all( "#http:\/\/(.*?)[\s|\n]#is", $text, $m );
That match the first two urls, but how do I match the last one? I tried adding [\s|\n|$] but that will also only match the first two urls.
Don't try to match \n (there's no line break after all!) and instead use $ (which will match to the end of the string).
Edit:
I'd love to hear why my initial idea doesn't work, so in case you know it, let me know. I'd guess because [] tries to match one character, while end of line isn't one? :)
This one will work:
preg_match_all('#http://(\S+)#is', $text, $m);
Note that you don't have to escape the / due to them not being the delimiting character, but you'd have to escape the \ as you're using double quotes (so the string is parsed). Instead I used single quotes for this.
I'm not familar with PHP, so I don't have the exact syntax, but maybe this will give you something to try. the [] means a character class so |$ will literally look for a $. I think what you'll need is another look ahead so something like this:
#http:\/\/(.*)(?=(\s|$))
I apologize if this is way off, but maybe it will give you another angle to try.
See What is the best regular expression to check if a string is a valid URL?
It has some very long regular expressions that will match all urls.

Categories