regex to explode a string json into values - php

I try build a php regex that validate this type of input string:
{name:'something name here',type:'',id:''},{name:'other name',type:'small',id:34},{name:'orange',type:'weight',id:28}
etc...
So, it is a list of json that each contain 3 field: name,type,id.Field name is always present, instead type and id can be together empty string ( '' ). Then I can explode it by comma if it has valid format and obtain a array of json string.
How can I do?
UPDATE
it isn't a valid json as you can say but I have a input field where user put tags, and I want track a name, type and id of that tags.
example:
tag1 (has name,type,id), tags2 (has only name), tags3(has name, type,id).
So, I think that I can post a string in that format:
{'name':'test','type':'first','id':3},{'name':'other','type':'second','id':45}, etc
But I must validate this string with a regex. I can do
$data = explode(',',$list);
and then I do:
foreach($data as $d){
$tmp = json_decode($d);
if($tmp == false) echo 'error invalid data';
}

As Gubo pointed out: this is not a valid JSON encoded string. If the actual data you want to process in your script ís valid however, you're barking up the wrong tree looking for a regular expression... PHP has tons of functions that will parse JSON strings much faster than a regular expression.
$string1 = "{name:'something name here',type:'',id:''},{name:'othername',type:'small',id:34},{name:'orange',type:'weight',id:28}";
$string2 = '[{"name":"something name here","type":"","id":""},{"name":"othername","type":"small","id":"34"},{"name":"orange","type":"weight","id":"28"}]';
Where $string2 is the data in valid JSON formar. If your data is a valid JSON string, the following code will suffice:
$parsed = json_decode($string2);
//$parsed[0]['name'] return 'something name here'
If, however you're dealing with invalid JSON strings, things get a bit more complicated... First off: if you're lacking your object properties (or array keys as they will become in PHP) are quoted, a quick fix would be this:
$parsed = json_decode('['.$string1.']');
If you really want to parse them seperatly:
$separated= preg_split('/(?<=[\}]),/',$string1);
But I can't see why you would want to do that. The biggest issue here is the absence of quotes on the property strings (or keys). I have put together a regex (untested) that could quote those strings:
$parsed = json_decode(preg_replace('/(?<=[\{,])([a-z]+)/',str_replace('\'','"',$string1)));
Keep in mind, the last regex is untested, so it might not perform as you expect it to... but it should help you on your way... for the last example, the same rules apply for all the other examples I gave: if the quotes and brackets are there, just use json_decode, if the brackets are missing, add them, too...
It's getting rather late here, so I'm off to bed now... I hope this answer isn't packed with typo's and sentences that nobody can understand. If it is, I do apologize.

You don't need a regex for that. Just use this:
var_dump(json_decode($json, true));
See: http://us.php.net/manual/en/function.json-decode.php

Related

How to find certain text within a php variable and then replace all text between characters either side

I have a variable within PHP coming from a form that contains email addresses all separated by a comma (')
For example:
user#domain1.com,user#domain2.com,user3#domain2.com,user2#domain4.com
What I am trying to achieve is to look at the variable, find for example #domain2.com and remove everything between the comma that are either side of that email.
I know I can use str_replace to replace just the text I'm after, like so:
$emails=str_replace("#domain2.com", "", "$emailscomma");
However, I'm more looking to remove that entire email based on the text I'm asking it to find.
So in this example I'm wanting to remove user#domain2.com and user3#domain2.com
Is this possible?
Searched any articles I could find but couldn't find something that finds something and then replaces but more than just the text it finds.
You can of course use regular expressions, but I would suggest a bit easier way. Operating on arrays is much easier than on strings and substrings. I would convert your string to an array and then filter it.
$emails = "user#domain1.com,user#domain2.com,user3#domain2.com,user2#domain4.com";
// Convert to array (by comma separator)
$emailsArray = explode(',', $emails);
$filteredArray = array_filter($emailsArray, function($email) {
// filters out all emails with '#domain2.com' substring
return strpos($email, '#domain2.com') === false;
});
print_r($filteredArray);
Now you can convert the filtered array to string again. Just use implode() function.

php regular expression breaks

I have the following string in an html.
BookSelector.load([{"index":25,"label":"Science","booktype":"pdf","payload":"<script type=\"text\/javascript\" charset=\"utf-8\" src=\"\/\/www.192.168.10.85\/libs\/js\/books.min.js\" publisher_id=\"890\"><\/script>"}]);
i want to find the src and the publisher_id from the string.
for this im trying the following code
$regex = '#\BookSelector.load\(.*?src=\"(.*?)\"}]\)#s';
preg_match($regex, $html, $matches);
$match = $matches[1];
but its always returning null.
what would be my regex to select the src only ?
what would be my regex if i need to parse the whole string between BookSelector.load ();
Why your regex isn't working?
First, I'll answer why your regex isn't working:
You're using \B in your regex. It matches any position not matched by a word boundary (\b), which is not what you want. This condition fails, and causes the entire regex to fail.
Your original text contains escaped quotes, but your regex doesn't account for those.
The correct approach to solve this problem
Split this task into several parts, and solve it one by one, using the best tool available.
The data you need is encapsulated within a JSON structure. So the first step is obviously to extract the JSON content. For this purpose, you can use a regex.
Once you have the JSON content, you need to decode it to get the data in it. PHP has a built-in function for that purpose: json_decode(). Use it with the input string and set the second parameter as true, and you'll have a nice associative array.
Once you have the associative array, you can easily get the payload string, which contains the <script> tag contents.
If you're absolutely sure that the order of attributes will always be the same, you can use a regex to extract the required information. If not, it's better to use an HTML parser such as PHP's DOMDocument to do this.
The whole code for this looks like:
// Extract the JSON string from the whole block of text
if (preg_match('/BookSelector\.load\((.*?)\);/s', $text, $matches)) {
// Get the JSON string and decode it using json_decode()
$json = $matches[1];
$content = json_decode($json, true)[0]['payload'];
$dom = new DOMDocument;
$dom->loadHTML($content);
// Use DOMDocument to load the string, and get the required values
$script_tag = $dom->getElementsByTagName('script')->item(0);
$script_src = $tag->getAttribute('src');
$publisher_id = $tag->getAttribute('publisher_id');
var_dump($src, $publisher_id);
}
Output:
string(40) "//www.192.168.10.85/libs/js/books.min.js"
string(3) "890"

PHP Regex pattern to match magic search keywords

Ok regex experts. I'm having a ton of trouble trying to make a regex pattern for my needs.
The goal:
Take a search query such as "good food type:post format:gallery" and parse the type or format or both from the string.
This is what I wrote, but doesnt work unless both type and format are present and type comes before format. Ideally, either type or format could be present.
$query = "Great food type:post format:gallery";
preg_match('/(.*?(?<=\btype:)(?P<type>[a-z]*\w+))(.*?(?<=\bformat:)(?P<format>[a-z]*\w+))/', $query, $matches);
I image I need the returned $matches to be named as well right?
Thanks,
I don't think you'll want to use a regex for this. It'll be a pain to maintain and update when you add more operators like type: and format: Also the regex then depends on ordering of what's entered.
A simple approach might be like
$tokens=explode(" ",$searchString);
foreach($tokens as $token){
if(preg_match('~([^:]+:(.*)~',$token,$flagMatch)){
$flags[$flagMatch[1]]=$flagMatch[2];
}
$searchtokens[]=$token
}
Obvious caveat with that example is exploding straight on space so you wouldn't be able to handle "quoted terms" that should be treated as one.

Regex replace matched subexpression (and nothing else)?

I've used regex for ages but somehow I managed to never run into something like this.
I'm looking to do some bulk search/replace operations within a file where I need to replace some data within tag-like elements. For example, converting <DelayEvent>13A</DelayEvent> to just <DelayEvent>X</DelayEvent> where X might be different for each.
The current way I'm doing this is such:
$new_data = preg_replace('|<DelayEvent>(\w+)</DelayEvent>|', '<DelayEvent>X</DelayEvent>', $data);
I can shorten this a bit to:
$new_data = preg_replace('|(<DelayEvent>)(\w+)(</DelayEvent>)|', '${1}X${2}', $data);
But really all I want to do is simulate a "replace text between tags T with X".
Is there a way to do such a thing? In essence I'm trying to prevent having to match all the surrounding data and reassembling it later. I just want to replace a given matched sub-expression with something else.
Edit: The data is not XML, although it does what appear to be tag-like elements. I know better than parsing HTML and XML with RegEx. ;)
It is possible using lookarounds:
$new_data = preg_replace('|(?<=<DelayEvent>)\w+(?=</DelayEvent>)|', 'X', $data);
See it working online: ideone

php string manipulation nonrandom sort

I am trying to sort a 4 character string thats being feed in from a user into a different order. an example might be they type "abcd" which I then take and turn it into "bcad".
Here is an example of my attempt which is not working :P
<?php
$mixedDate = $_REQUEST['userDate'];
$formatted_date = firstSubString($mixedDate,2).secondSubString($mixedDate,3).thirdSubString($mixedDate,1).fourthSubString($mixedDate,4);
//... maybe some other stuff here then echo formatted_date
?>
any help would be appreciated.
Copied from comment:
You could pretty simply do this by doing something like:
$formatted_date = $mixedDate[1].$mixedDate[2].$mixedDate[0].$mixedDate[3];
That way, you don't have to bother with calling a substring method many times, since you're just moving individual characters around.
<?php
$mixedDate = $_REQUEST['userDate'];
$formatted_date = $mixedDate{1}.$mixedDate{2}.$mixedDate{0}.$mixedDate{3};
echo $formatted_date;
?>
The curly syntax allows you to get just that one character from your string.
It should be noted that this works correctly on your sample string, abcd and turns it into bcad if $_REQUEST['userDate'] is abcd.
Look into split() in php. It takes a string and a delimiter then splits the string into an array. Either force the user to use a certain format or use a regex on the input string to put the date into a known format, like dd/mm/yyyy or dd-mm-yyyy, then use the hyphen or / as the delimiter.
Once the string is split into an array, you can rearrange it any way you like.
That is very simple.
If
$mixedDate = 21-12-2010
then, try this
echo substr($mixedDate, 3,
2).'-'.substr($mixedDate, 0,
2).'-'.substr($mixedDate, 6);
this will result in
12-21-2010
This is assuming the format is fixed.
Use str_split() to break the string into single characters:
$char_array = str_split($input_string);
If you know exactly what order you want, and you only have four characters, then from here you can actually just do it the way you wanted from your question, and concatenate the array elements back into a single string, like so:
$output_string = $char_array[2].$char_array[3].$char_array[1].$char_array[4];
If your needs are more complex, you can sort and implode the string:
Use sort() to put the characters into order:
sort($char_array);
Or one of the other related sorting functions that PHP provides if you need a different sort order. If you need an sort order which is specific to your requirements, you can use usort(), which allows you to write a function which defines how the sorting works.
Then re-join the characters into a single string using implode():
$output_string = implode($char_array);
Hope that helps.

Categories