problem with php regular expression - php

hi i have a data in below format
<option value="http://www.torontoairportlimoflatrate.com/aurora-limousine-service.html">Aurora</option>
<option value="http://www.torontoairportlimoflatrate.com/alexandria-limousine-service.html">Alexandria</option>
i after banging my head on table 10 times figured out to use regular expression below
preg_match_all("#>\w*#",$data,$result);
This returns the results as below
Array
(
[0] => Array
(
[0] => >Ajax
[1] => >
[2] => >Aurora
[3] => >
[4] => >Alexandria
[5] => >
[6] => >Alliston
I only want single array having values i.e.
cities
[0] => Ajax
[1] => Aurora
...... so on.
Pleas

If you'd prefer not to use an HTML parser, you can do it with a regex, but keep in mind that you'll probably need to modify it based on what you'll receive as input in the future. For your specific problem, this is a regex that does the job:
<?php
preg_match_all('/<option\svalue=\"([a-zA-Z0-9-_.\/:]+)\">([a-zA-Z\s]+)<\/option>/', $data, $result);
var_dump($result[2]);
Note:
If you want to match every url you should replace ([a-zA-Z0-9-_.\/:]+) with a more capable url matching regex. You can find some on StackOverflow also, but for me is a matter of personal taste.

Related

PHP Array Break Down/Explode

STARTING ARRAY
Array
(
[0] => Array
(
[0] => /searchnew.aspx?Make=Toyota&Model=Tundra&Trim=CrewMax+5.7L+V8+6-Spd+AT+SR5&st=Price+asc
[1] => 19
)
)
I have been struggling to break down this array for the past couple days now. I have found a few useful functions to extract the strings I need when a start and end point are defined, however, I can't see that being good for long term use. Basically I'm trying to take the string relative to [0], and extract the strings following "Model=" and "Trim=", in hopes to have array like this:
Array
(
[0] => Array
(
[0] => Tundra ***model***
[1] => CrewMax+5.7L+V8+6-Spd+AT+SR5 ***trim***
[2] => 19
)
)
I'm getting this information fed through an api, so coming up with a dynamic solution is my biggest challenge. I realize this a big question, but is there a better/less hacky way of approaching this problem?
parse_url() will get you the query string and parse_str() parses the variables from that:
$q = parse_url($array[0][0], PHP_URL_QUERY);
parse_str($q, $result);
print_r($result);
Yields:
Array
(
[Make] => Toyota
[Model] => Tundra
[Trim] => CrewMax 5.7L V8 6-Spd AT SR5
[st] => Price asc
)
Now just echo $result['Model'] etc...

preg_replace json string match same character beginning/end

Ok so what I have is a JSON string which can contain 1 or many elements below I've put an example of the sting but this is only an example the real string is much more complicated. This one highlight's the issue's I'm having.
{"elements":[{"id":2,"string":"something","string2":"","string3":"no html here","integer":2,"array":{"options":[{"id":1,"value":"data"},{"id":2,"value":"more data"}]},"string4":"text with <a href=\"http:\/\/www.example.com\">html<\/a>","string5":"naughty <a href=\"http:\/\/www.example.com\">link<\/a>"},{"id":2,"string":"something","string2":"","string3":"no html here","integer":2,"array":{"options":[{"id":1,"value":"data"},{"id":2,"value":"more data"}]},"string4":"text with <a href=\"http:\/\/www.example.com\">html<\/a>","string5":"naughty <a href=\"http:\/\/www.example.com\">link<\/a>"}]}
What I'm trying to do is match all of the Strings (data-type not the name) in the JSON data and then depending on whether it's allowed HTML or not (using a blacklist) striping out the HTML. I'm no regex expert so I can't work out what's going wrong.
Here is my regex:-
([{,]"(?!(elements|string3|string4)":)(.*?)":)(?!,")"(.*?)",
I'm having two issue's with it:-
It is matching elements with both integer's and array's by simply jumping to the " found within the next string. I expected the match to fail and move on
I can't get it to handle the \" in the url so I need the , on the end of the regex but this then stop's the next string matching I tried \G but this seemed to have no affect I have a feeling it starts after the , in the previous match. I also tried a number of solutions that were suppose to allow for escaped text but these all failed to work in my case.
The thought was that this would be quicker than converting the JSON string into an object and then traversing the array of hundreds of elements to remove the HTML if that's quicker then I'll just do that it'll be a whole lot easier.
Don't work on the json directly, decode it using json_decode().
Then cleanup your HTML using HTMLPurifier, which does a great job at cleaning HTML code.
Then encode your data to json again using json_encode().
Description
There were several problems with your expression like the use of .*? will continue to capture all characters until the next required character is matched. I replaced this with [^"]*? which will match all non quotes, this forces the capture to stop consuming characters which are outside the quoted group.
I also made a capture group for the open quotes (["]) although probably overkill this allows you to simply add a single quote to the character class. Then I refer back to this captured group later to ensure the correct corresponding close quote is also matched. This way if the open quote is not required in your input string then you can simply insert a question mark (["])? and the close quote will automatically be found that matches the open quote.
I also moved the [{,] to outside the capture group
This is my cleaned up version of the regex
[{,]((")(?!(elements|string3|string4)\2:)([^"]*?)\2:)(")([^"]*?)\5(?=,)
PHP Code Example:
<?php
$sourcestring="your source string";
preg_match_all('/[{,]((")(?!(elements|string3|string4)\2:)([^"]*?)\2:)(")([^"]*?)\5(?=,)/i',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
?>
$matches Array:
(
[0] => Array
(
[0] => ,"string0":"something0"
[1] => ,"string1":""
[2] => ,"string":"something"
[3] => ,"string5":""
)
[1] => Array
(
[0] => "string0":
[1] => "string1":
[2] => "string":
[3] => "string5":
)
[2] => Array
(
[0] => "
[1] => "
[2] => "
[3] => "
)
[3] => Array
(
[0] =>
[1] =>
[2] =>
[3] =>
)
[4] => Array
(
[0] => string0
[1] => string1
[2] => string
[3] => string5
)
[5] => Array
(
[0] => "
[1] => "
[2] => "
[3] => "
)
[6] => Array
(
[0] => something0
[1] =>
[2] => something
[3] =>
)
)

Get name from hashtag using regex

I have this string/content:
#Salome, #Jessi H and #O'Ren were playing at the #Lean's yard with "#Ziggy" the mouse.
Well, I am trying to get all names focuses above. I have used # symbol to create like a hash to be used in my web. If you note, there are names with spaces between like #Jessi H and characters before and after like #Ziggy. So, I don't my if you suggest me another way to manage the hash in another way to get it works correctly. I was thinking that for user that have white spaces, could write the hash with quotes like #"Jessi H". What do you think? Other examples:
#Lean's => #"Lean"'s
#Jessi H => #"Jessi H"
"#Jessi H" => (sorry, I don't know how to parse it)
#O'Ren => #"O'Ren"
What I have do?
I'm starting using regex in php, but some SO questions have been usefull for me to get started, so, these are my tries using preg_match_all function firstly:
Result of /#(.*?)[,\" ]/:
Array ( [0] => Salome [1] => Jessi [2] => Charlie [3] => Lean's [4] => Ziggy" ) )
Result of /#"(.*?)"/ for names like #"name":
Empty array
Guys, I don't expect that you do it all for me. I think that a pseudo-code or something like this will be helpful to guide me to the right direction.
Try the following regex: '/#(?:"([^"]+)|([^\b]+?))\b/'
This will return two match groups, the first containing any quoted names (eg #"Jessi H" and #"O'Ren"), and the second containing any unquoted names (eg #Salome, #Leon)
$matches = array();
preg_match_all('/#(?:"([^"]+)|([^\b]+?))\b/', '#Salome, #"Jessi H" and #"O\'Ren" were playing at the #Lean\'s yard with "#Ziggy" the mouse.', $matches);
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] => #Salome
[1] => #"Jessi H
[2] => #"O'Ren
[3] => #Lean
[4] => #Ziggy
)
[1] => Array
(
[0] =>
[1] => Jessi H
[2] => O'Ren
[3] =>
[4] =>
)
[2] => Array
(
[0] => Salome
[1] =>
[2] =>
[3] => Lean
[4] => Ziggy
)
)
Are you setting these requirements or can you choose them? If you can set the requirements, I would suggest using _ instead of spaces, which would allow you to use the regex:
/#(.+) /
If spaces must be allowed and you're going with quotes, then the quotes should probably span the entire name, allowing for this regex:
/#\"(.+)\" /

Searching an array for a string pattern and store it in a variable

Array
(
[0] => LAMPION
[1] => BANBU
[2] => DT-T300-FNS
[3] => T65
[4] => DT-299-FNS
[5] => T30
)
I have an array looking like this. The problem is the data stored in the array is not consistent so i have to search the array for this pattern "xx-xxx-xxx" and store it in a variable. is there any way i can do that? really need hlp
Yes.
$matches = preg_grep('/^.{2}-.{3}-.{3}\z/', $array);
If you want the first, just add [0] (you'll need a temporary variable first for < PHP 5.4).

PHP REGEX separate by different criteria

I'm trying to separate some strings by different criteria but I can't get the desired results.
Here are 3 examples:
$ppl[0] = "Balko, Vlado \"Panelбk\" (2008) {Byt na tretom (#1.55)}";
$ppl[1] = "'Abd Al-Hamid, Ja'far A Two Hour Delay (2001)";
$ppl[2] = "'t Hoen, Frans De reьnie (1963) (TV)";
I'm currently using this for the last 2:
$pattern = '#,|\t|\(#'
But I will get and empty space.
result:
Array ( [0] => 'Abd Al-Hamid [1] => Ja'far [2] => A Two Hour Delay [3] => 2001) )
Array ( [0] => 't Hoen [1] => Frans [2] => [3] => De reünie [4] => 1963) [5] => TV) )
As for the 1st expression I used another pattern but I still get empty spaces. Any ideas?
EDIT:
Thanks this helped indeed. I tried using a modified version on the first string:
$pattern4 = '#[",\t]+|[{}]+|[()]+#';
However I still get an empty space:
Array ( [0] => Balko [1] => Vlado [2] => Panelák [3] => [4] => 2008 [5] => [6] => Byt na tretom [7] => #1.55 [8] => [9] => )
What should I do? I think that the " and the brackets are causing the problem but I don't know how to fix it.
I would surmise you have two tabs as separator in your second and third example string. (Can't see that here, the SO editor converts them into spaces).
But you could adapt your regex slightly in that case:
$pattern = '#,|\t+|\(#'
Or simpler even:
$pattern = '#[,\t(]+#'
And the alternatve, btw, would be just applying array_filter() on the result arrays to remove the empty entries.

Categories