preg_match to array. PHP - php

Calling all the PHP helpers out there.
So basically I would like to give the function preg_match a variable that can contain a couple thousand lines of code) and have it search using a wildcard + strings either side of the widlcard.
For example I would like to search for strings that look like this <a href="*.pdf">
I would then like the function to return every match (along with the html shiz around the wildcard, this is to catch any directory structures too) in an array that I can loop through using a foreach(){} loop.
I'm guessing this is possible, would anyone have the time to help me with this?
I've check through all the preg_match lit' and through the answers on here, but I can't seem to get the patterns correct. Thanks in advance.
Peace out.

unset($matches);
preg_match_all('/<a href="[^"]+\.pdf">/',$text,$matches);
foreach ($matches as $match)
{
$shiz = $match[0];
// Your code here ...
}

Related

Using preg_match_all to filter out strings containing this but not this

im having an issue with preg_match_all. I have this string:
$product_req = "ACTIVE-6,CATEGORY-ACTIVE-8,CATEGORY-ACTIVE-4,ACTIVE-9";
I need to get the numbers preceded by "ACTIVE-" but not by "CATEGORY-ACTIVE-", so in this case the result should be 6,9. I used the statement below:
preg_match_all("/ACTIVE-(\d+)/", $product_req, $this_act);
However this will return all the numbers because all of them are in fact preceded by "ACTIVE-" but thats not what i meant because i need to leave out those preceded by "CATEGORY-ACTIVE-". How can i configure preg_match_all to do it? Or maybe there is some other function that can do the job?
EDIT:
I tried this:
preg_match_all("/CATEGORY-ACTIVE-(\d+)/", $product_req, $this_cat_act);
preg_match_all("/ACTIVE-(\d+)/", $product_req, $this_act);
$act_cat = str_replace($this_cat_act[1],"",$this_act[1]);
it kinda works, but i guess there is a better and cleaner way to do it. Besides the output is kinda weird too.
Thank you.

Regex replace matched subexpression (and nothing else)?

I've used regex for ages but somehow I managed to never run into something like this.
I'm looking to do some bulk search/replace operations within a file where I need to replace some data within tag-like elements. For example, converting <DelayEvent>13A</DelayEvent> to just <DelayEvent>X</DelayEvent> where X might be different for each.
The current way I'm doing this is such:
$new_data = preg_replace('|<DelayEvent>(\w+)</DelayEvent>|', '<DelayEvent>X</DelayEvent>', $data);
I can shorten this a bit to:
$new_data = preg_replace('|(<DelayEvent>)(\w+)(</DelayEvent>)|', '${1}X${2}', $data);
But really all I want to do is simulate a "replace text between tags T with X".
Is there a way to do such a thing? In essence I'm trying to prevent having to match all the surrounding data and reassembling it later. I just want to replace a given matched sub-expression with something else.
Edit: The data is not XML, although it does what appear to be tag-like elements. I know better than parsing HTML and XML with RegEx. ;)
It is possible using lookarounds:
$new_data = preg_replace('|(?<=<DelayEvent>)\w+(?=</DelayEvent>)|', 'X', $data);
See it working online: ideone

PHP Regex for information between h4 tags

I am trying to grab what is the h4 text
$regex = '/<h4>([A-Za-z0-9\,\.])/';
I am just getting the first letter back, I cannot figure out how to use * to keep grabbing everything to the first < character.
I have made countless attempts and know I am overlooking something simple.
So I was making that much harder than I needed to, the following works:
$regex = '/<h4>.*?<\/h4>/';
If you can trust that grabbing all characters up to the first < is a good enough rule then use this:
$regex = '/<h4>([^<]*?)</';
Of course that definition will only grab 'The ' from <h4>The <b>Best</b> Book</h4> You can fix that be changing it to:
$regex = '/<h4>(.*?)<\/h4>/';
Which will grab everything between a <h4> and a </h4>, but still isn't perfect because anything like <h4 > or <h4 style="..."> will break it, along with a million other valid HTML examples. If you know that the contents won't have any < though, and you know your tag will always be exactly <h4> the first one works well enough for your situation.
If your situation is more complex you will want to use something like PHP's DOM extension (DOMDocument) which is meant for parsing HTML and XML, since neither are regular languages and cannot be parsed error free with regex.
You can use the below function to accomplish this task.
**function getTextBetweenTags($string, $tagname) {
$pattern = "/<$tagname ?.*>(.*)<\/$tagname>/";
preg_match($pattern, $string, $matches);
return $matches;
}**
In the first parameter you have to pass the complete string, and in the second parameter you have to pass the tagname ("h4")..

Strip All Urls From A Mixed String ( php )

i reposted this question because i didn't find a good answer.
i have a string which can contains text with urls.
i want a function to strip all urls from this string and just let the text.
by example the string can contains like this :
1) hey take a look here : http://xxx.xxx/545df5 this is nice!
2) hey take a look here : http://www.xxx.xxx/545df5 this is nice!
3) hey take a look here : xxx.xxx/545df5 this is nice!
4) hey take a look here : www.xxx.xxx/545df5 this is nice!
Thanks
Regular expression for URL and how to use regular expression with php should help you.
What you really need is a solid regex to find urls in a string and you can preg_replace that pattern with nothing. I can tell you though that tracking down a regex like that is not easy. Depending on the variations in the urls you're looking for (i.e. http:// vs https:// vs ftp://) You could run into real trouble trying to account for all that.
Here is a page that I found to be a good start though.
Regex is the way to go as was discussed prior. Finding one isn't that terribly hard (google: url regex pattern) One example returned is here
http://www.geekzilla.co.uk/View2D3B0109-C1B2-4B4E-BFFD-E8088CBC85FD.htm
I would also recommend you test your regex using one of the many fine online regex testers. My favorite (for non-java) is
http://www.regextester.com/
This function should do it(assuming your strings are seperated by space " "):
function isValidURL($url) {
return preg_match('|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i', $url);
}
function cleanUpUrls($urls) {
$urlArray = explode(' ',$urls);
$resultArray = array();
foreach ($urlArray as $url) {
if(!isValidURL($url)) {
$resultArray[] = $url;
}
}
return implode(' ',$resultArray);
}

PHP regex templating - find all occurrences of {{var}}

I need some help with creating a regex for my php script. Basically, I have an associative array containing my data, and I want to use preg_replace to replace some place-holders with real data. The input would be something like this:
<td>{{address}}</td><td>{{fixDate}}</td><td>{{measureDate}}</td><td>{{builder}}</td>
I don't want to use str_replace, because the array may hold many more items than I need.
If I understand correctly, preg_replace is able to take the text that it finds from the regex, and replace it with the value of that key in the array, e.g.
<td>{{address}}</td>
get replaced with the value of $replace['address']. Is this true, or did I misread the php docs?
If it is true, could someone please help show me a regex that will parse this for me (would appreciate it if you also explain how it works, since I am not very good with regexes yet).
Many thanks.
Use preg_replace_callback(). It's incredibly useful for this kind of thing.
$replace_values = array(
'test' => 'test two',
);
$result = preg_replace_callback('!\{\{(\w+)\}\}!', 'replace_value', $input);
function replace_value($matches) {
global $replace_values;
return $replace_values[$matches[1]];
}
Basically this says find all occurrences of {{...}} containing word characters and replace that value with the value from a lookup table (being the global $replace_values).
For well-formed HTML/XML parsing, consider using the Document Object Model (DOM) in conjunction with XPath. It's much more fun to use than regexes for that sort of thing.
To not have to use global variables and gracefully handle missing keys you can use
function render($template, $vars) {
return \preg_replace_callback("!{{\s*(?P<key>[a-zA-Z0-9_-]+?)\s*}}!", function($match) use($vars){
return isset($vars[$match["key"]]) ? $vars[$match["key"]] : $match[0];
}, $template);
}

Categories