Where's the bug in my regex? - php

I am trying to build a regex that would extract the values of pseudo-xml-tags (enclodes in{} instead of <>) and it doesn't work. Have verified the thing with RegexBuddy, my favourite rx-tool which captured quite correct, but when using it in my PHP-Code, I do not get a result.
So, w/o further ado, here's the problem:
$match=array();
$ret=preg_match('\{lang\s*=\s*[\"\']*?(.*?)[\"\']*?\s*/\}',"{lang='DE'/}xxxxlxlxlxl",$match);
Why is $match empty?

The pattern should be
/\{lang\s*=\s*[\"\']*?(.*?)[\"\']*?\s*\/\}/
^ ^

Related

Find all hashtags in string using preg_match_all

I'm having problems figuring out the right regex pattern for the search preg_match_all("THIS PART", $my_string). I need to find all hashtags in my string with the word after the hashtag included as well.
So, these strings should be found by the mentioned function:
Input
#hi im like typing text right here hihih #asdasdasdasd #
Result
#hi
#asasdasdasdasd
Input
#asd#asd xd so fun lol #lol
Result
#asd#asd2 would be two seperate matches and #lol would be matched aswell.
I hope the question made sense and thanks beforehand!
This should work:
/#(?<hash>[^\s#]+)/g
It searches for # and creates then a named group called hash, it stops matching after it reaches another # or after any whitespace character (\s).
You can use preg_match_all
preg_match_all('/(?<!\w)#\w+/', $your_string, $allMatches);
It will give all contain # tag word. hope it help you.
print_r($allMatches)

preg match all a tags with the same class from file_get_contents

i want to get all the a tags with the same class from html file,
i have tried:
$html = file_get_contents('http://10tv.nana10.co.il/Category/?CategoryID=400008');
preg_match_all('/<a\s+class="FooterNavigationItemValue">(.*)<\/a>/', $html, $div_array);
return var_dump($div_array);
but i get an empty array, help?
As Marc B commented, using DOM will be your best bet. But since you are looking for regex:
'#<a.*?class="FooterNavigationItemValue".*?>(.*?)</a>#s'
P.S. I looked into the site mentioned in the code and this piece of regex does its job perfectly.
Now the explanation:
the two .*? before and after class="FooterNavigationItemValue" is to make sure that the string still matches if there's something before and after class="FooterNavigationItemValue".
And I used (.*?) instead of (.*) to prevent regex greediness. More info can be found here: What do lazy and greedy mean in the context of regular expressions?

PHP Regex : several stopping characters with Positive lookbehind

Hi stackoverflow community !
I'm trying to use a simple regex expression in PHP based on a Positive lookbehind. My objective is to extract everything in a URL between a domain name and a set of specific characters (? or & or /). I want to extract "bar" on those examples :
foo.com/bar?
foo.com/bar&
foo.com/bar/
I tried
(?<=foo\.com\/)[^/?&]+
it works fine in the plateform test
but not with PHP 5.3x preg_match : the error thrown is that I can't use several stopping characters - it works with one.
I also tried a combination of positive lookbehind/lookahead, but the issue remains the same.
What did I do wrong ?
In PHP, unlike (say) JavaScript, you can't use the regex-delimiter without escaping it, even inside a character class. So, you need to change this:
"/(?<=foo\.com\/)[^/?&]+/"
to this:
"/(?<=foo\.com\/)[^\/?&]+/"
Escape the slashes:
preg_match("/(?<=foo\.com\/)[^\/?&]+/", "http://www.foo.com/bar?", $result);
here ___^
or use another delimiter
preg_match("#(?<=foo\.com/)[^/?&]+#", "http://www.foo.com/bar?", $result);

Convert PHP RegEx to JavaScript RegEx

I have a PHP regular expression I'm using to get the YouTube video code out of a URL.
I'd love to match this with a client-side regular expression in JavaScript. Can anyone tell me how to convert the following PHP regex to JavaScript?
preg_match("#(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=v\/)[^&\n]+(?=\?)|(?<=embed/)[^&\n]+|(?<=v=)[^&\n]+|(?<=youtu.be/)[^&\‌​n]+#", $url, $matches);
Much appreciated, thanks!
I think the only problem is to get rid of the lookbehind assertions (?<=...), they are not supported in Javascript.
The advantage of them is, you can use them to ensure that a pattern is before something, but they are NOT included in the match.
So, you need to remove them, means change (?<=v=)[a-zA-Z0-9-]+(?=&) to v=[a-zA-Z0-9-]+(?=&), but now your match starts with "v=".
If you just need to validate and don't need the matched part, then its fine, you are done.
But if you need the part after v= then put instead the needed pattern into a capturing group and continue working with those captured values.
v=([a-zA-Z0-9-]+)(?=&)
You will then find the matched substring in $1 for the first group, $2 for the second, $3 ...
you can replace your look behind assertion using this post
Javascript: negative lookbehind equivalent?

Problem using regex to remove number formatting in PHP

I'm having this issue with a regular expression in PHP that I can't seem to crack. I've spent hours searching to find out how to get it to work, but nothing seems to have the desired effect.
I have a file that contains lines similar to the one below:
Total','"127','004"','"118','116"','"129','754"','"126','184"','"129','778"','"128','341"','"127','477"','0','0','0','0','0','0
These lines are inserted into INSERT queries. The problem is that values like "127','004" are actually supposed to be 127,004, or without any formatting: 127004. The latter is the actual value I need to insert into the database table, so I figured I'd use preg_replace() to detect values like "127','004" and replace them with 127004.
I played around with a Regular Expression designer and found that I could use the following to get my desired results:
Regular Expression
"(\d+)','(\d{3})"
Replace Expression
$1$2
The line on the top of this post would end up like this: (which is what I am after)
Total','127004','118116','129754','126184','129778','128341','127477','0','0','0','0','0','0
This, however, does not work in PHP. Nothing is being replaced at all.
The code I am using is:
$line = preg_replace("\"(\d+)','(\d{3})\"", '$1$2', $line);
Any help would be greatly appreciated!
There are no delimiters in your regex. Delimiters are required in order for PHP to know what is the pattern to match and what is a pattern modifier (e.g. i - case-insensitive, U - ungreedy, ...). Use a character that doesn't occur in your pattern, typically you'll see a slash '/' used.
Try this:
$line = preg_replace("/\"(\d+)','(\d{3})\"/", '$1$2', $line);
You forgot to wrap your regular expression in front-slashes. Try this instead:
"/\"(\d+)','(\d{3})\"/"
use preg_replace("#\"(\d+)','(\d+)\"#", '$1$2', $s); instead of yours

Categories