PHP preg_replace cuts of my $subject string - php

I was working on this project of mine when I encountered the following problem. I have a link which goes to:
file.php?page=1&color=all&pos=all&nat=all&mine=all&tree=all
Now, I wanted to change the color to 'gold' so I looked around on Google and found this php function called preg_replace(). So I implemented it in my code like this:
$pre='?page=1&color=all&pos=all&nat=all&mine=all&tree=all';
preg_replace('/color=(.*)&/', 'color=gold&', $pre);
For some reason my output is ?page=1&color=gold&tree=all so it seems that it cut of the middle of the code somehow.
This is the link I expect as my output: ?page=1&color=gold&pos=all&nat=all&mine=all&tree=all
Can anybody tell me what it is I'm doing wrong? Thanks!

Regular Expressions (regex) are greedy. You said "find color=" and then "get as much as you can until you see a &". What you want is "get as much as you can as long as it is not a &". That would be:
preg_replace('/color=[^&]*/','color=gold',$pre);
The [^&] means "anything except &". Also - you aren't using the match, so you don't need the parenthesis.

Related

How to exclude part of the text via regex in PHP

I'm strugglig with regex. I need to find regex for pregmatch. If the string contains "\n" and not "\c\n", do something.
I do have a sentence "Thank you for tuning in to I-See News.\c\nJames Movesworth reporting.Today, let’s learn about\nQuick Attack." I tried something, but it's not working at all.
preg_match('/(?!\\c)(\\n)/', $string);
Thank you!
edit: My previous topic was closed, because it's simmilar to (Regular expression for a string containing one word but not another). Perhaps it is, but still if I modify my pattern according to the suggested topic,
^(?!.*\\c).*(\\n).*$
it still doesn't correctly display the answer. (In this case it displays "false" - but it shoud be true, before "Quick Attack" in the sentence is "\n".
Thanks to The four bird, Wiktor, bobble bubble, this works:
preg_match('/(?<!\\\c)\\\n/', $string);

PHP: How would I remove parts of a string between 2 chunks of characters without removing too much?

This problem is driving me nuts. Let's say I have a string:
This is a &start;pretty bad&end; string that I want to &start;somehow&end; display differently
I want to be able to remove the &start; and &end; parts as well as everything in between so it says:
This is a string that I want to display differently
I tried using preg_replace with a regular expression but it took off too much, ie:
This is a display differently
The question is: how do I remove the stuff just between sets of &start; and &end; pairs and make sure that it doesn't remove anything between any &end; and &start; segments?
Keep in mind, I'm working with hundreds of strings that are very different to each other so I'm looking for a flexible solution that'll work with all of them.
Thanks in advance for any help with this.
Edit: Replaced dollar signs with ampersands. Oops!
Try this regex /\&start;(.+?)\$end;/g
It looks like it works as desired: https://regex101.com/r/MW5nom/2
I quickly tried it on chrome console using JS, tried converting it into PHP:
"This is a &start;pretty bad$end; string that I want to &start;somehow$end; display differently".replace(/\&start;(.+?)\$end;/g, "")

make link clickables with preg_replace

I want to make my links automatically clickables, but it doesn't work.
Here's my code:
$val['message'] = preg_replace('#https?://(w{3}.)?([a-zA-Z0-9_-]{1,20}(.[a-zA-Z0-9_-]{1,10}))(/[a-zA-Z0-9_-]{1,12}(/[a-zA-Z0-9_-]{1,12}))?(/([a-zA-Z0-9_-]{1,20})(.[a-zA-Z0-9_-]{1,7}))?(\?[a-zA-Z0-9_-]{1,7}=[a-zA-Z0-9_-]{1,7}(&[a-zA-Z0-9_-]{1,7}=[a-zA-Z0-9_-]{1,7}))?#is', '$0', $val['message']);
(here is my preg thing, but with lines:)
'https?://
(w{3}.)?
([a-zA-Z0-9_-]{1,20}(.[a-zA-Z0-9_-]{1,10}))
(/[a-zA-Z0-9_-]{1,12}(/[a-zA-Z0-9_-]{1,12}))?
(/([a-zA-Z0-9_-]{1,20})
(.[a-zA-Z0-9_-]{1,7}))?
(\?[a-zA-Z0-9_-]{1,7}=[a-zA-Z0-9_-]{1,7}
(&[a-zA-Z0-9_-]{1,7}=[a-zA-Z0-9_-]{1,7}))?
I also tried this:
$val['message'] = preg_replace("#(([\w]+?://[\w#$%&~.-;:=,?#[]+])(/[\w#$%&~/.-;:=,?#[]+])?)#is", "$1", $val['message']);
but doesn't work with links like https://www.youtube.com/watch?v=videolink
Try this regex, worked for me:
(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?
Why does everyone like to try to make their own regex for this? Linkifying links is hard work with lots of edge cases, not to mention what should or shouldn't be included in the link, e.g.
Are you talking about youtube.com?
I like the ASP.net language
I wonder what www.stackoverflow.com counts as a link
Parentheses are a particular pain in the butt (example: http://example.com/?auth=gH;2($Hd)DA0;QAb)
Aside: in the last line above, StackOverflow's preview section links everything until the last closing bracket, but after submission it only links up to the first punctuation mark bracket. Helps prove my point about how hard this is to get right and consistent though!
Best to use something established, example:
https://github.com/misd-service-development/php-linkify
For something a bit more quick n dirty:
http://buildinternet.com/2010/05/how-to-automatically-linkify-text-with-php-regular-expressions/

preg_replace multiple times in same string

I have a text and I want to do something like Wiki code, creating links with [[]] and stuffs.
I am using this preg_replace to do that, and it seems to work:
<?=preg_replace("/\{\{([^\*]+)\|([^\*]+)\|([^\*]+)\}\}/", "<a href='$1.php#$2'>$3</a>", $conditions['pattern']); ?>
The problem is that when I have this text "can[not] build at %{{types|location|location}}% %{{some|other|stuff}}%" it outputs this:
can[not] build at %stuff%
It's like only the last one gets replaced, but wrong.
Any idea? Thanks
Fixed!
I changed the regular expression to /\{\{([a-zA-Z]+)\|([a-zA-Z]+)\|([a-zA-Z ]+)\}\}/ and now it works :D

Scrape a price off a website

I'm trying to scrape a price from a web page using PHP and Regexes. The price will be in the format £123.12 or $123.12 (i.e., pounds or dollars).
I'm loading up the contents using libcurl. The output of which is then going into preg_match_all. So it looks a bit like this:
$contents = curl_exec($curl);
preg_match_all('/(?:\$|£)[0-9]+(?:\.[0-9]{2})?/', $contents, $matches);
So far so simple. The problem is, PHP isn't matching anything at all - even when there are prices on the page. I've narrowed it down to there being a problem with the '£' character - PHP doesn't seem to like it.
I think this might be a charset issue. But whatever I do, I can't seem to get PHP to match it! Anyone have any ideas?
(Edit: I should note if I try using the Regex Test Tool using the same regex and page content, it works fine)
Have you try to use \ in front of £
preg_match_all('/(\$|\£)[0-9]+(\.[0-9]{2})/', $contents, $matches);
I have try this expression with .Net with \£ and it works. I just edited it and removed some ":".
(source: clip2net.com)
Read my comment about the possibility of Curl giving you bad encoding (comment of this post).
maybe pound has it's html entity replacement? i think you should try your regexp with some sort of couching program (i.e. match it against fixed text locally).
i'd change my regexp like this: '/(?:\$|£)\d+(?:\.\d{2})?/'
This should work for simple values.
'#(?:\$|\£|\€)(\d+(?:\.\d+)?)#'
This will not work with thousand separator like 234,343 and 34,454.45.

Categories