php replace problem - php

there are some url, like 11162_sport.html, 11451_sport.html, 11245_sport.html or 231sport.html,
I want when the url like XXXXX_sport.html then replace them into 11162_football.html, 11451_football.html, 11245_football.html, and 231sport.html has no change.
how to replace them, $newurl = preg_replace("_sport.html","_football.html",$url)? Thanks.

Simply do $newurl = str_replace("_sport.html", "_football.html", $url);
This is faster than doing a preg_replace() and more accurant.
see the manual on str_replace.

you can use str_replace for such simple replacement.

If it must be regular expressions, do:
preg_replace('/_sport\.html$/', '_football.html', $url);
str_replace() would indeterminately replace all occurences of sport.html whereas a regular expression with an end-of-line marker ($) will only replace the pattern at the end of the URL.
The dot needs to be escaped because it would match any character (except new-lines).

Related

How to use a variable as a pattern along with the other patterns in preg_match() function?

Actually I'm writing a web crawler for my mini project.
I want to crawl only those web pages that belong to the input website only. I want my web crawler not to crawl to other websites other than the input given for now.
This is what I'm doing:
$url = $_POST["url"];
$web = #file_get_contents($url);
preg_match_all("/<a\s.*href=\"(.*)\"/U", $web, $matches);
What I want to do is:
$url = $_POST["url"];
$web = #file_get_contents($url);
preg_match_all("/<a\s.*href=\"(.*$url.*)\"/U", $web, $matches);
for example:
Input: https://www.google.com/
then the regular expression should be :
preg_match("/.*google.com.*/U", xyz, xyz);
Any other suggestions will be helpful, thanks in advance.
Change your delimiters to something that is not in any of your URLs?
preg_match_all("#<a\s.*href=\"(.*$url.*)\"#U", $web, $matches);
edit
Probably better to escape the $url with preg_quote
I found the solution, here's the solution.
If you want to use a variable along with the regular expression.
preg_match("/regular_expression".($my_variable)."regular_expression/U", $source, $matches);
The real solution is to use a preg_quote with the actual regex delimiter and append the part to the regex literal parts with the dot syntax:
preg_match_all("/<a\s.*href=\"(.*" . preg_quote($url, "/") . ".*)\"/U", $web, $matches);
^ ^^^^^^^^^^ ^^^ ^
The dots are like + in some other languages used for string concatenation, and preg_quote will make sure all special regex metacharacters in the variable string are properly escaped.

Trim URL by variable

Hi I have a Link like this:
mypage.php?product=3&page=1
I want to delete the &page=1, &page=2, &page=5 and so.
I have tried this, but I think it is not right.
str_replace('/(\\?|&)page=.*?(&|$)/', '', $link);
Thanks for your help.
str_replace() doesn't work with regular expressions, so you'd use preg_replace() instead:
$url = preg_replace('/[?&]page=[^&]+/', '', $url);
Two changes here: first, it's better to use a character class instead of alternation when you target individual symbol only (not having to escape ? within the brackets is a nice bonus), second, [^&]+ ('match any number of non-& characters') construct is more direct and readable than .+?(&|$) one.

Regular Expression to remove underscore and string

I need to use PHP regex to remove '_normal' from the end of this url.
http://a0.twimg.com/profile_images/3707137637/8b020cf4023476238704a9fc40cdf445_normal.jpeg
so that it becomes
http://a0.twimg.com/profile_images/3707137637/8b020cf4023476238704a9fc40cdf445.jpeg.
I tried
$prof_img = preg_replace('_normal', '', $prof_img);
but the underscore seems to be throwing things off.
As others have stated, str_replace is probably the best option for this simple example.
The problem with your specific code is that your regex string is undelimited, you need to this instead:
$prof_img = preg_replace('/_normal/', '', $prof_img);
See PCRE regex syntax for a reference.
The underscore is treated as a normal character in PCRE and isn't throwing things off.
If you require that only _normal at the end of the filename is matched, you can use:
$prof_img = preg_replace('/_normal(\.[^\.]+)$/', '$1', $prof_img);
See preg_replace for more information on how this works.
Try using str_replace; it's much more efficient than regex for something like this.
However, if you want to use regular expressions, you need a delimiter:
preg_replace('|_normal|','', $url);
str_replace should work.
$prof_img = str_replace('_normal', '', $prof_img);
You just forgot to add delimiters around your regex.
http://www.php.net/manual/en/regexp.reference.delimiters.php
When using the PCRE functions, it is required that the pattern is
enclosed by delimiters. A delimiter can be any non-alphanumeric,
non-backslash, non-whitespace character.
Often used delimiters are forward slashes (/), hash signs (#) and
tildes (~). The following are all examples of valid delimited
patterns.
$prof_img = preg_replace('/_normal/', '', $prof_img);
$prof_img = preg_replace('#_normal#', '', $prof_img);
$prof_img = preg_replace('~_normal~', '', $prof_img);
You can use decompose the URL first, perform the replacement and stick them back together, i.e.
$url = 'http://a0.twimg.com/profile_images/3707137637/8b020cf4023476238704a9fc40cdf445_normal.jpeg';
$parts = pathinfo($url);
// transform
$url = sprintf('%s%s.%s',
$parts['dirname'],
preg_replace('/_normal$/', '', $parts['filename']),
$parts['extension']
);
You might note two differences between your expression and mine:
Yours wasn't delimited.
Mine is anchored, i.e. it only removes _normal if it occurs at the end of the file name.
Using non-capturing groups, you can also try like this:
$prof_img = preg_replace('/(.+)(?:_normal)(.+)/', '$1$2', $prof_img);
It will keep the required part as a match.

Regex split by character and capture

I'm looking for a regular expression to split and capture a url path eg with example.com/param1/param2/param3 I would like param1, param2 and param3 to be captured. There could be an unknown number of parameters. This will be used with PHP's preg_match. Can this be done?
EDIT:
This will be used with PHP's preg_match, because I am using it as a Zend Router Rule. Can this be done?
You don't need regex if it's always going to be a /, you could just explode it into an array:
$array = explode('/', $url);
Depends on if it allows it
(/[^/]*)+
That'll give multiple matches /param1, /param2, /param3
Then just strip the slashes

PHP preg_match url

I have a problem with preg_match in PHP.
I have URLs:
http://mysite.com/file/w867/2612512232
http://mysite.com/file/c3233/2123255464
etc.
I need URLs:
http://mysite.com/file/2612512232
http://mysite.com/file/2123255464
I must remove:
w867/
c3233/
etc.
You don't necessarily need to use preg_match. parse_url() can do the job.
http://us.php.net/manual/en/function.parse-url.php
Just make sure to concatenate it all against without the part you don't want.
you could try a pattern like this:
"/(.*)\/([^/])*\/(.*)/"
and then with str_replace you can: $string = str_replace('/'.$matches[2], '', $string);
preg_replace("|file/\w+/|", "file/", $url);
This will search for 1st pattern between "/" symbols just after "file/" part.

Categories