preg_replace with Regex - find number-sequence in URL - php

I'm a regex-noobie, so sorry for this "simple" question:
I've got an URL like following:
http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-146370543.aspx
what I'm going to archieve is getting the number-sequence (aka Job-ID) right before the ".aspx" with preg_replace.
I've already figured out that the regex for finding it could be
(?!.*-).*(?=\.)
Now preg_replace needs the opposite of that regular expression. How can I archieve that? Also worth mentioning:
The URL can have multiple numbers in it. I only need the sequence right before ".aspx". Also, there could be some php attributes behind the ".aspx" like "&mobile=true"
Thank you for your answers!

You can use:
$re = '/[^-.]+(?=\.aspx)/i';
preg_match($re, $input, $matches);
//=> 146370543
This will match text not a hyphen and not a dot and that is followed by .aspx using a lookahead (?=\.aspx).
RegEx Demo

You can just use preg_match (you don't need preg_replace, as you don't want to change the original string) and capture the number before the .aspx, which is always at the end, so the simplest way, I could think of is:
<?php
$string = "http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-146370543.aspx";
$regex = '/([0-9]+)\.aspx$/';
preg_match($regex, $string, $results);
print $results[1];
?>
A short explanation:
$result contains an array of results; as the whole string, that is searched for is the complete regex, the first element contains this match, so it would be 146370543.aspx in this example. The second element contains the group captured by using the parentheeses around [0-9]+.

You can get the opposite by using this regex:
(\D*)\d+(.*)
Working demo
MATCH 1
1. [0-100] `http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-`
2. [109-114] `.aspx`
Even if you just want the number for that url you can use this regex:
(\d+)

Related

How to get a number from a html source page?

I'm trying to retrieve the followed by count on my instagram page. I can't seem to get the Regex right and would very much appreciate some help.
Here's what I'm looking for:
y":{"count":
That's the beginning of the string, and I want the 4 numbers after that.
$string = preg_replace("{y"\"count":([0-9]+)\}","",$code);
Someone suggested this ^ but I can't get the formatting right...
You haven't posted your strings so it is a guess to what the regex should be... so I'll answer on why your codes fail.
preg_replace('"followed_by":{"count":\d')
This is very far from the correct preg_replace usage. You need to give it the replacement string and the string to search on. See http://php.net/manual/en/function.preg-replace.php
Your second usage:
$string = preg_replace(/^y":{"count[0-9]/","",$code);
Is closer but preg_replace is global so this is searching your whole file (or it would if not for the anchor) and will replace the found value with nothing. What your really want (I think) is to use preg_match.
$string = preg_match('/y":\{"count(\d{4})/"', $code, $match);
$counted = $match[1];
This presumes your regex was kind of correct already.
Per your update:
Demo: https://regex101.com/r/aR2iU2/1
$code = 'y":{"count:1234';
$string = preg_match('/y":\{"count:(\d{4})/', $code, $match);
$counted = $match[1];
echo $counted;
PHP Demo: https://eval.in/489436
I removed the ^ which requires the regex starts at the start of your string, escaped the { and made the\d be 4 characters long. The () is a capture group and stores whatever is found inside of it, in this case the 4 numbers.
Also if this isn't just for learning you should be prepared for this to stop working at some point as the service provider may change the format. The API is a safer route to go.
This regexp should capture value you're looking for in the first group:
\{"count":([0-9]+)\}
Use it with preg_match_all function to easily capture what you want into array (you're using preg_replace which isn't for retrieving data but for... well replacing it).
Your regexp isn't working because you didn't escaped curly brackets. And also you didn't put count quantifier (plus sign in my example) so it would only capture first digit anyway.

How to cut out everything from a string except certain part of it in php?

Let's say I have string like this:
Village_name(315|431 K64)
What I want to do is when I paste that into let's say text box, and click a button, all I will be left with is 315|431.
Is there a way of doing this?
Use the below regex and then replace the match with \1.
(\d+\|\d+)|.
It captures the number|number part and matches all the remaining chars. By replacing all the matched chars with \1 will give you the number|number part only.
DEMO
In php, you may use this also.
(?:\d+\|\d+)(*SKIP)(*F)|.
substring which was matched by \d+\|\d+ regex would be matched first and the following (*SKIP)(*F) makes the regex to fail. Now thw . after the pipe symbol would match all the chars except number|number because we already skipped that part.
DEMO
I know this question has been answered and the answer has been accepted. But I still want to suggest this answer, as you really don't need to use PHP to realize your requirement. Just use Javascript. Its enough:
var str = 'Village_name(315|431 K64)';
var pattern = /\((\w+\|\w+) /;
var res = str.match(pattern);
document.write(res[1]);
Please try this:-
<?php
$str = 'Village_name(315|431 K64)';
preg_match_all('/(?:\d+\|\d+)/', $str, $matches);
echo "<pre/>";print_r($matches);//print in array format completly
$i=0;
foreach($matches as $match){ //iteration through one foreach as you asked
echo $match[$i];
$i++;
}
?>
Output:- http://prntscr.com/74ddg9
Note:- explode can work with some adjustment but if the format only like what you given.So go for preg_match_all. It's best.

Match a string that begin with certain pattern and ends with a certain pattern

I have to fetch all string that starts with [[{ and ends with }]]. I tried to use:
'/^\[\[\{*$\}\}\]\]/'
but it does not work.
Basically I have to fetch some JSON string embedded inside HTML documents.
This is the perfect use for lookarounds and you can use them like so:
$re = '/(?=\[\[\{).*?(?<=\}\]\])/m';
preg_match_all($re, $str, $matches);
Here, preg_match_all() searches $str for all matches to the regular expression given in $re and puts them in $matches.
Regex101 Demo
You should put anchors on it correctly. They are currently in the wrong place. This is the proper version. You should use it with preg_match_all(...):
/\[\[\{.*?\}\]\]/

How can use a match in the same regex in php?

I have this string (that is a serialized variable in php):
s:12:"hello "world";
and I wanna to find "hello "world" only with regex, I try this, but seems it is stupid :P
(s:(?P<num>[0-9]+):".{\k{num}}";)
I only want to know how I can use "num" result in the its regex?
this regex is used in a big regex so I can't check for end of string.
thanks advance!
You can use your named capturing groups as backreference like this
Back references to the named subpatterns can be achieved by (?P=name)
or, since PHP 5.2.2, also by \k or \k'name'. Additionally PHP
5.2.4 added support for \k{name} and \g{name}.
According to php.net
But I think this can be used only to match the found pattern again, but not as a number in a quantifier. (At least I didn't got it to work.)
You can use preg_match function, which will populate an array of matches:
If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches1 will have the text that matched the first captured parenthesized subpattern, and so on.
More information about preg_match: PHP: preg_match
$text = 's:12:"hello "world";s:12:"good bue world";';
$pattern = "(.*:[0-9]+:\"(.*)\";.*)U";
preg_match_all($pattern,$text,$r);

preg_match returning weird results

I am searching a string for urls...and my preg_match is giving me an incorrect amount of matches for my demo string.
String:
Hey there, come check out my site at www.example.com
Function:
preg_match("#(^|[\n ])([\w]+?://[\w]+[^ \"\n\r\t<]*)#ise", $string, $links);
echo count($links);
The result comes out as 3.
Can anybody help me solve this? I'm new to REGEX.
$links is the array of sub matches:
If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.
The matches of the two groups plus the match of the full regular expression results in three array items.
Maybe you rather want all matches using preg_match_all.
If you use preg_match_pattern, (as Gumbo suggested), please note that if you run your regex against this string, it will both match the value of your anchor attribute "href" as well as the linked Text which in this case happens to comtain an url. This makes TWO matches.
It would be wise to run an array_unique on your resultset :)
In addition to the advice on how to use preg_match, I believe there is something seriously wrong with the regular expression you are using. You may want to trying something like this instead:
preg_match("_([a-zA-Z]+://)?([0-9a-zA-Z$-\_.+!*'(),]+\.)?([0-9a-zA-Z]+)+\.([a-zA-Z]+)_", $string, $links);
This should handle most cases (although it wouldn't work if there was a query string after the top-level domain). In the future, when writing regular expressions, I recommend the following web-sites to help: http://www.regular-expressions.info/ and especially http://regexpal.com/ for testing them as you're writing them.

Categories