How to get last digits which are number before '.html' string - php

there is a string, for example : http://address.com/sef-title-of-topic-1111.html
i could not get 1111 in anyway with regexp in php. Is it possible? How?
my code:
$address = 'http://address.com/sef-title-of-topic-1111.html';
preg_match('#-(.*?)\.html#sim',$address,$result);

If the url example is how they will always appear (ie. ending in hyphen, numbers, .html) then this should work:
$str = "http://address.com/sef-title-of-topic-1111.html";
preg_match('#.*-(\d+)\.html#', $str, $matches);
print_r($matches);
If they won't always match the pattern you gave in your question, then clarify by showing alternative values for your $address value.

If you know that the extension is definitely .html (and not .htm for example) then you could use
$lastNos= substr($input, -9, -4);
Clearly a simple solution but you have not specified why regex is required.

If the URL will always be in this format I would use str_replace to strip the .html then explode by "-" and find the last piece.
Of course all of that is assuming the URL is always in this format.

If the format is always the same you dont need a regex.
$url = "http://address.com/sef-title-of-topic-1111.html";
echo $str = strrev(array_shift(array_reverse(explode(".", array_shift(explode("-",strrev($url)))))));
edit: sorry my php is a bit rusty

Related

Regex to extract a string between two specific forward slashes

Hi I have the following text:
file:/home/dx/reader/validation-garage/IDON/test-test-test#2016-10-04.txt#/
I need to retrieve test-test-test#2016-10-04.txt# from the string above. If I can also exclude the hash even better.
I've tried looking at examples like this Regex to find text between second and third slashes but having trouble getting it working, can anyone help?
I'm using PHP regex to do this.
You may try regex expression below
\/([a-z\-]*\#[0-9\-\.]*[a-z]{3}\#)\/
A working example is here: https://www.regex101.com/r/RYsh7H/1
Explanation:
[a-z\-]* => Matches test-test-test part with lowercase and can contain dahses
\# => Matches constant # sign
[0-9\-\.]* => Matches the file name with digits, dashes and {dot}
[a-z]{3}\# => Matches your 3 letter extension and #
PS: If you really do not need # you do not have to use regex. And you may consider using parse_url method of PHP.
Hope this helps;
basename() also works, so you can also do like this:
echo basename('file:/home/dx/reader/validation-garage/IDON/test-test-test#2016-10-04.txt#/');
Without regex you can do:
$url_parts = parse_url('file:/home/dx/reader/validation-garage/IDON/test-test-test#2016-10-04.txt#/');
echo end(explode('/', $url_parts['path']));
or better:
$url_path = parse_url('file:/home/dx/reader/validation-garage/IDON/test-test-test#2016-10-04.txt#/', PHP_URL_PATH);
echo end(explode('/', $url_path));

How to get a number from a html source page?

I'm trying to retrieve the followed by count on my instagram page. I can't seem to get the Regex right and would very much appreciate some help.
Here's what I'm looking for:
y":{"count":
That's the beginning of the string, and I want the 4 numbers after that.
$string = preg_replace("{y"\"count":([0-9]+)\}","",$code);
Someone suggested this ^ but I can't get the formatting right...
You haven't posted your strings so it is a guess to what the regex should be... so I'll answer on why your codes fail.
preg_replace('"followed_by":{"count":\d')
This is very far from the correct preg_replace usage. You need to give it the replacement string and the string to search on. See http://php.net/manual/en/function.preg-replace.php
Your second usage:
$string = preg_replace(/^y":{"count[0-9]/","",$code);
Is closer but preg_replace is global so this is searching your whole file (or it would if not for the anchor) and will replace the found value with nothing. What your really want (I think) is to use preg_match.
$string = preg_match('/y":\{"count(\d{4})/"', $code, $match);
$counted = $match[1];
This presumes your regex was kind of correct already.
Per your update:
Demo: https://regex101.com/r/aR2iU2/1
$code = 'y":{"count:1234';
$string = preg_match('/y":\{"count:(\d{4})/', $code, $match);
$counted = $match[1];
echo $counted;
PHP Demo: https://eval.in/489436
I removed the ^ which requires the regex starts at the start of your string, escaped the { and made the\d be 4 characters long. The () is a capture group and stores whatever is found inside of it, in this case the 4 numbers.
Also if this isn't just for learning you should be prepared for this to stop working at some point as the service provider may change the format. The API is a safer route to go.
This regexp should capture value you're looking for in the first group:
\{"count":([0-9]+)\}
Use it with preg_match_all function to easily capture what you want into array (you're using preg_replace which isn't for retrieving data but for... well replacing it).
Your regexp isn't working because you didn't escaped curly brackets. And also you didn't put count quantifier (plus sign in my example) so it would only capture first digit anyway.

Extracting text from URL using PHP

I'm curious as to how I would get a certain value after a delimiter in a URL?
If I have a URL of http://www.testing.site.com/site/biz/i-want-this, how would I extract only the part that says "i-want-this", or initially after the last /?
Thank you!
You want basename($path); It should give you what you need:
http://www.ideone.com/8hFSN
$url = "http://www.testing.site.com/site/biz/i-want-this";
preg_match( "/[^\/]*$/", $url, $match);
echo $match[0]; // i-want-this
You can use basename() but if you are on Windows, it will break on not just slashes but also backslashes. This is unlikely to come up as backslashes are unusual in a URL. But I suspect you could find them in a query string in a valid URL.

PHP URL to Link with Regex

I know I've seen this done a lot in places, but I need something a little more different than the norm. Sadly When I search this anywhere it gets buried in posts about just making the link into an html tag link. I want the PHP function to strip out the "http://" and "https://" from the link as well as anything after the .* so basically what I am looking for is to turn A into B.
A: http://www.youtube.com/watch?v=spsnQWtsUFM
B: www.youtube.com
If it helps, here is my current PHP regex replace function.
ereg_replace("[[:alpha:]]+://[^<>[:space:]]+[[:alnum:]/]", "\\0", htmlspecialchars($body, ENT_QUOTES)));
It would probably also be helpful to say that I have absolutely no understanding in regular expressions. Thanks!
EDIT: When I entered a comment like this blahblah https://www.facebook.com/?sk=ff&ap=1 blah I get html like this<a class="bwl" href="blahblah https://www.facebook.com/?sk=ff&ap=1 blah">www.facebook.com</a> which doesn't work at all as it is taking the text around the link with it. It works great if someone only comments a link however. This is when I changed the function to this
preg_replace("#^(.*)//(.*)/(.*)$#",'<a class="bwl" href="\0">\2</a>', htmlspecialchars($body, ENT_QUOTES));
This is the simples and cleanest way:
$str = 'http://www.youtube.com/watch?v=spsnQWtsUFM';
preg_match("#//(.+?)/#", $str, $matches);
$site_url = $matches[1];
EDIT: I assume that the $str had been checked to be a URL in the first place, so I left that out. Also, I assume that all the URLs will contain either 'http://' or 'https://'. In case the url is formatted like this www.youtube.com/watch?v=spsnQWtsUFM or even youtube.com/watch?v=spsnQWtsUFM, the above regexp won't work!
EDIT2: I'm sorry, I didn't realize that you were trying to replace all strings in a whole test. In that case, this should work the way you want it:
$str = preg_replace('#(\A|[^=\]\'"a-zA-Z0-9])(http[s]?://(.+?)/[^()<>\s]+)#i', '\\1\\3', $str);
I am not a regex whizz either,
^(.*)//(.*)/(.*)$
\2
was what worked for me when I tried to use as find and replace in programmer's notepad.
^(.)// should extract the protocol - referred as \1 in the second line.
(.)/ should extract everything till the first / - referred as \2 in the second line.
(.*)$ captures everything till the end of the string. - referred as \3 in the second line.
Added later
^(.*)( )(.*)//(.*)/(.*)( )(.*)$
\1\2\4 \7
This should be a bit better, but will only replace just 1 URL
The \0 is replaced by the entire matched string, whereas \x (where x is a number other than 0 starting at 1) will be replaced by each subpart of your matched string based on what you wrap in parentheses and the order those groups appear. Your solution is as follows:
ereg_replace("[[:alpha:]]+://([^<>[:space:]]+[:alnum:]*)[[:alnum:]/]", "\\1
I haven't been able to test this though so let me know if it works.
I think this should do it (I haven't tested it):
preg_match('/^http[s]?:\/\/(.+?)\/.*/i', $main_url, $matches);
$final_url = ''.$matches[1].'';
I'm surprised no one remembers PHP's parse_url function:
$url = 'http://www.youtube.com/watch?v=spsnQWtsUFM';
echo parse_url($url, PHP_URL_HOST); // displays "www.youtube.com"
I think you know what to do from there.
$result = preg_replace('%(http[s]?://)(\S+)%', '\2', $subject);
The code with regex does not work completely.
I made this code. It is much more comprehensive, but it works:
See the result here: http://cht.dk/data/php-scripts/inc_functions_links.php
See the source code here: http://cht.dk/data/php-scripts/inc_functions_links.txt

PHP if string contains URL isolate it

In PHP, I need to be able to figure out if a string contains a URL. If there is a URL, I need to isolate it as another separate string.
For example: "SESAC showin the Love! http://twitpic.com/1uk7fi"
I need to be able to isolate the URL in that string into a new string. At the same time the URL needs to be kept intact in the original string. Follow?
I know this is probably really simple but it's killing me.
Something like
preg_match('/[a-zA-Z]+:\/\/[0-9a-zA-Z;.\/?:#=_#&%~,+$]+/', $string, $matches);
$matches[0] will hold the result.
(Note: this regex is certainly not RFC compliant; it may fetch malformed (per the spec) URLs. See http://www.faqs.org/rfcs/rfc1738.html).
this doesn't account for dashes -. needed to add -
preg_match('/[a-zA-Z]+:\/\/[0-9a-zA-Z;.\/\-?:#=_#&%~,+$]+/', $_POST['string'], $matches);
URLs can't contain spaces, so...
\b(?:https?|ftp)://\S+
Should match any URL-like thing in a string.
The above is the pure regex. PHP preg_* and string escaping rules apply before you can use it.
$test = "SESAC showin the Love! http://twitpic.com/1uk7fi";
$myURL= strstr ($test, "http");
echo $myURL; // prints http://twitpic.com/1uk7fi

Categories