Regex to filter int from url - php

I'm trying to filter out a value of a url.
The url looks like the following:
http://userimages-akm.imvu.com/catalog/includes/modules/phpbb2/images/avatars/145870556_47076915459092eafd7b69.jpg
Now i'm trying to only receive the following part from the url: 145870556
I thought about using a regex. But i won't get a working regex beside this one:
^((http[s]?|ftp):\/)?\/?([^:\/\s]+)((\/\w+)*\/)([\w\-\.]+[^#?\s]+)(.*)?(#[\w\-]+)?$
Is there a better regex to use?

If the image filename always follows the same format <timestamp>_<hex-value>.<extension>, then you don't need to match the entire URL.
$url = 'http://userimages-akm.imvu.com/catalog/includes/modules/phpbb2/images/avatars/145870556_47076915459092eafd7b69.jpg';
preg_match_all('~\/(\d+)_.*$~', $url, $matches);
// $matches[1] = '145870556';
https://regex101.com/r/vsDnoj/2

Related

Codeigniter current_url avoid jpg etc

I'm quite new in codeigniter. I'm using the current_url() function to preserve previously viewed page's URL. But the function (I think from different ajax calls) gives i.e. jpg files' url.
Like this:
/uploads/default/files/HTC.jpg
I'd like to avoid these and just preserve those URLs which are used in Browser's URL bar.
Any idea? Thanks in advance!
I assume you are using current_url() and then saving the output of this function somewhere for later retrieval.
So you could, before you save the string, perform a regex check to see if it fits the format you want:
$pattern = '~.+\.[a-zA-Z]{0,3}$~';
$string = current_url();
preg_match($pattern, $string, $matches);
if (empty($matches)) {
// We can save the url
}
The regex will hit on Urls which end in a . with zero to 3 letters:
HTC.jpg will fail
la/di.php will fail
la/items/2 will pass

searching link with php regular expression

I was using c and c# for programming and I am using some third-party regular expression library to identify link pattern. But yesterday, for some reason, someone asked me to use php instead. I am not familiar with the php regular expression but I try, didn't get the result as expected. I have to extract and replace the link of an image src of the form :
<img src="/a/b/c/d/binary/capture.php?id=main:slave:demo.jpg"/>
I only want the path in the src but the quotation could be double or single, also the id could be vary form case to case (here it is main:slave:demo.jpg)
I try the following code
$searchfor = '/src="(.*?)binary\/capture.php?id=(.+?)"/';
$matches = array();
while ( preg_match($searchfor, $stringtoreplace, $matches) == 1 ) {
// here if mataches found, replace the source text and search again
$stringtoreplace= str_replace($matches, 'whatever', $stringtoreplace);
}
But it doesn't work, anything I miss or any mistake from above code?
More specifically, let say I have a image tag which give the src as
<img src="ANY_THING/binary/capture.php?id=main:slave:demo.jpg"/>
here ANY_THING could be anything and "/binary/capture.php?id=" will be fixed for all cases, the string after "id=" is of pattern "main:slave:demo.jpg", the string before colon will be changed from case to case, the name of the jpeg will be varied too. I would expect to have it replaced as
<img src="/main/slave/demo.jpg"/>
Since I only have right to modify the php script at specific and limit time, I want to debug my code before any modification made. Thanks.
First of all, as you may know, regex shouldn't be used to manipulate HTML.
However, try:
$stringtoreplace = '<img src="/a/b/c/d/binary/capture.php?id=main:slave:demo.jpg"/>';
$new_str = preg_replace_callback(
// The regex to match
'/<img(.*?)src="([^"]+)"(.*?)>/i',
function($matches) { // callback
parse_str(parse_url($matches[2], PHP_URL_QUERY), $queries); // convert query strings to array
$matches[2] = '/'.str_replace(':', '/', $queries['id']); // replace the url
return '<img'.$matches[1].'src="'.$matches[2].'"'.$matches[3].'>'; // return the replacement
},
$stringtoreplace // str to replace
);
var_dump($new_str);

RegEx to grab url out of another url

String to pull from : http:\/\/c.ypcdn.com\/2\/c\/rtd?vrid=357c99c36bd7ed631eda2e43fc9e30f8&rid=283d465f-f63b-4b0d-90b0-be6c12ed7617&ptid=943aw4l8qj&ypid=11720135&lid=194823099&tl=6&lsrc=SP&dest=http%3A%2F%2FCleanation.com
RegEx I have used before: www\..*?\.\w{2,5}
However the above RegEx will only grab the URL if it has a "www". in it. If I take out the "www." of the RegEx it justs grabs the c.ypcdn.com. I want to grab the Cleanation.com at the end of the string.
Needs to be dynamic so it can grab any url that doesn't have a "www." out of that url.
why not use parse_url() and then parse_str() on the returned query index to get it?
edit: example:
$url= "http://c.ypcdn.com/2/c/rtd?vrid=357c99c36bd7ed631eda2e43fc9e30f8&rid=283d465f-f63b-4b0d-90b0-be6c12ed7617&ptid=943aw4l8qj&ypid=11720135&lid=194823099&tl=6&lsrc=SP&dest=http%3A%2F%2FCleanation.com";
$query = parse_url($url, PHP_URL_QUERY);
parse_str($query,$params);
echo $params['dest'];
If this is always the dest parameter, you can grab it with something like:
"dest=https?%3A%2F%2F([^?&]+?)"
If its aways the last parameter, you can grab it with:
"dest=https?%3A%2F%2F(.+)$"

URL with query string validation using PHP

I need a PHP validation function for URL with Query string (parameters seperated with &). currently I've the following function for validating URLs
$pattern = '/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?#)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/';
echo preg_match($pattern, $url);
This function correctly validates input like
google.com
www.google.com
http://google.com
http://www.google.com ...etc
But this won't validate the URL when it comes with parameters (Query string). for eg.
http://google.com/index.html?prod=gmail&act=inbox
I need a function that accepts both types of URL inputs. Please help. Thanks in advance.
A simple filter_var
if(filter_var($yoururl, FILTER_VALIDATE_URL))
{
echo 'Ok';
}
might do the trick, although there are problems with url not preceding the schema:
http://codepad.org/1HAdufMG
You can turn around the issue by placing an http:// in front of urls without it.
As suggested by #DaveRandom, you could do something like:
$parsed = parse_url($url);
if (!isset($parsed['scheme'])) $url = "http://$url";
before feeding the filter_var() function.
Overall it's still a simpler solution than some extra-complicated regex, though..
It also has these flags available:
FILTER_FLAG_PATH_REQUIRED FILTER_VALIDATE_URL Requires the URL to
contain a path part. FILTER_FLAG_QUERY_REQUIRED FILTER_VALIDATE_URL
Requires the URL to contain a query string.
http://php.net/manual/en/function.parse-url.php
Some might think this is not a 100% bullet-proof,
but you can give a try as a start

PHP url create from string

I would like to check a string and convert all the substrings that could be potential links inside the original string like http://www.google.com, or www.google.com, replaced with
<a href='http://www.google.com'>http://www.google.com</a> so that i can create real links from them.
How can i do this?
you can create the HTML links by calling the following function in PHP:
$stringToCheck = 'http://www.google.com, or www.google.com';
$stringWithHTMLLinks = '';
$stringWithHTMLLinks = preg_replace('/\b((https?|ftp|file):\/\/|www\.|ftp\.)[-A-Z0-9+&##\/%?=~_|!:,.;]*[A-Z0-9+&##\/%=~_|]/si', '\0', $stringToCheck);
Use this regex provided on Daring Fireball to match an URL.

Categories