Explode a list of URLs on php - php

I am trying to import urls from a list of URLS using the explode function.
For example, let's say
<?php
$urls = "http://storage.google.com/gn-0be5doc/da7f835a8c249109e7a1_solr.txt
http://google.com/gn-0be5doc/1660ed76f46bfc2467239e_solr.txt
http://google.com/gn-0be5doc/6dffbff7483625699010_solr.txt
http://google.com/gn-0be5doc/ef246266ee2e857372ae5c73_solr.txt
http://google.com/gn-0be5doc/d0565363ec338567c79b54e6_solr.txt
http://google.com/gn-0be5doc/43bd2d2abd741b2858f2b727_solr.txt
http://google.com/gn-0be5doc/eb289a45e485c38ad3a23bc4726dc_solr.txt";
$url_array = explode (" ", $urls);
?>
Considering that there is no delimeter here, the explode functions returns the whole text together.
Is there a way I can get them separately? Perhaps use end of url as the txt part?
Thanks in advance.

looks like all you need:
$urls = explode( "\n",$urls );
or
$urls = explode( "\r\n", $urls );
if you must you could use http://
If it was a string with out breaks then:
$urls = "http://storage.google.com/gn-0be5doc/da7f835a8c249109e7a1_solr.txthttp://google.com/gn-0be5doc/1660ed76f46bfc2467239e_solr.txthttp://google.com/gn-0be5doc/6dffbff7483625699010_solr.txthttp://google.com/gn-0be5doc/ef246266ee2e857372ae5c73_solr.txt";
$urls = preg_split('#(?=http://)#', $urls);
print_r($urls);
explode not used as it remove the delimiter

You clearly have line breaks in your code, and for line breaks with/without extra whitespace, you can use PHP_EOL as a delimiter:
$url_array = explode(PHP_EOL, $urls);

Related

SPLIT URL in PHP

I have below URL in my code and i want to split it and get the number from it
For example from the below URL need to fetch 123456
https://review-test.com/#/c/123456/
I have tried this and it is not working
$completeURL = https://review-test.com/#/c/123456/ ;
list($url, $number) = explode('#c', preg_replace('/^.*\/+/', '', $completeURL));
Use parse_url
It's specifically made for this sort of thing.
You can do this without using regex also -
$completeURL = 'https://review-test.com/#/c/123456/' ;
list($url, $number) = explode('#c', str_replace('/', '', $completeURL));
echo $number;
If you wan to get the /c/123456/ params you will need to execute the following:
$url = 'https://review-test.com/#/c/123456/';
$url_fragment = parse_url($url, PHP_URL_FRAGMENT);
$fragments = explode('/', $url_fragment);
$fragments = array_filter(array_map('trim', $fragments));
$fragments = array_values($fragments);
The PHP_URL_FRAGMENT will return a component of the url after #
After parse_url you will end up with a string like this: '/c/123456/'
The explode('/', $url_fragment); function will return an array with empty indexes where '/' was extracted
In order to remove empty indexes array_filter($fragments); the
array_map with trim option will remove excess spaces. It does not
apply in this case but in real case scenario you better trim.
Now if you var_dump the result you can see that the array needs to
be reindexed array_values($fragments)
You should try this: basename
basename — Returns trailing name component of path
<?php
echo basename("https://review-test.com/#/c/123456/");
?>
Demo : http://codepad.org/9Ah83qaP
Subsequently you can directly take from pure regex to fetch numbers from string,
preg_match('!\d+!', "https://review-test.com/#/c/123456/", $matches);
print_r($matches);
Working demo
Simply:
$tmp = explode( '/', $completeUrl).end();
It will explode the string by '/' and take the last element
If you have no other option than regex, for your example data you could use preg_match to split your url instead of preg_replace.
An approach could be to
Capture the first part as a group (.+\/)
Then capture your number as a group (\d+)
Followed by a forward slash at the end of the line \/$/
This will take the last number from the url followed by a forward slash.
Then you could use list and skip the first item of the $matches array because that will contain the text that matched the full pattern.
$completeURL = "https://review-test.com/#/c/123456/";
preg_match('/(.+\/)(\d+)\/$/', $completeURL, $matches);
list(, $url, $number) = $matches;

How to get last part of a string?

I have this string:
"application/controllers/backend"
I want get:
backend
of course the backend it's dynamic, so could be change, so I'm looking for a solution that allow me to get only the last part of the string. How I can do that?
You can take the advantage of basename() to get the last part
in your case, it will be
basename("application/controllers/backend");
Output:
backend
Some thing like this :
echo end(explode("/", $url));
If this thorws error then do :
$parts = explode("/", $url);
echo end($parts);
$arr = explode ("/", $string);
//$arr[2] is your third element in the string
http://php.net/manual/en/function.explode.php
Just use
basename("application/controllers/backend");
http://php.net/manual/en/function.basename.php
And, if you want to do it with a regex:
$result = (preg_match('%.*[/\\\\](.*?)$%', $url, $regs)) ? $regs[1] : '';
You did ask initially for a solution with regex, so, although the other answers haven't involved regex, here is one approach which does.
You can use preg_match and str_replace for this:
$string = '"application/controllers/backend"';
preg_match('/[^\/]+"/', $string, $matches);
$last_item = str_replace('"','',$matches[0]);
$last_item is now a string containing the word backend.

Remove everything before http in every element of array

I got an array call $urlsand i want to remove everything before http for every element in the array
suppose
$urls[1] = hd720\u0026url=http%3A%2F%2Fr2---sn-h50gpup0nuxaxjvh-hg0l.googlevideo.com%2Fvideoplayback%3Fexpire%3D1387559704%26fexp%3D937407%252C908540%252C941239%252C916623%252C909717%252C932295%252C936912%252C936910%252C923305%252C936913%252C907231%252C907240%252C921090%
I want it to be
$urls[1] = http%3A%2F%2Fr2---sn-h50gpup0nuxaxjvh-hg0l.googlevideo.com%2Fvideoplayback%3Fexpire%3D1387559704%26fexp%3D937407%252C908540%252C941239%252C916623%252C909717%252C932295%252C936912%252C936910%252C923305%252C936913%252C907231%252C907240%252C921090%
Here i gave example only for $urls[1] but i want to remove every characters till http is found for ALL element of array
I tried
$urls = strstr($urls, 'http');
$urls = preg_replace('.*(?=http://)', '', $urls);
Both didn't work
Use array_map() with a callback function:
$urls = array_map(function($url) {
return preg_replace('~.*(?=http://)~', '$1', urldecode($url));
}, $urls);
Demo.
strstr coupled with array_map gives you the expected result.
$furls = array_map('filterArr',$urls);
function filterArr($v)
{
return urldecode(strstr($v,'http'));
}
print_r($furls);
I'd do it like this:
foreach($urls as $key=>$val) {
$e = &$urls[$key]; // notice the & sign
// now whatever you do with $e will go back
// into the original array element
$e = preg_replace(.............);
}
I always use this technique to convert arrays since it's fast and efficient. The array_walk / array_filter way is also good but much slower when your array is medium to big.
You can cut everything before http with explode.
$string = explode("http", $urls); // Hold the url and cut before the http
$str = $string[0]; // Hold the first cut - E.G : hd720\u0026url=
echo $str; // Hold the first cut - E.G : hd720\u0026url=
Also note that $string[1]; will hold the other side of http : `%3A%2F%2Fr2---sn-h50...
So you can do it somthing like that :
$str1 = $string[1];
$fixedUrl = 'http'.$str1; // will hold the fixed http : http%3A%2F%2Fr2---sn-h50gpup0nuxaxjvh-hg0l...
You just miss delimiters arround your regex, preg_replace works well on array:
$urls = preg_replace('~.*(?=http://)~', '', $urls);
// add delimiters __^ __^
I used ~ to avoid escaping the //, in this case, it'll be:
$urls = preg_replace('/.*(?=http:\/\/)/', '', $urls);
// add delimiters __^ __^

php preg_replace words last part

$str='<p>http://domain.com/1.html?u=1234576</p><p>http://domain.com/2.html?u=2345678</p><p>http://domain.com/3.html?u=3456789</p><p>http://domain.com/4.html?u=56789</p>';
$str = preg_replace('/.html\?(.*?)/','.html',$str);
echo $str;
I need get
<p>http://domain.com/1.html</p>
<p>http://domain.com/2.html</p>
<p>http://domain.com/3.html</p>
<p>http://domain.com/4.html</p>
remove ?u=*number* from every words last part. thanks.
Change this line:
$str = preg_replace('/.html\?(.*?)/','.html',$str);
into this:
$str = preg_replace('/.html\?(.*?)</','.html<',$str);
An alternative to the other answers:
preg_replace("/<p>([^?]*)\?[^<]*<\/p>/", "<p>$1</p>", $input);
This will match all types of urls with url variables, not only the ones with html-files in them.
For example, you can also extract these types of values:
<p>http://domain.com/1.php?u=1234576</p>
<p>http://domain.com?u=1234576</p>
<p>http://domain.com</p>
<p>http://domain.com/pages/users?uid=123</p>
With an output of:
<p>http://domain.com/1.php</p>
<p>http://domain.com</p>
<p>http://domain.com</p>
<p>http://domain.com/pages/users</p>
This code will load the url's into an array so they can be handled on the fly:
$str = '<p>http://domain.com/1.html?u=1234576</p><p>http://domain.com/2.html?u=2345678</p><p>http://domain.com/3.html?u=3456789</p><p>http://domain.com/4.html?u=56789</p>';
$str = str_replace("<p>","",$str);
$links = preg_split('`\?.*?</p>`', $str,-1,PREG_SPLIT_NO_EMPTY);
foreach($links as $v) {
echo "<p>".$v."</p>";
}

PHP - strip URL to get tag name

I need to strip a URL using PHP to add a class to a link if it matches.
The URL would look like this:
http://domain.com/tag/tagname/
How can I strip the URL so I'm only left with "tagname"?
So basically it takes out the final "/" and the start "http://domain.com/tag/"
For your URL
http://domain.com/tag/tagname/
The PHP function to get "tagname" is called basename():
echo basename('http://domain.com/tag/tagname/'); # tagname
combine some substring and some position finding after you take the last character off the string. use substr and pass in the index of the last '/' in your URL, assuming you remove the trailing '/' first.
As an alternative to the substring based answers, you could also use a regular expression, using preg_split to split the string:
<?php
$ptn = "/\//";
$str = "http://domain.com/tag/tagname/";
$result = preg_split($ptn, $str);
$tagname = $result[count($result)-2];
echo($tagname);
?>
(The reason for the -2 is because due to the ending /, the final element of the array will be a blank entry.)
And as an alternate to that, you could also use preg_match_all:
<?php
$ptn = "/[a-z]+/";
$str = "http://domain.com/tag/tagname/";
preg_match_all($ptn, $str, $matches);
$tagname = $matches[count($matches)-1];
echo($tagname);
?>
Many thanks to all, this code works for me:
$ptn = "/\//";
$str = "http://domain.com/tag/tagname/";
$result = preg_split($ptn, $str);
$tagname = $result[count($result)-2];
echo($tagname);

Categories