In PHP/Apache I can get the full url and cut it up into parts like this
URL: example.com/friends/enemies-cats/
Then using PHP explode function I can split the URL by the "/" into an array.
Array[0] = 'friends';
Array[1] = 'enemies-cats';
I wonder, is it possible to do the same thing on a Java server. I am hoping the same thing could work on all servers e.g. tomcat, jboss, websphere etc. I would prefer not to use things like urlrewriter if I can avoid it.
Also is it possible to achieve the same thing in ASP?
Realistically, I would like to find the easiest way to convert the URL to an array in each of PHP, JSP, and ASP.
If it is possible, any idea where to start? Any limitations? Any security issues, etc.?
JSP:
String[] stringArray = url.split("/");
PHP: You already have it...
$parts = explode('/', $url);
ASP: I don't know ASP, but here is what google found:
parts = Split(url, "/");
Related
I've recently run into a bug in PHP 7.1 which seems to have come back after being fixed in PHP 5.4.7
The problem is simply that if you pass a url to parse_url() and the url doesn't have a scheme it will return the whole url as if it's just a path. For example:
var_dump(parse_url('google.co.uk/test'))
Result:
array(1) { ["path"]=> string(12) "google.co.uk/test" }
While in reality here it should split into its domain and path.
I run parse_url a few ten million times a day as part of url decryption / encryption functionality. I'm looking for a fast way to fix this edgecase bug or have a reliable alternative to parse_url.
Edit:
Thanks for the helpful responses, here's the solution I used in the end, I hope it helps someone. I won't submit it as an answer because I already marked someone else as correct (which they are) which allowed me to write this.
$parsedUrl = parse_url($uri);
// if the uri has no scheme, it won't think there's a host and will give bad results
if ($parsedUrl !== false && !isset($parsedUrl['host'])) {
// double slash prepended will parse $uri as if it has a schema and no schema will be in the result
$parsedUrl = parse_url('//' . $uri);
}
if ($parsedUrl === false) {
throw new MalformedUrlException('Malformed URL: ' . $uri);
}
// use parsed url as needed
parse_url needs to have information if the given string is the beginning of a url.
this is why parse_url('//domain/path') works -> it will just not output any schema.
now to describe the problem you want to be solved: php would need to know every domain there is and to then be able to decide if this is what the user wanted (basically impossible)
Take for example the following url: 'http://whois.domaintools.com/test.at' -> if I only pass the path it will write 'test.at' -> is this now a path or domain?
Here is the format of affiliate URL I have http://tracking.vcommission.com/aff_c?offer_id=2119&&url=http%3A%2F%2Fwww.netmeds.com%2F%3Fsource_attribution%3DVC-CPS-Emails%26utm_source%3DVC-CPS-Emails%26utm_medium%3DCPS-Emails%26utm_campaign%3DEmails
If you see it has 2 URLs:
first URL: is for vcommission.com and
Second URL: netmeds.com
I have CSV file with lot of rows. Each rows may have different second URL. I wanted to get second URL for each rows. First URL is also not static as for different CSV, this would also different.
How can I get second URL?
Some basic string parsing like this should give you an idea.
$url='http://tracking.vcommission.com/aff_c?offer_id=2119&&url=http%3A%2F%2Fwww.netmeds.com%2F%3Fsource_attribution%3DVC-CPS-Emails%26utm_source%3DVC-CPS-Emails%26utm_medium%3DCPS-Emails%26utm_campaign%3DEmails';
list($u,$q)=explode('url=',urldecode($url));
$o=(object)parse_url($q);
echo $o->host;
A good way to find the domain for a URL is with parse_url
Unfortunately due to the way your data is stored this is not really an option however you may be able to use some sort of regex to find contained web addresses in the query string
<?php
$url = "http://tracking.vcommission.com/aff_c?offer_id=2119&&url=http%3A%2F%2Fwww.netmeds.com%2F%3Fsource_attribution%3DVC-CPS-Emails%26utm_source%3DVC-CPS-Emails%26utm_medium%3DCPS-Emails%26utm_campaign%3DEmails";
$p = parse_url($url);
$pattern = "/www[^%]*/";
preg_match($pattern, $p['query'], $result);
var_dump($result);
You may need to adjust the regex pattern based on how the other data presents itself.
I have seen on most online newspaper websites that when i click on a headline link, e.g. two thieves caught red handed, it normally opens a url like this: www.example.co.uk/news/two-thieves-caught-red-handed.
How do I deal with this url in php code, so that I can only pick the last part in the url. e.g. two-thieves-caught-red-handed. After that I want to work with this string.
I know how to deal with GET parameters like "www.example.co.uk/news/headline=two thieves caught red handed".
But I do not want to do it that way. Could you show me another way.
You can use the combination of explode and end functions for that
for example:
<?php
$url = "www.example.co.uk/news/two-thieves-caught-red-handed";
$url = explode('/', $url);
$end = end($url);
echo "$end";
?>
The code will result
two-thieves-caught-red-handed
You have several options in php to get the current url. For a detailed overview look here.
One would be to use $_SERVER[REQUEST_URI] and the use a string manipulation function for extraction of the parts you need.
Maybe this thread will help you too.
Say I have a url like this in a php variable:
$url = "http://mywebsite.extension/names/level/etc/page/x";
how would I automatically remove everything after the .com (or other extension) and before /page/2?
Basically I would like every url that could be in $url to become http://mywebsite.extension/page/x
Is there a way to do this in php? :s
thanks for your help guys!
I think parse_url() is the function you're looking for. You can use it to break down an URL into it's component parts, and then put it back together however you want, adding in your own things as needed.
As PeeHaa noted, explode() will be useful for dividing up the path.
It seems Google's URLs are structured differently these days. So it is harder to extract the referring keyword from them. Here is an example:
http://www.google.co.uk/search?q=jquery+post+output+46&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a#pq=jquery+post+output+46&hl=en&cp=30&gs_id=1v&xhr=t&q=jquery+post+output+php+not+running&pf=p&sclient=psy-ab&client=firefox-a&hs=8N5&rls=org.mozilla:en-US%3Aofficial&source=hp&pbx=1&oq=jquery+post+output+php+not+run&aq=0w&aqi=q-w1&aql=&gs_sm=&gs_upl=&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=bdeb326aa44b07c5&biw=1280&bih=875
The search I performed was actually "jquery post output php not running", so the first 'q=' does not contain the full search. The second one does. I'd like to write a script that always extracts the last 'q=', but I'm not sure if Google's URL's always have the full search last. Anyone had any experience with this.
You can accomplish this using parse_url(), parse_str(), and urldecode(), where $str is the refer string:
$fragment = parse_url($str, PHP_URL_FRAGMENT);
parse_str($fragment, $arr);
$query = urldecode($arr['q']); // jquery post output php not running