Parsing out "site.com" from a passed variable in PHP? - php

I'm passing through a variety of URLs in a global variable called target_passthrough, so the URL of a page might look like:
http://www.mysite.com/index.php?target_passthrough=example.com
Or something like that. Formats for that variable may be a variety of things such as (minus quotes):
"www.example.com"
".example.com"
"example.com"
"http://www.example.com"
".example.com/subdir/"
".example.com/subdir/page.php"
"example.com/subdir/page.php"
Please note how some of those have periods as the first character such as 2,5, and 6.
Now, what I am trying to do is pull out just "example.com" from any of those possible scenarios with PHP and store it to a variable to echo out later. I tried parse_url but it gives me the "www" when that is present, which I do not want. In instances where the url is just "example.com" it returns a null value.
I don't really know how to do regex matching or if that is even what I need so any guidance would be appreciated--not really that advanced at php.

As you pointed out, you can use parse_url to do much of the work for you and then simply strip off the www or leading dot if it is present.
An alternative strategy of taking the last two "words" won't always work because there are domains like www.example.co.uk. Using this strategy would give you co.uk instead of example.co.uk. There is no simple rule for determining which parts are the domain or the sub-domain.

parse_url() outputs an array the different parts of the URL. You are getting null values because you are only referencing the first item in the array. parse_url()
Array (
[scheme] => http
[host] => hostname
[user] => username
[pass] => password
[path] => /path
[query] => arg=value
[fragment] => anchor
)

Related

PHP: rawurldecode() not showing plus sign

I have a URL like this:
abc.com/my+string
When I get the parameter, it obviously it replaces the + with a space, so I get my string
I replaced the + in the url with %2B, then I use rawurldecode(), but the result is the same. Tried with urldecode() but I still can't get the plus sign in my variable, it's always an empty space.
Am I missing something, how do I get exactly my+string in PHP from the url abc.com/my%2Bstring ?
Thank you
In general, you don’t need to URL-decode GET parameter values manually, since PHP already does that for you, automatically. abc.com?var=my%2Bstring -> $_GET['var'] will contain my+string
The problem here was that URL rewriting was in play. As http://httpd.apache.org/docs/2.2/rewrite/flags.html#flag_b explains,
mod_rewrite has to unescape URLs before mapping them, so backreferences will be unescaped at the time they are applied.
So mod_rewrite has decoded my%2Bstring to my+string, and when you rewrite this as a query string parameter, you effectively get ?var=my+string. And when PHP applies automatic URL decoding on that value, the + will become a simple space.
The [B] flag exists to make mod_rewrite re-encode the value again.
Like this:
echo urldecode("abc.com/my%2Bstring"); // => abc.com/my+string
echo PHP_EOL;
echo rawurldecode("abc.com/my%2Bstring"); // => abc.com/my+string
Further if you want to get the actual my+string, you can utilize the powers of parse_url function which comes with PHP itself, although you have to provide a full URL into it.
Other way is just to explode the value by a / and get it like this:
$parts = explode('/', 'abc.com/my+string'); // => Array(2)
echo $parts[1] ?? 'not found'; // => string|not found
Also read the documentation on both: urldecode and rawurldecode.
Example here.

how to turn ?page=home to ?home and still use $_GET

Is there anyway to turn index.php?page=home to just index.php?home, so I can still use $_GET? or is this impossible? I would have googled this but I couldn't find anything anywhere.
UPDATE let me be more clear, i need to get the string AFTER index.php? so i can use it in my php code
If you want to use the URL and string the hostname out you can use Robert Elwell answer
$foo = "http://www.example.com/foo/bar?hat=bowler&accessory=cane";
$blah = parse_url($foo);
print_r($blah);
Array
(
[scheme] => http
[host] => www.example.com
[path] => /foo/bar
[query] => hat=bowler&accessory=cane
)
If you want to use any string you could use [substr][2] to return only a part of a string.
If you have a more concrete example I could give an example on how to solve it.

Rewrite rules - Just wanna know is it possible?

Example:
I want to make an url with different categorised variables. And there is an order. First variable is vegetables({variable01}), second is fruits({variable02}), third is trees ({variable03}) etc.
xxx.com/{variable01}-{variable02}-{variable03}-{variable04}-......
Yes I got this url.
BUT
What if a variable have two words(or three) AND I want the seperator is also a hyphen(brussels-sprouts)?
Example:
xxx.com/brussels-sprouts-{variable02}-{variable03}-{variable04}-......
or
xxx.com/{variable02}-green-apple-{variable03}-{variable04}-......
xxx.com/brussels-sprouts-green-apple
How can this be possible?
Thanks.
There is no way to specify the boundaries of your variables anymore, so it's only possible if you know the exact amount of variables and only 1 of your variables may contain hyphens. You can create a regex for that:
First variable:
^([\w-]+)-(\w+)-(\w+)-(\w+)$
Second variable:
^(\w+)-([\w-]+)-(\w+)-(\w+)$
But: if you know that each variable can have at most 1 hyphen, you can also do this:
/var1-foo1-var2-foo2-var3-foo3-var4
RewriteRule ^(\w+(-\w+)?)-(\w+(-\w+)?)-(\w+(-\w+)?)-(\w+(-\w+)?)$ index.php?var1=$1&var2=$3&var3=$5&var4=$7 [L]
Which results in:
array (
'var1' => 'var1-foo1',
'var2' => 'var2-foo2',
'var3' => 'var3-foo3',
'var4' => 'var4',
)

preg_match from URL string

I have a string passed through a campaign source that looks like this:
/?source=SEARCH%20&utm_source=google&utm_medium=cpc&utm_term=<keyword/>&utm_content={creative}&utm_campaign=<campaign/>&cpao=111&cpca=<campaign/>&cpag=<group/>&kw=<mpl/>
when its present I need to cut this up and pass it through to our form handler so we can track our campaigns. I can check for it, hold its contents in a cookie and pass it throughout our site but i am having and issue using preg_match to cut this up and put it into variables so I can pass their values to the handler. I want the end product to look like:
$utm_source=google;
$utm_medium=cpc;
$utm_term=<keyword/>
there is no set number of characters, it could be Google, Bing etc, so i am trying to use preg_match to get the first part (utm_source) and stop past what I want (&) and so forth but I don't understand preg_match well enough to do this.
PHP should be parsing your query sting for you, into $_GET. Otherwise, PHP knows how to parse query strings. Don't use regular expressions or for this, use parse_str.
Input:
<?php
$str = "/?source=SEARCH%20&utm_source=google&utm_medium=cpc&utm_term=<keyword/>&utm_content={creative}&utm_campaign=<campaign/>&cpao=111&cpca=<campaign/>&cpag=<group/>&kw=<mpl/>";
$ar = array();
parse_str($str, $ar);
print_r($ar);
Output:
Array
(
[/?source] => SEARCH
[utm_source] => google
[utm_medium] => cpc
[utm_term] => <keyword/>
[utm_content] => {creative}
[utm_campaign] => <campaign/>
[cpao] => 111
[cpca] => <campaign/>
[cpag] => <group/>
[kw] => <mpl/>
)

How to pass querystring to testAction in CakePHP 1.2?

In CakePHP putting a querystring in the url doesn't cause it to be automatically parsed and split like it normally is when the controller is directly invoked.
For example:
$this->testAction('/testing/post?company=utCompany', array('return' => 'vars')) ;
will result in:
[url] => /testing/post?company=utCompany
While invoking the url directly via the web browser results in:
[url] => Array
(
[url] => testing/post
[company] => utCompany
)
Without editing the CakePHP source, is there some way to have the querystring split when running unit tests?
I have what is either a hack (i.e. may not work for future CakePHP releases) or an undocumented feature.
If the second testAction parameter includes an named array called 'url' then the values will be placed in the $this->params object in the controller. This gives us the same net result as when the controller is directly invoked.
$data = array ('company' => 'utCompany') ;
$result = $this->testAction('/testing/post', array
(
'return' => 'vars',
'method' => 'get',
'url' => $data)
) ;
I'm satisfied with this method for what I need to do. I'll open the question to the community shortly so that it in the future a better answer can be provided.
None of these answers will woerk in Cake 1.3. You should instead set the following before your testAction call:
$this->__savedGetData['company'] = 'utcompany';
CakePHP does provide some level of url splitting but it only seems to work in the run-time configuration and not the test configuration. I'll contact the CakePHP if this is intentional.
I suggestion for your querystring parser would be to use the PHP function explode.
I believe you can do something like this:
$result = explode ('&', $queryString, -1) ;
which would give you your key-pairs in seperate array slots upon which you can iterate and perform a second explode like so:
$keyPair = explode ('=', $result[n], -1) ;
However, all this being said it would be better to peek under the hood of CakePHP and see what they are doing.
What I typed above won't correctly handle situations where your querystring contains html escaped characters (prefixed with &), nor will it handle hex encoded url strings.
use _GET['parmname'];

Categories