I am working on a new script that basically instead when somebody searches for something on my website how it normally goes to here:
http://domain.com/index.php?q=apples
to
http://apples.domain.com
I have made this work perfectly in PHP as well as htaccess but the problem I am having is using the original keyword afterwards on the new subdomain page.
Right now I can use parse_url to get the keyword out of the url but my script also filters out potential problems like:
public function sanitise($v, $separator = '-')
{
return trim(
preg_replace('#[^\w\-]+#', $separator, $v),
$separator
);
}
So if somebody searches for netbook v1.2
The new subdomain would be:
http://netbook-v1-2.domain.com
Now I can take the keyword out but it's with the dashes and not original. I am looking for a way to send over the original keyword with the 301 redirect as well.
Thanks!
You can either just replace the hyphen with spaces when they visit the new subdomain or, since you're on the same top-level domain, you can just cookie the keyword when redirecting them:
setcookie('clientkeyword', 'netbook-v1-2.domain.com:netbook v1.2', 0, '/', '.domain.com');
Look at this answer: https://stackoverflow.com/a/358334/992437
And see if you can use the POST or GET data that's already there. If so, that might be your best bet.
Related
I am have a form that requests a user to submit a website and then on a different page I send a mysql query to retrieve that website an and turn it into a link by doing this in PHP (V=5.6)
$link = '' . $school_website . '';
the problem is that when i try to click this link, instead of sending me to www.google.com for example, it directs me to www.mydomain.com/www.google.com.
I fixed it originally by using substr to see if the first part was "www" and if it was adding "http://" to it and it worked but now i realize that not all websites will start out with that.
I have tried googling this problem but Im not quite sure how to word the problem so I am not getting anything.
My apologies if this is a duplicate. I did not see anything here that fits my problem, so any help would be greatly appreciated. Thanks.
Could always check if it has http/s:// with regex, if it hasn't then add http:// and the link will work as it should. Or make it ugly but simple.
Simplest way is to remove any protocol and prepend // - that would mark the link as absolute and adopt your current protocol. Even if it didn't have http/s:// it would work as it should.
$school_website = '//' . str_ireplace(['https://', 'http://'], '', $school_website);
Example:
https://google.com becomes //google.com
google.com becomes //google.com
www.google.com becomes //www.google.com
In any of the above cases it would become a absolute url.
A better but longer way would be to validate the url with regex.
Until you add http:// or https://in front of url. It will remain the relative
Like if you re on www.mydomain.com and your href attribute value is www.google.com, The attribute value remain the relative and will target to
you url.
You need http:// or https:// at the beginning of the URL of external links - in fact that should be part of your variable "$school_website", and if that one is for example retrieved from a database, that value has to be changed in the database.
My url is www.mysite.com/properties/property-1-someplace
The /property-1-someplace/ is dynamically generated.
I'm writing an if statement that asks if the url is properties/property-1-someplace then execute code, but property-1-someplace is generated by wordpress and thus is constantly changing.
How can I target pages that are in the properties directory, but access the url afterwards if I don't know what that url is?
I can use the PHP variable $pagename but that does not address the properties part of the url.
If I could do <?php if (is_page( 'property/*.*' ) ): that would be perfect.
Any ideas?
EDIT:
Sorry I misunderstood your question, here is what you are probably looking for. Note that you may have to change the regular expression pattern to /^properties/ depending on the value of $pagename. If $pagename contains the whole URL (e.g. with domain name) then you will need to update the code with the domain name.
if( preg_match( '/^\/properties/', $pagename ) ) {
// Do your stuff here.
}
I want to generate an absolute URL with a specific scheme (https) in a Symfony2 controller. All the solutions I found point me to configure the targeted route so that it requires that scheme. But I need the route to remain accessible in http, so I can't set it to require https (in which case http requests are redirected to the corresponding https URL).
Is there a way to generate an URL with, in the scope of that URL generation only, a specific scheme?
I saw that using the 'network' keyword generates a "network path"-style URL, which looks like "//example.com/dir/file"; so maybe I can simply do
'https:' . $this->generateUrl($routeName, $parameters, 'network')
But I don't know if this will be robust enough to any route or request context.
UPDATE: after investigation in the URL generation code, this "network path" workaround seems fully robust. A network path is generated exactly as an absolute URL, without a scheme before the "//".
Best way to go with this
$url = 'https:'.$this->generateUrl($routeName, $parameters, UrlGeneratorInterface::NETWORK_PATH)
According to the code or documentation, currently you cannot do that within the generateUrl method. So your "hackish" solution is still the best, but as #RaymondNijland commented you are better off with str_replace:
$url = str_replace('http:', 'https:', $this->generateUrl($routeName, $parameters));
If you want to make sure it's changed only at the beginning of the string, you can write:
$url = preg_replace('/^http:/', 'https:', $this->generateUrl($routeName, $parameters));
No, the colon (:) has no special meaning in the regex so you don't have to escape it.
With default UrlGenerator, I don't think that is possible, if you don't want to mess with strings.
You could make your own HttpsUrlGenerator extends UrlGenerator introducting one slight change:
Within method generate(), instead of:
return $this->doGenerate(
$compiledRoute->getVariables(),
$route->getDefaults(),
$route->getRequirements(),
$compiledRoute->getTokens(),
$parameters,
$name,
$referenceType,
$compiledRoute->getHostTokens(),
$route->getSchemes()
);
You could do:
return $this->doGenerate(
$compiledRoute->getVariables(),
$route->getDefaults(),
$route->getRequirements(),
$compiledRoute->getTokens(),
$parameters,
$name,
$referenceType,
$compiledRoute->getHostTokens(),
['https']
);
As you can see, $route->getSchemes() gets pumped into doGenerate() based on the route settings (the tutorial link you provided above).
You could even go further and externalize this schema array and supply it via __construct.
Hope this helps a bit ;)
So I have a custom made site that uses this type of input:
example.com/?id=4e2dc982
Or this would also work:
example.com/index.php?id=4e2dc982
But now I've started seeing hits in my log from GoogleBot trying to retrieve this for some reason:
example.com/index.php/?id=4e2dc982
The worse thing is that this actually works, it pulls the page with the right GET parameter, but because of the extra '/' all the links and references don't work. When it tries to load "image.jpg" instead of loading the proper "example.com/image.jpg" it tries to load "example.com/index.php/image.jpg". How do I best fix this? I know I could go back and replace every link to use absolute paths but that's silly. The link with an extra '/' shouldn't work in the first place.
Update:
I found the fix, but still don't know why this is even allowed. I went to:
http://ca1.php.net/manual-lookup.php?pattern=test
And tried to see if the following was possible, and sure enough it works:
http://ca1.php.net/manual-lookup.php/?pattern=test
But their page doesn't break. So I looked at it and found out why:
<base href="http://ca1.php.net/manual-lookup.php" />
So basically, ANY PHP script seems to accept an extra /, but if you didn't code all your links to have absolute paths, or use a base tag, your site will be screwed up whenever someone adds an extra '/'.
It must be linked from somewhere, which you need to figure out from where. You can use google site search to (i.e. site:yoursie) may be to figure out.
One suggestion for now is to use canonical tag
http://googlewebmastercentral.blogspot.com.au/2009/02/specify-your-canonical.html
I think that one of the things you could actually do is getting the header or browser agent (although some browsers don't send this), you could possibly do it. Then if the header contains anything like Google, do not allow the bot to crawl the page, else redirect the user to the site.
Below is an example:
$browser = $_SERVER['HTTP_USER_AGENT'];
checkbrowser($browser); //Calls checkbrowser(); with the browser version.
function checkbrowser($analyze) {
$searchwords = array("bot","google","crawler");
$matches = array();
$matchFound = preg_match_all(
"/\b(" . implode($searchwords,"|") . ")\b/i",
$analyze,
$matches
);
if ($matchFound) {
$words = array_unique($matches[0]);
foreach($words as $word) {
if($word == "bot") {
echo "Sorry, bots are not allowed to crawl this specific page.";
die(); //Terminate the script and leave the bot with that message so it cannot crawl.
}
}
}
}
This is how I often do it, but I utilize this method for different things. You can modify the function by changing the $searchwords to something that fits you best.
I'm using php and I have the following code to convert an absolute path to a url.
function make_url($path, $secure = false){
return (!$secure ? 'http://' : 'https://').str_replace($_SERVER['DOCUMENT_ROOT'], $_SERVER['HTTP_HOST'], $path);
}
My question is basically, is there a better way to do this in terms of security / reliability that is portable between locations and servers?
The HTTP_HOST variable is not a reliable or secure value as it is also being sent by the client. So be sure to validate its value before using it.
I don't think security is going to be effected, simply because this is a url, being printed to a browser... the worst that can happen is exposing the full directory path to the file, and potentially creating a broken link.
As a little side note, if this is being printed in a HTML document, I presume you are passing the output though something like htmlentities... just in-case the input $path contains something like a [script] tag (XSS).
To make this a little more reliable though, I wouldn't recommend matching on 'DOCUMENT_ROOT', as sometimes its either not set, or won't match (e.g. when Apache rewrite rules start getting in the way).
If I was to re-write it, I would simply ensure that 'HTTP_HOST' is always printed...
function make_url($path, $secure = false){
return (!$secure ? 'http://' : 'https://').$_SERVER['HTTP_HOST'].str_replace($_SERVER['DOCUMENT_ROOT'], '', $path);
}
... and if possible, update the calling code so that it just passes the path, so I don't need to even consider removing the 'DOCUMENT_ROOT' (i.e. what happens if the path does not match the 'DOCUMENT_ROOT')...
function make_url($path, $secure = false){
return (!$secure ? 'http://' : 'https://').$_SERVER['HTTP_HOST'].$path;
}
Which does leave the question... why have this function?
On my websites, I simply have a variable defined at the beggining of script execution which sets:
$GLOBALS['webDomain'] = 'http://' . (isset($_SERVER['HTTP_HOST']) ? $_SERVER['HTTP_HOST'] : '');
$GLOBALS['webDomainSSL'] = $GLOBALS['webDomain'];
Where I use GLOBALS so it can be accessed anywhere (e.g. in functions)... but you may also want to consider making a constant (define), if you know this value won't change (I sometimes change these values later in a site wide configuration file, for example, if I have an HTTPS/SSL certificate for the website).
I think this is the wrong approach.
URLs in a HTML support relative locations. That is, you can do link to refer to a page that has the same path in its URL as the corrent page. You can also do link to provide a full path to the same website. These two tricks mean your website code doesn't really need to know where it is to provide working URLs.
That said, you might need some tricks so you can have one website on http://www.example.com/dev/site.php and another on http://www.example.com/testing/site.php. You'll need some code to figure out which directory prefix is being used, but you can use a configuration value to do that. By which I mean a value that belongs to that (sub-)site's configuration, not the version-controlled code!