This may be a stupid question, but is it possible to capture what a user typed into a Google search box, so that this can then be used to generate a dynamic page on the landing page on my Web site?
For example, let's say someone searches Google for "hot dog", and my site comes up as one of the search result links. If the user clicks the link that directs them to my Web site, is it possible for me to somehow know or capture the "hot dog" text from the Google search box, so that I can call a script that searches my local database for content related to hot dogs, and then display that? It seems totally impossible to me, but I don't really know. Thanks.
I'd do it like this
$referringPage = parse_url( $_SERVER['HTTP_REFERER'] );
if ( stristr( $referringPage['host'], 'google.' ) )
{
parse_str( $referringPage['query'], $queryVars );
echo $queryVars['q']; // This is the search term used
}
This is an old question and the answer has changed since the original question was asked and answered. As of October 2011 Google is encrypting this referral information for anyone who is logged into a Google account: http://googleblog.blogspot.com/2011/10/making-search-more-secure.html
For users not logged into Google, the search keywords are still found in the referral URL and the answers above still apply. However, for authenticated Google users, there is no way to for a website to see their search keywords.
However, by creating dedicated landing pages it might still be possible to make an intelligent guess. (Visitors to the "Dignified charcoal sketches of Jabba the Hutt" page are probably...well, insane.)
Yes, it is possible. See HTTP header Referer. The Referer header will contain URL of Google search result page.
When user clicks a link on a Google search result page, the browser will make a request to your site with this kind of HTTP header:
Referer: http://www.google.fi/search?hl=en&q=http+header+referer&btnG=Google-search&meta=&aq=f&oq=
Just parse URL from request header, the search term used by user will be in q -parameter. Search term used in above example is "http header referer".
Same kind of approach usually works also for other search engines, they just have different kind of URL in Referer header.
This answer shows how to implement this in PHP.
Referer header is only available with HTTP 1.1, but that covers just about any somewhat modern browser. Browser may also forge Referer header or the header might be missing altogether, so do not make too serious desicions based on Referer header.
This is an old question but I found out that google no more gives out the query term because it by default redirects every user to https which will not give you the "q"parameter. Unless someone manually enters the google url with http (http://google.com) and then searches, there is no way as of now to get the "q" parameter.
Yes, it comes in the url:
http://www.google.com/search?hl=es&q=hot+dog&lr=&aq=f&oq=
here is an example:
Google sends many visitors to your site, if you want to get the keywords
they used to come to your site, maybe to impress them by displaying it
back on the page, or just to store the keyword in a database, here's the
PHP code I use :
// take the referer
$thereferer = strtolower($_SERVER['HTTP_REFERER']);
// see if it comes from google
if (strpos($thereferer,"google")) {
// delete all before q=
$a = substr($thereferer, strpos($thereferer,"q="));
// delete q=
$a = substr($a,2);
// delete all FROM the next & onwards
if (strpos($a,"&")) {
$a = substr($a, 0,strpos($a,"&"));
}
// we have the results.
$mygooglekeyword = urldecode($a);
}
and we can use <?= $mygooglekeywords ?> when we want to output the
keywords.
You can grab the referring URL and grab the search term from the query string. The search will be in the query as "q=searchTerm" where searchTerm is the text you want.
Same thing, but with some error handling
<?php
if (#$_SERVER['HTTP_REFERER']) {
$referringPage = parse_url($_SERVER['HTTP_REFERER']);
if (stristr($referringPage['host'], 'google.')) {
parse_str( $referringPage['query'], $queryVars );
$google = $queryVars['q'];
$google = str_replace("+"," ",$google); }
else { $google = false; }}
else { $google = false; }
if ($google) { echo "You searched for ".$google." at Google then came here!"; }
else { echo "You didn't come here from Google"; }
?>
Sorry, a little more
Adds support for Bing, Yahoo and Altavista
<?php
if (#$_SERVER['HTTP_REFERER']) {
$referringPage = parse_url($_SERVER['HTTP_REFERER']);
if (stristr($referringPage['host'], 'google.')
|| stristr($referringPage['host'], 'bing.')
|| stristr($referringPage['host'], 'yahoo.')) {
parse_str( $referringPage['query'], $queryVars );
if (stristr($referringPage['host'], 'google.')
|| stristr($referringPage['host'], 'bing.')) { $search = $queryVars['q']; }
else if (stristr($referringPage['host'], 'yahoo.')) { $search = $queryVars['p']; }
else { $search = false; }
if ($search) { $search = str_replace("+"," ",$search); }}
else { $search = false; }}
else { $search = false; }
if ($search) { echo "You're in the right place for ".$search; }
?>
Related
I'm struggling to make AJAX-based website SEO-friendly. As recommended in tutorials on the web, I've added "pretty" href attributes to links: контакт and, in a div where content is loaded with AJAX by default, a PHP script for crawlers:
$files = glob('./pages/*.php');
foreach ($files as &$file) {
$file = substr($file, 8, -4);
}
if (isset($_GET['site'])) {
if (in_array($_GET['site'], $files)) {
include ("./pages/".$_GET['site'].".php");
}
}
I have a feeling that at the beginning I need to additionaly cut the _escaped_fragment_= part from (...)/index.php?_escaped_fragment_=site=about because otherwise the script won't be able to GET the site value from URL , am I right?
but, anyway, how do I know that the crawler transforms pretty links (those with #!) to ugly links (containing ?_escaped_fragment_=)? I've been told that it happens automatically and I don't need to provide this mapping, but Fetch as Googlebot doesn't provide me with any information about what happens to URL.
Google bot will automatically query for ?_escaped_fragment_= urls.
So from www.example.com/index.php#!site=about
Google bot will query: www.example.com/index.php?_escaped_fragment_=site=about
On PHP site you will get it as $_GET['_escaped_fragment_'] = "site=about"
If you want to get the value of the "site" you need to do something like this:
if(isset($_GET['_escaped_fragment_'])){
$escaped = explode("=", $_GET['_escaped_fragment_']);
if(isset($escaped[1]) && in_array($escaped[1], $files)){
include ("./pages/".$escaped[1].".php");
}
}
Take a look at the documentation:
https://developers.google.com/webmasters/ajax-crawling/docs/specification
I am building a website where I need to retrieve Facebook shares and likes of numerous links and URLs from different sites.
The problem is, for some URLs it is impossible to get what I wanted. For example, when I look for data about links that look like http://www.example.com/?a=1&b=2&c=3 all I get is wrong data about http://www.example.com/?a=1 and the rest of the URL (&b=2&c=3) is simply ignored by Facebook.
Here at StackOverflow, a lot of people are looking for an answer and many questions are simply unanswered. So, once I did it right, I'm back to tell how I did it.
P.S. : This works only for non-Facebook URLs. If you're looking for shares and likes counts of an internal Facebook link (image, video ...), this won't work for you. I'll be using PHP to answer the question.
To get likes and shares counts, I use FQL instead of the Graph API (even though I am actually using the Graph API to send the query).
But this is not enough : to be able to do so, I had call the rawurlencode() function on the URL I want to get data about. Otherwise, I'll keep getting errors.
So, this likes(), the function I am using to have the counts :
function likes($url) {
$url = rawurlencode($url);
$json_string = file_get_contents("http://graph.facebook.com/fql?format=json&q=SELECT%20share_count,%20like_count%20FROM%20link_stat%20WHERE%20url='$url'");
$json = json_decode($json_string, true);
if (key_exists("data", $json)) {
if (is_array($json["data"])) {
if (array_key_exists("0", $json["data"])) {
return intval($json["data"]["0"]["share_count"]) + intval($json["data"]["0"]["like_count"]);
} else {
echo "Error : '0' is no key<br/>";
return 0;
}
} else {
echo "Error : data is no table <br/>";
return 0;
}
} else {
echo "Error : No data key <br/>";
return 0;
}
}
I hope this will help someone someday :)
As recently as two days ago, the following code worked to get the search query from google:
$refer = parse_url($_SERVER['HTTP_REFERER']);
$host = parse_url($_SERVER['HTTP_REFERER'], PHP_URL_HOST);
$query = parse_url($_SERVER['HTTP_REFERER'], PHP_URL_QUERY);
if(strstr($host,'www.google.com'))
{
//do google stuff
$qstart = strpos($query, 'q=') +2;
$qend = strpos($query, '&', $qstart);
$qlength = $qend - $qstart;
$querystring = substr($query, $qstart, $qlength);
$querystring = str_replace('q=','',$querystring);
$keywords = explode('%20',$querystring);
$keywords = implode(' ', $keywords);
return $keywords;
}
However, now it does not. I tested it by using echo($query) and it appears that the way google processes referrer query requests has changed. Previously $query included
"q=term1%20term2%20term3%20...
Now, however, I am getting the following output when $query is echo'ed:
sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CCsQFjAB&url=http%3A%2F%2Fexample.com%2F&ei=vDA-UNnxHuOjyAHlloGYCA&usg=AFQjCNEvzNXHULR0OvoPMPSWxIlB9-fmpg&sig2=iPinsBaCFuhCLGFf0JHAsQ
Is there a way to get around this?
Sorry to say, but it's global Google politics change.
See web link
http://googlewebmastercentral.blogspot.ru/2012/03/upcoming-changes-in-googles-http.html
This means if user sign in Google account.
You can try it yourself: if your Google search url starts with https:// this means Google will hide some scratch parameters for the sake of privacy.
I too ran into the same issue this week. I'm not sure if this is still relevant to you, but what I found was that Google initiated SSL (Secure Sockets Layer) search for users who were signed in about a year ago, and it looks like SSL search may now be applied to all Google search queries. When I tested this, I was not signed in to Google and was using Firefox and still got the encrypted referring query.
This article has some helpful background and some ideas for working without specific search term data: http://searchenginewatch.com/article/2227114/5-Tips-for-Handling-Not-Provided-Data
// take the referer
$thereferer = strtolower($_SERVER['HTTP_REFERER']);
// see if it comes from google
if (strpos($thereferer,"google")) {
// delete all before q=
$a = substr($thereferer, strpos($thereferer,"q="));
// delete q=
$a = substr($a,2);
// delete all FROM the next & onwards
if (strpos($a,"&")) {
$a = substr($a, 0,strpos($a,"&"));
}
// we have the results.
$mygooglekeyword = urldecode($a);
}
Google initiated SSL for all searches and the information is only available via Google Analytics.
However, for paid campaigns search engines like Google, Bing and Yahoo use query string parameters such as utm_parameters and you can access the search query from the parameter utm_term.
I want to use an idea that I have seen on another website where I enter a "keyword", press Enter, and it then takes the client to a specific page or website.
I have seen something like this on http://qldgov.remserv.com.au, On the right side there is a field called "My Employer", type in "health" for example and you will be provided with relevant content.
Essentially I have client branded mini sites where we want to assign a "keyword" for each client brand so all of their employees will be able to go to their site entering this one keyword without all of them having individual logins. I want to be able to link to a URL that I can define in some manner.
I have looked at the source code of the site mentioned above and see they are using a form but I am not sure how they have assigned the keywords or if its even possible to do this without a database or anything like that. Trying to keep it as simple as possible as I am not a PHP/Java expert by any means.
Any help would be appreciated, even if its not code but an idea of the direction I need to go in to make this work. Thanks in advance!! :-)
The easiest way in my eyes would be to define an array that contains all of the keywords and respective urls client side (in JS). For example:
var array = { 'health' : '/health.php', 'sport' : '/swimming.php' };
You would then get the user input on onSubmit and if it exists modify the window.location appropriately.
if ( array[user_input] !== undefined ) {
window.location = array[user_input];
}
else {
alert ( 'not found' );
}
If the user supplied health they will be redirected to /health.php, if they supply sport they will be redirected to /swimming.php (JSFiddle). Alternatively you can use server-side (PHP, JAVA) to handle the request but this may not be worth the effort.
Goodluck.
By using php (rather than javascript), you're not relying on javascript + making it seo friendly.
Firstly you're going to need either some sort of database or a list of keywords/urls
$keywords = array('keyword1' => 'path/to/load.php', 'another keyword' => 'another/path');
Then you'll need a basic form
<form action="loadkeyword.php">
<input name="query">
<button type="submit">Go</button>
</form>
Then in loadkeyword.php
$keywords = array('keyword1' => 'path/to/load.php', 'another keyword' => 'another/path');
$query = $_GET['query'];
if (isset($keywords[$query])) {
$url = $keywords[$query];
header("HTTP/1.0 301 Moved Permanently");
header('location: '.$url);
exit;
} else {
header("HTTP/1.1 404 Not Found");
die('unable to locate keyword');
}
If you have a large list of keywords, I would suggest using a database instead of an array to keep track of your keywords.
The site you link to is doing it server-side, either via a keyword-list that matches content or a search function (I suspect the latter).
There are a few different ways you could achieve your goal, all of them to do with matching keywords to content and then redirecting, either with an array, a list, or a database - the principle will be the same.
However, I would respectfully suggest this may not be the best solution anyway. My reasoning is that (based upon the example you give) you're effectively making your users guess which keyword matches which minisite (even if you have many keywords for each site). Why not just have some kind of menu to choose from (i.e. a selector with a list of minisites)?
hi I just want your opinions about this code I found on a website for detect real search spiders from spammer is it good?? and do you have any recommendations for other scripts or methods for this subject
<?php
$ua = $_SERVER['HTTP_USER_AGENT'];
$spiders=array('msnbot','googlebot','yahoo');
$pattern=array("/\.google\.com$/","/search\.live\.com$/","/\.yahoo\.com$/");
for($i=0;$i < count($spiders) and $i < count($pattern);$i++)
{
if(stristr($ua, $spiders[$i])){
//it's pretending to be MSN's bot or Google's bot
$ip = $_SERVER['REMOTE_ADDR'];
$hostname = gethostbyaddr($ip);
if(!preg_match($pattern[$i], $hostname))
{
//the hostname does not belong to either live.com or googlebot.com.
//Remember the UA already said it is either MSNBot or Googlebot.
//So it's a spammer.
echo "spammer";
exit;
}
else{
//Now we have a hit that half-passes the check. One last go:
$real_ip = gethostbyname($hostname);
if($ip != $real_ip){
//spammer!
echo "Please leave Now spammr";
break;
}
else{
//real bot
}
}
}
else
{
echo "hello user";
}
}
note: it used user agent switcher with this code and it worked perfectly but am not sure if it will work in real world, so what do you think??
What would keep a spammer from simply giving an entirely correct user agent string?
I think this is fairly pointless. You would have to at least compare IP ranges (or their name servers) as well in order to get reliable results. This is possible for Google:
Google Webmaster Central: How to verify Googlebot
but even if you test for Google and Bing this way, a spambot can enter your site simply by giving a browser user-agent. Therefore, it is ultimately impossible to detect a spam-bot. They are a reality, and there is no good way to keep them out from a web site.
you can also have htaccess so that things like this will be prevented just like on this tutorial
http://perishablepress.com/press/2007/06/28/ultimate-htaccess-blacklist/