How to resolve multilanguaes for social bots?

How to resolve multilanguaes for social bots? - php

I have done multi languages on my web site.
When user visits page at first time works mechanism that detects country and language.
After is occurring reloading page and data is added to current session.
Problem is that when I try to share links in social networks I get empty data, because social bot does not have session mechanism therefore web server returns empty page without text.
How I can resolve this issue or may be change something in architecture.
To share please a good prartic about this.

Don't use sessions for this. If you want to share the page URL with the right language, make sure this language is part of the URL. A simple way to do this is to use a query string. Your URL could look like this: http://www.example.com/page?language=dutch
In PHP you can use $_GET to read which language to use:
<?php
if ( isset( $_GET['language'] ) ) {
// Display page in specified language
} else {
// Display page in default language
}

Related

php menu and multilanguage page

I am developing a multi-language website with php. I used session to take language change request from users.
<?php
$check_lang=array("eng","suo","sve");
session_start();
if(isset($_GET["lang"])){
if(in_array($_GET["lang"], $check_lang)){
$_SESSION["lang"]=$_GET["lang"];
}
}
if($_SESSION["lang"]==""){
$_SESSION["lang"]="eng";
}
include("Lang/".$_SESSION["lang"]."/".$_SESSION["lang"].".php");
?>
This works pretty well. But the problem is that once the user navigates different menu(say "About" in English) and presses a link to change language to Swedish, the page redirects to "home" in Swedish.
I would like to know how to record which page user is currently and change that particular page on language change request.
-thanks on advance.

You can use $_SERVER["HTTP_REFERER"] to see where people are coming from. People can actually change this so make sure you build some security so people don't get redirected to other domains etc.

How to handle current language? Always in URL, or session?

Im planning to add language feature to my site. I can see two ways:
storing language in the url, so always www.mysite.com/en/introduce, www.mysite.com/en/home, or if 1st parameter is missing, just use the default. Its good for bookmark, but very hard to implement to all available links
storing in session. Way much easier, but users may gets confused not seeing the language in the URL.
I would say: session. What would you say? Any experiences?

If you want all your pages to be indexed by search engines, you'll have put the language parameter in the URL.
If you're producing more something like Facebook where a user needs to be logged in to receive content in his personalized language, use sessions.

I would use the first method togetter with a url rewrite engine.
F.e. when using RewriteEngine for Apache you could add this line to your .htaccess:
RewriteRule ^([a-zA-Z][a-zA-Z])/([a-zA-Z]*)$ content.php?culture=$1&content=$2
and even this can work:
RewriteRule ^([a-zA-Z][a-zA-Z])/([a-zA-Z]*)$ $2.php?culture=$1

You want to put your language as part of the url, otherwise google won't be able to index it for different countries. Also, they might think you have two types of content on the same page.
I would store it in session if there's only some parts of content changing as it's easier to implement if you're just changing i.e. contact details for the company based on what country the user is coming from. But as a general rule, give it a separate url either using .htaccess or your routing system.

Regular users don't look at URL and change the parameters from there. Normal users are point and click. Keep the language selection somewhere visible on the page and also in the user settings. This is not something that a user will want to change several times during a visit. We are talking about a setting that you can ask and set on the first visit. Currently I hate the way the google does it using my IP, assuming (wrong) that if I am entering from Norway I definitely speak Norwegian and I can handle finding in Norwegian menus the English version. I do like the way Etsy.com is doing it, they ask you on the first visit what is your preferred language, currency and so on. If you accept them good, but you can change them right there without having to navigate to a menu. In my opinion go for cookies or session instead of polluting the URL.

WordPress Referrer Tracking

I have a WordPress blog that's hosted within my site (http://www.mysite.com/blog) and my website itself is based on ASP.NET.
I'm tracking referrers within ASP.NET upon session start and storing them within a session variable to save into my database either after a session expires or after a visitor converts to a member.
How can I track the referrers for visitors that come to the blog first and click on a link to a page within the website? Is there a way in WordPress that I could pass the referrer using a query string?

When they land on the blog drop a cookie with the original referrer.
Then read this from the app.
Whip up a plugin to do something like this:
if( !isset($_COOKIE['ref']) ) {
setcookie(...);
}
If you feel like a hack you could just add this to WP's index.php, but a plugin would be the more portable, clean option.

How to hide a page url from bots/spiders?

On my website, I have 1000 products, and they all have their own web page which are accessible by something like product.php?id=PRODUCT_ID.
On all of these pages, I have a link which has a url action.php?id=PRODUCT_ID&referer=CURRNT_PAGE_URL .. so if I am visiting product.php?id=100 this url becomes action.php?prod_id=100&referer=/product.php?id=1000 clicking on this url returns the user back to referer
Now, the problem I am facing is that I keep getting false hits from spiders. Is there any way by which I can avoid these false hits? I know I can "diallow" this url in robots.txt but still there are bots who ignore this. What would you recommend?
Any ideas are welcome. Thanks

Currently, the easiest way of making a link inaccessible to 99% of robots (even those that choose to ignore robots.txt) is with Javascript. Add some unobtrusive jQuery:
<script type="text/javascript">
$(document).ready(function() {
$('a[data-href]').attr('href', $(this).attr('data-href'));
});
});
</script>
The construct your links in the following fashion.
Click me!
Because the href attribute is only written after the DOM is ready, robots won't find anything to follow.

Your problem consists of 2 separate issues:
multiple URLs lead to the same resource
crawlers don't respect robots.txt
The second issue is hard to tackle, read Detecting 'stealth' web-crawlers
The first one is easier.
You seem to need an option to let the user go back to the previous page.
I'm not sure why you do not let the browser's history take care of this (through the use of the back-button and javascript's history.back();), but there are enough valid reasons out there.
Why not use the refferer header?
Almost all common browser send information about the referring page with every request. It can be spoofed, but for the mayority of visitors this should be a working solution.
Why not use a cookie?
If you store the CURRNT_PAGE_URL in a cookie, you can still use a single unique URLs for each page, and still dynamically create breadcrumbs and back links based on the refferer set in the cookie, and not be dependent on the HTTP-referrer value.

You can use the robots.txt file to prevent complying bots.
Next thing you can do, once robots.txt is configured is to check your server logs. Find any useragents that seem suspicious.
Let's say you find evil_webspider_crawling_everywhere as a useragent, you can check for it in the headers of the request (sorry, no example, haven't used php in a long time) and deny access to the webspider.

Another option is to use PHP to detect bots visiting your page.
You could use this PHP function to detect the bot (this gets most of them):
function bot_detected() {
return (
isset($_SERVER['HTTP_USER_AGENT'])
&& preg_match('/bot|crawl|slurp|spider|mediapartners/i', $_SERVER['HTTP_USER_AGENT'])
);
}
And then echo href links to page only when you find that the user is not a bot:
if (bot_detected()===false)) {
echo "http://example.com/yourpage";
}

I don't believe you can stop user agents that don't obey your advice.
Before going down this route I would really want to make ascertain that bots/spiders are a problem - doing anything that prevents natural navigation of your site should be seen as a last resort.
If your really want to stop spiders what you might want to consider is using javascript in the your links so that navigation only happens after the link is clicked. This should stop spiders.
Personally I'm not fussed about spiders or bots.

I'm not sure if I should use a redirect

I have an affiliate link on my webpage. When you click on the link it follows the href value which is as follows:
www.site_name.com/?refer=my_affiliate_id
This would be fine, except that the site offers no tracking for the ads, so I can't tell how many clicks I am getting. I could easily implement my own tracking by changing the original link href value to a php script which increments some click stats in a database and then redirects the user to the original page. Like so:
<?php // Do database updating stuff here
Header("Location: http://www.site_name.com/?refer=my_affiliate_id");
?>
But I have read some articles that say that using redirects may be seen by google as a sign of 'blackhat' techniques and they might rank me lower, unindex my site or even hurt the site that I'm redirecting too.
Does anybody know if this is true, or have any idea of the best way I could go about this?
Many thanks in advance
Joe

You could always do what Google does with search results. They have the link href normal, until the mousedown event. something to the effect of:
adlink.onmousedown = function(e) {
var callingLink = /* stuff to actually get the element here */;
callingLink.href = 'http://mysite.com/adtrack_redirect_page.ext?link=' + escape(callingLink.href);
}
Or something like that :P
So, Google will see a normal link, but almost all users will be redirected to your counter page.

Using a 301 redirect simple tells Google that the website is permamently moved. It should have, according to most random people on the internet and according to Google itself, no effect on your page-rank.

Actually I've read (can't remember where exactly) that this kind of redirect DOES HURT your rating. No, it won't "kill" your website nor the referenced, as far as I know (and please do check further), but it will hurt your site's rating as I said.
Anyway I'd recommend using some javascript to refer anything out of you domain - something like "window.open(....)" should do the trick, as Google will not follow this code.
There, refer to your tracking script which will redirect further.

You could use a javascript onClick event to send an ajax signal to your server whenever the link is clicked. That way the outgoing link is still fully functional, and your server-side script can increment your counter to track the clickthrough.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How to resolve multilanguaes for social bots? - php

Related

php menu and multilanguage page

How to handle current language? Always in URL, or session?

WordPress Referrer Tracking

How to hide a page url from bots/spiders?

I'm not sure if I should use a redirect

Categories

Resources