Google bots get 503 errors while indexing my website - php

While google (or any other robot) is crawling my website developed using Laravel, it gets 503 Server Unavailable error. I can visit my website without any problem and do not get any error: http://www.kurumyonetimsistemi.com
How can I fix this problem?
Edit:
503 error is because I am redirecting not found pages to a custom page. If I remove this redirect, robots get 500 Internal Server Error. But I can display my website on a browser without problem.

Well, after a while of searching, I have found the problem. I was getting the visitors browser language and stroing it as a cookie.
This process causes errors for robots. If I remove the code, the problem is solved.

Related

Website displays fine, but 503 error on page validation

I have a website running on Joomla that displays just fine, but when validated returns a 503 server error. Since there are so many php files in joomla, plugins, etc., and I don't know which one to fix, I was wondering if I could just do something in the .htaccess file forcing all 503 errors to report 200 instead. It probably isn't the best way to do it, but I just have no idea how to fix it otherwise. Does anyone know how to do this?
Thanks
If you have your site in offline mode and log in at the frontend, the site will display as normal, however others trying to access the site ( in this case the w3 - validator?! ) will still get a 503 Service Temporarily Unavailable -response. It is not not possible to validate the website while it is in offline mode (but it is easily forgotten that you logged into the page).
To validate the site during development you can set the site in online mode, and protect it with .htaccess-authentication ( http://www.htaccesstools.com/htaccess-authentication/ ), this will pop up a login-box in the w3 validator.

/cache url requests causing 404 errors

We are developing a PHP webapplication. Lately our apache logs show that Apache serves a 404 error page for a particular case. The logs indicate that the HTTP_REFERER is: http://ourhost.com?gclid=some_id. The REQUEST_URI is: /cache/some_other_id.
Our webapplication is build with symfony 1.4. Our webapplication does not serve any pages beginning with /cache, it therefore serves a 404 page. The webapplication also does not serve pages containing a link to /cache/some_other_id.
Why does Google (crawler) try to visit URLs beginning with /cache?
How should we handle these 404 errors?
It would seem it is this issue. Basically, some kind of browser extension making such requests... There is a suspicion of "Browser Companion Helper", part of "Ginyas Browser Companion" doing the requests.
I don't see much that can be done about it from the server side, except possibly advise a user that they have malware on their browser.

Strange issue with 404 error pages

I have a website set up that uses a custom 404 error page. This seems to be working on most pages.
In fact, I have two different error pages that I want to show, and now a third that I just found out about.
This page, which does not exist, shows the correct error page that should be shown if a page cannot be found. This shows the error page as configured in my .htaccess file:
http://www.canadiancommuter.com/wontfindthis.php
This error page is generated from my PHP code if someone tries to access an old article that no longer exists in the database:
http://www.canadiancommuter.com/2334054466-some+old+article.html
However, this link, which will also generate a 404 error, shows a different error page (which usually includes advertising):
http://www.canadiancommuter.com/2012062500-TTC+asks+Ministry+of+Labour+to+treat+CNE+like+Rolling+Stones+concert%2FCaribana.html
I know the reason WHY this URL doesn't work. I purposely added characters to it to cause it to return a 404 error. My problem is that I can't figure out WHERE this other 404 error page is coming from.
It's not in my .htaccess file, the error page from my .htaccess file can be seen in the first link above.
It's not in my code. The only error page generated by the code itself can be seen in the second link above.
The only other places it could come from is my domain registrar, and my web host.
The domain is registered through one registrar, but points to my hosting account with another provider. The registrar says that because I'm just pointing the DNS for my domain to we web host the error page wouldn't come from them, but would come from my web host.
My web host says this error page isn't coming from them, but must be in my code.
I've heavily modified all of the code used for this site, so I'm pretty confident that the error page is not coming from there.
Does anyone have any ideas where I should look for this error page?
(Just a note, I'm not certain the registrar or the web host were entirely sure of what they were talking about, so I haven't ruled out either of them as being the source of this page. However, a thorough look through the administrative consoles for both do not reveal anything to this effect.)
Your pages are being served through a proxy running cloudflare-nginx which could be catching some 404 errors because slashes in either / or %2F form cause a different 404 page to be served.
Do you have an .htaccess rule that catches all of the possible 404 errors and not just the ones that match your filename scheme? If not, try setting one up. You could also try to run the site in a local server instance and see if the 404 pages behave as expected.
Edited because I mistakenly took characters produced by Transfer-encoding: chunked to be caused by misconfifguration
If (as you have already determined) your code doesn't generate the error page, then the "mysterious" 404 page comes either from the default websersver configuration (which is presumably controlled by your hosting provider) or indirectly from your DNS service (if your webserver redirects your browser to an unregistered domain, for example, then you may be redirected to a page which invites you to buy it).
The most straightforward way IMO to track this down is by using a browser equipped with machinery for tracking redirects (e.g. Firefox with the Firebug extension installed). If the error pages are indeed coming from your domain (and not a misspelling of it), then that implicates the default webserver configuration (and so presumably your hosting provider).
EDIT:
Re-reading the above I realize that I should clarify: your DNS service can't simply "redirect" you somewhere. If you find that typing a non-existing domain into your browser redirects you to a page with advertising then you can be fairly sure that it's your network connectivity provider that is inspecting your HTTP request, doing a DNS looking on your behalf behind the scenes, and redirecting you.
This is absolutely from your host provider!!
When %2F is given in url, nginx cannot handle that as an error (this might be a bug!) and it displays your host's default error page, you can see the same error on other websites hosted on the same server as your site:
http://aias-uic.org/not-found.html
http://halfdrawn.com/not-found.html
http://flyingmantis.com/not%2Ffound.html
...
and there are many other websites on the same server as your site! (you can check their IP to be sure)
The last one uses custom error page, so with %2F the mysterious error page is shown!
You can also disable your custom error pages for a while and you will probably get the mysterious error page!

501 Errors on Facebook IFRAME

Im loading an iframe within a facebook app but I seem to be getting a 501 error response. I do not get this error response when navigating directly to the domain I am hosting on e.g
www.example.com/fbapp/
but when i go to
aps.facebook.com/fbapp
I get the error.
We have a valid SSL for our site and the issue from what I can tell is sporadic at best.
I would really just like to understand why it might be happening and any preventative measures I can take.
The request from Facebook to your app is made via a POST request when loaded from Facebook - make sure your code can handle that, and check your own server's error logs as this is where the 501 error is coming from, only your own logs will be able to tell you what the issue is

interesting facebook status code 500 error on linux server, PHP

index.html is working on facebook debug.
index.php is not working on facebook debug.
this site is reachable, but facebook is not reaching. what is the problem?
Though it works in the browser, when I try to load your site from the command-line using CURL, it responds with an 500 Internal Server Error and no page. It seems that your site blows up whenever the client doesn't send the Accept-Language header. This header is optional, so you probably shouldn't do that.
Your page is returning a 500 error to Facebook's crawler.
Also when I check it manually I get the same problem, as Jeremy reported.
Do you have any logic in your PHP which checks the user agent header and does different things on different user agents?
The Facebook crawler presents as
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)

Categories