Increase in 404 errors where "%5C%22" features in google results - php

I hope someone here can help me
I have started getting alerts from google about increasing 404's.
every one of these has the string "%5C%22" in the url rather than the ascii character.
This issue comes and goes every few months. It's a wp site, with only premium plugins.
The best/ nearest answer I have found is here:
It seems that google is looking in code that is not designed for it to
look at. Indeed Stack Overflow lists a similar issue
Ajax used for image loading causes 404 errors.
But there appears to be no real cause identified.
For example
https://rapidbi.com/swotanalysistemplates/%5C%22/ listed in google - but when I go to the page it says contains this link ( https://rapidbi.com/swotanalysistemplates/ ) there is no such link.
Sometimes the %5c%22 is in the middle as well as at the end of a url. So the theory that its a /\ code in PHP makes sense - but how do we solve this?
Could it be that google is reading the PHP instructions?
Should this be an issue that google coders fix rather than us poor webmasters/
or is there a server side solution to this.
I have 100s of these errors increasingly daily!
Should I just ignore these google reported errors?
Should I mark them as fixed (they are not, as they never existed in the first place)?
Is there a fix? It's a wp based site, should I be changing the robot text to block something? If so what?
do we know of any plugins that might create this issue?
Thank you in advanced
Mike

First, from Google itself. 404s doesn't affect ranking of your website.
But of course we want to fix this kind of error.
Second, on Google Webmaster Tools you can see where "google" saw / crawled this link. I suggest that you check where google pick this URL and look in your code if there's any code that add /%5C%22/ in the URL.

Related

What is "&ct=ga"? - Help an SEO determine if URLs containing parameter needs to be redirected

I am not a developer, just an SEO working with developers.
We are seeing a lot of "&ct=ga" which are all generating a 404 status code. I want to clear these out but before I redirect, does anyone know what it is?
I did a bit of research and it may be connected to some kind of feed (3rd party or Google News). I didn't know if it could be related to Google Analytics (which we use on the site).
So my question is:
What is this?
Is it safe to do a 301 redirect on URLs with this parameter back to the original URL?
Thank you!
Your developers can probably figure out more based on the context. A grep through the PHP code base for $_REQUEST['ct'] or $_GET['ct'] should hopefully find something that they can latch onto. Alternatively, if the code is based on a framework, there may be a framework-specific way of getting parameters in the code, in which case they'll want to search for that syntax instead.

how to fake url detection by php

im working on a script for indexing and downloading whole website by user sent url
for example when a user submit a domain like http://example.com then i will copy all links in index page and go for download the its inside links and start from first.....
i do this part with curl and regular expression to download and extract the links
however
some yellow websites are making fake urls for example if you go to http://example.com?page=12 it have some links to http://example.com?page=12&id=10 or http://example.com?page=13 and etc..
this will make a loop and the script cant complete the site downloading
is there any way to detect these kind of pages!?
p.s.: i think google and yahoo and some other search engines face this kind of problem too but their database are clear and on searches thay dont show these kind of data....
Some pages may use GET variables and be perfectly valid (like as you've mentioned here, ?page=12 and ?page=13 may be acceptable). So what I believe you're actually looking for here is a unique page.
It's not possible however to detect these straight from their URL. ?page=12 may point to exactly the same thing as ?page=12&id=1 does; they may not. The only way to detect one of these is to download it, compare the download to pages you've already got, and as a result find out if it really is one you haven't seen yet. If you have seen it before, don't crawl its links.
Minor side note here: Make sure you block websites from a different domain, otherwise you may accidentally start crawling the whole web :)

Website optimization and random javascript errors in google chrome console

I've really big problem with my website: http://ap.v11.pl/sklep/
It loads really slow and I dont know how to fix that.
I've getting some weird errors from Chrome console: http://scr.hu/0an/xq5bz
There errors are random, for example i'm getting error that something cant be found but this resource exists and the paths are good.
My htaccess:
http://pastebin.com/ewZZBLFg
Page is working on ZendFramework 2
Thank you for any advices
My hypothesis is:
you are running Ghostery as Chrome plugin or something similar so that e.g. your browser will block a couple of your scripts like the adstat thing and google analyticis
your webserver has a problem sending the correct mime type for the javascript stuff. Check out this posting on the "resource interpreted as a ..." error message
It may be that only one frontend is not working correctly. This would explain why you get the errors not all the time.
In general your site is packed with scripts and images. The first page has > 250 requests & almost 4 Mb. That's very much and it takes time. Amazon's Frontpage has half the number of requests and something like 300kb.
You should check if you can reduce the number of requests - the yslow plugin may give you some good advise here. Can you reduce the image size and number of image? (css sprites?)
You should also check if you have to deliver all the images through your regular web browser or if you can use a lightweight alternative. Are you using NGINX? AFAIK it has good options for performance tunings.
Edit: As a starting point: http://gtmetrix.com/reports/www.ap.v11.pl/fBGKScZ6

can a php piece of code that block old browsers from accessing a website block search engine spiders?

i was looking for a way to block old browsers from accessing the contents of a page because the page isn't compatible with old browsers like IE 6.0 and to return a message saying that the browser is outdated and that an upgrade is needed to see that webpage.
i know a bit of php and doing a little script that serves this purpose isn't hard, then i was just about to start doing it and a huge question popped up in my mind.
if i do a php script that blocks browsers based on their name and version is it impossible that this may block some search engine spiders or something?
i was thinking about doing the browser identification via this function: http://php.net/manual/en/function.get-browser.php
a crawler will probably be identified as a crawler but is it impossible that the crawler supplies some kind of browser name and version?
if nobody tested this stuff before or played a bit with this kind of functions i will probably not risk it, or i will make a testfolder inside a website to see if the pages there get indexed and if not i abandon this idea or i will try to modify it in a way that it works but to save me the trouble i figured it would be best to ask around and because i didn't found this info after a lot of searching.
No, it shouldn't affect any of major crawlers. get_browser() relies on the User-Agent string sent with the request, and thus it shouldn't be a problem for crawlers, which happen to use custom user-agent strings (eg: Google's spiders will have "Google" in their names).
Now, I personally think it's a bit unfriendly to completely block a website to someone with IE. I'd just put a red banner above saying "Site might not function correctly. Please update your browser or get a new one" or something to that effect.

Immediate update of site-map index in google, shouldn't that be impossible?

Ckeck out http://www.blocket.se/goteborg?ca=15
Look at the first ad.
Copy the headline, and paste it into google, and also write 'blocket' after the headline in google, now search.
You will see it finds the ad right away.
How come?
Does googles crawlers really update their index that fast?
Or is it just because you entered that search string that google quickly updates its index and returns the results?
Thanks
I will tag this php, mysql etc because usually you guys know these kind of things!
It's not immediate. Google does crawl some sites a lot more often than other sites, but it's never immediate.

Categories