Best solution to anti-spam in PHP?

Best solution to anti-spam in PHP? - php

How to distinguish robots from normal user?
How does SO do this job?
Currently I'm met with a robot which post once every 1 hour...

Try akismet as your first line of defense. Bad Behaviour is efficient too, perhaps too efficient, as i had issues with false positives. Akismet on the other hand serves me well. Then, if necessary, add other layers not impeding on the user experience, (like using empty fields that should remain empty) and then if you really have to, other techniques involving turing test of some sorts (captcha being the worst in terms of user friendliness: try simple questions instead). Here is a good read.

There are multiple approaches to this problem and it's a good idea to use several that overlap. One of those that SO uses is reCAPTCHA

Make sure you've got a valid email address for anyone who posts (as per S.O.) and implement a CAPCHA on registration and when you think someone might be behaving oddly. Keep a well-trained copy of spamassassin around and feed the posts through that.
C.

* QUICK, IMPLEMENT EASILY TO EXISTING FORM, SIMPLE INTERACTION FOR USERS *
http://www.codegravity.com/projects/mathguard

Related

Securing a php contact form

i have made a simple php contact form following this tutorial:
http://www.catswhocode.com/blog/how-to-create-a-built-in-contact-form-for-your-wordpress-theme
The big problem is that this form processing is not safe, I have heard people can use it to send spam and/or hack my server.
What are the basic steps needed to make this form more secure?
Ps: I don't want to use re-captcha if it can be avoided...
Edit: I need suggestions to what php functions are used to filter and secure that the form is submitted "the right way" and not altered and/or used to hack my site or send email to other people (using the site to send spam to other people). Do i just need to use strip_slashes? or is there a better way?

One way: If you're not a huge site, it's not likely anyone is going to figure this out/take the time to.
You could use some tricky JS to handle tokens on click. So your server issues token-id's to clickable/focus-able elements on the page during the backend render phase. Log these in a database or data file. Then, when users click around and submit, you can compare the id's sent via the onclick() function. You could also apply some heuristics to determine if the history of clicks is reasonably paced. Posts are too fast to be a human or not, that is, even if they scripted the hijacking of the token-ids and auto submitted, you could check that the time between click events appears automated. Signed up for a twitter account lately? They use passive human detection that while not 100% foolproof, it is slower and more difficult to break. Somebody would REALLY want to hack/spam your site.
Important Step 2: strip out/URLEncode strange characters if you think this will break your page. common ones that break things are " and ' and :
Another Way: http://areyouahuman.com/
As long as you are using encrypted methods verifying humanity without crappy CAPTCHA is possible.I mean, don't ignore your headers either. These are complimentary ways.
The key is to have enough complexity to make for an NP-Complete problem. http://en.wikipedia.org/wiki/NP-complete
When the day comes when AI can solve multiple complex Human problems on their own, we will have other things to worry about than request tampering.
http://louisville.academia.edu/RomanYampolskiy/Papers/1467394/AI-Complete_AI-Hard_or_AI-Easy_Classification_of_Problems_in_Artificial
Another company doing interesting research is http://www.vouchsafe.com/play-games they actually use games designed to trick the RTT into training the RTT how to be more solvable by only humans!
Here's a great article on NP-Hard problems. I can see a huge possibility here: http://www.i-programmer.info/news/112-theory/3896-classic-nintendo-games-are-np-hard.html

How to avoid repeated post submissions?

I have a form that allows users to paste input, like on StackOverflow. But if users know the format I send to the server, they can keep sending requests to me. How can I ensure it is a real user sending a request instead of some kind of machine attack to insert information?

There are a load of ways to do this and employing a large variety of different things is a good idea to protect against spam.
What stackoverflow does (from my experience) is if there is an abnormal amount of posting, or maybe the posts are very short, or something else is a bit suspicious then they use a capcha.
You can monitor this by using cookies, for instance monitoring the time between posts is a good indicator that someone is spamming. Similarly if the lengths of the messages posted are all about the same length, or include the same url/link or something you can also display a capcha to test if the user is a human or not.

You can use a captcha. That's probably the most common approach.

There are different techniques allowing you to achieve this with more or less success. Using a Captcha is one popular way used by many sites.

building a basic human test with php

Im making a rudimentary "human test" for a form on my website.
I want to take the current date (to the minute, not second), and combine that with the users REMOTE_ADDR, then from that generate a string (perhaps use md5?) then limit that to 6 characters.
This code will then be presented to the user, which is instructed to copy it to a particular text box, upon submission will be verified and allow the form to process.
I dont know if there is an easier way to do this, but this is something i think will work for me and be a quick fix. Any suggestions?

I dont know if there is an easier way
to do this, but this is something i
think will work for me and be a quick
fix. Any suggestions?
If you just need a quick fix, try for something simpler. I had a very popular website with a notoriously effective Turing Test:
Check this box if you're a human: [ ]
This little fix brought my spam count down from 10s of 1000s of messages everyday to 1 or 2 every few months. Of course, once the bots wised up, I had to make my test much more difficult:
What's the sound a cat makes? (Rhymes with 'cow') [________________]
Never had anymore problems after that. YMMV.

I would say the simplest solution would be to use a honeypot.
Basically, create a hidden field called Name or something of that sort, and then check to see if the field has data upon submission. If it does, you know it is a bot! Since it is hidden, human's will not be able to populate that field, only bots will!

Generate an MD5 from any source (inc. totally random). Put it on the screen and store it in the session. Check it. Voila.
Using a captcha library is, obv. much more secure though. There's plenty of very good and very very easy to install ones about.

Posting to website using captcha

Currently I'm wondering if there is a way to post to a website using captcha for a human-check. The following question is asked, ofcourse this is done with random numbers:
Type this number in digits; 'twohundredandfive': [ input ]
The form is sent using AJAX. So when reloading the website the number to be typed changes.
A way to get pass this is reading and converting the number, then post some data, but at the second request the number already has been changed (which is good). But IS there a way to avoid this?
Don't think I'm using this for bad intensions, the described form is used in one of my applications. It is just a check to get sure bots can't get pass.
Thanks so far :-)

A CAPTCHA should test whether the entity solving it is human. To my eyes, the problem you are setting looks like it would be fairly trivial to solve algorithmically.
Given that a human can pass the test, then it's certainly possible to write an automated bot which will pass it too. As to whether there is a "back door" which allows access without solving the CAPTCHA, only you can decide that by analysing your source code.

I hate CAPTCHAs. More often than not, they are unreadable to humans as well :)
I heard one Microsoft researcher offer the following scheme: put 4 pictures up, 3 of little puppies, one with a kitten. Ask the user to click the kitten. With a large enough sample base, you can create a random picture/question any time the page refreshes. No one will bother developing an algorithm to analyze photos to that degree.
read this post for another interesting idea.

Converting strings to numbers has already been discussed in another question where many references to the google calculator were given, which does a great job in such conversions, so your approach is not suitable for testing whether your user is human.
As for an alternate solution, I can only link to another great answer.

How do I protect my forum against spam?

I have a forum on a website I master, which gets a daily dose of pron spam. Currently I delete the spam and block the IP. But this does not work very well. The list of blocked IP's is growing quickly, but so is the number of spam posts in the forum.
The forum is entirely my own code. It is built in PHP and MySQL.
What are some concrete ways of stopping the spam?
Edit
The thing I forgot to mention is that the forum needs to be open for unregistered users to post. Kinda like a blog comment.

In a guestbook app I wrote, I implemented two features which prevent most of the spam:
Don't allow POST as the first request in a session
Require a valid HTTP Refer(r)er when posting

One way that I know which works is to use JavaScript before submitting the form. For example, to change the method from GET to POST. ;) Spambots are lousy at executing JavaScript. Of course, this also means that non-Javascript people will not be able to use your site... if you care about them that is. ;) (Note: I don't)

In my experience, the best easy defenses come from just doing something "non-standard". If you make your site non-standard, this makes it so that any automated spam would have to be coded specifically for your site, which (no offense) probably isn't worth the effort. Note that if the spam is coming from human spammers, there's not really anything you can do that won't also stop legitimate posters. So the goal is to find a solution that will throw away any "standard" posts - that is, "fill out the whole form and push submit".
A couple examples that come to mind of things that you could try:
Have a hidden form field with a name that sounds like something a spammer would want to fill out, like "website" or "homepage" or something like that. If the form field gets filled out, throw away the message instead of posting it, because it was a bot automatically filling in the whole form, even invisible fields.
You don't have to use a "real" captcha, but even something simple like "Enter the following word backwards: <random backwards word>" or "What is the domain name of this website?". Easy for a human to do, but it would require a fairly complex bot to figure out what to fill in.

You could use a captcha, there are some good scripts like PHPCaptcha or use a spam control service, like Akismet, they have a PHP API.

You might want to look at this question, which has several answers that describe how you could implement a non-intrusive captcha.
Another thing to consider is to require time between posts to prevent massive spamming.

Include a CAPTCHA that is always "orange".

The spams may be by bots or humans - bots are more likely.
To stop the bots, put in a hidden field populated by Javascript - there is a 99.5% chance that a standard, stupid bot that isn't customised to your site will fail to fill that in.
If they fail to fill it in correctly, give them a message that Javascript is required or something, and give them an opportunity to post some other way (e.g. with a captcha or registration). That way anonymous users who aren't spambots can (mostly) still post with no problems, and most spambots (which haven't been tailored for your specific site) won't.
Don't bother blacklisting IP addresses or using third party blacklists, that will just generate false positives. Almost all bots use the same IP addresses as (some) legitimate users.
Another trick is to put in a text field with a plausible sounding name, which is made difficult to see with CSS - anyone filling this field in with anything is considered to be a bot.

Advanced solutions:
Akismet
Defensio
Sblam! (open source clone of the above)
You can try your luck with non-standard form:
fields that must stay empty hidden with CSS
fields with misleading names, e.g. <input name=email> for something that is not an e-mail.
For me CAPTCHA is like giving up to spammers and letting them damage your forum anyway – except that instead of spam damage, you get usability and accessibility damage.

Something I've found to be surprisingly effective: disallow comments that contain too many URLs (more than, say, 5). Since doing that, I've had zero comment spam.
Edit: Since writing the above, I've had recurring comment spam with only one link. I have now added some honeypot fields and have had no commend spam for a few months now.

Don't let anybody post until they respond to an email sent to their registered email address. You'll see lots of forums and mailing lists generate a unique email address or web url that is sent to the new user's given email address, and they have to respond to the email or click on the link to finalize their registration.

Captcha is definitely the easiest method - try KittenAuth if you want something bot-proof (Although I got pandas this time)
Kitten Auth

There is no single answer since Spam is really a matter of economics: how much is it worth it to someone to put their stuff onto the web. There, however, some solutions that seem pretty good
Recaptcha
Use CCS to create an invisible
field that robots fill-in
Create a time-specific hidden field in your form so the
robot can't use the same form over and over again.

I want to say that in most time, a CAPTCHA is enough for you to prevent SPAMers.
But do use a strong one, like http://www.captcha.net/.
Remember that SPAMers do not want to spend much time to deal with a particular site(except heavy traffic sites), they use a tool to post AD on a lot of sites. So make your FORM a little unusual, (e.g. give the user a image says '1.5+2.4=?' and let users to answer, this will block most of the spam tools :) )

The easiest thing I've done to stop spammers with (so far) 100% consistency is to validate the text that was submitted. If you use the php function strstr() to check for "a href" or even a non-clickable http or www, you can then just reroute the spammer elsewhere. I actually have a script then write to my .htaccess file to deny the offending IP address. Not sure if there's any other kind of spam to be concerned about, but links are all I've seen so far.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.