I'm looking to set up a whistleblowing/anonymous tip website, but I've run into some problems. The basic idea is that you navigate to a splash page, fill in a few fields (name and location optionally, and then the message), then fire it off. At that point the message gets sent to a specific email inbox so that our team can look at it.
I've done a bit of research and PHP seems like my best bet, but I would also like to be able to log IP addresses for every message (or, more ideally, append them to the email before it is sent) so that I can be sure I'm not getting trolled or spammed. Can anyone point me in the right direction with this? I'm kind of a PHP noob, but willing to learn.
Thanks!
The remote IP address will be available within your php script using the super global $_SERVER['REMOTE_ADDR']. You can append that to your mail.
Just to mention: If you log the ip address of the sender, you kind of miss something important if you want the sender to be ANONYMOUS. Because if you log the ip, then this is not really the case anymore.
Problem
Spambots most of the times have a network of computers(hacked!) so blocking IP addresses most of the times does not work. Also I would like to point out the probably some legimate user who is not aware of the malware on his PC can't use your service because you are blocking his IP address. Otherwise CAPTCHA's were NOT necessary at all and Google, Yahoo! would not be using them at all because as you most likely know these images are hard to read sometimes.
Solution
You should just have a good spam filter(GMail's works very good) in place and use Akismet to detect spam-messages instead. They have very decent libraries in place so that you don't have to do any coding at all and it is going to work a lot better, then what you were about to implement.
Related
I am preparing a registration form for my website where they can register and track how many registered so far. But I encountered some issues that some people register multiple times by using different details from same device. I would like to stop that cheat. Anyone help me to overcome this issue. I want to track their ip or their device details and restrict multiple registration.
You should look at the $_SERVER array: https://www.php.net/manual/en/reserved.variables.server.php
The thing that might interest You most in this case would be $_SERVER['REMOTE_ADDR'];
Just remember that many PCs from the same network might have the same IP, so You can't just block them purely on the IP. Be careful to not block normal users by accident. You might want to set a cookie as well.
Obviously You won't be able to 100% block multiaccount cheaters if they know what they are doing, but You should be able to either catch most of them, of force them to give up. Add things like not allow to register multiples accounts on the same email, force to solve hard captha, email confirmation links etc. Often it is just a small deal, but it simply makes multiaccount-cheaters life a little harder, and most of them will give up just because of that.
Sometimes is good idea to let them be for some time and log the multiaccounts for some time. Then block them all at once, so they won't know if they managed to bypass Your security or not at the very second when they try to create account.
Check as well other $_SERVER variables that You might find useful, like HTTP_USER_AGENT that returns very specific information about browser.
You can use the PHP superglobal $_SERVER to find their ip. Then use some validation on whether to accept the registration.
I have just added a CAPTCHA to a page to block spams but we are getting spams as usual.
The website is using Html, Php, Javascript and unsecured http only and nothing else.
I am generating and comparing captchas in Php using if statement. I am also adding both the captchas (generated and typed) in a comment for testing. So while genuine mails are received with both the generated and typed captcha. In the spams mails both capchas are blank (the spammers are at work so mystery and confusion).
I have checked all the files on website they are exactly as I had uploaded. I do not understand what spammers have done and how?
Some guidelines are needed. So, I can start studying books and websites.
You can basically chaulk that up to...
https://dynomapper.com/blog/514-online-captcha-solving-services-and-available-captcha-types
https://github.com/imagetyperz-api/imagetyperz-api-nodejs
https://github.com/bestcaptchasolver/bestcaptchasolver-php
The list goes on and on but I think you get the idea.
Your going to want to stack methods of spam prevention that don't annoy real users.
https://www.lifewire.com/solutions-to-protect-web-forms-from-spam-3467469
Good list there, my personal goto has always been honeypots, just a hidden field that looks and feels like a real field, but if someone fills it out you immediately know its a bot. Maybe throw it 9999px of the page to the right, but to a bot they still see it right under the other fields in your code.
Also if you are bored and have a bit of time and another great way of banning most of the current active botnets from your site, or maybe setting up a seperate site on the same hosting provider just to harvest all the active IPs of botnets trolling your hosting providers IP range.
Make a robots.txt file like this
User-agent: *
Disallow: /secret/
Any honest bot like google bot won't follow the path to /secret/ on your domain, but let me tell you that if its a bot you do not want on your site one of the first things its going to be programmed to do is check files like this for queues on where the paths on your site are, especially private ones.
Then just set up a script to automatically IP ban all traffic to /secret
Now odviously this is doing to end up banning some legit people who end up using the IP after the bot etc, etc. But honestly tell me it doesn't sound like a fun idea.
Sometimes users misspelled their email domain and hence they enter wrong email address.
Eg. abc#gmial.com rather than abc#gmail.com
Has anybody thought about this before? Can anybody suggest how to handle this type of mistakes?
It didn't exist when this question was asked, but I recommend MailCheck which auto-suggests corrections to entered emails. It's used successfully by large companies.
Can anybody suggest how to handle this type of mistakes?
You would usually send a confirmation E-Mail to the address given, and proceed only if a link in that E-Mail has been clicked.
There is no other good way to deal with this - it's impossible to tell for sure whether gmial.com is a typo or not, seeing as it's a valid domain.
Create a list of common email domain names:
hotmail.com
gmail.com
googlemail.com
... etc
When a user enters an email address, take the domain name of the entered address and take the Levenstein distance between your list. If the distance is 1 (or maybe up to 2) then ask the user to confirm that's the email address they meant.
In my opinion it is bordering on impossible to come up with a generic solution for the generic case.
That being said, the most common typo is to interchange two adajcent letters.
So you might want to check for character content for the largest sites gmail, yahoo and what have you; Based on that suggest an alternative spelling if the original does not match gmail etc.
Do not assume the user is at fault, suggest alternatives if it looks suspicious compared to common names. A white-list was mentioned in another reply.
Use confirmation mails if you need to know you can get a reply from this address.
You cannot assume the spelling you find is in error, that is what confirmation mails are for.
Make it very non-obtrusive (ajax springs to mind).
In our forms we're using a combination of techniques. While bad data can still slip through, the chances are vastly reduced.
First is to do a simple formatting regex that is commonly available - just be sure it's RFC-compliant. If this fails, it's good to offer the user a confirmation form at this point, because they may catch other errors for you while fixing this problem.
The next part is to check the TLD part of the domain. Since all TLDs can be known, these are relatively easy to scan for misspellings using some regex tests. Just keep a list of all current TLDs in a table somewhere and update it form time to time as needed (mind you, this list can get complex when dealing with international TLDs. If you're only dealing with US traffic, the rules are much easier, and that's something else you can filter out. For example, if you're selling a service only available in the US, it would make sense to filter out international emails at form submission time. We are, so this works for us).
Third is to do something like what #npclaudiu suggested - scan for common misspellings of big-name mail hosts (gmail, hotmail, yahoo, etc) in the domain part and if a possible hit is detected, offer a confirmation form to the user. (You entered someone#hptmail.com, did you mean hotmail.com?)
If you get through those steps, then you can do the MX lookup suggested by #symcbean.
Finally, if all of that succeeds, there is a method (but I've not yet tested it) for communicating with the remote SMTP host to see if the mailbox exists. We're about to begin testing this ourselves. I found the how-to for such here:
http://www.webdigi.co.uk/blog/2009/how-to-check-if-an-email-address-exists-without-sending-an-email/
The funny thing is that the url does exist http://www.gmial.com
In fact it would be very difficult for you to know if it's a mistake or just a "strange" domain. Look at the Google API's because when you type something wrong in Google they propose you "did you mean...."
good luck
Arnaud
You can not provide this functionality in a way that you auto correct the misspelled email domain names, because the name which you are assuming to be invalid, would be valid. you should expect anything to be entered as a email address domain name.
I would suggest, if you are creating a signup form, you provide user with a dropdown having all possible domain names which you are aware of so that he can make a selection from that.
Hope this helps.
You could create a list of popular e-mail domains (gmail.com, yahoo.com, ymail.com, etc) in your db and validate the e-mail address that the user inputs against this list, and if the domain resembles with one of these domains, you should show a warning and allow the user to correct it if necessary, not auto correct it. And to compare the domain entered with the domains in your list, you might use an algorithm like the the one used in the soundex function in SQL Server, that matches words based on if one word sounds like the second.
Edit: you can find more details the SOUNDEX function here.
As mentioned before, it is not a good idea to automatically assume that someone has mistyped an email. A better approach would be to implement a little javascript function that checks if the domain of the email was possibly mistyped and alert the user instead of assuming they were wrong from the start.
Give me a minute to create a little mockup.
EDIT: OK, so maybe it was more than a minute. Take a look at http://jsbin.com/iyaxuq/8/edit and see for yourself how javascript can help prevent common typing errors. Try emails like: test#gmail.cmo, another#yhaoo.com, loser#htomali.ocm (typo of hotmail), and me#aol.com.
Note: I used a lazy regex to validate the email. Don't rely on it (or for that matter, most regexes) for a real app.
Trying to automate correction of bad data is a very dangerous practice. Ultimately, only the user can provide the correct data. However there are strict rules about formatting an email address - a regex check can be run in javascript (or using the preg functions with the same regex syntax) - but note that there are a lot of bad examples on the internet of regexes claiming to solve the problem.
This should be a fairly complete implementation of an RFC2822 ADDR_SPEC validator:
/[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?/gi
However in practice I find this to be adequate:
/^[a-z0-9\._%+!$&*=^|~#%'`?{}/\-]+#([a-z0-9\-]+\.){1,}([a-z]{2,22})$/gi
Then, serverside, you can do an MX lookup to verify that the domain provided not only meets the formatting requirements but exists as an email receiving site.
This does not prove that the named mailbox exists at that site, nor that it is accepting emails - ultimately you'd need to send an email to that address including a click back link / password to establish whether the email address is valid.
Update
While, as the top voted answer here says, the best way to validate an ADDR_SPEC is to send a token to the address to be submitted back via the web, this is not of much help if the data is not coming from the person whom controls the mailbox, and the action is dissociated from the primary interaction even when they do. A further consideration is that an email address which is valid today might not be tomorrow.
Using a regex (and an MX lookup) is still a good idea to provide immediate feedback to the user, but for a complete solution you also need to monitor the bounces.
I have a classifieds website, and on each classifieds page, there is a form for tipping a friend where you just enter the persons email-adress and the tip will then be sent. The form is submitted to tip.php where all "magic" happens with checking and sanitizing etc etc...
Lastly I use php:s mail() function to send the email from tip.php...
Now, I wouldn't want spam-bots and automated robots etc to send mail and blacklist my server.
What should I do?
One method which I would rather NOT use is logging IP:adresses of senders in a table (MySql) and then allow only x emails per sender.
As I said, the above solution is nothing I would prefer, there must be an easier way.
Is there any method you know of?
Is there any application to install maybe, on a linux server which does the job?
Thanks
I would say that the most used method would be captcha. This will ensure that the one that sends the email is a man, but everything can be cracked. So I would recommend to find a really good one, just type captcha into google and you are good to go. Also you can use another method/thing to make it more viable, e.g. some question that can be answered a simple mathematical problem, etc.
I think you should do something in the form which makes it difficult for robots to submit rubbish into it.
Either a piece of Javascript which robots don't run (Hint: The usually don't) or if you MUST, a captcha.
You should definitely monitor the use of this facility, as well as monitoring outbound messages, message queues, and watch for bounced mail though.
Quite a lot of web spam seems to come from humans who are paid to submit rubbish into peoples' forms, which is difficult to block.
You can of course, also use something like Akismet - an API where you can ask them to spam-scan form input; I'm sure its licence terms are very reasonable and if spam is a real problem, paying for it will be acceptable to management (using Akismet is much cheaper than paying expensive developers to write and maintain an in-house anti-spam system)
Unless its a paid for service or you can restrict the recipients to a pre-approved list and can establish the bona fides of the users I would strongly recommend you don't do this. However...
Do have a look at spamassassin - but remember that one of its most important metrics is the Bayesian filtering engine - which needs to be trained using heuristics (but you can run spamassassin for your incoming mail and copy the database to your webserver).
Do make sure that you only allow authenticated customers (with an authenticated email) to use the facility, and limit the rate at which they can send messages (and the number of recipients) using a dead-man's lever.
C.
we build newsletter module,
and send email to members.
The environment is LAMP.
Are there any way to know whether member open the mail ?
i hear about put image if 'php' source ,
what is the best way?
Ultimately there is obviously no fool-proof way to get notifications, because there is no guaranteed way of getting the email client to respond back in some fashion; the email client can be set up to just interpret the incoming email as ASCII text and nothing more, and there is nothing you can do about that.
However; in most cases if you are emailing to users that are expecting an email, odds are that HTML rendering and inline images are probably turned on for your source address, so using an inline IMG tag and monitoring access to the referenced file (obviously using some per-user unique ID in the reference) should give you the best you are going to get.
Regardless you'll never be able to make the assumption that if you do not get a notification back that that means the user has not seen the email.
There's no foolproof way to do it since you're not the one in control of the email client. Many people take their privacy seriously enough to prevent read-receipts, web beacons and all the other tricks which can be used to detect the reading (people can turn off read receipts, block images, prevent unsolicited outgoing connections and so on).
This is my opinion of course but I believe you're approaching the problem the wrong way. Instead of trying to force the user to let you know if they've read the email, just make it worth their while. It's obviously of some benefit to you to have this information (otherwise why do it?) so you share that benefit around and make sure it's the user's decision.
That way, you turn the relationship from a battleground into a partnership (win/win).
Yes, there is a standard mechanism (RFC 3798) called read receipts. It is implemented by all modern mail clients, and the user can choose to send (or not) the reciept as they choose.
There are also various non-standard subterfuges for doing this without the user's consent, which I won't detail.
EDIT:
It should be like the below (using built-in PHP mail function):
mail("foo#foo.com", "Let me know if you get this", "Important message", "Disposition-Notification-To: sender#sender.com\r\n");
A common way to check if an email has been read is a web beacon, which is usually a small 1x1px invisible image that is loaded from your server, which can track when the image has been loaded and therefore the email has been read.
This is not guaranteed to work, however, since many email clients block images in their emails or your readers could be using text-only email clients.
Each email has a uniquely named image in it corresponding to the users account (or db row), when that image is loaded or accessed, you can see which user has opened the email. This relies on the user receiving HTML emails though.