This problem might not occure in English, but does really hurt in Polish language. I guess that my question is mostly for Polish users since they might already have a decent solution.
What I mean, is that the verbs in Polish language, are different for male and female in past time. And there are dozens of different options. If my script need to display lots and lots of text - it really becomes a painful problem to deal with. Short example (not very elegant use of language, but for demonstration purpose):
Male: On poszedł i nie znalazł, więc klasnął w dłonie i nagle go coś pożarło.
Female: Ona poszła i nie znalazła, więc klasnęła w dłonie i nagle ją coś pożarło.
I managed to find such an solution: each time at the beginning of script, I prepare variable that looks like that:
$verb[$ending][$sex] = 'something';
//$ending does contain - for my convenience - letters that says what kind of eding am I changing, instead of numeric options
//Examples:
$verb['-a']['male'] = '';
$verb['-a']['female'] = 'a';
//works for On=>Ona, znalazł=>znalazła
$verb['al-ela']['male'] = 'ął';
$verb['al-ela']['female'] = 'ęła';
//works for klasnął=>klasnęła
Now if I add fact, that 99% of time I don't know from the beginning what kind of sex am I dealing with, my variable start look kinda scary: $verb['al-ela'][$_SESSION['user'.$id]['sex']]. So my end text does look like that:
O'.$verb['-a'][$_SESSION['user'.$id]['sex']].' posz'.$verb['edl-la'][$_SESSION['user'.$id]['sex']].' i nie znalazł'.$verb['-a'][$_SESSION['user'.$id]['sex']].', więc klasn'.$verb['al-ela'][$_SESSION['user'.$id]['sex']].' w dłonie i nagle '.$verb['go-ja'][$_SESSION['user'.$id]['sex']].' coś pożarło.
Yes, sure - this is rather extreme example, but sometimes text really does look like that and it is unavoidable.
To make long story short, here are my questions:
Am I doing it wrong? Is there a better/faster/more handy solution for such type of problems?
Is there a script that might detect/change endings for me without ruining rest of the text?
I struggled to find full list of possible ending variations in Polish (for both singular, and plural), so I'm creating my own list as I'm finding new options. Perhaps someone does have a list like that => it might help me to create script from my 2nd question.
Thanks a lot in advance, best regards!
some years ago I started using the following code including in the top of my pages. I read that was good and used it. But I was wondering, is it helpful?
$page = "index.php";
$cracktrack = $_SERVER['QUERY_STRING'];
$wormprotector = array('chr(', 'chr=', 'chr%20', '%20chr', 'wget%20', '%20wget', 'wget(',
'cmd=', '%20cmd', 'cmd%20', 'rush=', '%20rush', 'rush%20',
'union%20', '%20union', 'union(', 'union=', 'echr(', '%20echr', 'echr%20', 'echr=',
'esystem(', 'esystem%20', 'cp%20', '%20cp', 'cp(', 'mdir%20', '%20mdir', 'mdir(',
'mcd%20', 'mrd%20', 'rm%20', '%20mcd', '%20mrd', '%20rm',
'mcd(', 'mrd(', 'rm(', 'mcd=', 'mrd=', 'mv%20', 'rmdir%20', 'mv(', 'rmdir(',
'chmod(', 'chmod%20', '%20chmod', 'chmod(', 'chmod=', 'chown%20', 'chgrp%20', 'chown(', 'chgrp(',
'locate%20', 'grep%20', 'locate(', 'grep(', 'diff%20', 'kill%20', 'kill(', 'killall',
'passwd%20', '%20passwd', 'passwd(', 'telnet%20', 'vi(', 'vi%20',
'insert%20into', 'select%20', 'nigga(', '%20nigga', 'nigga%20', 'fopen', 'fwrite', '%20like', 'like%20',
'$_request', '$_get', '$request', '$get', '.system', 'HTTP_PHP', '&aim', '%20getenv', 'getenv%20',
'new_password', '&icq','/etc/password','/etc/shadow', '/etc/groups', '/etc/gshadow',
'HTTP_USER_AGENT', 'HTTP_HOST', '/bin/ps', 'wget%20', 'unamex20-a', '/usr/bin/id',
'/bin/echo', '/bin/kill', '/bin/', '/chgrp', '/chown', '/usr/bin', 'g++', 'bin/python',
'bin/tclsh', 'bin/nasm', 'perl%20', 'traceroute%20', 'ping%20', '.pl', '/usr/X11R6/bin/xterm', 'lsof%20',
'/bin/mail', '.conf', 'motd%20', 'HTTP/1.', '.inc.php', 'config.php', 'cgi-', '.eml',
'file://', 'window.open', '<SCRIPT>', 'javascript://','img src', 'img%20src','.jsp','ftp.exe',
'xp_enumdsn', 'xp_availablemedia', 'xp_filelist', 'xp_cmdshell', 'nc.exe', '.htpasswd',
'servlet', '/etc/passwd', 'wwwacl', '~root', '~ftp', '.js', '.jsp', 'admin_', '.history',
'bash_history', '.bash_history', '~nobody', 'server-info', 'server-status', 'reboot%20', 'halt%20',
'powerdown%20', '/home/ftp', '/home/www', 'secure_site, ok', 'chunked', 'org.apache', '/servlet/con',
'<script', '/robot.txt' ,'/perl' ,'mod_gzip_status', 'db_mysql.inc', '.inc', 'select%20from',
'select from', 'drop%20', '.system', 'getenv', 'http_', '_php', 'php_', 'phpinfo()', '<?php', '?>', 'sql=');
$checkworm = str_replace($wormprotector, '*', $cracktrack);
if ($cracktrack != $checkworm){
$cremotead = $_SERVER['REMOTE_ADDR'];
$cuseragent = $_SERVER['HTTP_USER_AGENT'];
header("location:$page");
die();
}
In general, I personally wouldn't use this strategy. I'd rather sanitize each and every input. If a user passes .bash_history in the URL I don't care because it's never going to do anything in my script.
I could maybe see something like this being useful if you had some third-party low reliability script that was available for anyone to hit. Even in that scenario though it seems like a semi-reliable band-aid at best.
For applications you write however, this should hopefully be unnecessary.
Although it's great that you're concerned about security, and you're following the principle of treating all input with suspicion, I don't think that list is terribly useful.
It's a rather arbitrary selection of potentially unwanted strings/commands/tags/folder names and other things. It's likely to get out of date over time, and probably is already. Having a generic list like this is never going to catch everything, and may also lend a false sense of security that your application is secure when really it's not.
As another answer has already mentioned, you want to be checking each input you get from your application (whether via query string variables, POST variables or wherever) and validating that it meets your expectations (e.g. if you're expecting a numeric value, is the value passed in numeric?).
Then if you plan to redisplay or re-use that data, you might want to sanitise if further, and strip out things that might potentially be dangerous in the context where it will be used. For example, you might strip out "script" tags if you're going to display the data on a web page.
If you sanitize all user input properly, there's absolutely no need to use a script like this.
Besides that, it's also case sensitive (str_replace vs str_ireplace) which means that I can easily bypass it by making use of a mix of uppercase and lowercase letters. It also only checks the query string, useless against POST requests.
How to convert something like this
\xe6\xa6\x82\xe8\xa6\x81\n\xe3\x83\xbb\xe3\x82\xb0\xe3\x83\xaa\xe3\x83\xbc\xe3\x81\xae\xe3\x82\xa8\xe3\x83\xb3\xe3\x82\xb8\xe3\x83\x8b\xe3\x82\xa2\xe3\x81\xab\xe5\xbf\x9c\xe5\x8b\x9f\xe3\x81\x97\xe3\x81\xa6\xe3\x81\xbf\xe3\x81\x9f\xe3\x81\x84\xe3\x81\x8c\xe3\x80\x81\xe5\xbf\x9c\xe5\x8b\x9f\xe5\x89\x8d\xe3\x81\xab\xe8\x87\xaa\xe5\x88\x86\xe3\x81\xae\xe5\xae\x9f\xe5\x8a\x9b\xe3\x82\x92\xe8\xa9\xa6\xe3\x81\x97\xe3\x81\xa6\xe3\x81\xbf\xe3\x81\x9f\xe3\x81\x84\xe3\x80\x82\n\xe3\x83\xbb\xe5\x9c\xb0\xe6\x96\xb9\xe3\x81\xab\xe4\xbd\x8f\xe3\x82\x93\xe3\x81\xa7\xe3\x81\x84\xe3\x82\x8b\xe3\x81\xae\xe3\x81\xa7\xe9\x9d\xa2\xe6\x8e\xa5\xe5\x9b\x9e\xe6\x95\xb0\xe3\x81\x8c\xe5\xb0\x91\xe3\x81\xaa\xe3\x81\x84\xe6\x96\xb9\xe3\x81\x8c\xe3\x81\x82\xe3\x82\x8a\xe3\x81\x8c\xe3\x81\x9f\xe3\x81\x84\xe3\x80\x82\n\xe3\x83\xbb\xe9\x9d\xa2\xe6\x8e\xa5\xe3\x81\xaf\xe8\x8b\xa6\xe6\x89\x8b\xe3\x81\xa0\xe3\x81\x8c\xe3\x83\x97\xe3\x83\xad\xe3\x82\xb0\xe3\x83\xa9\xe3\x83\x9f\xe3\x83\xb3\xe3\x82\xb0\xe3\x81\xab\xe3\x81\xaf\xe8\x87\xaa\xe4\xbf\xa1\xe3\x81\x8c\xe3\x81\x82\xe3\x82\x8b\xe3\x80\x82\xe3\x81\xaf\xe3\x80\x81\xe3\x81\x93\xe3\x81\xae\xe3\x82\x88\xe3\x81\x86\xe3\x81\xaa\xe6\x96\xb9\xe3\x80\x85\xe3\x81\xae\xe3\x81\x94\xe8\xa6\x81\xe6\x9c\x9b\xe3\x81\xab\xe3\x81\x8a\xe5\xbf\x9c\xe3\x81\x88\xe3\x81\x99\xe3\x82\x8b\xe3\x81\x9f\xe3\x82\x81\xe3\x81\xab\xe4\xbd\x9c\xe3\x82\x89\xe3\x82\x8c\xe3\x81\x9f\xe6\x96\xb0\xe3\x81\x97\xe3\x81\x84\xe6\x8e\xa1\xe7\x94\xa8\xe3\x83\x97\xe3\x83\xad\xe3\x82\xb0\xe3\x83\xa9\xe3\x83\xa0\xe3\x81\xa7\xe3\x81\x99\xe3\x80\x82\n\xe3\x83\x97\xe3\x83\xad\xe3\x82\xb0\xe3\x83\xa9\xe3\x83\x9f\xe3\x83\xb3\xe3\x82\xb0\xe3\x82\xb9\xe3\x82\xad\xe3\x83\xab\xe3\x82\x92\xe8\xa9\x95\xe4\xbe\xa1\xe3\x81\x99\xe3\x82\x8b\xef\xbc\x91\xe6\xac\xa1\xe9\x9d\xa2\xe6\x8e\xa5\xe3\x82\x92\xe3\x83\x91\xe3\x82\xb9\xe3\x81\xa7\xe3\x81\x8d\xe3\x81\xbe\xe3\x81\x99\xe3\x81\xae\xe3\x81\xa7\xe5\x8a\xb9\xe7\x8e\x87\xe7\x9a\x84\xe3\x81\xaa\xe8\xbb\xa2\xe8\x81\xb7\xe6\xb4\xbb\xe5\x8b\x95\xe3\x82\x92\xe8\xa1\x8c\xe3\x81\xa3\xe3\x81\xa6\xe9\xa0\x82\xe3\x81\x91\xe3\x81\xbe\xe3\x81\x99\xe3\x80\x82\n\xe3\x82\x82\xe3\x81\xa1\xe3\x82\x8d\xe3\x82\x93\xe5\xad\xa6\xe7\x94\x9f\xe3\x81\xae\xe7\x9a\x86\xe3\x81\x95\xe3\x82\x93\xe3\x81\xae\xe3\x83\x81\xe3\x83\xa3\xe3\x83\xac\xe3\x83\xb3\xe3\x82\xb8\xe3\x82\x82\xe3\x81\x8a\xe5\xbe\x85\xe3\x81\xa1\xe3\x81\x97\xe3\x81\xa6\xe3\x81\x8a\xe3\x82\x8a\xe3\x81\xbe\xe3\x81\x99\xe3\x80\x82
which i received in HTTP POST to show them properly on HTML web page.
I have no idea what I am looking at but i think i can be converted to something which look in this ☺ format.
How can i do this in PHP
If you send the appropriate character set encoding with your HTTP response, you don't have to do anything to the data, the browser should properly decode it as Japanese text.
Example:
<?php
header('Content-Type: text/html; charset=UTF-8');
$var = "\xe6\xa6\x82\xe8\xa6\x81\n\xe3\x83\xbb\xe3\x82\xb0\xe3\x83\xaa\xe3\x83\xbc\xe3\x81\xae\xe3\x82\xa8\xe3\x83\xb3\xe3\x82\xb8\xe3\x83\x8b\xe3\x82\xa2\xe3\x81\xab\xe5\xbf\x9c\xe5\x8b\x9f\xe3\x81\x97\xe3\x81\xa6\xe3\x81\xbf\xe3\x81\x9f\xe3\x81\x84\xe3\x81\x8c\xe3\x80\x81\xe5\xbf\x9c\xe5\x8b\x9f\xe5\x89\x8d\xe3\x81\xab\xe8\x87\xaa\xe5\x88\x86\xe3\x81\xae\xe5\xae\x9f\xe5\x8a\x9b\xe3\x82\x92\xe8\xa9\xa6\xe3\x81\x97\xe3\x81\xa6\xe3\x81\xbf\xe3\x81\x9f\xe3\x81\x84\xe3\x80\x82\n\xe3\x83\xbb\xe5\x9c\xb0\xe6\x96\xb9\xe3\x81\xab\xe4\xbd\x8f\xe3\x82\x93\xe3\x81\xa7\xe3\x81\x84\xe3\x82\x8b\xe3\x81\xae\xe3\x81\xa7\xe9\x9d\xa2\xe6\x8e\xa5\xe5\x9b\x9e\xe6\x95\xb0\xe3\x81\x8c\xe5\xb0\x91\xe3\x81\xaa\xe3\x81\x84\xe6\x96\xb9\xe3\x81\x8c\xe3\x81\x82\xe3\x82\x8a\xe3\x81\x8c\xe3\x81\x9f\xe3\x81\x84\xe3\x80\x82\n\xe3\x83\xbb\xe9\x9d\xa2\xe6\x8e\xa5\xe3\x81\xaf\xe8\x8b\xa6\xe6\x89\x8b\xe3\x81\xa0\xe3\x81\x8c\xe3\x83\x97\xe3\x83\xad\xe3\x82\xb0\xe3\x83\xa9\xe3\x83\x9f\xe3\x83\xb3\xe3\x82\xb0\xe3\x81\xab\xe3\x81\xaf\xe8\x87\xaa\xe4\xbf\xa1\xe3\x81\x8c\xe3\x81\x82\xe3\x82\x8b\xe3\x80\x82\xe3\x81\xaf\xe3\x80\x81\xe3\x81\x93\xe3\x81\xae\xe3\x82\x88\xe3\x81\x86\xe3\x81\xaa\xe6\x96\xb9\xe3\x80\x85\xe3\x81\xae\xe3\x81\x94\xe8\xa6\x81\xe6\x9c\x9b\xe3\x81\xab\xe3\x81\x8a\xe5\xbf\x9c\xe3\x81\x88\xe3\x81\x99\xe3\x82\x8b\xe3\x81\x9f\xe3\x82\x81\xe3\x81\xab\xe4\xbd\x9c\xe3\x82\x89\xe3\x82\x8c\xe3\x81\x9f\xe6\x96\xb0\xe3\x81\x97\xe3\x81\x84\xe6\x8e\xa1\xe7\x94\xa8\xe3\x83\x97\xe3\x83\xad\xe3\x82\xb0\xe3\x83\xa9\xe3\x83\xa0\xe3\x81\xa7\xe3\x81\x99\xe3\x80\x82\n\xe3\x83\x97\xe3\x83\xad\xe3\x82\xb0\xe3\x83\xa9\xe3\x83\x9f\xe3\x83\xb3\xe3\x82\xb0\xe3\x82\xb9\xe3\x82\xad\xe3\x83\xab\xe3\x82\x92\xe8\xa9\x95\xe4\xbe\xa1\xe3\x81\x99\xe3\x82\x8b\xef\xbc\x91\xe6\xac\xa1\xe9\x9d\xa2\xe6\x8e\xa5\xe3\x82\x92\xe3\x83\x91\xe3\x82\xb9\xe3\x81\xa7\xe3\x81\x8d\xe3\x81\xbe\xe3\x81\x99\xe3\x81\xae\xe3\x81\xa7\xe5\x8a\xb9\xe7\x8e\x87\xe7\x9a\x84\xe3\x81\xaa\xe8\xbb\xa2\xe8\x81\xb7\xe6\xb4\xbb\xe5\x8b\x95\xe3\x82\x92\xe8\xa1\x8c\xe3\x81\xa3\xe3\x81\xa6\xe9\xa0\x82\xe3\x81\x91\xe3\x81\xbe\xe3\x81\x99\xe3\x80\x82\n\xe3\x82\x82\xe3\x81\xa1\xe3\x82\x8d\xe3\x82\x93\xe5\xad\xa6\xe7\x94\x9f\xe3\x81\xae\xe7\x9a\x86\xe3\x81\x95\xe3\x82\x93\xe3\x81\xae\xe3\x83\x81\xe3\x83\xa3\xe3\x83\xac\xe3\x83\xb3\xe3\x82\xb8\xe3\x82\x82\xe3\x81\x8a\xe5\xbe\x85\xe3\x81\xa1\xe3\x81\x97\xe3\x81\xa6\xe3\x81\x8a\xe3\x82\x8a\xe3\x81\xbe\xe3\x81\x99\xe3\x80\x82";
echo $var;
Since we send a header saying that the character encoding is UTF-8, the browser knows to decode it as such. You could also use a meta-tag to specify the charset. If the browser was set to auto-detect the code, neither option is necessary, but you can't rely on that.
It looks like Japan
php > echo "\xe6\xa6\x82\xe8\xa6\x81\n\xe3\x83\xbb\xe3\x82\xb0\xe3\x83\xaa\xe3\x83\xbc\xe3\x81\xae\xe3\x82\xa8\xe3\x83\xb3\xe3\x82\xb8\xe3\x83\x8b\xe3\x82\xa2\xe3\x81\xab\xe5\xbf\x9c\xe5\x8b\x9f\xe3\x81\x97\xe3\x81\xa6\xe3\x81\xbf\xe3\x81\x9f\xe3\x81\x84\xe3\x81\x8c\xe3\x80\x81\xe5\xbf\x9c\xe5\x8b\x9f\xe5\x89\x8d\xe3\x81\xab\xe8\x87\xaa\xe5\x88\x86\xe3\x81\xae\xe5\xae\x9f\xe5\x8a\x9b\xe3\x82\x92\xe8\xa9\xa6\xe3\x81\x97\xe3\x81\xa6\xe3\x81\xbf\xe3\x81\x9f\xe3\x81\x84\xe3\x80\x82\n\xe3\x83\xbb\xe5\x9c\xb0\xe6\x96\xb9\xe3\x81\xab\xe4\xbd\x8f\xe3\x82\x93\xe3\x81\xa7\xe3\x81\x84\xe3\x82\x8b\xe3\x81\xae\xe3\x81\xa7\xe9\x9d\xa2\xe6\x8e\xa5\xe5\x9b\x9e\xe6\x95\xb0\xe3\x81\x8c\xe5\xb0\x91\xe3\x81\xaa\xe3\x81\x84\xe6\x96\xb9\xe3\x81\x8c\xe3\x81\x82\xe3\x82\x8a\xe3\x81\x8c\xe3\x81\x9f\xe3\x81\x84\xe3\x80\x82\n\xe3\x83\xbb\xe9\x9d\xa2\xe6\x8e\xa5\xe3\x81\xaf\xe8\x8b\xa6\xe6\x89\x8b\xe3\x81\xa0\xe3\x81\x8c\xe3\x83\x97\xe3\x83\xad\xe3\x82\xb0\xe3\x83\xa9\xe3\x83\x9f\xe3\x83\xb3\xe3\x82\xb0\xe3\x81\xab\xe3\x81\xaf\xe8\x87\xaa\xe4\xbf\xa1\xe3\x81\x8c\xe3\x81\x82\xe3\x82\x8b\xe3\x80\x82\xe3\x81\xaf\xe3\x80\x81\xe3\x81\x93\xe3\x81\xae\xe3\x82\x88\xe3\x81\x86\xe3\x81\xaa\xe6\x96\xb9\xe3\x80\x85\xe3\x81\xae\xe3\x81\x94\xe8\xa6\x81\xe6\x9c\x9b\xe3\x81\xab\xe3\x81\x8a\xe5\xbf\x9c\xe3\x81\x88\xe3\x81\x99\xe3\x82\x8b\xe3\x81\x9f\xe3\x82\x81\xe3\x81\xab\xe4\xbd\x9c\xe3\x82\x89\xe3\x82\x8c\xe3\x81\x9f\xe6\x96\xb0\xe3\x81\x97\xe3\x81\x84\xe6\x8e\xa1\xe7\x94\xa8\xe3\x83\x97\xe3\x83\xad\xe3\x82\xb0\xe3\x83\xa9\xe3\x83\xa0\xe3\x81\xa7\xe3\x81\x99\xe3\x80\x82\n\xe3\x83\x97\xe3\x83\xad\xe3\x82\xb0\xe3\x83\xa9\xe3\x83\x9f\xe3\x83\xb3\xe3\x82\xb0\xe3\x82\xb9\xe3\x82\xad\xe3\x83\xab\xe3\x82\x92\xe8\xa9\x95\xe4\xbe\xa1\xe3\x81\x99\xe3\x82\x8b\xef\xbc\x91\xe6\xac\xa1\xe9\x9d\xa2\xe6\x8e\xa5\xe3\x82\x92\xe3\x83\x91\xe3\x82\xb9\xe3\x81\xa7\xe3\x81\x8d\xe3\x81\xbe\xe3\x81\x99\xe3\x81\xae\xe3\x81\xa7\xe5\x8a\xb9\xe7\x8e\x87\xe7\x9a\x84\xe3\x81\xaa\xe8\xbb\xa2\xe8\x81\xb7\xe6\xb4\xbb\xe5\x8b\x95\xe3\x82\x92\xe8\xa1\x8c\xe3\x81\xa3\xe3\x81\xa6\xe9\xa0\x82\xe3\x81\x91\xe3\x81\xbe\xe3\x81\x99\xe3\x80\x82\n\xe3\x82\x82\xe3\x81\xa1\xe3\x82\x8d\xe3\x82\x93\xe5\xad\xa6\xe7\x94\x9f\xe3\x81\xae\xe7\x9a\x86\xe3\x81\x95\xe3\x82\x93\xe3\x81\xae\xe3\x83\x81\xe3\x83\xa3\xe3\x83\xac\xe3\x83\xb3\xe3\x82\xb8\xe3\x82\x82\xe3\x81\x8a\xe5\xbe\x85\xe3\x81\xa1\xe3\x81\x97\xe3\x81\xa6\xe3\x81\x8a\xe3\x82\x8a\xe3\x81\xbe\xe3\x81\x99\xe3\x80\x82";
概要
・グリーのエンジニアに応募してみたいが、応募前に自分の実力を試してみたい。
・地方に住んでいるので面接回数が少ない方がありがたい。
・面接は苦手だがプログラミングには自信がある。は、このような方々のご要望にお応えするために作られた新しい採用プログラムです。
プログラミングスキルを評価する1次面接をパスできますので効率的な転職活動を行って頂けます。
もちろん学生の皆さんのチャレンジもお待ちしております。
The google translate
Summary
-But I would like to apply for the engineers of the glee, want to try their strength before application.
The smaller the number of times, because they appreciate the interview live in rural areas.
• The programming is confident but not good interview. Is a new adoption program was created to meet the needs of people like this.
You can Tenshoku efficient activities so you can pass the next one interview to evaluate the programming.
We look forward to challenge of course students
Maybe I'm wrong ;)
I have actually no real clue why you get this as POST, but I assume that
\x82
(and the like) stands for a hexa-decimal number. To convert a whole string (ensure it's that format):
$string = eval('return "' . $thatExactInputAsGiven . '";');
$string does now contain the byte-sequence that this submission represents. However I can not tell you which encoding it is in, but probably this one-line above helps you for testing.
If you fear the eval, mind the error handling:
$string = implode('', array_map(function($v){
$r = sscanf($v, '\x%x', $ord);
if (!$r) throw new Exception('Invalid input.');
return chr($ord);
}, str_split($thatExactInputAsGiven, 4)));
for security reasons i want the users on my website not to be able to register a username that resembles their email adress. Someone with email adress user#domain.com cant register as user or us.er, etc
For example i want this not to be possible:
tester -> tester#mydomain.com (wrong)
tes.ter -> tester#mydomain.com (wrong)
etc.
But i do want to be able to use the following:
tester6 -> tester#mydomain.com (good)
etc.
//edit
tester6 is wrong too. i ment user6 -> tester#mydomain.com (good).
Does anyone have an idea how to achieve this, or something as close as possible. I am checking this in javascript, and after that on the server in php.
Ciao!
ps. Maybe there is some jquery plugin to do this, i can't find this so far. The downside tho of using a plugin for this, is that i have to implement the same in php. If it is a long plugin it will take some time to translate.
//Edit again
If i only check the part before the # they can still use userhotmailcom, or usergmail, etc. If they supply that there email is abvious.
Typically, I use the Levenshtein distance algorithm to check whether a password looks like a login.
PHP has a native levenshtein function and here is one written in JavaScript.
Something like this?
var charsRe = /[.+]/g; // Add your characters here
if (username.replace(charsRe, '') == email.split('#')[0].replace(charsRe, ''))
doError();
If all you want is to disallow user names that vary from the email address only with periods (.), you can remove periods from the user name and compare it with email address.
//I don't know php - translating this pseudo code won't be hard
$email = "someone#something.com"
$emailname = $email.substring(0, $email.indexOf('#'));
$uname = "som.e.on.e";
$uname = $uname.replace(/\./g, "");//regex matching a '.' globally
if($uname === $emailname)
showInvalidNameErrorMessage();
Modified regex to prevent hyphens and underscores /[\-._]/g
Well, I am a newbie PHP developer. But the answer I have in my mind is, wouldn't it be great if you just allow them to register only with their email address (which won't be shared with others) and then ask for their first name and last name separately and only show their first name within public contents (i.e. Blogs, etc). I am not an expert in programming and if I am wrong please correct me and still I couldn't understand what you by security for you. Sorry for the bad English, I am not a native English speaker.
I would like to be able to detect what country a visitor is from on my website, using PHP.
Please note that I'm not trying to use this as a security measure or for anything important, just changing the spelling of some words (Americans seems to believe that the word "enrolment" has 2 Ls.... crazy yanks), and perhaps to give a default option in a "Select your country" list.
As such, using a Geolocation database is a tad over-the-top and I really don't want to muck about with installing new PHP libraries just for this, so what's the easiest/simplest way to find what country a visitor is from?
PHP provides a function since 5.3.0 to parse the $_SERVER['HTTP_ACCEPT_LANGUAGE'] variable into a locale.
$locale = Locale::acceptFromHttp($_SERVER['HTTP_ACCEPT_LANGUAGE']);
echo $locale; // returns "en_US"
Documentation: https://www.php.net/manual/en/locale.acceptfromhttp.php
Not guaranteed, but most browsers submit an Accept-Language HTTP header that specifies en-us if they're from the US. Some older browsers only said they are en, though. And not all machines are set up correctly to indicate which locale they prefer. But it's a good first guess.
English-UK based-users usually set their system or user locale to English-UK, which in default browser configurations should result in en-gb as the Accept Language header. (An earlier version of this said en-uk; that was a typo, sorry.) Other countries also have en locales, such as en-za (south africa), and, primarily theoretically, combinations like en-jp are also possible.
Geo-IP based guesses will less likely be correct on the preferred language/locale, however. Google thinks that content-negotiation based on IP address geolocation makes sense, which really annoys me when I'm in Japan or Korea...
You can check out the HTTP_ACCEPT_LANGUAGE header (from $_SERVER) that most browsers will send.
Take a look at Zend_Locale for an example, or maybe you might even want to use the lib.
You can do some IP comparaison without having a whole library to do it.
Solution #1
Use an API, this way nothing is needed from your side. This is a web API that let you know the country:
Example: http://api.hostip.info/get_html.php?ip=12.215.42.19
Return : Country: UNITED STATES (US)
Solution #2
But, Have you think to use the browser agent language? You might be able to know the type of english from it.
Solution #3
This website called BlockCountry let you have a list of IP by country. Of course, you do not want to block, but you can use the list of IP and compare them (get all US IP...) this might not be accurate...
Given your stated purpose, the Accept-Language header is a more suitable solution than IP-based geolocation. Indeed, it's precisely the intended purpose of Accept-Language.
I use the HTTP_ACCEPT_LANGUAGE
$localePreferences = explode(",",$_SERVER['HTTP_ACCEPT_LANGUAGE']);
if(is_array($localePreferences) && count($localePreferences) > 0) {
$browserLocale = $localePreferences[0];
$_SESSION['browser_locale'] = $browserLocale;
}
Parse $_SERVER["HTTP_ACCEPT_LANGUAGE"] to get country and browser's locale.
For identifying your visitors country I've used GeoIP extension, very simple to use.
The http://countries.nerd.dk service is what I use for IP-to-country mapping. It works really well and being based on DNS, is cached well too.
You can also download the database for local use if you don't want to rely on an external service.
Or you can do the following:
download 'geoip.dat' and geoip.inc from http://www.maxmind.com/app/geoip_country
in geoip.inc header you will find how to use it (eg. initialize and the rest...)
GeoIP extension is good choice.
One thing is which language viewer wants, second - which you can serve:
$SystemLocales = explode("\n", shell_exec('locale -a'));
$BrowserLocales = explode(",",str_replace("-","_",$_SERVER["HTTP_ACCEPT_LANGUAGE"])); // brosers use en-US, Linux uses en_US
for($i=0;$i<count($BrowserLocales);$i++) {
list($BrowserLocales[$i])=explode(";",$BrowserLocales[$i]); //trick for "en;q=0.8"
for($j=0;$j<count($SystemLocales);$j++) {
if ($BrowserLocales[$i]==substr($SystemLocales[$j],0,strlen($BrowserLocales[$i]))){
setlocale(LC_ALL, $SystemLocales[$j]);
break 2; // found and set, so no more check is needed
}
}
}
for example, mine system serves only:
C
POSIX
pl_PL.UTF-8
and my browser languages are: pl, en-US, en => so the only correct locale is pl_PL.UTF-8.
When no successful comparison is found - there's no setlocale at all.