Special characters to HTML ASCII Entity Equivalent

Special characters to HTML ASCII Entity Equivalent - php

How will I convert all special characters to their corresponding html entity?
Special character would be like $ & / \ { } ( - ' , # etc.
I tried to use htmlentities() and htmlspecialchars(). But didn't solve my problem.
please check here. I want output like Entity Number i.e. column 3.
Actually the scenario is - I need to take input from fckeditor. and then save into the database. So I need to convert all special character to their corresponding html entity, from the text. Otherwise it's giving me error.

What you are looking is for an ASCII equivalent of a character. So you need to make use of ord().
By the way what divaka mentioned is right.
Do like this..
<?php
function getHTMLASCIIEquiv($val)
{
$arr=['$','&','/','\\','{','}','(','-','\'',',','#'];
$val = str_split($val);$str="";
foreach($val as $v)
{
if(in_array($v,$arr))
{
$str.="&#".ord($v).";";
}
else
{
$str.=$v;
}
}
return $str;
}
echo getHTMLASCIIEquiv('please check $100 & get email from test#cc.com');
OUTPUT :
please check $100 & get email from test#cc.com
Demo

Related

PHPs strpos does not work as intended with double quoted string

I'm using the following code to return true or false if a string contains a substring in PHP 8.0.
<?php
$username = "mothertrucker"; // This username should NOT be allowed
$banlistFile = file_get_contents("banlist.txt"); //Contains the word "trucker" in it
$banlist = explode("\n", $banlistFile); // Splits $banlistFile into an array, split by line
if (contains($username, $banlist)) {
echo "Username is not allowed!";
} else {
echo "Username is allowed";
}
function contains($str, array $arr)
{
foreach($arr as $a) { // For each word in the banlist
if (stripos($str, $a) !== false) { // If I change $a to 'trucker', it works. "trucker" does not
return true;
}
}
return false;
}
?>
This is to detect if an inappropriate word is used when creating a username. So for example, if someone enters the username "mothertrucker", and "trucker" is included in the ban list, I want it to deny it.
Right now with this code, If I just type in the word "trucker" as a username, it is found and blocks it. Cool. However if there's more to the string than just "trucker", it doesn't detect it. So the username "mothertrucker" is allowed.
I discovered that if I explicitly type in 'trucker' instead of $a in the stripos function, it works perfectly. However, if I explicitly type in "trucker" (with double quotes), it stop working, and only blocks if that's the only thing the user entered.
So what I'm seeing, is it looks like the string $a that I'm passing it is being interpreted by PHP as a double quoted string, when in order for this to detect it properly, it needs to be a single quoted string. But as far as I can tell, I have no control over how php passes passing the variable.
Can I somehow convert it to a single quoted string? Perhaps the explode command I'm using in line 2 is causing it? Is there another way I can pull the data from a txt document and have it be interpreted as a single quote string? Hopefully I'm made sense with my explanation, but you can copy and paste the code and see it for yourself
Thanks for any help!

One potential problem would be any whitespace (which includes things like \r) could stop the word matching, so just trimming the word to compare with can tidy that up...
stripos($str, $a)
to
stripos($str, trim($a))

I do not know what your file actually contains so i dont know what the result of explode is.
Anyways my suggestion is (depending on the speed you want to perform this and also the length of the banlist file also your level of banning) to not explode the file and just look into it as a whole.
<?php
$username = "allow"; // This username should be allowed
$banlist = "trucker\nmotherfucker\n donot\ngoodword";
var_dump(contains($username, $banlist));
function contains($str, $arr)
{
if (stripos($arr, $str) !== false) return true;
else return false;
}
?>
Otherwise if you are going to allow say good which is an allowed word but since it is in the file with goodword it will not (using my example), you should not use stripos but instead use your example and use strcasecmp

Emoji name "family_mothers_one_boy" or "woman-woman-boy"?

I have a reference emojis file used by my php code. Inside there is for example "woman-woman-boy", but the browser (chrome) replaces this name by "family_mothers_one_boy"...
Why are there two versions of emojis' names?
Is there en (some) error(s) in my file, or should I have to do something in my code to avoid the conversion?
NOTE:
The code related to this emoji is:
1F469;‍👩‍&#x1F466
Here are the two functions I'm using to manage the emojis:
1. When I display the emoji, I replace the tage :name: by the HTML rendering (using unicode)
function replaceEmojiNameByUnicode($inputText){
$emoji_unicode = getTabEmojiUnicode();
preg_match_all("/:([a-zA-Z0-9'_+-]+):/", $inputText, $emojis);
foreach ($emojis[1] as $emojiname) {
if (isset($emoji_unicode[$emojiname])) {
$inputText = str_replace(":".$emojiname.":", "&#x".$emoji_unicode[$emojiname].";", $inputText);
}
else {
$inputText = str_replace(":".$emojiname.":", "(:".$emojiname.":)", $inputText);
}
}
return $inputText;
}
2. When I want to propose the list of emoji I display an HTML SELECT in the page. Teh following function return the list of option to add inside:
/* Display the options in the HTML select */
function displayEmojisOptions(){
$emoji_unicode = getTabEmojiUnicode();
foreach ($emoji_unicode as $name => $unicode) {
echo '<option value="&#x'.$unicode.';">'.$name.' => &#x'.$unicode.';</option>';
}
}
In the array $emoji_unicode there is one entry (with 3 semi-column removed to not display emoji here):
'family_one_girl' => '1F468;&#x200D&#x1F469&#x200D&#x1F467',
For example: In order to make it works, I have to replace the line 'thinking_face' => '1F914', by 'thinking' => '1F914',
My question is: why ??
Thank you

Nop, the emoji text was changed by no code... I guess it was due to a wrong emoji file I used... I correct all the emoji manually and now I did not see the mismatch anymore...
If someone need the corrected file, I can provide it.

if string is equal to alt+0173

In facebook comment section when i type alt+0173 and press enter it submit my comment as empty comment and i want to avoid this in my website I use the following code.
if ($react == ''){
#do nothing
} else {
#insert data
}
but it didn't work and insert the data with letter "A" with two dots on the top see the below image. when i copy and past it shows as "Â".
I also try the following code but it also didn't work.
if ($react == '' || $react == 'Â'){
#do noting
} else {
#insert data
}

I didn't verify but i think this is your solution:
alt+0173 is ascii char 173 and called Soft hyphen.
This is sometimes used to go past security scripts as you see no space but there is a char. So you can use a blocked word like bloc+173 char+ked is shown on screen as blocked but sometimes is is not picked up by the security script.
The following line prevents use of this character by removing it(it has no good use anyways).
Put it before your if/else lines.
$string = str_replace(chr(173), "", $string);
in your case:
$react = str_replace(chr(173), "", $react);
So in your case if the string only contains the alt+0173 char the string should now be empty.
Update:
But...
In your case there is something strange happening, you say your input is alt+0173 but you get an Ä which is chr(142).
Even stranger, when i asked to revert the character string to an ascii char with ord($react); you got chr(97) which is a lowercase 'a'.
As you stated you use ajax, but my knowledge of ajax is minimal so i can't help you there but maybe someone can so i hope i clarified the case a bit.
But my best guess is that something changes the value of $react when in comes from the form to the php script and you should look there.

This method helped me to solve the answer.
source: Remove alt-codes from string
$unwanted_array = array( 'Ä'=>'A' );
$react = strtr( $react, $unwanted_array );
$newreact = preg_replace("/[^A-Za-z]+/i", " ", $react);
if ($newreact == "" || $newreact == " "){
#do nothing
} else {
#insert data
}

preg_match_all not working when trying to get all characters including whitespace

Here is my function:
function grabstuff()
{
foreach (glob("../folder/*.php") as $fn)
{
$file = file_get_contents($fn);
preg_match_all("#\{\('(\w+)'\)}#", $file, $matches);
foreach ($matches[1] as $match)
{
$query = ("ALTER TABLE xxxxx ADD COLUMN `$match` LONGTEXT AFTER xxxxx;");
$result = mysql_query($query) or die("Err: ".mysql_error());
}
}
}
And here is what it looks for on the pages:
<?/*{('test test')}*/?>
It is ignoring this instance where there is a whitespace. It works well for testtest and test_test. Not getting any php errors or mysql errors. Do I need to use \w+\S or \w+\W? I tried both of those, even (...) and it still didn't work. How do I get my above function to recognize any characters within the {('')}, whether they be a normal abc characters or whitespace. I'm sure this is simple. I've done research on google and here and wasn't able to find a solution. (There will be multiple instances of {('')} on any given page if that helps). I've been using this function for a while now, but would like to add the ability to include whitespaces. Thanks!

'(\w+)'
\w is all alphanumeric characters and the underscore - that doesn't include whitespace. Just use a character set:
'([\w\s]+)'

Determine if a character is alphabetic

Having problems with this.
Let's say I have a parameter composed of a single character and I only want to accept alphabetic characters. How will I determine that the parameter passed is a member of the latin alphabet (a–z)?
By the way Im using PHP Kohana 3.
Thanks.

http://php.net/manual/en/function.ctype-alpha.php
<?php
$ch = 'a';
if (ctype_alpha($ch)) {
// Accept
} else {
// Reject
}
This also takes locale into account if you set it correctly.
EDIT: To be complete, other posters here seem to think that you need to ensure the parameter is a single character, or else the parameter is invalid. To check the length of a string, you can use strlen(). If strlen() returns any non-1 number, then you can reject the parameter, too.
As it stands, your question at the time of answering, conveys that you have a single character parameter somewhere and you want to check that it is alphabetical. I have provided a general purpose solution that does this, and is locale friendly too.

Use the following guard clause at the top of your method:
if (!preg_match("/^[a-z]$/", $param)) {
// throw an Exception...
}
If you want to allow upper case letters too, change the regular expression accordingly:
if (!preg_match("/^[a-zA-Z]$/", $param)) {
// throw an Exception...
}
Another way to support case insensitivity is to use the /i case insensitivity modifier:
if (!preg_match("/^[a-z]$/i", $param)) {
// throw an Exception...
}

preg_match('/^[a-zA-Z]$/', $var_vhar);
Method will return int value: for no match returns 0 and for matches returns 1.

I'd use ctype, as Nick suggested,since it is not only faster than regex, it is even faster than most of the string functions built into PHP. But you also need to make sure it is a single character:
if (ctype_alpha($ch) && strlen($ch) == 1) {
// Accept
} else {
// Reject
}

You can't use [a-zA-Z] for Unicode.
here are the example working with Unicode,
if ( preg_match('/^\p{L}+$/u', 'my text') ) {
echo 'match';
} else {
echo 'not match';
}

This will help hopefully.This a simple function in php called ctype_alpha
$mystring = 'a'
if (ctype_alpha($mystring))
{
//Then do the code here.
}

You can try:
preg_match('/^[a-zA-Z]$/',$input_char);
The return value of the above function is true if the $input_char contains a single alphabet, else it is false. You can suitably make use of return value.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Special characters to HTML ASCII Entity Equivalent - php

Related

PHPs strpos does not work as intended with double quoted string

Emoji name "family_mothers_one_boy" or "woman-woman-boy"?

if string is equal to alt+0173

preg_match_all not working when trying to get all characters including whitespace

Determine if a character is alphabetic

Categories

Resources