I get data from the database that is utf8 encoded. But somehow some old data contains latin1 characters.
So this
$encod = mb_detect_encoding($string, 'UTF-8', true);
always is correct.
Is it safe to always use utf8_decode() to check for latin1 characters like 'äöüß'???
$string = utf8_decode($string);
$search = Array(" ", "ä", "ö", "ü", "ß", "."); //,"/Ä/","/Ö/","/Ü/");
$replace = Array("-", "ae", "oe", "ue", "ss", "-"); //,"Ae","Oe","Ue");
$string = str_replace($search, $replace, strtolower($string));
Regards
It seems to work without the utf8_encoding:
<?php
$string = "äöüß";
$search = Array(" ", "ä", "ö", "ü", "ß", "."); //,"/Ä/","/Ö/","/Ü/");
$replace = Array("-", "ae", "oe", "ue", "ss", "-"); //,"Ae","Oe","Ue");
$string = str_replace($search, $replace, strtolower($string));
echo $string;
?>
DEMO: http://codepad.org/HGTyHkBU
Use htmlspecialchars(); it is more safer for work.
More info:
http://php.net/manual/en/function.htmlspecialchars.php
Related
This question already has answers here:
How can I convert ereg expressions to preg in PHP?
(4 answers)
Deprecated: Function eregi_replace() [duplicate]
(2 answers)
Closed 4 years ago.
How can I adjust my code by replacing the "eregi_replace" function with another one that does the same thing?
I know the alternatives, but I do not know how to update the code. O erro que recebo é:
PHP Deprecated: Function eregi_replace
function url($String){
$Separador = "-";
$String = trim($String);
$String = strtolower($String);
$String = strip_tags($String);
$String = eregi_replace("[[:space:]]", $Separador, $String);
$String = eregi_replace("[çÇ]", "c", $String);
$String = eregi_replace("[áÁäÄàÀãÃâÂ]", "a", $String);
$String = eregi_replace("[éÉëËèÈêÊ]", "e", $String);
$String = eregi_replace("[íÍïÏìÌîÎ]", "i", $String);
$String = eregi_replace("[óÓöÖòÒõÕôÔ]", "o", $String);
$String = eregi_replace("[úÚüÜùÙûÛ]", "u", $String);
$String = eregi_replace("(\()|(\))", $Separador, $String);
$String = eregi_replace("(\/)|(\\\)", $Separador, $String);
$String = eregi_replace("(\[)|(\])", $Separador, $String);
$String = eregi_replace("[#®#\$%&\*\+=\|º]", $Separador, $String);
$String = eregi_replace("[;:'\"<>,\.?!_]", $Separador, $String);
$String = eregi_replace("[“”]", $Separador, $String);
$String = eregi_replace("(ª)+", $Separador, $String);
$String = eregi_replace("[´~^°]", $Separador, $String);
$String = eregi_replace("($Separador)+", $Separador, $String);
$String = substr($String, 0, 100);
$String = eregi_replace("(^($Separador)+)|(($Separador)+$)", "", $String);
$String = str_replace("-", $Separador, $String);
return $String;
}
I need to find if a file name contains some special characters I don't want.
I'm using this code actually:
$files = array("logo.png", "légo.png");
$badChars = array(" ", "é", "É", "è", "È", "à", "À", "ç", "Ç", "¨", "^", "=", "/", "*", "-", "+", "'", "<", ">", ":", ";", ",", "`", "~", "/", "", "|", "!", "#", "#", "$", "%", "?", "&", "(", ")", "¬", "{", "}", "[", "]", "ù", "Ù", '"', "«", "»");
$matches = array();
foreach($files as $file) {
$matchFound = preg_match_all("#\b(" . implode("|", $badChars) . ")\b#i", $file, $matches);
}
if ($matchFound) {
$words = array_unique($matches[0]);
foreach($words as $word) {
$results[] = array('Error' => "Forbided chars found : ". $word);
}
}
else {
$results[] = array('Success' => "OK.");
}
But I have an error saying:
Warning: preg_match_all(): Compilation failed: nothing to repeat at offset 38 in /home/public_html/upload.php on line 138
Which is:
$matchFound = preg_match_all("#\b(" . implode("|", $badChars) . ")\b#i", $file, $matches);
Any help or clue?
it is because ? * + are quantifiers. Since they are not escaped you obtain this error: |? there is obviously nothing to repeat.
For your task you don't need to use an alternation, a character class should suffice:
if (preg_match_all('~[] éèàç¨^=/*-+\'<>:;,`\~/|!##$%?&()¬{}[ù"«»]~ui', $file, $m)) {
$m = array_unique($m[0]);
$m = array_map(function ($i) use ($file) { return array('Error' => 'Forbidden character found : ' . $i . ' in ' . $file); }, $m);
$results = array_merge($results, $m);
}
or perhaps this pattern: ~[^[:alnum:]]~
It's because your characters have * in it, which tries to repeat the previous character, which in your case ends up being |, which is invalid. Your regex turns into:
..... |/|*|-| .....
Map preg_quote() to your character array before your loop and you'll be fine:
$badChars = array_map( 'preg_quote', $badChars);
Just make sure that since you're not specifying your delimiter # in the call to preg_quote(), you'll have to manually escape it in your $badChars array.
How can I increase the performance of the following code:
$text = str_replace("A", "B", $text);
$text = str_replace("f", "F", $text);
$text = str_replace("c", "S", $text);
$text = str_replace("4", "G", $text);
//more str_replace here
Do it as one function call:
$text = str_replace(["A","f","c","4"], ["B","F","S","G"], $text);
I have written this method to replace special characters:
function replace_sonder($string)
{
$string2 = str_replace("ä", "ä", $string);
$string2 = str_replace("%E4", "ä", $string2);
$string2 = str_replace("ö", "ö", $string2);
$string2 = str_replace("%F6", "ö", $string2);
$string2 = str_replace("ü", "ü", $string2);
$string2 = str_replace("%FC", "ü", $string2);
$string2 = str_replace("Ä", "Ä", $string2);
$string2 = str_replace("%C4", "Ä", $string2);
$string2 = str_replace("Ö", "Ö", $string2);
$string2 = str_replace("%D6", "Ö", $string2);
$string2 = str_replace("Ü", "Ü", $string2);
$string2 = str_replace("%DC", "Ü", $string2);
$string2 = str_replace("ß", "ß", $string2);
$string2 = str_replace("%DF", "ß", $string2);
return $string2;
}
it always returns the same string that I pass in. Where am I missing something or is there an alternative way to do this?
$string = preg_replace("/ä/", "ä", $string);
...
but better way is:
$string = htmlentities($string, ENT_QUOTES);
Check the output you're comparing is not to an HTML page as it will convert the characters back again.
is it possible to make this more smooth with less line of codes since i have to repeat it for every new box i need to insert it into.
$fil_namn = str_replace("5FSE_", "", $fil_url);
$fil_namn = str_replace(".pdf", "", $fil_namn);
$fil_namn = str_replace(".docx", "", $fil_namn);
$fil_namn = str_replace(".doc", "", $fil_namn);
$fil_namn = preg_replace("[_]",". ",$fil_namn);
$fil_namn = preg_replace('/^[0-9]+\. +/','', $fil_namn);
$fil_namn = preg_replace ("[AaA]","å",$fil_namn);
$fil_namn = preg_replace ("[AeA]","ä",$fil_namn);
$fil_namn = preg_replace ("[OoO]","ö",$fil_namn);
$fil_namn = preg_replace ("[aAa]","Å",$fil_namn);
$fil_namn = preg_replace ("[aEa]","Ä",$fil_namn);
$fil_namn = preg_replace ("[oOo]","ö",$fil_namn);
$fil_namn= str_replace("."," ", $fil_namn);
You could use this:
str_replace(array('5FSE_', '.pdf', '.docx', '.doc'), '', $fill_namn);
str_replace allows for arrays.
You can also do this:
$string = "Hello";
echo str_replace(array("H", "e", "l", "o"), array("A", "l", "e", "x"), $string);
This will print out Aeeex.
Another method would be to use the strtr() function:
$string = "[AaA][AeA][OoO][aAa][aEa][oOo]";
$find = array("[AaA]", "[AeA]", "[OoO]", "[aAa]", "[aEa]", "[oOo]");
$replace = array("å", "ä", "ö", "Å", "Ä", "ö");
echo strtr($string, array_combine($find, $replace));
This echoes out:
åäöÅÄö