I already searched here and found similar posts related to this post but i did not find a solution yet .
i tried this :
$text = "الحمد لله رب العالمين hello";
echo $is_arabic = preg_match('/\p{Arabic}/u', $text);
I add the unicode flag but if i add any English characters it is returning true ! any fix for this ?
Any idea folks ?
Thanks in advance
Use unicode flag:
$text = "الحمد لله رب العالمين";
echo $is_arabic = preg_match('/\p{Arabic}/u', $text);
here __^
If you want to match only arabic you should do:
echo $is_arabic = preg_match('/^[\s\p{Arabic}]+$/u', $text);
Update: I see I am apparently wrong about classes not being supported (though the docs at say "Extended properties such as "Greek" or "InMusicalSymbols" are not supported by PCRE" but the comment at http://php.net/manual/en/regexp.reference.unicode.php#102756 says they are supported), so I guess M42's is the better answer. They can, however, be done with ranges as follows:
$text = "الحمد لله رب العالمين";
echo $is_arabic =
preg_match('/^[\s\x{0600}-\x{06FF}\x{0750}-\x{077F}\x{08A0}-\x{08FF}\x{FB50}-\x{FDFF}\x{FE70}-\x{FEFF}\x{10E60}\x{10E60}—\x{10E7F}\x{1EE00}—\x{1EEFF}]+$/u', $text);
Related
I'm trying to use a regex to find and replace all URLs in a forum system. This works but it also selects anything that is within bbcode. This shouldn't be happening.
My code is as follows:
<?php
function make_links_clickable($text){
return preg_replace('!(([^=](f|ht)tp(s)?://)[-a-zA-Zа-яА-Я()0-9#:%_+.~#?&;//=]+)!i', '$1', $text);
}
//$text = "https://www.mcgamerzone.com<br>http://www.mcgamerzone.com/help/support<br>Just text<br>http://www.google.com/<br><b>More text</b>";
$text = "#Theareak We know this and [b][url=https://www.mcgamerzone.com/news/67/False-positive-proxy-bans-and-bot-attacks]here[/url] [/b]is an explanation, we are trying to fix this asap! https://www.mcgamerzone.com/news/67/False-positive-proxy-bans-and-bot-attacks aaa";
echo "<b>Unparsed text:</b><br>";
echo $text;
echo "<br><br>";
echo "<b>Parsed text:</b><br>";
echo make_links_clickable($text);
?>
All urls that occur in bb-code are following up on a = character, meaning that I don't want anything that starts with = to be selected.
I basically have that working but this results in selecting 1 extra character in in front of the string that should be selected.
I'm not very familiar with regex. The final output of my code is this:
<b>Unparsed text:</b><br>
#Theareak We know this and [b][url=https://www.mcgamerzone.com/news/67/False-positive-proxy-bans-and-bot-attacks]here[/url] [/b]is an explanation, we are trying to fix this asap! https://www.mcgamerzone.com/news/67/False-positive-proxy-bans-and-bot-attacks aaa<br>
<br>
<b>Parsed text:</b><br>
#Theareak We know this and [b][url=https://www.mcgamerzone.com/news/67/False-positive-proxy-bans-and-bot-attacks]here[/url] [/b]is an explanation, we are trying to fix this asap! https://www.mcgamerzone.com/news/67/False-positive-proxy-bans-and-bot-attacks aaa
You can match and skip [url=...] like this:
\[url=[^\]]*](*SKIP)(?!)|(((f|ht)tps?://)[-a-zA-Zа-яёЁА-Я()0-9#:%_+.\~#?&;/=]+)
See regex demo
That way, you will only match the URLs outside the [url=...] tag.
IDEONE demo:
function make_links_clickable($text){
return preg_replace('~\[url=[^\]]*](*SKIP)(?!)|(((f|ht)tps?://)[-a-zA-Zа-яёЁА-Я()0-9#:%_+.\~#?&;/=]+)~iu', '$1', $text);
}
$text = "#Theareak We know this and [b][url=https://www.mcgamerzone.com/news/67/False-positive-proxy-bans-and-bot-attacks]here[/url] [/b]is an explanation, we are trying to fix this asap! https://www.mcgamerzone.com/news/67/False-positive-proxy-bans-and-bot-attacks aaa";
echo "<b>Parsed text:</b><br>";
echo make_links_clickable($text);
You can use a negative lookbehind (?<!=) instead of your negated class. It asserts that what is going to be matched isn't preceded by something.
Example
My users sometimes use chinese characters for the title of their input.
My slugs are in the format of /stories/:id-:name where an example could be /stories/1-i-love-php.
How do I allow chinese characters?
I have googled and found the japanese version of this answer over here.
Don't quite understand Japanese, so I am asking about the chinese version.
Thank you.
i have tested in Bengali characters
it may work. try this:
at first the coded page (write code where in the page) have to convert into encoding type in UTF-8, then write code.
code here:
function to_slug($string, $separator = '-') {
$re = "/(\\s|\\".$separator.")+/mu";
$str = #trim($string);
$subst = $separator;
$result = preg_replace($re, $subst, $str);
return $result;
}
$id=34;
$string_text="আড়াইহাজারে দেড় বছরের --- শিশুর -গলায় ছুরি";
$base_url="http://example.com/";
echo $target_url=$base_url.$id."-". #to_slug($string_text);
var_dump($target_url);
output:
http://example.com/34-আড়াইহাজারে-দেড়-বছরের-শিশুর-গলায়-ছুরি
string 'http://example.com/34-আড়াইহাজারে-দেড়-বছরের-শিশুর-গলায়-ছুরি' (length=136)
I plan to use my custom function below while getting from data from my mysql table & print it as html. Since htmlspecialchars() translate tags to html entities, I retranslate them ( p, br, strong) to tags. My question is: Is it efficient enough or Is there any other shorter or more efficient way to achieve this aim? If you know any, can you please guide me with at least keywords? I can look fort he details in php.net and this site. Thanks, regards
function safe_output_from_mysql($safe_echo_to_html)
{
$safe_echo_to_html = mb_convert_encoding($safe_echo_to_html, 'UTF-8', mb_detect_encoding($safe_echo_to_html));
$safe_safe_echo_to_html = htmlspecialchars($safe_echo_to_html, ENT_QUOTES, "UTF-8");
$safe_echo_to_html = preg_replace("<br />","<br />",$safe_echo_to_html);
$safe_echo_to_html = preg_replace("<p>","<p>",$safe_echo_to_html);
$safe_echo_to_html = preg_replace("</p>","</p>",$safe_echo_to_html);
$safe_echo_to_html = preg_replace("<strong>","<strong>",$safe_echo_to_html);
$safe_echo_to_html = preg_replace("</strong>","</strong>",$safe_echo_to_html);
return $safe_echo_to_html;
}
There is no need to call preg_replace() multiple times. You can use a single pattern to match all the desired tags:
preg_replace('/<\s*(\/?(?:strong|p|br)\s*\/?)>/i', '<\1>', $s);
I'm assuming, of course, that you're actually planning to use regex to do the match. If the search strings are straight text, then strtr() is more efficient.
htmlspecialchars_decode: http://www.php.net/manual/en/function.htmlspecialchars-decode.php
This function is the opposite of htmlspecialchars(). It converts special HTML entities back to characters.
$str = "<p>this -> "</p>\n";
echo htmlspecialchars_decode($str);
The above example will output:
<p>this -> "</p>
Please see the function htmlspecialchars_decode($str); function.
just a quick question about Regular expressions: Will this code work for any grooming I will need to do? (i.e. Can this be inputted into a database and be safe?)
function markdown2html($text) {
$text = htmlspecialchars($text, ENT_QUOTES, 'UTF-8');
// Strong Emphasis
$text = preg_replace('/__(.+?)__/s', '<strong>$1</strong>', $text);
$text = preg_replace('/\*\*(.+?)\*\*/s', '<strong>$1</strong>', $text);
// Underline
$text = preg_replace('/_([^_]+)_/', '<p style="text-decoration: underline;">$1</p>', $text);
//Italic
$text = preg_replace('/\*([^\*]+)\*/', '<em>$1</em>', $text);
// Windows to Unix
$text = str_replace('\r\n', '\n', $text);
// Macintosh to Unix
$text = str_replace('\r', '\n', $text);
//Paragraphs
$text = '<p>' . str_replace("\n\n", '</p><p>', $text) . '</p>';
$text = str_replace("\n", '<br />', $text);
// [Linked Text](Url)
$text = preg_replace('/\[([^\]]+)]\(([a-z0-9._~:\/?##!$&\'()*+,;=%]+)\)/i', '$1', $text);
return $text;
}
No, absolutely not.
Your code has nothing to do with SQL -- it does not modify ' or \ characters at all. Commingling the formatting functionality of this function with SQL escaping is silly.
Your code may also introduce HTML injection in some situations -- I'm particularly suspicious of the URL linking regex. Without a proper parser involved, I would not trust it an inch.
No, the data can not assured to be safe after passing through that function.
You need to either escape sql-sensitive characters or use PDO/Mysqli. Preapared statements are much more handy anyway.
Don't use the old way of hacking together a query, ie:
$query = 'select * from table where col = '.$value;
You're just asking for trouble there.
A couple of things jumped out at me:
I believe that the first two regexs ('/__(.+?)__/s' and the corresponding one for *) handle ___word___ and ***word*** incorrectly –– they will treat the third character as part of the word, so you will get *word* (where the first * is bold and the trailing one is not) instead of word.
On the third one ('/_([^_]+)_/'), is it really appropriate for
do _not_ do that
to turn into
do <p style="text-decoration: underline;">not</p> do that
?
Of course I’m not saying that it’s OK to use if you fix these issues.
I'm grabbing a string from the database that could be something like String’s Title however I need to replace the ’ with a ' so that I can pass the string to an external API. I've used just about every variation of escaped strings in str_replace() that I can think of to no avail.
$stdin = mb_str_replace('’', '\'', $stdin);
Implementation of mb_str_replace() here: http://www.php.net/manual/en/ref.mbstring.php#107631
I mean this:
<?php
function mb_str_replace($needle, $replacement, $haystack) {
return implode($replacement, mb_split($needle, $haystack));
}
echo mb_str_replace('’', "'", "String’s Title");
It may solve encoding problems.
I have just tested this:
echo str_replace('’', "'", $cardnametitle);
//Outputs: String's Title
Edit: I believe that entries in your database have been htmlentitiesed.
Note: I'm pretty sure this is not a good solution, even though it did solve your problem I think there should be a better way to do it.
Try this
$s = "String’s Title";
$h = str_replace("’","'",$s);
echo $h;
Also can Try with preg_replace
echo preg_replace('/\’/',"'","String’s Title");
I don't know why str_replace() is not working for you.
I feel you haven't tried it in correct way.
Refer LIVE DEMO
<?php
$str = "String’s Title";
echo str_replace('’', '\'', $str) . "\n";
echo str_replace("’", "'", $str);
?>
OUTPUT:
String's Title
String's Title
UPDATE 1:
You may need to try setting the header as
header('Content-Type: text/html; charset=utf-8');
I came across a similar issue trying to replace apostrophes with underscores... I ended up having to write this (and this was for a WordPress site):
$replace = array(",","'","’"," ","’","–");
$downloadTitle = str_replace( $replace,"_",get_the_title($gallery_id));
I'm new to PHP myself, and realize this is pretty hideous code, but it worked for me. I realized it was the "’" that REALLY needed to be factored in for some reason.