How to make Stripos("UTF-8") in PHP? [duplicate] - php

I am using the stripos function to check if a string is located inside another string, ignoring any cases.
Here is the problem:
stripos("ø", "Ø")
returns false. While
stripos("Ø", "Ø")
returns true.
As you might see, it looks like the function does NOT do a case-insensitive search in this case.
The function has the same problems with characters like Ææ and Åå. These are Danish characters.

Use mb_stripos() instead. It's character set aware and will handle multi-byte character sets. stripos() is a holdover from the good old days when there was only ASCII and all chars were only 1 byte.

You need mb_stripos.

mb_stripos will take care of this.

As the other solutions say, try first with mb_stripos(). But if using this function doesn't help, check the encoding of your php file. Convert it to UTF-8 and save it. That did the trick for me after hours of research.

Related

How to get only integer value Started from Symbol from given string

I need a integer value which started from £ and £ , I try to do with regrex but I only getting value which starting from £.
Here I use the regrex Like this.
if(preg_match('/(\£[0-9]+(\.[0-9]{2})?)/',$vals,$matches))
{
$main[]= str_replace('£','',$matches[0]);
}
I am not familiar with regrex. so please share any solution. any help would highly appriciated.Thank you.
From your question I understand that you are having troubles with character encodings, so first of all I would suggest you to address this issue one step before, it is really important to resolve encoding issues in the earliest possible step.
Back to the question, first off, to avoid falling deeper into the charset encoding hell, I would recommend you to write your regexp literal in HEX, because otherwise the charset encoding in which you save your PHP files would affect the result. I.E. if you do something like this:
preg_match('/(£|£)(\d+)', ...)
It would match "£" and "£" (binary) if you save your source code in ISO-8859-1, but it would actually match "£" and "£" (binary) if you chose to save your source code in UTF-8 (which might be a good idea in general). So be careful with this, and verify what your editor/IDE is doing!
My suggestion thus is to write it this way, which is equivalent for ISO-8859-1 and UTF-8:
preg_match('/(\xa3|\xc2\xa3)(\d+)', ...) // match "£" and "£"
Also I suggest to make use of the sub-pattern capture feature of regular expressions, so you don't have to str_replace() afterwards, this way:
if (preg_match('/(?:\xa3|\xc2\xa3)([0-9]+(?:\.[0-9]{2})?)/', $data, $regp)) {
$main[] = $regp[1];
}
The "?:" at after the "(" means "this is a sub-pattern, but don't capture it".
Note that you can also replace preg_match with preg_match_all and you will find in $regp[1] the array of all matching numbers already prepared.
Try with this modified regex:
(?:£|£)([0-9]+(\.[0-9]{2})?)
It should do the trick. But it will return you decimal values also, because of the:
(.[0-9]{2})?
You can remove it and it will return only the integer part after £|£

PHP strpos says different croatian chars are the same: š č

I have the following code:
$text = 'Tomáš'
echo strpos($text, "č");
# result if 4
I believe they are different chars so why is PHP telling me they are the same?
What is going on and how can I correct this?
The encoding you chose to save your source code file in cannot encode the characters you're trying to save. Whatever characters PHP is seeing, it's not comparing the strings you think it is. Save your source code in an encoding that can encode all characters, preferably UTF-8.
You should try with mb_strpos function.
Performs a multi-byte safe strpos() operation based on number of characters. The first character's position is 0, the second character position is 1, and so on.
With a regular setup, it returns false to me.
However if you've troubles with such special characters, using mb_strpos instead of strpos should help.
http://php.net/manual/en/function.mb-strpos.php

Substr not working with html tags and entities

I have gone throught the following question:
substr() not working but it did not work for me :(
I am facing the same problem. I am using nicEditor and for at the time of insert, I do htmlentities(addslashes(urlencode($description)))
and when I view the description? It shows me correctly, but when i use substr() it returns nothing.
like:
substr($description,0,10)
$description contains the content and it is fine, present in db, works without substr()
Please provide a var_dumb()
of $description and a bit more code before $description is filled in, so we can see if there is an other problem.
Try this one
Use mb_substr for multibyte character encodings like UTF-8. substr
just counts bytes while mb_substr counts characters.
substr() works with singlebyte only
http://php.net/manual/en/function.mb-substr.php
Source: PHP Substr Function Trimming Problem
This happens because in UTF-8 characters are not restricted to one
byte, they have variable length to match Unicode characters, between 1
and 4 bytes.
A safe way of cutting these strings without losing anything is by
using the mb_substr PHP function instead. It works almost the same way
as substr but the difference is that you can add a new parameter to
specify the encoding type, whether is UTF-8 or a different encoding.
Source: http://osc.co.cr/extracting-a-substring-from-a-utf-8-string-in-php/

Changing case with regex

I was looking for this for a while, but was not able to find any answer. I need to change a string to lowercase in PHP.
Off course, this can be done by using strtolower(), but I was wondering if its possible to do it via preg_replace().
I noticed that in vim one can use \L or \U modifiers in the back references to change the case to lower or upper.
Is something like that possible to do in PHP, i.e. in the second argument in preg_replace()? The reason why I wanna change the case via preg_replace() is that I heard that it might work better for UTF8 strings (not sure if its true).
Thanks.
You should actually just use
mb_strtolower($str, 'UTF-8')
That way you specify utf-8 is the encoding, and all should work well.
Edit: sorry had strtoupper, changed to lower. Also, you can leave off utf-8 and it should automatically detect the encoding and use the right one.
Doing with preg_replace is practically impossible.
This is because you need to pass the strtolower() / strtoupper() as a parameter to preg_replace function. Since preg_replace cannot act on their own.
Go with the function what Dave suggested.

Get last letter from arabic string Php

I've been working with Arabic characters for a while now.
Look at this:
$string = "السلام";
Works perfectly when I print it.
But. I want to get the last letter, "م".
I've tried
$string[strlen($string]-1)];
Tried substring too.
Getting this output: �
SOLVED:
Forgot to add: mb_internal_encoding("UTF-8");
Thanks a lot guys!
You're trying to use byte-type operations on a multi-byte string (utf-8? -16?) You need to use the mb_*() functions to work with multi-byte strings: http://php.net/mb_substr
Try this:
<?php
mb_internal_encoding("UTF-8");
$string = "السلام";
echo mb_substr($string, -1);
?>
Your code is also not correct (there is syntax error):
$string[strlen($string]-1)];
^--should be )
$string[strlen($string)-1)];
You should use mb_strlen for multibyte strings. These characters take more than one byte, therefore when you fetch them with native non-mb functions, you take only one part of the character, which is usually some gibberish. mb_* functions take care of that.

Categories