PHP str_split on string with decoded html_entity - php

If I run this code:
<?php
$string = 'My string ‘to parse’';
$string_decoded = html_entity_decode($string, ENT_QUOTES, 'utf-8');
$string_array = str_split($string_decoded);
var_dump($string_array);
?>
I get this result:
array (size=28)
0 => string 'M' (length=1)
1 => string 'y' (length=1)
2 => string ' ' (length=1)
3 => string 's' (length=1)
4 => string 't' (length=1)
5 => string 'r' (length=1)
6 => string 'i' (length=1)
7 => string 'n' (length=1)
8 => string 'g' (length=1)
9 => string ' ' (length=1)
10 => string '�' (length=1)
11 => string '�' (length=1)
12 => string '�' (length=1)
13 => string 't' (length=1)
14 => string 'o' (length=1)
15 => string ' ' (length=1)
16 => string 'p' (length=1)
17 => string 'a' (length=1)
18 => string 'r' (length=1)
19 => string 's' (length=1)
20 => string 'e' (length=1)
21 => string '�' (length=1)
22 => string '�' (length=1)
23 => string '�' (length=1)
As you can see, instead of the decoded single quotes (left/right), I'm getting these three characters for each quote...
I noticed that this happens with some entities, but not others. A few that present this issue are ‘ ” $copy;. Some that don't present the same problem are & $gt;.
I tried different charsets but couldn't find one that would work for all.
What am I doing wrong? Is there a way to make it work for all entities? Or at least all the "common" ones?
Thanks.

This should do well:
function mb_str_split($string) {
return preg_split('/(?<!^)(?!$)/u', $string );
}
$string = 'My string ‘to parse’';
$string = utf8_encode($string);
$string_decoded = html_entity_decode($string, ENT_QUOTES, 'utf-8');
$string_array = mb_str_split($string_decoded);
var_dump($string_array);
As mentioned in comments: you need to split the string with mb_split or by regex.
Proof: https://3v4l.org/3FRmG

Related

Not sure on how to randomize this

im trying to randomze a set a results from the database,
this is the bases of the array:
array (size=30)
0 => string '1' (length=1)
1 => string 'jordan' (length=6)
2 => string 'chris' (length=5)
3 => string '1' (length=1)
4 => string '1' (length=1)
5 => string 'card1, card2, card3, card4, card5, card6, card7, card8' (length=54)
6 => string 'card16, card20, card30, card40, card50, card60, card70, card80' (length=62)
7 => string '' (length=0)
8 => string '' (length=0)
9 => string '' (length=0)
10 => string '' (length=0)
11 => string '' (length=0)
12 => string '' (length=0)
13 => string '' (length=0)
14 => string '' (length=0)
15 => string '' (length=0)
16 => string '' (length=0)
17 => string '' (length=0)
18 => string '' (length=0)
19 => string '' (length=0)
20 => string '' (length=0)
21 => string '' (length=0)
22 => string '' (length=0)
23 => string '' (length=0)
24 => string '' (length=0)
25 => string '' (length=0)
26 => string '' (length=0)
27 => string '2013-11-21 04:23:19' (length=19)
28 => string '0' (length=1)
29 => string '0' (length=1)
im wanting to pull the data from array[5] and shuffle it/randomize it
while ($row = mysql_fetch_array($cards, MYSQL_NUM)) {
var_dump($row);
var_dump(array_rand($row[6], 2 ));
}
i've tried various things and now im just at the stage of getting confused even more than i did when i first started can someone help me out?
Explode your string first:
$cards = explode(",", $row[6]);
then, randomize, using shuffle, and implode:
shuffle($cards);
$result = implode(",", $cards);
it should now be a shuffled list.
You can use the php function shuffle in order to randomize in array.
Then you just have to pull the data you want to an array, shuffle it and then display it :)
Try this
function getR($ipt)
{
$pieces = explode(",", $ipt);
return $pieces[mt_rand(0,count($pieces)-1)];
}
while ($row = mysql_fetch_array($cards, MYSQL_NUM))
{
echo getR($row[5]);
}

strange behavior of preg_match_all()

Following code:
$string ='۱۲۳۴۵۶۷۸۹۰';
$regex ='#۱#';
preg_match_all($regex,$string,$match);
var_dump($match);
will output:
array(1) {
[0] =>
array(1) {
[0] =>
string(2) "۱"
}
}
but
$regex2 ='#[۱]#';
preg_match_all($regex2,$string,$match);
var_dump($match);
will output
array (size=1)
0 =>
array (size=11)
0 => string '�' (length=1)
1 => string '�' (length=1)
2 => string '�' (length=1)
3 => string '�' (length=1)
4 => string '�' (length=1)
5 => string '�' (length=1)
6 => string '�' (length=1)
7 => string '�' (length=1)
8 => string '�' (length=1)
9 => string '�' (length=1)
10 => string '�' (length=1)
Indeed I want use RegEx like [۱۲۳۴۵۶۷۸۹۰]‍‍‍‍‍‍, but the function output strange result with such RegEx's. I am using PHP 5.4
Try adding the Unicode flag:
$regex = '#[۱]#u';
The reason for this is because ۱ is actually several bytes long. On it's own, it's harmless because those exact bytes are either the symbol, or the individual bytes being there coincidentally. However, in a character class any of the individual bytes may match any of the individual bytes in the other characters, which is does because they are close together in the map.

PHP range for hebrew alphabets

PHP has a function range('a','z') which prints the English alphabet a, b, c, d, etc.
Is there a similar function for hebrew alphabets?
You can do something like this:
function utfOrd($c) {
return intval(array_pop(unpack('H*', $c)),16);
}
function utfChr($c) {
return pack('H*', base_convert("$c", 10, 16));
}
var_dump(array_map('utfChr', range(utfOrd('א'), utfOrd('ת'))));
Prints:
array
0 => string 'א' (length=2)
1 => string 'ב' (length=2)
2 => string 'ג' (length=2)
3 => string 'ד' (length=2)
4 => string 'ה' (length=2)
5 => string 'ו' (length=2)
6 => string 'ז' (length=2)
7 => string 'ח' (length=2)
8 => string 'ט' (length=2)
9 => string 'י' (length=2)
10 => string 'ך' (length=2)
11 => string 'כ' (length=2)
12 => string 'ל' (length=2)
13 => string 'ם' (length=2)
14 => string 'מ' (length=2)
15 => string 'ן' (length=2)
16 => string 'נ' (length=2)
17 => string 'ס' (length=2)
18 => string 'ע' (length=2)
19 => string 'ף' (length=2)
20 => string 'פ' (length=2)
21 => string 'ץ' (length=2)
22 => string 'צ' (length=2)
23 => string 'ק' (length=2)
24 => string 'ר' (length=2)
25 => string 'ש' (length=2)
26 => string 'ת' (length=2)
If you need some more characters, you can use this to create your hardcoded array or merge few ranges.
Range can work with the standard western alphabet because the characters A thru Z are consecutive values in the ASCII (and UTF-8) character set.
Hebrew characters are not ASCII chars (see this list) but you could set an initial range of the UTF-8 numeric values and then just array_map that to characters.

How to filter a dirty array into a clean stream of data

I have here what I would call a dirty array,
this dirty array needs to be filtered so it is a clean array e.g.
Below is the Array.
array
0 => string '1' (length=1)
1 => string 'FIRSTNAME A' (length=7)
2 => string 'LASTNAME B' (length=10)
3 => string '2011-12-08 16:15:37' (length=19)
4 => string '2' (length=1)
5 => string 'FIRSTNAME B' (length=7)
6 => string 'LASTNAME B' (length=10)
7 => string '2011-12-08 16:15:43' (length=19)
8 => string '3' (length=1)
9 => string 'FIRSTNAME C' (length=7)
10 => string 'LASTNAME C' (length=10)
11 => string '2011-12-08 16:15:48' (length=19)
12 => string '4' (length=1)
13 => string 'FIRSTNAME D' (length=7)
14 => string 'LASTNAME D' (length=10)
15 => string '2011-12-08 16:15:55' (length=19)
16 => string '6' (length=1)
17 => string 'FIRSTNAME E' (length=7)
18 => string 'LASTNAME E' (length=10)
19 => string '2011-12-08 16:16:08' (length=19)
I want the final output to look like
array[0]= 1, FIRSTNAME A, LASTNAME A, DATE
array[1]= 2, FIRSTNAME B, LASTNAME B, DATE
array[2]= 3, FIRSTNAME C, LASTNAME C, DATE
array[3]= 4, FIRSTNAME D, LASTNAME D, DATE
array[4]= 4, FIRSTNAME E, LASTNAME E, DATE
This should work
$clean = array_chunk($dirty, 4);
more about array_chunk
Wow, I'm going to take a stab at this. Not completely sure what your asking, but hopefully this will get us in some direction:
$cleanArray = array_chunk($dirtyArray,4);
foreach($cleanArray as $value) {
$finalArray[] = implode(", ",$value);
}
print_r($finalArray);

Get text that is within brackets with single or double quotes

I try to found in my all PHP files the strings inside the i18n functions. Here is an example:
$string = '__("String 2"); __("String 3", __("String 4"));' . "__('String 5'); __('String 6', __('String 7'));";
var_dump(preg_match_all('#__\((\'|")([^\'"]+)(\'|")\)#', $string, $match));
var_dump($match);
I wanna get this result:
array
0 => array
0 => string 'String 2' (length=8)
1 => string 'String 3' (length=8)
2 => string 'String 4' (length=8)
3 => string 'String 5' (length=8)
4 => string 'String 6' (length=8)
4 => string 'String 7' (length=8)
But unfortunately I get this result
array
0 => array
0 => string '__("esto es una prueba")' (length=24)
1 => string '__("esto es una prueba 2")' (length=26)
2 => string '__("prueba 4")' (length=14)
3 => string '__('caca')' (length=10)
4 => string '__('asdsnasdad')' (length=16)
1 => array
0 => string '"' (length=1)
1 => string '"' (length=1)
2 => string '"' (length=1)
3 => string ''' (length=1)
4 => string ''' (length=1)
2 => array
0 => string 'esto es una prueba' (length=18)
1 => string 'esto es una prueba 2' (length=20)
2 => string 'prueba 4' (length=8)
3 => string 'caca' (length=4)
4 => string 'asdsnasdad' (length=10)
3 => array
0 => string '"' (length=1)
1 => string '"' (length=1)
2 => string '"' (length=1)
3 => string ''' (length=1)
4 => string ''' (length=1)
Thanks in advance.
preg_match_all('/(?<=\(["\']).*?(?=[\'"])/', $subject, $result, PREG_PATTERN_ORDER);
$result = $result[0];
Simple.
Note that I am using ( as an entry point to the match. If you have more exotic input you should provide it.
Don't capture the quotes.
preg_match_all('#__\([\'"]([^\'"]+)[\'"]\)#', $string, $match);
Also take a look at the flags parameter for preg_match_all() for different output formats.

Categories