This question already has answers here:
Encoding issue, coverting & to & for html using php
(4 answers)
Closed 1 year ago.
I am trying to explode the txt file in one array per line. The file was give through a URL on this format:
Ville:Montréal; Fichier:montreal.txt
Ville:Québec; Fichier:quebec.txt
The problem it that the separator variable ";" is the same found in some other parts of the string.
The wanted result is:
[0] => Ville:Québec [1] => Fichier:quebec.txt
[0] => Ville:Montréal [1] => Fichier:montreal.txt
I am using this code:
<?php $tabCities = file ('redacted'); ?>
<?php //$oneLine = utf8_decode ($tabCities[0]); ?>
<?php $oneLine = $tabCities[0]; ?>
<?php $arrayLine = explode(";", $oneLine); ?>
<?php print_r ($oneLine); ?>
<?php print_r ($arrayLine); ?>
It outputs
Ville:Montréal; Fichier:montreal.txt Array ( [0] => Ville:Montré [1] => al [2] => Fichier:montreal.txt )
utf8_decode does not help. Is there any other function or strategy I can try?
You are looking for html_entity_decode.
It will convert the html entity (é) into a unicode character to é
Related
I am reading CSV file to PHP. File is reading properly except one thing. Issue is in reading words which are in non-English language.
Array is returning some symbols instead of other language exact word.
For eg: LBL_Hello Hello 你好
Above contents are there in CSV file. I am reading the words into PHP in form of Array using below code:
<?php
$file = file_get_contents("test.csv");
$data = (array_map("str_getcsv", preg_split('/\r*\n+|\r+/', $file)));
echo "<pre>";
print_r(($data));
exit;
?>
Output:
Array
(
[0] => Array
(
[0] => LBL_Hello
[1] => Hello
[2] => ���
)
[1] => Array
(
[0] =>
)
)
Here, Chinese word "你好" is showing as "���". This is happened with all other language words also.
Please help me to know what is missing so that I can get the exact other language word while reading CSV into PHP.
Thanks in Advance.
Try sending charset headers at the top of file(before any other output)
header('Content-Type: text/html; charset=UTF-8')
Sample Code to get data(contains Non-English characters) from CSV:
<?php
header('Content-Type: text/html; charset=UTF-8');
$handle = fopen ("test.csv", "r");
$dataArr = [];
echo "<pre>";
while ($data = fgetcsv ($handle, 1000, ",")) {
array_push($dataArr, $data);
}
print_r($dataArr);
echo "</pre>";
exit;
?>
This question already has an answer here:
Convert keyboard emoticons into custom png and vice versa
(1 answer)
Closed 7 years ago.
Input - hey I'm smiling 😀
Output - hey I'm smiling <span class ="smile"></span>
Code
$emoticons = array('😀' =>'<span class ="smile"></span>') ;
$str = strtr($str, $emoticons) ;
I can't use str_replace because I have more than one element in $emoticons array.
This above code is not working the input and output remains same.
This works for me:
<?php
$str = "hey I'm smiling 😀 and I'm crying 😢 😢"; // input
$emoticons = array('😀' =>'<span class="smile"></span>','😢' =>'<span class="cry"></span>') ; // array of emoticons and spans
$output = strtr($str, $emoticons); // change emoticons from array to spans from array
echo $output; // print it
?>
My Setup:
index.php:
<?php
$page = file_get_contents('a.html');
$arr = array();
preg_match('/<td class=\"myclass\">(.*)\<\/td>/s',$page,$arr);
print_r($arr);
?>
a.html:
...other content
<td class="myclass">
THE
CONTENT
</td>
other content...
Output:
Array
(
[0] => Array
(
)
)
If I change the line 4 of index.php to:
preg_match('/<td class=\"myclass\">(.*)\<\/t/s',$page,$arr);
The output is:
Array
(
[0] => <td class="myclass">
THE
CONTENT
</t
[1] =>
THE
CONTENT
)
I can't make out what's wrong. Please help me match the content between <td class="myclass"> and </td>.
Your code appears to work. I edited the regex to use a different separator and get a clearer view. You may want to use the ungreedy modifier in case there is more than one myclass TD in your HTML.
I have not been able to reproduce the "array of array" behaviour you note, unless I manipulate the code to add an error -- see at bottom.
<?php
$page = <<<PAGE
...other content
<td class="myclass">
THE
CONTENT
</td>
other content...
PAGE;
preg_match('#<td class="myclass">(.*)</td>#s',$page,$arr);
print_r($arr);
?>
returns, as expected:
Array
(
[0] => <td class="myclass">
THE
CONTENT
</td>
[1] =>
THE
CONTENT
)
The code below is similar to yours but has been modified to cause an identical error. Doesn't seem likely you did this, though. The regexp is modified in order to not match, and the resulting empty array is stored into $arr[0] instead of $arr.
preg_match('#<td class="myclass">(.*)</ td>#s',$page,$arr[0]);
Returns the same error you observe:
Array
(
[0] => Array
(
)
)
I can duplicate the same behaviour you observe (works with </t, does not work with </td>) if I use your regexp, but modify the HTML to have </t d>. I still need to write to $arr[0] instead of $arr if I also want to get an identical output.
Do you understand that the 3rd paramter of preg_match is the matches and it will contain the match then the other elements will show the captured pattern.
http://ca3.php.net/manual/en/function.preg-match.php
If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.
This code
preg_match('/<td class=\"myclass\">(.*)\<\/t/s',$page,$arr);
When applied on
...other content
<td class="myclass">
THE
CONTENT
</td>
other content...
Will return the match in $arr[0] and the result of (.*) in $arr[1]. This result is correct: There is your content in [1]
Array
(
[0] => <td class="myclass">
THE
CONTENT
</t
[1] =>
THE
CONTENT
Example two
<?php
header('Content-Type: text/plain');
$page = 'A B C D E F';
$arr = array();
preg_match('/C (D) E/', $page, $arr);
print_r($arr);
Example output
Array
(
[0] => C D E // This is the string found
[1] => D // this is what I wanted to look for and extracted out of [0], the matched parenthesis
)
Your regex seems correct. Isn't the syntax of preg_match as follows?
preg_match('/<td class=\"myclass\">(.*)\<\/td>/s',$page,$arr);
The | in the regex represents or
$Content contains HTML document
$contents = curl_exec ($ch)
I need to get a content from:
<span class="Menu1">Artur €2000</span>
It's repeated several times so I want to save it into Array
I try to do that this way:
preg_match_all('<span class=\"Menu1\">(.*?)</span>#si',$contents,$wynik2);
But I've got an error
Warning: preg_match_all() [function.preg-match-all]: Unknown modifier '('
Can You guys help me please?
EDIT: $contents = curl_exec ($ch)
SOLVED: The error was cased becasue of wrong HTML on CURLed website:
<span class="Menu1">Content</tr>
instead of:
<span class="Menu1">Content</tr>
I didn't expected that someone can write wrong HTML. Thank You guys for help!
You forgot the first delimiter (#):
$contents = '<span class="Menu1">Artur $2000</span> somehtml <span class="Menu1">Mark $1000</span>';
preg_match_all('#<span class="Menu1">(.*?)</span>#si', $contents, $wynik2);
print_r($wynik2);
/*
Array
(
[0] => Array
(
[0] => <span class="Menu1">Artur $2000</span>
[1] => <span class="Menu1">Mark $1000</span>
)
[1] => Array
(
[0] => Artur $2000
[1] => Mark $1000
)
)
*/
You should put this sign "|" in the start and the end of your regular expression :
preg_match_all("|<span class=\"Menu1\">(.*?)</span>|U",$contents,$wynik2);
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Find url from string with php
I found this code:
but 2 link did not make properly.
save my code in a php file for testing.
$a = <<<_END
www.coders.com/coder6602.html => Work
www.avan.ir?a=21 => Work
www.limited.ndot.in/kansas/#?w=500 => Not Work
http://www.sales.com => Work
http://www.mediafire.com/?298nlpla3eys9g6 => Work
http://diary.com/files/draft.html و http://www.logo.net/plogo.swf => Not Work (this is two link)
_END;
/*
limited.ndot.in/kansas/#?w=500
diary.com/files/draft.html و logo.net/plogo.swf
*/
function strHyperlink($str){
$pattern_url = '~(?>[a-z+]{2,}://|www\.)(?:[a-z0-9]+(?:\.[a-z0-9]+)?#)?(?:(?:[a-z](?:[a-z0-9]|(?<!-)-)*[a-z0-9])(?:\.[a-z](?:[a-z0-9]|(?<!-)-)*[a-z0-9])+|(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))(?:/[^\\/:?*"<>|\n]*[a-z0-9])*/?(?:\?[a-z0-9_.%]+(?:=[a-z0-9_.%:/+-]*)?(?:&[a-z0-9_.%]+(?:=[a-z0-9_.%:/+-]*)?)*)?(?:#[a-z0-9_%.]+)?~i';
$str = preg_replace('/http:\/\//','www.', $str);
$str = preg_replace('/www.www./','www.', $str);
$str = preg_replace($pattern_url,"\\0", $str);
return preg_replace('/www./','',$str);
}
echo strHyperlink($a);
Use this regex: \b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))
You can test it on: http://gskinner.com/RegExr/
I have just tested this regex with your url's and works perfectly.