FIXED!
File encoding is UTF-16LE, changed to UTF-8 in PhpStorm and it behaves.
===========================================================
I'm reading a text file in PHP and want to read and manipulate the contents, but as soon as I touch the read contents of the file in anyway it 'breaks'.
If I read the file then echo it the text is displayed but any other operation with not work.
$contents = file_get_contents($file);
echo $contents; // works
$contents .= 'a longer test' . $contents;
echo $contents;
My ultimate goal is to run some regex’s on the contents before dumping it into a database but I need to be able to work with it first.
If it makes any difference I am using Laravel. I tried File::get($file) but have the same outcome.
EDIT to show output - Unicode issue?
//// first echo
POUR L ’É T U DE DE L ’H IST O IR E ET DE LA LANGUE DU PAYS, LA CONSERVATION DES A N TIQ U ITÉS DE L ’IL E , ET LA PUBLICATION DE DOCUMENTS HISTORIQUES, ETC., ETC. FONDÉE LE 28 JANVIER, 1873. DIXIÈME BULLETIN ANNUEL. : C. LE F E U VRE, IM PR IM E U R -É D IT EU R D E LA SOCIÉTÉ, BERESFORD LIBRARY , ST. -H ÉLIE R . 1885. = Page 1 =
// Second echo
POUR L ’É T U DE DE L ’H IST O IR E ET DE LA LANGUE DU PAYS, LA CONSERVATION DES A N TIQ U ITÉS DE L ’IL E , ET LA PUBLICATION DE DOCUMENTS HISTORIQUES, ETC., ETC. FONDÉE LE 28 JANVIER, 1873. DIXIÈME BULLETIN ANNUEL. : C. LE F E U VRE, IM PR IM E U R -É D IT EU R D E LA SOCIÉTÉ, BERESFORD LIBRARY , ST. -H ÉLIE R . 1885. = Page 1 =潬杮牥琠獥エഀ匀伀䌀䤀䔀吀䔀 䨀䔀刀匀䤀䄀䤀匀䔀ഀ倀伀唀刀 䰀 ᤀ줠 吀 唀 䐀䔀 䐀䔀 䰀 ᤀ䠠 䤀匀吀 伀 䤀刀 䔀 䔀吀 䐀䔀 䰀䄀 䰀䄀一䜀唀䔀 䐀唀 倀䄀夀匀Ⰰഀ䰀䄀 䌀伀一匀䔀刀嘀䄀吀䤀伀一 䐀䔀匀 䄀 一 吀䤀儀 唀 䤀吀준匀 䐀䔀 䰀 ᤀ䤠䰀 䔀 Ⰰ 䔀吀 䰀䄀 倀唀䈀䰀䤀䌀䄀吀䤀伀一 ഀ䐀䔀 䐀伀䌀唀䴀䔀一吀匀 䠀䤀匀吀伀刀䤀儀唀䔀匀Ⰰ 䔀吀䌀⸀Ⰰ 䔀吀䌀⸀ഀ䘀伀一䐀준䔀 䰀䔀 ㈀㠀 䨀䄀一嘀䤀䔀刀Ⰰ 㠀㜀㌀⸀ഀ䐀䤀堀䤀저䴀䔀 䈀唀䰀䰀䔀吀䤀一 䄀一一唀䔀䰀⸀ഀ㨀ഀ䌀⸀ 䰀䔀 䘀 䔀 唀 嘀刀䔀Ⰰ 䤀䴀 倀刀 䤀䴀 䔀 唀 刀 ⴀ준 䐀 䤀吀 䔀唀 刀 䐀 䔀 䰀䄀 匀伀䌀䤀준吀준Ⰰഀ䈀䔀刀䔀匀䘀伀刀䐀 䰀䤀䈀刀䄀刀夀 Ⰰ 匀吀⸀ ⴀ䠀 준䰀䤀䔀 刀 ⸀ഀ㠀㠀㔀⸀ഀ 㴀 倀愀最攀 㴀
If I put the first string into a HEREDOC all works fine, so might be something with the txt file? It's extracted text from an OCRd from am old PDF.
Full code
public function import()
{
// get all the files
$files = File::files('../import');
foreach ($files as $file) {
// load text file contents
$contents = file_get_contents($file);
echo $contents; // as expected
$contents .= 'a longer test' . $contents;
echo $contents; // weird stuff
// test txt file contents inline
$contents2 = <<<EOD
SOCIETE JERSIAISE
POUR L ’É T U DE DE L ’H IST O IR E ET DE LA LANGUE DU PAYS,
LA CONSERVATION DES A N TIQ U ITÉS DE L ’IL E , ET LA PUBLICATION
DE DOCUMENTS HISTORIQUES, ETC., ETC.
FONDÉE LE 28 JANVIER, 1873.
DIXIÈME BULLETIN ANNUEL.
:
C. LE F E U VRE, IM PR IM E U R -É D IT EU R D E LA SOCIÉTÉ,
BERESFORD LIBRARY , ST. -H ÉLIE R .
1885.
= Page 1 =
EOD;
echo $contents2; // works
$contents2 .= 'a longer test' . $contents2;
echo $contents2; // prints as expected
}
FIXED!
File encoding is UTF-16LE, changed to UTF-8 in PhpStorm and it behaves.
Or in code:
foreach ($files as $file) {
// load text file contents
$contents = file_get_contents($file);
// fix encoding
$contents = mb_convert_encoding($contents, 'UTF-8', 'UTF-16');
echo $contents;
.....
$data_to_write = 'test';
$file_handle = fopen($file, 'a');
fwrite($file_handle, $data_to_write);
fclose($file_handle);
Related
I need to to replace comma "," by "->" as multiple value separator on category field of csv, on a php script.
In the attached example csv piece, the field value on first row is
;ALIMENTACIÓN,GRANEL,Cereales legumbres y frutos secos,Desayuno y entre horas,Varios;
I neet to be replaced to:
;ALIMENTACIÓN->GRANEL->Cereales legumbres y frutos secos->Desayuno y entre horas->Varios;
I tried this code on my php script:
file_put_contents("result.csv",str_replace(",","->",file_get_contents("origin.csv")));
And it works, but it replace comma on all fields. but i need to change only on this Catefory field. It is, i need do no replace commas on description field, or other fields.
Thank you, in advance
Piece of my csv file as example (header and 3 rows -i truncated description field-):
id;SKU;DEFINICION;AMPLIACION;DISPONIBLE;IVA;REC_EQ;PVD;PVD_IVA;PVD_IVA_REC;PVP;PESO;EAN;HAY_FOTO;IMAGEN;FECHA_IMAGEN;CAT;MARCA;FRIO;CONGELADO;BIO;APTO_DIABETICO;GLUTEN;HUEVO;LACTOSA;APTO_VEGANO;UNIDAD_MEDIDA;CANTIDAD_MEDIDA;
1003;"01003";"COPOS DE AVENA 1000GR";"Los copos son granos de cereales que han sido aplastados para facilitar su digestion, manteniendo integras las propiedades del grano.<br>
La avena contiene proteínas en abundancia, así como hidratos de carbono, grasas saludables...";59;2;1.40;2.20;2.42;2.45;3.14;1;"8423266500305";1;"https://distribudiet.net/webstore/images/01003.jpg";"04/03/2020 0:00:00";ALIMENTACIÓN,GRANEL,Cereales legumbres y frutos secos,Desayuno y entre horas,Varios;GRANOVITA;0;0;0;0;1;0;0;1;kilo;1
1018;"01018";"MUESLI 10 FRUTAS 1000GR";"Receta de muesli de cereales, diez tipos diferentes de deliciosas frutas desecadas, frutos secos, semillas de girasol, lino y sesamo.<br>
A finales del ...";63;2;1.40;4.66;5.13;5.19;6.65;1;"8423266500060";1;"https://distribudiet.net/webstore/images/01018.jpg";"04/03/2020 0:00:00";ALIMENTACIÓN,GRANEL,Desayuno y entre horas;GRANOVITA;0;0;0;0;;0;0;1;kilo;1
1037;"01037";"AZUCAR CAÑA INTEGRAL 1000GR";"Azúcar moreno de caña integral sin gluten para endulzar todo tipo de postres, batidos o tus recetas favoritas de repostería. 100% natural, obtenido sin procesamiento quimico por ...";17;2;1.40;3.43;3.77;3.82;4.90;1;"8423266500121";1;"https://distribudiet.net/webstore/images/01037.jpg";"04/03/2020 0:00:00";ALIMENTACIÓN,GRANEL,Endulzantes;GRANOVITA;0;0;0;0;0;0;0;1;kilo;1
<?php
$input = 'PRESTA.csv';
$output = 'OUTPUT.csv';
$file = str_replace("<br>\n", "<br>", file_get_contents($input)); // Remove newlines in description
$lines = explode("\r\n", $file); // Split the file into lines
$fp = fopen($output, 'w'); // Open output file for writing
for ($i = 0; $i < count($lines); ++$i) {
$extract = str_getcsv($lines[$i], ';'); // Split using ; delimeter
if ($i > 0 && isset($extract[16])) // Only replace on the 16th field "CAT"
$extract[16] = str_replace(',', '->', $extract[16]);
else
var_dump($extract); // There are some lines that dont have a CAT field
fputcsv($fp, $extract, ';'); // Write line to file using ; delimeter
}
fclose($fp);
I am using doc2txt.class.php class to get the txt from word file using php and I am using the below code
require("doc2txt.class.php");
$docObj = new Doc2Txt("test.docx");
$txt = $docObj->convertToText();
My word file contains the below text
MWONGOZO WA MAOMBI MAALUMU (MAOMBI YA HATARI).
Huu ni Mfano Tu, Jinsi Ya Kuomba Na Maeneo Ya Kuombea! Unatakiwa pamoja na KUWA NA BIDII, KUMTEGEMEA SANA ROHO MTAKATIFU NI MUHIMU SANA!
MAOMBI MAALUMU YA JINSI YA KUPAMBANA KATIKA VITA VYA KIROHO
Jinsi Ya Kuomba Maombi Haya
But output I get is little different my output is
MWONGOZO WA MAOMBI MAALUMU (MAOMBI YA HATARI).Huu ni Mfano Tu, Jinsi Ya Kuomba Na Maeneo Ya Kuombea! Unatakiwa pamoja na KUWA NA BIDII, KUMTEGEMEA SANA ROHO MTAKATIFU NI MUHIMU SANA! MAOMBI MAALUMU YA JINSI YA KUPAMBANA KATIKA VITA VYA KIROHOJinsi Ya Kuomba Maombi Haya
as you can see output contains this word KIROHO Jinsi as one word KIROHOJinsi
so when I count the number of words it gives 45 words but actually there
are 46 words.
Is there any way to resolve this issue?
I have checked this code for txt file and it is working fine. I think this might help you. Thanks
$myfile = file_get_contents("test.txt");
$array = explode("\n", $myfile);
$count = null;
if (!empty($array))
{
$i = 0;
foreach ($array as $rowarray)
{
$a1 = array_filter(explode(" ", trim($rowarray)));
$count = $count + count($a1);
}
echo $count;
}
In need my php code to count characters from a text being echoed. When this count gets to 64, i need it to echo "$something" and the keep echoing from where it stoped.
Also, best case scenario this code shouldn't crop complete words.
For example
-- This:
echo 'This is a huge string that i mean to crop acording to it\'s character\'s count. For every 64 characters including spaces i need it to echo some other thing in the middle';
-- Would end up like this:
echo 'This is a huge string that i mean to crop acording to it\'s ' . $something . 'character\'s count. For every 64 characters including spaces ' . $something . 'i need it to echo some other thing in the middle';
For better understanding... I need this code to solve the fact that SVG text can't be wrapped and justified.
Would you use mb_strimwidth ? how?
Thanks in advance!
--- UPDATE 1 - I've unsuccessfully tried
echo mb_strimwidth($row['resumen'], 0, 84, "$something");
echo mb_strimwidth($row['resumen'], 64, 64, "$something");
echo mb_strimwidth($row['resumen'], 128, 64, "$something");
--- UPDATE 2 - PARTIAL SUCCESS!
$uno = substr($row['resumen'], 0, 64);
$dos = substr($row['resumen'], 64, 64);
$tres = substr($row['resumen'], 128, 64);
$suma = $uno . "</text><text>" . $dos . "</text><text>" . $tres;
echo "$suma";
BUT THIS JUST echoes the first line of my text.
Finally got to this solution:
$n=0;
$var="Texto largo mayor a 64 caracteres que complica mi utilizacion de una infografia en svg, ya que este lenguaje no acepta wrappers para el texto. Era muy lindo para ser verdad.";
$ts= mb_strwidth($var);
//Ahora defino una variable que cuenta el texto que queda por imprimir.
$aimprimir=mb_strwidth($var);
if ($ts>64){
while ($aimprimir>64):
//mientras reste por imprimir un texto de largo mayor que 64....
echo mb_strimwidth($var,$n,70,"<br/>");
$aimprimir=$aimprimir-64;
$n=$n+65;
endwhile;
//si lo que resta por imprimir es menor o igual a 64 entonces imprimalo...
echo mb_strimwidth($var,$n,$aimprimir,"<br/>");
}
else {
echo "$var";
}
I am trying to get values such R $ XX, XX [X is an example] using regular expression but I can not.
Below is my code:
$str = 'Indicada para 21 velocidades, corente indexadaCAPACETE MTB MANTUA MUSIC R$140,00PEDIVELA SHIMANO DEORE R$380,00PEDIVELA SHIMANO TX-71 R$99,00CORRENTE SHIMANO HG 40 R$55,00ROLO PARA TREINAMENTO TRANZ-X R$545,00CAPACETE MTB HIGH ONE (PROMOÇÃO) R$85,00BOMBA DE PÉ HIGH ONE COM MANÔMETRO (NYLON) R$89,90CAPA SELIM GEL (PRÓ-SPIN) R$45,00SUPORTE DE PAREDE VERTICAL R$20,00SUPORTE DE PAREDE HORIZONTAL R$35,00SUPORTE DE PAREDE VERTICAL PRETO R$28,00ESPUMA PARA GUIDÃO R$11,00BOMBA DE PÉ BETO NYLON R$55,00
Bomba pé nylon, acompanha adaptadores: valvula,bola e infláveisALAVANCA SHIMANO XT DUAL CONTROL EFM 761 R$500,00
Alavanca (par) 27 velocidades com manetes para freios mecânicos, com tecnologia "Dual Control" que chega muito próximo do sistema "STI" das bikes de corrida.
SAPATILHA SHIMANO MTB M 064 R$285,00
Pele sintética e malha flexÃvel, resistentes ao esticar.
Entressola de poliamida reforçada com fibra de de vidro.
Pamilha estruturalmente flexÃvel de acordo com uma ampla variedade de formatos de pé.
Volume + forma para melhor acomodação dos dedos dos pés.
Proteção em borracha oferece excelante tração e conforto para o caminhar.
Indicada para o pedal PD-M530, PD-M520.
Acompanha a base interna da sapatilha.ALAVANCA SHIMANO EF 51 R$130,00
Alavanca shimano 21 vel, ez-fire c/ maneteCAMPAINHA "I LOVE MY BIKE" R$14,00
Em alumÃnio, nas cores: polido, preto, azul e vermelho.
Fácil fixação no guidão.CAPACETE INFANTIL R$57,00CESTA ALUMÃNIO E NYLON
';
$regex = "/R\$[0-9]{1,},[0-9]{1,}/";
$result = preg_match_all($regex, $content, $rs);
var_dump($rs);
What's going on?
Try this code:
$content = "R$13,57 more text R$123,456";
$regex = "/$.*(R\$[0-9]{1,},[0-9]{1,}).*^/";
$result = preg_match_all($regex, $content, $rs);
var_dump($rs);
You need to place the group you are trying to match inside parentheses.
I made a HTML page in PHP to test a multilingual architecture and learn PHP.
A function in the file receives a language variable, paragraph number, sifts through a text file and returns the appropriate string.
I have been testing this locally and made sure it worked properly, but when I try this on my hosting service (making sure all paths and everything are unchanged), the script does not load and then a server error message appears.
I have also made sure that the version of PHP on my LAN and the version on the server match. (PHP Version 5.5.1.2).
What could be causing the program not to work?
The PHP file looks like this:
<!DOCTYPE html>
<html>
<head>
<title>PHP Test</title>
<link style rel=stylesheet href="style.css" type="text/css">
</head>
<?php
function skipLines($numberOfLanguages, $numberOfPairs, $openFile) {
//for the number of pairs given, skip numOfLangs * 2
//2 = one line for the numbers and one for the paragraph itself
for ($x=0; $x < (($numberOfPairs - 1) * $numberOfLanguages * 2); $x = $x + 1) {
//just casts them into the wind... like ashes
fgets($openFile);
}
}
function newRetriever($textPairNumber, $selectedLanguage, $openFile) {
//3 = numberOfLanguages
skipLines(3, $textPairNumber, $openFile);
$languageFound = false;
//test for language and return when true
while ($languageFound == false) {
//get the number and language in text after skipping is done
$currentPlaceArray = explode(" ", fgets($openFile));
$currentLanguage = $currentPlaceArray[1];
if (strcmp($currentLanguage, $selectedLanguage) == 2) {
$languageFound = true;
echo utf8_decode(fgets($openFile));
} else {
fgets($openFile);
}
}
//rewind
rewind($openFile);
}
?>
<body>
<h1>Human Rights Declaration</h1>
<p>This page uses PHP to retrieve different copies of text from each language and places them appropriately.</p>
<p>
<?php
$f = fopen("rawtext4languages.txt", "r");
newRetriever("3", "s", $f);
fclose($f);
?>
</p>
</body>
</html>
The text file has this basic format, divided by line breaks:
1 e
Line in English
1 s
Line in Spanish
1 f
Line in French
1 g
Line in German
2 e
Line in English
(...) And so forth
Also, you can see how it responds on the server here:
http://www.eamonbohan.com/Exercises/PHP/phplesson1.php
Additionally, I got an answer from my hosting service that says this:
Thank you for contacting our Help Desk. During the initial
investigation of your issue we noticed that your script is executing
in 30 seconds timeout. We increased the timeout for it to 120 seconds,
but it did not help. Here is the strace output of your script:
(Interjection: in the stack, there is an extra function I removed from the code above but it basically just builds a header based on the language variable. 0.000060 is the point in the strace output where things get weird).
eamonboh#eamonbohan.com [~/public_html/Exercises/PHP]# strace -s 2048 -r -p
`pgrep -u eamonboh php`
Process 29025 attached - interrupt to quit
0.000000 restart_syscall(<... resuming interrupted call ...>) = 0
4.021723 write(1, " <body>\n <h1>Human Rights Declaration</h1>\n <p>This page uses PHP to retrieve different copies of text from each language and places them appropriately.</p>\n <p>\n ", 182) = 182
0.000244 getcwd("/home/eamonboh/public_html/Exercises/PHP", 4096) = 41
0.000199 lstat("/home/eamonboh/public_html/Exercises/PHP/rawtexttrilingual.txt", {st_mode=S_IFREG|0644, st_size=3262, ...}) = 0
0.000359 open("/home/eamonboh/public_html/Exercises/PHP/rawtexttrilingual.txt", O_RDONLY) = 3
0.000125 fstat(3, {st_mode=S_IFREG|0644, st_size=3262, ...}) = 0
0.000163 lseek(3, 0, SEEK_CUR) = 0
0.000060 write(1, "<h2>", 4) = 4
0.000054 write(1, "1st", 3) = 3
0.000038 write(1, " ", 1) = 1
0.000134 write(1, "Spanis", 6) = 6
0.000038 write(1, "h article</h2>", 14) = 14
0.000060 read(3, "1 e\nAll human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.\n1 s\nTodos los seres humanos nacen libres e iguales en dignidad y derechos y, dotados como est\303\241n de raz\303\263n y conciencia, deben comportarse fraternalmente los unos con los otros.\n1 f\nTous les \303\252tres humains naissent libres et \303\251gaux en dignit\303\251 et en droits. Ils sont dou\303\251s de raison et de conscience et doivent agir les uns envers les autres dans un esprit de fraternit\303\251.\n2 e\nEveryone is entitled to all the rights and freedoms set forth in this Declaration, without distinction of any kind, such as race, colour, sex, language, religion, political or other opinion, national or social origin, property, birth or other status.</p<p>Furthermore, no distinction shall be made on the basis of the political, jurisdictional or international status of the country or territory to which a person belongs, whether it be independent, trust, non-self-governing or under any other limitation of sovereignty.\n2 s\nToda persona tiene los derechos y libertades proclamados en esta Declaraci\303\263n, sin distinci\303\263n alguna de raza, color, sexo, idioma, religi\303\263n, opini\303\263n pol\303\255tica o de cualquier otra \303\255ndole, origen nacional o social, posici\303\263n econ\303\263mica, nacimiento o cualquier otra condici\303\263n.</p><p>Adem\303\241s, no se har\303\241 distinci\303\263n alguna fundada en la condici\303\263n pol\303\255tica, jur\303\255dica o internacional del pa\303\255s o territorio de cuya jurisdicci\303\263n dependa una persona, tanto si se trata de un pa\303\255s independiente, como de un territorio bajo administraci\303\263n fiduciaria, no aut\303\263nomo o sometido a cualquier otra limitaci\303\263n de soberan\303\255a.\n2 f\nChacun peut se pr\303\251valoir de tous les droits et de toutes les libert\303\251s proclam\303\251s dans la pr\303\251sente D\303\251claration, sans distinction aucune, notamment de race, de couleur, de sexe, de langue, de religion, d'opinion politique ou de toute autre opinion, d'origine nationale ou sociale, de fortune, de naissance ou de toute autre situation.</p><p>D"..., 8192) = 3262
0.000269 read(3, "", 8192) = 0
120.032177 --- SIGPROF (Profiling timer expired) # 0 (0) ---
0.000170 setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={120, 0}}, NULL) = 0
0.000134 rt_sigaction(SIGPROF, {0x810d70, [PROF], SA_RESTORER|SA_RESTART, 0x7f3369d02030}, {0x810d70, [PROF], SA_RESTORER|SA_RESTART, 0x7f3369d02030}, 8) = 0
0.000149 rt_sigprocmask(SIG_UNBLOCK, [PROF], NULL, 8) = 0
0.000078 open("php_errorlog", O_WRONLY|O_CREAT|O_APPEND, 0644) = 4
0.000224 write(4, "[12-Mar-2015 15:28:34 America/Chicago] PHP Fatal error: Maximum execution time of 120 seconds exceeded in /home/eamonboh/public_html/Exercises/PHP/phplesson1.php on line 56\n", 174) = 174
0.000102 close(4) = 0
0.000076 write(1, "<br />\n<b>Fatal error</b>: Maximum execution time of 120 seconds exceeded in <b>/home/eamonboh/public_html/Exercises/PHP/phplesson1.php</b> on line <b>56</b><br />\n", 165) = 165
0.000072 chdir("/home/eamonboh/public_html/Exercises/PHP") = 0
0.000048 setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0
0.000088 close(3) = 0
0.000277 --- SIGTERM (Terminated) # 0 (0) ---
Any advice would be greatly appreciated by this webdev newbie.