I'm reading an RSS feed and outputting it on a page, and I need to take a substring of the <description> tag and store it as a variable (and then convert to a different time format, but I can figure that out myself). Here's a sample of the data I'm working with:
<description><b>When:</b> Tuesday, November 03, 2015 - 6:00 PM - 8:00 PM<br><b>Where:</b> Adult Literacy Classroom (Lower Level) dedicated in honor of Eleanor Moore<br><br>Clases de preparación para el GED grupos de estudio para ayudar con sus habilidades y preparación para obtener su diploma de equivalencia de escuela. Las clases se llevaran a cabo en español, según la materia (escritura, literatura, estudios sociales, ciencias, matemáticas y la constitución) <br /><br />GED preparation classes Study groups to help build your skills that will prepare you to get your high school equivalency diploma. Classes are taught in Spanish by subject area (writing, literature, social studies, science, math and the constitution)<br /></description>
I've already got everything within the description tag as a varible, I just need to grab the string Tuesday, November 03, 2015 - 6:00 PM - 8:00 PM, but I can't figure out how to do that. I have a feeling PHP's explode might work, but I'm terrible with regex. I'll keep working on it and post back my progress, but any help would be greatly appreciated.
By the way, I'm using this method to get the data: http://bavotasan.com/2010/display-rss-feed-with-php/
Thanks to #Bomberis123, I was able to do exactly what I needed to. My code may be a little messy, but I figured I'd share it for anyone who needs to do something similar:
<?php
$next_up_at_rss_feed = new DOMDocument();
$next_up_at_rss_feed->load("http://host7.evanced.info/waukegan/evanced/eventsxml.asp?ag=&et=&lib=0&nd=30&feedtitle=Waukegan+Public+Library%3CBR%3ECalendar+of+Programs+%26+Events&dm=rss2&LangType=0");
$next_up_at_posts = array();
foreach ($next_up_at_rss_feed->getElementsByTagName("item") as $node) {
$date = preg_match("/((\s)([^\<])+)/", $node->getElementsByTagName("description")->item(0)->nodeValue, $matches, PREG_OFFSET_CAPTURE, 3);
$date = $matches[0][0];
$next_up_at_post = array (
"title" => $node->getElementsByTagName("title")->item(0)->nodeValue,
"date" => $date,
"link" => $node->getElementsByTagName("guid")->item(0)->nodeValue,
);
array_push($next_up_at_posts, $next_up_at_post);
}
$next_up_at_limit = 4;
for ($next_up_at_counter = 0; $next_up_at_counter < $next_up_at_limit; $next_up_at_counter++) {
// get each value from the array;
$title = str_replace(" & ", " & ", $next_up_at_posts[$next_up_at_counter]["title"]);
$link = $next_up_at_posts[$next_up_at_counter]["link"];
$date_raw = $next_up_at_posts[$next_up_at_counter]["date"];
// seperate out the date so it can be formatted
$date_array = explode(" - ", $date_raw);
// set up various formats for date
$date = $date_array[0];
$date_time = strtotime($date);
$date_iso = date("Y-m-d", $date_time);
$date_pretty = date("F j", $date_time);
// set up various formats for start time
$start = $date_array[1];
$start_time = strtotime($start);
$start_iso = date("H:i", $start_time);
$start_pretty = date("g:ia", $start_time);
// set up various formats for end time
$end = $date_array[2];
$end_time = strtotime($end);
$end_iso = date("H:i", $end_time);
$end_pretty = date("g:ia", $end_time);
// display the data
echo "<article class='mini-article'><header class='mini-article-header'>";
echo "<h6 class='mini-article-heading'><a href='{$link}' target='_blank'>{$title}</a></h6>";
echo "<p class='mini-article-sub-heading'><a href='{$link}' target='_blank'><time datetime='{$date_iso}T{$start_iso}-06:00'>{$date_pretty}, {$start_pretty} - {$end_pretty}</time></a></p>";
echo "</header></article>";
}
?>
Try this Regex you can use php regex and use first group https://regex101.com/r/fI8nU9/1
$subject = "<description><b>When:</b> Tuesday, November 03, 2015 - 6:00 PM - 8:00 PM<br><b>Where:</b> Adult Literacy Classroom (Lower Level) dedicated in honor of Eleanor Moore<br><br>Clases de preparación para el GED grupos de estudio para ayudar con sus habilidades y preparación para obtener su diploma de equivalencia de escuela. Las clases se llevaran a cabo en español, según la materia (escritura, literatura, estudios sociales, ciencias, matemáticas y la constitución) <br /><br />GED preparation classes Study groups to help build your skills that will prepare you to get your high school equivalency diploma. Classes are taught in Spanish by subject area (writing, literature, social studies, science, math and the constitution)<br /></description>";
$pattern = '/((\s)([^&])+)/';
preg_match($pattern, $subject, $matches, PREG_OFFSET_CAPTURE, 3);
echo $matches[0][0];
Hurray, something I can help with and my first StackOverflow answer! Try something like this. It does use regex but just a couple simple pieces of syntax you can pick up.
$data = "<description><b>When:</b> Tuesday, November 03, 2015 - 6:00 PM - 8:00 PM<br><b>Where:</b> Adult Literacy Classroom (Lower Level) dedicated in honor of Eleanor Moore<br><br>Clases de preparación para el GED grupos de estudio para ayudar con sus habilidades y preparación para obtener su diploma de equivalencia de escuela. Las clases se llevaran a cabo en español, según la materia (escritura, literatura, estudios sociales, ciencias, matemáticas y la constitución) <br /><br />GED preparation classes Study groups to help build your skills that will prepare you to get your high school equivalency diploma. Classes are taught in Spanish by subject area (writing, literature, social studies, science, math and the constitution)<br /></description>";
$regex = "~<description><b>When:</b> (.+?)<br><b>Where:</b>~";
preg_match($regex,$data,$match);
echo $match[1];
I tested this and it works.
In this instance, you just set up $regex with what you expect the raw string to look like, with ~ on either end and (.+?) where the part you want to extract is.
I am far from an expert on regexp, but this might be something for the more paranoid programmer:
$s = '<description><b>When:</b> Tuesday, November 03, 2015 - 6:00 PM - 8:00 PM<br><b>Where:</b> Adult Literacy Classroom (Lower Level) dedicated in honor of Eleanor Moore<br><br>Clases de preparación para el GED grupos de estudio para ayudar con sus habilidades y preparación para obtener su diploma de equivalencia de escuela. Las clases se llevaran a cabo en español, según la materia (escritura, literatura, estudios sociales, ciencias, matemáticas y la constitución) <br /><br />GED preparation classes Study groups to help build your skills that will prepare you to get your high school equivalency diploma. Classes are taught in Spanish by subject area (writing, literature, social studies, science, math and the constitution)<br /></description>';
$a = array();
$p = '/(Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sunday),\s'
.'(January|February|March|April|May|June|July|August|September|October|November|December)\s'
.'[0-3][0-9],\s[1-2][0-9]{3}\s-\s' // Year
.'[0-2]?[0-9]:[0-5][0-9]\s[AP]M\s-\s' // Time
.'[0-2]?[0-9]:[0-5][0-9]\s[AP]M/'; // Time
preg_match( $p, $s, $a, PREG_OFFSET_CAPTURE );
echo $a[0][0];
Tested and working...
This will catch a date formatted as described, somewhere in the text.
Related
I need to to replace comma "," by "->" as multiple value separator on category field of csv, on a php script.
In the attached example csv piece, the field value on first row is
;ALIMENTACIÓN,GRANEL,Cereales legumbres y frutos secos,Desayuno y entre horas,Varios;
I neet to be replaced to:
;ALIMENTACIÓN->GRANEL->Cereales legumbres y frutos secos->Desayuno y entre horas->Varios;
I tried this code on my php script:
file_put_contents("result.csv",str_replace(",","->",file_get_contents("origin.csv")));
And it works, but it replace comma on all fields. but i need to change only on this Catefory field. It is, i need do no replace commas on description field, or other fields.
Thank you, in advance
Piece of my csv file as example (header and 3 rows -i truncated description field-):
id;SKU;DEFINICION;AMPLIACION;DISPONIBLE;IVA;REC_EQ;PVD;PVD_IVA;PVD_IVA_REC;PVP;PESO;EAN;HAY_FOTO;IMAGEN;FECHA_IMAGEN;CAT;MARCA;FRIO;CONGELADO;BIO;APTO_DIABETICO;GLUTEN;HUEVO;LACTOSA;APTO_VEGANO;UNIDAD_MEDIDA;CANTIDAD_MEDIDA;
1003;"01003";"COPOS DE AVENA 1000GR";"Los copos son granos de cereales que han sido aplastados para facilitar su digestion, manteniendo integras las propiedades del grano.<br>
La avena contiene proteínas en abundancia, así como hidratos de carbono, grasas saludables...";59;2;1.40;2.20;2.42;2.45;3.14;1;"8423266500305";1;"https://distribudiet.net/webstore/images/01003.jpg";"04/03/2020 0:00:00";ALIMENTACIÓN,GRANEL,Cereales legumbres y frutos secos,Desayuno y entre horas,Varios;GRANOVITA;0;0;0;0;1;0;0;1;kilo;1
1018;"01018";"MUESLI 10 FRUTAS 1000GR";"Receta de muesli de cereales, diez tipos diferentes de deliciosas frutas desecadas, frutos secos, semillas de girasol, lino y sesamo.<br>
A finales del ...";63;2;1.40;4.66;5.13;5.19;6.65;1;"8423266500060";1;"https://distribudiet.net/webstore/images/01018.jpg";"04/03/2020 0:00:00";ALIMENTACIÓN,GRANEL,Desayuno y entre horas;GRANOVITA;0;0;0;0;;0;0;1;kilo;1
1037;"01037";"AZUCAR CAÑA INTEGRAL 1000GR";"Azúcar moreno de caña integral sin gluten para endulzar todo tipo de postres, batidos o tus recetas favoritas de repostería. 100% natural, obtenido sin procesamiento quimico por ...";17;2;1.40;3.43;3.77;3.82;4.90;1;"8423266500121";1;"https://distribudiet.net/webstore/images/01037.jpg";"04/03/2020 0:00:00";ALIMENTACIÓN,GRANEL,Endulzantes;GRANOVITA;0;0;0;0;0;0;0;1;kilo;1
<?php
$input = 'PRESTA.csv';
$output = 'OUTPUT.csv';
$file = str_replace("<br>\n", "<br>", file_get_contents($input)); // Remove newlines in description
$lines = explode("\r\n", $file); // Split the file into lines
$fp = fopen($output, 'w'); // Open output file for writing
for ($i = 0; $i < count($lines); ++$i) {
$extract = str_getcsv($lines[$i], ';'); // Split using ; delimeter
if ($i > 0 && isset($extract[16])) // Only replace on the 16th field "CAT"
$extract[16] = str_replace(',', '->', $extract[16]);
else
var_dump($extract); // There are some lines that dont have a CAT field
fputcsv($fp, $extract, ';'); // Write line to file using ; delimeter
}
fclose($fp);
I am trying to get values such R $ XX, XX [X is an example] using regular expression but I can not.
Below is my code:
$str = 'Indicada para 21 velocidades, corente indexadaCAPACETE MTB MANTUA MUSIC R$140,00PEDIVELA SHIMANO DEORE R$380,00PEDIVELA SHIMANO TX-71 R$99,00CORRENTE SHIMANO HG 40 R$55,00ROLO PARA TREINAMENTO TRANZ-X R$545,00CAPACETE MTB HIGH ONE (PROMOÇÃO) R$85,00BOMBA DE PÉ HIGH ONE COM MANÔMETRO (NYLON) R$89,90CAPA SELIM GEL (PRÓ-SPIN) R$45,00SUPORTE DE PAREDE VERTICAL R$20,00SUPORTE DE PAREDE HORIZONTAL R$35,00SUPORTE DE PAREDE VERTICAL PRETO R$28,00ESPUMA PARA GUIDÃO R$11,00BOMBA DE PÉ BETO NYLON R$55,00
Bomba pé nylon, acompanha adaptadores: valvula,bola e infláveisALAVANCA SHIMANO XT DUAL CONTROL EFM 761 R$500,00
Alavanca (par) 27 velocidades com manetes para freios mecânicos, com tecnologia "Dual Control" que chega muito próximo do sistema "STI" das bikes de corrida.
SAPATILHA SHIMANO MTB M 064 R$285,00
Pele sintética e malha flexÃvel, resistentes ao esticar.
Entressola de poliamida reforçada com fibra de de vidro.
Pamilha estruturalmente flexÃvel de acordo com uma ampla variedade de formatos de pé.
Volume + forma para melhor acomodação dos dedos dos pés.
Proteção em borracha oferece excelante tração e conforto para o caminhar.
Indicada para o pedal PD-M530, PD-M520.
Acompanha a base interna da sapatilha.ALAVANCA SHIMANO EF 51 R$130,00
Alavanca shimano 21 vel, ez-fire c/ maneteCAMPAINHA "I LOVE MY BIKE" R$14,00
Em alumÃnio, nas cores: polido, preto, azul e vermelho.
Fácil fixação no guidão.CAPACETE INFANTIL R$57,00CESTA ALUMÃNIO E NYLON
';
$regex = "/R\$[0-9]{1,},[0-9]{1,}/";
$result = preg_match_all($regex, $content, $rs);
var_dump($rs);
What's going on?
Try this code:
$content = "R$13,57 more text R$123,456";
$regex = "/$.*(R\$[0-9]{1,},[0-9]{1,}).*^/";
$result = preg_match_all($regex, $content, $rs);
var_dump($rs);
You need to place the group you are trying to match inside parentheses.
I made a HTML page in PHP to test a multilingual architecture and learn PHP.
A function in the file receives a language variable, paragraph number, sifts through a text file and returns the appropriate string.
I have been testing this locally and made sure it worked properly, but when I try this on my hosting service (making sure all paths and everything are unchanged), the script does not load and then a server error message appears.
I have also made sure that the version of PHP on my LAN and the version on the server match. (PHP Version 5.5.1.2).
What could be causing the program not to work?
The PHP file looks like this:
<!DOCTYPE html>
<html>
<head>
<title>PHP Test</title>
<link style rel=stylesheet href="style.css" type="text/css">
</head>
<?php
function skipLines($numberOfLanguages, $numberOfPairs, $openFile) {
//for the number of pairs given, skip numOfLangs * 2
//2 = one line for the numbers and one for the paragraph itself
for ($x=0; $x < (($numberOfPairs - 1) * $numberOfLanguages * 2); $x = $x + 1) {
//just casts them into the wind... like ashes
fgets($openFile);
}
}
function newRetriever($textPairNumber, $selectedLanguage, $openFile) {
//3 = numberOfLanguages
skipLines(3, $textPairNumber, $openFile);
$languageFound = false;
//test for language and return when true
while ($languageFound == false) {
//get the number and language in text after skipping is done
$currentPlaceArray = explode(" ", fgets($openFile));
$currentLanguage = $currentPlaceArray[1];
if (strcmp($currentLanguage, $selectedLanguage) == 2) {
$languageFound = true;
echo utf8_decode(fgets($openFile));
} else {
fgets($openFile);
}
}
//rewind
rewind($openFile);
}
?>
<body>
<h1>Human Rights Declaration</h1>
<p>This page uses PHP to retrieve different copies of text from each language and places them appropriately.</p>
<p>
<?php
$f = fopen("rawtext4languages.txt", "r");
newRetriever("3", "s", $f);
fclose($f);
?>
</p>
</body>
</html>
The text file has this basic format, divided by line breaks:
1 e
Line in English
1 s
Line in Spanish
1 f
Line in French
1 g
Line in German
2 e
Line in English
(...) And so forth
Also, you can see how it responds on the server here:
http://www.eamonbohan.com/Exercises/PHP/phplesson1.php
Additionally, I got an answer from my hosting service that says this:
Thank you for contacting our Help Desk. During the initial
investigation of your issue we noticed that your script is executing
in 30 seconds timeout. We increased the timeout for it to 120 seconds,
but it did not help. Here is the strace output of your script:
(Interjection: in the stack, there is an extra function I removed from the code above but it basically just builds a header based on the language variable. 0.000060 is the point in the strace output where things get weird).
eamonboh#eamonbohan.com [~/public_html/Exercises/PHP]# strace -s 2048 -r -p
`pgrep -u eamonboh php`
Process 29025 attached - interrupt to quit
0.000000 restart_syscall(<... resuming interrupted call ...>) = 0
4.021723 write(1, " <body>\n <h1>Human Rights Declaration</h1>\n <p>This page uses PHP to retrieve different copies of text from each language and places them appropriately.</p>\n <p>\n ", 182) = 182
0.000244 getcwd("/home/eamonboh/public_html/Exercises/PHP", 4096) = 41
0.000199 lstat("/home/eamonboh/public_html/Exercises/PHP/rawtexttrilingual.txt", {st_mode=S_IFREG|0644, st_size=3262, ...}) = 0
0.000359 open("/home/eamonboh/public_html/Exercises/PHP/rawtexttrilingual.txt", O_RDONLY) = 3
0.000125 fstat(3, {st_mode=S_IFREG|0644, st_size=3262, ...}) = 0
0.000163 lseek(3, 0, SEEK_CUR) = 0
0.000060 write(1, "<h2>", 4) = 4
0.000054 write(1, "1st", 3) = 3
0.000038 write(1, " ", 1) = 1
0.000134 write(1, "Spanis", 6) = 6
0.000038 write(1, "h article</h2>", 14) = 14
0.000060 read(3, "1 e\nAll human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.\n1 s\nTodos los seres humanos nacen libres e iguales en dignidad y derechos y, dotados como est\303\241n de raz\303\263n y conciencia, deben comportarse fraternalmente los unos con los otros.\n1 f\nTous les \303\252tres humains naissent libres et \303\251gaux en dignit\303\251 et en droits. Ils sont dou\303\251s de raison et de conscience et doivent agir les uns envers les autres dans un esprit de fraternit\303\251.\n2 e\nEveryone is entitled to all the rights and freedoms set forth in this Declaration, without distinction of any kind, such as race, colour, sex, language, religion, political or other opinion, national or social origin, property, birth or other status.</p<p>Furthermore, no distinction shall be made on the basis of the political, jurisdictional or international status of the country or territory to which a person belongs, whether it be independent, trust, non-self-governing or under any other limitation of sovereignty.\n2 s\nToda persona tiene los derechos y libertades proclamados en esta Declaraci\303\263n, sin distinci\303\263n alguna de raza, color, sexo, idioma, religi\303\263n, opini\303\263n pol\303\255tica o de cualquier otra \303\255ndole, origen nacional o social, posici\303\263n econ\303\263mica, nacimiento o cualquier otra condici\303\263n.</p><p>Adem\303\241s, no se har\303\241 distinci\303\263n alguna fundada en la condici\303\263n pol\303\255tica, jur\303\255dica o internacional del pa\303\255s o territorio de cuya jurisdicci\303\263n dependa una persona, tanto si se trata de un pa\303\255s independiente, como de un territorio bajo administraci\303\263n fiduciaria, no aut\303\263nomo o sometido a cualquier otra limitaci\303\263n de soberan\303\255a.\n2 f\nChacun peut se pr\303\251valoir de tous les droits et de toutes les libert\303\251s proclam\303\251s dans la pr\303\251sente D\303\251claration, sans distinction aucune, notamment de race, de couleur, de sexe, de langue, de religion, d'opinion politique ou de toute autre opinion, d'origine nationale ou sociale, de fortune, de naissance ou de toute autre situation.</p><p>D"..., 8192) = 3262
0.000269 read(3, "", 8192) = 0
120.032177 --- SIGPROF (Profiling timer expired) # 0 (0) ---
0.000170 setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={120, 0}}, NULL) = 0
0.000134 rt_sigaction(SIGPROF, {0x810d70, [PROF], SA_RESTORER|SA_RESTART, 0x7f3369d02030}, {0x810d70, [PROF], SA_RESTORER|SA_RESTART, 0x7f3369d02030}, 8) = 0
0.000149 rt_sigprocmask(SIG_UNBLOCK, [PROF], NULL, 8) = 0
0.000078 open("php_errorlog", O_WRONLY|O_CREAT|O_APPEND, 0644) = 4
0.000224 write(4, "[12-Mar-2015 15:28:34 America/Chicago] PHP Fatal error: Maximum execution time of 120 seconds exceeded in /home/eamonboh/public_html/Exercises/PHP/phplesson1.php on line 56\n", 174) = 174
0.000102 close(4) = 0
0.000076 write(1, "<br />\n<b>Fatal error</b>: Maximum execution time of 120 seconds exceeded in <b>/home/eamonboh/public_html/Exercises/PHP/phplesson1.php</b> on line <b>56</b><br />\n", 165) = 165
0.000072 chdir("/home/eamonboh/public_html/Exercises/PHP") = 0
0.000048 setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0
0.000088 close(3) = 0
0.000277 --- SIGTERM (Terminated) # 0 (0) ---
Any advice would be greatly appreciated by this webdev newbie.
FIXED!
File encoding is UTF-16LE, changed to UTF-8 in PhpStorm and it behaves.
===========================================================
I'm reading a text file in PHP and want to read and manipulate the contents, but as soon as I touch the read contents of the file in anyway it 'breaks'.
If I read the file then echo it the text is displayed but any other operation with not work.
$contents = file_get_contents($file);
echo $contents; // works
$contents .= 'a longer test' . $contents;
echo $contents;
My ultimate goal is to run some regex’s on the contents before dumping it into a database but I need to be able to work with it first.
If it makes any difference I am using Laravel. I tried File::get($file) but have the same outcome.
EDIT to show output - Unicode issue?
//// first echo
POUR L ’É T U DE DE L ’H IST O IR E ET DE LA LANGUE DU PAYS, LA CONSERVATION DES A N TIQ U ITÉS DE L ’IL E , ET LA PUBLICATION DE DOCUMENTS HISTORIQUES, ETC., ETC. FONDÉE LE 28 JANVIER, 1873. DIXIÈME BULLETIN ANNUEL. : C. LE F E U VRE, IM PR IM E U R -É D IT EU R D E LA SOCIÉTÉ, BERESFORD LIBRARY , ST. -H ÉLIE R . 1885. = Page 1 =
// Second echo
POUR L ’É T U DE DE L ’H IST O IR E ET DE LA LANGUE DU PAYS, LA CONSERVATION DES A N TIQ U ITÉS DE L ’IL E , ET LA PUBLICATION DE DOCUMENTS HISTORIQUES, ETC., ETC. FONDÉE LE 28 JANVIER, 1873. DIXIÈME BULLETIN ANNUEL. : C. LE F E U VRE, IM PR IM E U R -É D IT EU R D E LA SOCIÉTÉ, BERESFORD LIBRARY , ST. -H ÉLIE R . 1885. = Page 1 =潬杮牥琠獥エഀ匀伀䌀䤀䔀吀䔀 䨀䔀刀匀䤀䄀䤀匀䔀ഀ倀伀唀刀 䰀 ᤀ줠 吀 唀 䐀䔀 䐀䔀 䰀 ᤀ䠠 䤀匀吀 伀 䤀刀 䔀 䔀吀 䐀䔀 䰀䄀 䰀䄀一䜀唀䔀 䐀唀 倀䄀夀匀Ⰰഀ䰀䄀 䌀伀一匀䔀刀嘀䄀吀䤀伀一 䐀䔀匀 䄀 一 吀䤀儀 唀 䤀吀준匀 䐀䔀 䰀 ᤀ䤠䰀 䔀 Ⰰ 䔀吀 䰀䄀 倀唀䈀䰀䤀䌀䄀吀䤀伀一 ഀ䐀䔀 䐀伀䌀唀䴀䔀一吀匀 䠀䤀匀吀伀刀䤀儀唀䔀匀Ⰰ 䔀吀䌀⸀Ⰰ 䔀吀䌀⸀ഀ䘀伀一䐀준䔀 䰀䔀 ㈀㠀 䨀䄀一嘀䤀䔀刀Ⰰ 㠀㜀㌀⸀ഀ䐀䤀堀䤀저䴀䔀 䈀唀䰀䰀䔀吀䤀一 䄀一一唀䔀䰀⸀ഀ㨀ഀ䌀⸀ 䰀䔀 䘀 䔀 唀 嘀刀䔀Ⰰ 䤀䴀 倀刀 䤀䴀 䔀 唀 刀 ⴀ준 䐀 䤀吀 䔀唀 刀 䐀 䔀 䰀䄀 匀伀䌀䤀준吀준Ⰰഀ䈀䔀刀䔀匀䘀伀刀䐀 䰀䤀䈀刀䄀刀夀 Ⰰ 匀吀⸀ ⴀ䠀 준䰀䤀䔀 刀 ⸀ഀ㠀㠀㔀⸀ഀ 㴀 倀愀最攀 㴀
If I put the first string into a HEREDOC all works fine, so might be something with the txt file? It's extracted text from an OCRd from am old PDF.
Full code
public function import()
{
// get all the files
$files = File::files('../import');
foreach ($files as $file) {
// load text file contents
$contents = file_get_contents($file);
echo $contents; // as expected
$contents .= 'a longer test' . $contents;
echo $contents; // weird stuff
// test txt file contents inline
$contents2 = <<<EOD
SOCIETE JERSIAISE
POUR L ’É T U DE DE L ’H IST O IR E ET DE LA LANGUE DU PAYS,
LA CONSERVATION DES A N TIQ U ITÉS DE L ’IL E , ET LA PUBLICATION
DE DOCUMENTS HISTORIQUES, ETC., ETC.
FONDÉE LE 28 JANVIER, 1873.
DIXIÈME BULLETIN ANNUEL.
:
C. LE F E U VRE, IM PR IM E U R -É D IT EU R D E LA SOCIÉTÉ,
BERESFORD LIBRARY , ST. -H ÉLIE R .
1885.
= Page 1 =
EOD;
echo $contents2; // works
$contents2 .= 'a longer test' . $contents2;
echo $contents2; // prints as expected
}
FIXED!
File encoding is UTF-16LE, changed to UTF-8 in PhpStorm and it behaves.
Or in code:
foreach ($files as $file) {
// load text file contents
$contents = file_get_contents($file);
// fix encoding
$contents = mb_convert_encoding($contents, 'UTF-8', 'UTF-16');
echo $contents;
.....
$data_to_write = 'test';
$file_handle = fopen($file, 'a');
fwrite($file_handle, $data_to_write);
fclose($file_handle);
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
hi im trying to read a file in descending order.
i want to echo last 10 words from the file
expected result:
brian tracy, brian tracy, der reiche
sack, der reiche sack, der reiche
sack, electrical machines by charles s
siskind second e, test de politica
fiscal, gigantomastia,gigantomastia,,
a,
file i want to read :
find a doctor, Find a Doctor,technique with fingers of right hand over left ven, la empresa adaptable, la empresa adaptable en la era de la informaci n, la pobre mia, probabilidad estadistica, crack beam, dwarf rabbit, probabilidad estadistica, kamsutra bangla, power of the dog, power of the dog, prinsip kerja uji ninhidrin, letramania 3, gre, gre, prinsip kerja uji ninhidrin, prinsip kerja uji ninhidrin, artificial intelligence a modern approach, configuring sap erp financials and controlling, gas spring, imperio carolingio, blue collar man, caligrafia, wonderlic, women and weight loss tamasha, women and the weight loss tamasha, vivir amar y aprender leo buscaglia, vivir amar y aprender leo buscaglia, wonderlic, plan de manejo ambiental, calibra o de manometros, curso de carpinteria, secreto industrial, secreto industrial, deneme, elementos secundarios de un triangulo, imperio carolingio, caligrafia, construir en lo construido, plan de manejo ambiental, lisboa, lisboa secreta, modelo de contrato secreto industrial, el conde de montecristo, metode titrasi formol, metode titrasi formol, probabilidad estadistica, probabilidad estadistica, history of islam akbar shah najeebabadi, caligrafia, caligrafia, conversacion en la catedral, brian tracy, brian tracy, der reiche sack, der reiche sack, der reiche sack, electrical machines by charles s siskind second e, test de politica fiscal, gigantomastia,gigantomastia, Find a Doctor, Find a Doctor,technique with fingers of right hand over left ven, la empresa adaptable, la empresa adaptable en la era de la informaci n, la pobre mia, probabilidad estadistica, crack beam, dwarf rabbit, probabilidad estadistica, kamsutra bangla, power of the dog, power of the dog, prinsip kerja uji ninhidrin, letramania 3, gre, gre, prinsip kerja uji ninhidrin, prinsip kerja uji ninhidrin, artificial intelligence a modern approach, configuring sap erp financials and controlling, gas spring, imperio carolingio, blue collar man, caligrafia, wonderlic, women and weight loss tamasha, women and the weight loss tamasha, vivir amar y aprender leo buscaglia, vivir amar y aprender leo buscaglia, wonderlic, plan de manejo ambiental, calibra o de manometros, curso de carpinteria, secreto industrial, secreto industrial, deneme, elementos secundarios de un triangulo, imperio carolingio, caligrafia, construir en lo construido, plan de manejo ambiental, lisboa, lisboa secreta, modelo de contrato secreto industrial, el conde de montecristo, metode titrasi formol, metode titrasi formol, probabilidad estadistica, probabilidad estadistica, history of islam akbar shah najeebabadi, caligrafia, caligrafia, conversacion en la catedral, brian tracy, brian tracy, der reiche sack, der reiche sack, der reiche sack, electrical machines by charles s siskind second e, test de politica fiscal, gigantomastia,gigantomastia,, a,
If the file will not be too big, you can simply read it all and then remove the data you don't need :
$content = file_get_contents($filename); // $filename is the file to read
$chunks = explode($delimiter, $content); // $delimiter is your word separator
$chunks = array_slice($chunks, -$n); // $n is the number of words to keep from the end of the file
// NOTE : -$n !
If the file will grow beyond reasonable size to be loaded into memory, you may read it in chunks. Something like (untested) :
function getLastTokens($filename, $n, $delimiter) {
$offset = filesize($filename);
$chunksize = 4096; // 4K chunk
if ($offset <= $chunksize * 2) {
// our one liner here because the file is samll enough
$tokens = explode($delimiter, file_get_contents($filename));
} else {
$tokens = array();
$fp = fopen($filename, 'r');
$chunkLength = 0;
while (count($tokens) < $n && $offset > 0) {
$lastOffset = $offset;
$offset -= $chunksize;
if ($offset < 0) $offset = 0; // can't seek before first byte
$chunkLength += ($lastOffset - $offset); // how much to read
fseek($fp, $offset);
$data = fread($fp, $chunkLength); // read the next (previous) chunk
if (($pos = strpos($data, $delimiter)) !== false) {
$chunkLength = 0; // reset chunk size to read next time
$offset += $pos;
$data = explode($delimiter, substr($data, $pos + 1));
array_unshift($data, & $tokens); // make $tokens the $data array's first element
// with the last line, this is equivalent to
// array_push($tokens, $data[1], $data[2], $data[3], ....)
call_user_func_array('array_push', $data);
}
}
fclose($fp);
}
fclose($fp);
return array_slice($tokens, -$n);
}
$file = "File contents"; //File get contents or anything else here.
$array = explode(",", $file);
$array = array_slice($array, -10, 10); //Starting from Last 10th element, get Ten elements.
$string = implode(", ", $array);
echo $array;
Edit:
Changed the implementation to remove the loop and the count etc.
$text = file_get_contents($file); //get contents of file
$words = explode(',', $text); //split into array
if (($length = count($words) < 10) {
$lastWords = $words; //shorter than 10 so return all
} else {
for ($i = $length-11, $i < $length; $i++ { //loop through last 10 words
$lastWords[] = $words[$i]; //add to array
}
}
$str = implode(',', $lastWords); //change array back into a string
echo $str;