quick question:
So I've written a bunch of crawling code, but one of the websites I'm crawling didn't include line breaks between tags. Since I've already written a bunch of code, I threw in a quick hack with preg_replace and continued from where I left off. Problem is, fopen() doesn't work on strings...
$string = file_get_contents($url);
$string = preg_replace("/>(^\n|\n+)?</", ">\n<", $string);
$file = fopen($string, 'r');
while(($buffer = fgets($file)) != false) { ... }
So, without rewriting my loop, how do I approach this?
Thanks for the help!
Rob
It seems you're looking to iterate over each line of the fetched page. You can do this after you've pulled it into a string by using the explode() function to break the string into an array at the line breaks:
foreach (explode("\n", $string) as $line) {
....
}
Related
I'm afraid this question won't be too popular and possibly to be downvoted, but I have searched and searched in this site (and others too) and I can't find a solution.
I have a text file with, say, this content:
I need to remove blank lines, but keeping the existing carriage returns, like this:
The code I'm using:
if ($file = fopen("file.txt", "r")) {
while(!feof($file)) {
$line = fgets($file);
echo str_replace("\r\n","",$line)
}
fclose($file);
}
As stated above, I have tried with functions like str_replace, preg_replace, and \r\n or \n\n, etc. as characters to replace, but with all of them I'm getting this result:
The blank line is removed as desired, but carriage returns are removed too, which it's not allowed in my case.
So I wonder if anyone could suggest a way to get my goal :) Thanks.
There are bound to be duplicates for the replacing, but simply read into an array and skip the empty lines:
$lines = file("file.txt", FILE_SKIP_EMPTY_LINES):
Then loop the array to echo the lines or implode() to get it back into a string.
#Abracadaver #nogad #GCRdev
I've bee trying your methods, but didn't work for me. I finally found a way (thanks to https://stackoverflow.com/a/20719126), which I let it here if it is useful for someone:
$file = fopen("file.txt","r");
while($line = fgets($file)){
$tempData = nl2br($line);
$tempData = explode("<br />",$tempData);
foreach ($tempData as $val) {
if(trim($val) != '')
{
echo $val."<br />";
}
}
}
fclose($file);
i need to generate random sentences from dictionary. In dictionary is every word at one line, firstly i load this dictionary to array and after it i have a for cycle and randomly pickup some data, but if i wrote it, so it is at one line in browser, but in source code is every word at another line. Then I need to create a set of XML files from search engine and this new lines are indexed as /n/r and in XML source code it has got a symbol
So my question is how i can make a sentence which will be at one line in source code too. Thanks.
Here is piece of my code i don´t have here randomly loading data, i only made it for illustration in for cycle.
$file = fopen("test.txt", "r");
$data = array();
while (($buffer = fgets($file)) !== false) {
$data[] = $buffer;
}
$sentence = '';
for ($i=0;$i<10;$i++){
$sentence = $sentence . $data[$i];
}
Use trim function to filter new line characters.
In your code use:
$data[] = trim($buffer);
Present data in file ---
Kortrijk]]||75,592||74,790||73,777VWVVLG
Hasselt]]||65,503||68,085||70,584VLIVLG
Sint-Niklaas]]||68,277||68,290||70,016VOVVLG
Ostend]]||69,039||67,279||69,115VWVVLG
|22Tournai]]||67,291||67,379||67,844WHTWAL
|23Genk]]||61,532||62,842||64,095VLIVLG
|24Seraing]]||62,832||60,557||61,237WLGWAL
This is the data set i have in my wiki.txt file, i need to remove all content after "]]||" from all lines.
//Require data after code implementation
Kortrijk
Hasselt
Sint-Niklaas
Ostend
|22Tournai
|23Genk
|24Seraing
This is the code i came across, but dont have any idea how to use it in my code, i followed preg_replace, regular expression etc but all going above my head..help me plz and plz let me know any tutorial link that i can follow for these kind of working(specially regular expression for a novice).
$file="wiki.txt";
file_put_contents($file,str_replace('find','replace',file_get_contents($file)));
try:
$arr = file("wiki.txt"); //will give you contents as array
$newContent = "";
foreach($arr as $key => $val) {
$newContent .=substr_replace($val, '', strpos($val, ']'))."\n";
}
//add changed content back to file
file_put_contents("wiki.txt", $newContent);
//result is:
Kortrijk
Hasselt
Sint-Niklaas
Ostend
|22Tournai
|23Genk
|24Seraing
preg_replace('/]]\|\|.*$/m', '', $fnames);
Matches ']]||' literally, then all characters (.*) until the end of a line '$' and replaces them with ''.
Also: Check out this tutorial on RegExp
Rather than "change" the file, you probably want to open the file, read the contents, then write the parts you want to a new file. If everything goes as planned, replace the old file with the new file. Much safer that way.
<?php
$lines = file("input.txt");
$output = "";
foreach ($lines as $line) {
$output .= substr($line, 0, strpos($line, "]")) . "\n";
}
file_put_contents("output.txt", $output);
Lots of ways to solve this.
I'm trying to parse each IP line from the following file (loading from the web) and I'm going to store the values in database so i'm looking to put them in to an array.
The file its loading has the following source:
12174 in store for taking<hr>221.223.89.99:8909
<br>123.116.123.71:8909
<br>221.10.162.40:8909
<br>222.135.5.38:8909
<br>120.87.121.122:8909
<br>118.77.254.242:8909
<br>218.6.19.14:8909
<br>113.64.124.85:8909
<br>123.118.243.239:8909
<br>124.205.154.181:8909
<br>124.117.13.116:8909
<br>183.7.223.212:8909
<br>112.239.205.245:8909
<br>118.116.235.156:8909
<br>27.16.28.174:8909
<br>222.221.142.59:8909
<br>114.86.40.251:8909
<br>111.225.105.142:8909
<br>115.56.86.62:8909
<br>59.51.108.142:8909
<br>222.219.39.194:8909
<br>114.244.252.246:8909
<br>202.194.148.41:8909
<br>113.94.174.239:8909
<br><hr>total£º 24¡£
So I guess I'm looking to take everything between the <hr>'s and add each line line by line.
However doing the following doesn't seem to be working (in terms of stripping it the parts i dont' want)
<?php
$fileurl = "**MASKED**";
$lines = file($fileurl);
foreach ($lines as $line_num => $line) {
$line2 = strstr($line, 'taking', 'true');
$line3 = str_replace($line2, '', $line);
print_r($line3);
}
?>
If you want to add the values to an array, why not doing that directly inside the loop? I'd do something like this:
$output = array();
foreach ($lines as $line) {
if(preg_match("/<br>\d/", $line)) {
$output[] = substr($line, 4);
}
}
print_r($output);
Look into PHP function explode: http://www.php.net/manual/en/function.explode.php
It can take a string, and create an array out of it, by splitting at a specific character. In your case, this might be <br>
Also, trim function can get rid of the whitespace when needed.
I am using the following code which lets me navigate to a particular array line, and subarray line and change its value.
What i need to do however, is change the first column of all rows to BLANK or NULL, or clear them out.
How can i change the code below to accomplish this?
<?php
$row = $_GET['row'];
$nfv = $_GET['value'];
$col = $_GET['col'];
$data = file_get_contents("temp.php");
$csvpre = explode("###", $data);
$i = 0;
$j = 0;
if (isset($csvpre[$row]))
{
$target_row = $csvpre[$row];
$info = explode("%%", $target_row);
if (isset($info[$col]))
{
$info[$col] = $nfv;
}
$csvpre[$row] = implode("%%", $info);
}
$save = implode("###", $csvpre);
$fh = fopen("temp.php", 'w') or die("can't open file");
fwrite($fh, $save);
fclose($fh);
?>
Use foreach or array_map to perform the same action on all elements of an array.
In this case, something roughly along these lines?
foreach($rows as &$row) {
$row[0] = NULL;
}
I don't have a ready answer for you but I would recommend checking out CakePHP's Set class. It does things like this very well and (in some methods) supports XPath. Hopefully you can find the code you need there.
Depending on the size of that file, this could be much more efficient than looping through:
$data = file_get_contents("temp.php"); //data = blah%%blah%%blah%%blah%%###blah%%blah%%blah
$data = preg_replace( "/^(.+?)(?=%%)/", "\\1", $data ); //Replace first column to blank
$data = preg_replace( "/(###)(.+?)(?=%%))/", "\\1", $data ); //Replace all other columns to blank
After that, write it back to the file as you did above.
This would need to be adjusted to allow for escape characters if your columns allow %% to appear consecutively within them, but other than that, this should work.
If you expect this csv file to get REALLY large, you should start thinking of looping through the file line by line rather than reading it completely into memory using file_get_contents. I would point you to fgets_csv, but I don't believe it is possible to get each csv line by any delimiter other than newline (unless you are willing to replace your ### separator with \r\n). If you end up going this way, the answer totally changes :P
For more information on Regex (specifically positive lookaheads) see Regex Tutorial - Lookahead and Lookbehind Zero-Width Assertions (also a great site for regex in general)