parsing a large text file into 140 character tweets

parsing a large text file into 140 character tweets - php

I want to parse a large text file so that it breaks into a new line at 140 characters... or the character limit on one tweet. Does anyone have any ideas?
Thanks.

ArrayList tweetList = new ArrayList();
while(string.length > 0)
{
if(string.length > 139)
{
tweetList.add(string.substring(0, 139);
string = string.substring(140,string.length - 1);
}
else
{
tweetList.add(string.substring(0, string.length - 1);
string = "";
}
}

Much shorter: :)
String[] tweets = yourLongString.split("(?<=\\G.{140})");
Ooops, didn't read the php constraint. This is Java.

If you don't care about where the split occurs (it could be in the middle of a word, or the like):
define ('TWEET_SIZE', 140);
$parts = str_split ($data, TWEET_SIZE);
$new = implode ("\n", $parts);
UPDATE
Something like this:
define ('TWEET_SIZE', 140); // set the size of each segment
$data = file_get_contents ('<path to file>'); // load the data from file
$parts = str_split ($data, TWEET_SIZE); // split the data
$new = implode ("\n", $parts); // put it back together with newlines
file_put_contents ('<path to new file>', $data); // put in new file (if needed)

Related

Remove Line from String within txt file

Currently I have a code, which displays data from a txt file, and randomizes it after converting it into an array.
$array = explode("\n", file_get_contents('test.txt'));
$rand_keys = array_rand($array, 2);
I am trying to make it so that, after this random value is displayed.
$search = $array[$rand_keys[0]];
We're able to store this into another txt file such as completed.txt and remove the randomized segment from our previous txt file. Here's the approach I tried, and surely didn't work out with.
$a = 'test.txt';
$b = file_get_contents('test.txt');
$c = str_replace($search, '', $b);
file_put_contents($a, $c);
Then to restore into a secondary file, I was messing with something like this.
$result = '';
foreach($lines as $line) {
if(stripos($line, $search) === false) {
$result .= $search;
}
}
file_put_contents('completed.txt', $result);
This actually appears to work to some extent, however when I look at the file completed.txt all of the contents are EXACTLY the same, and there's a bunch of blank spaces being left behind within test.txt

There are some better ways of doing it (IMHO), but at the moment you are just removing the actual line without the new line character. You may also find it will replace other lines as it just replaces the text without any idea of content.
But you will probably fix your code with the addition of replacing the new line...
$c = str_replace($search."\n", '', $b);
An alternative way of doing it is...
$fileName = 'test.txt';
$fileComplete = "completed.csv";
// Read file into an array
$lines = file($fileName, FILE_IGNORE_NEW_LINES);
// Pick a line
$randomLineKey = array_rand($lines);
// Get the text of that line
$randomLine = $lines[$randomLineKey];
// Remove the line
unset($lines[$randomLineKey]);
// write out new file
file_put_contents($fileName, implode(PHP_EOL, $lines));
// Add chosen line to completed file
file_put_contents($fileComplete, $randomLine.PHP_EOL, FILE_APPEND);

remove line where multiple characters are present

I am reading file with file_get_contents.
Some lines can have multiple "=" chars and I want to remove these lines.
I tried
str_replace("=", "", $content);
but this replaces all occurences of "=" but not removes these lines.
Any idea please?
UPDATE: my content from file looks like this:
something
apple is =greee= maybe red
sugar is white
sky is =blue

Without seeing an example of your file/strings, it's a bit tricky to advise, but the basic principle I would work to would be something like this:
$FileName = "PathToFile";
$FileData = file_get_contents($FileName);
$FileDataLines = explode("\r\n", $FileData); // explode lines by ("\n", "\r\n", etc)
$FindChar = "="; // the character you want to find
foreach($FileDataLines as $FileDataLine){
$NoOfChar = substr_count($FileDataLine, $FindChar); // finds the number of occurrences of character in string
if($NoOfChar <= 1){ // if the character appears less than two times
$Results[] = $FileDataLine; // add to the results
}
}
# print the results
print_r($Results);
# build a new file
$NewFileName = "YourNewFile";
$NewFileData = implode("\r\n", $Results);
file_put_contents($NewFileName, $NewFileData);
Hope it helps

Multiple file_put_contents with with str_replace?

I am trying to replace multiple parts of a string in a file with file_put_contents.
Essentially what the function does is finds a particular phrase in the file (which are in the $new and $old arrays and replaces it.
$file_path = "hello.txt";
$file_string = file_get_contents($file_path);
function replace_string_in_file($replace_old, $replace_new) {
global $file_string; global $file_path;
if(is_array($replace_old)) {
for($i = 0; $i < count($replace_old); $i++) {
$replace = str_replace($replace_old[$i], $replace_new[$i], $file_string);
file_put_contents($file_path, $replace); // overwrite
}
}
}
$old = array("hello8", "hello9"); // what to look for
$new = array("hello0", "hello3"); // what to replace with
replace_string_in_file($old, $new);
hello.txt is: hello8 hello1 hello2 hello9
Unfortunately it outputs: hello8 hello1 hello2 hello3
So it outputs only 1 change when it should have outputted 2:
hello0 hello1 hello2 hello3

That's a single file, so why output it after every replacement? Your workflow should be
a) read in file
b) do all replacements
c) write out modified file
In other words, move your file_put_contents() to OUTSIDE your loop.
As well, str_replace will accept arrays for its "todo" and "replacewith" arrays. There's no need to loop over your inputs. so basically you should have
$old = array(...);
$new = array(...);
$text = file_get_contents(...);
$modified = str_replace($old, $new, $text);
file_put_contents($modified, ....);
Your main problem is that your str_replace, as you wrote it, is never using the updated string. You constantly use the same ORIGINAL string for each replacement,
$replace = str_replace($replace_old[$i], $replace_new[$i], $file_string);
^^^^^^^^^^^---should be $replace

You're not updating $file_string with each iteration. I.e., you set it once at the start of the loop, replace the first pair, and then the second call to replace uses the original $file_string again.

echo partial text

I want to display just two lines of the paragraph.
How do I do this ?
<p><?php if($display){ echo $crow->content;} ?></p>

Depending on the textual content you are referring to, you might be able to get away with this :
// `nl2br` is a function that converts new lines into the '<br/>' element.
$newContent = nl2br($crow->content);
// `explode` will then split the content at each appearance of '<br/>'.
$splitContent = explode("<br/>",$newContent);
// Here we simply extract the first and second items in our array.
$firstLine = $splitContent[0];
$secondLine = $splitContent[1];
NOTE - This will destroy all the line breaks you have in your text! You'll have to insert them again if you still want to preserve the text in its original formatting.

If you mean sentences you are able to do this by exploding the paragraph and selecting the first two parts of the array:
$array = explode('.', $paragraph);
$2lines = $array[0].$array[1];
Otherwise you will have to count the number of characters across two lines and use a substr() function. For example if the length of two lines is 100 characters you would do:
$2lines = substr($paragraph, 0, 200);
However due to the fact that not all font characters are the same width it may be difficult to do this accurately. I would suggest taking the widest character, such as a 'W' and echo as many of these in one line. Then count the maximum number of the largest character that can be displayed across two lines. From this you will have the optimum number. Although this will not give you a compact two lines, it will ensure that it can not go over two lines.
This is could, however, cause a word to be cut in two. To solve this we are able to use the explode function to find the last word in the extracted characters.
$array = explode(' ', $2lines);
We can then find the last word and remove the correct number of characters from the final output.
$numwords = count($array);
$lastword = $array[$numwords];
$numchars = strlen($lastword);
$2lines = substr($2lines, 0, (0-$numchars));

function getLines($text, $lines)
{
$text = explode("\n", $text, $lines + 1); //The last entrie will be all lines you dont want.
array_pop($text); //Remove the lines you didn't want.
return implode("<br>", $text); //Implode with "<br>" to a string. (This is for a HTML page, right?)
}
echo getLines($crow->content, 2); //The first two lines of $crow->content

Try this:
$lines = preg_split("/[\r\n]+/", $crow->content, 3);
echo $lines[0] . '<br />' . $lines[1];
and for variable number of lines, use:
$num_of_lines = 2;
$lines = preg_split("/[\r\n]+/", $crow->content, $num_of_lines+1);
array_pop($lines);
echo implode('<br />', $lines);
Cheers!

This is a more general answer - you can get any amount of lines using this:
function getLines($paragraph, $lines){
$lineArr = explode("\n",$paragraph);
$newParagraph = null;
if(count($lineArr) > 0){
for($i = 0; $i < $lines; $i++){
if(isset($lines[$i]))
$newParagraph .= $lines[$i];
else
break;
}
}
return $newParagraph;
}
you could use echo getLines($crow->content,2); to do what you want.

editing values stored in each subarray of an array

I am using the following code which lets me navigate to a particular array line, and subarray line and change its value.
What i need to do however, is change the first column of all rows to BLANK or NULL, or clear them out.
How can i change the code below to accomplish this?
<?php
$row = $_GET['row'];
$nfv = $_GET['value'];
$col = $_GET['col'];
$data = file_get_contents("temp.php");
$csvpre = explode("###", $data);
$i = 0;
$j = 0;
if (isset($csvpre[$row]))
{
$target_row = $csvpre[$row];
$info = explode("%%", $target_row);
if (isset($info[$col]))
{
$info[$col] = $nfv;
}
$csvpre[$row] = implode("%%", $info);
}
$save = implode("###", $csvpre);
$fh = fopen("temp.php", 'w') or die("can't open file");
fwrite($fh, $save);
fclose($fh);
?>

Use foreach or array_map to perform the same action on all elements of an array.
In this case, something roughly along these lines?
foreach($rows as &$row) {
$row[0] = NULL;
}

I don't have a ready answer for you but I would recommend checking out CakePHP's Set class. It does things like this very well and (in some methods) supports XPath. Hopefully you can find the code you need there.

Depending on the size of that file, this could be much more efficient than looping through:
$data = file_get_contents("temp.php"); //data = blah%%blah%%blah%%blah%%###blah%%blah%%blah
$data = preg_replace( "/^(.+?)(?=%%)/", "\\1", $data ); //Replace first column to blank
$data = preg_replace( "/(###)(.+?)(?=%%))/", "\\1", $data ); //Replace all other columns to blank
After that, write it back to the file as you did above.
This would need to be adjusted to allow for escape characters if your columns allow %% to appear consecutively within them, but other than that, this should work.
If you expect this csv file to get REALLY large, you should start thinking of looping through the file line by line rather than reading it completely into memory using file_get_contents. I would point you to fgets_csv, but I don't believe it is possible to get each csv line by any delimiter other than newline (unless you are willing to replace your ### separator with \r\n). If you end up going this way, the answer totally changes :P
For more information on Regex (specifically positive lookaheads) see Regex Tutorial - Lookahead and Lookbehind Zero-Width Assertions (also a great site for regex in general)

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

parsing a large text file into 140 character tweets - php

I want to parse a large text file so that it breaks into a new line at 140 characters... or the character limit on one tweet. Does anyone have any ideas? Thanks.

ArrayList tweetList = new ArrayList(); while(string.length > 0) { if(string.length > 139) { tweetList.add(string.substring(0, 139); string = string.substring(140,string.length - 1); } else { tweetList.add(string.substring(0, string.length - 1); string = ""; } }

Much shorter: :) String[] tweets = yourLongString.split("(?<=\\G.{140})"); Ooops, didn't read the php constraint. This is Java.

Related

Remove Line from String within txt file

remove line where multiple characters are present

Multiple file_put_contents with with str_replace?

echo partial text

editing values stored in each subarray of an array

Categories

Resources