I have a long text (3600 sentences) and I want to change the order of random sentences. There are some simple PHP script that can change the order of sentences?
You can accomplish it like this. Explode a string on the end of a sentence e.g full stop. Shuffle the array using the shuffle function. Then implode the string, adding the full stops back.
The output will be something like:
Hello, this is one sentence. This is a fifth. This is a forth. This is a second.. THis is a third
$sentences = 'Hello, this is one sentence. This is a second. THis is a third. This is a forth. This is a fifth.';
$sentencesArray = explode('.', $sentences);
array_filter($sentencesArray);
shuffle($sentencesArray);
$sentences = implode('.', $sentencesArray);
var_dump($sentences);
I constructed a solution which solves the problem for sentences ending with ".", "!" or "?". I noticed that it is not a good idea to include the very last part of the sentences array in the shuffling, because the last part is never supposed to end with the particular character we're splitting on:
"Hi.| Hello.| "
I hope you get the idea. So I shuffle all the elements except the last. And I do the work separately for ".", "?", and "!".
You should know that "...", "?!", "!!!11!!1!!" will cause big trouble. :):)
<?php
function randomizeOrderOnDelimiter($glue,$sentences){
$sentencesArray = explode($glue, $sentences);
// Get out the items to shuffle: all but the last.
$work = array();
for ($i = 0; $i < count($sentencesArray)-1; $i++) {
$work[$i] = $sentencesArray[$i];
}
shuffle($work); // shuffle them
// And put them back.
for ($i = 0; $i < count($sentencesArray)-1; $i++) {
$sentencesArray[$i] = $work[$i];
}
$sentences = implode($glue, $sentencesArray);
return $sentences;
}
$sentences = 'Hello, this is one sentence. This is a second. THis is a third. This is a forth. This is a fifth. Sixth is imperative! Is seventh a question? Eighth is imperative! Is ninth also a question? Tenth.';
$sentences = randomizeOrderOnDelimiter('.', $sentences);
$sentences = randomizeOrderOnDelimiter('?', $sentences);
$sentences = randomizeOrderOnDelimiter('!', $sentences);
var_dump($sentences);
?>
Related
I apologize for a confusing title. Basically, i'm facing a problem with my website which ends up bugging it completely. I need to basically remove all duplicate entries on the same line, on all lines in my text file list. I.e.
123123
123
123
Sometimes i get entries like 123123 on the same line when it should just be 123, on each line. This is just an example of course, it's hard for me to explain. I apologize again. I hope this was enough for you to gasp what i mean.
To sum it up, i'm need to remove the duplicate part of the string 123123, so it's just 123, for all of the lines in my text file.
Help appreciated.
A live example for this:
2017-06-21:127.0.0.12017-06-21:127.0.0.1
2017-06-21:127.0.0.12017-06-21:127.0.0.1
2017-06-21:127.0.0.12017-06-21:127.0.0.1
2017-06-21:127.0.0.1
$linesStr = '2017-06-21:127.0.0.12017-06-21:127.0.0.1
2017-06-21:127.0.0.12017-06-21:127.0.0.1
2017-06-21:127.0.0.12017-06-21:127.0.0.1
2017-06-21:127.0.0.1';
//can be \n only
$lines = explode("\r\n", $linesStr);
//loop through all lines
foreach($lines as $i => $line)
{
$lineLen = ceil(strlen($line) / 2);
$first = substr($line, 0, $lineLen);
$second = substr($line, $lineLen);
if($first == $second)
{
$lines[$i] = $first;
}
}
$lines = implode("\r\n", $lines);
This should do it...
A basic algorithm to deduplicate a string:
Split it in half.
If the both halves are the same, replace the whole string with either half.
Caveat: this doesn't care whether or not the string was intended to be a duplicate or not, and consequently may remove some things you don't want it to.
function deduplicate($str) {
$str = trim($str);
list($beginning, $end) = str_split($str, strlen($str) / 2);
return ($beginning == $end) ? $end : $str;
}
Assuming you have an array of lines from your file you can apply it with array_map.
$lines = array_map('deduplicate', $lines);
I have a very long text, and I need to cut the text after N chars, so that at the end I obtain a text, rendered on multiple rows, without any of the words being cut;
So, if a have a text with the lenght of a 1000 chars, which has been saved on 1 line, and I need to cut from 100 to 100 chars, at the end, I will get a text spread on 10 lines.
I tryed something, but I got stuck;
foreach does not work, the text is not seen a a array; also, i did not made sure to keep the words intact in my test;
Has anyone tryed this? Or is there any link with solution?
public static function cut_line_after_n_chars($str, $n = 70) {
$result = '';
$pos = 0;
foreach ($str as $c) {
$pos++;
if ($pos == $n) {
$result .= $c + '<br/>';
$pos = 0;
}
else
$result .= $c;
}
return $result;
}
It sounds like you need wordwrap.
http://php.net/manual/en/function.wordwrap.php
This allows you to break a string into an array of pieces without cutting off words. You can then format these pieces as you like.
EDIT
If you still need each of your lines to be 100 characters, you can use str_pad to add extra spaces onto each row.
Use explode() function to get array of words from your string.
$words = explode( ' ', $str );
$length = 0;
foreach( $words as $word ) {
// Your loop code goes here.
}
here's the line of code that I came up with:
function Count($text)
{
$WordCount = str_word_count($text);
$TextToArray = explode(" ", $text);
$TextToArray2 = explode(" ", $text);
for($i=0; $i<$WordCount; $i++)
{
$count = substr_count($TextToArray2[$i], $text);
}
echo "Number of {$TextToArray2[$i]} is {$count}";
}
So, what's gonna happen here is that, the user will be entering a text, sentence or paragraph. By using substr_count, I would like to know the number of occurrences of the word inside the array. Unfortunately, the output the is not what I really need. Any suggestions?
I assume that you want an array with the word frequencies.
First off, convert the string to lowercase and remove all punctuation from the text. This way you won't get entries for "But", "but", and "but," but rather just "but" with 3 or more uses.
Second, use str_word_count with a second argument of 2 as Mark Baker says to get a list of words in the text. This will probably be more efficient than my suggestion of preg_split.
Then walk the array and increment the value of the word by one.
foreach($words as $word)
$output[$word] = isset($output[$word]) ? $output[$word] + 1 : 1;
If I had understood your question correctly this should also solve your problem
function Count($text) {
$TextToArray = explode(" ", $text); // get all space separated words
foreach($TextToArray as $needle) {
$count = substr_count($text, $needle); // Get count of a word in the whole text
echo "$needle has occured $count times in the text";
}
}
$WordCounts = array_count_values(str_word_count(strtolower($text),2));
var_dump($WordCounts);
I want to display just two lines of the paragraph.
How do I do this ?
<p><?php if($display){ echo $crow->content;} ?></p>
Depending on the textual content you are referring to, you might be able to get away with this :
// `nl2br` is a function that converts new lines into the '<br/>' element.
$newContent = nl2br($crow->content);
// `explode` will then split the content at each appearance of '<br/>'.
$splitContent = explode("<br/>",$newContent);
// Here we simply extract the first and second items in our array.
$firstLine = $splitContent[0];
$secondLine = $splitContent[1];
NOTE - This will destroy all the line breaks you have in your text! You'll have to insert them again if you still want to preserve the text in its original formatting.
If you mean sentences you are able to do this by exploding the paragraph and selecting the first two parts of the array:
$array = explode('.', $paragraph);
$2lines = $array[0].$array[1];
Otherwise you will have to count the number of characters across two lines and use a substr() function. For example if the length of two lines is 100 characters you would do:
$2lines = substr($paragraph, 0, 200);
However due to the fact that not all font characters are the same width it may be difficult to do this accurately. I would suggest taking the widest character, such as a 'W' and echo as many of these in one line. Then count the maximum number of the largest character that can be displayed across two lines. From this you will have the optimum number. Although this will not give you a compact two lines, it will ensure that it can not go over two lines.
This is could, however, cause a word to be cut in two. To solve this we are able to use the explode function to find the last word in the extracted characters.
$array = explode(' ', $2lines);
We can then find the last word and remove the correct number of characters from the final output.
$numwords = count($array);
$lastword = $array[$numwords];
$numchars = strlen($lastword);
$2lines = substr($2lines, 0, (0-$numchars));
function getLines($text, $lines)
{
$text = explode("\n", $text, $lines + 1); //The last entrie will be all lines you dont want.
array_pop($text); //Remove the lines you didn't want.
return implode("<br>", $text); //Implode with "<br>" to a string. (This is for a HTML page, right?)
}
echo getLines($crow->content, 2); //The first two lines of $crow->content
Try this:
$lines = preg_split("/[\r\n]+/", $crow->content, 3);
echo $lines[0] . '<br />' . $lines[1];
and for variable number of lines, use:
$num_of_lines = 2;
$lines = preg_split("/[\r\n]+/", $crow->content, $num_of_lines+1);
array_pop($lines);
echo implode('<br />', $lines);
Cheers!
This is a more general answer - you can get any amount of lines using this:
function getLines($paragraph, $lines){
$lineArr = explode("\n",$paragraph);
$newParagraph = null;
if(count($lineArr) > 0){
for($i = 0; $i < $lines; $i++){
if(isset($lines[$i]))
$newParagraph .= $lines[$i];
else
break;
}
}
return $newParagraph;
}
you could use echo getLines($crow->content,2); to do what you want.
I have some text inside $content var, like this:
$content = $page_data->post_content;
I need to slice the content somehow and extract the sentences, inserting each one inside it's own var.
Something like this:
$sentence1 = 'first sentence of the text';
$sentence2 = 'second sentence of the text';
and so on...
How can I do this?
PS
I am thinking of something like this, but I need somekind of loop for each sentence:
$match = null;
preg_match('/(.*?[?\.!]{1,3})/', $content, $match);
$sentence1 = $match[1];
$sentence2 = $match[2];
Ty:)
Do you need them in variables? Can't you use a array?
$sentence = explode(". ", $page_data->post_content);
EDIT:
If you need variables:
$allSentence = explode(". ", $page_data->post_content);
foreach($allSentence as $key => $val)
{
${"sentence". $key} = $val;
}
Assuming each sentence ends with full stop, you can use explode:
$content = $page_data->post_content;
$sentences = explode('.', $content);
Now your sentences can be accessed like:
echo $sentences[0]; // 1st sentence
echo $sentences[1]; // 2nd sentence
echo $sentences[2]; // 3rd sentence
// and so on
Note that you can count total sentences using count or sizeof:
echo count($sentences);
It is not a good idea to create a new variable for each sentence, imagine you might have long piece of text which would require to create that number of variables there by increasing memory usage. You can simply use array index $sentences[0], $sentences[1] and so on.
Assuming a sentence is delimited by terminating punctuation, optionally followed by a space, you can do the following to get the sentences in an array.
$sentences = preg_split('/[!?\.]\s?/', $content);
You may want to trim any additional spaces as well with
$sentences = array_map('trim', $sentences);
This way, $sentences[0] is the first, $sentences[1] is the second and so on. If you need to loop through them you can use foreach:
foreach($sentences as $sentence) {
// Do something with $sentence...
}
Don't use individually named variables like $sentence1, $sentence2 etc. Use an array.
$sentences = explode('.', $page_data->post_content);
This gives you an array of the "sentences" in the variable $page_data->post_content, where "sentences" really means sequences of characters between full stops. This logic will get tripped up wherever a full stop is used to mean something other than the end of a sentence (e.g. "Mr. Watson").
Edit: Of course, you can use more sophisticated logic to detect sentence boundaries, as you have suggested. You should still use an array, not create an unknown number of variables with numbers on the ends of their names.