In PHP, I'd like to crop the following sentence:
"Test 1. Test 2. Test 3."
and transform this into 2 strings:
"Test 1. Test 2." and "Test 3."
How do I achieve this?
Do I use strpos?
Many thanks for any pointers.
function isIndex($i){
$i = (isset($i)) ? $i : false;
return $i;
}
$str = explode("2.", "Test 1. Test 2. Test 3.");
$nstr1 = isIndex(&$str[0]).'2.';
$nstr2 = isIndex(&$str[1]);
to separate the two first sentence, this should do it quick :
$str = "Lorem Ipsum dolor sit amet etc etc. Blabla 2. Blabla 3. Test 4.";
$p1 = "";
$p2 = "";
explode_paragraph($str, $p1, $p2); // fills $p1 and $p2
echo $p1; // two first sentences
echo $p2; // the rest of the paragraph
function explode_paragraph($str, &$part1, &$part2) {
$s = $str;
$first = strpos($s,"."); // tries to find the first dot
if ($first>-1) {
$s = substr($s, $first); // crop the paragraph after the first dot
$second = strpos($s,"."); // tries to find the second dot
if ($second>-1) { // a second one ?
$part1 = substr($str, 9, $second); //
$part2 = substr($str, $second);
} else { // only one dot : part1 will be everything, no part2
$part1 = $str;
$part2 = "";
}
} else { // no sentences at all.. put something in part1 ?
$part1 = ""; // $part1 = $str;
$part2 = "";
}
}
$string="Inspired by arthropod insects and spiders, BAIUST researchers have created an entirely new type of semi-soft robots capable of standing and walking using drinking straws and inflatable tubing. Inspired by arthropod insects and spiders, BAIUST researchers have created an entirely new type of semi-soft robots capable of standing and walking using drinking straws and inflatable tubing. Inspired by arthropod insects and spiders, BAIUST researchers have created an entirely new type of semi-soft robots capable of standing and walking using drinking straws and inflatable tubing.
";
call :
echo short_description_in_complete_sentence($string,3,1000,2);
function:
public function short_description_in_complete_sentence($string,$start_point,$end_point,$sentence_number=1){
$final_string='';
//$div_string=array();
$short_string=substr($string,$start_point,$end_point);
$div_string=explode('.',$short_string);
for($i=0;$i<$sentence_number;$i++){
if(!Empty($div_string[$sentence_number-1])){
$final_string=$final_string.$div_string[$i].'.';
}else{ $final_string='Invalid sentence number or total character number!';}
}
return $final_string;
}
Something like that?
$str = 'Test 1. Test 2. Test 3.';
$strArray = explode('.', $str);
$str1 = $strArray[0] . '. ' . $strArray[1] . '.';
$str2 = $strArray[2] . '.';
what is the main reason why you must cut the text after "Test 2." and not before ? the best solution depends on what you want to do and what you will eventually want to do with that
Related
I need help with writing a function for smart selection of fragments in the text.
Src text = "Regulation is mediated via many different mechanisms"
HightLight string = "mediate via"
Expected Result = "Regulation is mediated via many different mechanisms"
I found one solution in google, but it is not work correctly with strings with dynamic endings
<?php
$string = "The monkey hangs from the door";
$keyword = "the";
function highlightkeyword($str, $search) {
$occurrences = substr_count(strtolower($str), strtolower($search));
$newstring = $str;
$match = array();
for ($i=0;$i<$occurrences;$i++) {
$match[$i] = stripos($str, $search, $i);
$match[$i] = substr($str, $match[$i], strlen($search));
$newstring = str_replace($match[$i], '[#]'.$match[$i].'[#]', strip_tags($newstring));
}
$newstring = str_replace('[#]', '<b>', $newstring);
$newstring = str_replace('[#]', '</b>', $newstring);
return $newstring;
}
?>
Another examples:
Ex1:
src = is mediated via many
search = mediate via
result = is [b]mediated via[/b] many
Ex2:
src = are meddling in local affairs.
search = meddle in
result = are [b]meddling in[/b] local affairs.
Ex3:
src = who can not get married in France.
search = marry in
result = who can not get [b]married in[/b] France.
!! search string contain marry in, but source contain married in
To make patterns recognizable you can use the power of regex
function highlightkeyword($keyword, $string) {
return preg_replace("/{$keyword}/", '<strong>\\0</strong>', $string);
}
Examples
$string = "Regulation is mediated via many different mechanisms";
$keyword = "mediate.*? via";
echo highlightkeyword($keyword, $string), PHP_EOL;
Regulation is <strong>mediated via</strong> many different mechanisms
$string = "Who can not get married in France.";
$keyword = "marr(ied|y)";
echo highlightkeyword($keyword, $string), PHP_EOL;
Who can not get <strong>married</strong> in France.
$string = "Who can not marry in France.";
$keyword = "marr(ied|y)";
echo highlightkeyword($keyword, $string), PHP_EOL;
Who can not <strong>marry</strong> in France.
I have a search String: $str (Something like "test"), a wrap string: $wrap (Something like "|") and a text string: $text (Something like "This is a test Text").
$str is 1 Time in $text. What i want now is a function that will wrap $str with the wrap defined in $wrap and output the modified text (even if $str is more than one time in $text).
But it shall not output the whole text but just 1-2 of the words before $str and then 1-2 of the words after $str and "..." (Only if it isn`t the first or last word). Also it should be case insensitive.
Example:
$str = "Text"
$wrap = "<span>|</span>"
$text = "This is a really long Text where the word Text appears about 3 times Text"
Output would be:
"...long <span>Text</span> where...word <span>Text</span> appears...times <span>Text</span>"
My Code (Obviusly doesnt works):
$tempar = preg_split("/$str/i", $text);
if (count($tempar) <= 2) {
$result = "... ".substr($tempar[0], -7).$wrap.substr($tempar[1], 7)." ...";
} else {
$amount = substr_count($text, $str);
for ($i = 0; $i < $amount; $i++) {
$result = $result.".. ".substr($tempar[$i], -7).$wrap.substr($tempar[$i+1], 0, 7)." ..";
}
}
If you have a tipp or a solution dont hesitate to let me know.
I have taken your approach and made it more flexible. If $str or $wrap changes you could have escaping issues within the regex pattern so I have used preg_quote.
Note that I added $placeholder to make it clearer, but you can use $placeholder = "|" if you don't like [placeholder].
function wrapInString($str, $text, $element = 'span') {
$placeholder = "[placeholder]"; // The string that will be replaced by $str
$wrap = "<{$element}>{$placeholder}</{$element}>"; // Dynamic string that can handle more than just span
$strExp = preg_quote($str, '/');
$matches = [];
$matchCount = preg_match_all("/(\w+\s+)?(\w+\s+)?({$strExp})(\s+\w+)?(\s+\w+)?/i", $text, $matches);
$response = '';
for ($i = 0; $i < $matchCount; $i++) {
if (strlen($matches[1][$i])) {
$response .= '...';
}
if (strlen($matches[2][$i])) {
$response .= $matches[2][$i];
}
$response .= str_replace($placeholder, $matches[3][$i], $wrap);
if (strlen($matches[4][$i])) {
$response .= $matches[4][$i];
}
if (strlen($matches[5][$i]) && $i == $matchCount - 1) {
$response .= '...';
}
}
return $response;
}
$text = "text This is a really long Text where the word Text appears about 3 times Text";
string(107) "<span>text</span> This...long <span>text</span> where...<span>text</span> appears...times <span>text</span>"
To make the replacement case insensitive you can use the i regex option.
If I understand your question correct, just a little bit of implode and explode magic needed
$text = "This is a really long Text where the word Text appears about 3 times Text";
$arr = explode("Text", $text);
print_r(implode('<span>Text</span>', $arr));
If you specifically need to render the span tags using HTML, just write it that way
$arr = explode("Text", $text);
print_r(implode('<span>Text</span>', $arr));
Use patern below to get your word and 1-2 words before and after
/((\w+\s+){1,2}|^)text((\s+\w+){1,2}|$)/i
demo
In PHP code it can be:
$str = "Text";
$wrap = "<span>|</span>";
$text = "This is a really long Text where the word Text appears about 3 times Text";
$temp = str_replace('|', $str, $wrap); // <span>Text</span>
// find patern and 1-2 words before and after
// (to make it casesensitive, delete 'i' from patern)
if(preg_match_all('/((\w+\s+){1,2}|^)text((\s+\w+){1,2}|$)/i', $text, $match)) {
$res = array_map(function($x) use($str, $temp) { return '... '.str_replace($str, $temp, $x) . ' ...';}, $match[0]);
echo implode(' ', $res);
}
I want to ask about is it possible to get some string that between a specifi keyword? For example, I have 2 sentence like these:
I will go to #new bathroom and wash the car#
Result: bathroom and wash the car
Someone need an #new icebreaker# to hold that problem
Result : icebreaker
I want to make condition to get all words between the #new #
Any idea how to create this?
My code so far:
<?php
$sentence = "I will go to #new bathroom and wash the car#";
$start = strpos($sentence, "#new");
//$end = strpos($sentence, "#");
$end = 20; //because my strpos still wrong, I define a static number
$new = substr($sentence, $start, $end);
echo $new;
?>
My problem is I can't find a way to chase the last hashtag
Use this regular expression:
/#new (.+)#/i
Together with preg_match(), you'll get your match in an array:
<?php
$string = "Someone need an #new icebreaker# to hold that problem";
preg_match("/#new (.+)#/i", $string, $matches);
var_dump($matches[1]); // icebreaker
Demo
If you anticipate more than one possible match, use preg_match_all() to get them all.
I have written the following code for your problem but please bare in mind that i am still a beginner myself.
It works exactly how you want it to but i am sure there are better solutions out there.
<?php
$string = "I will go to #new bathroom and wash the car#";
$stringArray = str_split($string);
$output = '';
$count = 0;
foreach($stringArray as $letter){
if($count == 0 && $letter == '#'){
$count = 1;
} elseif($count == 1){
if($letter != '#'){
$output .= $letter;
} else {
$count = 2;
}
}
}
echo $output;
?>
hope this helps :)
Another way different to Regular Expression is to explode the string and replace the new in the sentence
This will just work if you have only one keyword in the sentence #new
$string = "I will go to #new bathroom and wash the car#";
$string1 = "Someone need an #new icebreaker# to hold that problem";
function getString($string, $delimiter = '#')
{
$string_array = explode($delimiter, $string);
return str_replace('new ', '', $string_array[1]);
}
echo getString($string);
//bathroom and wash the car
echo getString($string1);
//icebreaker
I'd like more work with arrays
$string = [
"I will go to #new bathroom and wash the car#",
"Someone need an #new icebreaker# to hold that problem"
];
function getString($string, $delimiter = '#')
{
$result = [];
foreach ($string as $value) {
$string_array = strstr($value, $delimiter) ? explode($delimiter, $value) : [];
$result[] = isset($string_array[1]) ? str_replace('new ', '', $string_array[1]) : NULL;
}
return $result;
}
print_r(getString($string));
/*
Array
(
[0] => bathroom and wash the car
[1] => icebreaker
)
*/
You can use regex to match that. here is a links and a simple regex.
(#)\w+(#)
(\#)+(.)+(\#)
https://regexr.com/
http://php.net/manual/en/function.preg-match.php
You can search "#" from end,
like $end = strpos($sentence, "#", -0);
and than get substring as you already have.
$new = substr($sentence, $start, $end);
This is my string:
monkey/rabbit/cat/donkey/duck
If my variable is cat...
$animal = cat
... I want to remove everything coming after cat.
My desired result is:
monkey/rabbit/cat
I tried to use str_replace:
$subject = 'monkey/rabbit/cat/donkey/duck';
$trimmed = str_replace($animal, '', $subject);
echo $trimmed;
But here I get the result:
monkey/rabbit//donkey/duck
So it is just cutting out cat.
You can combine strpos with substr:
$pos = strpos($subject, $animal);
if ($pos !== false) {
$result = substr($subject, 0, $pos + strlen($animal));
}
If you wold like to make sure it only the full segments are erased, in case of a partial match, you could use the offset argument of strpos:
$pos = strpos($subject, $animal);
if ($pos !== false) {
$result = substr($subject, 0, strpos($subject, '/', $pos));
}
You can use explode in your case:
$string = "monkey/rabbit/cat/donkey/duck";
$val = explode("donkey", $string );
echo $val[0];
Result: monkey/rabbit/cat
PS* Ofcourse there are better ways to do this
My approach would be to explode by your variable.
Take the first part and append the variable.
<?php
$string = 'monkey/rabbit/cat/donkey/duck';
$animal = 'cat';
$temp = explode($animal,$string);
print $temp[0] . $animal;
Will output nicely
monkey/rabbit/cat
There's no need to use any of strpos, strlen, substr or donkeys
<?php
$animal="cat";
$string1="monkey/rabbit/cat/donkey/duck";
$parts = explode($animal, $string1);
$res = $parts[0];
print("$res$animal")
?>
Here is a bit of explanation for what each step does:
$subject = 'monkey/rabbit/cat/donkey/duck';
$target = 'cat';
$target_length = strlen($target); // get the length of your target string
$target_index = strpos($subject, $target); // find the position of your target string
$new_length = $target_index + $target_length; // find the length of the new string
$new_subject = substr($subject, 0, $new_length); // trim to the new length using substr
echo $new_subject;
This can all be combined into one statement.
$new_subject = substr($subject, 0, strpos($subject, $target) + strlen($target));
This assumes your target is found. If the target is not found, the subject will be trimmed to the length of the target, which obviously is not what you want. For example, if your target string was "fish" the new subject would be "monk". This is why the other answer checks if ($pos !== false) {.
One of the comments on your question raises a valid point. If you search for a string that happens to be contained in one of the other strings, you may get unexpected results. There is really not a good way to avoid this problem when using the substr/strpos method. If you want to be sure to only match a complete word between your separators (/), you can explode by / and search for your target in the resulting array.
$subject = explode('/', $subject); // convert to array
$index = array_search($target, $subject); // find the target
if ($index !== false) { // if it is found,
$subject = array_slice($subject, 0, $index + 1); // remove the end of the array after it
}
$new_subject = implode('/', $subject); // convert back to string
I'm probably going to kop some flak for going down the RegExp route but...
$subject = 'monkey/rabbit/polecat/cat/catfish/duck';
$animal = "cat";
echo preg_replace('~(.*(?:/|^)' . preg_quote($animal) . ')(?:/|$).*~i', "$1", $subject);
This will ensure that your animal is wrapped immediately on either side with / characters, or that it's at the start or end of the string (i.e. monkey or duck).
So in this example it'll output:
monkey/rabbit/polecat/cat
Ending specifically with cat rather than stumbling at polecat or catfish
I am trying to use a script to search a text file and return words that meet certain criteria:
*The word is only listed once
*They are not one words in an ignore list
*they are the top 10% of the longest words
*they are not repeating letters
*The final list would be a random ten that met the above criteria.
*If any of the above were false then words reported would be null.
I've put together the following but the script dies at arsort() saying it expects an array. Can anyone suggest a change to make arsort work? Or suggest an alternative (simpler) script to find metadata?**I realize this second question may be a question better suited for another StackExchange.
<?php
$fn="../story_link";
$str=readfile($fn);
function top_words($str, $limit=10, $ignore=""){
if(!$ignore) $ignore = "the of to and a in for is The that on said with be was by";
$ignore_arr = explode(" ", $ignore);
$str = trim($str);
$str = preg_replace("#[&].{2,7}[;]#sim", " ", $str);
$str = preg_replace("#[()°^!\"§\$%&/{(\[)\]=}?´`,;.:\-_\#'~+*]#", " ", $str);
$str = preg_replace("#\s+#sim", " ", $str);
$arraw = explode(" ", $str);
foreach($arraw as $v){
$v = trim($v);
if(strlen($v)<3 || in_array($v, $ignore_arr)) continue;
$arr[$v]++;
}
arsort($arr);
return array_keys( array_slice($arr, 0, $limit) );
}
$meta_keywords = implode(", ", top_words( strip_tags( $html_content ) ) );
?>
The problem is when your loop never increments $arr[$v], which results in the possibility of $arr not becoming defined. This is the reason for your error because then arsort() is given null as its argument - not an array.
The solution is to define $arr as an array before the loop for instances where $arr[$v]++; isn't executed.
function top_words($str, $limit=10, $ignore=""){
if(!$ignore) $ignore = "the of to and a in for is The that on said with be was by";
$ignore_arr = explode(" ", $ignore);
$str = trim($str);
$str = preg_replace("#[&].{2,7}[;]#sim", " ", $str);
$str = preg_replace("#[()°^!\"§\$%&/{(\[)\]=}?´`,;.:\-_\#'~+*]#", " ", $str);
$str = preg_replace("#\s+#sim", " ", $str);
$arraw = explode(" ", $str);
$arr = array(); // Defined $arr here.
foreach($arraw as $v){
$v = trim($v);
if(strlen($v)<3 || in_array($v, $ignore_arr)) continue;
$arr[$v]++;
}
arsort($arr);
return array_keys( array_slice($arr, 0, $limit) );
}
Came across an excellent code that words well for this:
<?php
function extract_keywords($str, $minWordLen = 3, $minWordOccurrences = 2, $asArray = false, $maxWords = 5, $restrict = true)
{
$str = str_replace(array("?","!",";","(",")",":","[","]"), " ", $str);
$str = str_replace(array("\n","\r"," "), " ", $str);
strtolower($str);
function keyword_count_sort($first, $sec)
{
return $sec[1] - $first[1];
}
$str = preg_replace('/[^\p{L}0-9 ]/', ' ', $str);
$str = trim(preg_replace('/\s+/', ' ', $str));
$words = explode(' ', $str);
// If we don't restrict tag usage, we'll remove common words from array
if ($restrict == false) {
$commonWords = array('a','able','about','above', 'get a list here http://www.wordfrequency.info','you\'ve','z','zero');
$words = array_udiff($words, $commonWords,'strcasecmp');
}
// Restrict Keywords based on values in the $allowedWords array
// Use if you want to limit available tags
if ($restrict == true) {
$allowedWords = array('engine','boeing','electrical','pneumatic','ice','pressurisation');
$words = array_uintersect($words, $allowedWords,'strcasecmp');
}
$keywords = array();
while(($c_word = array_shift($words)) !== null)
{
if(strlen($c_word) < $minWordLen) continue;
$c_word = strtolower($c_word);
if(array_key_exists($c_word, $keywords)) $keywords[$c_word][1]++;
else $keywords[$c_word] = array($c_word, 1);
}
usort($keywords, 'keyword_count_sort');
$final_keywords = array();
foreach($keywords as $keyword_det)
{
if($keyword_det[1] < $minWordOccurrences) break;
array_push($final_keywords, $keyword_det[0]);
}
$final_keywords = array_slice($final_keywords, 0, $maxWords);
return $asArray ? $final_keywords : implode(', ', $final_keywords);
}
$text = "Many systems that traditionally had a reliance on the pneumatic system have been transitioned to the electrical architecture. They include engine start, API start, wing ice protection, hydraulic pumps and cabin pressurisation. The only remaining bleed system on the 787 is the anti-ice system for the engine inlets. In fact, Boeing claims that the move to electrical systems has reduced the load on engines (from pneumatic hungry systems) by up to 35 percent (not unlike today’s electrically power flight simulators that use 20% of the electricity consumed by the older hydraulically actuated flight sims).";
echo extract_keywords($text);
// Advanced Usage
// $exampletext = "The quick brown fox jumped over the lazy dog. The quick brown fox jumped over the lazy dog. The quick brown fox jumped over the lazy dog.";
// echo extract_keywords($exampletext, 3, 1, false, 5, false);
?>