Extract Keywords from text php

Extract Keywords from text php - php

I am trying to extract relative keywords from description input which use Wysiwyg, with multi language english/arabic… using the following function but its not doing the task I want. Have a look the function I am using:
function extractKeyWords($string) {
mb_internal_encoding('UTF-8');
$stopwords = array();
$string = preg_replace('/[\pP]/u', '', trim(preg_replace('/\s\s+/iu', '', mb_strtolower($string))));
$matchWords = array_filter(explode(' ',$string) , function ($item) use ($stopwords) { return !($item == '' || in_array($item, $stopwords)
|| mb_strlen($item) <= 2 || is_numeric($item));});
$wordCountArr = array_count_values($matchWords);
// <p><p>
arsort($wordCountArr);
return array_keys(array_slice($wordCountArr, 0, 10)); }

figured it out ! Thanks
function generateKeywords($str)
{
$min_word_length = 3;
$avoid = ['the','to','i','am','is','are','he','she','a','an','and','here','there','can','could','were','has','have','had','been','welcome','of','home',' ','“','words','into','this','there'];
$strip_arr = ["," ,"." ,";" ,":", "\"", "'", "“","”","(",")", "!","?"];
$str_clean = str_replace( $strip_arr, "", $str);
$str_arr = explode(' ', $str_clean);
$clean_arr = [];
foreach($str_arr as $word)
{
if(strlen($word) > $min_word_length)
{
$word = strtolower($word);
if(!in_array($word, $avoid)) {
$clean_arr[] = $word;
}
}
}
return implode(',', $clean_arr);
}

This one seems very nice and comprehensive https://www.beliefmedia.com.au/create-keywords
Just make sure to change this line
$string = preg_replace('/[^\p{L}0-9 ]/', ' ', $string);
to
$string = preg_replace('/[^\p{L}0-9 ]/u', ' ', $string);
To support other langages (e.g Arabic)
And also better to use mb_strlen

If the string is in html format you can add the
strip_tags($str);
Before
$min_word_length = 3;

Related

PHP: How can I cut the word and add "..."

How can I cut the words and add "..." after reaching 4 or 5 words?
The code below states I did the character-based word cuttingb but I need it now to be by word.
Currently I have this kind of code:
if(strlen($post->post_title) > 35 )
{
$titlep = substr($post->post_title, 0, 35).'...';
}
else
{
$titlep = $post->post_title;
}
and this is the output of title:
if ( $params['show_title'] === 'true' ) {
$title = '<h3 class="wp-posts-carousel-title">';
$title.= '' . $titlep . '';
$title.= '</h3>';
}

Typically, I'll explode the body and pull out the first x characters.
$split = explode(' ', $string);
$new = array_slice ( $split, 0 ,5);
$newstring = implode( ' ', $new) . '...';
Just know, this method is slow.

Variant #1
function crop_str_word($text, $max_words = 50, $sep = ' ')
{
$words = split($sep, $text);
if ( count($words) > $max_words )
{
$text = join($sep, array_slice($words, 0, $max_words));
$text .=' ...';
}
return $text;
}
Variant #2
function crop_str_word($text, $max_words, $append = ' …')
{
$max_words = $max_words+1;
$words = explode(' ', $text, $max_words);
array_pop($words);
$text = implode(' ', $words) . $append;
return $text;
}
Variant #3
function crop_str_word($text, $max_words)
{
$words = explode(' ',$text);
if(count($words) > $max_words && $max_words > 0)
{
$text = implode(' ',array_slice($words, 0, $max_words)).'...';
}
return $text;
}
via

You should use str_replace function of PHP.
str_replace('your word', '...', $variable);
read that article: http://php.net/manual/en/function.str-replace.php

In WordPress this functionality is done by wp_trim_words() function.
<?php
if(strlen($post->post_title) > 35 )
{
$titlep = wp_trim_words( $post->post_title, 35, '...' );
}
else
{
$titlep = $post->post_title;
}
?>
If you do this functionality using PHP then write code as below:
<?php
$titlep = strlen($post->post_title) > 35 ? substr($post->post_title, 0, 35).'...' : $post->post_title;
?>

determine which word from an array is in a string

I have a string:
$str = "Hello is a greeting";
And I have an array of words:
$equals = array("is", "are", "was", "were", "will", "has", "have", "do", "does");
I'm trying to see which word is in the string:
$words = explode(" ", $str);
$new_association = false;
foreach($words as $word) {
if(in_array($word, $equals)) {
$new_association = true;
$e['response'] = 'You made an association.';
// determine which 'equals' word was used.
// $equal_used = 'is';
}
}
How do I determine which equals word was used?

$new_asscociation = false;
$equal_used = array_intersect($equals, explode(' ', $str));
if (!empty($equal_used)) {
$new_asscociation = true;
var_dump($equal_used);
}

Mark's answer above is superior but if you'd rather stick with your current approach:
$words = explode(" ", $str);
$new_association = false;
foreach($words as $word) {
if(in_array($word, $equals)) {
$new_association = true;
$e['response'] = 'You made an association.';
// determine which 'equals' word was used.
$equal_used = $word;
}
}

replacing word in string that has length of 40 or more

I have the following code:
$caption = $picture->getCaption();
$words = explode(" ", $caption);
foreach ($words as $word) {
$string_length = strlen($word);
if ($string_length > 40) {
str_replace($word, '', $caption);
$picture->setCaption($caption);
}
}
However, why doesn't this replace the caption with the trimmed word removed?

You need to do like this:
$caption = $picture->getCaption();
$words = explode(" ", $caption);
foreach ($words as $word)
{
$string_length = strlen($word);
if ($string_length > 40) {
$picture->setCaption(str_replace($word, '', $caption));
}
}

You need to assign the replacement made:
$caption = str_replace($word, '', $caption);
I think this is much better:
$caption = $picture->getCaption();
// explode them by spaces, filter it out
// get all elements thats just inside 40 char limit
// them put them back together again with implode
$caption = implode(' ', array_filter(explode(' ', $caption), function($piece){
return mb_strlen($piece) <= 40;
}));
$picture->setCaption($caption);

You have to do it like that :
$caption = $picture->getCaption();
$words = explode(" ", $caption);
foreach ($words as $word)
{
$string_length = strlen($word);
if ($string_length > 40) {
$replaced = str_replace($word, '', $caption);
$picture->setCaption($replaced);
}
}

PHP: remove word from sentence if it contains #

I want to remove words from sentence if word contains #, I am using php.
Input: Hi I am #RaghavSoni
Output: Hi I am
Thank You.

You could do:
$str = preg_replace('/#\w+/', '', $str);

This is not a good way, but it works :
<?php
$input="Hi I am #RaghavSoni";
$inputWords = explode(' ', $input);
foreach($inputWords as $el)
{
if($el[0]=="#" )
{
$input = str_replace($el, "", $input);
}
}
echo $input;
?>

while(strpos($string, '#') !== false) {
$location1 = strpos($string, "#");
$location2 = strpos($string, " ", $location1);
if($location2 !== false) {
$length = $location2 - $location1;
$string1 = substr($string, 0, $location1);
$string2 = substr($string, $location2);
$string = $string1 . $string2;
}
}
echo $string;

echo str_replace("#RaghavSoni", "", "Hi I am #RaghavSoni.");
# Output: Hi I am.

PHP string find and replace until last instance

I have a string in a DB table which is separated by a comma i.e. this,is,the,first,sting
What I would like to do and don't know how is to have the string outputted like:
this, is, the, first and string
Note the spaces and the last comma is replaced by the word 'and'.

This can be your solution:
$str = 'this,is,the,first,string';
$str = str_replace(',', ', ', $str);
echo preg_replace('/(.*),/', '$1 and', $str);

First use, the function provided in this answer: PHP Replace last occurrence of a String in a String?
function str_lreplace($search, $replace, $subject)
{
$pos = strrpos($subject, $search);
if($pos === false)
{
return $subject;
}
else
{
return substr_replace($subject, $replace, $pos, strlen($search));
}
}
Then, you should perform a common str_replace on text to replace all other commas:
$string = str_lreplace(',', 'and ', $string);
str_replace(',',', ',$string);

$words = explode( ',', $string );
$output_string = '';
for( $x = 0; $x < count($words); x++ ){
if( $x == 0 ){
$output = $words[$x];
}else if( $x == (count($words) - 1) ){
$output .= ', and ' . $words[$x];
}else{
$output .= ', ' . $words[$x];
}
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Extract Keywords from text php - php

If the string is in html format you can add the strip_tags($str); Before $min_word_length = 3;

Related

PHP: How can I cut the word and add "..."

determine which word from an array is in a string

replacing word in string that has length of 40 or more

PHP: remove word from sentence if it contains #

PHP string find and replace until last instance

Categories

Resources