PHP get words from a word using pspell_check - php

I had a PHP string which contains English words. I want to extract all the possible words from the string, not by explode() by space as I have only a word. I mean extraction of words from a word.
Example: With the word "stackoverflow", I need to extract stack, over, flow, overflow all of them.
I am using pspell_check() for spell checking. I am currently getting the following combination.
--> sta
--> stac
--> stack
and so on.
So I found the only the words matching stack but I want to find the following words. Notice that I don't want the final word as I've already.
--> stack
--> over
--> flow
My Code:
$myword = "stackoverflow";
$word_length = strlen($myword);
$myword_prediction = $myword[0].$myword[1];
//(initial condition as words detection starts after 3rd index)
for ($i=2; $i<$word_length; $i++) {
$myword_prediction .= $myword[$i];
if (pspell_check(pspell_new("en"), $myword_prediction))
{
$array[] = $myword_prediction;
}
}
var_dump($array);

How about if you have an outer loop like this. The first time through you start at the first character of $myword. The second time through you start at the second character, and so on.
$myword = "stackoverflow";
$word_length = strlen($myword);
$startLetter = 0;
while($startLetter < $word_length-2 ){
$myword_prediction = $myword[$startLetter] . $myword[$startLetter +1];
for ($i=$startLetter; $i<$word_length; $i++) {
$myword_prediction .= $myword[$i];
if (pspell_check(pspell_new("en"), $myword_prediction)) {
$array[] = $myword_prediction;
}
}
$startLetter ++;
}

Well, you would need to get all substrings, and check each one:
function get_all_substrings($input){
$subs = array();
$length = strlen($input);
for($i=0; $i<$length; $i++){
for($j=$i; $j<$length; $j++){
$subs[] = substr($input, $i, $j);
}
}
return array_unique($subs);
}
$substrings = get_all_substrings("stackoverflow");
$pspell_link = pspell_new("en");
$words = array_filter($substrings, function($word) use ($pspell_link) {
return pspell_check($pspell_link, $word);
});
var_dump($words);

Related

PHP Explode Show Seperator

So I wrote the following code to show the words after the fourth full stop / period in a sentence.
$text = "this.is.the.message.seperated.with.full.stops.";
$limit = 4;
$minText = explode(".", $text);
for($i = $limit; $i < count($minText); $i++){
echo $minText[$i];
}
The algorithm is working and it is showing me the rest of the sentence after the fourth "." full stop / period.... My problem is that the output is not showing the full stops in the sentence therefore it is showing me just text without the proper punctuation "." .... Can someone please help me out on how to fix the code to display also the full stops / periods ??
Thanks a lot
you could try this...
for($i = $limit; $i < count($minText); $i++){
echo $minText[$i].".";
}
notice the added period at the end of the echo command // .".";
$text = "this.is.the.message.seperated.with.full.stops.";
$limit = 4;
$minText = explode(".", $text);
for($i = $limit; $i < count($minText); $i++){
echo $minText[$i].".";
}
Instead of splitting the input string and then iterating over it, you can find the nth position of the separator (.) in the string by using strpos() function by changing the offset parameter.
Then, it is just the matter of printing the sub-string from the position we just determined.
<?php
$text = "this.is.the.message.seperated.with.full.stops.";
$limit = 4;
$pos = 0;
//find the position of 4th occurrence of dot
for($i = 0; $i < $limit; $i++) {
$pos = strpos($text, '.', $pos) + 1;
}
print substr($text, $pos);
If desired output is "seperated.with.full.stops.", then you can use:
<?php
$text = "this.is.the.message.seperated.with.full.stops.";
$limit = 4;
$minText = explode(".", $text);
$minText = array_slice($minText, $limit);
echo implode('.', $minText) . '.';
If you want to break it up on the periods between words, but keep the one at the end as actual punctuation, you may want to use preg_replace() to convert the periods to another character and then explode it.
$text = "this.is.the.message.seperated.with.full.stops.";
$limit = 4;
//replace periods if they are follwed by a alphanumeric character
$toSplit = preg_replace('/\.(?=\w)/', '#', $text);
$minText = explode("#", $toSplit);
for($i = $limit; $i < count($minText); $i++){
echo $minText[$i] . "<br/>";
}
Which Yields
seperated
with
full
stops.
Of course, if you just simply want to print all the full stops, then add them in after you echo the term.
echo $minText[$i] . ".";

Count consecutive occurence of specific, identical characters in a string - PHP

I am trying to calculate a few 'streaks', specifically the highest number of wins and losses in a row, but also most occurences of games without a win, games without a loss.
I have a string that looks like this; 'WWWDDWWWLLWLLLL'
For this I need to be able to return:
Longest consecutive run of W charector (i will then replicate for L)
Longest consecutive run without W charector (i will then replicate for L)
I have found and adapted the following which will go through my array and tell me the longest sequence, but I can't seem to adapt it to meet the criteria above.
All help and learning greatly appreciated :)
function getLongestSequence($sequence){
$sl = strlen($sequence);
$longest = 0;
for($i = 0; $i < $sl; )
{
$substr = substr($sequence, $i);
$len = strspn($substr, $substr{0});if($len > $longest)
$longest = $len;
$i += $len;
}
return $longest;
}
echo getLongestSequence($sequence);
You can use a regular expression to detect sequences of identical characters:
$string = 'WWWDDWWWLLWLLLL';
// The regex matches any character -> . in a capture group ()
// plus as much identical characters as possible following it -> \1+
$pattern = '/(.)\1+/';
preg_match_all($pattern, $string, $m);
// sort by their length
usort($m[0], function($a, $b) {
return (strlen($a) < strlen($b)) ? 1 : -1;
});
echo "Longest sequence: " . $m[0][0] . PHP_EOL;
You can achieve the maximum count of consecutive character in a particular string using the below code.
$string = "WWWDDWWWLLWLLLL";
function getLongestSequence($str,$c) {
$len = strlen($str);
$maximum=0;
$count=0;
for($i=0;$i<$len;$i++){
if(substr($str,$i,1)==$c){
$count++;
if($count>$maximum) $maximum=$count;
}else $count=0;
}
return $maximum;
}
$match="W";//change to L for lost count D for draw count
echo getLongestSequence($string,$match);

reorder / rewrap bbcodes

I'm trying to reorder the BBCodes but I failed
so
[̶b̶]̶[̶i̶]̶[̶u̶]̶f̶o̶o̶[̶/̶b̶]̶[̶/̶u̶]̶[̶/̶i̶]̶ ̶-̶ ̶w̶r̶o̶n̶g̶ ̶o̶r̶d̶e̶r̶ ̶ ̶
I̶ ̶w̶a̶n̶t̶ ̶i̶t̶ ̶t̶o̶ ̶b̶e̶:̶ ̶
̶[̶b̶]̶[̶i̶]̶[̶u̶]̶f̶o̶o̶[̶/̶u̶]̶[̶/̶i̶]̶[̶/̶b̶]̶ ̶-̶ ̶r̶i̶g̶h̶t̶ ̶o̶r̶d̶e̶r̶
PIC:
I tried with
<?php
$string = '[b][i][u]foo[/b][/u][/i]';
$search = array('/\[b](.+?)\[\/b]/is', '/\[i](.+?)\[\/i]/is', '/\[u](.+?)\[\/u]/is');
$replace = array('[b]$1[/b]', '[i]$1[/i]', '[u]$1[/u]');
echo preg_replace($search, $replace, $string);
?>
OUTPUT: [b][i][u]foo[/b][/u][/i]
any suggestions ? thanks!
phew, spent awhile thinking of the logic to do this. (feel free to put it in a function)
this only works for the scenario given. Like other users have commented it's impossible. You shouldn't be doing this. Or even on server side. I'd use a client side parser just to throw a syntax error.
supports [b]a[i]b[u]foo[/b]baa[/u]too[/i]
and bbcode with custom values [url=test][i][u]foo[/url][/u][/i]
Will break with
[b] bold [/b][u] underline[/u]
And [b] bold [u][/b] underline[/u]
//input string to be reorganized
$string = '[url=test][i][u]foo[/url][/u][/i]';
echo $string . "<br />";
//search for all opentags (including ones with values
$tagsearch = "/\[([A-Za-z]+)[A-Za-z=._%?&:\/-]*\]/";
preg_match_all($tagsearch, $string, $tags);
//search for all close tags to store them for later
$closetagsearch = "/(\[\/([A-Za-z]+)\])/is";
preg_match_all($closetagsearch, $string, $closetags);
//flip the open tags for reverse parsing (index one is just letters)
$tags[1] = array_reverse($tags[1]);
//create temp var to store new ordered string
$temp = "";
//this is the last known position in the original string after a match
$last = 0;
//iterate through each char of the input string
for ($i = 0, $len = strlen($string); $i < $len; $i++) {
//if we run out of tags to replace/find stop looping
if (empty($tags[1]) || empty($closetags[1]))
continue;
//this is the part of the string that has no matches
$good = substr($string, $last, $i - $last);
//next closing tag to search for
$next = $closetags[1][0];
//how many chars ahead to compare against
$scope = substr($string, $i, strlen($next));
//if we have a match
if ($scope === "$next") {
//add to the temp variable with a modified
//version of an open tag letter to become a close tag
$temp .= $good . substr_replace("[" . $tags[1][0] . "]", "/", 1, 0);
//remove the first key/value in both arrays
array_shift($tags[1]);
array_shift($closetags[1]);
//update the last known unmatched char
$last += strlen($good . $scope);
}
}
echo $temp;
Please also note: it might be the users intention to nest the tags out of order :X

Search for pattern in a string

Pattern search within a string.
for eg.
$string = "111111110000";
FindOut($string);
Function should return 0
function FindOut($str){
$items = str_split($str, 3);
print_r($items);
}
If I understand you correctly, your problem comes down to finding out whether a substring of 3 characters occurs in a string twice without overlapping. This will get you the first occurence's position if it does:
function findPattern($string, $minlen=3) {
$max = strlen($string)-$minlen;
for($i=0;$i<=$max;$i++) {
$pattern = substr($string,$i,$minlen);
if(substr_count($string,$pattern)>1)
return $i;
}
return false;
}
Or am I missing something here?
What you have here can conceptually be solved with a sliding window. For your example, you have a sliding window of size 3.
For each character in the string, you take the substring of the current character and the next two characters as the current pattern. You then slide the window up one position, and check if the remainder of the string has what the current pattern contains. If it does, you return the current index. If not, you repeat.
Example:
1010101101
|-|
So, pattern = 101. Now, we advance the sliding window by one character:
1010101101
|-|
And see if the rest of the string has 101, checking every combination of 3 characters.
Conceptually, this should be all you need to solve this problem.
Edit: I really don't like when people just ask for code, but since this seemed to be an interesting problem, here is my implementation of the above algorithm, which allows for the window size to vary (instead of being fixed at 3, the function is only briefly tested and omits obvious error checking):
function findPattern( $str, $window_size = 3) {
// Start the index at 0 (beginning of the string)
$i = 0;
// while( (the current pattern in the window) is not empty / false)
while( ($current_pattern = substr( $str, $i, $window_size)) != false) {
$possible_matches = array();
// Get the combination of all possible matches from the remainder of the string
for( $j = 0; $j < $window_size; $j++) {
$possible_matches = array_merge( $possible_matches, str_split( substr( $str, $i + 1 + $j), $window_size));
}
// If the current pattern is in the possible matches, we found a duplicate, return the index of the first occurrence
if( in_array( $current_pattern, $possible_matches)) {
return $i;
}
// Otherwise, increment $i and grab a new window
$i++;
}
// No duplicates were found, return -1
return -1;
}
It should be noted that this certainly isn't the most efficient algorithm or implementation, but it should help clarify the problem and give a straightforward example on how to solve it.
Looks like you more want to use a sub-string function to walk along and check every three characters and not just break it into 3
function fp($s, $len = 3){
$max = strlen($s) - $len; //borrowed from lafor as it was a terrible oversight by me
$parts = array();
for($i=0; $i < $max; $i++){
$three = substr($s, $i, $len);
if(array_key_exists("$three",$parts)){
return $parts["$three"];
//if we've already seen it before then this is the first duplicate, we can return it
}
else{
$parts["$three"] = i; //save the index of the starting position.
}
}
return false; //if we get this far then we didn't find any duplicate strings
}
Based on the str_split documentation, calling str_split on "1010101101" will result in:
Array(
[0] => 101
[1] => 010
[2] => 110
[3] => 1
}
None of these will match each other.
You need to look at each 3-long slice of the string (starting at index 0, then index 1, and so on).
I suggest looking at substr, which you can use like this:
substr($input_string, $index, $length)
And it will get you the section of $input_string starting at $index of length $length.
quick and dirty implementation of such pattern search:
function findPattern($string){
$matches = 0;
$substrStart = 0;
while($matches < 2 && $substrStart+ 3 < strlen($string) && $pattern = substr($string, $substrStart++, 3)){
$matches = substr_count($string,$pattern);
}
if($matches < 2){
return null;
}
return $substrStart-1;

Swap every pair of characters in string

I'd like to get all the permutations of swapped characters pairs of a string. For example:
Base string: abcd
Combinations:
bacd
acbd
abdc
etc.
Edit
I want to swap only letters that are next to each other. Like first with second, second with third, but not third with sixth.
What's the best way to do this?
Edit
Just for fun: there are three or four solutions, could somebody post a speed test of those so we could compare which is fastest?
Speed test
I made speed test of nickf's code and mine, and results are that mine is beating the nickf's at four letters (0.08 and 0.06 for 10K times) but nickf's is beating it at 10 letters (nick's 0.24 and mine 0.37)
Edit: Markdown hates me today...
$input = "abcd";
$len = strlen($input);
$output = array();
for ($i = 0; $i < $len - 1; ++$i) {
$output[] = substr($input, 0, $i)
. substr($input, $i + 1, 1)
. substr($input, $i, 1)
. substr($input, $i + 2);
}
print_r($output);
nickf made beautiful solution thank you , i came up with less beautiful:
$arr=array(0=>'a',1=>'b',2=>'c',3=>'d');
for($i=0;$i<count($arr)-1;$i++){
$swapped="";
//Make normal before swapped
for($z=0;$z<$i;$z++){
$swapped.=$arr[$z];
}
//Create swapped
$i1=$i+1;
$swapped.=$arr[$i1].$arr[$i];
//Make normal after swapped.
for($y=$z+2;$y<count($arr);$y++){
$swapped.=$arr[$y];
}
$arrayswapped[$i]=$swapped;
}
var_dump($arrayswapped);
A fast search in google gave me that:
http://cogo.wordpress.com/2008/01/08/string-permutation-in-php/
How about just using the following:
function swap($s, $i)
{
$t = $s[$i];
$s[$i] = $s[$i+1];
$s[$i+1] = $t;
return $s;
}
$s = "abcd";
$l = strlen($s);
for ($i=0; $i<$l-1; ++$i)
{
print swap($s,$i)."\n";
}
Here is a slightly faster solution as its not overusing substr().
function swapcharpairs($input = "abcd") {
$pre = "";
$a="";
$b = $input[0];
$post = substr($input, 1);
while($post!='') {
$pre.=$a;
$a=$b;
$b=$post[0];
$post=substr($post,1);
$swaps[] = $pre.$b.$a.$post;
};
return $swaps;
}
print_R(swapcharpairs());

Categories