PHP Simple Spell Checker (Binary Search help)

PHP Simple Spell Checker (Binary Search help) - php

I need help with performing a binary search with a search term ($searchTerm) and comparing it to a dictionary ($dictionary).
Basically, it reads a dictionary file into an array. The user inputs some words, that string becomes $checkMe. I do an explode function and it turns into $explodedCheckMe. I pass each term in $checkMe to binarySearch as $searchTerm (Okay, my code is confusing). I think my logic is sound, but my syntax isn't ...
I've been using this a lot: http://us3.php.net/manual/en/function.strcasecmp.php
here is my code: paste2.org/p/457232

I know this doesn't directly answer your question, but have you considered using pspell and a custom dictionary?

So you are looking up exact strings in the dictionary. Why don't you a simple array for this? The native PHP's hash table is definitely going to be faster than a binary search implemented in PHP.
while (!feof($file)) {
$dictionary[strtolower(fgets($file))] = 1;
}
...
function search($searchTerm, $dictionary) {
if ($dictionary[strtolower($searchTerm)]) {
// do something
}
}
But if you really want to use a binary search, try this:
function binarySearch($searchTerm, $dictionary) {
$minVal = 0;
$maxVal = count($dictionary);
while ($minVal < $maxVal) {
$guess = intval($minVal + ($maxVal - $minVal) / 2);
$result = strcasecmp($dictionary[$guess], $searchTerm);
if ($result == 0) {
echo "FOUND";
return;
}
elseif ($result < 0) {
$minVal = $guess + 1;
}
else {
$maxVal = $guess;
}
}
}
The main problem was that you can't set $maxval to $guess - 1. See the wikipedia article on binary search, it's really good.

Related

Math / statistics problem analyse words in string

In need of some help - am trying to analyse news articles.
I have a list of positive words and negative words. I am search the article contents for instances of the words a counting the up.
my problem is that the negative word list is a lot long that the positive so all the results a skewed to negative.
I am looking for a way to normalise the results so a positive word is weighted slightly against the negative to even out the fact that is a considerably high chance of finding a negative word. Unfortunately I have no idea where to start.
Appreciate you taking the time to read this.
Below is the code I have so far.
function process_scores($content)
{
$positive_score = 0;
for ($i = 0; $i < count($this->positive_words); $i++) {
if($this->positive_words[$i] != "")
{
$c = substr_count( strtolower($content) , $this->positive_words[$i] );
if($c > 0)
{
$positive_score += $c;
}
}
}
$negative_score = 0;
for ($i = 0; $i < count($this->negative_words); $i++) {
if($this->negative_words[$i] != "")
{
$c = substr_count( strtolower($content) , $this->negative_words[$i] );
if($c > 0)
{
$negative_score += $c;
}
}
}
return ["positive_score" => $positive_score, "negative_score" => $negative_score];
}

So I don't know php, but this seems less like a php question and more of a question of method. Right now when you analyze an article, you assign words as positive or negative based on whether or not they are in your dictionary, but because your dictionaries are of different sizes, you feel like this isn't giving you a fair analysis of the article.
One method you could try is to assign each word in the article a value. If a word does not exist in your dictionary, have the program prompt for manual interpretation of the word through the command line. Then decide whether the word is positive, negative, or neutral, and have the program add that word to the appropriate dictionary. This will be really annoying at first, but English speakers use roughly the same 2000 words for almost all of our conversation, so after a few articles, you will have robust dictionaries and not have to worry about skew because every single word will have been assigned a value.

I would suggest just throwing in a weighting factor to the output. The exact weighting is determined by trial and error. I went ahead and refactored your code since there was some repetition
<?php
class WordScore {
private $negative_words = [];
private $positive_words = [];
private $positive_weight = 1;
private $negative_weight = 1;
public function setScore(float $pos = 1, float $neg = 1) {
$this->negative_weight = $neg;
$this->positive_weight = $pos;
}
public function processScores($content) {
$positive_score = $this->countWords($content, $this->positive_words);
$negative_score = $this->countWords($content, $this->negative_words);
return [
"positive_score" => $positive_score * $this->positive_weight,
"negative_score" => $negative_score * $this->negative_weight
];
}
private function countWords( string $content, array $words, float $weight = 1 ) {
$count = 0;
foreach( $words as $word ) {
$count += substr_count( strtolower($content) , strtolower($word) );
}
return $count;
}
}
working example at http://sandbox.onlinephpfunctions.com/code/19b4ac3c12d35cf253e9fa6049e91508e4797a2e

PHP Mathematical Equation with multiple functions in it

I have a mathematical question with PHP. Where n is a positive integer, this function f(n) satisfies the following.
This is a question asked in my programming class and now I am now trying to create a program to find f(n) using PHP, but now I am confused because this equation contains more than one function and I do not know how to put this in PHP. If you have any idea on how to put this equation into some codes, please post your idea. I would like to know how to write php code to solve this kind of mathematical equations.

If you look closely to the equation you will find this is Fibonacci Series.you can solve this using recursive function Like this.
function fib($n) {
if ($n < 0) {
return NULL;
} elseif ($n === 0) {
return 0;
} elseif ($n === 1 || $n === 2) {
return 1;
} else {
return fib($n-1) + fib($n-2);
}
}
As you can see i am calling same function until the base condition satisfied. Hope this help

Is f(n) in Ω(g(n)), Θ(g(n)) or O(g(n))?

Given two functions in PHP, say
function f($n) {
return $n;
}
function g($n) {
return pow($n, (2/3));
}
How to check if a function f(n) is in Ω(g(n)), Θ(g(n)) or O(g(n)) in PHP?
What I tried so far:
$n = INF;
$A = f($n) / g($n);
if ($A == 0) {
echo "f(n) = O(g(n))";
} elseif (is_infinite($A)) {
echo "f(n) = Ω(g(n))";
} elseif ($A != 0) {
echo "f(n) = Θ(g(n))";
}
Shouldn't that work?

Your basic idea is correct: you have to find the limit of f(n)/g(n) as n grows without bound. Unfortunately there is no easy way to compute the exact limit in PHP, since that requires symbolic computations which is best left to a computer algebra system such as Mathematica or Maxima.
You can approximate the limit by computing f(n)/g(n) for increasing values of n and seeing if you get a sequence that approaches a fixed value. For example:
$n=1;
while ($n < 1e300) {
$A = f($n)/g($n);
echo $A, "\n";
$n *= 1e12;
}
In this particular case the sequence of f(n)/g(n) seems to grow without bound, so the numerical evidence suggests that f(n) is in Ω(g(n)). This is not a proof though; symbolic methods are needed for that.

Both the time and space requirements for both f() and g() are in Ω(1), Θ(1) and O(1).

MySQL / PHP, but more of a MATH Question (Shortening Script)

For my latest project I need to shorten the URLs which I then put in a mysql database.
I now ran against a problem, because I don't know how to solve this. Basically, the shortened strings should look like this (I want to include lowercase letters, uppercase letters and numbers)
a
b
...
z
0
...
9
A
...
Z
aa
ab
ac
...
ba
So, 1. URl --> a. Stored in mysql.
Next time, a new url gets stored to --> b because a is already in the mysql database.
And that is it. But I don't have any idea. Could someone of you please help me out?
Edit: Formattted & Further explanation.
It is kinda like the imgur.com URL shortening service. It should continue like this until infinity (which is not needed, I think...)

You can use the following function (code adapted from my personal framework):
function Base($input, $output, $number = 1, $charset = 'abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ')
{
if (strlen($charset) >= 2)
{
$input = max(2, min(intval($input), strlen($charset)));
$output = max(2, min(intval($output), strlen($charset)));
$number = ltrim(preg_replace('~[^' . preg_quote(substr($charset, 0, max($input, $output)), '~') . ']+~', '', $number), $charset[0]);
if (strlen($number) > 0)
{
if ($input != 10)
{
$result = 0;
foreach (str_split(strrev($number)) as $key => $value)
{
$result += pow($input, $key) * intval(strpos($charset, $value));
}
$number = $result;
}
if ($output != 10)
{
$result = $charset[$number % $output];
while (($number = intval($number / $output)) > 0)
{
$result = $charset[$number % $output] . $result;
}
$number = $result;
}
return $number;
}
return $charset[0];
}
return false;
}
Basically you just need to grab the newly generated auto-incremented ID (this also makes sure you don't generate any collisions) from your table and pass it to this function like this:
$short_id = Base(10, 62, $auto_increment_id);
Note that the first and second arguments define the input and output bases, respectively.
Also, I've reordered the charset from the "default" 0-9a-zA-Z to comply with your examples.
You can also just use base_convert() if you can live without the mixed alphabet case (base 36).

Programming In General - Binary Search Algorithms

Given an array $array of N numbers and a key $key, write the binary search algorithm in plain English. If $array contains $key, return the index of the $key; otherwise, return -1.
Can someone show me how to do this?

Doesn't seem like I should give you the code here, but maybe this description can help?
Sort the list.
Let i = length / 2
Compare term at index i to your key.
a. If they are equal, return the index.
b. If key is greater than this term, repeat 3 (recurse) on upper half of list i = (i + length) / 2 (or (i + top) / 2 depending how you implement)
c. If key is less than this term, repeat 3 on lower half i = i/2 or (i + bottom)/2
Stop recursion if/when the new i is the same as the old i. This means you've exhausted the search. Return -1
Be careful for off-by-one errors, which can make you exclude certain terms by mistake, or cause infinite recursion, but this is the general idea. Pretty straightforward.
Think of it as playing 'Guess the number' for the numbers 1 through 100. You take a guess, I tell you higher or lower. You say 50, I say lower. You say 25, I say higher. You say 37...

I know this is little late :) ,but take it anyway.This also show that recursive function works faster than in_array()
function binarySearch($A,$value,$starting,$ending)
{
if($ending<$starting)
{
return -1;
}
$mid=intVal(($starting+$ending)/2);
if($value===$A[$mid])
{
return $mid;
}
else if($value<$A[$mid])
{
$ending=$mid-1;
}
else if($value>$A[$mid])
{
$starting=$mid+1;
}
return binarySearch($A,$value,$starting,$ending);
}
for($i;$i<1000000;$i++){
$arr[$i]=$i;
}
$value =99999;
$msc=microtime(true);
$pos = in_array($value,$arr);
$msc=microtime(true)-$msc;
echo "Time taken for in_array() : ".round($msc*1000,3).' ms <br>';
if($pos>0)
echo $value .' found.';
else
echo $value .' not found';
echo "<br><br>";
$msc=microtime(true);
$pos = binarySearch($arr,$value ,0,1000000);
$msc=microtime(true)-$msc;
echo "Time taken for recursive function : ".round($msc*1000,3).' ms<br>';
if($pos>=0)
echo $value .' found.';
else
echo $value .' not found';
Ouput:
Time taken for in_array() : 5.165 ms
99999 found.
Time taken for recursive function : 0.121 ms
99999 found.

Here is a better non recursive solution.
function fast_in_array($elem, $array){
$top = sizeof($array) -1;
$bot = 0;
while($top >= $bot)
{
$p = floor(($top + $bot) / 2);
if ($array[$p] < $elem) $bot = $p + 1;
elseif ($array[$p] > $elem) $top = $p - 1;
else return TRUE;
}
return FALSE;
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP Simple Spell Checker (Binary Search help) - php

I know this doesn't directly answer your question, but have you considered using pspell and a custom dictionary?

Related

Math / statistics problem analyse words in string

PHP Mathematical Equation with multiple functions in it

Is f(n) in Ω(g(n)), Θ(g(n)) or O(g(n))?

MySQL / PHP, but more of a MATH Question (Shortening Script)

Programming In General - Binary Search Algorithms

Categories

Resources