PHP: Add words to pspell?

PHP: Add words to pspell? - php

I am using pspell like this:
$ps = pspell_new("en");
if(!pspell_check($ps, $word))
{
$suggestion = pspell_suggest($ps, $word);
}
However I want to added some industry terms to the list.
I looked up pspell_add_to_session which says the first param is supposed to be int $dictionary_link But I do not know what that is and there is no example.

In your case, the $ps variable created by pspell_new() is that "dictionary link":
$ps = pspell_new("en");
pspell_add_to_session($ps, "somenewword");

The $dictionary_link integer is an integer representation of the pspell library handle, as returned by pspell_new or pspell_new_personal. The PHP documentation is incomplete in a lot of places regarding this variable.[1][2][3][4][5][6][7][8][9]

Related

PHP How to extract variable from serialized mysql string?

Within a php file I'm trying to extract the user_name var in this string
user_name|s:11:"testaccount";user_email|s:27:"testaccount#testaccount.com";user_login_status|i:1;
I can't figure out what formatting this is though. I am using php mysqli to query the database with this function
$q = "SELECT `data` FROM `sessions` WHERE `id` = '".$this->dbc->real_escape_string($cookie)."' LIMIT 1";
where $cookie is a cookie of the client. Does anyone recognize the format of the string?

Name , email and status are separated by semi colon. Name and value separated by pipe. Value is in serialize form.
For eg. user_name|s:11:"testaccount";
Unserialise s:11:"testaccount"; you will get testaccount value

Figured it out. Used this function to do it https://gist.github.com/phred/1201412.
//
// This is the result of about an hour's delving into PHP's hairy-ass serialization internals.
// PHP provides a session_decode function, however, it's only useful for setting the contents of
// $_SESSION. Say, for instance, you want to decode the session strings that PHP stores in its
// session files -- session_decode gets you nowhere.
//
// There are a bunch of nasty little solutions on the manual page[1] that use pretty hairy regular
// expressions to get the job done, but I found a simple way to use PHP's unserialize and recurse
// through the string extracting all of the serialized bits along the way.
//
// It's not speedy (it calls unserialize AND serialize for each session element), but it's accurate
// because it uses PHP's internal serialized object parser. Fun trivia: PHP's serialized object
// parser is an ugly-ass little compiled regular expression engine. But hey, it works, let's not
// reinvent this wheel.
//
// [1]: http://www.php.net/manual/en/function.session-decode.php
//
define("SESSION_DELIM", "|");
function unserialize_session($session_data, $start_index=0, &$dict=null) {
isset($dict) or $dict = array();
$name_end = strpos($session_data, SESSION_DELIM, $start_index);
if ($name_end !== FALSE) {
$name = substr($session_data, $start_index, $name_end - $start_index);
$rest = substr($session_data, $name_end + 1);
$value = unserialize($rest); // PHP will unserialize up to "|" delimiter.
$dict[$name] = $value;
return unserialize_session($session_data, $name_end + 1 + strlen(serialize($value)), $dict);
}
return $dict;
}
$session_data = …; // A string from a PHP session store.
$session_dict = unserialize_session($session_data);

PHP/MySQL include alternative spellings in search

My PHP search form pulls data from a MySQL database. I am expecting users to sometimes fill the search box with a search term that has a slightly different spelling than my database entry, like "theater" instead of "theater." There are just a few of these that I expect to be very common, so I added an additional row to my database table that contains those alternative spellings, and my PHP search form searches this row of the database as well. It works well, but this will cause a lot of additional work when maintaining the database, so I'm wondering if there's something I can do within my PHP code to search for those predefined alternative spellings (I don't mean to give the user suggested spellings, but I want the search form to return, for example, entries that have "theatre" in it even though the user typed "theater." Is there an easy way to do this (without a search server)?

Yes you can easily do this work without database search,you need correct spellings so I suggest you do this work from PHP coding instead of databases search...
You can do this work with PHP Pspell module, PHP Pspell work like android keyboard whenever use type wrong spelling in search box it automatically check that spelling from dictionary and make it correct like if user type "theater" then it automatically correct it with "theatre".
before starting programming you have to check is Pspell module installed or not
<?php
$config_dic= pspell_config_create ('en');
Here is a small function to help you understand how Pspell works:
<?php
function orthograph($string)
{
// Suggests possible words in case of misspelling
$config_dic = pspell_config_create('en');
// Ignore words under 3 characters
pspell_config_ignore($config_dic, 3);
// Configure the dictionary
pspell_config_mode($config_dic, PSPELL_FAST);
$dictionary = pspell_new_config($config_dic);
// To find out if a replacement has been suggested
$replacement_suggest = false;
$string = explode('', trim(str_replace(',', ' ', $string)));
foreach ($string as $key => $value) {
if(!pspell_check($dictionary, $value)) {
$suggestion = pspell_suggest($dictionary, $value);
// Suggestions are case sensitive. Grab the first one.
if(strtolower($suggestion [0]) != strtolower($value)) {
$string [$key] = $suggestion [0];
$replacement_suggest = true;
}
}
}
if ($replacement_suggest) {
// We have a suggestion, so we return to the data.
return implode('', $string);
} else {
return null;
}
}
To use this function, it is sufficient to pass to it a string parameter:
<?php
$search = $_POST['input'];
$suggestion_spell = orthograph($search);
if ($suggestion_spell) {
echo "Try with this spelling : $suggestion_spell";
}
$dict = pspell_new ("en");
if (!pspell_check ($dict, "lappin")) {
$suggestions = pspell_suggest ($dict, "lappin");
foreach ($suggestions as $suggestion) {
echo "Did you mean: $suggestion?<br />";
}
}
// Suggests possible words in case of misspelling
$config_dic = pspell_config_create('en');
// Ignore words under 3 characters
pspell_config_ignore($config_dic, 3);
// Configure the dictionary
pspell_config_mode($config_dic, PSPELL_FAST);
$dictionary = pspell_new_config($config_dic);
$config_dic = pspell_config_create ('en');
pspell_config_personal($config_dic, 'path / perso.pws');
pspell_config_ignore($config_dic , 2);
pspell_config_mode($config_dic, PSPELL_FAST);
$dic = pspell_new_config($config_dic);
pspell_add_to_personal($dic, "word");
pspell_save_wordlist($dic);
?>

Search for function call in php in order to generate a translation file

I am developing a php website that needs to be multilingual.
For this reason, I implemented a translation function which has the following header:
function t($string, $replace_pairs = array(), $language = NULL)
Basically, this function is called like this in multiples files of my project:
echo '<p>' . t('Hello world!') . '</p>';
$hello_String = t("Hello #name!", array('#name'=>$username));
I haven't generated the translation strings yet and I would like to generate multiple translation file automatically (one for each language).
What I am looking for is a bash program (or a single command, using grep for example) that would look for every call to this t() function and generate a php file with the following structure:
<?php
/* Translation file "fr.php" */
$strings['fr']['Hello world!'] = '';
$strings['fr']['Hello #name!'] = '';
Has anyone ever encountered this situation and could help me with this ?
Thank you very much.
Kind regards,
Matthieu

Yes, you're not exactly the first to come across this. :)
You can use the venerable gettext system for this, you don't need to invent your own functions. Then you'd get to use xgettext, which is a command line utility to extract strings using the _() function.
If you want to roll your own system for whatever reason, your best bet is to write a PHP script which uses token_get_all to tokenize the source, then go through the tokens and look for T_FUNCTIONs with the value t.

No need to reinvent the wheel
Drupal uses the same t() function for its localization and the potx module is your friend.
If you don't already have, or want to install, a drupal instance you can look at the potx.inc file and reuse it in your script.
Here is the complete API documentation for the translation template extractor.

Try this script http://pastie.org/4568713
Usage:
php script.php ./proj-directory lang1 lang2 lang3
This creates lang1.php, lang2.php, lang3.php files in ./lang directory

You need two functions:
1- scan directories for php files. like this
2- match your t function, grep string and generate the language file. like
function genLang($file) {
$content = file_get_contents($file);
preg_match(...);
foreach(...){
echo(...);
}
}

Yii framework also uses same functionality,
see their MessageCommand class
https://github.com/yiisoft/yii/blob/master/framework/cli/commands/MessageCommand.php#L125

What you need is a (very simple) "template system", but there are two instances of templating in your problem.
Transform "Hello $X!" into "Hello Jonh!" or "Hello Maria!", setting $X. (PHP do this for you in string declarations).
Select the adequate template: "Hello $X!" for english, "¡Hola $X!" for spanish.
The item 1 is the more simple, but the algorithm order is 2,1 (item 2 them item 1).
For this simple task you not need a regular expression (to reinvent the "string with place-holder" of PHP).
Illustrating
For item 1, the simplest way is to declare a specialized function to say "Hello",
// for any PHP version.
function template1($name) { return "<p>Hello $name!</p>";}
print template1("Maria");
For item 2 you need a generalization, that PHP do also for you, by a closure,
header('Content-Type: text/html; charset=utf-8'); // only for remember UTF8.
// for PHP 5.3+. Use
function generalTemplate1($K) {
// $K was a literal constant, now is a customized content.
return function($name) use ($K) {return "<p>$K $name!</p>"; };
}
// Configuring template1 (T1) for each language:
$T1_en = generalTemplate1('Hello'); // english template
$T1_es = generalTemplate1('¡Hola'); // spanish template
// using the T1 multilingual
print $T1_en('Jonh'); // Hello Jonh!
print $T1_es('Maria'); // ¡Hola Maria!
For more templates, use generalTemplate2(), generalTemplate3(), etc.; $T2_en, $T2_es, $T2_fr, $T3_en, $T3_es, etc.
Solution
Now, for pratical use, you like to use arrays... Well, there are a datastructure problem,
and more 1 level of generalization. The cost is variable-name parser for place-holders. I used simple regular expression with preg_replace_callback().
function expandMultilangTemplate($T,$K,$lang,$X) {
// string $T is a template, a HTML structure with $K and $X placeholders.
// array $K is a specific language constants for the template.
// array $lang is the language, a standard 2-letter code. "en", "fr", etc.
// array $X is a set of name-value (compatible with $T placeholders).
// Parsing steps:
$T = str_replace('{#K}',$K[$lang],$T); // STEP-1: expand K into T with lang.
// STEP-2: expand X into T
global $_expMultTpl_X; // need to be global for old PHP versions
$_expMultTpl_X = $X;
$T = preg_replace_callback(
'/#([a-z]+)/',
create_function(
'$m',
'global $_expMultTpl_X;
return array_key_exists($m[1],$_expMultTpl_X)?
$_expMultTpl_X[$m[1]]:
"";
'
),
$T
);
return $T;
}
// CONFIGURING YOUR TEMPLATE AND LANGUAGES:
$T = "<p>{#K} #name#surname!</p>";
$K = array('en'=>'Hello','es'=>'¡Hola');
// take care with things like "!", that is generic, and "¡" that is not.
// USING!
print expandMultilangTemplate(
$T, $K, 'en', array('name'=>'Jonh', 'surname'=>' Smith') );
print expandMultilangTemplate($T, $K, 'es', array('name'=>'Maria'));
I tested this script with PHP5, but it runs with older (PHP 4.0.7+).
About "multilingual files": if your translations are into files, you can use somthing like
$K = getTranslation('translationFile.txt');
function getTranslation($file,$sep='|') {
$K = array();
foreach (file($file) as $line) {
list($lang,$words) = explode($sep,$line);
$K[$lang]=$words;
}
}
and a file as
en|Hello
es|¡Hola
Simplest with PHP 5.3
If you using PHP 5.3+, there are a simple and elegant way to express this "simplest multilingual template system",
function expandMultilangTemplate($T,$K,$lang,$X) {
$T = str_replace('{#K}',$K[$lang],$T);
$T = preg_replace_callback(
'/#([a-z]+)/',
function($m,$X=NULL) use ($X) {
return array_key_exists($m[1],$X)? $X[$m[1]]: '';
},
$T
);
return $T;
}

Symfony and Zend Lucene Error

I use symfony with Zend Lucene Search. I have
$query = Zend_Search_Lucene_Search_QueryParser::parse($query.'*');
$hits = self::getLuceneIndex()->find($query);
Sometimes I have error :
At least 3 non-wildcard characters are required at the beginning of pattern.
When I make like in documentations:
$pattern = new Zend_Search_Lucene_Index_Term($query.'*');
$query = new Zend_Search_Lucene_Search_Query_Wildcard($pattern);
$hits = self::getLuceneIndex()->find($query);
It finds nothing.

I do not is it right , but it is work for me :
So, query fail in my case, because it have < 3 characters or have some special characters, so in my search action :
public function executeAds(sfWebRequest $request)
{
if (!$query = $request->getParameter('query'))
{
return $this->forward('search', 'adssearch');
}
$query = str_replace(" ", "", $query);
$query = preg_replace("/[^A-Za-z0-9]/","",$query);
if (strlen(trim($query))<3)
{
$this->redirect('search/notice');
}
$this->ads = Doctrine_Core::getTable('Ads') ->getAdsLuceneQuery($query);
I do not use
$pattern = new Zend_Search_Lucene_Index_Term($query.'*');
$query = new Zend_Search_Lucene_Search_Query_Wildcard($pattern);
$hits = self::getLuceneIndex()->find($query);
Because it is not work for me.

Taken directly from the Zend Reference documentation, you can use:
Zend_Search_Lucene_Search_Query_Wildcard::getMinPrefixLength() to
query the minimum required prefix length and
use Zend_Search_Lucene_Search_Query_Wildcard::setMinPrefixLength() to
set it.
So my suggestion would be either of two things:
Set the prefixMinLength to 0 using Zend_Search_Lucene_Search_Query_Wildcard::setMinPrefixLength(0) - based on this, your original code snippet should work fine (it did for my Zend Lucene implementation)
As you yourself suggested, validate all search queries using javascript or otherwise to ensure there is a minimum of Zend_Search_Lucene_Search_Query_Wildcard::getMinPrefixLength() before any wildcards used (I recommend querying that instead of assuming the default of "3" so the validation is flexible)

Good way to create script supporting translations?

I'm creating an open-source cms and was just wondering that which is the best way to add localizations? I already decided to have them in files similar to lang.en.php. I would assume arrays, but in which form?
$lang['xyz'] = "Text goes here!";
$lang['Text goes here!'] = "Translated text!";
Or should I create my custom parser and add localizations to a file, like this:
"Text goes here!" = "Translated text!";
And then just parse it.
What would you suggest? I tried to search but no results for me.
Martti Laine

I know the Gettext library for Desktop applications does something similar to your custom parser. Gettext has a module in PHP, but I'm not sure if it's installed in most PHP installations by default.
Basically, you would write every string with it with a function name tr("How are you?"). Then create a function to translate it:
include('lang.es.php');
function tr($txt) {
global $tr;
if(array_key_exists($txt,$tr)) {
return $tr($txt);
}
return $txt;
}
And in lang.es.php, have:
$tr = array();
$tr["How are you?"] = "¿Como Estas?";
You would probably want to do printf(tr("How are you, %s?"), $name); for variables, or proper nouns that should not be translated.

I think you should use the Joomla way. Language files must be in ini extension:
FOO=translation
BAR=translation2
then you parse the file with parse_ini_file function and get the translation array:
$dictionary=parse_ini_file("english.ini");
function translate($text)
{
global $dictionary;
if(isset($dictionary[strtoupper($text)])) return $dictionary[strtoupper($text)];
else return $text;
}

It's not as simple as you think it is, do you really need hundreds of rows in an array in order to translate I deleted 45 comments, or I deleted 192 comments? etc.
It would be very helpful if you could call a translate function with: translate('I deleted %d comments', $number);
<?php
$dict = parse_ini_file('lang.ini');
function translate($text){
global $dict;
$args = func_get_args();
if(isset($dict[$text])){
// I am not sure how to convert %d in $args[.], maybe someone else could provide a regular expression for this.
} else {
return $text;
}
}
?>

How will you manage plural form ?
Some languages have very tricky plural rules : example here
In Polish we use e.g. plik (file) this
way:
1 plik
2,3,4 pliki
5-21 pliko'w
22-24 pliki
25-31 pliko'w
For this reason, I suggest you to use gettext because everything has been done for you.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP: Add words to pspell? - php

In your case, the $ps variable created by pspell_new() is that "dictionary link": $ps = pspell_new("en"); pspell_add_to_session($ps, "somenewword");

The $dictionary_link integer is an integer representation of the pspell library handle, as returned by pspell_new or pspell_new_personal. The PHP documentation is incomplete in a lot of places regarding this variable.[1][2][3][4][5][6][7][8][9]

Related

PHP How to extract variable from serialized mysql string?

PHP/MySQL include alternative spellings in search

Search for function call in php in order to generate a translation file

Symfony and Zend Lucene Error

Good way to create script supporting translations?

Categories

Resources