Sphinx randomly fails to combine subqueries

Sphinx randomly fails to combine subqueries - php

I have this sphinx search engine that I use through Zend using sphinxapi.php . It works fantastic. Really really great.
However, there is one problem: It randomly fails.
// Prepare Sphinx search
$sphinxClient = new SphinxClient();
$sphinxClient->SetServer($ip, $port);
$sphinxClient->SetConnectTimeout( 10 );
$sphinxClient->SetMatchMode( SPH_MATCH_ANY );
$sphinxClient->SetLimits( $resultsPerPage * ($nPage - 1), $resultsPerPage );
$sphinxClient->SetArrayResult( true );
$query = array();
$query['lang'] = '#lang "lang' . $language . '"';
if (isset($params)) {
foreach ($params as $param) {
$query['tags'] = '#tags "' . $param . '"';
}
}
// Make the Sphinx search
$sphinxClient->SetMatchMode(SPH_MATCH_EXTENDED);
$sphinxResult = $sphinxClient->Query(implode(' ', $query), '*');
As seen here, I search for a language and an arbitrary amount of tags, imploded into a single query string in the end (instead of making a battleload of subqueries).
So, normally, this works like a charm, but occassionally sphinx returns that it found 2000 entries in English and, say, 1000 entries with the tag "pictures" (or some other purely english word) but ZERO hits that fit both results, which is purely false. In fact, refreshing the page everything returns to normal with something like 800 real results.
My question is why does it do this and how do I make it stop?
Any ideas?
:Edit: Added shortened output log
[error] =>
[warning] =>
...
[total] => 0
[total_found] => 0
[time] => 0.000
[words] => Array (
[langen] => Array (
[docs] => 2700
[hits] => 2701 )
[picture] => Array (
[docs] => 829
[hits] => 1571 ) ) )

have you checked to see if the sphinx client is giving you any error or warning messages that may describe the failure?
if($sphinxResult === false) {
print "Query failed: " . $sphinxClient->GetLastError();
} else {
if($sphinxClient->GetLastWarning()) {
print "WARNING: " . $sphinxClient->GetLastWarning();
}
// process results
}

This issue has been solved completely a few months after the original post. The issue is that our service providers in the umbrella corporation har mistakenly assigned the wrong root values to the sphinx commands. The problem above was actually running on Sphinx 0.9.8 and was obviously buggy. My advice, if you ever experience similar problems is to double-tripple-check the version you use both to index and to query.
It feels like one of those times your math calculation doesn't roll out because you forgot a minus on the first row. Thanks to everyone that have tried to help in this and related threads.

Related

PHP write to .ini file

Everyone, hello!
I'm currently trying to write to an .ini file from PHP, and I'm using Teoman Soygul's answer and code from here: How to read and write to an ini file with PHP
This works out great, although, when I save the data to it, it shows up strange in my .ini:
[Server] = ""
p_ip = "192.168.10.100"
p_port = 80
p_password = 1234
[Variable] = ""
string1_find = "Caution"
Most notably it also seems to see attempt to give the categories Server and Variable an empty value. Also, sometimes it saves the variable between consistency and sometimes not. How come there is no consistency here?
The code I'm using to find/post in PHP is this:
...
$a=array("[Server]"=>'',"p_ip"=>$_POST['pip'],"p_port"=>$_POST['pport'], "p_password"=>$_POST['pass'],
"[Variable]"=>'',"string1_find"=>$_POST['string1_find'],
...
If anyone could point me into the right direction, that would really be appreciated. Thank you!

You are not using right, you should be passing a multidimentional array instead:
$data = array(
'Server' => array(
'p_ip' => '192.168.10.100',
'p_port' => 80,
'p_password' => 1234,
),
'Variable' => array(
'string1_find' => 'Caution'
)
);
//now call the ini function from Soygul's answer
write_php_ini($data, 'file.ini');
Here is my output:
[Server]
p_ip = "192.168.10.100"
p_port = 80
p_password = 1234
[Variable]
string1_find = "Caution"
Notice that you need to create an extra array per new section and then you can start listing your custom definitions.

Storing an object to database and then retrieving it back again

Please excuse the long post - I've tried to be as succinct as possible, whilst giving as much detail as I can ... and I've historically been a procedural PHP programmer, now getting my feet wet with objects.
I'm working on a test project where I'm using a 3rd party class (loaded via phar) to capture a stream from an online server. The online server posts events, I capture (using "capture.php") each event with the 3rd party class, serialize the data and then store it to a MySQL database as a BLOB field.
//capture.php
// include phar file
require_once 'PAMI-1.72.phar';
// set include path to have pami's phar first
ini_set('include_path', implode(PATH_SEPARATOR, array('phar://pami.phar', ini_get('include_path'))));
use PAMI\Client\Impl\ClientImpl as PamiClient;
use PAMI\Message\Event\EventMessage;
use PAMI\Listener\IEventListener;
$pamiClient = new PamiClient($pamiClientOptions);
// Open the connection
$pamiClient->open();
$pamiClient->registerEventListener(function (EventMessage $event) {
// Following line dumps the event to a text file in rawCapture.php
//var_dump($event);
$mysqli = new mysqli($server, $user, $pass, $db);
if ($mysqli->connect_error) {
echo "Failed to connect to MySQL: (" . $mysqli->connect_errno . ") " . $mysqli->connect_error;
exit;
}
$serializedData = $mysqli->real_escape_string(serialize($event));
$sql = 'insert into obj (data) values ("' . $serializedData . '")';
echo "\r\nSQL is $sql\r\n";
});
In a different script ("read.php" - I know - I'm VERY creative in script naming) I ensure that I have the include files to the class definition files, pick up the first entry in the database, pull the BLOB data and then unserialize it. At this point I'm testing the variable type in read.php and the variable is an object:
require_once 'PAMI-1.72.phar';
// set include path to have pami's phar first
ini_set('include_path', implode(PATH_SEPARATOR, array('phar://pami.phar', ini_get('include_path'))));
use PAMI\Client\Impl\ClientImpl as PamiClient;
use PAMI\Message\Event\EventMessage;
use PAMI\Listener\IEventListener;
/* All the MySQL stuff to get the $data from the db */
$event = unserialize($data);
echo "\r\nevent is " . gettype($event) . "\r\n";
print_r($event);
And this is the result I get:
event is object
PAMI\Message\Event\NewchannelEvent Object
(
[rawContent:protected] => Event: Newchannel
Privilege: call,all
Channel: Local/s#tc-maint-00006a43;2
ChannelState: 4
ChannelStateDesc: Ring
CallerIDNum:
CallerIDName:
AccountCode:
Exten: s
Context: tc-maint
Uniqueid: 1443368640.57856
[lines:protected] => Array
(
)
[variables:protected] => Array
(
)
[keys:protected] => Array
(
[event] => Newchannel
[privilege] => call,all
[channel] => Local/s#tc-maint-00006a43;2
[channelstate] => 4
[channelstatedesc] => Ring
[calleridnum] =>
[calleridname] =>
[accountcode] =>
[exten] => s
[context] => tc-maint
[uniqueid] => 1443368640.57856
)
[createdDate:protected] => 1443368636
)
Ok - so far so good, IMHO. This certainly appears to be an object, all the data certainly appears to have been stored correctly (rawCapture.php is another script which also polls the data and var_dumps the same events out to a text file, so that I can "compare" and ensure that everything is being recorded correctly).
However, now I want to deal with some of the object data and this is where I'm having issues. For example, I'm trying to obtain the value of the Uniqueid, but using $id = $event->Uniqueid; I get the following error PHP Notice: Undefined property: PAMI\Message\Event\NewchannelEvent::$Uniqueid in /home/dave/objRead.php on line 69
Now I'm stuck. I've tried checking that $event is in fact an object, using get_object_vars():
$objVars = get_object_vars($event);
if(is_null($objVars)){
echo "\r\nobjVars doesn't appear to be an object?!\r\n";
} else {
if(is_array($objVars)){
print_r($objVars);
} else {
echo "\r\nobjVars isn't an array!!\r\n";
}
}
and I'm getting my own error code "objVars doesn't appear to be an object?!".
Can anyone give me some advice on what on earth I'm doing wrong??
Also - I've tried to keep indentation in the code sections above but it wouldn't let me post without removing them. I've also amended my OP to show that I'm including the 3rd party code so that the class definitions are loaded, and the line that I use to unserialize the data from the MySQL table.

Have you tried changing the case of Uniqueid in $id = $event->Uniqueid; to lower case? So it would be $id = $event->uniqueid;

PHP array custom format

I am just starting on the PHP.
I am working on getting information from WordPress database.
The plugin writes data to a DB, from the Sign up form.
What I want is to get this data formatted on my own way, let's say a table, on a separate page.
So what I did already, is to connect to a DB, and print the data. Did it by this:
<?php
//connect to the database
mysql_connect ("host","user","pasw") or die ('Cannot connect to MySQL: ' . mysql_error());
mysql_select_db ("database") or die ('Cannot connect to the database: ' . mysql_error());
//query
$query = mysql_query("select id, data from wp_ninja_forms_subs") or die ('Query is invalid: ' . mysql_error());
//write the results
while ($row = mysql_fetch_array($query)) {
echo $row['id'] . " " . $row['data'] . "
";
// close the loop
}
?>
The thing is, that I get the results, which doesn't really suit me:
14 a:4:{i:0;a:2:{s:8:"field_id";i:2;s:10:"user_value";s:8:"John Doe";}i:1;a:2:{s:8:"field_id";i:4;s:10:"user_value";s:11:"+3706555213";}i:2;a:2:{s:8:"field_id";i:12;s:10:"user_value";s:9:"Company 1";}i:3;a:2:{s:8:"field_id";i:8;s:10:"user_value";a:1:{i:0;s:13:" Finansiniai ";}}} 15 a:4:{i:0;a:2:{s:8:"field_id";i:2;s:10:"user_value";s:10:"Bill Gates";}i:1;a:2:{s:8:"field_id";i:4;s:10:"user_value";s:11:"+5654412213";}i:2;a:2:{s:8:"field_id";i:12;s:10:"user_value";s:9:"Company 2";}i:3;a:2:{s:8:"field_id";i:8;s:10:"user_value";a:1:{i:0;s:13:" ?vaizd˛io ";}}} 16 a:4:{i:0;a:2:{s:8:"field_id";i:2;s:10:"user_value";s:7:"Person3";}i:1;a:2:{s:8:"field_id";i:4;s:10:"user_value";s:7:"6463213";}i:2;a:2:{s:8:"field_id";i:12;s:10:"user_value";s:9:"Company 3";}i:3;a:2:{s:8:"field_id";i:8;s:10:"user_value";a:2:{i:0;s:10:" HTML/CSS ";i:1;s:12:" Photoshop ";}}} 17 a:4:{i:0;a:2:{s:8:"field_id";i:2;s:10:"user_value";s:11:"Pretty Girl";}i:1;a:2:{s:8:"field_id";i:4;s:10:"user_value";s:9:"643122131";}i:2;a:2:{s:8:"field_id";i:12;s:10:"user_value";s:4:"Zara";}i:3;a:2:{s:8:"field_id";i:8;s:10:"user_value";a:1:{i:0;s:13:" ?vaizd˛io ";}}}
Now, what I want to see is a table:
2013.10.25 Pretty Girl 643122131 Zara Įvaizdžio
2013.10.25 Person3 6463213 Company 3 HTML/CSS , Photoshop
2013.10.25 Bill Gates +5654412213 Company 2 Įvaizdžio
2013.10.25 John Doe +3706555213 Company 1 Finansiniai
Could someone tell me, how to achieve that?
I believe, that my data output is an array, or am I wrong about it too?
If yes, maybe someone could give me an example how to format one part of that array, so I could do the rest?
Or even some hint on what to google for?
Thanks!

Use "\n" for new line character :)
Your individual values (a:4:{i:0....}) have been "serialized". This one was an array of four elements that was passed to the serialize() PHP function. The function returned it's textual representation (i.e. it has "serialized" the array). So you have the entire array saved in one cell in database as simple text.
Function unserialize() does the opposite - turns the "serialized" (text) values and returns the original PHP object (an array in this case).
You can serialize almost any PHP object as long as it doesn't have some any "resources" attached to it. But there is already a separate question for that: What could cause a failure in PHP serialize function?
As long as you serialize arrays of numbers, strings and arrays and even simple objects there is nothing to worry about.
Your first cell (no 14) when unserilalized:
array (
0 =>
array (
'field_id' => 2,
'user_value' => 'John Doe',
),
1 =>
array (
'field_id' => 4,
'user_value' => '+3706555213',
),
2 =>
array (
'field_id' => 12,
'user_value' => 'Company 1',
),
3 =>
array (
'field_id' => 8,
'user_value' =>
array (
0 => ' Finansiniai ',
),
),
)
(Using: http://www.functions-online.com/unserialize.html) Run these trough a for loop or something to get the rendering you need, that's up to you.

Synonym finder algorithm

I think example will be much better than loooong description :)
Let's assume we have an array of arrays:
("Server1", "Server_1", "Main Server", "192.168.0.3")
("Server_1", "VIP Server", "Main Server")
("Server_2", "192.168.0.4")
("192.168.0.3", "192.168.0.5")
("Server_2", "Backup")
Each line contains strings which are synonyms. And as a result of processing of this array I want to get this:
("Server1", "Server_1", "Main Server", "192.168.0.3", "VIP Server", "192.168.0.5")
("Server_2", "192.168.0.4", "Backup")
So I think I need a kind of recursive algorithm. Programming language actually doesn't matter — I need only a little help with idea in general. I'm going to use php or python.
Thank you!

This problem can be reduced to a problem in graph theory where you find all groups of connected nodes in a graph.
An efficient way to solve this problem is doing a "flood fill" algorithm, which is essentially a recursive breath first search. This wikipedia entry describes the flood fill algorithm and how it applies to solving the problem of finding connected regions of a graph.
To see how the original question can be made into a question on graphs: make each entry (e.g. "Server1", "Server_1", etc.) a node on a graph. Connect nodes with edges if and only if they are synonyms. A matrix data structure is particularly appropriate for keeping track of the edges, provided you have enough memory. Otherwise a sparse data structure like a map will work, especially since the number of synonyms will likely be limited.
Server1 is Node #0
Server_1 is Node #1
Server_2 is Node #2
Then edge[0][1] = edge[1][0] = 1, indicated that there is an edge between nodes #0 and #1 ( which means that they are synonyms ). While edge[0][2] = edge[2][0] = 0, indicating that Server1 and Server_2 are not synonyms.
Complexity Analysis
Creating this data structure is pretty efficient because a single linear pass with a lookup of the mapping of strings to node numbers is enough to crate it. If you store the mapping of strings to node numbers in a dictionary then this would be a O(n log n) step.
Doing the flood fill is O(n), you only visit each node in the graph once. So, the algorithm in all is O(n log n).

Introduce integer marking, which indicates synonym groups. On start one marks all words with different marks from 1 to N.
Then search trough your collection and if you find two words with indexes i and j are synonym, then remark all of words with marking i and j with lesser number of both. After N iteration you get all groups of synonyms.
It is some dirty and not throughly efficient solution, I believe one can get more performance with union-find structures.

Edit: This probably is NOT the most efficient way of solving your problem. If you are interested in max performance (e.g., if you have millions of values), you might be interested in writing more complex algorithm.
PHP, seems to be working (at least with data from given example):
$data = array(
array("Server1", "Server_1", "Main Server", "192.168.0.3"),
array("Server_1", "VIP Server", "Main Server"),
array("Server_2", "192.168.0.4"),
array("192.168.0.3", "192.168.0.5"),
array("Server_2", "Backup"),
);
do {
$foundSynonyms = false;
foreach ( $data as $firstKey => $firstValue ) {
foreach ( $data as $secondKey => $secondValue ) {
if ( $firstKey === $secondKey ) {
continue;
}
if ( array_intersect($firstValue, $secondValue) ) {
$data[$firstKey] = array_unique(array_merge($firstValue, $secondValue));
unset($data[$secondKey]);
$foundSynonyms = true;
break 2; // outer foreach
}
}
}
} while ( $foundSynonyms );
print_r($data);
Output:
Array
(
[0] => Array
(
[0] => Server1
[1] => Server_1
[2] => Main Server
[3] => 192.168.0.3
[4] => VIP Server
[6] => 192.168.0.5
)
[2] => Array
(
[0] => Server_2
[1] => 192.168.0.4
[3] => Backup
)
)

This would yield lower complexity then the PHP example (Python 3):
a = [set(("Server1", "Server_1", "Main Server", "192.168.0.3")),
set(("Server_1", "VIP Server", "Main Server")),
set(("Server_2", "192.168.0.4")),
set(("192.168.0.3", "192.168.0.5")),
set(("Server_2", "Backup"))]
b = {}
c = set()
for s in a:
full_s = s.copy()
for d in s:
if b.get(d):
full_s.update(b[d])
for d in full_s:
b[d] = full_s
c.add(frozenset(full_s))
for k,v in b.items():
fsv = frozenset(v)
if fsv in c:
print(list(fsv))
c.remove(fsv)

I was looking for a solution in python, so I came up with this solution. If you are willing to use python data structures like sets
you can use this solution too. "It's so simple a cave man can use it."
Simply this is the logic behind it.
foreach set_of_values in value_collection:
alreadyInSynonymSet = false
foreach synonym_set in synonym_collection:
if set_of_values in synonym_set:
alreadyInSynonymSet = true
synonym_set = synonym_set.union(set_of_values)
if not alreadyInSynonymSet:
synonym_collection.append(set(set_of_values))
vals = (
("Server1", "Server_1", "Main Server", "192.168.0.3"),
("Server_1", "VIP Server", "Main Server"),
("Server_2", "192.168.0.4"),
("192.168.0.3", "192.168.0.5"),
("Server_2", "Backup"),
)
value_sets = (set(value_tup) for value_tup in vals)
synonym_collection = []
for value_set in value_sets:
isConnected = False # If connected to a term in the graph
print(f'\nCurrent Value Set: {value_set}')
for synonyms in synonym_collection:
# IF two sets are disjoint, they don't have common elements
if not set(synonyms).isdisjoint(value_set):
isConnected = True
synonyms |= value_set # Appending elements of new value_set to synonymous set
break
# If it's not related to any other term, create a new set
if not isConnected:
print ('Value set not in graph, adding to graph...')
synonym_collection.append(value_set)
print('\nDone, Completed Graphing Synonyms')
print(synonym_collection)
This will have a result of
Current Value Set: {'Server1', 'Main Server', '192.168.0.3', 'Server_1'}
Value set not in graph, adding to graph...
Current Value Set: {'VIP Server', 'Main Server', 'Server_1'}
Current Value Set: {'192.168.0.4', 'Server_2'}
Value set not in graph, adding to graph...
Current Value Set: {'192.168.0.3', '192.168.0.5'}
Current Value Set: {'Server_2', 'Backup'}
Done, Completed Graphing Synonyms
[{'VIP Server', 'Main Server', '192.168.0.3', '192.168.0.5', 'Server1', 'Server_1'}, {'192.168.0.4', 'Server_2', 'Backup'}]

How to make a PHP XML/RPC call to UPC database

hay i am working on barcode reader project when i call upcdatabase from my php script it give me errors. i use the php example provided by www.upcdatabase.com
the code is
<?php error_reporting(E_ALL);
ini_set('display_errors', true);
require_once 'XML/RPC.php';
$rpc_key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'; // Set your rpc_key here
$upc='0639382000393';
// Setup the URL of the XML-RPC service
$client = new XML_RPC_Client('/xmlrpc', 'http://www.upcdatabase.com');
$params = array( new XML_RPC_Value( array(
'rpc_key' => new XML_RPC_Value($rpc_key, 'string'),
'upc' => new XML_RPC_Value($upc, 'string'),
), 'struct'));
$msg = new XML_RPC_Message('lookup', $params);
$resp = $client->send($msg);
if (!$resp)
{
echo 'Communication error: ' . $client->errstr;
exit;
}
if(!$resp->faultCode())
{
$val = $resp->value();
$data = XML_RPC_decode($val);
echo "<pre>" . print_r($data, true) . "</pre>";
}else{
echo 'Fault Code: ' . $resp->faultCode() . "\n";
echo 'Fault Reason: ' . $resp->faultString() . "\n";
}
?>
when i check the $upc='0639382000393'; into upc data base view this then it works fine but i run this script into the browser then it give the following error Array
(
[status] => fail
[message] => Invalid UPC length
)

Unfortunately, their API appears rather short on documentation.
There are three types of codes the site mentions on the Item Lookup page:
13 digits for an EAN/UCC-13
12 digits for a Type A UPC code, or
8 digits for a Type-E (zero-supressed) UPC code.
Right after the page mentions those three types, it also says,
Anything other than 8 or 12 digits is not a UPC code!
The 13-digit EAN/UCC-13 is a superset of UPC. It includes valid UPCs, but it has many other values that are not valid UPCs.
From the Wikipedia article on EAN-13:
If the first digit is zero, all digits in the first group of six are encoded using the patterns used for UPC, hence a UPC barcode is also an EAN-13 barcode with the first digit set to zero.
Having said that, when I removed the leading zero from $upc, it worked as expected. Apparently the Item Lookup page has logic to remove the leading zero, while the API does not.
Array
(
[upc] => 639382000393
[pendingUpdates] => 0
[status] => success
[ean] => 0639382000393
[issuerCountryCode] => us
[found] => 1
[description] => The Teenager's Guide to the Real World by BYG Publishing
[message] => Database entry found
[size] => book
[issuerCountry] => United States
[noCacheAfterUTC] => 2011-01-22T14:46:15
[lastModifiedUTC] => 2002-08-23T23:07:36
)
Alternatively, instead of setting the upc param, you can set the original 13-digit value to the ean param and it will also work.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Sphinx randomly fails to combine subqueries - php

Related

PHP write to .ini file

Storing an object to database and then retrieving it back again

PHP array custom format

Synonym finder algorithm

How to make a PHP XML/RPC call to UPC database

Categories

Resources