MongoDB MapReduce returning no data in PHP - php

I'm using a Mongo MapReduce to perform a word-count operation on a bunch of documents. The documents are very simple (just an ID and a hash of words):
{ "_id" : 6714078, "words" : { "my" : 1, "cat" : 1, "john" : 1, "likes" : 1, "cakes" : 1 } }
{ "_id" : 6715298, "words" : { "jeremy" : 1, "kicked" : 1, "the" : 1, "ball" : 1 } }
{ "_id" : 6717695, "words" : { "dogs" : 1, "can't" : 1, "look" : 1, "up" : 1 } }
The database is called "words" in my environment, the collections in question are named "wordsX" where X is a category number (I know, don't ask). The field in the document hash where the words are stored is also named "words". Gah.
The problem I'm having is that under certain conditions in my PHP app, the MapReduce doesn't return any data. Annoyingly, running the same commands from the Mongo shell gives perfect results. I'm trying to pin down where this bug is occurring but I'm really stumped, so hoping someone might be able to shed some light on this. The lead-up to this question does go on a bit, because the environment is a bit complicated, but please bear with me.
The commands I've tried running from the Mongo shell to replicate the PHP-based operations are as follows:
m = function () {
if (this.words) {
for (index in this.words) {
emit(index, this.words[index]);
}
}
}
r = function (key, values) {
var total = 0;
for (var i in values) {
total += values[i];
}
return total;
}
res = db.words.mapReduce(m, r, { query : { _id : { $in : [6714078,6715298,6717695] } } });
This results in a temporary collection being created containing the word count data. All OK so far.
However if I run the same commands from PHP (using the standard Mongo library), I end up with no data under certain conditions. It's a bit tricky to describe because I don't want to bore you with the details of the application/environment beyond Mongo, but basically I'm using Sphinx to filter some records, then supplying a list of content IDs to Mongo on which the MapReduce is performed. If I filter back into the data set by 2 or 3 days, I get results back from Mongo; if I don't filter, I get an empty dataset back. The PHP code to run the same operation is as follows. I've not included the Sphinx-based parts as I don't think they're relevant (just know that we get a list of IDs back) because I've tried supplying exactly the same list to Mongo on the command line and got the right results, whereas I don't from within PHP. Hope that makes sense.
The PHP code I'm using looks like this:
$objMongo = new Mongo();
$objDB = $objMongo->words;
$arrWordList = array();
$strMap = '
function() {
if (this.words) {
for (index in this.words) {
emit(index, this.words[index]);
}
}
}
';
$strReduce = '
function(key, values) {
var total = 0;
for (var i in values) {
total += values[i];
}
return total;
}
';
$objMapFunc = new MongoCode($strMap);
$objReduceFunc = new MongoCode($strReduce);
$arrQuery = array(
'_id' => array('$in' => $arrIDs) // <--- list of IDs from Sphinx
);
$arrCommand = array(
'mapreduce' => 'wordsX',
'map' => $objMapFunc,
'reduce' => $objReduceFunc,
'query' => $arrQuery
);
MongoCursor::$timeout = -1;
$arrStatsInfo = $objDB->command($arrCommand);
var_dump($arrStatsInfo);
The contents of the result-info array ($arrStatsInfo) under working and non-working conditions (the filtering as specified above) are as follows.
Working results:
array(4) {
["result"]=>
string(31) "tmp.mr.mapreduce_1279637336_227"
["timeMillis"]=>
int(171)
["counts"]=>
array(3) {
["input"]=>
int(54)
["emit"]=>
int(2517)
["output"]=>
int(1526)
}
["ok"]=>
float(1)
}
Empty results:
array(4) {
["result"]=>
string(31) "tmp.mr.mapreduce_1279637381_228"
["timeMillis"]=>
int(21)
["counts"]=>
array(3) {
["input"]=>
int(0)
["emit"]=>
int(0)
["output"]=>
int(0)
}
["ok"]=>
float(1)
}
So it looks like under the broken condition, no records even make it into the MapReduce. I've spent ages trying to work out what on earth is going on here but I've had no insights thus far. As I've said, running the same commands (as above) directly in the Mongo command line using exactly the same set of IDs returns the right results.
After all that, I guess my question is: is there anything obviously wrong with the PHP-Mongo interaction I'm doing above? Are there other steps I can take to try to debug this?
Please let me know if supplying any further information would be helpful. I appreciate this is a somewhat expansive and ill-defined question but I've tried my best to communicate the issue! Really hope someone can suggest a way out of this.
Many thanks for reading!

For future readers, this issue turned out to be the result of inconsistent handling of ints/numeric strings elsewhere in the app. Sorry about the red herring!

Related

Why do php redis get data returns with colons

The problem
I wrote a multi-threaded implementation of the code and tried to use redis as a counter, this is my code. When I try to use redis as a counter, I often get ':'(colon) in the value, sometimes not, is it because I loop too fast and redis doesn't even notice?
Output result
cclilshy#192 debug % php st.php
registerRedis success!
registerSocket success!
1
2
3
1
2
string(2) ":5"
1
2
3
string(2) ":9"
3
string(1) "9"
// the up is the output. Why?
Code
$func = function($handle){
for($i=0;$i<3;$i++){
echo $handle->counter().PHP_EOL;
}
var_dump($handle->total());
};
//$handle->counter() :
public function counter($record = true){
if($record = false){
return $this->count;
}
$this->thread->counter();
$this->count++;
return $this->count;
}
//$handle->total() :
public function total(){
return $this->thread->counter(false);
}
//$handle->thread->counter() :
public function counter($record = true){
if($record === false){
return $this->redis->get('thread.' . $this->pids[0] . '.count');
}
return $this->redis->incr('thread.' . $this->pids[0] . '.count');
}
Redis is single-threaded by design, so it's anyway serves your threads in sync way.
Response looks like this because RESP protocol are not parsed and it returns you raw representation for integer.
Yeah, one of the reason could be that parser process is still blocked at the time you're already returns next value
I solved this problem by using a separate redis connection for each process

How to properly split data from email into an array

Preface:
My overall goal is to create a pie chart showing the data below. The pie chart would have slices depicted as 1-8, with matching percentages based on the numerical value to the right of the colon.
I have the following data that is sent automatically in an email:
1:64.00
2:63.07
3:62.78
4:61.87
5:47.47
6:43.97
7:36.99
8:19.85
Sent from: [email redacted]
Parameters:3000,0
Time Server:2018.11.05 08:21:53
Time Local: 2018.11.04 22:21:53
There always this many lines sent in the email.
What I am trying to do is splice out the lines for data 1-8, which I have successfully done with this portion of the code:
if (strpos($row['subject'], 'Currency Relative Strength') !== false) {
$a1 = preg_split('/\r\n|\r|\n/', $row['body']);
$a = array("label" => $b[0], "p1" => $a1[1], "p2" => $a1[2], "p3" => $a1[3], "p4" => $a1[4], "p5" => $a1[5], "p6" => $a1[6], "p7" => $a1[7]);
$array[] = $a;
}
It looks like this:
[{"p0":"1:64.00","p1":"2:63.07","p2":"3:62.78","p3":"4:61.87","p4":"5:47.47","p5":"6:43.97","p6":"7:36.99","p7":"8:19.85"}]
The issue I am running into is that I am trying to follow the documentation here:
https://canvasjs.com/docs/charts/integration/jquery/chart-types/jquery-pie-chart/
Which for the datapoints portion requires an array that is completely different from the one I managed to put together.
This either leaves me needing to completely restructure the array, OR use a different pie chart system, which I am not against if anybody has any suggestions. I also understand that if I choose to go the route of canvasjs then to get an automatically updating pie chart with data that updates every minute, I have to implement something more like this:
CanvasJS: Making a chart dynamic with data.php, json encode and ajax(bandwidth meters)
To anybody willing to provide either assistance with the current code to better fit canvas js, OR suggest a whole different pie chart system that might work better with the array I have managed to build, I appreciate you very much! Btw I am not married to the idea of a jquery pie chart, I just figured it might be a better way to go...
Given that
$message = '
...
';
and the format for plotting JQuery pie chart
dataPoints: [
{ label: "Samsung", y: 30.3, legendText: "Samsung"},
...
]
The data points can be extracted as follow.
$lines = explode("\n", trim($message));
$firstEightLines = array_slice($lines, 0, 8);
$dataPoints = array_map(function($line) {
list($index, $point) = explode(":", $line);
return [
'label' => "p{$index}",
'legendText' => "p{$index}",
'y' =>(float)$point,
];
}, $firstEightLines);
var_dump($dataPoints);
/*
array(8) {
[0]=>
array(3) {
["label"]=>
string(2) "p1"
["legendText"]=>
string(2) "p1"
["y"]=>
float(64)
}
...*/

How to sort Japanese like Excel

I want to sort Japanese words ( Kanji) like sort feature in excel.
I have tried many ways to sort Japanese text in PHP but the result is not 100% like result in excel.
First . I tried to convert Kanji to Katakana by using this lib (https://osdn.net/projects/igo-php/) but some case is not same like excel.
I want to sort these words ASC
けやきの家
高森台病院
みのりの里
My Result :
けやきの家
高森台病院
みのりの里
Excel Result:
けやきの家
みのりの里
高森台病院
Second I tried other way by using this function
mb_convert_kana($text, "KVc", "utf-8");
The sorting result is correct with those text above, but it contain some case not correct
米田病院
米田病院
高森台病院
My result :
米田病院
米田病院
高森台病院
Excel Result:
高森台病院
米田病院
米田病院
Do you guys have any idea about this. (Sorry for my English ) . Thank you
Firstly, Japanese kanji are not sortable. You can sort by its code number, but that order has no meanings.
Your using Igo (or any other morphological analysis libraries) sounds good solution, though it can not be perfect. And your first sort result seems fine for me. Why do you want them to be sorted in Excel order?
In Excel, if a cell keeps remembering its phonetic notations when the user initially typed on Japanese IME (Input Method Editor), that phonetics will be used in sort. That means, as not all cell might be typed manually on IME, some cells may not have information how those kanji-s are read. So results of sorting Kanji-s on Excel could be pretty unpredictable. (If sort seriously needed, usually we add another yomigana field, either in hiragana or katakana, and sort by that column.)
The second method mb_convert_kana() is totally off point. That function is to normalize hiragana/katakana, as there are two sets of letters by historical reason (full-width kana and half-width kana). Applying that function to your Japanese texts only changes kana parts. If that made your expectation satisfied, that must be coincidence.
You must define what Excel Japanese sort order your customer requires first. I will be happy to help you if it is clear.
[Update]
As op commented, mb_convert_kana() was to sort mixed hiragana/katakana. For that purpose, I suggest to use php_intl Collator. For example,
<?php
// demo: Japanese(kana) sort by php_intl Collator
if (version_compare(PHP_VERSION, '5.3.0', '<')) {
exit ('php_intl extension is available on PHP 5.3.0 or later.');
}
if (!class_exists('Collator')) {
exit ('You need to install php_intl extension.');
}
$collator = new Collator('ja_JP');
$textArray = [
'カキクケコ',
'日本語',
'アアト',
'Alphabet',
'アイランド',
'はひふへほ',
'あいうえお',
'漢字',
'たほいや',
'さしみじょうゆ',
'Roma',
'ラリルレロ',
'アート',
];
$result = $collator->sort($textArray);
if ($result === false) {
echo "sort failed" . PHP_EOL;
exit();
}
var_dump($textArray);
This sorts hiragana/katakana mixed texts array. Results are here.
array(13) {
[0]=>
string(8) "Alphabet"
[1]=>
string(4) "Roma"
[2]=>
string(9) "アート"
[3]=>
string(9) "アアト"
[4]=>
string(15) "あいうえお"
[5]=>
string(15) "アイランド"
[6]=>
string(15) "カキクケコ"
[7]=>
string(21) "さしみじょうゆ"
[8]=>
string(12) "たほいや"
[9]=>
string(15) "はひふへほ"
[10]=>
string(15) "ラリルレロ"
[11]=>
string(6) "漢字"
[12]=>
string(9) "日本語"
}
You won't need to normalize them by yourself. Both PHP(though with php_intl extension) and database(such like MySQL) know how to sort alphabets in many languages so you do not need to write it.
And, this does not solve the original issue, Kanji sort.
Laravel Alpha to Hiragana with a custom function
Note : $modals (laravel models with get() )
alphabets : Hiragana orders
Source : https://gist.github.com/mdzhang/899a427eb3d0181cd762
public static function orderByHiranagana ($modals,$column){
$outArray = array();
$alphabets = array("a","i","u","e","o","ka","ki","ku","ke","ko","sa","shi","su","se","so","ta","chi","tsu","te","to","na","ni","nu","ne","no","ha","hi","fu","he","ho","ma","mi","mu","me","mo","ya","yu","yo","ra","ri","ru","re","ro","wa","wo","n","ga","gi","gu","ge","go","za","ji","zu","ze","zo","da","ji","zu","de","do","ba","bi","bu","be","bo","pa","pi","pu","pe","po","(pause)","kya","kyu","kyo","sha","shu","sho","cha","chu","cho","nya","nyu","nyo","hya","hyu","hyo","mya","myu","myo","rya","ryu","ryo","gya","gyu","gyo","ja","ju","jo","bya","byu","byo","pya","pyu","pyo","yi","ye","va","vi","vu","ve","vo","vya","vyu","vyo","she","je","che","swa","swi","swu","swe","swo","sya","syu","syo","si","zwa","zwi","zwu","zwe","zwo","zya","zyu","zyo","zi","tsa","tsi","tse","tso","tha","ti","thu","tye","tho","tya","tyu","tyo","dha","di","dhu","dye","dho","dya","dyu","dyo","twa","twi","tu","twe","two","dwa","dwi","du","dwe","dwo","fa","fi","hu","fe","fo","fya","fyu","fyo","ryi","rye","(wa)","wi","(wu)","we","wo","wya","wyu","wyo","kwa","kwi","kwu","kwe","kwo","gwa","gwi","gwu","gwe","gwe","mwa","mwi","mwu","mwe","mwo");
$existIds = array();
foreach ($alphabets as $alpha){
foreach ($modals as $modal) {
if($alpha == strtolower(substr($modal->$column, 0, strlen($alpha))) && !in_array($modal->id,$existIds)) {
array_push($outArray,$modal);
array_push($existIds,$modal->id);
}
}
}
return $outArray;
}
Call like this :
$students = Students::get();
$students = CommonHelper::orderByHiranagana($students,'lastname');

Composer Package (PHPLeague) in Codeigniter loading class

I am aiming to use the following package for my project Color Extractor.
I have composer working and setup in my project fine as per #philsturgeon's tutorial here, however I am stuck as the function returns an empty array for an image I know is there.
I am autoloading in the index.php using require FCPATH.'vendor/autoload.php';
And I have tested this using Phil's example.
My Code looks like this:
$client = new League\ColorExtractor\Client();
$image = $client->loadJpeg(FCPATH.'assets/images/tumblr_ma7gmzwfAq1r780z3o1_250.jpg');
// Get the most used color hex code
$palette = $image->extract();
// Get three most used color hex code
$palette = $image->extract(3);
// Change the Minimum Color Ratio (0 - 1)
// Default: 0
$image->setMinColorRatio(1);
$palette = $image->extract();
var_dump($palette);
Can anyone tell me what I am doing wrong here as I don't have any errors in my log and I get the standard output.
array(0) { }
I was being utterly stupid apologies for anyone who viewed this question I was trying to declare multiple variables with the same name and variants of the function.
My final code looks like
$client = new League\ColorExtractor\Client();
$image = $client->loadJpeg(FCPATH.'assets/images/tumblr_ma7gmzwfAq1r780z3o1_250.jpg');
$palette = $image->extract(3);
var_dump($palette);
Which returns an expected array like this.
array(3) { [0]=> string(7) "#E09D73" [1]=> string(7) "#AC4C34" [2]=>
string(7) "#EEDF6C" }

PHP PDO strange behavior: getting empty strings in result array ~ 1 out of 10 times

like 1 out of 10 times MySQL -> PDO replies with the right number of array elements but with them being all empty.
the other ~9 times i get the correct result on the same query.
(the PHP version i am forced to use is 5.3.1, MySQL version is 5.1.41)
my php function:
function getClientsByLike($dbh, $term)
{
$k=$dbh->prepare("
SELECT code, text FROM my_table WHERE text LIKE :search_term
ORDER BY text ASC
LIMIT 500
");
$k->bindParam(":search_term", $x = "%".$term."%");
$k->execute();
while($obj = $k->fetch())
{
$result.= $obj['code'].' '.$obj['text'].'<br />';
}
return $result;
}
$dbh = new PDO('mysql:host=localhost;dbname=my_db', 'my_user', 'my_password',
array(
PDO::ATTR_PERSISTENT => true,
PDO::ATTR_ERRMODE => PDO::ERRMODE_WARNING,
PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES utf8',
PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC
)
);
echo getClientsByLike($dbh, "term");
the times it is working i get something like this as a result:
434 textblabla<br />
23 moretext<br />
95 evenmoretext<br />
when it's not working i get the correct amount of results, but all the returned strings are empty:
<br />
<br />
<br />
sometimes the behavior is alternating with every execution of the script: working, not working, working, not working, and so on.
i've been working quite a bit now with PDO (like 3 months) but this is the first time i ran into a strange behavior like this.
any suggestions are very appreciated.
I am not sure what error you are getting. Maybe try wrap your prepare and execute method with PDOException might know the reason.
try{
$k=$dbh->prepare("
SELECT code, text FROM my_table WHERE text LIKE :search_term
ORDER BY text ASC
LIMIT 500
");
$k->bindParam(":search_term", $x = "%".$term."%");
$k->execute();
}catch(PDOException $e){
echo "<p style='color:red;'>{$e->getMessage()}</p>";
}
after further debugging my script with this method:
$clients = $k->fetchAll();
ob_start();
var_dump($clients);
$debug = ob_get_clean();
file_put_contents("result.txt", $debug."\r\n", FILE_APPEND);
notice: i switched to fetchAll() now
i have found the reason for the strange behavior but still not exactly why this behavior occurs...
i took a look at the logfile i created for the result-arrays PDO returns:
result-array which works for me:
array(1) {
[0]=>
array(2) {
["code"]=>
string(5) "31081"
["text"]=>
string(28) "some text here"
}
}
result-array which produced the strange behavior:
array(1) {
[0]=>
array(2) {
["my_table.code"]=>
string(5) "31081"
["my_table.text"]=>
string(28) "some text here"
}
}
so suddenly PDO replies with "table_name"."field_name" instead of just "field_name" as expected. i don't change any PDO settings anywhere, they stay the same all the time.
the behavior of PDO - sometimes replying with the table name, sometimes without - seems quite random and like a serious bug to me (at least in PHP 5.3.1)
explicitly setting PDO to never fetch table names (in this script) should fix this for me:
$dbh->setAttribute(PDO::ATTR_FETCH_TABLE_NAMES, false);

Categories