MongoDB + PHP: How do you query the size of a document stored in a mongodb object? The document limit, at the time of this writing (recently raised from 4MB), is 16MB.
How can I query the size of a Document using the ObjectId?
Object.bsonsize(document) is the function you're after. I'm reading this thread and finding a few suggestions about how to do it in php. The easiest is probably to query for the document object and run strlen on the bson_encode'd object giving you the size in bytes.
$byteSize = strlen( bson_encode( $yourDocument ) );
If you're wanting to do this check on a lot of documents, say in a loop, without reading back every document you'll need to run a command execute (Mardix had posted this little function to do just that - note: it could easily be rewritten a little better to avoid the hardcoded and global variables):
$DBName = "MyDBName";
$MongoDB = new MongoDB(new Mongo(),$DBName);
function documentSize(Array $Criteria){
global $MongoDB;
$collectionName = "MyCollection";
$jsonCriteria = json_encode($Criteria);
$code = "function(){
return Object.bsonsize(db.{$collectionName}.findOne({$jsonCriteria}))
}";
$resp = $MongoDB->execute($code);
return $resp["retval"];
}
PHP example where $id is the document id
$myDocSize = documentSize(array("_id"=>$id));
Related
I have some code running on my website that uses API calls to pull events from a calendar and show them on my website. The code is fairly simple overall, and works well, however to prevent the code from running every time the page loads, I'm using PHP Redis to save the key data to a Redis List, then a cronjob to run the php code that uses the Redis List to fetch the information from the calendar API, and save the information to Redis.
Everything works fine, except I am using foreach to run through each instance of the Redis List; and it keeps running through all the entries but saving only the last one. What can I do to fix this?
My code:
<?php
function redis_calendar_fetch() {
$redisObj1 = new Redis();
$redisObj1 -> connect('localhost', 6379, 2.5, NULL, 150 );
date_default_timezone_set('America/Edmonton');
$fetcharray = $redisObj1-> smembers('fetch_list');
$effectiveDate = date('Y-m-d', strtotime('+12 months'));
$application_id = "REDACTED";
$secret = "REDACTED";
$fafull = array();
foreach($fetcharray as $faraw) {
$fa1 = json_decode($faraw);
$fa2 = json_decode(json_encode($fa1), true);
$fafull[] = $fa2;
}
$redisObj1 -> close(); // This code all works perfectly, and returns the Redis List results in an array that can be used by foreach
foreach($fafull as $fa) {
$redisObj = new Redis();
$redisObj -> connect('localhost', 6379, 2.5, NULL, 150 );
// After this, I run through all the array data, pull data & process it properly. I have omitted this from this question because it is long and arduous, and runs perfectly fine.
// Right before this, an array called $redisarray is created that contains all the relevant event data //
$redisarrayfixed = json_encode($redisarray);
$redisObj->set($key, $redisarrayfixed);
$redisObj -> close();
// If I put a line here saying 'return = $redisarrayfixed', the code runs only the first instance of the array and stops. If I omit this, it runs through all of them, but only saves the last one
}
}
redis_calendar_fetch();
As mentioned, I then use a cronjob to run this code every 30 minutes, and I have a separate piece of php code that handles the shortcode & fetches the proper saved events for the proper page.
My concern solely is with the foreach($fafull as $fa), which only saves the final result to Redis. Is there a better way to force each array instance to save?
For performance, you might want to keep just one instance of a redis connection active.
Secondly, it's only going to save the final result because on each iteration, you are using the same $key. It seems like what you want to do is iterate and push to an array, then at the end, save it entirely.
Example of how I'm understanding this;
$redisObj = new Redis();
$redisObj -> connect('localhost', 6379, 2.5, NULL, 150 );
$someArray = array();
foreach($fafull as $fa) {
$redisarrayfixed = json_encode($redisarray);
array_push($someArray, $redisarrayfixed);
}
$redisObj->set($key, $someArray);
$redisObj -> close();
I' creating solr document via solarium plugin in php.
But the all document is stored text_general data type except id field. text_general is the default datatype in solr system.
My doubt is why id field is only to stored string type as default.
And If any possible to add document with string type using solarium plugin.
My code part is here,
public function updateQuery() {
$update = $this->client2->createUpdate();
// create a new document for the data
$doc1 = $update->createDocument();
// $doc1->id = 123;
$doc1->name = 'value123';
$doc1->price = 364;
// and a second one
$doc2 = $update->createDocument();
// $doc2->id = 124;
$doc2->name = 'value124';
$doc2->price = 340;
// add the documents and a commit command to the update query
$update->addDocuments(array($doc1, $doc2));
$update->addCommit();
// this executes the query and returns the result
$result = $this->client2->update($update);
echo '<b>Update query executed</b><br/>';
echo 'Query status: ' . $result->getStatus(). '<br/>';
echo 'Query time: ' . $result->getQueryTime();
}
The result document for the above code is here,
{
"responseHeader":{
"status":0,
"QTime":2,
"params":{
"q":"*:*",
"_":"1562736411330"}},
"response":{"numFound":2,"start":0,"docs":[
{
"name":["value123"],
"price":[364],
"id":"873dfec0-4f9b-4d16-9579-a4d5be8fee85",
"_version_":1638647891775979520},
{
"name":["value124"],
"price":[340],
"id":"7228e92d-5ee6-4a09-bf12-78e24bdfa52a",
"_version_":1638647892102086656}]
}}
This depends on the field type defined in the schema for your Solr installation. It does not have anything to do with how you're sending data through Solarium.
In the schemaless mode, the id field is always set as a string, since a unique field can't be tokenized (well, it can, but it'll give weird, non-obvious errors).
In your case i'd suggest defining the price field as an integer/long field (if it's integers all the way) and the name field as a string field. Be aware that string fields only generate hits on exact matches, so in your case you'd have to search for value124 with exact casing to get a hit.
You can also adjust the multiValued property of the field when you define the fields explicitly. That way you get only the string back in the JSON structure instead of an array containing the string.
I am trying to implement a logging library which would fetch the current debug level from the environment the application runs in:
23 $level = $_SERVER['DEBUG_LEVEL'];
24 $handler = new StreamHandler('/var/log/php/php.log', Logger::${$level});
When I do this, the code fails with the error:
A valid variable name starts with a letter or underscore,followed by any number of letters, numbers, or underscores at line 24.
How would I use a specific Logger:: level in this way?
UPDATE:
I have tried having $level = "INFO" and changing ${$level} to $$level. None of these changes helped.
However, replacing the line 24 with $handler = new StreamHandler('/var/log/php/php.log', Logger::INFO); and the code compiles and runs as expected.
The variable itself is declared here
PHP Version => 5.6.99-hhvm
So the answer was to use a function for a constant lookup:
$handler = new StreamHandler('/var/log/php/php.log', constant("Monolog\Logger::" . $level));
<?php
class Logger {
const MY = 1;
}
$lookingfor = 'MY';
// approach 1
$value1 = (new ReflectionClass('Logger'))->getConstants()[$lookingfor];
// approach 2
$value2 = constant("Logger::" . $lookingfor);
echo "$value1|$value2";
?>
Result: "1|1"
I'm tying to extract data from thousands of premade sql files. I have a script that does what I need using the Mysqli driver in PHP, but it's really slow since it's one sql file at a time. I modified the script to create unique temp database names, which each sql file is loaded into. Data is extracted to an archive database table, then the temp database is dumped. In an effort to speed things up, I created a script structured 4 scripts similar to the one below, where each for loop is stored in it's own unique PHP file (the code below is only for a quick demo of what's going on in 4 separate files), they are setup to grab only 1/4 of the files from the source file folder. All of this works perfectly, the scripts run, there is zero interference with file handling. The issue is that I seem to get almost zero performance boost. Maybe 10 seconds faster :( I quickly refreshed my PHPmyadmin database listing page and could see the 4 different databases loaded at anytime, but I also noticed that it looked like it was still running more or less sequentially as the DB names were changing on the fly. I went the extra step of creating an unique user for each script with it's own connection. No improvement. Can I get this to work with mysqli / PHP or do I need to look into some other options? I'd prefer to do this all in PHP if I can (version 7.0). I tested by running the PHP scripts in my browser. Is that the issue? I haven't written any code to execute them on the command line and set them to the background yet. One last note, all the users in my mysql database have no limits on connections, etc.
$numbers = array('0','1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20');
$numCount = count($numbers);
$a = '0';
$b = '1';
$c = '2';
$d = '3';
$rebuild = array();
echo"<br>";
for($a; $a <= $numCount; $a+=4){
if(array_key_exists($a, $numbers)){
echo $numbers[$a]."<br>";
}
}
echo "<br>";
for($b; $b <= $numCount; $b+=4){
if(array_key_exists($b, $numbers)){
echo $numbers[$b]."<br>";
}
}
echo "<br>";
for($c; $c <= $numCount; $c+=4){
if(array_key_exists($c, $numbers)){
echo $numbers[$c]."<br>";
}
}
echo "<br>";
for($d; $d <= $numCount; $d+=4){
if(array_key_exists($d, $numbers)){
echo $numbers[$d]."<br>";
}
}
Try this:
<?php
class BackgroundTask extends Thread {
public $output;
protected $input;
public function run() {
/* Processing here, use $output for... well... outputting data */
// Here you would implement your for() loops, for example, using $this->input as their data
// Some dumb value to demonstrate
$output = "SOME DATA!";
}
function __construct($input_data) {
$this->input = $input_data;
}
}
// Create instances with different input data
// Each "quarter" will be a quarter of your data, as you're trying to do right now
$job1 = new BackgroundTask($first_quarter);
$job1->start();
$job2 = new BackgroundTask($second_quarter);
$job2->start();
$job3 = new BackgroundTask($third_quarter);
$job3->start();
$job4 = new BackgroundTask($fourth_quarter);
$job4->start();
// ==================
// "join" the first job, i.e. "wait until it's finished"
$job1->join();
echo "First output: " . $job1->output;
$job2->join();
echo "Second output: " . $job2->output;
$job3->join();
echo "Third output: " . $job3->output;
$job4->join();
echo "Fourth output: " . $job4->output;
?>
When using four calls to your own script through HTTP, you're overloading your connections for no useful reason. Instead, you're taking away spaces for other users who may be trying to access your website.
Im working on a project that downloads up to 5000 individual pieces of data from a server. It basically is a PHP page that takes POST variable, gets the data from the DB and sends it back to the .NET client.
It is slow. It takes about 1 second per request. I've googled a lot and tried all sorts of tweaks to the code, like the famous proxy-setting etc. But nothing speeds it up.
Any idea's? All solutions that make this super fast are welcome. Even C-written DLL's or anything you can think of. This just needs to be a lot faster.
Public Function askServer(oCode As String) As String
oBytesToSend = Encoding.ASCII.GetBytes("cmd=" & System.Web.HttpUtility.UrlEncode(oCode))
Try
oRequest = WebRequest.Create(webServiceUrl)
oRequest.Timeout = 60000
oRequest.Proxy = WebRequest.DefaultWebProxy
CType(oRequest, HttpWebRequest).UserAgent = "XXXXX"
oRequest.Method = "POST"
oRequest.ContentLength = oBytesToSend.Length
oRequest.ContentType = "application/x-www-form-urlencoded"
oStream = oRequest.GetRequestStream()
oStream.Write(oBytesToSend, 0, oBytesToSend.Length)
oResponse = oRequest.GetResponse()
If CType(oResponse, HttpWebResponse).StatusCode = Net.HttpStatusCode.OK Then
oStream = oResponse.GetResponseStream()
oReader = New StreamReader(oStream)
oResponseFromServer = oReader.ReadToEnd()
oResponseFromServer = System.Web.HttpUtility.UrlDecode(oResponseFromServer)
Return oResponseFromServer
Else
MsgBox("Server error", CType(vbOKOnly + vbCritical, MsgBoxStyle), "")
Return ""
End If
Catch e As Exception
MsgBox("Oops" & vbCrLf & e.Message, CType(vbOKOnly + vbCritical, MsgBoxStyle), "")
Return ""
End Try
End Function
Some ideas :
Run the http requests in parallel. (Client)
If the data response size allows it get all data needed in one request (you need change your server implementation).
Caching data. (Server)