UPDATE: Seems like I have been wasting my time to some extent as according to http://www.browserscope.org/?category=network&v=top-d most modern browsers already limit the number of connections to a single host. 6 being the common number of connections which suits my purposes rather well. But I guess it is still an interesting problem.
The final piece of the jigsaw for my work task is break a list of potentially 250+ ajax requests into batches.
As the result of the following php code
<?
// print("alert(\" booya \");");
$hitlist = array();
$hitlist = urlBuilder($markets,$template);
foreach ($hitlist as $mktlist) {
foreach ($mktlist as $id => $hit) {
$cc = substr($id,0,2);
$lc = substr($id,-4);
echo ("$(\"#" . $cc . $lc . "\").load(\"psurl.php?server=" . $server . "&url=" . $hit . "&port=" . $port . "\");\n");
}
}
?>
This generates a long list of jquery .load's which right now are all executed on a click.
e.g.
$("#sesv-1").load("psurl.php?server=101.abc.com&url=/se/sv&port=80");
$("#sesv-2").load("psurl.php?server=101.abc.com&url=/se/sv/catalog/&port=80");
$("#sesv-3").load("psurl.php?server=101.abc.com&url=/se/sv/catalog/products/12345678&port=80");
$("#atde-1").load("psurl.php?server=101.abc.com&url=/at/de&port=80");
$("#atde-2").load("psurl.php?server=101.abc.com&url=/at/de/catalog/&port=80");
$("#atde-3").load("psurl.php?server=101.abc.com&url=/at/de/catalog/products/12345678&port=80");
$("#benl-1").load("psurl.php?server=101.abc.com&url=/be/nl&port=80");
$("#benl-2").load("psurl.php?server=101.abc.com&url=/be/nl/catalog/&port=80");
$("#benl-3").load("psurl.php?server=101.abc.com&url=/be/nl/catalog/products/12345678&port=80");
$("#befr-1").load("psurl.php?server=101.abc.com&url=/be/fr&port=80");
$("#befr-2").load("psurl.php?server=101.abc.com&url=/be/fr/catalog/&port=80");
$("#befr-3").load("psurl.php?server=101.abc.com&url=/be/fr/catalog/products/12345678&port=80");
Depending on circumstances it can be like 250 requests or perhaps only 30-40. The whole purpose of the app is to warm up newly restarted appservers... so 250 requests in a new jvm = death!
So ideally I would like to break them up. Perhaps by the market would be best meaning at most 5-6 requests at a time.
Any ideas on how this can be accomplished? Is it possible in standard jquery? Trying to make the dependencies as limited as possible so preferably without plugins!
You can use jQuery's .queue.
// Define a queue for execution
var
$elem = $("#sesv-1"),
enqueue = function(a){ $elem.queue("status", a) };
// Queue your requests
enqueue(function(a){
$aElem.load("url", a);
});
enqueue(function(a){
$otherElem.load("url", a);
});
// Execute the queue
$elem.dequeue("status");
You can create as many queues as you need (most probably per market) then enqueue your requests.
Related
I have a file that has the function of importing data into a sql database from an api. A problem I encountered was that the api can only retrieve a max dataset size of 1000, even though sometimes I need to retrieve large amounts of data, ranging from 10-200,000. My first thought was to create a while loop in which inside I make calls to the api until all of the data is properly retrieved, and afterwards, can I enter it into the database.
$moreDataToImport = true;
$lastId = null;
$query = '';
while ($moreDataToImport) {
$result = json_decode(callToApi($lastId));
$query .= formatResult($result);
$moreDataToImport = !empty($result['dataNotExported']);
$lastId = getLastId($result['customers']);
}
mysqli_multi_query($con, $query);
The issue I encountered with this is that I was quickly reaching memory limits. The easy solution to this is to simply increase the memory limit until it was suffice. How much memory I needed, however, was undeclared, because there is always a possibility that I need to import very large datasets, and can theoretically always run out of memory. I don't want to set an infinite memory limit, as the problems with that are unimaginable.
My second solution to this was instead of looping through the imported data, I could instead send it to my database, and then do a page refresh, with a get request specifying the last Id I left off on.
if (isset($_GET['lastId'])
$lastId = $_GET['lastId'];
else
$lastId = null;
$result = json_decode(callToApi($lastId));
$query .= formatResult($result);
mysqli_multi_query($con, $query);
if (!empty($result['dataNotExported'])) {
header('Location: ./page.php?lastId='.getLastId($result['customers']));
}
This solution solves my memory limit issue, however now I have another issue, being that browsers, after 20 redirects (depends on the browser), will automatically kill the program to stop a potential redirect loop, then shortly refresh the page. The solution to this would be to kill the program yourself at the 20th redirect and allow it to do a page refresh, continuing the process.
if (isset($_GET['redirects'])) {
$redirects = $_GET['redirects'];
if ($redirects == '20') {
if ($lastId == null) {
header("Location: ./page.php?redirects=2");
}
else {
header("Location: ./page.php?lastId=$lastId&redirects=2");
}
exit;
}
}
else
$redirects = '1';
Though this solves my issues, I am afraid this is more impractical than other solutions, as there must be a better way to do this. Is this, or the issue of possibly running out of memory my only two choices? And if so, is one more efficient/orthodox than the other?
Do the insert query inside the loop that fetches each page from the API, rather than concatenating all the queries.
$moreDataToImport = true;
$lastId = null;
$query = '';
while ($moreDataToImport) {
$result = json_decode(callToApi($lastId));
$query = formatResult($result);
mysqli_query($con, $query);
$moreDataToImport = !empty($result['dataNotExported']);
$lastId = getLastId($result['customers']);
}
Page your work. Break it up into smaller chunks that will be below your memory limit.
If the API only returns 1000 at a time, then only process 1000 at a time in a loop. In each iteration of the loop you'll query the API, process the data, and store it. Then, on the next iteration, you'll be using the same variables so your memory won't skyrocket.
A couple things to consider:
If this becomes a long running script, you may hit the default script running time limit - so you'll have to extend that with set_time_limit().
Some browsers will consider scripts that run too long to be timed out and will show the appropriate error message.
For processing upwards of 200,000 pieces of data from an API, I think the best solution is to not make this work dependant on a page load. If possible, I'd put this in a cron job to be run by the server on a regular schedule.
If the dataset is dependant on the request (for example, if you're processing temperatures from one of 1000s of weather stations - the specific station ID to be set by the user), then consider creating a secondary script that does the work. Calling and forking the secondary script from your primary script will enable your primary script to finish execution while your secondary script executes in the background on your server. Something like:
exec('php path/to/secondary-script.php > /dev/null &');
I have a website on an Ubuntu LAMP Server - that has a form which gets variables and then they get submitted to a function that handles them. The function calls other functions in the controller that "explodes" the variables, order them in an array and run a "for" loop on each variable, gets new data from slow APIs, and inserts the new data to the relevant tables in the database.
Whenever I submit a form, the whole website gets stuck (only for my IP, on other desktops the website continue working regularly), and I get redirected until I get to the requested "redirect("new/url);".
I have been researching this issue for a while and found this post as an example:
Continue PHP execution after sending HTTP response
After studding how this works in the server side, which is explained really good in this video: https://www.youtube.com/watch?v=xVSPv-9x3gk
I wanted to start learning how to write it's syntax and found out that this only work on CLI and not from APACHE, but I wasn't sure.
I opened this post a few days ago: PHP+fork(): How to run a fork in a PHP code
and after getting everything working from the server side, installing fork and figuring out the differences of the php.ini files in a server (I edited the apache2 php.ini, don't get mistaked), I stopped getting the errors I used to get for the "fork", but the processes don't run in the background, and I didn't get redirected.
This is the controller after adding fork:
<?php
// Registers a new keyword for prod to the DB.
public function add_keyword() {
$keyword_p = $this->input->post('key_word');
$prod = $this->input->post('prod_name');
$prod = $this->kas_model->search_prod_name($prod);
$prod = $prod[0]->prod_id;
$country = $this->input->post('key_country');
$keyword = explode(", ", $keyword_p);
var_dump($keyword);
$keyword_count = count($keyword);
echo "the keyword count: $keyword_count";
for ($i=0; $i < $keyword_count ; $i++) {
// create your next fork
$pid = pcntl_fork();
if(!$pid){
//*** get new vars from $keyword_count
//*** run API functions to get new data_arrays
//*** inserts new data for each $keyword_count to the DB
print "In child $i\n";
exit($i);
// end child
}
}
// we are the parent (main), check child's (optional)
while(pcntl_waitpid(0, $status) != -1){
$status = pcntl_wexitstatus($status);
echo "Child $status completed\n";
}
// your other main code: Redirect to main page.
redirect('banana/kas');
}
?>
And this is the controller without the fork:
// Registers a new keyword for prod to the DB.
public function add_keyword() {
$keyword_p = $this->input->post('key_word');
$prod = $this->input->post('prod_name');
$prod = $this->kas_model->search_prod_name($prod);
$prod = $prod[0]->prod_id;
$country = $this->input->post('key_country');
$keyword = explode(", ", $keyword_p);
var_dump($keyword);
$keyword_count = count($keyword);
echo "the keyword count: $keyword_count";
// problematic part that needs forking
for ($i=0; $i < $keyword_count ; $i++) {
// get new vars from $keyword_count
// run API functions to get new data_arrays
// inserts new data for each $keyword_count to the DB
}
// Redirect to main page.
redirect('banana/kas');
}
The for ($i=0; $i < $keyword_count ; $i++) { is the part that I want to get running in the background because it's taking too much time.
So now:
How can I get this working the way I explained? Because from what I see, fork isn't what I'm looking for, or I might be doing this wrong.
I will be happy to learn new techniques, so I will be happy to get suggestions about how I can do this in different ways. I am a self learner, and I found out the great advantages of Node.js for exmaple, which could have worked perfectly in this case if I would have learnt it. I will consider to learn working with Node.js in the future. sending server requests and getting back responses is awesome ;).
***** If there is a need to add more information about something, please tell me in comments and I will add more information to my post if you think it's relevant and I missed it.
What you're really after is a queue or a job system. There's one script running all the time, waiting for something to do. Once your original PHP script runs, it just adds a job to the list, and it can continue it's process as normal.
There's a few implementations of this - take a look at something like https://laravel.com/docs/5.1/queues
I'm writing a chat program for a site that does live broadcasting, and like you can guess with any non application driven chat it relies on a looping AJAX call to get new information (messages) in my case once every 2 seconds. My JSON that is being created via PHP and populated by SQL is of some concern to me, while it shows no noticeable impact on my server at present I cannot predict what adding several hundred users to the mix may do.
<?PHP
require_once("../../../../wp-load.php");
global $wpdb;
$table_name = $wpdb->prefix . "chat_posts";
$posts = $wpdb->get_results("SELECT * FROM ". $table_name ." WHERE ID > ". $_GET['last'] . " ORDER BY ID");
echo json_encode($posts);
?>
There obviously isn't much wiggle room as far as optimizing the code itself, but I am a little worried about how well the Wordpress SQL engine is written and if it will bog my SQL down once it gets to the point where it is receiving 200 requests every 2 seconds. Would caching the json encoded results of the DB query to a file then age checking it against new calls to the PHP script and either re-constructing the file with a new query or passing the files contents based on its last modification date be a better way to handle this? At that point I am putting a bigger load on my file-system but reducing my SQL load to one query every 2 seconds regardless of number of users.
Or am I already on the right path with just querying the server on every call?
So this is what I came up with, I went the DB only route for a few tests and while response was snappy, it didn't scale well and connections quickly got eaten up. So I decided to write a quick little bit of caching logic. So far it has worked wonderfully and seems to allow me to scale my chat as big as I want.
$cacheFile = 'cache/chat_'.$_GET['last'].'.json';
if (file_exists($cacheFile) && filemtime($cacheFile) + QUERY_REFRESH_RATE > time())
{
readfile($cacheFile);
} else {
require_once("../../../../wp-load.php");
$timestampMin = gmdate("Y-m-d H:i:s", (time() - 7200));
$sql= "/*qc=on*/" . "SELECT * FROM ". DB_TABLE ."chat_posts WHERE ID > ". $_GET['last'] . " AND timestamp > '".$timestampMin."' ORDER BY ID;";
$posts = $wpdb->get_results($sql);
$json = json_encode($posts);
echo $json;
file_put_contents($cacheFile,$json);
}
Its also great in that it allows me to run my formatting functions against messages such as parsing URL's into actual links and such with much less overhead.
My app first queries 2 large sets of data, then does some work on the first set of data, and "uses" it on the second.
If possible I'd like it to instead only query the first set synchronously and the second asynchronously, do the work on the first set and then wait for the query of the second set to finish if it hasn't already and finally use the first set of data on it.
Is this possible somehow?
It's possible.
$mysqli->query($long_running_sql, MYSQLI_ASYNC);
echo 'run other stuff';
$result = $mysqli->reap_async_query(); //gives result (and blocks script if query is not done)
$resultArray = $result->fetch_assoc();
Or you can use mysqli_poll if you don't want to have a blocking call
http://php.net/manual/en/mysqli.poll.php
MySQL requires that, inside one connection, a query is completely handled before the next query is launched. That includes the fetching of all results.
It is possible, however, to:
fetch results one by one instead of all at once
launch multiple queries by creating multiple connections
By default, PHP will wait until all results are available and then internally (in the mysql driver) fetch all results at once. This is true even when using for example PDOStatement::fetch() to import them in your code one row at a time. When using PDO, this can be prevented with setting attribute \PDO::MYSQL_ATTR_USE_BUFFERED_QUERY to false. This is useful for:
speeding up the handling of query results: your code can start processing as soon as one row is found instead of only after every row is found.
working with result sets that are potentially larger than the memory available to PHP (PHP's self-imposed memory_limit or RAM memory).
Be aware that often the speed is limited by a storage system with characteristics that mean that the total processing time for two queries is larger when running them at the same time than when running them one by one.
An example (which can be done completely in MySQL, but for showing the concept...):
$dbConnectionOne = new \PDO('mysql:hostname=localhost;dbname=test', 'user', 'pass');
$dbConnectionOne->setAttribute(\PDO::ATTR_ERRMODE, \PDO::ERRMODE_EXCEPTION);
$dbConnectionTwo = new \PDO('mysql:hostname=localhost;dbname=test', 'user', 'pass');
$dbConnectionTwo->setAttribute(\PDO::ATTR_ERRMODE, \PDO::ERRMODE_EXCEPTION);
$dbConnectionTwo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);
$synchStmt = $dbConnectionOne->prepare('SELECT id, name, factor FROM measurementConfiguration');
$synchStmt->execute();
$asynchStmt = $dbConnectionTwo->prepare('SELECT measurementConfiguration_id, timestamp, value FROM hugeMeasurementsTable');
$asynchStmt->execute();
$measurementConfiguration = array();
foreach ($synchStmt->fetchAll() as $synchStmtRow) {
$measurementConfiguration[$synchStmtRow['id']] = array(
'name' => $synchStmtRow['name'],
'factor' => $synchStmtRow['factor']
);
}
while (($asynchStmtRow = $asynchStmt->fetch()) !== false) {
$currentMeasurementConfiguration = $measurementConfiguration[$asynchStmtRow['measurementConfiguration_id']];
echo 'Measurement of sensor ' . $currentMeasurementConfiguration['name'] . ' at ' . $asynchStmtRow['timestamp'] . ' was ' . ($asynchStmtRow['value'] * $currentMeasurementConfiguration['factor']) . PHP_EOL;
}
I'm writing a multi-user, JavaScript based drawing app as a learning project. Right now it's one way: the "transmitter" client at transmitter.html sends the data as the user draws on the HTML5 canvas element, the "receiver" client at receiver.html replicates it on their own canvas.
The transmitter just draws a line between (previousX, previousY) and (currentX, currentY) in response to a mouseMove event. It sends those two sets of coordinates to transmitter.php via AJAX. They sit in a session var until the receiver collects them by calling receiver.php, also via AJAX. At least that's how it should work.
This is transmitter.php:
<?php
session_start();
if (!isset($_SESSION['strokes'])) $_SESSION['strokes'] = '';
$_SESSION['strokes'] .= $_GET['px'] . "," . $_GET['py'] . "," . $_GET['x'] . "," . $_GET['y'] . ';';
?>
This is receiver.php:
<?php
session_start();
echo($_SESSION['strokes']);
$_SESSION['strokes'] = "";
?>
In case you're wondering why I'm using a session var, it's because it's the fastest way I could think of to store the data in such a way that it could be accessed by the other script. I tried googling for alternatives but couldn't find anything. If there's a better way, I'd love to hear about it.
Anyway, the problem is that not all of the data is making it through. This manifests itself by gaps in the lines drawn on the receiver's canvas. I also set up a little counter in the transmitter's and receiver's JavaScript files, to check exactly how many "strokes" were being sent/received. There are invariably less received than sent, so the data is being lost server-side, it seems.
At the risk of giving you more code than you need to see, this is the code in transmitter.js that sends the data to the server (n is the counter that I mentioned):
function mouseMoveHandler(e)
{
var x = e.pageX - canvasX;
var y = e.pageY - canvasY;
if (mouseDown)
{
canvas.moveTo(prevX, prevY);
canvas.lineTo(x, y);
canvas.stroke();
sendToServer(prevX, prevY, x, y);
n++;
}
prevX = x;
prevY = y;
}
This is the code in receiver that gets it back and draws it (again, n is the counter):
function responseHandler()
{
if (xhr.readyState == 4)
{
var strokes = xhr.responseText.split(';');
n += strokes.length - 1;
for (var i = 0; i < strokes.length - 1; i++)
{
stroke = strokes[i].split(',');
canvas.moveTo(stroke[0], stroke[1]);
canvas.lineTo(stroke[2], stroke[3]);
canvas.stroke();
}
setTimeout("contactServer()", 500);
}
}
If I read your question correctly; you're trying to access the same session from different clients? If this is the case, this isn't possible, sessions are bound to a client/user.
If you want something realtime, multi-user you probably should take a look at NodeJS and specifically at NodeJS Events. Which is JS based, so I think that will integrate nicely in your application.