PHP Amp\Mysql async slower than native blocking PDO? - php

I'm doing some testing with Amp and try to see how it could help speeding up SQL Queries by running them async. I think I'm doing something wrong because the results of this test file are very disappointing and not what I would have expected. Is there something I'm doing wrong?
The code below gives me results like this, first number is Amp\Mysql and it is a lot slower for some reason:
0.37159991264343
0.10906314849854
PHP code:
<?php
require 'vendor/autoload.php';
require 'Timer.php';
$runThisManyTimes = 1000;
///////////////////////////////////////////////////////////
use Amp\Mysql\ConnectionConfig;
use Amp\Loop;
Loop::run(function() use ($runThisManyTimes) {
$timer = Timer::start();
$config = ConnectionConfig::fromString(
"host=127.0.0.1 user=test password=test db=test "
);
/** #var \Amp\Mysql\Pool $pool */
$pool = Amp\Mysql\pool($config);
/** #var \Amp\Mysql\Statement $statement */
$statement = yield $pool->prepare("SELECT * FROM accounts WHERE id = :id");
for ($i = 1; $i <= $runThisManyTimes; $i++) {
/** #var \Amp\Mysql\ResultSet $result */
$result = yield $statement->execute(['id' => '206e5903-98bd-4af5-8fb1-86a520e9a330']);
while (yield $result->advance()) {
$row = $result->getCurrent();
}
}
$timer->stop();
echo $timer->getSeconds();
Loop::stop();
});
echo PHP_EOL;
///////////////////////////////////////////////////////////
$timer = Timer::start();
$pdo = new PDO('mysql:host=127.0.0.1;dbname=test', 'test', 'test');
$statement = $pdo->prepare("SELECT * FROM accounts WHERE id = :id");
for ($i = 1; $i <= $runThisManyTimes; $i++) {
$statement->execute(['id' => '206e5903-98bd-4af5-8fb1-86a520e9a330']);
$statement->fetch();
}
$timer->stop();
echo $timer->getSeconds();

Parallel execution of MySQL is not productive when each thread takes less than, say, 1 second.
Each thread must use its own connection; establishing the connection takes some time.
Your particular benchmark (like most benchmarks) is not very useful. After the first execution of that single SELECT, all subsequent executions will probably take less than 1ms. It would be better to use a sequence of statements that reflect your app.

Your benchmark doesn't include any concurrency, so it's basically like blocking I/O in the PDO example. amphp/mysql is a full protocol implementation in PHP, so it's somewhat expected to be slower than the C implementation of PDO.
If you want to find out whether non-blocking concurrent I/O has benefits for your application and you're currently using sequential blocking PDO queries, you should benchmark those against non-blocking concurrent queries using amphp/mysql instead of serial ones.
Additionally, amphp/mysql might not be optimized as much as the database drivers behind PDO, but it allows for non-blocking concurrent queries, which isn't supported by PDO. If you do sequential queries, PDO will definitely have better performance for the time being, but amphp/mysql is very useful once concurrency is involved.

Related

Get data from database with MySQL in forloop

I am a newbie in MySQL and PHP.
I have the following code to get data within a date range (day 1 to day 2, then day 2 to day 3 and so on).
function getData($query) {
global $connect;
$result = mysqli_query($connect, $query);
if (!$result) {
echo 'MySQL Error: '.mysqli_error($connect);
die();
}
return mysqli_fetch_assoc($result);
}
$dayZero = date_create('2017-01-21');
$dayToday = date_create('Y-m-d');
$diff = date_diff($dayZero, $dayToday)->format('%a');
for ($i = 0; $i < $diff; ++$i) {
$start[$i] = date('Y-m-d', date_format($dayZero, 'U') + (24*60*60)*($i));
$end[$i] = date('Y-m-d', date_format($dayZero, 'U') + (24*60*60)*($i+1));
$days[$i] = getData('SELECT count(*) AS "b" FROM `table_name` WHERE `timestamp` BETWEEN "'.$start[$i].'" AND "'.$end[$i].'"')['b'];
}
The code works as expected, but it runs extremely slow. My guess is because it needs to check the database each time it loops.
Is there a way to make it runs faster? Or is there any optimization that I can make?
Yes! Great question. While you can execute queries as you have done, the better option is to use prepared statements. This separates the query into a prepared statement and it's variables see here:
http://www.w3schools.com/php/php_mysql_prepared_statements.asp
The actual statement or query is sent to the server one time. After this the server waits for you to supply the variables.
This is great for performance applications (like yours), where the server is able to make use of caching to greatly speed up the performance. It is also the preferred method for secure applications where there server is protected from injection attacks.
As a final note, there are a bunch of ways to optimize SQL queries and this is just one of them. You should really always be using prepared statements though.

how to cleanup / free database query memory in zend?

After executing this simple code (for MySQL database) I get 1kB of memory less for each loop iteration, so after 1000'th iteration I have about 1MB memory used.
Now, if I have to loop in a long running script (about 1 000 000 iterations) I will be out of memory quickly
$_db = Zend_Db_Table::getDefaultAdapter();
$start_memory = memory_get_usage();
for ($i=0; $i<1000; $i++) {
$update_query = "UPDATE table SET field='value'";
$_db->query($update_query);
}
echo 'memory used: '.(memory_get_usage()-$start_memory);
Is there a way to free memory used by database query?
I tried to put update query in a function so after leaving function scope resources used by this function should be freed automaticaly:
function update($_db) {
$sql = "UPDATE table SET field='value'";
$_db->query($sql);
}
...
for ($i=0; $i<1000; $i++) {
update($_db);
}
but they are not!
I'm not interested in advices like 'try updating mutliple rows in one go' ;)
Most probably you have the Zend_Db_Profiler enabled.
The database profiler stores each executed query which is very useful for debugging and optimisation but leads to rather fast memory exhaustion if you execute a huge numbers of queries.
In the example you gave, disabling the profiler should do the trick:
$_db = Zend_Db_Table::getDefaultAdapter();
$_db->getProfiler()->setEnabled(false);
$start_memory = memory_get_usage();
for ($i=0; $i<1000; $i++) {
$update_query = "UPDATE table SET field='value'";
$_db->query($update_query);
}
echo 'memory used: '.(memory_get_usage()-$start_memory);
When executing the same query multiple times the best way to save memory is to implement prepared statements. Your adapter is going to be using prepared statements, but since you are calling the query() method inside the loop, it's getting prepared every time. Move that outside of the loop:
$_db = Zend_Db_Table::getDefaultAdapter();
$_stm = $_db->prepare("UPDATE table SET field='?'");
for ($i=0; $i<1000; $i++) {
$_stm->execute(array($fieldValue));
}

PHP Optimisation is this correct?

I have been creating a php application that makes quite a few queries to the database i'd say roughly around 30 or so each page load. This is needed due to the nature of the application. I am using OOP php techniques and optimising my queries as much as I can. Should I be using some sort of caching system? or would you say 30 is fine? Here is a typical query.
Ok so my __construct looks like this:
public function __construct($host = 'localhost', $user = 'root', $pass = 'root', $name = 'advert')
{
$this->_conn = new mysqli($host, $user, $pass, $name)
or trigger_error('Unable to connect to the server, please check your credentials.', E_USER_ERROR);
}
And one method like so.
$sql = "SELECT `advert_id`,
`ad_title`,
`ad_image` FROM adverts WHERE UNIX_TIMESTAMP() < `ad_expires` AND `ad_show` = 0 AND `ad_enabled` = 1 ORDER BY `ad_id` DESC LIMIT 1";
$stmt = $this->_conn->prepare($sql);
if ($stmt) {
$stmt->execute();
$stmt->bind_result($ad_id, $ad_title, $ad_image);
$rows = array();
while ($row = $stmt->fetch()) {
$item = array(
'ad_id' => $ad_id,
'ad_title' => $ad_title,
'ad_image' => $ad_image
);
$rows[] = $item;
}
The app is kinda like this throughout.
Thanks any feedback will be much appreciated.
**EDIT Sorry i meant to say 30 queries not 30 connections
You should use caching when it will useful. If time of page generation without caching of queries is 3 seconds, and with caching - 0.03, then you should use caching, obviously. If caching not gives any noticeable boost - don't spend resources.
Just make one connection and re-use it. 30 connections is a lot considering that you might have multiple users.
Edit: initial question said connections. 30 queries is fine unless this is data that doesn't change very often. In this case you can first do a query to see if you need to pull data or if the cached data is fine to serve to the user.

Efficient way to look up value based on a key in php [duplicate]

This question already has answers here:
How to find array / dictionary value using key?
(2 answers)
Closed 1 year ago.
With a list of around 100,000 key/value pairs (both string, mostly around 5-20 characters each) I am looking for a way to efficiently find the value for a given key.
This needs to be done in a php website. I am familiar with hash tables in java (which is probally what I would do if working in java) but am new to php.
I am looking for tips on how I should store this list (in a text file or in a database?) and search this list.
The list would have to be updated occasionally but I am mostly interested in look up time.
You could do it as a straight PHP array, but Sqlite is going to be your best bet for speed and convenience if it is available.
PHP array
Just store everything in a php file like this:
<?php
return array(
'key1'=>'value1',
'key2'=>'value2',
// snip
'key100000'=>'value100000',
);
Then you can access it like this:
<?php
$s = microtime(true); // gets the start time for benchmarking
$data = require('data.php');
echo $data['key2'];
var_dump(microtime(true)-$s); // dumps the execution time
Not the most efficient thing in the world, but it's going to work. It takes 0.1 seconds on my machine.
Sqlite
PHP should come with sqlite enabled, which will work great for this kind of thing.
This script will create a database for you from start to finish with similar characteristics to the dataset you describe in the question:
<?php
// this will *create* data.sqlite if it does not exist. Make sure "/data"
// is writable and *not* publicly accessible.
// the ATTR_ERRMODE bit at the end is useful as it forces PDO to throw an
// exception when you make a mistake, rather than internally storing an
// error code and waiting for you to retrieve it.
$pdo = new PDO('sqlite:'.dirname(__FILE__).'/data/data.sqlite', null, null, array(PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION));
// create the table if you need to
$pdo->exec("CREATE TABLE stuff(id TEXT PRIMARY KEY, value TEXT)");
// insert the data
$stmt = $pdo->prepare('INSERT INTO stuff(id, value) VALUES(:id, :value)');
$id = null;
$value = null;
// this binds the variables by reference so you can re-use the prepared statement
$stmt->bindParam(':id', $id);
$stmt->bindParam(':value', $value);
// insert some data (in this case it's just dummy data)
for ($i=0; $i<100000; $i++) {
$id = $i;
$value = 'value'.$i;
$stmt->execute();
}
And then to use the values:
<?php
$s = microtime(true);
$pdo = new PDO('sqlite:'.dirname(__FILE__).'/data/data.sqlite', null, null, array(PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION));
$stmt = $pdo->prepare("SELECT * FROM stuff WHERE id=:id");
$stmt->bindValue(':id', 5);
$stmt->execute();
$value = $stmt->fetchColumn(1);
var_dump($value);
// the number of seconds it took to do the lookup
var_dump(microtime(true)-$s);
This one is waaaay faster. 0.0009 seconds on my machine.
MySQL
You could also use MySQL for this instead of Sqlite, but if it's just one table with the characteristics you describe, it's probably going to be overkill. The above Sqlite example will work fine using MySQL if you have a MySQL server available to you. Just change the line that instantiates PDO to this:
$pdo = new PDO('mysql:host=your.host;dbname=your_db', 'user', 'password', array(PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION));
The queries in the sqlite example should all work fine with MySQL, but please note that I haven't tested this.
Let's get a bit crazy: Filesystem madness
Not that the Sqlite solution is slow (0.0009 seconds!), but this about four times faster on my machine. Also, Sqlite may not be available, setting up MySQL might be out of the question, etc.
In this case, you can also use the file system:
<?php
$s = microtime(true); // more hack benchmarking
class FileCache
{
protected $basePath;
public function __construct($basePath)
{
$this->basePath = $basePath;
}
public function add($key, $value)
{
$path = $this->getPath($key);
file_put_contents($path, $value);
}
public function get($key)
{
$path = $this->getPath($key);
return file_get_contents($path);
}
public function getPath($key)
{
$split = 3;
$key = md5($key);
if (!is_writable($this->basePath)) {
throw new Exception("Base path '{$this->basePath}' was not writable");
}
$path = array();
for ($i=0; $i<$split; $i++) {
$path[] = $key[$i];
}
$dir = $this->basePath.'/'.implode('/', $path);
if (!file_exists($dir)) {
mkdir($dir, 0777, true);
}
return $dir.'/'.substr($key, $split);
}
}
$fc = new FileCache('/tmp/foo');
/*
// use this crap for generating a test example. it's slow to create though.
for ($i=0;$i<100000;$i++) {
$fc->add('key'.$i, 'value'.$i);
}
//*/
echo $fc->get('key1', 'value1');
var_dump(microtime(true)-$s);
This one takes 0.0002 seconds for a lookup on my machine. This also has the benefit of being reasonably constant regardless of the cache size.
It depends on how frequent you would access your array, think it this way how many users can access it at same time.There are many advantages towards storing it in database and here you have two options MySQL and SQLite.
SQLite works more like text file with SQL support, you can save few milliseconds during queries as it located within reach of your application, the main disadvantage of it that it can add only one record at a time (same as text file).
I would recommend SQLite for arrays with static content like GEO IP data, translations etc.
MySQL is more powerful solution but require authentication and located on separate machine.
PHP arrays will do everything you need. But shouldn't that much data be stored in a database?
http://php.net/array

PHP PDO: how does re-preparing a statement affect performance

I'm writing a semi-simple database wrapper class and want to have a fetching method which would operate automagically: it should prepare each different statement only the first time around and just bind and execute the query on successive calls.
I guess the main question is: How does re-preparing the same MySql statement work, will PDO magically recognize the statement (so I don't have to) and cease the operation?
If not, I'm planning to achieve do this by generating a unique key for each different query and keep the prepared statements in a private array in the database object - under its unique key. I'm planning to obtain the array key in one of the following ways (none of which I like). In order of preference:
have the programmer pass an extra, always the same parameter when calling the method - something along the lines of basename(__FILE__, ".php") . __LINE__ (this method would work only if our method is called within a loop - which is the case most of the time this functionality is needed)
have the programmer pass a totally random string (most likely generated beforehand) as an extra parameter
use the passed query itself to generate the key - getting the hash of the query or something similar
achieve the same as the first bullet (above) by calling debug_backtrace
Has anyone similar experience? Although the system I'm working for does deserve some attention to optimization (it's quite large and growing by the week), perhaps I'm worrying about nothing and there is no performance benefit in doing what I'm doing?
MySQL (like most DBMS) will cache execution plans for prepared statements, so if user A creates a plan for:
SELECT * FROM some_table WHERE a_col=:v1 AND b_col=:v2
(where v1 and v2 are bind vars) then sends values to be interpolated by the DBMS, then user B sends the same query (but with different values for interpolation) the DBMS does not have to regenerate the plan. i.e. it's the DBMS which finds the matching plan - not PDO.
However this means that each operation on the database requires at least 2 round trips (1st to present the query, the second to present the bind vars) as opposed to a single round trip for a query with literal values, then this introduces additional network costs. There is also a small cost involved in dereferencing (and maintaining) the query/plan cache.
The key question is whether this cost is greater than the cost of generating the plan in the first place.
While (in my experience) there definitely seems to be a performance benefit using prepared statements with Oracle, I'm not convinced that the same is true for MySQL - however, a lot will depend on the structure of your database and the complexity of the query (or more specifically, how many different options the optimizer can find for resolving the query).
Try measuring it yourself (hint: you might want to set the slow query threshold to 0 and write some code to convert literal values back into anonymous representations for the queries written to the logs).
Believe me, I've done this before and after building a cache of prepared statements the performance gain was very noticeable - see this question: Preparing SQL Statements with PDO.
An this was the code I came up after, with cached prepared statements:
function DB($query)
{
static $db = null;
static $result = array();
if (is_null($db) === true)
{
$db = new PDO('sqlite:' . $query, null, null, array(PDO::ATTR_ERRMODE => PDO::ERRMODE_WARNING));
}
else if (is_a($db, 'PDO') === true)
{
$hash = md5($query);
if (empty($result[$hash]) === true)
{
$result[$hash] = $db->prepare($query);
}
if (is_a($result[$hash], 'PDOStatement') === true)
{
if ($result[$hash]->execute(array_slice(func_get_args(), 1)) === true)
{
if (stripos($query, 'INSERT') === 0)
{
return $db->lastInsertId();
}
else if (stripos($query, 'SELECT') === 0)
{
return $result[$hash]->fetchAll(PDO::FETCH_ASSOC);
}
else if ((stripos($query, 'UPDATE') === 0) || (stripos($query, 'DELETE') === 0))
{
return $result[$hash]->rowCount();
}
else if (stripos($query, 'REPLACE') === 0)
{
}
return true;
}
}
return false;
}
}
Since I don't need to worry about collisions in queries, I've ended up using md5() instead of sha1().
OK, since I've been bashing methods of keying the queries for the cache, other than simply using the query string itself, I've done a naive benchmark. The following compares using the plain query string vs first creating the md5 hash:
$ php -v
$ PHP 5.3.0-3 with Suhosin-Patch (cli) (built: Aug 26 2009 08:01:52)
$ ...
$ php benchmark.php
$ PHP hashing: 0.19465494155884 [microtime]
$ MD5 hashing: 0.57781004905701 [microtime]
$ 799994
The code:
<?php
error_reporting(E_ALL);
$queries = array("SELECT",
"INSERT",
"UPDATE",
"DELETE",
);
$query_length = 256;
$num_queries = 256;
$iter = 10000;
for ($i = 0; $i < $num_queries; $i++) {
$q = implode('',
array_map("chr",
array_map("rand",
array_fill(0, $query_length, ord("a")),
array_fill(0, $query_length, ord("z")))));
$queries[] = $q;
}
echo count($queries), "\n";
$cache = array();
$side_effect1 = 0;
$t = microtime(true);
for ($i = 0; $i < $iter; $i++) {
foreach ($queries as $q) {
if (!isset($cache[$q])) {
$cache[$q] = $q;
}
else {
$side_effect1++;
}
}
}
echo microtime(true) - $t, "\n";
$cache = array();
$side_effect2 = 0;
$t = microtime(true);
for ($i = 0; $i < $iter; $i++) {
foreach ($queries as $q) {
$md5 = md5($q);
if (!isset($cache[$md5])) {
$cache[$md5] = $q;
}
else {
$side_effect2++;
}
}
}
echo microtime(true) - $t, "\n";
echo $side_effect1 + $side_effect2, "\n";
To my knowledge PDO does not reuse already prepared statements as it does not analyse the query by itself so it does not know if it is the same query.
If you want to create a cache of prepared queries, the simplest way imho would be to md5-hash the query string and generate a lookup table.
OTOH: How many queries are you executing (per minute)? If less than a few hundred then you only complicate the code, the performance gain will be minor.
Using a MD5 hash as a key you could eventually get two queries that result in the same MD5 hash. The probability is not high, but it could happen. Don't do it. Lossful hashing algorithms like MD5 is just ment as a way to tell if two objects are different with high certainty, but are not a safe means of identifying something.

Categories