PHP's pg_connect() appears to be slower on a new

PHP's pg_connect() appears to be slower on a new - php

Got an obscure one. Been at this one for weeks. Here is what we’re doing:
Upgrade of Silverstripe 3.1 -> 3.7
Upgrade of Platform. PHP 5.6 -> 7.4, Postgres 9.5 -> 13
Old system is running mod_php, New is running PHP_FPM
Parallel hosting systems running. Old and New. Both have identical resources.
Performance tests run on both systems for Old -> New comparisons.
Both platforms consist of two App nodes, and one DB node
Most tests are great. PHP 7.4 obviously smokes the old system. But one test is Apache Bench just hitting home page. Which shows a pretty serious degradation on the New system. Digging further into this, we created a series of test files for Apache Bench to hit that isolate specific things - Static HTML, Raw PHP, PHP + Raw pg_ commands and queries.
All thrash the old system except the latter. DB interactions. So drilling down further we created a new target file that simply does a pg_connect() and immediately closes it, then hit that with Apache Bench. Same result as the query test script. Suggesting the deficit is in the action of pg_connect(). Database connections.
Clearly there's a lot of variables in this issue. And I'm loath to pack this question out with a bunch of details about the tests (and other tests like pgbench) until needed. I'm more hoping to stumble across someone who has observed a similar issue with pg_connect() specifically, and has even the most random idea for an angle to test.
Other testing ideas would be welcome. One thing we are trying to work out right now is how to get connection timing from pgbench to see if the issue exists on Postgres itself. Our tests are as follows:
Apache Bench hitting several target pages
PGBench (showing the new system performing better than the old)
PHP test
jMeter is being set up now
UPDATE
We've done a bit of manual digging in the Postgres logs in each environment, and see something a bit strange looking at the received/authorised log entry pairs like this:
2021-05-19 11:34:25.708 NZST,,,404734,"10.220.218.21:50560",60a44f01.62cfe,1,"",2021-05-19 11:34:25 NZST,,0,LOG,00000,"connection received: host=xx.xx.xx.xx port=50560",,,,,,,,,""
2021-05-19 11:34:25.908 NZST,"db_user","db_name",404734,"xx.xx.xx.xx:50560",60a44f01.62cfe,2,"authentication",2021-05-19 11:34:25 NZST,7/486524,0,LOG,00000,"connection authorized: user=db_user database=db_name",,,,,,,,,""
I captured a bunch of these pairs on both environments (Old and New) and calculated the gap between the recieved log message and the authentication log message.
On Old, the average is 0.011s. On New, the average is 0.195s. Difference aside, this makes no sense, as the test page on the application node of that environment takes ~0.02s to complete in full.

Related

APC User-Cache suitable for high load environments?

We try to deploy APC user-cache in a high load environment as local 2nd-tier cache on each server for our central caching service (redis), for caching database queries with rarely changing results, and configuration. We basically looked at what Facebook did (years ago):
http://www.slideshare.net/guoqing75/4069180-caching-performance-lessons-from-facebook
http://www.slideshare.net/shire/php-tek-2007-apc-facebook
It works pretty well for some time, but after some hours under high load, APC runs into problems, so the whole mod_php does not execute any PHP anymore.
Even a simple PHP script with only does not answer anymore, while static resources are still delivered by Apache. It does not really crash, there is no segfault. We tried the latest stable and latest beta of APC, we tried pthreads, spin locks, every time the same problem. We provided APC with far more memory it can ever consume, 1 minute before a crash we have 2% fragmentation and about 90% of the memory is free. When it „crashes“ we don’t find nothing in error logs, only restarting Apache helps. Only with spin locks we get an php error which is:
PHP Fatal error: Unknown: Stuck spinlock (0x7fcbae9fe068) detected in
Unknown on line 0
This seems to be a kind of timeout, which does not occur with pthreads, because those don’t use timeouts.
What’s happening is probably something like that:
http://notmysock.org/blog/php/user-cache-timebomb.html
Some numbers: A server has about 400 APC user-cache hits per second and about 30 inserts per second (which is a lot I think), one request has about 20-100 user-cache requests. There are about 300.000 variables in the user-cache, all with ttl (we store without ttl only in our central redis).
Our APC-settings are:
apc.shm_segments=1
apc.shm_size=4096M
apc.num_files_hint=1000
apc.user_entries_hint=500000
apc.max_file_size=2M
apc.stat=0
Currently we are using version 3.1.13-beta compiled with spin locks, used with an old PHP 5.2.6 (it’s a legacy app, I’ve heard that this PHP version could be a problem too?), Linux 64bit.
It's really hard to debug, we have written monitoring scripts which collect as much data as we could get every minute from apc, system etc., but we cannot see anything uncommon - even 1 minute before a crash.
I’ve seen a lot of similar problems here, but by now we couldn’t find a solution which solves our problem yet. And when I read something like that:
http://webadvent.org/2010/share-and-enjoy-by-gopal-vijayaraghavan
I’m not sure if going with APC for a local user-cache is the best idea in high load environments. We already worked with memcached here, but APC is a lot faster. But how to get it stable?
best regards,
Andreas

Lesson 1: https://www.kernel.org/doc/Documentation/spinlocks.txt
The single spin-lock primitives above are by no means the only ones. They
are the most safe ones, and the ones that work under all circumstances,
but partly because they are safe they are also fairly slow. They are slower
than they'd need to be, because they do have to disable interrupts
(which is just a single instruction on a x86, but it's an expensive one -
and on other architectures it can be worse).
That's written by Linus ...
Spin locks are slow; that assertion is not based on some article I read online by facebook, but upon the actual facts of the matter.
It's also, an incidental fact, that spinlocks are deployed at levels higher than the kernel because of the very problems you speak of; untraceable deadlocks because of a bad implementation.
They are used by the kernel efficiently, because that's where they were designed to be used, locking tiny tiny tiny sections, not sitting around and waiting for you to copy your amazon soap responses into apc and back out a billion times a second.
The most suitable kind of locking (for the web, not the kernel) available in APC is definitely rwlocks, you have to enable rwlocks with a configure option in legacy APC and it is the default in APCu.
The best advice that can be given, and I already gave it, is don't use spinlocks, if mutex are causing your stack to deadlock then try rwlocks.
Before I continue, your main problem is you are using a version of PHP from antiquity, which nobody even remembers how to support, in general you should look to upgrade, I'm aware of the constraints on the OP, but it would be irresponsible to negate to mention that this is a real problem, you do not want to deploy on unsupported software. Additionally APC is all but unmaintained, it is destined to die. O+ and APCu are it's replacement in modern versions of PHP.
Anyway, I digress ...
Synchronization is a headache when you are programming at the level of the kernel, with spinlocks, or whatever. When you are several layers removed from the kernel, when you rely on 6 or 7 bits of complicated software underneath you synchronizing properly in order that your code can synchronize properly synchronization becomes, not only a headache for the programmer, but for the executor too; it can easily become the bottleneck of your shiny web application even if there are no bugs in your implementation.
Happily, this is the year 2013, and Yahoo aren't the only people able to implement user caches in PHP :)
http://pecl.php.net/package/yac
This is an, extremely clever, lockless cache for userland PHP, it's marked as experimental, but once you are finished, have a play with it, maybe in another 7 years we won't be thinking about synchronization issues :)
I hope you get to the bottom of it :)

Unless you are on a freebsd derived operating system it is not a good idea to use spinlocks, they are the worst kind of synchronization on the face of the earth. The only reason you must use them in freebsd is because the implementer negated to include PTHREAD_PROCESS_SHARED support for mutex and rwlocks, so you have little choice but to use the pg-sql inspired spin lock in that case.

php-handlersocket results truncated when values too large

Our setup is as follows:
Primary DB Server, Amazon EC2 m2.xlarge instance (17GB ram, 2x3.25ecu CPU) running Percona 5.5.x
Application Server(s), Amazon EC2 m1.large instance (7.5GB ram, 2x2ecu CPU) running PHP 5.4
php-handlersocket PECL library found here http://code.google.com/p/php-handlersocket/
For the most part it works but as soon as I load up the app server with even relative traffic, the results start failing on queries where the result record(s) have fields with medium to large values. The two main culprits in our case are XML strings that are ~5Kb, and media files stored as binary objects 5-500Kb. The symptom is if I request 10 fields and the XML is in the 8th field, I'll get 7 results with data, and the 8th will be empty, 9 and 10 are not included at all.
There is a reported issue for the php-handlersocket library relating to this kind of problem, however there's also a proposed fix, which I've implemented and I thought it helped, but it seems not entirely. The issue details and fix are here http://code.google.com/p/php-handlersocket/issues/detail?id=28
My HandlerSocket settings are just slightly different than the defaults, should I be setting these different?
loose_handlersocket_port = 9998
loose_handlersocket_port_wr = 9999
loose_handlersocket_threads = 4
loose_handlersocket_threads_wr = 1
open_files_limit = 65535
I've reduced the default read threads to 4 since they recommend CORES * 2, the default is 16. I thought slower responses would be better than none at all, but this didn't seem to make a difference.
The php-handlersocket project looks to be dead which on it's own is a bit surprising, the last source updates were more than a year ago, but there doesn't seem to be any other PHP library available so I'm stuck.
I'm wondering if anyone has had similar problems, if there are other libraries available or if I should be exploring skipping libraries and creating my own interface with something like CURL.

So in the end, as with many problems, I became wary of an open source project with limited attention and decided to write my own php sockets based solution that's working perfectly.

How to effectively patch code running on multiple hosts?

A product is a CMS under constant development and new features patches are always being rolled out. The product is hosted on multiple servers for some of which I just have FTP credentials and access to db and for others I have full root access. Now, as soon as a new feature is released after testing and all I have to manually ftp files and run SQL queries on all servers. This is very time consuming, error prone and inefficient. How can I make it more robust, fool proof or automate parts of it? The CMS is based on PHP, MySQL, JS, HTML and CSS. The admin part is common for all. Skins and some custom modules are developed for different clients and the only part we update is admin.
Update
For managing code we use GIT, SQL is not part of this GIT structure and I will be talking to product manager/owners to have it under version control.

This is one of the big questions of unpackaged code.
Personally, I have a PHAR, which when executed, extracts code to the specific folder and executes needed queries.

I've run web deployments across dozens of servers handling hundreds of millions of visitors/month.
SQL change management is always going to be a beast. You're only hope is either rolling your own in house (what I did) or using something like EMS DB Comparer.
To handle the file syncing, you're going to need a number of tools, all expertly crafted to work together, including:
Source code version control (bzr, svn, etc.), that is properly branched (stable branch, dev branch, testing branch are required),
A Continuous Integration server,
SFTP support on every server,
Hopefully unit and integrative tests to determine build quality,
Rsync on every server,
Build scripts (I do these in Phing),
Deployment scripts (in Phing as well),
Know how.
The general process takes approximately 20 hours to research thoroughly and about 40 hours to set up. When I drew up my documentation, there were over 40 distinct steps involved. Creating the SQL change management utility took another 20-30 hours. Add in another 10 hours of testing, and you're looking at a 100-120 hour project, but one which will save you considerably in botched deployments in the future as well as reduce deployment time to the time it takes to click a button.
That's why I do consultations on setting up this entire process, and it usually takes ~5 hours to set up on a client's network.

WordPress on IIS 7 php-cgi hogging CPU

Running WordPress on IIS 7 (Windows Server 2008) with WP-SuperCache as per IIS.net's guide.
Was running great but recently we changed the permissions on some folders and the administrator password and we're getting huge spikes in our CPU usage as a result of the PHP-cgi.exe processes.
This leads me to believe it's not caching however the pages themselves have the "Cached with WP-SuperCache" comments at the bottom, and the caching seems to be working correctly.
What else could be the issue here?

I think I may have found a solution or at least a work round to this problem, at least it seems to be working for me reliably.
Try setting the Max Instances setting, under IIS Server --> FastCGI Settings, to 1.
It seemed to me that only certain requests were causing a php-cgi.exe process to go rogue and hog the cpu, usually when updating a post. When reading other posts on this issue one of them mentioned the Max Instances setting and that it is set to default at 0 or automatic. I wondered if this might not have a good effect when things aren't as they should be. I'm guessing (but this isn't quite my field of expertise) if a certain request(s) is causing the process to lock-up, so FastCGI just creates another, whilst leaving the first in place. Somehow it seems only having a single instance allows PHP to move on from the lock-up and the cpu stays under control.
For servers with high-levels of requests setting FastCGI to only a single instance may not be ideal, but it certainly beats the delays I was getting before. Used in combination with WP-SuperCache and WinCache, things seem to nipping along nicely now.

Looking at that task mgr looks like its missing the cache on every request. Plus that article dates to 2008 so difficult to say whether the directions as written would still work. Something with WP-SuperCache could have changed.
I would recommend using W3 Total Cache. I've done extensive testing with it on Windows Server 2008 and IIS 7 and it works great. It is also compatible with and leverages the WinCache extension for PHP. Has some other great features too if you're interested, minification, CDN support, etc. It's a really great performance plugin for WordPress. You can get the plugin here, http://wordpress.org/extend/plugins/w3-total-cache/
some other things to check...
What size is the app pool? (# of processes?)
Make sure you are using PHP 5.3.
Make sure you are using WinCache.
Make sure to set MaxInstanceRequests to something less than PHP_FCGI_MAX_REQUESTS. Definitely do not allow PHP to handle recyling the app pool. The default is 10K requests. If you are seeing these results during a load test then this might be the cause. Increase MaxInstanceRequests and keep it one less than PHP_FCGI_MAX_REQUESTS.
Hope that helps.

What steps should you take to speed up SimpleTest?

I'm writing some testing code on a Drupal 6 project, and I can't believe how slow these tests seem to be running, after working with other languages and frameworks like Ruby on Rails or Django.
Drupal.org thinks this question is spam, and won't give me a way to prove I'm human, so I figured SO is the next base place to ask a question like this, and get a sanity check on my approach to testing.
The following test code in this gist is relatively trivial.
http://gist.github.com/498656
In short I am:
creating a couple of content types,
create some roles,
creating users,
creating content as the users,
checking if the content can be edited by them
checking if it's visible to anonymous users
And here's the output when I run these tests from the command line:
Drupal test run
---------------
Tests to be run:
- (ClientProjectTestCase)
Test run started: Thu, 29/07/2010 - 19:29
Test summary:
-------------
ClientProject feature 52 passes, 0 fails, and 0 exceptions
Test run duration: 2 min 9 sec
I'm trying to run tests like this before I push code to a central repo everytime, but if it's taking this long this early on the project, I dread to think about it further down the line when we have ever more test cases.
What can I do to speed this up?
I'm using a MacbookPro with:
4gb of ram,
2.2ghz Core 2 Duo processor,
PHP 5.2,
Apache 2.2.14, without any opcode caching,
Mysql 5.1.42 (Innodb tables are my default)
A 5400 RPM laptop hard drive
I understand that in the examples above I'm bootstrapping Drupal each time, and this is a very expensive operation, but this isn't unheard with other frameworks, like Ruby on Rails, or Django, and I don't understand why it's averaging out at a little over a minute per testcase on this project.
There's a decent list of tricks here for speeding up Drupal 7, many of which look like they'd apply to Drupal 6 as well, but I haven't yet had a chance to try them yet, and it would be great to hear how these have worked out for others by I blunder down further blind alleys,
What has worked for you when you've been working with Drupal 6 in this situation, and where are the quick wins for this?
One minute per test case when I'm expecting to easily more than a hundred test cases feels insane.

It looks like the biggest increase in speed will come from running the test database in a ram disk, based on this post here on Performance tuning tips for Drupal 7 testing on qa.drupal.org
DamZ wrote a modified mysql init.d script for /etc/init.d/mysql on Debian 5 that runs MySQL databases entirely out of tmpfs. It's at http://drupal.org/files/mysql-tmpfs.txt, attached to http://drupal.org/node/466972.
It allowed the dual quad core machine donated to move from a 50 minute test and huge disk I/O with InnoDB to somewhere under 3 minutes per test. It's live as #32 on PIFR v1 for testing.d.o right now. It is certainly the only way to go.
I have not and won't be trying it on InnoDB anytime soon if anyone wants to omit the step on skip-innodb below and try it on tmpfs.
Also there some instructions here for creating a ram disk on OS X, although this is for moving your entire stock of mysql databases into a ram disk, instead of just a single database:
Update - I've tried this approach now with OS X, and documented what I've found
I've been able to cut 30-50% from the test times by switching to a ram disk. Here are the steps I've taken:
Create a ram disk
I've chosen a gigabyte mainly because I've got 4gb of RAM, and I'm not sure how much space I might need, so I'm playing it safe:
diskutil erasevolume HFS+ "ramdisk" `hdiutil attach -nomount ram://2048000`
Setup mysql
Next I ran the mysql install script to get mysql installed on the new ramdisk
/usr/local/mysql/scripts/mysql_install_db \
--basedir=/usr/local/mysql \
--datadir=/Volumes/ramdisk
Then, I took the following steps: I made sure the previous mysqld was no longer running, and then started the mysql daemon, making sure we tell it to use ram disk as our data directory, rather than the default location.
/usr/local/mysql/bin/mysqld \
--basedir=/usr/local/mysql \
--datadir=/Volumes/ramdisk \
--log-error=/Volumes/ramdisk/mysql.ramdisk.err \
--pid-file=/Volumes/ramdisk/mysql.ramdisk.pid \
--port=3306 \
--socket=/tmp/mysql_ram.sock
Add the database for testing
I then pulled down the latest database dump on our staging site with drush, before updating where settings.php points to it:
drush sql-dump > staging.project.database.dump.sql
Next was to get this data into the local testing setup on the ram disk. This involved creating a symlink to the ramdisk database socket, and creating the database, granting rights to the mysql user specified in the drupal installation, then loading the database in to start running tests. Step by step:
Creating the symlink - this because the mysql command by default looks for /tmp/mysql.sock, and symlinking that to our short term ram disk was simpler than constantly changing php.ini files
ln -s /tmp/mysql_ram.sock /tmp/mysql.sock
Creating the database (from the comamnd line at the mysql prompt)
CREATE DATABASE project_name;
GRANT ALL PRIVILEGES ON project_name.* to db_user#localhost IDENTIFIED BY 'db_password';
Loading the content into the new database...
mysql project_database < staging.project.database.dump.sql
Run the tests on the command line
...and finally running the test from the command line, and using growlnotify to tell me when tests have finished
php ./scripts/run-tests.sh --verbose --class ClientFeatureTestCase testFeaturesCreateNewsItem ; growlnotify -w -m "Tests have finished."
Two test cases takes around a minute and half still, is still unusably slow - orders of magnitude slower than other frameworks I might have used before.
What am I doing wrong here?
This can't be the standard way of running tests with Drupal, but I haven't been able to find any stats on how long I should expect a test suite to take with Drupal, to tell me otherwise,

The biggest issue with Drupal SimpleTests is it takes a long time to install Drupal, and that's done for every test case.
So use simpletest_clone -- basically, dump your database fresh after installation and it lets you use that dump as the starting point for each test case rather than running the entire installer.

I feel your pain, and your observations are spot on. A suite that takes minutes to run is a suite that inhibits TDD. I've resorted to plain PHPUnit tests run on the command line which run as fast as you'd expect coming from a Rails environment. The real key is to get away from hitting the database at all; use mocks and stubs.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.