Get unique worker/thread/process/request ID in PHP

Get unique worker/thread/process/request ID in PHP - php

In multi-threaded environments (like most web platforms) I often include some sort of thread ID to the logs of my apps. This enables me to tell exactly what log entry came from which request/thread, when there are multiple requests at once which are simultaneously writing to the same log.
In .NET/C#, this can be done by the formatters of log4net, which by default include the current thread's ManagedThreadId (a number) or Name (a given name). These properties uniquely identify a thread (see for example: How to log correct context with Threadpool threads using log4net?
In PHP, I have not found anything similar (I asked Google, PHP docs and SO). Does it exist?

Up until recently, I used apache_getenv("UNIQUE_ID"), and it worked perfectly with a crc32 or another hash function.
Nowadays I'm just using the following, in order to remove dependency on Apache and this mod.
$uniqueid = sprintf("%08x", abs(crc32($_SERVER['REMOTE_ADDR'] . $_SERVER['REQUEST_TIME'] . $_SERVER['REMOTE_PORT'])));
It's unique enough to understand which logs belong to which request. If you need more precision, you can use other hash functions.
Hope this helps.

zend_thread_id():
int zend_thread_id ( void )
This function returns a unique identifier for the current thread.
Although:
This function is only available if PHP has been built with ZTS (Zend Thread Safety) support and debug mode (--enable-debug).
You could also try yo call mysql_thread_id(), when you use that API for your database access (or mysqli::$thread_id when using mysqli).

PHP does not seem to have a function for this available, but your web server might be able to pass the identifier via environment variables. There is for example an Apache module called "mod_unique_id"[1] which generates a unique identifier for each request and stores it as an environment variables. If the variable is present, it should be visible via $_SERVER['unique_id'] [2]
"Pure PHP" solution could be to write a script that generates suitable random identifier, stores it via define("unique_id", val) and then use auto_prepend_file [3] option in php.ini to include this in every script that executes. This way the unique id would be created when the request starts processing and it would be available during the processing of the request.
[1] http://httpd.apache.org/docs/current/mod/mod_unique_id.html
[2] http://forums.devshed.com/php-development-5/server-unique-id-questions-163269.html
[3] http://www.php.net/manual/en/ini.core.php#ini.auto-prepend-file

I've seen getmypid() used for this purpose, but it seems to behave differently on different systems. In some cases the ID is unique to each request, but on others it's shared.
So, you're probably better of going with one of the other answers to ensure portability.

Assigning an ID in order to identify logged data from serving a request probably is as simple as creating a UUID version 4 (random) and writing it to every line of the log.
There even is software helping with that: ramsey/uuid, php-middleware/request-id
Adding it to every line of logging is easy when using log4php by putting the UUID to the LoggerMDC data and using an appropriate LogFormatter. With PSR-3 loggers, it might be a bit more complicated, YMMV.
A randomly created UUID will be suitable to identify one single request, and by using that UUID in the HTTP headers of sub requests and in the response, it will even be possible to trace one request across multiple systems and platforms inside the server farm. However, putting it as a header is not the task of any of the packages I mentioned.

Related

PHP session id in cookie and storage

I've read multiple comments about encrypting PHP session data, in case it is stored in a temp directory that is available on multiple accounts on a shared server. However, even if the data is encrypted, session_start() still generates filenames containing the session_id. For example,
sess_uivrkk2c5ksnv2hnt5rc8tvgi5
, where uivrkk2c5ksnv2hnt5rc8tvgi5 is the same session id I found in the cookie my browser received.
How is this problem typically addressed / could someone point me to an example? All of the simple examples I've found only address encrypting the data, not changing the filename.
Just to see what would happen, I made a SessionHandler wrapper that would do an MD5 hash on the $session_id variable before passing it on to its parent function, but that did not work. Instead, I ended up with two files: a blank one (with session_id as a part of its name) and a full one (with an MD5'ed session_id). Also, there was the problem of close() not accepting session_id as a parameter, so I couldn't pass it on to its parent.
EDIT: I 'm learning about php sessions, this isn't for a live commercial site, etc.

Yes, in some scenarios (i.e. a very incompetently configured server - although these do unfortunately exist) on a shared server your session data may be readable by other people. Trying to hide the session files by changing their names serves no useful purpose - this is described as "Security through Obscurity". Go and Google the phrase - it is usually described as an oxymoron.
If your question is how do you prevent other customers accessing your session data on a badly configured server then the sensible choices (in order of priority) are:
switch service provider
use a custom session handler to store the data somewhere secure (e.g. database) There are lots of examples on the web - quality varies
use a custom session handler to encrypt the data and use file storage. Again you don't need to write the code yourself - just scrutinize any candidates
If you want to find out if your provider might be a culprit - just have a look at the value of FILE. Does it look as if you have access to the root filesystem? Write a script which tries to read from outside your home directory. If you can't then the provider may have set an open_basedir restriction (it is possible to get around this - again Google will tell you how).

How works (and using) RESTful's caching

I developed a really small REST API (using PHP), which provides information about users (also update and create users but it doesn't matter for the question). just to show the available calls (JSON output, by the way):
/api/users/54216
/api/users/54216?fields=id,name
/api/users/54216/photos
54216 is an example user id.
Until this day, I used caching only to save html pages to display, really not complicated - never used cache to save only data.
What should I do to save these calls, and how do I use it then? My target is (I think..) to save the data to JSON file one time at X minutes and when needed, get the file cache and decode it.
In addition, how do you recommend me to cache specific the information of user? because call no.1 output all the information and call no.2 output only specific fields, I don't wanna use 2 cache files because it's really not effective.
I have never taken a part in this section (cache [json] data & REST API, it's my first time), so I am very confused.
EDIT:
I am talking about server-side caching.

I suggest you to read HTTP Cache
The first important principal is to understand how HTTP Caching Works,
there are basically two parts, TTL (Cache-Control) and Stale Check
(ETag). When a resource is generated by an origin server you need to
think of it is gone. You no longer have control over it, you only get
to make suggestions to the client what to do with it. The two
mechanism you have are TTL (which is how long the client should keep
the object in cache before checking back) and Stale Check (which is a
version of the resource that was returned) that can be sent with a new
GET request to the origin server, to say "Hey I have this version is
it still good". Giving the Origin server the opportunity to say yep,
keeping using that one and provide a new TTL, if it is still valid.
You need to use these two controls in different ways to get the
effects you want. For instance when serving files that will never
change (like the css for a build) you can set a really long TTL, and
no etag. For something that doesn't change very often, but when it
does change needs to be quick (like the party members on a
reservation) you would set a low TTL (like 1 minute) and an ETag. In
this second example you set a low TTL of 1 minute to help with bursts
from clients to not overwhelem the origin server (scale) and the ETag
allows the Origin server to skip the construction of the reservation
object, if it has a way to verify what the current valid ETag is
faster than constructing the entire reservation. Another example
would be something that doesn't change often and when it does, it can
propagate slowly (like a user's ad recommendation profile) You can
set a higher TTL (like 6 hours) and not worry so much about an ETag
(although it would still be useful).
REf: https://groups.google.com/d/msg/api-craft/YJMH0XMQJIM/HtdAPEXbQLMJ
Or, if you want to cache on server side, have a look at memcached (tutorial)
And also look at Reverse Proxy cache solutions like varnish etc.

https://devcenter.heroku.com/articles/ios-network-caching-http-headers
This explains Caching from iOS perspective, but the term Cache-Control, max-age, ETags, Last Modified are explained well.

Static object that is accessible to all users like Application.cfc

I've done a fair bit of PHP over the years but I'm currently learning ColdFusion and have come across the Application.cfc file.
Basically this is a class that's created once (has an expire date). The class handles incoming users and can set session variables and static memory objects, such as queries. For example I can load site wide statistical data for one user in another thread from the Application.cfc. Something that would usually take a few seconds for each page would make the whole site quick and responsive.
Another example (just for clarification).
If I put an incremental variable that's set to 0 in OnApplicationStart this variable can be incremented with each user request (multiple users) or in OnSessionStart without the need to contact the SQL database since it's constantly in the server's memory under this application.
I was wondering if PHP has a similar file or object? Something that can be created once and used to store temporary variables?

The PHP runtime itself initializes the environment from scratch on every HTTP request, so it has no built-in mechanism to do this. Of course you can serialize anything into common storage and then read it back and deserialize on each request, but this is not the same as keeping it in-memory.
This type of functionality in PHP is achieved by outsourcing to other programs; memcached and APC are two of the most commonly used programs that offer such services, and both come with PHP extensions that simplify working with them.

PHP Beginner: Where and how are objects stored?

In an app written in PHP (e.g., a social network), let's say that 10 users (signed-in) are browsing the website.
In PHP code, there is "user" object created to store users data and to pass values to other functions and classes.
Question: When these 10 users go to user.php, which has code to create "user" object, how are these objects stored in memory in PHP? Do they not conflict? Is each one of the "user" objects are uniquely stored in the memory or would one be overwritten by another?
For example, user a visits first so object "user" contains his/her data but when user second visits, the "user" object in memory is overwritten so when first user calls the object, it's the second users data retrieved.
Or, is it unique?
I want to understand object in PHP as a newbie, please explain it simply because none of the web pages I found regarding OOP explains this.

PHP is a CGI application, that means, it's being started and terminated on each request.
a client sends a request to the web server
the server starts PHP and passes the request to it
PHP allocates a chunk of memory for your script
your script is being executed, all objects it creates are stored in that chunk of memory
you script generates some html, this html is sent to the client
the memory is being freed and PHP is stopped
If you have 10 clients requests coming at the same time, 10 copies of php will be started and 10 independent memory chunks will be used. So, no, objects from different requests do not interfere.
(Note: this explanation is deliberately simplified, there are actually different php setups and persistence options).

The best way to learn this is to install php on a local PC or Mac and then create a php info file
<?php
phpinfo();
?>
... then open it in your browser...This will show you all the settings on your server for php and other things.
Regarding the answer to your question, it's a bit more of an advanced topic for a newbee, but php sessions are what do the work of keeping user info. They usually work off a session id which is unique to the user for a small amount of time, and they dynamically allocate memory or disk space/flat files or a database (again see the settings above) to store the relevant data.
Unfortunately for you none of this is "automatic" you have to create the scripts to make it happen and behave in the way you want. Asking questions on this site is a good start...

You need to look at object design patterns in relation to php which is quite a big subject in its own right. There is an excellent Apress book called 'PHP Objects, Patterns and Practice' which explains some of the more common patterns and how you might use them and would be a good place to start learning.

The users information is all stored in a database, the user object will have to retrieve this data each time the page loads.
The object know what user is looking at the page because of their session_id, which in a nut shell is a random id given to you, stored in a cookie.
using the session_id you can retrieve the correct information form the database.

$_SERVER vs. WSGI environ parameter

I'm designing a site. It is in a very early stage, and I have to make a decision whether or not to use a SingleSignOn service provided by the server. (it's a campus site, and more and more sites are using SSO here, so generally it's a good idea).
The target platform is most probably going to be django via mod_wsgi. However, any documentation provided with this service features php code. This method heavily relies on using custom $_SERVER['HTTPsomething'] variables. Unfortunately, right now I don't have access to this environment.
(How) can I access these custom variables in django? According the the WSGI documentation, the environ variable should contain as many as possible variables. Can I be sure that I can access them?

In Django, the server environment variables are provided as dictionary members of the META attribute on the request object - so in your view, you can always access them via request.META['foo'] where foo is the name of the variable.
An easy way to see what is available is to create a view containing assert False to trigger an error. As long as you're running with DEBUG=True, you'll see a nice error page containing lots of information about the server status, including a full list of all the request attributes.

To determine the set of variables passed in the raw WSGI environment, before Django does anything to them, put the following code in the WSGI script file in place of your Django stuff.
import StringIO
def application(environ, start_response):
headers = []
headers.append(('Content-type', 'text/plain'))
start_response('200 OK', headers)
input = environ['wsgi.input']
output = StringIO.StringIO()
keys = environ.keys()
keys.sort()
for key in keys:
print >> output, '%s: %s' % (key, repr(environ[key]))
print >> output
length = int(environ.get('CONTENT_LENGTH', '0'))
output.write(input.read(length))
return [output.getvalue()]
It will display back to the browser the set of key/value pairs.
Finding out how the SSO mechanism works is important. If it does the sensible thing, you will possibly find that it sets REMOTE_USER and possibly AUTH_TYPE variables. If REMOTE_USER is set it is an indicator that the user named in the variable has been authenticated by some higher level authentication mechanism in Apache. These variables would normally be set for HTTP Basic and Digest authentication, but to work with as many systems as possible, a SSO mechanism, should also use them.
If they are set, then there is a Django feature, described at:
http://docs.djangoproject.com/en/dev/howto/auth-remote-user/
which can then be used to have Django accept authentication done at a higher level.
Even if the SSO mechanism doesn't use REMOTE_USER, but instead uses custom headers, you can use a custom WSGI wrapper around the whole Django application to translate any custom headers to a suitable REMOTE_USER value which Django can then make use of.

Well, $_SERVER is PHP. You are likely to be able to access the same variables via WSGI as well, but to be sure you need to figure out exactly how the SSO works, so you know what creates these variables (probably Apache) and that you can access them.
Or, you can get yourself access and try it out. :)

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Get unique worker/thread/process/request ID in PHP - php

I've seen getmypid() used for this purpose, but it seems to behave differently on different systems. In some cases the ID is unique to each request, but on others it's shared. So, you're probably better of going with one of the other answers to ensure portability.

Related

PHP session id in cookie and storage

How works (and using) RESTful's caching

Static object that is accessible to all users like Application.cfc

PHP Beginner: Where and how are objects stored?

$_SERVER vs. WSGI environ parameter

Categories

Resources