How to execute a long task in background? - php

I have an app made using Symfony 5, and I have a script that upload a video located on the server to the logged-in user channel.
Here's basically the code of my controller:
/**
* Upload a video to YouTube.
*
* #Route("/upload_youtube/{id}", name="api_admin_video_upload_youtube", methods={"POST"}, requirements={"id" = "\d+"})
*/
public function upload_youtube(int $id, Request $request, VideoRepository $repository, \Google_Client $googleClient): JsonResponse
{
$video = $repository->find($id);
if (!$video) {
return $this->json([], Response::HTTP_NOT_FOUND);
}
$data = json_decode(
$request->getContent(),
true
);
$googleClient->setRedirectUri($_SERVER['CLIENT_URL'] . '/admin/videos/youtube');
$googleClient->fetchAccessTokenWithAuthCode($data['code']);
$videoPath = $this->getParameter('videos_directory') . '/' . $video->getFilename();
$service = new \Google_Service_YouTube($googleClient);
$ytVideo = new \Google_Service_YouTube_Video();
$ytVideoSnippet = new \Google_Service_YouTube_VideoSnippet();
$ytVideoSnippet->setTitle($video->getTitle());
$ytVideo->setSnippet($ytVideoSnippet);
$ytVideoStatus = new \Google_Service_YouTube_VideoStatus();
$ytVideoStatus->setPrivacyStatus('private');
$ytVideo->setStatus($ytVideoStatus);
$chunkSizeBytes = 1 * 1024 * 1024;
$googleClient->setDefer(true);
$insertRequest = $service->videos->insert(
'snippet,status',
$ytVideo
);
$media = new \Google_Http_MediaFileUpload($googleClient, $insertRequest, 'video/*', null, true, $chunkSizeBytes);
$media->setFileSize(filesize($videoPath));
$uploadStatus = false;
$handle = fopen($videoPath, "rb");
while (!$uploadStatus && !feof($handle)) {
$chunk = fread($handle, $chunkSizeBytes);
$uploadStatus = $media->nextChunk($chunk);
}
fclose($handle);
}
This basically works, but the problem is that the video can be very big (10G+), so it's taking a very long time, and basically Nginx terminates before it's ended and returns a "504 Gateway Timeout" before the upload is completed.
And anyway, I don't want the user to have to wait for a page to load while it's uploading.
So, I'm looking for a way to, instead of just immediately running that script, execute that script in some kind of background thread, or in a asynchronous way.
The controller returns a 200 to the user, I can tell him that uploading is happening and to come back later to check progress.
How to do this?

There are many ways to accomplish this, but what you basically want is to decouple the action trigger and its execution.
Simply:
Remove all heavy work from your controller. Your controller should at most just check that the video id provided by the client exists in your VideoRepository.
Exists? Good, then you need to store this "work order" somewhere.
There are many solutions for this, depending on what you have already installed, what technology you feel more comfortable with, etc.
For sake of simplicity, let's say you have a PendingUploads table, with videoId, status, createdAt and maybe userId. So the only thing your controller would do is to create a new record in this table (maybe checking that the job is not "queued" yet, that kind of detail is up to your implementation).
And then return 200 (or 202, which could be more appropriate in the circumstances)
You would need then to write a separate process.
Very likely a console command that you would execute regularly (using cron would be the simplest way)
On each execution that process (which would have all the Google_Client logic, and probably a PendingUploadsRepository) would check which jobs are pending to upload, process them sequentially, and set status to whatever you signify to done. You could have status to either 0 (pending), 1 (processing), and 2 (processed), for example, and set the status accordingly on each step of the script.
The details on exactly to implement this are up to you. That question would be too broad and opinionated. Pick something that you already understand and allows you to move faster. If you are storing your jobs in Rabbit, Redis, a database, or a flat-file is not particularly important. If you start your "consumer" with cron or supervisor, either.
Symfony has a ready made component that could allow you to decouple this kind of messaging asynchronously (Symfony Messenger), and it's pretty nice. Investigate if it's your cup of tea, although if you are not going to use it for anything else in your application I would keep it simple to begin with.

Related

How to cache facebook graph api call

I've created a function to get the likes for my facebook page using the graph api. However, the level rate limit keeps on getting reached as it's being called on every request.
How would i cache this so it doesn't make the call every time?
The code i'm currently using is:
function fb_like_count() {
$id = '389320241533001';
$access_token = 'access token goes here';
$json_url ='https://graph.facebook.com/v3.2/'.$id.'?fields=fan_count&access_token='.$access_token;
$json = file_get_contents($json_url);
$json_output = json_decode($json);
if($json_output->fan_count) {
return like_count_format($json_output->fan_count);
} else{
return 0;
}
}
There are many cache mechanism in PHP that you can use depending on your project size.
I would suggest you to check memcached or Redis. These are in-memory cache mechanisms that are pretty fast and would help you to gain better performance.
You can read more about how to implement memcached here or for redis here.
The second and easier way is to use file caching. It works like this:
You send a request to Facebook API and when response is returned you save it to a file. When you want to send the second response you can check if there is any content in your file first and if there is you can return that directly to your application otherwise you will send the request to Facebook API
Simple integration is like this
if (file_exists($facebook_cache_file) && (filemtime($facebook_cache_file) > (time() - 60 * 15 ))) {
// Cache file is less than 15 minutes old but you can change this.
$file = file_get_contents($facebook_cache_file); // this holds the api data
} else {
// Our cache is out-of-date, so load the data from our remote server,
// and also save it over our cache for next time.
$response = getFacebookData() // get data from facebook and save into file
file_put_contents($facebook_cache_file, $response, LOCK_EX);
}
Anyway I would suggest you to use any PHP library for doing file cache.
Below you can find some that might be interesting to look at:
https://github.com/PHPSocialNetwork/phpfastcache
https://symfony.com/doc/current/components/cache.html

Doctrine Pessimistic Lock

Scenario:
I am implementing an application in Symfony2 running a command every five minutes (cronjob) that goes through a MySql table and in each record, first read a json_array field, performs a series of calculations and finally saves the array with new data in the same field. Moreover there is a web application where a user can edit and save this data in the same table
To avoid concurrency, if the command accesses a record I make a pessimistic lock so in that way, if user, at that exact moment, changes the data have to wait until the end of the transaction and when it finishes user data are saved.
But when user saves the data, randomly there is a bug, and user data are not saved and the web applications shows previous data, that tells me that the lock is not successful.
Code in the symfony2 command where I made pessimistic lock:
foreach ($this->cisOfferMetaData as $oldCisOfferMeta) {
// calculate budget used for each timeframe case and save it
// begin transaction and Lock cisOfferMEta Entity
$this->em->getConnection()->beginTransaction();
try {
$cisOfferMeta = $this->em->getRepository('CroboCisBundle:CisOfferMeta')->find(
$oldCisOfferMeta->getId(),
LockMode::PESSIMISTIC_READ
);
$budget = $cisOfferMeta->getBudgetOffer();
foreach (
generator(CisOfferMeta::getTypesArray(), CisOfferMeta::getTimeframesArray())
as $type => $timeframe
) {
if (isset($budget[$type][$timeframe]['goal'])) {
// if type=budget we need revenue value, if type=conversion, conversions value
$budget[$type][$timeframe]['used'] =
($type === 'conversion')
? intval($allTimeframes[$key]['conversions'][$timeframe])
: round($allTimeframes[$key]['revenue'][$timeframe], 2);
$budget[$type][$timeframe]['percent_reached'] =
($budget[$type][$timeframe]['used'] == 0.0)
? 0.0
: round(
$budget[$type][$timeframe]['used'] / intval($budget[$type][$timeframe]['goal']) * 100,
2
);
}
}
$budget['current_conversions'] = $allTimeframes[$key]['conversions'];
$budget['current_revenue'] = $allTimeframes[$key]['revenue'];
$cisOfferMeta->setBudgetOffer($budget);
$this->em->flush($cisOfferMeta);
$this->em->getConnection()->commit();
} catch (PessimisticLockException $e) {
$this->em->getConnection()->rollback();
throw $e;
}
}
Am I doing something wrong? I guess since the transaction is started until changes are commited, if a user attempts to read or update data, have to wait until the lock is released from that entity which is blocked.
Reading the Doctrine documentation is not clear if I should add versioning in the entity
Finally this code worked properly and made the pessimistic lock, the problem was in a Listener that read data prior to this lock and then flushed without changes after lock was released.

Slow responses using the Asana API

Information
I've started using the Asana API to make our own task overview in our CMS. I found an API on github which helps me a great deal with this.
As I've mentioned in an earlier question, I wanted to get all tasks for a certain user. I've managed to do this using the code below.
public function user($id)
{
if (isset($_SERVER['HTTP_X_REQUESTED_WITH']) &&
($_SERVER['HTTP_X_REQUESTED_WITH'] == 'XMLHttpRequest')) {
$this->layout = 'ajax';
}
$asana = new Asana(array(
'apiKey' => 'xxxxxxxxxxxxxxxxxxxx'
));
$results = json_decode($asana->getTasksByFilter(array(
'assignee' => $id,
'workspace' => 'xxxxxxxxxx'
)));
if ($asana->responseCode != '200' || is_null($results)) {
throw new \Exception('Error while trying to connect to Asana, response code: ' . $asana->responseCode, 1);
}
$tasks = array();
foreach ($results->data as $task) {
$result = json_decode($asana->getTaskTags($task->id));
$task->tags = $result->data;
$tasks[] = $task;
}
$user = json_decode($asana->getUserInfo($id));
if ($asana->responseCode != '200' || is_null($user)) {
throw new \Exception('Error while trying to connect to Asana, response code: ' . $asana->responseCode, 1);
}
$this->render("tasks", array(
'tasks' => $tasks,
'title' => 'Tasks for '.$user->data->name
));
}
The problem
The above works fine, except for one thing. It is slower than a booting Windows Vista machine (very slow :) ). If I include the tags, it can take up to 60 seconds before I get all results. If I do not include the tags it takes about 5 seconds which is still way too long. Now, I hope I am not the first one ever to have used the Asana API and that some of you might have experienced the same problem in the past.
The API itself could definitely be faster, and we have some long-term plans around how to improve responsiveness, but in the near-to-mid-term the API is probably going to remain the same basic speed.
The trick to not spending a lot of time accessing the API is generally to reduce the number of requests you make and only request the data you need. Sometimes, API clients don't make this easy, and I'm not familiar with the PHP client specifically, but I can give an example of how this would work in general with just the plain HTTP queries.
So right now you're doing the following in pseudocode:
GET /tasks?assignee=...&workspace=...
foreach task
GET /task/.../tags
GET /users/...
So if the user has 20 tasks (and real users typically have a lot more than 20 tasks - if you only care about incomplete and tasks completed in the last, say, week, you could use ?completed_since=<DATE_ONE_WEEK_AGO>), you've made 22 requests. And because it's synchronous, you wait a few seconds for each and every one of those requests before you start the next one.
Fortunately, the API has a parameter called ?opt_fields that allows you to specify the exact data you need. For example: let's suppose that for teach task, all you really want is to know the task ID, the task name, the tags it has and their names. You could then request:
GET /tasks?assignee=...&workspace=...&opt_fields=name,tags.name
(Each resource included always brings its id field)
This would allow you to get, in a single HTTP request, all the data you're after. (Well, the user lookup is still separate, but at least that's just 1 extra request instead of N). For more information on opt_fields, check out the documentation on Input/Output Options.
Hope that helps!

Download millions of images from external website

I am working on a real estate website and we're about to get an external feed of ~1M listings. Assuming each listing has ~10 photos associated with it, that's about ~10M photos, and we're required to download each of them to our server so as to not "hot link" to them.
I'm at a complete loss as to how to do this efficiently. I played with some numbers and I concluded, based on a 0.5 second per image download rate, this could take upwards of ~58 days to complete (download ~10M images from an external server). Which is obviously unacceptable.
Each photo seems to be roughly ~50KB, but that can vary with some being larger, much larger, and some being smaller.
I've been testing by simply using:
copy(http://www.external-site.com/image1.jpg, /path/to/folder/image1.jpg)
I've also tried cURL, wget, and others.
I know other sites do it, and at a much larger scale, but I haven't the slightest clue how they manage this sort of thing without it taking months at a time.
Sudo code based on the XML feed we're set to receive. We're parsing the XML using PHP:
<listing>
<listing_id>12345</listing_id>
<listing_photos>
<photo>http://example.com/photo1.jpg</photo>
<photo>http://example.com/photo2.jpg</photo>
<photo>http://example.com/photo3.jpg</photo>
<photo>http://example.com/photo4.jpg</photo>
<photo>http://example.com/photo5.jpg</photo>
<photo>http://example.com/photo6.jpg</photo>
<photo>http://example.com/photo7.jpg</photo>
<photo>http://example.com/photo8.jpg</photo>
<photo>http://example.com/photo9.jpg</photo>
<photo>http://example.com/photo10.jpg</photo>
</listing_photos>
</listing>
So my script will iterate through each photo for a specific listing and download the photo to our server, and also insert the photo name into our photo database (the insert part is already done without issue).
Any thoughts?
I am surprised the vendor is not allowing you to hot-link. The truth is you will not serve every image every month so why download every image? Allowing you to hot link is a better use of everyone's bandwidth.
I manage a catalog with millions of items where the data is local but the images are mostly hot linked. Sometimes we need to hide the source of the image or the vendor requires us to cache the image. To accomplish both goals we use a proxy. We wrote our own proxy but you might find something open source that would meet your needs.
The way the proxy works is that we encrypt and URL encode the encrypted URL string. So http://yourvendor.com/img1.jpg becomes xtX957z. In our markup the img src tag is something like http://ourproxy.com/getImage.ashx?image=xtX957z.
When our proxy receives an image request, it decrypts the image URL. The proxy first looks on disk for the image. We derive the image name from the URL, so it is looking for something like yourvendorcom.img1.jpg. If the proxy cannot find the image on disk, then it uses the decrypted URL to fetch the image from the vendor. It then writes the image to disk and serves it back to the client. This approach has the advantage of being on demand with no wasted bandwidth. I only get the images I need and I only get them once.
You can save all links into some database table (it will be yours "job queue"),
Then you can create a script which in the loop gets the job and do it (fetch image for a single link and mark job record as done)
The script you can execute multiple times f.e. using supervisord. So the job queue will be processed in parallel. If it's to slow you can just execute another worker script (if bandwidth does not slow you down)
If any script hangs for some reason you can easly run it again to get only images that havnt been yet downloaded. Btw supervisord can be configured to automaticaly restart each script if it fails.
Another advantage is that at any time you can check output of those scripts by supervisorctl. To check how many images are still waiting you can easy query the "job queue" table.
Before you do this
Like #BrokenBinar said in the comments. Take into account how many requests per second the host can provide. You don't want to flood them with requests without them knowing. Then use something like sleep to limit your requests per whatever number it is they can provide.
Curl Multi
Anyway, use Curl. Somewhat of a duplicate answer but copied anyway:
$nodes = array($url1, $url2, $url3);
$node_count = count($nodes);
$curl_arr = array();
$master = curl_multi_init();
for($i = 0; $i < $node_count; $i++)
{
$url =$nodes[$i];
$curl_arr[$i] = curl_init($url);
curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, true);
curl_multi_add_handle($master, $curl_arr[$i]);
}
do {
curl_multi_exec($master,$running);
} while($running > 0);
for($i = 0; $i < $node_count; $i++)
{
$results[] = curl_multi_getcontent ( $curl_arr[$i] );
}
print_r($results);
From: PHP Parallel curl requests
Another solution:
Pthread
<?php
class WebRequest extends Stackable {
public $request_url;
public $response_body;
public function __construct($request_url) {
$this->request_url = $request_url;
}
public function run(){
$this->response_body = file_get_contents(
$this->request_url);
}
}
class WebWorker extends Worker {
public function run(){}
}
$list = array(
new WebRequest("http://google.com"),
new WebRequest("http://www.php.net")
);
$max = 8;
$threads = array();
$start = microtime(true);
/* start some workers */
while (#$thread++<$max) {
$threads[$thread] = new WebWorker();
$threads[$thread]->start();
}
/* stack the jobs onto workers */
foreach ($list as $job) {
$threads[array_rand($threads)]->stack(
$job);
}
/* wait for completion */
foreach ($threads as $thread) {
$thread->shutdown();
}
$time = microtime(true) - $start;
/* tell you all about it */
printf("Fetched %d responses in %.3f seconds\n", count($list), $time);
$length = 0;
foreach ($list as $listed) {
$length += strlen($listed["response_body"]);
}
printf("Total of %d bytes\n", $length);
?>
Source: PHP testing between pthreads and curl
You should really use the search feature, ya know :)

Ratchet PHP WAMP - React / ZeroMQ - Specific user broadcast

Note: This is not the same as this question which utilises MessageComponentInterface. I am using WampServerInterface instead, so this question pertains to that part specifically. I need an answer with code examples and an explanation, as I can see this being helpful to others in the future.
Attempting looped pushes for individual users
I'm using the WAMP part of Ratchet and ZeroMQ, and I currently have a working version of the push integration tutorial.
I'm attempting to perform the following:
The zeromq server is up and running, ready to log subscribers and unsubscribers
A user connects in their browser over the websocket protocol
A loop is started which sends data to the specific user who requested it
When the user disconnects, the loop for that user's data is stopped
I have points (1) and (2) working, however the issue I have is with the third one:
Firstly: How can I send data to each specific user only? Broadcast sends it to everyone, unless maybe the 'topics' end up being individual user IDs maybe?
Secondly: I have a big security issue. If I'm sending which user ID wants to subscribe from the client-side, which it seems like I need to, then the user could just change the variable to another user's ID and their data is returned instead.
Thirdly: I'm having to run a separate php script containing the code for zeromq to start the actual looping. I'm not sure this is the best way to do this and I would rather having this working completely within the codebase as opposed to a separate php file. This is a major area I need sorted.
The following code shows what I currently have.
The server that just runs from console
I literally type php bin/push-server.php to run this. Subscriptions and un-subscriptions are output to this terminal for debugging purposes.
$loop = React\EventLoop\Factory::create();
$pusher = Pusher;
$context = new React\ZMQ\Context($loop);
$pull = $context->getSocket(ZMQ::SOCKET_PULL);
$pull->bind('tcp://127.0.0.1:5555');
$pull->on('message', array($pusher, 'onMessage'));
$webSock = new React\Socket\Server($loop);
$webSock->listen(8080, '0.0.0.0'); // Binding to 0.0.0.0 means remotes can connect
$webServer = new Ratchet\Server\IoServer(
new Ratchet\WebSocket\WsServer(
new Ratchet\Wamp\WampServer(
$pusher
)
),
$webSock
);
$loop->run();
The Pusher that sends out data over websockets
I've omitted the useless stuff and concentrated on the onMessage() and onSubscribe() methods.
public function onSubscribe(ConnectionInterface $conn, $topic)
{
$subject = $topic->getId();
$ip = $conn->remoteAddress;
if (!array_key_exists($subject, $this->subscribedTopics))
{
$this->subscribedTopics[$subject] = $topic;
}
$this->clients[] = $conn->resourceId;
echo sprintf("New Connection: %s" . PHP_EOL, $conn->remoteAddress);
}
public function onMessage($entry) {
$entryData = json_decode($entry, true);
var_dump($entryData);
if (!array_key_exists($entryData['topic'], $this->subscribedTopics)) {
return;
}
$topic = $this->subscribedTopics[$entryData['topic']];
// This sends out everything to multiple users, not what I want!!
// I can't send() to individual connections from here I don't think :S
$topic->broadcast($entryData);
}
The script to start using the above Pusher code in a loop
This is my issue - this is a separate php file that hopefully may be integrated into other code in the future, but currently I'm not sure how to use this properly. Do I grab the user's ID from the session? I still need to send it from client-side...
// Thought sessions might work here but they don't work for subscription
session_start();
$userId = $_SESSION['userId'];
$loop = React\EventLoop\Factory::create();
$context = new ZMQContext();
$socket = $context->getSocket(ZMQ::SOCKET_PUSH, 'my pusher');
$socket->connect("tcp://localhost:5555");
$i = 0;
$loop->addPeriodicTimer(4, function() use ($socket, $loop, $userId, &$i) {
$entryData = array(
'topic' => 'subscriptionTopicHere',
'userId' => $userId
);
$i++;
// So it doesn't go on infinitely if run from browser
if ($i >= 3)
{
$loop->stop();
}
// Send stuff to the queue
$socket->send(json_encode($entryData));
});
Finally, the client-side js to subscribe with
$(document).ready(function() {
var conn = new ab.Session(
'ws://localhost:8080'
, function() {
conn.subscribe('topicHere', function(topic, data) {
console.log(topic);
console.log(data);
});
}
, function() {
console.warn('WebSocket connection closed');
}
, {
'skipSubprotocolCheck': true
}
);
});
Conclusion
The above is working, but I really need to figure out the following:
How can I send individual messages to individual users? When they visit the page that starts the websocket connection in JS, should I also be starting the script that shoves stuff into the queue in PHP (the zeromq)? That's what I'm currently doing manually, and it just feels wrong.
When subscribing a user from JS, it can't be safe to grab the users id from the session and send that from client-side. This could be faked. Please tell me there is an easier way, and if so, how?
Note: My answer here does not include references to ZeroMQ, as I am not using it any more. However, I'm sure you will be able to figure out how to use ZeroMQ with this answer if you need to.
Use JSON
First and foremost, the Websocket RFC and WAMP Spec state that the topic to subscribe to must be a string. I'm cheating a little here, but I'm still adhering to the spec: I'm passing JSON through instead.
{
"topic": "subject here",
"userId": "1",
"token": "dsah9273bui3f92h3r83f82h3"
}
JSON is still a string, but it allows me to pass through more data in place of the "topic", and it's simple for PHP to do a json_decode() on the other end. Of course, you should validate that you actually receive JSON, but that's up to your implementation.
So what am I passing through here, and why?
Topic
The topic is the subject the user is subscribing to. You use this to decide what data you pass back to the user.
UserId
Obviously the ID of the user. You must verify that this user exists and is allowed to subscribe, using the next part:
Token
This should be a one use randomly generated token, generated in your PHP, and passed to a JavaScript variable. When I say "one use", I mean every time you reload the page (and, by extension, on every HTTP request), your JavaScript variable should have a new token in there. This token should be stored in the database against the User's ID.
Then, once a websocket request is made, you match the token and user id to those in the database to make sure the user is indeed who they say they are, and they haven't been messing around with the JS variables.
Note: In your event handler, you can use $conn->remoteAddress to get the IP of the connection, so if someone is trying to connect maliciously, you can block them (log them or something).
Why does this work?
It works because every time a new connection comes through, the unique token ensures that no user will have access to anyone else's subscription data.
The Server
Here's what I am using for running the loop and event handler. I am creating the loop, doing all the decorator style object creation, and passing in my EventHandler (which I'll come to soon) with the loop in there too.
$loop = Factory::create();
new IoServer(
new WsServer(
new WampServer(
new EventHandler($loop) // This is my class. Pass in the loop!
)
),
$webSock
);
$loop->run();
The Event Handler
class EventHandler implements WampServerInterface, MessageComponentInterface
{
/**
* #var \React\EventLoop\LoopInterface
*/
private $loop;
/**
* #var array List of connected clients
*/
private $clients;
/**
* Pass in the react event loop here
*/
public function __construct(LoopInterface $loop)
{
$this->loop = $loop;
}
/**
* A user connects, we store the connection by the unique resource id
*/
public function onOpen(ConnectionInterface $conn)
{
$this->clients[$conn->resourceId]['conn'] = $conn;
}
/**
* A user subscribes. The JSON is in $subscription->getId()
*/
public function onSubscribe(ConnectionInterface $conn, $subscription)
{
// This is the JSON passed in from your JavaScript
// Obviously you need to validate it's JSON and expected data etc...
$data = json_decode(subscription->getId());
// Validate the users id and token together against the db values
// Now, let's subscribe this user only
// 5 = the interval, in seconds
$timer = $this->loop->addPeriodicTimer(5, function() use ($subscription) {
$data = "whatever data you want to broadcast";
return $subscription->broadcast(json_encode($data));
});
// Store the timer against that user's connection resource Id
$this->clients[$conn->resourceId]['timer'] = $timer;
}
public function onClose(ConnectionInterface $conn)
{
// There might be a connection without a timer
// So make sure there is one before trying to cancel it!
if (isset($this->clients[$conn->resourceId]['timer']))
{
if ($this->clients[$conn->resourceId]['timer'] instanceof TimerInterface)
{
$this->loop->cancelTimer($this->clients[$conn->resourceId]['timer']);
}
}
unset($this->clients[$conn->resourceId]);
}
/** Implement all the extra methods the interfaces say that you must use **/
}
That's basically it. The main points here are:
Unique token, userid and connection id provide the unique combination required to ensure that one user can't see another user's data.
Unique token means that if the same user opens another page and requests to subscribe, they'll have their own connection id + token combo so the same user won't have double the subscriptions on the same page (basically, each connection has it's own individual data).
Extension
You should be ensuring all data is validated and not a hack attempt before you do anything with it. Log all connection attempts using something like Monolog, and set up e-mail forwarding if any critical's occur (like the server stops working because someone is being a bastard and attempting to hack your server).
Closing Points
Validate Everything. I can't stress this enough. Your unique token that changes on every request is important.
Remember, if you re-generate the token on every HTTP request, and you make a POST request before attempting to connect via websockets, you'll have to pass back the re-generated token to your JavaScript before trying to connect (otherwise your token will be invalid).
Log everything. Keep a record of everyone that connects, asks for what topic, and disconnects. Monolog is great for this.
To send to specific users, you need a ROUTER-DEALER pattern instead of PUB-SUB. This is explained in the Guide, in chapter 3. Security, if you're using ZMQ v4.0, is handled at the wire level, so you don't see it in the application. It still requires some work, unless you use the CZMQ binding, which provides an authentication framework (zauth).
Basically, to authenticate, you install a handler on inproc://zeromq.zap.01, and respond to requests over that socket. Google ZeroMQ ZAP for the RFC; there is also a test case in the core libzmq/tests/test_security_curve.cpp program.

Categories