Serializing a priority queue with custom items in PHP - php

I'm implementing a customized priority queue based on PHP's SPLPriorityQueue in a Zend Application. It contains custom objects, PriorityQueueItens, instead of pure values aside the priority value. When storing the queue in APC (or memcache) I have to make sure the queue and its items are serializable, so I've set them to implement the Serializable interface using code from the upcoming Zend Framework 2.
public function serialize()
{
$data = array();
while ($this->valid()) {
$data[] = $this->current();
$this->next();
}
foreach ($data as $item) {
$this->insert($item['data'], $item['priority']);
}
return serialize($data);
}
public function unserialize($data)
{
foreach (unserialize($data) as $item) {
$this->insert($item['data'], $item['priority']);
}
}
After fetching the queue from APC and retrieving the top item in the priority queue, using $this->extract(), I don't get the item but the array that is created during serialization.
So, instead of a PriorityQueueItem, the base class I use for objects stored in the queue, I get an associative array with indices data and priority (similar to the array in the serialize function). To get the actual item I need to retrive the data part of the array instead of treating the returned item as an item, which is how it works when not storing the queue in APC and how I assumed it would work now as well.
Is this a feature of serialization of objects or am I approaching this in a wrong way?
Update: The issue here was that I had a separate function that did extra cruft besides the extract(). This function returned the item as an array, but as soon as I called extract() explicitly I got the item as expected. Are there certain precautions to take with public functions in objects that have been serialized?

You mixed/switched this probably:
In your code you are serializing the $data array, not "your object". I'm not entirely sure of this because I do not know what the insert() function is for.
But for the serialize in an object with the serializable interface you will get back what has been returned from object::serialize().
As you serialize an array, you will get the serialized array back. PHP in the background is taking care that this was stored as your object.

Related

Create an array of objects and then fetch object without losing their class

I'm trying to do some experiments with arrays and objects to improve my PHP programming basics.
What I would like to do is to save a collection of objects instantiated by one of my classes in a text file to be able to fetch them later.
My current implementation is to save the objects in an array, encode the array and save it to a JSON file.
However, the problem that arises is that when I then go to extract the objects these are no longer objects deriving from my class but are transformed into stdClass objects.
Here is the method I use to save objects to the file:
public function store(string $titolo, string $nomeAutore, string $titoloToDoList): void
{
FileChecker::FindOrBuild('Data/Tasks.json', "[]");
$newTask = new Task($titolo, $nomeAutore, $titoloToDoList);
$file = file_get_contents('Data/Tasks.json');
$decodedFile = json_decode($file);
array_push($decodedFile, $newTask->toArray());
file_put_contents('Data/Tasks.json', json_encode($decodedFile));
FileChecker::FindOrBuild('log/log.txt', "Logs: \n");
Logger::logTaskStore($nomeAutore, $titoloToDoList);
}
and then I extract from the file with a simple json_decode ()
Can anyone suggest some alternative method to save objects in a text file without losing the class?
edit:
forgot to put the toArray() code which is simply
public function toArray(): array
{
return get_object_vars($this);
}
There are as many file formats as there are people who need to store something slightly different in a file. It's up to you to figure out which one makes sense for your application.
JSON is a file format designed to be very simple and flexible, and portable between lots of different languages. It has no concept of "class" or "custom type", and its "object" type is just a list of key-value pairs. (Have a look at the file your current code creates, and you'll see for yourself.)
You can build a file format "on top of" JSON: that is, rather than storing your objects directly, you first build a custom structure with a way of recording the class name, perhaps as a special key on each object called "__class". Then to decode, you first decode the JSON, then loop through creating objects based on the name you recorded.
You mentioned in comments PHP's built-in serialize method. That can be a good choice when you want to store full PHP data for internal use within a program, and will happily store your array of objects without extra code.
In both cases, be aware of the security implications if anyone can edit the serialized data and specify names of classes you don't want them to create. The unserialize function has an option to list the expected class names, to avoid this, but may have other security problems because of its flexibility.

PHP: How to make a wrapper for collection, which will return next element of collection every time I call it

I have a collection of objects. Every time when I get an element of this collection, I want to be sure that I get next element in this collections, when I get to the end of collection I just simply start to iterate it from very beginning.
For instance:
$listOfObjects = new WrappedCollection(array('Apple','Banana','Pikachu'));
$listOfObjects.getElement(); //I get Apple
$listOfObjects.getElement(); //I get Banana
$listOfObjects.getElement(); //I get Pikachu
$listOfObjects.getElement(); //I get Apple
I already implemented this with SplDoublyLinkedList, but everytime when I need to loop this list, I need to save the position of iterator, I am sure that there is a way to implement this more beautiful.
$this->listOfRunningCampaigns = new \SplDoublyLinkedList();
// Getting element of collection
public function getNextRunningCampaign(): Campaign
{
$this->listOfRunningCampaigns->next();
if ($this->listOfRunningCampaigns->current() !== null)
{
return $this->listOfRunningCampaigns->current();
}
$this->listOfRunningCampaigns->rewind();
return $this->listOfRunningCampaigns->current();
}
Here is example of what I have to do when I loop through collection:
// Saving current iterator position
$currentIteratorPosition = $this->listOfRunningCampaigns->key();
for ($this->listOfRunningCampaigns->rewind(); $this->listOfRunningCampaigns->valid(); $this->listOfRunningCampaigns->next())
{
//... some action
}
$this->moveRunningCampaignsListIterator($currentIteratorPosition);
// Function that moves iterator
private function moveRunningCampaignsListIterator($index): void
{
for ($this->listOfRunningCampaigns->rewind(); $this->listOfRunningCampaigns->valid(); $this->listOfRunningCampaigns->next())
{
if ($this->listOfRunningCampaigns->key() === $index)
{
break;
}
}
}
In my opinion the way I implemented this looks really bad, In near future I am about to make a lot of different actions with the elements of this collection and taking care of iterator every time is not the way I want to see this working. Can you please suggest some ways of implementing this?
Why you don't just use next()? This built-in function will do the rest with the array pointer for you.
In case you want to develop an array wrapper, I think you should take a look at Doctrine\Common\Collections\ArrayCollection, or use it directly, they did a great job and it also is available via Composer

First use of array is slow

I have encountered a weird "bug" in PHP and since I'm a novice I'm at the end of my knowledge.
I'm developing a TYPO3 extension that has some major performance issues with data, or so I thought.
It turns out that the first use of the array, which stores all the objects I got from my database query, is taking way to long.
Every use or loop after that is fast again.
The code looks like this:
$productsArr = $this->productRepository->findByDetail($category, $properties);
$newSortArr = array();
$familyProductList = array();
$counter = count($productsArr);
/** #var Product $product */
for($i = 0; $i < $counter; $i++) {
//it takes to long to do this
$product = $productsArr[$i];
if(!empty($productsArr[$i])) {
$newSortArr[$product->getInFamily()->getUid()][] = $product;
}
}
It doesn't matter where I first use the object array. The first use of the array is always taking around 30 sec.
Has anybody encountered something similar?
If you need more information I will gladly provide that.
Thanks in advance!
Your $productsArr is not an array but an Object of Exbase's class QueryResult which you can iterate over with foreach or do index access. This Object execute the query and build it's objects only when needed, so at the moment you do $product = $productsArr[$i];, all Product-objects of $productsArrare built. The major problem is that building objects in PHP has bad performance and consumpts a lot of memory.
So, to avoid the performance issue, consider using a custom query with
$this->productRepository->createQuery()->statement('select * from ...')->execute();
to get exactly what you want instead of loading a huge amount of objects and refine them later in PHP.
As Jay already mentioned, your result isn't an array, but a QueryResult. Just FYI, it's possible to transform it to an array by adding ->toArray() at the end of your query:
$productsArr = $this->productRepository->findByDetail($category, $properties)->toArray();
But this won't improve the situation. There are two possible issues:
Iterating all objects
The advantage of a QueryResult is that it reflects only the result of the query but doesn't resolve all objects already. A QueryResult can be passed e.g. to a Pagination widget and will then only load the results requested (e.h. 1-10, 11-20 etc.).
Since you're applying manual sorting, all your objects (depending on your project this can be a lot...) are loaded.
Apparently you would like to sort the products by their family UID? Why not do that with Extbase functionality in your ProductRepository:
protected $defaultOrderings = array(
'inFamily.uid' => \TYPO3\CMS\Extbase\Persistence\QueryInterface::ORDER_ASCENDING
);
Eager loading of sub objects
Your model Product might have relations to other models (e.h. Product to Category, Product to Options etc.). By default, Extbase resolves all these relations on accessing the objects.
To prevent this, you can use Lazy Loading for relations. This makes sense for sub objects that are not used in all the views. E.g. in your list view you only need to title, image and price of your product, but you don't need all options of the product.
To configure lazy loading for these sub objects, you just need to #lazy annotation in the model:
/**
* #var \TYPO3\CMS\Extbase\Persistence\ObjectStorage<\My\Extension\Domain\Model\ObjectStorageModel>
* #lazy
*/
protected $categories;
/**
* #var \My\Extension\Domain\Model\OtherModel
* #lazy
*/
protected $author;
Lazy loading can have some drawbacks, e.g. in certain situations when checking for an object being an instance of OtherModel, you get an object of type LazyLoadingProxy instead. You can work around most of these issues or maybe don't even stumple upon them in normal scenarios. A common workaround if you really depend an object on not being a LazyLoadingProxy is a check like that:
if ($product->getAuthor() instanceof \TYPO3\CMS\Extbase\Persistence\Generic\LazyLoadingProxy) {
$product->getAuthor()->_loadRealInstance();
}
This makes sure that in any case you have a "real" instance of the object.
Please don't forget to flush system caches when you're doing a change regarding either one of the issues.
I assume the array is populated on the first line. Have you considered populating $product using foreach($productArr as $product) instead o using for?

Zend Framework: Transforming post data before reaching Resource

I have a Zend Framework 2 project using Apigility, and I want to be able to send an array of objects via POST to create multiple entities at once. However, ZF/Rest/Resource automatically converts arrays to objects when making a POST. To make the logic a bit cleaner, I would like to convert the data array to an object of my liking (putting the array into a key such as 'storage') before it reaches the Resource.
// ZF\Rest\RestController
public function create($data) //$data is an array
{
$events = $this->getEventManager();
$events->trigger('create.pre', $this, array('data' => $data));
try {
// I want to convert $data to an object by this point
$entity = $this->getResource()->create($data);
} catch (\Exception $e) {
return new ApiProblem($this->getHttpStatusCodeFromException($e), $e);
}
I thought there must be a way to hook into the create.pre event to do this. I've attached a method which gets the Request from the Event, and gets, converts, and sets the Content of the Request, but my debugger says the Resource is still receiving the original array. I've also tried $event->setParam('data', $object), and this didn't work either. (I assume because the parameters are an array and not passed by reference.) Am I going about this the wrong way, or is this not possible?

Get the reference count of an object in PHP?

I realize the knee-jerk response to this question is that "you dont.", but hear me out.
Basically I am running on an active-record system on a SQL, and in order to prevent duplicate objects for the same database row I keep an 'array' in the factory with each currently loaded object (using an autoincrement 'id' as the key).
The problem is that when I try to process 90,000+ rows through this system on the odd occasion, PHP hits memory issues. This would very easily be solved by running a garbage collect every few hundred rows, but unfortunately since the factory stores a copy of each object - PHP's garbage collection won't free any of these nodes.
The only solution I can think of, is to check if the reference count of the objects stored in the factory is equal to one (i.e. nothing is referencing that class), and if so free them. This would solve my issue, however PHP doesn't have a reference count method? (besides debug_zval_dump, but thats barely usable).
Sean's debug_zval_dump function looks like it will do the job of telling you the refcount, but really, the refcount doesn't help you in the long run.
You should consider using a bounded array to act as a cache; something like this:
<?php
class object_cache {
var $objs = array();
var $max_objs = 1024; // adjust to fit your use case
function add($obj) {
$key = $obj->getKey();
// remove it from its old position
unset($this->objs[$key]);
// If the cache is full, retire the eldest from the front
if (count($this->objs) > $this->max_objs) {
$dead = array_shift($this->objs);
// commit any pending changes to db/disk
$dead->flushToStorage();
}
// (re-)add this item to the end
$this->objs[$key] = $obj;
}
function get($key) {
if (isset($this->objs[$key])) {
$obj = $this->objs[$key];
// promote to most-recently-used
unset($this->objs[$key]);
$this->objs[$key] = $obj;
return $obj;
}
// Not cached; go and get it
$obj = $this->loadFromStorage($key);
if ($obj) {
$this->objs[$key] = $obj;
}
return $obj;
}
}
Here, getKey() returns some unique id for the object that you want to store.
This relies on the fact that PHP remembers the order of insertion into its hash tables; each time you add a new element, it is logically appended to the array.
The get() function makes sure that the objects you access are kept at the end of the array, so the front of the array is going to be least recently used element, and this is the one that we want to dispose of when we decide that space is low; array_shift() does this for us.
This approach is also known as a most-recently-used, or MRU cache, because it caches the most recently used items. The idea is that you are more likely to access the items that you have accessed most recently, so you keep them around.
What you get here is the ability to control the maximum number of objects that you keep around, and you don't have to poke around at the php implementation details that are deliberately difficult to access.
It seems like the best answer was still getting the reference count, although debug_zval_dump and ob_start was too ugly a hack to include in my application.
Instead I coded up a simple PHP module with a refcount() function, available at: http://github.com/qix/php_refcount
Yes, you can definitely get the refcount from PHP. Unfortunately, the refcount isn't easily gotten for it doesn't have an accessor built into PHP. That's ok, because we have PREG!
<?php
function refcount($var)
{
ob_start();
debug_zval_dump($var);
$dump = ob_get_clean();
$matches = array();
preg_match('/refcount\(([0-9]+)/', $dump, $matches);
$count = $matches[1];
//3 references are added, including when calling debug_zval_dump()
return $count - 3;
}
?>
Source: PHP.net
I know this is a very old issue, but it still came up as a top result in a search so I thought I'd give you the "correct" answer to your problem.
Unfortunately getting the reference count as you've found is a minefield, but in reality you don't need it for 99% of problems that might want it.
What you really want to use is the WeakRef class, quite simply it holds a weak reference to an object, which will expire if there are no other references to the object, allowing it to be cleaned up by the garbage collector. It needs to be installed via PECL, but it really is something you want in every PHP installation.
You would use it like so (please forgive any typos):
class Cache {
private $max_size;
private $cache = [];
private $expired = 0;
public function __construct(int $max_size = 1024) { $this->max_size = $max_size; }
public function add(int $id, object $value) {
unset($this->cache[$id]);
$this->cache[$id] = new WeakRef($value);
if ($this->max_size > 0) && ((count($this->cache) > $this->max_size)) {
$this->prune();
if (count($this->cache) > $this->max_size) {
array_shift($this->cache);
}
}
}
public function get(int $id) { // ?object
if (isset($this->cache[$id])) {
$result = $this->cache[$id]->get();
if ($result === null) {
// Prune if the cache gets too empty
if (++$this->expired > count($this->cache) / 4) {
$this->prune();
}
} else {
// Move to the end so it is culled last if non-empty
unset($this->cache[$id]);
$this->cache[$id] = $result;
}
return $result;
}
return null;
}
protected function prune() {
$this->cache = array_filter($this->cache, function($value) {
return $value->valid();
});
}
}
This is the overkill version that uses both weak references and a max size (set it to -1 to disable that). Basically if it gets too full or too many results were expired, then it will prune the cache of any empty references to make space, and only drop non-empty references if it has to for sanity.
PHP 7.4 now has WeakReference
To know if $obj is referenced by something else or not, you could use:
// 1: create a weak reference to the object
$wr = WeakReference::create($obj);
// 2: unset our reference
unset($obj);
// 3: test if the weak reference is still valid
$res = $wr->get();
if (!is_null($res)) {
// a handle to the object is still held somewhere else in addition to $obj
$obj = $res;
unset($res);
}
I had a similar problem with the Incredibly Flexible Data Storage (IFDS) file format with trying to keep track of references to objects in an in-memory data cache. How I solved it was to create a ref-counting class that wrapped a reference to the underlying array. I generally prefer arrays over objects as PHP has traditionally tended to handle arrays better than objects with regards to unfortunate things like memory leaks.
class IFDS_RefCountObj
{
public $data;
public function __construct(&$data)
{
$this->data = &$data;
$this->data["refs"]++;
}
public function __destruct()
{
$this->data["refs"]--;
}
}
Since 'refs' is tracked as a regular value in the data, it is possible to know when the last reference to the data has gone away. Regardless of whether multiple variables reference the refcounting object or it is cloned, the refcount will always be non-zero until all references are gone. I don't need to care how many actual references there are internally in PHP as long as the value is correctly zero vs. non-zero. The IFDS implementation also tracks an estimated amount of RAM being used by each object (again, being exact isn't super important as long as it is in the ballpark), allowing it to prioritize writing and releasing unused objects that are occupying system resources first and then writing and releasing portions of still-referenced objects that are caching large quantities of DATA chunk information.
To get back to the topic/question, with this ref-counting class-based approach, it is, for example, mostly straightforward to prune to ~5,000 records in a cache upon hitting 10,000 records in the cache. General strategy is to not get rid of records still being referenced plus keep the most recently requested/used records that aren't being referenced because they are likely to be referenced again. Upon every new reference, unset() and then setting the item again will move the item to the end of the array so that the oldest probably unreferenced items appear first and the newest probably still referenced items appear last.
Weak references, as several people have suggested, won't solve every caching issue. They don't work in caching scenarios where you don't want to remove an item from the cache until the application is done working with it (i.e. deleting an item that the application later attempts to use) but also want to keep it around as long as RAM overhead permits even if the application stops referencing it temporarily but might need it again in a moment. Weak references are also incapable of working in scenarios where the item in the cache is holding onto unwritten data that may or may not be fine with staying unwritten even if there are no references to it in the application. In short, when there is a balancing act to maintain, weak references cannot be used.

Categories