Is there a more efficient way to update a JSON file?

Is there a more efficient way to update a JSON file? - php

I'm developing a browser-based game, and for the combat instances I need to be able track the player's hit points as well as the NPC's hit points. I'm thinking setting up a JSON file for each instance makes more sense then having a mySQL db get hammered with requests constantly. I've managed to create the JSON file, pull the contents, update the relevant vars, then overwrite the file, but I'm wondering if there's a more efficient way to handling it than how I've set it up.
$new_data = array(
"id"=>"$id",
"master_id"=>"$master_id",
"leader"=>"$leader",
"group"=>"$group",
"ship_1"=>"$ship_1",
"ship_2"=>"$ship_2",
"ship_3"=>"$ship_3",
"date_start"=>"$date_start",
"date_end"=>"$date_end",
"public_private"=>"$public_private",
"passcode"=>"$passcode",
"npc_1"=>"$npc_1",
"npc_1_armor"=>"$npc_1_armor",
"npc_1_shields"=>"$npc_1_shields",
"npc_2"=>"$npc_2",
"npc_2_armor"=>"$npc_2_armor",
"npc_2_shields"=>"$npc_2_shields",
"npc_3"=>"$npc_3",
"npc_3_armor"=>"$npc_3_armor",
"npc_3_shields"=>"$npc_3_shields",
"npc_4"=>"$npc_4",
"npc_4_armor"=>"$npc_4_armor",
"npc_4_shields"=>"$npc_4_shields",
"npc_5"=>"$npc_5",
"npc_5_armor"=>"$npc_5_armor",
"npc_5_shields"=>"$npc_5_shields",
"ship_turn"=>"$ship_turn",
"status"=>"$status");
$new_data = json_encode($new_data);
$file = "$id.json";
file_put_contents($file, $new_data);
It works, but I'm wondering if there is a way to update a single array item w/o having to pull ALL the data out, assign it to vars, and re-write the file. in this example, I'm only changing one var (ship_turn)

I'm thinking setting up a JSON file for each instance makes more sense then having a mySQL db get hammered with requests constantly.
MySQL is optimized for this task.
If you use files (like JSON) as database replacement, then you have to deal with "race conditions", because file access is not optimized for concurrent read / write access (by default).
If you're in a high-concurrency environment you should avoid using the filesystem as "database". Multiple operations on the file system are very hard to make atomic in PHP.
See flock for more details.

It depends on the game. For a turn based game or any non real-time game a MySQL approach should be ok. After all, databases are designed to get hammered heavily :-) For realtime games I would go for WebSocket and NodeJS as the backend. The server would keep a runtime state of the game, reacting appropriately to the client requests and dealing with race conditions (as you would do on a stand alone multiplayer server)

Related

Handling big arrays in PHP

The application i am working on needs to obtain dataset of around 10mb maximum two times a hour. We use that dataset to display paginated results on the site also simple search by one of the object properties should also be possible.
Currently we are thinking about 2 different ways to implement this
1.) Store the json dataset in the database or a file in the file system, read that and loop over to display results whenever we need.
2.) Store the json dataset in relational MySQL table and query the results and loop over whenever we need to display them.
Replacing/Refreshing the results has to be done multiple times per hour as i said.
Both ways have cons. I am trying to choose a good way which is less evil overall. Reading 10 MB in memory is not a lot and on the other hand rewriting a table few times a hour could produce conflicts in my opinion.
My concern regarding 1.) is how safe the app will be if we read 10mb in the memory all the time? What will happen if multiple users do this at some point of time, is this something to worry about or PHP is able to handle this in background?
What do you think it will be best for this use case?
Thanks!

When php runs on a web server (as it usually does) the server starts new php processes on demand when they're needed to handle concurrent requests. A powerful web server may allow fifty or so php processes. If each of them is handling this large data set, you'll need to have enough RAM for fifty copies. And, you'll need to load that data somehow for each new request. Reading 10mb from a file is not an overwhelming burden unless you have some sort of parsing to do. But it is a burden.
As it starts to handle each request, php offers a clean context to the programming environment. php is not good at maintaining in-RAM context from one request to the next. You may be able to figure out how to do it, but it's a dodgy solution. If you're running on a server that's shared with other web applications -- especially applications you don't trust -- you should not attempt to do this; the other applications will have access to your in-RAM data.
You can control the concurrent processes with Apache or nginx configuration settings, and restrict it to five or ten copies of php. But if you have a lot of incoming requests, those requests get serialized and they will slow down.
Will this application need to scale up? Will you eventually need a pool of web servers to handle all your requests? If so, the in-RAM solution looks worse.
Does your json data look like a big array of objects? Do most of the objects in that array have the same elements as each other? If so, that's conformable to a SQL table? You can make a table in which the columns correspond to the elements of your object. Then you can use SQL to avoid touching every row -- every element of each array -- every time you display or update data.
(The same sort of logic applies to Mongo, Redis, and other ways of storing your data.)

Optimization: Where to process data? Database, Server or Client?

I've been thinking a lot about optimization lately. I'm developing an application that makes me think where I should process data considering balancing server load, memory, client, loading, speed, size, etc..
I want to understand better how experienced programmers optimize their code when thinking about processing. Take the following 3 options:
Do some processing on the database level, when I'm getting the data.
Process the data on PHP
Pass the raw data to the client, and process with javascript.
Which would you guys prefer on which occasions and why? Sorry for the broad question, I'd also be thankful if someone could recommend me good reading sources on this.

Database is heart of any application, so you should keep load on database as light as possible. Here are some suggestions
Get only required fields from database.
Two simple queries are better than a single complex query.
Get data from database, process with PHP and then store this processed data into temporary storage(say cache e.g. Memcache, Couchbase, Redis). This data should be set with an expiry time, expiry time totally depends upon type of data. Caching will reduce your database load to a great extent.
Data is stored in normalized form. But if you know in advance that data is going to be requested and producing this data requires joins from many tables, then processed data, in advance, can be stored in separate table and can be served from this table.
Send as few as possible data on client side. Less HTML size will save bandwidth and browser will be able to render page quickly.
Load data on demand(using ajax, lazy loading etc), e.g a image is not visible on a page until user clicks on a tab, this image should be loaded upon user click.

Two thoughts: Computers should work, people should think. (IBM ad from the 1960s.)
"Premature optimization is the root of all evil (or at least most of it) in programming." --Donald Knuth
Unless you are, or are planning to become, Google or Amazon or Facebook, you should focus on functionality. "Make it work before you make it fast." If you are planning to grow to that size, do what they did: throw hardware at the problem. It is cheaper and more likely to be effective.
Edited to add: Since you control the processing power on the server, but probably not on the client, it is generally better to put intensive tasks on the server, especially if the clients are likely to be mobile devices. However, consider network latency, bandwidth requirements, and response time. If you can improve response time by processing on the client, then consider doing so. So, optimize the user experience, not the CPU cycles; you can buy more CPU cycles when you need them.
Finally, remember that the client cannot be trusted. For that reason, some things must be on the server.

So as a rule of thumb, process as much of the data in the database as possible. The cost of creating a new connection to query is very high, so you want to limit it as much as possible. Even if you have to write some very ugly SQL, performing a JOIN will almost always be quicker than performing 2 SELECT statements.
PHP should really only be used to format and cache data. If you are performing a ton of data operations after every request, you are probably storing your data in a format that's not very practical. You want to cache anything that is not changed often in an almost ready to server state using something like Redis or APCu.
Finally, client should never be performing data operations on more than a few objects. You never know the clients resource availability so always keep the client data lean. Perform pagination and sorting on any data sets larger than a few dozen in the back-end. An AJAX request using AngularJS is usually just as quick as performing a sort on 100+ items on an iPad 2.
If you would like further details on any aspect of this answer please ask and I will do my best to provide examples or additional detail.

large dataset for parsing in webpage

I have a large dataset of around 600,000 values that need to be compared, swapped, etc. on the fly for a web app. The entire data must be loaded since some calculations will require skipping values, comparing out of order, and so on.
However, each value is only 1 byte
I considered loading it as a giant JSON array, but this page makes me think that might not work dependably: http://www.ziggytech.net/technology/web-development/how-big-is-too-big-for-json/
At the same time, forcing the server to load it all for every request to be a waste of server resources since the clients can do the number crunching just as easily.
So I guess my question is this:
1) Is this possible to do reliably in jQuery/Javascript, and if so how?
2) If jQuery/Javascript is not the better option, what would be the best way to do this in PHP (read in files vs. giant arrays via include?)
Thanks!

I know Apache Cordova can make sql queries.
http://docs.phonegap.com/en/2.7.0/cordova_storage_storage.md.html#Storage
I know it's PhoneGap but it works on desktop browsers (At least all the ones I've used for phone app development)
So my suggestion:
Mirror your database in each users' local Cordova database, then run all the sql queries you want!
Some tips:
-Transfer data from your server to the webapp via JSON
-Break the data requests down into a few parts. That way you can easily provide a progress bar instead of waiting for the entire database to download
-Create a table with one entry that keeps the current version of your database, check this table before you send all that data. And change it each time you want to 'force' an update. This keeps the users database up-to-date and lowers bandwidth
If you need a push in the right direction I have done this before.

PHP/HTTP non non-stateless

PHP uses cookies, sessions or databases (and ORMs) in order to remember data (so they are not lost after single HTTP request). However, in Java (I mean servlets etc.) there is another solution: in brief you may choose for an object different scopes (how long it exists). Besides of session-scope or simple single HTTP-request "life" (scope), it can "live" during whole HTTP-server runtime and can be initialized at the startup of the HTTP-server.
Data can be therefore shared between different users / sessions, and no database requests are required (causing decrease of efficiency of the whole web-application). (I mean they're not required when HTTP-Server is already running - the object and its state is "remembered").
(And I do as much as I can to decrease SQL requests, using even PHP arrays for frequently read, but actually never modified DB data).
What I need in PHP is a way to:
Remember (store somewhere) data that can be changed and shared between many users, but not into DB
Without using sessions (nor cookies) I want to have multiple data-informations for many requests (etc. AJAX no single, but many requests to the same URL), which of course must be stored somewhere else for some time. For instance, I want to read all data (rows) with a single SQL request, remember them for a short period in PHP, and only then, one by one row, send responses with, say, each row in seperate response into appropriate AJAX function
Anyone can give me some hints how can I achieve this in PHP, preferably easiest possible way?

As a preface to this answer (which I'm sure you've already grasped), PHP's execution model essentially 'restarts' the process between requests and as such storage of anything cross-request in PHP alone is unachievable.
That leaves you with a few options, and they're all really 'strengths' of database:
Use a simple key-value in-memory persistance layer, like memcached or Redis
Use a noSQL solution with a bit more structure (and consistency should this be required) but that's still working in-memory and is comparably quicker than an RDB
Use an RDBMS because it'll work great, and the quantity if traffic you'll need to topple a well designed schema on moderate hardware is probably much higher than you think
HTH

How to improve on PHP's XML loading time?

Dropping my lurker status to finally ask a question...
I need to know how I can improve on the performance of a PHP script that draws its data from XML files.
Some background:
I've already mapped the bottleneck to CPU - but want to optimize the script's performance before taking a hit on processor costs. Specifically, the most CPU-consuming part of the script is the XML loading.
The reason I'm using XML to store object data because the data needs to be accessible via a browser Flash interface, and we want to provide fast user access in that area. The project is still in early stages though, so if best practice would be to abandon XML altogether, that would be a good answer too.
Lots of data: Currently plotting for roughly 100k objects, albeit usually small ones - and they must ALL be taken up into the script, with perhaps a few rare exceptions. The data set will only grow with time.
Frequent runs: Ideally, we'd run the script ~50k times an hour; realistically, we'd settle for ~1k/h runs. This coupled with data size makes performance optimization completely imperative.
Already taken an optimization step of making several runs on the same data rather than loading it for each run, but it's still taking too long. The runs should generally use "fresh" data with the modifications done by users.

Just to clarify: is the data you're loading coming from XML files for processing in its current state and is it being modified before being sent to the Flash application?
It looks like you'd be better off using a database to store your data and pushing out XML as needed rather than reading it in XML first; if building the XML files gets slow you could cache files as they're generated in order to avoid redundant generation of the same file.

If the XML stays relatively static, you could cache it as a PHP array, something like this:
<xml><foo>bar</foo></xml>
is cached in a file as
<?php return array('foo' => 'bar');
It should be faster for PHP to just include the arrayified version of the XML.

~1k/hour, 3600 seconds per hour, more than 3 runs a second (let alone the 50k/hour)...
There are many questions. Some of them are:
Does your php script need to read/process all records of the data source for each single run? If not, what kind of subset does it need (~size, criterias, ...)
Same question for the flash application + who's sending the data? The php script? "Direct" request for the complete, static xml file?
What operations are performed on the data source?
Do you need some kind of concurrency mechanism?
...
And just because you want to deliver xml data to the flash clients it doesn't necessarily mean that you have to store xml data on the server. If e.g. the clients only need a tiny little subset of the availabe records it probably a lot faster not to store the data as xml but something more suited to speed and "searchability" and then create the xml output of the subset on-the-fly, maybe assisted by some caching depending on what data the client request and how/how much the data changes.
edit: Let's assume that you really,really need the whole dataset and need a continuous simulation. Then you might want to consider a continuous process that keeps the complete "world model" in memory and operates on this model on each run (world tick). This way at least you wouldn't have to load the data on each tick. But such a process is usually written in something else than php.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.