My PHP script are triggered via POST from different sources.
There is a array, where elements are added and deleted.
I save this array with
$jsonString = json_encode($curList);
file_put_contents("curList.obj", $jsonString);
and read it with
$stringified1 = file_get_contents("curList.obj") ?: '';
$curList = json_decode($stringified1, true) ?: [];
This works usually fine. But now I got in trouble, because several instances are running at the same time and read and write the array in the same time period. This creates problems, because some information are going lost.
It is possible to make sure, that another instance has to be wait, until the read and update of the file is finished?
Or is it possible that only one instance of the script is running at the same time, without a POST get lost?
Thanks to #Maksim I found the following solution:
$file = "curList.obj";
$lock = "curList.lock";
while(true){
if(!file_exists($lock)){
$fHandle = fopen($lock, 'x');
if($fHandle != false)
break;
}
usleep(20);
}
$stringified1 = file_get_contents("curList.obj") ?: '';
$curList = json_decode($stringified1, true) ?: [];
$curList[] = $_POST;
// save array
$jsonString = json_encode($curList);
file_put_contents("curList.obj", $jsonString);
// release lock handle and delete lock
fclose($fHandle);
unlink($lock);
Why this solution.
I am using IIS 10 and there is a problem with IIS and flock. Therefore I build a kind of wrapper around the read and write of the data update.
It is an additional curList.lock file.
First I check if the file exists, if not I create it with fopen($lock, 'x'). The option 'x' only creates the file if it not already exists. Otherwise it will fail and the routine is stil in the while loop, until it get an access.
try use Session in your project. With Sessions you can temporary save different data without conflicts.
This information storage is unique to the user, but is not stored forever
Related
I already have a PHP script to upload a CSV file: it's a collection of tweets associated to a Twitter account (aka a brand). BTW, Thanks T.A.G.S :)
I also have a script to parse this CSV file: I need to extract emojis, hashtags, links, retweets, mentions, and many more details I need to compute for each tweet (it's for my research project: digital affectiveness. I've already stored 280k tweets, with 170k emojis inside).
Then each tweet and its metrics are saved in a database (table TWEETS), as well as emojis (table EMOJIS), as well as account stats (table BRANDS).
I use a class quite similar to this one: CsvImporter > https://gist.github.com/Tazeg/b1db2c634651c574e0f8. I made a loop to parse each line 1 by 1.
$importer = new CsvImporter($uploadfile,true);
while($content = $importer->get(1)) {
$pack = $content[0];
$data = array();
foreach($pack as $key=>$value) {
$data[]= $value;
}
$id_str = $data[0];
$from_user = $data[1];
...
After all my computs, I "INSERT INTO TWEETS VALUES(...)", same with EMOJIS. The after, I have to make some other operations
update reach for each id_str, if a tweet I saved is a reply to a previous tweet)
save stats to table BRAND
All these operations are scripted in a single file, insert.php, and triggered when I submit my upload form.
But everything falls down if there is too many tweets. My server cannot handle so long operations.
So I wonder if I can ajaxify parts of the process, especially the loop
upload the file
parse 1 CSV line and save it in SQL and display a 'OK' message each time a tweet is saved
compute all other things (reach and brand stats)
I'm not enough aware of $.ajax() but I guess there is something to do with beforeSend, success, complete and all the Ajax Events. Or maybe am I completely wrong!?
Is there anybody who can help me?
As far as I can tell, you can lighten the load on your server substantially because $pack is an array of values already, and there is no need to do the key value loop.
You can also write the mapping of values from the CSV row more idiomatically. Unless you know the CSV file is likely to be huge, you should also do multiple lines
$importer = new CsvImporter($uploadfile, true);
// get as many lines as possible at once...
while ($content = $importer->get()) {
// this loop works whether you get 1 row or many...
foreach ($content as $pack) {
list($id_str, $from_user, ...) = $pack;
// rest of your line processing and SQL inserts here....
}
}
You could also go on from this and insert multiple lines into your database in a single INSERT statement, which is supported by most SQL databases.
$f = fopen($filepath, "r");
while (($line = fgetcsv($f, 10000, ",")) !== false) {
array_push($entries, $line);
}
fclose($f);
try this, it may help.
Okay so I have a button. When pressed it does this:
Javascript
$("#csv_dedupe").live("click", function(e) {
file_name = 'C:\\server\\xampp\\htdocs\\Gene\\IMEXporter\\include\\files\\' + $("#IMEXp_import_var-uploadFile-file").val();
$.post($_CFG_PROCESSORFILE, {"task": "csv_dupe", "file_name": file_name}, function(data) {
alert(data);
}, "json")
});
This ajax call gets sent out to this:
PHP
class ColumnCompare {
function __construct($column) {
$this->column = $column;
}
function compare($a, $b) {
if ($a[$this->column] == $b[$this->column]) {
return 0;
}
return ($a[$this->column] < $b[$this->column]) ? -1 : 1;
}
}
if ($task == "csv_dupe") {
$file_name = $_REQUEST["file_name"];
// Hard-coded input
$array_var = array();
$sort_by_col = 9999;
//Open csv file and dump contents
if(($handler = fopen($file_name, "r")) !== FALSE) {
while(($csv_handler = fgetcsv($handler, 0, ",")) !== FALSE) {
$array_var[] = $csv_handler;
}
}
fclose($handler);
//copy original csv data array to be compared later
$array_var2 = $array_var;
//Find email column
$new = array();
$new = $array_var[0];
$findme = 'email';
$counter = 0;
foreach($new as $key) {
$pos = strpos($key, $findme);
if($pos === false) {
$counter++;
}
else {
$sort_by_col = $counter;
}
}
if($sort_by_col === 999) {
echo 'COULD NOT FIND EMAIL COLUMN';
return;
}
//Temporarily remove headers from array
$headers = array_shift($array_var);
// Create object for sorting by a particular column
$obj = new ColumnCompare($sort_by_col);
usort($array_var, array($obj, 'compare'));
// Remove Duplicates from a coulmn
array_unshift($array_var, $headers);
$newArr = array();
foreach ($array_var as $val) {
$newArr[$val[$sort_by_col]] = $val;
}
$array_var = array_values($newArr);
//Write CSV to standard output
$sout = fopen($file_name, 'w');
foreach ($array_var as $fields) {
fputcsv($sout, $fields);
}
fclose($sout);
//How many dupes were there?
$number = count($array_var2) - count($array_var);
echo json_encode($number);
}
This php gets all the data from a csv file. Columns and rows and using the fgetcsv function it assigns all the data to an array. Now I have code in there that also dedupes (finds and removes a copy of a duplicate) the csv files by a single column. Keeping intact the row and column structure of the entire array.
The only problem is, even though it works with small files that have 10 or so rows that i tested, it does not work for files with 25,000.
Now before you say it, I have went into my php.ini file and changed the max_input, filesize, max time running etc etc to astronomical values to insure php can accept file sizes of upwards to 999999999999999MB and time to run its script of a few hundred years.
I used a file with 25,000 records and execute the script. Its been two hours and fiddler still shows that a http request has not yet been sent back. Can someone please give me some ways that I can optimize my server and my code?
I was able to use that code from a user who helped my in another question I posted on how to even do this initially. My concern now is even though I tested it to work, I want to know how to make it work in less than a minute. Excel can dedupe a column of a million records in a few seconds why cant php do this?
Sophie, I assume that you are not experienced at writing this type of application because IMO this isn't the way to approach this. So I'll pitch this accordingly.
When you have a performance problem like this, you really need to binary chop the problem to understand what is going on. So step 1 is to decouple the PHP timing problem from AJAX and get a simple understanding of why your approach is so unresponsive. Do this using a locally installed PHP-cgi or even use your web install and issue a header('Context-Type: text/plain' ) and dump out microtiming of each step. How long does the CSV read take, ditto the sort, then nodup, then the write? Do this for a range of CSV file sizes going up by 10x in rowcount each time.
Also do a memory_get_usage() at each step to see how you are chomping up memory. Because your approach is a real hog and you are probably erroring out by hitting the configured memory limits -- a phpinfo() will tell you these.
The read, nodup and write are all o(N), but the sort is o(NlogN) at best and o(N2) at worst. Your sort is also calling a PHP method per comparison so will be slow.
What I don't understand is why you are even doing the sort, since your nodup algo does not make use of the fact that the rows are sorted.
(BTW, the sort will also sort the header row inline, so you need to unshift it before you do the sort if you still want to do it.)
There are other issue that you need to think about such as
Using a raw parameter as a filename makes you vulnerable to attack. Better to fix the patch relative to, say DOCROOT/Gene/IMEXporter/include and enforce some grammar on the file names.
You need to think about atomicity of reading and rewriting large files as a response to a web request -- what happen if two clients make the request at the same time.
Lastly you compare this to Excel, well load and saving Excel files can take time, and Excel doesn't have to scale to respond to 10s or 100s or users at the same time. In a transactional system you typically use a D/B backend for this sort of thing, and if you are using a web interface to compute heavy tasks, you need to accept the Apache (or equiv server) hard memory and timing constraints and chop your algos and approach accordingly.
In my application I have two pages. Index and engine. Index has mostly jQuery scripts and the basic two of them is an autoload function for the engine.php file that happens every 3 mins, and the other one is a "show more" button that shows 10 more posts when it is clicked.
$feeds = array('','');
$entries = array();
foreach ($feeds as $feed) {
$xml = simplexml_load_file($feed);
$entries = array_merge($entries, $xml->xpath('/rss/channel//item'));
}
What I am looking for is a way to separate the $feeds = array('',''); part that is loaded with the engine.php on every 3 mins or click.
What is your suggestions about that ?
I'm not sure I understand your question, but global variables are defined in php if they're defined outside all functions. To use them inside, you have to reference it like this before using it:
$variable = ""; //this variable is global
function test(){
global $variable;
//now you can use your variable
}
If what you meant, is that the value for your variable need to be persisted between requests, then I would suggest you use the $_SESSION array (link).
If this does not answer your question let me know.
Edit for updated answer
You'll have to persist the $feeds array in some way. I think the easiest way would be to use the session, but that won't make the array available to all users like you mentioned. So I'd suggest you persist the list of feed on a file, that way you can access the same information every time for all users.
Doing that you can create a new script that will updated the file whenever you want (i.e every 10 minutes), and the engine.php will only have to work directly with the file.
Do you need some specific help with code for this proposed solution? Does it help you solve your problem?
Edit2
To save your $feeds array to a file, simply do the following (I guess this part you already know):
$fp = fopen("feeds.txt", "a");
if($fp) {
foreach($feeds as $line) {
fwrite($fp, $line."\n");
}
fclose($fp);
}
And to load that same file, you can do this:
$feeds = file("feeds.txt"); //This will load each line into a different position of the array $feeds.
If you want to serialize the $entries array, I would suggest taking a look at serialize & unserialize functions
I am writing a script where it checks for an updated version from an external server. I use this code in the config.php file to check for latest version.
$data = get_theme_data('http://externalhost.com/style.css');
$latest_version = $data['Version'];
define('LATEST_VERSION', $latest_version);
This is fine and I can fetch the latest version (get_theme_data is WordPress function) but the problem is that it will be executed on every single load which I do not want. I also do not want to only check when a form is submitted for example. Alternatively I was looking into some sort of method to cache the result or maybe check the version every set amount of hours? Is such thing possible and how?
Here, gonna make it easy for you. Store the time you last checked for the update in a file.
function checkForUpdate() {
$file = file_get_contents('./check.cfg', true);
if ($file === "") {
$fp = fopen('./check.cfg', 'w+');
fwrite($fp, time() + 86400);
fclose($fp);
}
if ((int)$file > time()) {
echo "Do not updatE";
} else {
echo "Update";
$fp = fopen('./check.cfg', 'w+');
fwrite($fp, time() + 86400);
fclose($fp);
}
}
You can obviously make this much more secure/efficient if you want to.
Edit: This function will check for update once every day.
A scheduled task like this should be set up as a separate cron or at job. You can still write everything in PHP, just make a script that runs from the command line and does the updating. Checkout "man crontab" for details, and/or check which scheduling services your server is running.
I have 3 questions that will greatly help me with my project that I am stuck on, after much narrowing down these are the resulted questions arised from solutions:
Can I use one php file to change a variable value in another php file, can these values be read also from one php file to another?
How can I use crob job to change variable values within my php code?
Lastly, can cron read variable values in my php files??? for Example, if statements that will decide what to trigger and how to trigger when cron time comes?
I am a little new at cron and going deeper into php and need all the exeprtise help. I cant use any CURL or frameworks.
Please prevent the hijacking of my topic, the data I want is simple change $variable=1 in filenameA.php to $variable=2 using filenameB.php
This is not a very good practice, but it's the simplest thing you can do:
You need three files: my_script.php, my_cron_job.php, and my_data.txt.
In the script that control's $data (this is called my_cron_job.php):
<?php
$values = array(
"some_key" => "some_value",
"anything" => "you want"
);
file_put_contents("my_data.txt",serialize($values));
Running it will also create my_data.txt.
Then, in my_script.php:
<?php
$data = unserialize(file_get_contents("my_data.txt"));
print_r($data); //if you want to look at what you've got.
I'm not sure what type of data you are exchanging between PHP files. I'm fairly new as well, but will see what the community thinks of my answer. (Criticism welcomed)
I would have my PHP files write my common data to a txt file. When the cron job executes the PHP files, the PHP files can access/write to the txt file with the common data.
You seem to be describing a configuration file of some type.
I would recommend either an XML file or a database table.
For an XML file you could have something like:
<settings>
<backup>
<active>1</active>
<frequency>daily</frequency>
<script_file>backup.php</script_file>
</backup>
<reporting>
<active>1</active>
<frequency>weekly</frequency>
<script_file>generate_report.php</script_file>
</reporting>
<time_chime>
<active>1</active>
<frequency>hourly</frequency>
<script_file>ring_bell.php</script_file>
</time_chime>
</settings>
then have some controller script that cron calls hourly that reads the XML file and calls the scripts accordingly. Your crontab would look like:
0 * * * * php /path/to/script/cron_controller.php
and cron_controller.php would contain something like:
$run_time = time();
$cron_config = simplexml_load_file($conf_file_location);
if($cron_config === false) die('failed to load config file');
foreach($cron_config as $cron) {
if($cron->active != 1) continue; //cron must be active
$run_script = false;
switch((string) $cron->frequency) {
case 'hourly':
$run_script = true;
break;
case 'daily':
if(date('H', $run_time) == '00') //is it midnight?
$run_script = true;
break;
case 'weekly':
if(date('w:H', $run_time) == '0:00') //is it sunday at midnight?
$run_script = true;
break;
}
if($run_script) {
$script_file = (string) $cron->script_file;
if(file_exists($script_file)) {
echo "running $script_file\n";
require($script_file);
}
else {
echo "could not find $script_file\n";
}
}
}
and if you need to edit your configuration with php scripts you can use SimpleXML to do it, then just save it back to the original location with $cron_config->saveXML($conf_file_location);