Unpredictable log file writing in PHP - php

I have a script that runs every two minutes for a "Tweet-getter" application. In a nutshell it puts tweets onto Facebook. Every now and then it hiccups and despite my error checking, reposts old tweets continuously, every two minutes (the cycle of it being run as a cron job). I have a log.txt that in theory would help me determine what's going on here, but the problem is it isn't being written to every time the job runs. Here's the code:
<?php
$start_time = microtime();
require_once //a library and config
$facebook = new Facebook($api_key, $secret);
get_db_conn(); //returns $conn
$hold_me = mysql_fetch_array(mysql_query("SELECT * FROM `stats`"));
$last_id_posted = $hold_me[0]; //the status # of the most recently posted tweet
$me = "mytwittername";
$ch = curl_init("http://twitter.com/statuses/friends_timeline.xml?since_id=$last_id_posted");
curl_setopt($ch, CURLOPT_USERPWD, $me.":".$pw);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$xs = curl_exec($ch);
$data = new SimpleXMLElement($xs);
$latest_tweet_id = $last_id_posted;
$uid = get_uid(); //returns an array of facebookID->twittername
$user_count = count($uid);
curl_close($ch);
$total_tweets = 0;
$posted_tweets = 0;
foreach ($data->status as $tweet) {
$name = strtolower($tweet->user->screen_name);
if (array_key_exists($name, $uid)) {
$total_tweets += 1;
// $name = Twitter Name
$message = $tweet->text;
$fbid = $uid[$name];
theposting($name,$message,$fbid); //posts tweet to facebook
$this_id = $tweet->id;
if ($this_id > $latest_tweet_id) {
$latest_tweet_id = $this_id;
}
}
}
mysql_query("UPDATE stats SET lasttweet='$latest_tweet_id'");
commit_log(); //logs to a txt file how many tweets posted, how many users, execution duration, and time of execution
?>
So in theory the log is a string of "Monday 24th of August 2009 10:41:32 PM. Called all since # 3326415954. Updated to # 3526415953. 8 users. Took 0.086057 milliseconds. Posted 14 out of 20 tweets." lines. Occasionally though, it will skip two or three hours at a time, and in that time period it will "spam" people's facebook pages with multiple copies of the same tweet. I can't tell what might be breaking my code, but my suspicion is bad XML from twitter. All in all it's relatively low-traffic on my end, so I doubt I'm overloading my server or anything. The log.txt is 50kb right now, and last "broke" at ~35kb, so it's not a huge file slowing it down... Any thoughts would be appreciated!

The first thing I would do to improve the script is to check for cURL errors curl_errno & curl_error. Chances are if anything is going wrong it will be from there if your malformed XML theory is correct. You may also want to specify a timeout for both cURL and PHP.
I've not used the SimpleXML library, but it does look as if there is a check for malformed XML, it'll produce an E_WARNING if it's not well-formed.
Those 2 bits should elminate any dodgy data.
Without seeing the other functions it's a bit hard to see any other potential places where it could be going wrong.

You should test to make sure that your database query was successful.
Try selecting only the $last_id_posted in your SQL select, since you are throwing away the rest of the row anyways.
$last_id_posted has no default value. What is the expected result of ?since_id=
Serialize the state of your db/curl response & XML and dump into your log file.

Related

Loading XML data without refresh

I've script which get some values from XML record.
There's code:
<?php
//Data
$xml_data = '<image_process_call><image_url>https://i.pinimg.com/originals/e4/41/54/e44154308e3466d987665c6d50887f06.jpg</image_url><methods_list><method><name>collage</name><params>template_name=Nun Face in Hole;</params></method></methods_list><result_format>jpg</result_format><result_size>800</result_size><template_watermark>false</template_watermark></image_process_call>';
//Settings
$app_id = '';
$key = '';
$sign_data = hash_hmac('SHA1', $xml_data, $key);
//Send request
$request_url = 'http://opeapi.ws.pho.to/addtask?data='. $xml_data .'&sign_data='. $sign_data .'&app_id='. $app_id;
$request_xml = simplexml_load_file($request_url);
$request_id = strval($request_xml -> request_id);
if (isset($request_id)) {
$result_url = 'http://opeapi.ws.pho.to/getresult?request_id='. $request_id;
sleep(6);
$result_xml = simplexml_load_file($result_url);
$result_status = strval($result_xml -> status);
$result_img = strval($result_xml -> result_url);
if (isset($result_img)) {
echo $result_img;
} else {
echo 'Result image not found';
}
} else {
echo 'Request ID not found';
}
?>
The problem depends on time to generate the second XML file. $result_xml took few seconds so I have to use sleep(6) function.
If I remove this, I need to refresh the page (minimum three times) to get a link to generated image from second XML.
Do you have an idea how to do it more professionally? I can't be sure that every image will be generated in 6 seconds (sometimes shorter sometimes longer).
Is there any method for genereting the result only after receiving $result_img? Thanks in advance for your help!
I think it is worth writing.
In practice, it looks like this:
Script does $request_xml and XML from site return:
<image_process_response>
<request_id>2d8d4dec-4344-4df0-a1e1-0c8df304ad11</request_id>
<status>OK</status>
<description/>
<err_code>0</err_code>
</image_process_response>
Script gets request_id from this XML and do $result_xml. However, this is XML and script doesn't get image's url immediately. It needs to wait a few seconds.
After three times refreshing the page or using sleep(6) function finally we get:
<image_process_response>
<request_id>2d8d4dec-4344-4df0-a1e1-0c8df304ad11</request_id>
<status>OK</status>
<result_url>
http://worker-images.ws.pho.to/i1/9F1E2EAF-5B31-4407-8779-9A85F35862D3.jpg
</result_url>
<result_url_alt>
http://worker-images.ws.pho.to.s3.amazonaws.com/i1/9F1E2EAF-5B31-4407-8779-9A85F35862D3.jpg
</result_url_alt>
<limited_image_url>
http://worker-images.ws.pho.to/i1/3F797C83-2C2E-401C-B4AF-C4D36BBD442D.jpg
</limited_image_url>
<nowm_image_url>
http://worker-images.ws.pho.to/i1/9F1E2EAF-5B31-4407-8779-9A85F35862D3.jpg
</nowm_image_url>
<duration>2950.879097ms</duration>
<total_duration>2956.124067ms</total_duration>
</image_process_response>
Edit:
After trying to immediately generate the image I get such an XML:
<image_process_response>
<request_id>e615f0a1-ddee-4d81-94c4-a392f8f123e8</request_id>
<status>InProgress</status>
<description>The task is in progress, you need to wait for sometime.</description>
</image_process_response>
So this is reason why I see blank page...
Do someone have an idea how to force a script to reconnect with the second XML until it finds a result_url?
The problem depends on time to generate the second XML file.
$result_xml took few seconds so I have to use sleep(6) function. If I
remove this, I need to refresh the page (minimum three times) to get a
link to generated image from second XML.
Do you have an idea how to do it more professionally? I can't be sure
that every image will be generated in 6 seconds (sometimes shorter
sometimes longer). Is there any method for genereting the result only
after receiving $result_img? Thanks in advance for your help!
According to Pho.to API, An add task request is a queued POST request.
In my opinion, Send request in while-loop, but wait for smaller time instead of fixed 6 seconds, Check the status in image_process_response, Keep looping until it is not InProgress, After that, You can safely send second request to get processed image result.
You may encounter timeout issue due to low timeout configuration for DoS protection if you run this script on web server (via CGI/FastCGI), To resolve this situation, You need a queue for adding task in your HTTP request, and then process it offline (means without web environment).

PHP DB caching, without including files

I've been searching for a suitable PHP caching method for MSSQL results.
Most of the examples I can find suggest storing the results in an array, which would then get included to page. This seems great unless a request for the content was made at the same time as it being updated/rebuilt.
I was hoping to find something similar to ASP's application level variables, but far as I'm aware, PHP doesn't offer this functionality?
The problem I'm facing is I need to perform 6 queries on page to populate dropdown boxes. This happens on the vast majority of pages. It's also not an option to combine the queries. The cached data will also need to be rebuilt sporadically, when the system changes. This could be once a day, once a week or a month. Any advice will be greatly received, thanks!
You can use Redis server and phpredis PHP extension to cache results fetched from database:
$redis = new Redis();
$redis->connect('/tmp/redis.sock');
$sql = "SELECT something FROM sometable WHERE condition";
$sql_hash = md5($sql);
$redis_key = "dbcache:${sql_hash}";
$ttl = 3600; // values expire in 1 hour
if ($result = $redis->get($redis_key)) {
$result = json_decode($result, true);
} else {
$result = Db::fetchArray($sql);
$redis->setex($redis_key, $ttl, json_encode($result));
}
(Error checks skipped for clarity)

Prevent PHP from sending multiple emails when running parallel instances

This is more of a logic question than language question, though the approach might vary depending on the language. In this instance I'm using Actionscript and PHP.
I have a flash graphic that is getting data stored in a mysql database served from a PHP script. This part is working fine. It cycles through database entries every time it is fired.
The graphic is not on a website, but is being used at 5 locations, set to load and run at regular intervals (all 5 locations fire at the same time, or at least within <500ms of each-other). This is real-time info, so time is of the essence, currently the script loads and parses at all 5 locations between 30ms-300ms (depending on the distance from the server)
I was originally having a pagination problem, where each of the 5 locations would pull a different database entry since i was moving to the next entry every time the script runs. I solved this by setting the script to only move to the next entry after a certain amount of time passed, solving the problem.
However, I also need the script to send an email every time it displays a new entry, I only want it to send one email. I've attempted to solve this by adding a "has been emailed" boolean to the database. But, since all the scripts run at the same time, this rarely works (it does sometimes). Most of the time I get 5 emails sent. The timeliness of sending this email doesn't have to be as fast as the graphic gets info from the script, 5-10 second delay is fine.
I've been trying to come up with a solution for this. Currently I'm thinking of spawning a python script through PHP, that has a random delay (between 2 and 5 seconds) hopefully alleviating the problem. However, I'm not quite sure how to run exec() command from php without the script waiting for the command to finish. Or, is there a better way to accomplish this?
UPDATE: here is my current logic (relevant code only):
//get the top "unread" information from the database
$query="SELECT * FROM database WHERE Read = '0' ORDER BY Entry ASC LIMIT 1";
//DATA
$emailed = $row["emailed"];
$Entry = $row["databaseEntryID"];
if($emailed == 0)
{
**CODE TO SEND EMAIL**
$EmailSent="UPDATE database SET emailed = '1' WHERE databaseEntryID = '$Entry'";
$mysqli->query($EmailSent);
}
Thanks!
You need to use some kind of locking. E.g. database locking
function send_email_sync($message)
{
sql_query("UPDATE email_table SET email_sent=1 WHERE email_sent=0");
$result = FALSE;
if(number_of_affacted_rows() == 1) {
send_email_now($message);
$result = TRUE;
}
return $result;
}
The functions sql_query and number_of_affected_rows need to be adapted to your particular database.
Old answer:
Use file-based locking: (only works if the script only runs on a single server)
function send_email_sync($message)
{
$fd = fopen(__FILE__, "r");
if(!$fd) {
die("something bad happened in ".__FILE__.":".__LINE__);
}
$result = FALSE;
if(flock($fd, LOCK_EX | LOCK_NB)) {
if(!email_has_already_been_sent()) {
actually_send_email($message);
mark_email_as_sent();
$result = TRUE; //email has been sent
}
flock($fd, LOCK_UN);
}
fclose($fd);
return $result;
}
You will need to lock the row in your database by using a transaction.
psuedo code:
Start transaction
select row .. for update
update row
commit
if (mysqli_affected_rows ( $connection )) >1
send_email();

Instagram max requests per hour (30) exceeded?

{
"code":420,
"error_type":"OAuthRateLimitException",
"error_message":"You have exceeded the maximum number of requests per hour. You have performed a total of 253 requests in the last hour. Our general maximum request limit is set at 30 requests per hour."
}
I just noticed a clients website I am looking after has stopped showing the Instagram feed, so I loaded up the feed URL straight into the browser and I got the above error. I don't think there should have been 253 requests in an hour, but whilst Googling this problem, I came across someone saying it was because the API was being logged in on every request. Sadly, I have "inherited" this code, and haven't really worked with the Instagram API before, apart from to fix an error with this same website before.
The clients site is in WordPress so I have wrapped the code to get the images in a function:
function get_instagram($user_id=USERID,$count=6,$width=190,$height=190){
$url = 'https://api.instagram.com/v1/users/'.$user_id.'/media/recent/?access_token=ACCESSTOKEN&count='.$count;
// Also Perhaps you should cache the results as the instagram API is slow
$cache = './'.sha1($url).'.json';
if(file_exists($cache) && filemtime($cache) > time() - 60*60){
// If a cache file exists, and it is newer than 1 hour, use it
$jsonData = json_decode(file_get_contents($cache));
} else {
$jsonData = json_decode((file_get_contents($url)));
file_put_contents($cache,json_encode($jsonData));
}
$result = '<a style="background-image:url(/wp-content/themes/iwear/inc/img/instagram-background.jpg);" target="_BLANK" href="http://www.instagr.am" class="lc-box lcbox-4 instagram">'.PHP_EOL.'<ul>'.PHP_EOL;
foreach ($jsonData->data as $key=>$value) {
$result .= "\t".'<li><img src="'.$value->images->low_resolution->url.'" alt="'.$value->caption->text.'" data-width="'.$width.'" data-height="'.$height.'" /><div class="lc-box-inner"><div class="title"><h2>images</h2></div><div class="description">'.$value->caption->text.'</div></div></li>'.PHP_EOL;
}
$result .= '</ul></a>'.PHP_EOL;
return $result;
}
But as I said, this code has stopped working. Is there any way I could optimize this to actually work? I also notice there is mention of a cache in the (probably stolen) instagram stuff, but it isn't actually caching, so that could also be a solution
Thanks
Try registering a new client in instagram and then change
$url = 'https://api.instagram.com/v1/users/'.$user_id.'/media/recent/?access_token=ACCESSTOKEN&count='.$count;
for
$url = https://api.instagram.com/v1/users/'.$user_id.'/media/recent/?client_id=CLIENT_ID&count='.$count;
where CLIENT_ID is the client id of your recently created client.

PHP: Most efficient way to make multiple fsockopen(); connections?

Hey guys i'm making a website where you submit a server for advertising. When the user goes to the index page of my website it grabs the ip's of all the servers submitted and then tests to see if it is online using fsockopen() like so:
while($row = mysql_fetch_assoc($rs)) {
$ip = $row['ip'];
$info = #fsockopen($ip, 25565, $errno, $errstr, 0.5);
if($info) {
$status = "<div><img width='32px' height='32px'
title='$name is online!' src='images/online.png'/></div>";
$online = true;
} else {
$status = "<div><img width='32px' height='32px'
title='$name is offline!' src='images/offline.png'/></div>";
$online = false;
}
}
}
This way works fine, but the only downside is when you load the site it takes a good 2-4 seconds to start loading the website due to the fsockopen() methods being called. I want to know if there is a better way to do this that will reduce the amount of wait time before the website loads.
Any information will be appreciated, thanks.
Store the online status and last check time in a database, if the last check time is longer than 15 minutes for example, update it. I am pretty sure you don't need to get the status on EVERY pageload? It's the time it takes to connect to each server that slows down the website.
Then again, you would probably wanna move the update process to a cronjob instead of relying on someone visiting your website to update the server statuses.
Looking at your example, I'd make all the $status bits be javascript calls to another php page that checks that individual server.
However, the idea to move the status checks to cron job or use some kind of status caching is very good too. Maybe store statuses in a database only only check the ones that have expired (time limit set by you).

Categories