I need to write a php script which will accept the csv file as input and then parse the product urls provided in it.
After then I need to validate which product url is exist and which is not.
I have got these two options for it curl() and get_headers().
So can you please let me know which one is more faster and reliable ?
Any help will be much appreciated.
I tested this with 100 unique domains:
$urls = [
"http://familyshare.com/",
"http://elitedaily.com/",
"http://www.pickthebrain.com/",
"http://i100.independent.co.uk/",
"http://thingsorganizedneatly.tumblr.com/",
"http://www.cheatsheet.com/",
"https://jet.com/",
"https://nightwalk.withgoogle.com/en/panorama",
"http://www.vumble.com/",
"http://fusion.net/",
"https://www.zozi.com",
"http://joshworth.com/dev/pixelspace/pixelspace_solarsystem.html",
"http://how-old.net/",
"https://www.dosomething.org",
"https://devart.withgoogle.com/",
"http://www.ranker.com/",
"http://the-toast.net/",
"https://www.futurelearn.com/",
"https://croaciaaudio.com/",
"http://www.thesimpledollar.com/",
"http://giphy.com/giphytv",
"http://snapzu.com/",
"https://www.touchofmodern.com/",
"http://www.howstuffworks.com/",
"http://www.sporcle.com/",
"http://www.factcheck.org/",
"https://www.privacytools.io/",
"http://tiffanithiessen.com/",
"http://www.supercook.com/",
"http://www.livescience.com/",
"http://www.freshnessmag.com",
"http://www.abeautifulmess.com/",
"http://cardboardboxoffice.com/",
"http://www.takepart.com/",
"http://www.fixya.com/",
"http://bestreviews.com/",
"http://theodysseyonline.com/",
"http://justdelete.me/",
"http://adventure.com/",
"http://www.carryology.com/",
"http://whattheysee.tumblr.com/",
"https://unsplash.com/",
"http://fromwhereidrone.com/",
"http://www.attn.com/",
"http://ourworldindata.org/",
"http://www.melty.com/",
"http://www.truthdig.com/",
"https://tosdr.org/",
"https://thinga.com/",
"http://forvo.com/",
"http://tiii.me/",
"https://snapguide.com/",
"http://www.tubefilter.com/",
"http://www.inherentlyfunny.com/",
"http://www.someecards.com/",
"https://this.cm/",
"http://littlebigdetails.com/",
"http://clapway.com/",
"http://www.nerdfitness.com/",
"http://iwantdis.com/",
"http://Racked.com",
"http://thesweetsetup.com/",
"http://www.we-heart.com/",
"https://www.revealnews.org/",
"https://featuredcreature.com/",
"http://www.scotthyoung.com/blog/",
"http://www.thehandandeye.com/",
"http://www.thenorthernpost.com/",
"http://www.welzoo.com/",
"http://www.tickld.com/",
"http://thinksimplenow.com/",
"http://www.quietrev.com/",
"http://www.freshoffthegrid.com/",
"https://www.generosity.com/",
"http://addicted2success.com/",
"http://cubiclane.com/",
"http://waitbutwhy.com/",
"http://toolsandtoys.net/",
"http://googling.co/",
"http://penelopetrunk.com/",
"http://iaf.tv/",
"http://artofvisuals.com/",
"http://www.lifeaftercollege.org/blog",
"http://listverse.com/",
"http://chrisguillebeau.com/",
"http://expeditionportal.com/",
"http://www.marieforleo.com/",
"http://mostexclusivewebsite.com/",
"http://www.alphr.com/",
"http://www.rtings.com/",
"http://all-that-is-interesting.com/",
"http://theunbeatnpath.xyz/",
"http://www.keepinspiring.me/",
"https://paidtoexist.com/blog/",
"http://www.lovethispic.com/",
"http://riskology.co/blog/",
"http://geyserofawesome.com/",
"http://www.eugenewei.com/",
"http://clickotron.com/"
];
$startTime = microtime(true);
stream_context_set_default(
array(
'http' => array(
'method' => 'HEAD'
)
)
);
$headers1 = [];
foreach ($urls as $url) {
$headers1[] = get_headers($url);
}
$endTime = microtime(true);
$elapsed = $endTime - $startTime;
echo "Execution time : $elapsed seconds \n";
$startTime = microtime(true);
$headers2 = [];
foreach ($urls as $url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$header2[] = curl_exec($ch);
}
$endTime = microtime(true);
$elapsed = $endTime - $startTime;
echo "Execution time : $elapsed seconds \n";
get_headers(GET) vs cURL:
Execution time : 139.95884609222 seconds
Execution time : 65.998840093613 seconds
get_headers(HEAD) vs cURL:
Execution time : 114.60515785217 seconds
Execution time : 66.077962875366 seconds
So indeed cURL is significantly faster.
Related
Im developing a little software based on API requests with php cURL.
I encountered a problem with private requests of API. One of the parameters of the request is "nonce" (unix timestamp), but the response is "invalid nonce".
Contacting the assistance, they answer me that:
"Invalid Nonce is sent when nonce you sent is smaller or equal to the nonce that was previously sent."
And,
"if you make 2 requests at same second you need to increase nonce for 2nd request (you can use micro uniquestamp so that in one second you can create 1000000 unique nonces in 1 second)."
My question is: What function can I use to solve this problem!? I tried microtime() function, but I get the same error.
Thank you and sorry for my bad english.
My code:
$unix_time = time();
$microtime = number_format(microtime(true), 5, '', '')
$message = $unix_time.$customer_id.$API_KEY; //nonce + customer id + api key
$signature = hash_hmac('sha256', $message, $API_SECRET);
$ticker_url = "https://www.bitstamp.net/api/v2/ticker/btceur";
$balance_url = "https://www.bitstamp.net/api/v2/balance/btceur/";
$param_array = array(
"key" => $API_KEY,
"signature" => strtoupper($signature),
"nonce" => $microtime
);
switch($_POST['action']){
case 'ticker_btceur':
ticker_btceur($param_array, $ticker_url);
break;
case 'balance_btceur':
balance_btceur($param_array, $balance_url);
break;
}
function ticker_btceur($da, $b_url){ // cURL GET
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $b_url."?key=".$da['key']."&signature=".$da['signature']);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, true);
curl_setopt($ch, CURLOPT_CAINFO, getcwd() . "/CAcerts/cacert.pem");
if(curl_exec($ch) === false){
echo "Errore: ". curl_error($ch)." - Codice errore: ".curl_errno($ch);
}
else{
$result = curl_exec($ch);
echo $result;
}
curl_close($ch);
}
function balance_btceur($pa, $b_url){ // cURL POST
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL, $b_url);
curl_setopt($ch,CURLOPT_POST, count($pa));
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($pa));
if(curl_exec($ch) === false){
echo "Errore: ". curl_error($ch)." - Codice errore: ".curl_errno($ch);
}
else{
$result = curl_exec($ch);
echo $result;
}
curl_close($ch);
}
microtime() is current Unix timestamp with microseconds and it's different than normal microseconds time (1 sceond = 1000000 microseconds), so they are not the samething.
If the service provider is asking you to send the time in Unix timestamp with microseconds then you have to use:
$time = microtime(true);
Also you can make it random by using rand() to be like that:
// Increase the time in random value between 10 and 100 in microtime
$time = microtime(true) + rand(10, 100);
If they asking you to do it in microseconds time then use rand() like that:
$time = rand(1000,10000000);
Seems that API requires microseconds, here is function to get microseconds:
function microseconds()
{
list($usec, $sec) = explode(" ", microtime());
return $sec . ($usec * 1000000);
}
echo microseconds();
echo "\n";
my best guess is that they mean:
$stamp=(string)(int)(microtime(true)*1000000);
this stamp will change 1 million times per second, depending on when you generate it, it looks something like
string(16) "1555177383042022"
.. just note that this code won't work properly on a 32bit system, if your code needs 32bit php compatibility then do this instead:
$stamp2=bcmul(number_format(microtime(true),18,".",""),"1000000",0);
so I'm trying to figure out why does this PHP code takes too long to run to output the results.
for example this is my apitest.php and here is my PHP Code
<?php
function getRankedMatchHistory($summonerId,$serverName,$apiKey){
$k
$d;
$a;
$timeElapsed;
$gameType;
$championName;
$result;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://".$serverName.".api.pvp.net/api/lol/".$serverName."/v2.2/matchhistory/".$summonerId."?api_key=".$apiKey);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$response = curl_exec($ch);
curl_close($ch);
$matchHistory = json_decode($response,true); // Is the Whole JSON Response saved at $matchHistory Now locally as a variable or is it requested everytime $matchHistory is invoked ?
for ($i = 9; $i >= 0; $i--){
$farm1 = $matchHistory["matches"][$i]["participants"]["0"]["stats"]["minionsKilled"];
$farm2 = $matchHistory["matches"][$i]["participants"]["0"]["stats"]["neutralMinionsKilled"];
$farm3 = $matchHistory["matches"][$i]["participants"]["0"]["stats"]["neutralminionsKilledTeamJungle"];
$farm4 = $matchHistory["matches"][$i]["participants"]["0"]["stats"]["neutralminionsKilledEnemyJungle"];
$elapsedTime = $matchHistory["matches"][$i]["matchDuration"];
settype($elapsedTime, "integer");
$elapsedTime = floor($elapsedTime / 60);
$k = $matchHistory["matches"][$i]["participants"]["0"]["stats"]["kills"];
$d = $matchHistory["matches"][$i]["participants"]["0"]["stats"]["deaths"];
$a = $matchHistory["matches"][$i]["participants"]["0"]["stats"]["assists"];
$championIdTmp = $matchHistory["matches"][$i]["participants"]["0"]["championId"];
$championName = call_user_func('getChampionName', $championIdTmp); // calls another function to resolve championId into championName
$gameType = preg_replace('/[^A-Za-z0-9\-]/', ' ', $matchHistory["matches"][$i]["queueType"]);
$result = (($matchHistory["matches"][$i]["participants"]["0"]["stats"]["winner"]) == "true") ? "Victory" : "Defeat";
echo "<tr>"."<td>".$gameType."</td>"."<td>".$result."</td>"."<td>".$championName."</td>"."<td>".$k."/".$d."/".$a."</td>"."<td>".($farm1+$farm2+$farm3+$farm4)." in ". $elapsedTime. " minutes". "</td>"."</tr>";
}
}
?>
What I'd like to know is how to make the page output faster as it takes around
10~15 seconds to output the results which makes the browser thinks the website is dead like a 500 Internal error or something like it .
Here is a simple demonstration of how long it can take : Here
As you might have noticed , yes I'm using Riot API which is sending the response as a JSON encoded type.
Here is an example of the response that this function handles : Here
What I thought of was creating a temporarily file called temp.php at the start of the CURL function and saving the whole response there and then reading the variables from there so i can speed up the process and after reading the variables it deletes the temp.php that was created thus freeing up disk space. and increasing the speed.
But I have no idea how to do that in PHP Only.
By the way I'd like to tell you that i just started using PHP today so I'd prefer some explanation with the answers if possible .
Thanks for your precious time.
Try benchmarking like this:
// start the timer
$start_curl = microtime(true);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://".$serverName.".api.pvp.net/api/lol/".$serverName."/v2.2/matchhistory/".$summonerId."?api_key=".$apiKey);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
// debugging
curl_setopt($ch, CURLOPT_VERBOSE, true);
// start another timer
$start = microtime(true);
$response = curl_exec($ch);
echo 'curl_exec() in: '.(microtime(true) - $start).' seconds<br><br>';
// start another timer
$start = microtime(true);
curl_close($ch);
echo 'curl_close() in: '.(microtime(true) - $start).' seconds<br><br>';
// how long did the entire CURL take?
echo 'CURLed in: '.(microtime(true) - $start_curl).' seconds<br><br>';
I'm using cURL to upload a file via given URL. (user gives URL, and my server downloads the file)
For a progressbar, I use the CURLOPT_PROGRESSFUNCTION option.
I want the function of the progress to also calculate the speed of download, and how much time left.
$fp = fopen($temp_file, "w");
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_NOPROGRESS, false );
curl_setopt($ch, CURLOPT_PROGRESSFUNCTION, "curl_progress_callback");
curl_setopt($ch, CURLOPT_FILE, $fp);
$success = curl_exec($ch);
$curl_info = curl_getinfo($ch);
curl_close($ch);
fclose($fp);
function curl_progress_callback ($download_size, $downloaded_size, $upload_size, $uploaded_size) {
global $fileinfo;
if (!$downloaded_size) {
if (!isset($fileinfo->size)) {
$fileinfo->size = $download_size;
event_callback(array("send" => $fileinfo));
}
}
event_callback(array("progress" => array("loaded" => $downloaded_size, "total" => $download_size)));
}
Thank you! and sorry for my English
Add this before curl_exec:
$startTime = $prevTime = microtime(true);
$prevSize = 0;
You can calculate the average and current speed, and remaining time by adding this to the callback function:
$averageSpeed = $downloaded_size / (microtime(true) - $startTime);
$currentSpeed = ($downloaded_size - $prevSize) / (microtime(true) - $prevTime);
$prevTime = microtime(true);
$prevSize = $downloaded_size;
$timeRemaining = ($downloaded_size - $download_size) / $averageSpeed;
Speed is measured in Bytes/s and remaining time in seconds.
I have a function that calls 3 different APIs using cURL multiple times. Each API's result is passed to the next API called in nested loops, so cURL is currently opened and closed over 500 times.
Should I leave cURL open for the entire function or is it OK to open and close it so many times in one function?
There's a performance increase to reusing the same handle. See: Reusing the same curl handle. Big performance increase?
If you don't need the requests to be synchronous, consider using the curl_multi_* functions (e.g. curl_multi_init, curl_multi_exec, etc.) which also provide a big performance boost.
UPDATE:
I tried benching curl with using a new handle for each request and using the same handle with the following code:
ob_start(); //Trying to avoid setting as many curl options as possible
$start_time = microtime(true);
for ($i = 0; $i < 100; ++$i) {
$rand = rand();
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.google.com/?rand=" . $rand);
curl_exec($ch);
curl_close($ch);
}
$end_time = microtime(true);
ob_end_clean();
echo 'Curl without handle reuse: ' . ($end_time - $start_time) . '<br>';
ob_start(); //Trying to avoid setting as many curl options as possible
$start_time = microtime(true);
$ch = curl_init();
for ($i = 0; $i < 100; ++$i) {
$rand = rand();
curl_setopt($ch, CURLOPT_URL, "http://www.google.com/?rand=" . $rand);
curl_exec($ch);
}
curl_close($ch);
$end_time = microtime(true);
ob_end_clean();
echo 'Curl with handle reuse: ' . ($end_time - $start_time) . '<br>';
and got the following results:
Curl without handle reuse: 8.5690529346466
Curl with handle reuse: 5.3703031539917
So reusing the same handle actually provides a substantial performance increase when connecting to the same server multiple times. I tried connecting to different servers:
$url_arr = array(
'http://www.google.com/',
'http://www.bing.com/',
'http://www.yahoo.com/',
'http://www.slashdot.org/',
'http://www.stackoverflow.com/',
'http://github.com/',
'http://www.harvard.edu/',
'http://www.gamefaqs.com/',
'http://www.mangaupdates.com/',
'http://www.cnn.com/'
);
ob_start(); //Trying to avoid setting as many curl options as possible
$start_time = microtime(true);
foreach ($url_arr as $url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_exec($ch);
curl_close($ch);
}
$end_time = microtime(true);
ob_end_clean();
echo 'Curl without handle reuse: ' . ($end_time - $start_time) . '<br>';
ob_start(); //Trying to avoid setting as many curl options as possible
$start_time = microtime(true);
$ch = curl_init();
foreach ($url_arr as $url) {
curl_setopt($ch, CURLOPT_URL, $url);
curl_exec($ch);
}
curl_close($ch);
$end_time = microtime(true);
ob_end_clean();
echo 'Curl with handle reuse: ' . ($end_time - $start_time) . '<br>';
And got the following result:
Curl without handle reuse: 3.7672290802002
Curl with handle reuse: 3.0146431922913
Still quite a substantial performance increase.
My current code (see below) uses 147MB of virtual memory!
My provider has allocated 100MB by default and the process is killed once run, causing an internal error.
The code is utilising curl multi and must be able to loop with more than 150 iterations whilst still minimizing the virtual memory. The code below is only set at 150 iterations and still causes the internal server error. At 90 iterations the issue does not occur.
How can I adjust my code to lower the resource use / virtual memory?
Thanks!
<?php
function udate($format, $utimestamp = null) {
if ($utimestamp === null)
$utimestamp = microtime(true);
$timestamp = floor($utimestamp);
$milliseconds = round(($utimestamp - $timestamp) * 1000);
return date(preg_replace('`(?<!\\\\)u`', $milliseconds, $format), $timestamp);
}
$url = 'https://www.testdomain.com/';
$curl_arr = array();
$master = curl_multi_init();
for($i=0; $i<150; $i++)
{
$curl_arr[$i] = curl_init();
curl_setopt($curl_arr[$i], CURLOPT_URL, $url);
curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYPEER, FALSE);
curl_multi_add_handle($master, $curl_arr[$i]);
}
do {
curl_multi_exec($master,$running);
} while($running > 0);
for($i=0; $i<150; $i++)
{
$results = curl_multi_getcontent ($curl_arr[$i]);
$results = explode("<br>", $results);
echo $results[0];
echo "<br>";
echo $results[1];
echo "<br>";
echo udate('H:i:s:u');
echo "<br><br>";
usleep(100000);
}
?>
As per your last comment..
Download RollingCurl.php.
Hopefully this will sufficiently spam the living daylights out of your API.
<?php
$url = '________';
$fetch_count = 150;
$window_size = 5;
require("RollingCurl.php");
function request_callback($response, $info, $request) {
list($result0, $result1) = explode("<br>", $response);
echo "{$result0}<br>{$result1}<br>";
//print_r($info);
//print_r($request);
echo "<hr>";
}
$urls = array_fill(0, $fetch_count, $url);
$rc = new RollingCurl("request_callback");
$rc->window_size = $window_size;
foreach ($urls as $url) {
$request = new RollingCurlRequest($url);
$rc->add($request);
}
$rc->execute();
?>
Looking through your questions, I saw this comment:
If the intention is domain snatching,
then using one of the established
services is a better option. Your
script implementation is hardly as
important as the actual connection and
latency.
I agree with that comment.
Also, you seem to have posted the "same question" approximately seven hundred times:
https://stackoverflow.com/users/558865/icer
https://stackoverflow.com/users/516277/icer
How can I adjust the server to run my PHP script quicker?
How can I re-code my php script to run as quickly as possible?
How to run cURL once, checking domain availability in a loop? Help fixing code please
Help fixing php/api/curl code please
How to reduce virtual memory by optimising my PHP code?
Overlapping HTTPS requests?
Multiple https requests.. how to?
Doesn't the fact that you have to keep asking the same question over and over tell you that you're doing it wrong?
This comment of yours:
#mario: Cheers. I'm competing against
2 other companies for specific
ccTLD's. They are new to the game and
they are snapping up those domains in
slow time (up to 10 seconds after
purge time). I'm just a little slower
at the moment.
I'm fairly sure that PHP on a shared hosting account is the wrong tool to use if you are seriously trying to beat two companies at snapping up expired domain names.
The result of each of the 150 queries is being stored in PHP memory and by your evidence this is insufficient. The only conclusion is that you cannot keep 150 queries in memory. You must have a method of streaming to files instead of memory buffers, or simply reduce the number of queries and processing the list of URLs in batches.
To use streams you must set CURLOPT_RETURNTRANSFER to 0 and implement a callback for CURLOPT_WRITEFUNCTION, there is an example in the PHP manual:
http://www.php.net/manual/en/function.curl-setopt.php#98491
function on_curl_write($ch, $data)
{
global $fh;
$bytes = fwrite ($fh, $data, strlen($data));
return $bytes;
}
curl_setopt ($curl_arr[$i], CURLOPT_WRITEFUNCTION, 'on_curl_write');
Getting the correct file handle in the callback is left as problem for the reader to solve.
<?php
echo str_repeat(' ', 1024); //to make flush work
$url = 'http://__________/';
$fetch_count = 15;
$delay = 100000; //0.1 second
//$delay = 1000000; //1 second
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
for ($i=0; $i<$fetch_count; $i++) {
$start = microtime(true);
$result = curl_exec($ch);
list($result0, $result1) = explode("<br>", $result);
echo "{$result0}<br>{$result1}<br>";
flush();
$end = microtime(true);
$sleeping = $delay - ($end - $start);
echo 'sleeping: ' . ($sleeping / 1000000) . ' seconds<hr />';
usleep($sleeping);
}
curl_close($ch);
?>