I have this very simple PHP call to Alpha Vantage API to fill a table (or list) with NASDAQ stock prices:
<?php
function get_price($commodity = "")
{
$url = 'https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=' . $commodity . '&outputsize=full&apikey=myKey';
$obj = json_decode(file_get_contents($url), true);
$date = $obj['Meta Data']['3. Last Refreshed'];
$result = $obj['Time Series (Daily)']['2018-03-23']['4. close'];
$rd_result = round($result, 2);
echo $result;
}
?>
<?php get_price("XOM");
get_price("AAPL");
get_price("MSFT");
get_price("CVX");
get_price("CAT");
get_price("BA");
?>
And it works, but just so freaking slow. It can take ove 30 secs. to load while the json file from Alpha Vantage loads in a fraction of second.
Does anyone knows where am I going wrong?
This what i did when the API took time to reply, my solution is written in C# but the logic would be the same.
string[] AlphaVantageApiKey = { "RK*********", "B2***********", 4FD*********QN", "7S3Z*********FRX", "U************I3" };
int ApiKeyValue = 0;
foreach (var stock in listOfStocks)
{
DataTable dtResult = DataRetrival.GetIntradayStockFeedForSelectedStockAs(stock.Symbol.Trim().ToUpper(), ApiKeyValue);
ApiKeyValue = (ApiKeyValue == 4) ? 0 : ApiKeyValue + 1;
}
I use 5 to 6 different API keys, when i'm querying data. I loop thought each of them for each call. There by reducing load on one perpendicular token.
I observed that this improved my performance a lot. It takes me less than 1 min to get Intraday data for 50 stocks.
Another, way you can improve your performance is to use
outputsize=compact
compact returns only the latest 100 data points in the time series.
UPDATE: Batch Stock Quotes
You might want to consider using this type of query as well. Multiple stock quotes all in one call.
Also, using the full output size is grabbing data from the past 20 years, if applicable. Take that out of your query and have the API use its condensed output default.
EDIT: According to the above, you should make changes to your query. But it can also be an issue with your server. I tested this for a use case I am working on and it takes me a few seconds to get the data, albeit I am only pulling it for one stock symbol on a page at a time.
Try increasing your memory limit if things are too slow for your liking.
<?php
ini_set('memory_limit','500M'); // or your desired limit
?>
Also, if you have shared hosting, that might be the problem. However, I do not know enough about your server to answer that fully.
Related
I've been trying to validate over 1 million randomly generated values (strings) with PHP and a client side programming language on an online form, but there are a few challenges I'm facing:
PHP
Link to the (editable) PHP code:https://3v4l.org/AtTkO
The PHP code:
<?php
function generateRandomString($length = 10) {
$characters = '0123456789abcdefghijklmnopqrstuvwxyz-_.';
$charactersLength = strlen($characters);
$randomString = '';
for ($i = 0; $i < $length; $i++) {
$randomString .= $characters[rand(0, $charactersLength - 1)];
}
return $randomString;
}
$unique = array();
for ($i = 0; $i < 9000000; $i++)
{
$u=$i+1;
$random = generateRandomString(5);
if(!in_array($random, $unique)){
echo $u.".m".$random."#[server]\n";
$unique[] = $random;
gc_collect_cycles();
}else{
echo "duplicate detected";
$i--;
}
}
echo memory_get_peak_usage();
What should happen:
New 5 character value gets randomly generated
Value gets checked if it already exists in the array
Value gets added to array
All randomly generated values are exported to a .txt file to be used for validating. (Not in the script yet)
What actually happens:
I hit either a memory usage limit or a server timeout for the execution time.
What I've tried
I've tried using sleep(3) during the for loop.
Setting Memory limit to -1 and timeout to 0. The unlimited memory doesn't make a difference and is too dangerous in a working environment.
Using gc_collect_cycles() during the for loop
Using echo memory_get_peak_usage(); -> I don't really understand
how I could use this for debugging.
What I need help with:
Memory management in PHP
Having pauses in the script that will reset the PHP execution timer
Client Side Programming language
This is where I have absolutely no clue which way I should go or which programming language I should use for this.
What I want to achieve
Load a webpage that has a form
Load the .txt with all randomly generated strings
fill in the form with the first string
submit the form:
If positive response from form > save string in special .txt file or array, go to the next value
If negative response from form > delete string from file, go to the next value | or just go to the next value
All values with a positive response are filtered out and easily accessible at the end.
I don't know which programming language I should use for this function. I've been thinking about Javascript and Python but I'm not sure how I could combine that with PHP. A nudge in the right direction would be appreciated.
I might be completely wrong for trying to achieve this with PHP, if so, please let me know what would be the better and easier option.
Thanks!
Interesting question, first of all whenever you think of a solution like this, one of the first things you need to consider is can it be async? If your answer is yes, then your implementation will likely be simple, else, you will likely have to pay huge server costs or render random cached results.
NB remove gc_collect_cycles. It does the opposite of what you want, and you hardly ever need to call it manually.
That being said, the approach I would recommend in your case is as follows:
Use a websocket which will be opened only once on the client browser, and then forward results in realtime from server to the browser. Of course, this code itself, can run completely on clientside via javascript, so if it's not just a PoC, you can convert the php code to javascript.
Change your code to yield items or forward results via websocket once a generated code has been confirmed as unique.
However, if you're really just doing only what the PHP code says, you can do that completely in javascript and save your server resources. See this answer for an example code to replace your generateRandomString function.
Assuming you have the ability to edit the php.ini:
Increase your memory limit as described here:
PHP MEMORY LIMIT INCREASE
For the 'memory limit' see here
and for the 'timeout for the execution time' add :
set_time_limit(0);
on the top of the PHP file.
Have you tried using sets? https://www.php.net/manual/en/class.ds-set.php
Sets are very efficient whenever you want to ensure a value isn't present twice.
Checking the presence of a value in a set it way way way faster that loop across all entries on the array.
I'm not a expert with PHP but it would look like something like that in Ruby
require 'set'
CHARS = '0123456789abcdefghijklmnopqrstuvwxyz-_.'.split('');
unique = Set.new()
def generateRandomString(l = 10)
Array.new(l) { CHARS.sample }.join
end
while unique.length < 1_000_000
random_string = generateRandomString
if !unique.include?(random_string)
unique.add(random_string)
end
end
hope it helps
I am running a range of queries in BigQuery and exporting them to CSV via PHP. There are reasons why this is the easiest method for me to do this (multiple queries dependent on variables within an app).
I am struggling with memory issues when the result set is larger than 100mb. It appears that the memory usage of my code seems to grow in line with the result set, which I thought would be avoided by paging. Here is my code:
$query = $bq->query($myQuery);
$queryResults = $bq->runQuery($query,['maxResults'=>5000]);
$FH = fopen($storagepath, 'w');
$rows = $queryResults->rows();
foreach ($rows as $row) {
fputcsv($FH, $row);
}
fclose($FH);
The $queryResults->rows() function returns a Google Iterator which uses paging to scroll through the results, so I do not understand why memory usage grows as the script runs.
Am I missing a way to discard previous pages from memory as I page through the results?
UPDATE
I have noticed that actually since upgrading to the v1.4.3 BigQuery PHP API, the memory usage does cap out at 120mb for this process, even when the result set reaches far beyond this (currently processing a 1gb result set). But still, 120mb seems too much. How can I identify and fix where this memory is being used?
UPDATE 2
This 120mb seems to be tied at 24kb per maxResult in the page. E.g. adding 1000 rows to maxResults adds 24mb of memory. So my question is now why is 1 row of data using 24kb in the Google Iterator? Is there a way to reduce this? The data itself is < 1kb per row.
Answering my own question
The extra memory is used by a load of PHP type mapping and other data structure info that comes alongside the data from BigQuery. Unfortunately I couldn't find a way to reduce the memory usage below around 24kb per row multiplied by the page size. If someone finds a way to reduce the bloat that comes along with the data please post below.
However thanks to one of the comments I realized you can extract a query directly to CSV in a Google Cloud Storage Bucket. This is really easy:
query = $bq->query($myQuery);
$queryResults = $bq->runQuery($query);
$qJobInfo = $queryResults->job()->info();
$dataset = $bq->dataset($qJobInfo['configuration']['query']['destinationTable']['datasetId']);
$table = $dataset->table($qJobInfo['configuration']['query']['destinationTable']['tableId']);
$extractJob = $table->extract('gs://mybucket/'.$filename.'.csv');
$table->runJob($extractJob);
However this still didn't solve my issue as my result set was over 1gb, so I had to make use of the data sharding function by adding a wildcard.
$extractJob = $table->extract('gs://mybucket/'.$filename.'*.csv');
This created ~100 shards in the bucket. These need to be recomposed using gsutil compose <shard filenames> <final filename>. However, gsutil only lets you compose 32 files at a time. Given I will have variable numbers of shards, opten above 32, I had to write some code to clean them up.
//Save above job as variable
$eJob = $table->runJob($extractJob);
$eJobInfo = $eJob->info();
//This bit of info from the job tells you how many shards were created
$eJobFiles = $eJobInfo['statistics']['extract']['destinationUriFileCounts'][0];
$composedFiles = 0; $composeLength = 0; $subfile = 0; $fileString = "";
while (($composedFiles < $eJobFiles) && ($eJobFiles>1)) {
while (($composeLength < 32) && ($composedFiles < $eJobFiles)) {
// gsutil creates shards with a 12 digit number after the filename, so build a string of 32 such filenames at a time
$fileString .= "gs://bucket/$filename" . str_pad($composedFiles,12,"0",STR_PAD_LEFT) . ".csv ";
$composedFiles++;
$composeLength++;
}
$composeLength = 0;
// Compose a batch of 32 into a subfile
system("gsutil compose $fileString gs://bucket/".$filename."-".$subfile.".csv");
$subfile++;
$fileString="";
}
if ($eJobFiles > 1) {
//Compose all the subfiles
system('gsutil compose gs://bucket/'.$filename.'-* gs://fm-sparkbeyond/YouTube_1_0/' . $filepath . '.gz') ==$
}
Note in order to give my Apache user access to gsutil I had to allow the user to create a .config directory in the web root. Ideally you would use the gsutil PHP library, but I didn't want the code bloat.
If anyone has a better answer please post it
Is there a way to get smaller output from the BigQuery library than 24kb per row?
Is there a more efficient way to clean up variable numbers of shards?
I have a MySQL table which can contain up to 500 000 rows and I am calling them on my site without any LIMIT clause; when I do it this without AJAX, it works normally, but with AJAX , again without setting LIMIT, no data is returned. I checked the AJAX code and there is no mistake there. The thing is , when I write a limit, for example 45 000 , it works perfectly; but above this, ajax returns nothing.
With limit
witohut the limit :
Can this be a ajax issue because i found nothing similar on the web or something else?
EDIT
here is the sql request
SELECT ans.*, quest.inversion, t.wave_id, t.region_id, t.branch_id, quest.block, quest.saleschannelid, b.division, b.regionsid, quest.yes, quest.no FROM cms_vtb as ans
LEFT JOIN cms_vtb_question as quest ON ans.question_id=quest.id
LEFT JOIN cms_task as t ON t.id=ans.task_id
LEFT JOIN cms_wave as w ON w.id=t.wave_id
LEFT JOIN cms_branchemployees as b ON b.id=t.branchemployees_id WHERE t.publish='1' AND t.concurent_id='' AND ans.answer<>'3' AND w.publish='1' AND quest.questhide<>1 ORDER BY t.concurent_id DESC LIMIT 44115
the php :
var url='&module=ajax_typespace1&<?=$base_url?>';
$.ajax({
url: 'moduls_ajax.php?'+url,
cache: false,
dataType:'html',
success: function(data)
{
$("#result").html(data);
}
});
Apparently it was a server error, adding ini_set('memory_limit', '2048M'); helped a lot
The reason this happens has to do with how you format the data sent to the client. Not having seen the code of moduls_ajax.php, I can only suspect that you are probably assembling the query result into a variable - possibly in order to json_encode it properly?
But doing so may result in a huge memory allocation, whereas if you send the data piece by piece to the Web server, you may need a fraction of the memory only.
The same happens on your web page where the same query is either output straight on, or is not being encoded. In the latter case, you'll discover that when the row number grows to about two or three times the current value, the working Web page will stop also.
For example:
$result = array();
while ($tuple = $resultset->fetch()) {
$result[] = $tuple;
}
print json_encode($result);
Instead - of course, it's more complicated than before -
// Since we know it is an array with numeric keys, the JSON
// will be of the format [ <item>, <item>,...,<item> ]
$sep = '[';
while ($tuple = $resultset->fetch()) {
print $sep . json_encode($tuple);
$sep = ',';
}
print ']';
Pros and cons
This is about three times as expensive as a single function call, and can also yield a slightly worse compression performance (the web browser may receive the data in chunks of different size and find more difficulty in compressing them optimally; it's a matter of tenths of one percent, usually). On the other hand, in some setups the output will arrive much more quickly to the client browser and possibly prevent browser timeouts.
The memory requirements, if the tuples are all more or less of the same size, is around two to three N-ths of before - if you have one thousand rows, and needed one gigabyte to be able to process the query, now three-four megabytes ought to suffice. Of course, this also means that the more rows, the better... and the less rows, the less point there is in doing this.
More of the same
The same approach holds for other kind of assembling (to HTML, CSV and so on).
In some cases it may be helpful to dump the data into an external temporary file and send a Location header to have it loaded by the browser. Sometimes it is possible (if PHP is compiled as an Apache module on a Unix system) to output the file after having deleted it, so that it's not necessary to do garbage collection on the temporary files:
$fp = fopen($temporary_file, 'r');
unlink($temporary_file); // The file is deleted, the handle remains valid
fpassthru($fp); // On some platforms this results in the browser being "short-circuited" to the file descriptor, so that the PHP script may terminate while output continues normally.
die();
I am trying to parse xml files to store data into database. I have written a code with PHP (as below) and I could successfully run the code.
But the problem is, it requires around 8 mins to read a complete file (which is around 30 MB), and I have to parse around 100 files in each hour.
So, obviously my current code is of no use to me. Can anybody advise for a better solution? Or should I switch to other coding language?
What I get from net is, I can do it with Perl/Python or something called XSLT (which I am not so sure about, frankly).
$xml = new XMLReader();
$xml->open($file);
while ($xml->name === 'node1'){
$node = new SimpleXMLElement($xml->readOuterXML());
foreach($node->node2 as $node2){
//READ
}
$xml->next('node1');
}
$xml->close();
Here's an example of my script I used to parse the WURFL XML database found here.
I used the ElementTree module for Python and wrote out a JavaScript Array - although you can easily modify my script to write a CSV of the same (Just change the final 3 lines).
import xml.etree.ElementTree as ET
tree = ET.parse('C:/Users/Me/Documents/wurfl.xml')
root = tree.getroot()
dicto = {} #to store the data
for device in root.iter("device"): #parse out the device objects
dicto[device.get("id")] = [0, 0, 0, 0] #set up a list to store the needed variables
for child in device: #iterate through each device
if child.get("id") == "product_info": #find the product_info id
for grand in child:
if grand.get("name") == "model_name": #and the model_name id
dicto[device.get("id")][0] = grand.get("value")
dicto[device.get("id")][3] +=1
elif child.get("id") == "display": #and the display id
for grand in child:
if grand.get("name") == "physical_screen_height":
dicto[device.get("id")][1] = grand.get("value")
dicto[device.get("id")][3] +=1
elif grand.get("name") == "physical_screen_width":
dicto[device.get("id")][2] = grand.get("value")
dicto[device.get("id")][3] +=1
if not dicto[device.get("id")][3] == 3: #make sure I had enough
#otherwise it's an incomplete dataset
del dicto[device.get("id")]
arrays = []
for key in dicto.keys(): #sort this all into another list
arrays.append(key)
arrays.sort() #and sort it alphabetically
with open('C:/Users/Me/Documents/wurfl1.js', 'w') as new: #now to write it out
for item in arrays:
new.write('{\n id:"'+item+'",\n Product_Info:"'+dicto[item][0]+'",\n Height:"'+dicto[item][1]+'",\n Width:"'+dicto[item][2]+'"\n},\n')
Just counted this as I ran it again - took about 3 seconds.
In Perl you could use XML::Twig, which is designed to process huge XML files (bigger than can fit in memory)
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
my $file= shift #ARGV;
XML::Twig->new( twig_handlers => { 'node1/node2' => \&read_node })
->parsefile( $file);
sub read_node
{ my( $twig, $node2)= #_;
# your code, the whole node2 string is $node2->sprint
$twig->purge; # if you want to reduce memory footprint
}
You can find more info about XML::Twig at xmltwig.org
In case of Python I would recommend using lxml.
As you are having performance problems, I would recommend iterating through your XML and processing things part by part, this would save a lot of memory and is likely to be much faster.
I am reading on old server 10 MB XML within 3 seconds, your situation might be different.
About iterating with lxml: http://lxml.de/tutorial.html#tree-iteration
Review this line of code:
$node = new SimpleXMLElement($xml->readOuterXML());
Documentation for readOuterXML has a comment, that sometime it is attempting to reach out for namespaces etc. Anyway, here I would suspect big performance problem.
Consider using readInnerXML() if you could.
I have this small bit of code (listed at the end of this message), that runs on page load. We get around 50,000 UNIQUE visitors per day (not counting repeats). It could be coincidental, but ever since implementation, there have been random server load issues.
So what I'm asking is...
1) Can someone confirm/deny whether or not the below code can in fact cause issues?
2) Can this be optimized?
Just fyi:
-- I have stuck this function in the HEADER file of a WordPress layout.
-- It is called 10+ times in the footer
-- It is a VPS server using NGINX
-- I have not checked the logs just yet
The code's purpose...
We specify a percentage to the function that tells the code to display a string that percent of the time (so if we put 60, then it means the string should show up 60% of the time). Each entry in the footer generates its own random number.
The code:
function writeRndString($theString, $percent) {
$randno = rand(1,100);
if($randno <= (int)$percent) {
echo "Random String: " . $theString;
echo "\n\n";
}
}
This is a very simple function, should be fast, even if you call it several times. Even with 50000 daily, which is about 2 pages per second.
If you can, simply remove it for a few minutes and check the server load. It could be called a lot more times than you assume :)
Maby....
You forgot a $ on: echo "Random String: " . theString;
And alitte bether maby, don't use variables you don't need in fact.
maby also use return
function writeRndString($theString, $percent) {
if (rand(1, 100) <= (int) $percent) {
return "Random String: " . $theString . "\n\n";
}
}
PHP:
<?php
echo "blablabla" . writeRndString($x, $y);
?>