I am not sure why this was working fine last night and this morning I am getting
Fatal error: Out of memory (allocated 1611137024) (tried to allocate
1610350592 bytes) in /home/twitcast/public_html/system/index.php on
line 121
The section of code being ran is as follows
function podcast()
{
$fetch = new server();
$fetch->connect("TCaster");
$collection = $fetch->db->shows;
// find everything in the collection
$cursor = $collection->find();
if($cursor->count() > 0)
{
$test = array();
// iterate through the results
while( $cursor->hasNext() ) {
$test[] = ($cursor->getNext());
}
$i = 0;
foreach($test as $d) {
for ( $i = 0; $i <= 3; $i ++) {
$url = $d["streams"][$i];
$xml = file_get_contents( $url );
$doc = new DOMDocument();
$doc->preserveWhiteSpace = false;
$doc->loadXML( $xml); // $xml = file_get_contents( "http://www.c3carlingford.org.au/podcast/C3CiTunesFeed.xml")
// Initialize XPath
$xpath = new DOMXpath( $doc);
// Register the itunes namespace
$xpath->registerNamespace( 'itunes', 'http://www.itunes.com/dtds/podcast-1.0.dtd');
$items = $doc->getElementsByTagName('item');
foreach( $items as $item) {
$title = $xpath->query( 'title', $item)->item(0)->nodeValue;
$published = strtotime($xpath->query( 'pubDate', $item)->item(0)->nodeValue);
$author = $xpath->query( 'itunes:author', $item)->item(0)->nodeValue;
$summary = $xpath->query( 'itunes:summary', $item)->item(0)->nodeValue;
$enclosure = $xpath->query( 'enclosure', $item)->item(0);
$url = $enclosure->attributes->getNamedItem('url')->value;
$fname = basename($url);
$collection = $fetch->db->shows_episodes;
$cursorfind = $collection->find(array("internal_url"=>"http://twitcatcher.russellharrower.com/videos/$fname"));
if($cursorfind->count() < 1)
{
$copydir = "/home/twt/public_html/videos/";
$data = file_get_contents($url);
$file = fopen($copydir . $fname, "w+");
fputs($file, $data);
fclose($file);
$collection->insert(array("show_id"=> new MongoId($d["_id"]),"stream"=>$i,"episode_title"=>$title, "episode_summary"=>$summary,"published"=>$published,"internal_url"=>"http://twitcatcher.russellharrower.com/videos/$fname"));
echo "$title <br> $published <br> $summary <br> $url<br><br>\n\n";
}
}
}
}
}
line 121 is
$data = file_get_contents($url);
You want to add 1.6GB of memory usage for a single PHP thread? While you can increase the memory limit, my strong advice is to look at another way of doing what you want.
Probably the easiest solution: you can use CURL to request a byte range of the source file (using Curl is wiser than get_file_contents anyway, for remote files). You can get 100K ata time, write to the local file then got the next 100k and appeand to the file etc, until the entire file is pulled in.
You may also do something with streams, but it gets a little more complex. This may be your only option if the remote server won't let you get part of a file by bytes.
Finally there's Linux commands such as wget, run through exec(), if your server has permissions.
Memory Limit - take a look at this directive. Suppose that is what you need.
or you may try to use copy instead of reading file to memory (which is video file, as I understand so there is nothing strange that it takes a lot of memory):
$copydir = "/home/twt/public_html/videos/";
copy($url, $copydir . $fname);
Looks like last night opened file were smaller)
Related
I am writing a web scraper in PHP using gitpod. After a while I have managed to solve all problems. But even though no problems are left, the code does not open the browser nor produce any output.
Does anybody have an idea why that could be the case?
<?php
if (file_exists('vendor/autoload.php')) {
require 'vendor/autoload.php';
}
use Goutte\Client;
$client = new Goutte\Client();
// Create a new array to store the scraped data
$data = array();
// Loop through the pages
if ($client->getResponse()->getStatus() != 200) {
echo 'Failed to access website. Exiting script.';
exit();
}
for ($i = 0; $i < 3; $i++) {
// Make a request to the website
$crawler = $client->request('GET', 'https://ec.europa.eu/info/law/better-regulation/have-your-say/initiatives_de?page=' . $i);
// Find all the initiatives on the page
$crawler->filter('.initiative')->each(function ($node) use (&$data) {
// Extract the information for each initiative
$title = $node->filter('h3')->text();
$link = $node->filter('a')->attr('href');
$description = $node->filter('p')->text();
$deadline = $node->filter('time')->attr('datetime');
// Append the data for the initiative to the data array
$data[] = array($title, $link, $description, $deadline);
});
// Sleep for a random amount of time between 5 and 10 seconds
$sleep = rand(5,10);
sleep($sleep);
}
// Open the output file
$fp = fopen('initiatives.csv', 'w');
// Write the header row
fputcsv($fp, array('Title', 'Link', 'Description', 'Deadline'));
// Write the data rows
foreach ($data as $row) {
fputcsv($fp, $row);
}
// Close the output file
fclose($fp);
?>
I'm using SimpleXML to fetch a remote XML file and im having some issues because sometimes SimpleXML can't load the XML. I don't know exactly the reason but i suspect the remote site takes longer than usual to return data, resulting in a timeout.
The code i use is the following:
$xml = #simplexml_load_file($url);
if(!$xml){
$database = Config_helper::get_config_option('mysql');
$db = new \DB($database['database'], $database['server'], $database['user'], $database['password']);
$date = date('Y-m-d H:i:s');
$db->query("INSERT INTO gearman_job_error (timestamp, data, attempt)
VALUES ('$date', '{$job->workload()}', '1')");
//$db->query("INSERT INTO gearman_job_error (timestamp, data, attempt) VALUES ({$date}, {$job->workload()}, 1);");
return $job->sendFail();
}
else {
foreach($xml->point as $key=>$value):
$length = count($value);
$timestamp = (string) $value->data[0];
$j=0;
for ($i = 1; $i < $length; $i++)
{
$forecast[$timestamp][$time_request][] = array($variables[$j] => (string) $value->data[$i]);
$j++;
}
endforeach;
return serialize($forecast);
}
Those url's i can't load are stored in the database and by checking them i confirm that they load correctly in the browser.. no problem with them.
Example: http://mandeo.meteogalicia.es/thredds/ncss/modelos/WRF_HIST/d02/2015/02/wrf_arw_det_history_d02_20150211_0000.nc4?latitude=40.393288&longitude=-8.873433&var=rh%2Ctemp%2Cswflx%2Ccfh%2Ccfl%2Ccfm%2Ccft&point=true&accept=xml&time_start=2015-02-11T00%3A00Z&time_end=2015-02-14T20%3A00Z
My question is, how can i insist the SimpleXML to take it's time to load the url? My goal is only after a reasonable time it assumes it can't load the file and store it in the database.
simplexml_load_file itself doesn't have any support for specifying timeouts, but you can combine file_get_contents and simplexml_load_string, like this:
<?php
$timeout = 30;
$url = 'http://...';
$context = stream_context_create(['http' => ['timeout' => $timeout]]);
$data = file_get_contents($url, false, $context);
$xml = simplexml_load_string($data);
print_r($xml);
I figured a way of doing this that for now suits me.
I set a maximum number of tries to fetch the xml and if it doesn't work that means the xml can be possibly damaged or missing.
I have tested and the results are accurate! It's simple and more effective then setting a timeout. I guess you can always set a timeout also.
$maxTries = 5;
do
{
$content = #file_get_contents($url);
}
while(!$content && --$maxTries);
if($content)
{
try
{
$xml = #simplexml_load_string($content);
# Do what you have to do here #
}
catch(Exception $exception)
{
print($exception->getMessage());
}
}
else
{
echo $url;
$job->sendFail();
}
Am dynamically loading an xml file and sending request to the api but getting
Warning: DOMDocument::loadXML(): Empty string supplied as input in /home/spotrech/public_html/ but this error is very inconsistent sometime appear sometime don't! I really no idea how to solve this. below is code
$rechargeApiUrl = "http://allrechargeapi.com/apirecharge.ashx?uid=$uid&apikey=$apike&number=$mobileNo&opcode=$opId&amount=$amount&ukey=$uniId&format=xml";
$url = file_get_contents($rechargeApiUrl);
$xmlDoc = new DOMDocument();
$xmlDoc->loadXML(preg_replace('/(<\?xml[^?]+?)utf-16/i', '$1utf-8', $url));
$itemInfo = $xmlDoc->getElementsByTagName('Result'); //returns an object.
$itemCount = $itemInfo->length;
foreach ($itemInfo as $userInfo) {
//Assigning node values to its specified variables.
$ukey = strtolower($userInfo->getElementsByTagName('ukey')->item(0)->childNodes->item(0)->nodeValue);
$status = $userInfo->getElementsByTagName('status')->item(0)->childNodes->item(0)->nodeValue;
$resultCode = $userInfo->getElementsByTagName('resultcode')->item(0)->childNodes->item(0)->nodeValue;
}
$strStatus = strtolower(trim($status));
$strResultCode = trim($resultCode);
$strCode = trim($ukey);
any response will be appreciated.Thank you
I'm using the following code to turn user's IP into latitude/longitude information using the hostip web service:
//get user's location
$ip=$_SERVER['REMOTE_ADDR'];
function get_location($ip) {
$content = file_get_contents('http://api.hostip.info/?ip='.$ip);
if ($content != FALSE) {
$xml = new SimpleXmlElement($content);
$coordinates = $xml->children('gml', TRUE)->featureMember->children('', TRUE)->Hostip->ipLocation->children('gml', TRUE)->pointProperty->Point->coordinates;
$longlat = explode(',', $coordinates);
$location['longitude'] = $longlat[0];
$location['latitude'] = $longlat[1];
$location['citystate'] = '==>'.$xml->children('gml', TRUE)->featureMember->children('', TRUE)->Hostip->children('gml', TRUE)->name;
$location['country'] = '==>'.$xml->children('gml', TRUE)->featureMember->children('', TRUE)->Hostip->countryName;
return $location;
}
else return false;
}
$data = get_location($ip);
$center_long=$data['latitude'];
$center_lat=$data['longitude'];
This works fine for me, using $center_long and $center_lat the google map on the page is centered around my city, but I have a friend in Thailand who tested it from there, and he got this error:
Warning: get_location() [function.get-location]: Node no longer exists in /home/bedbugs/registry/index.php on line 21
So I'm completely confused by this, how could he be getting an error if I don't? I tried googling it and it has something to do with parsing XML data, but the parsing process is the same for me and him. Note that line 21 is the one that starts with '$coordinates =' .
You need to check the service actually has an <ipLocation> listed, you're doing:
$xml->children('gml', TRUE)->featureMember->children('', TRUE)->Hostip->ipLocation
->children('gml', TRUE)->pointProperty->Point->coordinates
but the XML output for my IP is:
<HostipLookupResultSet version="1.0.1" xsi:noNamespaceSchemaLocation="http://www.hostip.info/api/hostip-1.0.1.xsd">
<gml:description>This is the Hostip Lookup Service</gml:description>
<gml:name>hostip</gml:name>
<gml:boundedBy>
<gml:Null>inapplicable</gml:Null>
</gml:boundedBy>
<gml:featureMember>
<Hostip>
<ip>...</ip>
<gml:name>(Unknown City?)</gml:name>
<countryName>(Unknown Country?)</countryName>
<countryAbbrev>XX</countryAbbrev>
<!-- Co-ordinates are unavailable -->
</Hostip>
</gml:featureMember>
</HostipLookupResultSet>
The last part ->children('gml', TRUE)->pointProperty->Point->coordinates gives the error because it has no children (for some IPs).
You can add a basic check to see if the <ipLocation> node exists like this (assuming the service always returns at least up to the <hostIp> node):
function get_location($ip) {
$content = file_get_contents('http://api.hostip.info/?ip='.$ip);
if ($content === FALSE) return false;
$location = array('latitude' => 'unknown', 'longitude' => 'unknown');
$xml = new SimpleXmlElement($content);
$hostIpNode = $xml->children('gml', TRUE)->featureMember->children('', TRUE)->Hostip;
if ($hostIpNode->ipLocation) {
$coordinates = $hostIpNode->ipLocation->children('gml', TRUE)->pointProperty->Point->coordinates;
$longlat = explode(',', $coordinates);
$location['longitude'] = $longlat[0];
$location['latitude'] = $longlat[1];
}
$location['citystate'] = '==>'.$hostIpNode->children('gml', TRUE)->name;
$location['country'] = '==>'.$hostIpNode->countryName;
return $location;
}
Just wondering if anyone can point me in the direction of some tips / a script that will help me create an XML from an original CSV File, using PHP.
Cheers
This is quite easy to do, just look at fgetcsv to read csv files and then DomDocument to write an xml file. This version uses the headers from the file as the keys of the xml document.
<?php
error_reporting(E_ALL | E_STRICT);
ini_set('display_errors', true);
ini_set('auto_detect_line_endings', true);
$inputFilename = 'input.csv';
$outputFilename = 'output.xml';
// Open csv to read
$inputFile = fopen($inputFilename, 'rt');
// Get the headers of the file
$headers = fgetcsv($inputFile);
// Create a new dom document with pretty formatting
$doc = new DomDocument();
$doc->formatOutput = true;
// Add a root node to the document
$root = $doc->createElement('rows');
$root = $doc->appendChild($root);
// Loop through each row creating a <row> node with the correct data
while (($row = fgetcsv($inputFile)) !== FALSE)
{
$container = $doc->createElement('row');
foreach($headers as $i => $header)
{
$child = $doc->createElement($header);
$child = $container->appendChild($child);
$value = $doc->createTextNode($row[$i]);
$value = $child->appendChild($value);
}
$root->appendChild($container);
}
$strxml = $doc->saveXML();
$handle = fopen($outputFilename, "w");
fwrite($handle, $strxml);
fclose($handle);
The code given above creates an XML document, but does not store it on any physical device. So replace echo $doc->saveXML(); with
$strxml = $doc->saveXML();
$handle = fopen($outputFilename, "w");
fwrite($handle, $strxml);
fclose($handle);
There are a number of sites out there that will do it for you.
If this is going to be a regular process rather than a one-time thing it may be ideal to just parse the CSV and output the XML yourself:
$csv = file("path/to/csv.csv");
foreach($csv as $line)
{
$data = explode(",", $line);
echo "<xmltag>".$data[0]."</xmltag>";
//etc...
}
Look up PHP's file and string functions.
Sorry to bring up an old thread but I tried this script and I'm getting a DOM Exception error
The headers of our CSV Files are display_name office_number mobile_number and the error I'm receiving is DOMDocument->createElement('\xEF\xBB\xBFdisplay_name') #1 {main}