Uncaught TypeError: DOMDocument::importNode() - php

I get the following error when running php on my local:
Fri, 25 Mar 2022 03:11:55 +0000---Starting f_contracts with query 1 Fri, 25 Mar 2022 03:12:01 +0000---Starting XML -> JSON conversion
Warning: XMLReader::expand(): /private/tmp/redshift-dump.xml:1109: parser error : Extra content at the end of the document in /Users/hm/repo/f_contract_update/data.redshift.sync.hsdp/scripts/primary.php on line 1605
Warning: XMLReader::expand(): </table_data> in /Users/hm/repo/f_contract_update/data.redshift.sync.hsdp/scripts/primary.php on line 1605
Warning: XMLReader::expand(): ^ in /Users/hm/repo/f_contract_update/data.redshift.sync.hsdp/scripts/primary.php on line 1605
Warning: XMLReader::expand(): An Error Occurred while expanding in /Users/hm/repo/f_contract_update/data.redshift.sync.hsdp/scripts/primary.php on line 1605
Fatal error: Uncaught TypeError: DOMDocument::importNode(): Argument #1 ($node) must be of type DOMNode, bool given in /Users/hm/repo/f_contract_update/data.redshift.sync.hsdp/scripts/primary.php:1605 Stack trace: #0 /Users/hm/repo/f_contract_update/data.redshift.sync.hsdp/scripts/primary.php(1605): DOMDocument->importNode(false, true) #1 /Users/hm/repo/f_contract_update/data.redshift.sync.hsdp/scripts/primary.php(1582): Primary->mysqlDumpXmlToJson() #2 /Users/hm/repo/f_contract_update/data.redshift.sync.hsdp/scripts/primary.php(1509): Primary->dumpData('f_contrac...', 1) #3 /Users/hm/repo/f_contract_update/data.redshift.sync.hsdp/scripts/primary.php(1460): Primary->process('f_contrac...', 1) #4 /Users/hm/repo/f_contract_update/data.redshift.sync.hsdp/scripts/primary.php(46): Primary->processFContracts() #5 /Users/hm/repo/f_contract_update/data.redshift.sync.hsdp/scripts/primary.php(1665): Primary->__construct() #6 {main} thrown in /Users/hm/repo/f_contract_update/data.redshift.sync.hsdp/scripts/primary.php on line 1605
The exact same script runs perfectly when running on the server, but gives me the above error when running on local.
When I look at the XML dump, and compare it to the dump on the server, I notice that the XML doesn't close off properly :
Compared to the server :
This is the script which it is complaining about :
private function mysqlDumpXmlToJson()
{
if (file_exists($this->json_file)) {
unlink($this->json_file);
}
$z = new \XMLReader();
$z->open($this->xml_file);
$doc = new \DOMDocument();
while ($z->read() && $z->name !== 'row') ;
$f = fopen($this->json_file, 'a+');
while ($z->name === 'row') {
$data = [];
$node = simplexml_import_dom($doc->importNode($z->expand(), true));
foreach ($node as $col) {
$value = (string)$col;
$value = str_replace('0000-00-00 00:00:00', '', $value);
$data[(string)$col['name']] = $value;
}
fwrite($f, json_encode($data));
$z->next('row');
}
fclose($f);
}
Could it be that the mysqldump is limiting output size and where are these configurations set?
EDIT********
The following steps gets executed:
private function process($table, $query)
{
$this->info('Starting ' . $table . ' with query ' . $query);
$this->dumpData($table, $query);
$this->info('Compressing JSON');
$this->compressJson();
$this->info('Moving to S3');
$this->moveToS3();
$this->info('Moving data to staging table');
$this->copyData($table);
$this->info('Finished ' . $table . ' with query ' . $query);
}
The failure occurs during dumpData :
private function dumpData($table, $query)
{
if (file_exists($this->xml_file)) {
unlink($this->xml_file);
}
if ($table=='c')
{
system(sprintf('mysql -h rds.sdp.com -u redshift -****** --xml --database onnet --execute "select columns from c" >%s'
, $this->xml_file));
}
else {
system(sprintf(
'mysqldump --single-transaction --no-tablespaces -h rds -u redshift -***** --xml onnet %s --where="%s"> %s',
$table,
$query,
$this->xml_file
));
}
if ($table == 's_m') {
system(sprintf(
'sed -i "s/<..>/__#__/g" %s',
$this->xml_file
));
system(sprintf(
'iconv -c -f utf8 -t ascii < %s > %s',
$this->xml_file,
$this->xml_file . '.tmp'
));
system(sprintf(
'strings %s > %s',
$this->xml_file . '.tmp',
$this->xml_file
));
};
$this->info('Starting XML -> JSON conversion');
$this->mysqlDumpXmlToJson();
$this->info('Finished XML -> JSON conversion');
return (filesize($this->json_file) > 1000);
}
Hope that clarifies.
File systems are:
private $xml_file = '/tmp/rsh-dmp.xml';
private $json_file = '/tmp/rsh-dmp.json';
private $json_file_compressed = '/tmp/rsh-dmp.json.gz';
private $json_file_s3 = 's3://sdp-ew/rsh-dmp.json.gz';

You have stated that the exact same script runs correctly on prod, while it fails on your local. So, the source-code that you have is capable to run correctly. So, the problem must be something else.
Now, we know that $this->xml_file is a file on the server, prod or local and it is XMLReader::expand() that complains first and then, the result is a bool instead of a DOMNode, which strongly suggests that the DOM could not be properly validated.
Since you have shown that the actual file ends unexpectedly, it is highly probable that your file is not properly formed.
First of all, you need a root node. From your example we see that your root node is not being closed. So, you either have some unclosed nodes, or, you do not have a root node in the first place.
If you download the file from prod and test your local code with that file, then you will see that it is being correctly executed. So, the bug if at the place where the file was created/generated. If you have a function/module which generates the file, then you will need to debug on how it is being generated and see why the nodes are not being closed, or why the root node is not being wrapped around your structure.

Related

Php-fpm still logging my error even when using catch

Php-fpm error log file is still logging my error even using try-catch
$NUM_OF_ATTEMPTS = 100;
$attempts = 0;
do
{
try
{
$db = new SQLite3('proxies/socks5.db');
$results = $db->query('SELECT proxy FROM socks5proxies WHERE timeout <= ' . $settimeout . $countryq . ';');
while ($row = $results->fetchArray())
{
echo $row['proxy'] . "\r\n";
}
}
catch(Exception $e)
{
$attempts++;
sleep(1);
continue;
}
break;
}
while ($attempts < $NUM_OF_ATTEMPTS);
Expected result:
Retry on error, and don't log the error
Actual results:
Logs the error in the php-fpm error log file:
thrown in /var/www/html/api.php on line 200
[10-Jan-2019 14:00:49 UTC] PHP Warning: SQLite3::query(): Unable to prepare statement: 11, database disk image is malformed in /var/www/html/api.php on line 140
[10-Jan-2019 14:00:49 UTC] PHP Fatal error: Uncaught Error: Call to a member function fetchArray() on boolean in /var/www/html/api.php:141
Stack trace:
#0 {main}
thrown in /var/www/html/api.php on line 141
Call SQLite3::enableExceptions to tell PHP it should throw exceptions instead of standard errors:
try {
$db = new SQLite3('proxies/socks5.db');
$db->enableExceptions(true);
$results = $db->query('...');
} catch (\Exception $e) {
}
In any case, if you need to do 100 attempts to get this to work, then this really isn't the angle you should be taking to fix it.

Insert into mongo database does not work - why?

I try to set up some basic fixtures in Mongo DB. To do so I read a set of JSON files and try to insert them into local db (I am connecting with admin rights). Here's the weird part. For some reasons I wrote to versions of the code that should work basically the same so I have one:
$client = new \MongoClient($connectionURI);
$db = $client->selectDB($database);
$collections = $db->listCollections();
foreach ($collections as $collection) {
//echo "Removing all documents from '$collection'" . PHP_EOL;
$collection->remove();
$tmp = explode('.', (string)$collection);
$collectionName = $tmp[1];
$tmp = explode('_', $tmp[0]);
$dbName = $tmp[1];
$country = $tmp[2];
if(file_exists(__DIR__."/fixtures/{$country}/{$dbName}/{$collectionName}.json")) {
echo "Inserting fixture data into '{$collection}'".PHP_EOL;
$data = json_decode(file_get_contents(__DIR__."/fixtures/{$country}/{$dbName}/{$collectionName}.json"));
$doc = $collection->insert($data, ["w" => "majority"]);
}
}
And second one based on iteration of the files to read instead of listing existing collections:
$client = new \MongoClient($connectionURI);
foreach(glob(__DIR__.'/fixtures/*', GLOB_ONLYDIR) as $dir) {
$country = basename($dir);
foreach(glob(__DIR__.'/fixtures/'.$country.'/*', GLOB_ONLYDIR) as $dbDir) {
$collections = array_diff(
scandir(__DIR__."/fixtures/{$country}/".basename($dbDir)), ['.', '..']
);
$dbName = 'test_'.basename($dbDir).'_'.$country;
foreach($collections as $collectionFile) {
$collectionName = pathinfo($collectionFile)['filename'];
$data = [];//json_decode(file_get_contents(__DIR__."/fixtures/{$country}/".basenam e($dbDir)."/{$collectionName}.json"));
// $client->$dbName->$collectionName->insert($data);
$db = $client->selectDB($dbName);
$collection = $db->selectCollection($collectionName);
$collection->insert($data, ["w" => "majority"]);
echo $country.'->'.$dbName.'->'.$collectionName.PHP_EOL;
}
}
}
The trick is that the first implementation works nicely and second throws MongoCursorException with authentication problem. The problem I am having is that both versions try to connect to exact same database and collection in fact. So I am getting following output:
Inserting fixture data into 'test_customers_poland.accounts'
PHP Fatal error: Uncaught exception 'MongoCursorException' with message 'Couldn't get connection: Failed to connect to: 127.0.0.1:27017: SASL Authentication failed on database 'test_customers_poland': Authentication failed.' in /srv/dev-api/src/tests/init_databases.php:96
Stack trace:
#0 /srv/dev-api/src/tests/init_databases.php(96): MongoCollection->insert(Array, Array)
#1 {main}
thrown in /s
rv/dev-api/src/tests/init_databases.php on line 96
Of course I also checked running those snippets separately as well with the same result so the question is: what am I doing wrong in second approach?

fgets() halting without error in Swiftmailer

I'm attempting to generate a test email in Laravel 4.2 (Windows 7, IIS 6.1), and I've encountered a silent termination - it just fails, doesn't return my view, and doesn't return an error or Exception. I've managed to brute force my way through the Laravel codebase and located the termination within Swift\Transport\AbstractSmtpTransport::_getFullResponse(), specifically the line $line = $this->_buffer->readLine($seq);:
protected function _getFullResponse($seq)
{
$response = '';
try {
do {
$line = $this->_buffer->readLine($seq);
$response .= $line;
} while (null !== $line && false !== $line && ' ' != $line{3});
} catch (Swift_IoException $e) {
$this->_throwException(
new Swift_TransportException(
$e->getMessage())
);
} catch (Swift_TransportException $e) {
$this->_throwException($e);
}
return $response;
}
That do loop executes twice. The first time $line is assigned the value * OK The Microsoft Exchange IMAP4 service is ready., which is great, as obviously I'm getting to the server. Unfortunately, the second iteration fails in Swift\Transport\StreamBuffer::readLine() at the line $line = fgets($this->_out); :
public function readLine($sequence)
{
if (isset($this->_out) && !feof($this->_out)) {
$line = fgets($this->_out);
if (strlen($line)==0) {
$metas = stream_get_meta_data($this->_out);
if ($metas['timed_out']) {
throw new Swift_IoException(
'Connection to ' .
$this->_getReadConnectionDescription() .
' Timed Out'
);
}
}
return $line;
}
}
I've tried wrapping that line in a try/catch, and nothing happens, the code just halts with no information on the second iteration of the do loop. So, any advice as to a) how to squeeze more information out of the halt or b) what could cause fgets() to halt this way?
Ok, progress: after broadening my search to Swiftmailer in general, I was able to find mention of a timeout occurring at that particular line in Swiftmailer. By extending the max_execution_time in php.ini I was able to get an actual Exception:
Expected response code 220 but got code "", with message "* OK The Microsoft Exchange IMAP4 service is ready. * BYE Connection is closed. 13 "
I think under the circumstances I'll close this question and move onto figuring out why I'm not getting a 220.

How do I include PHP required libs in an AWS EMR streaming cluster

I've created a PHP project that converts JSON format into AVRO format.
The original project requires PHP libs that I'm not sure how to add on EMR.
This is the stderr log received by EMR:
PHP Warning: require_once(vendor/autoload.php): failed to open stream: No such file or directory in /mnt/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/filecache/12/convert-json-to-avro.php on line 3
PHP Fatal error: require_once(): Failed opening required 'vendor/autoload.php' (include_path='.:/usr/share/pear:/usr/share/php') in /mnt/var/lib/hadoop/tmp/nm-local- dir/usercache/hadoop/filecache/12/convert-json-to-avro.php on line 3
log4j:WARN No appenders could be found for logger (amazon.emr.metrics.MetricsUtil).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
And here is the main code for the mapper:
#!/usr/bin/php
<?php
require_once 'vendor/autoload.php';
error_reporting(E_ALL);
ini_set('display_errors', 1);
$outputFile = __DIR__ . '/test_avro_out.avr';
$avroJsonSchema = file_get_contents(__DIR__ . '/HttpRequestEvent.avsc');
// Open $file_name for writing, using the given writer's schema
$avroWriter = AvroDataIO::open_file($outputFile, 'w', $avroJsonSchema);
$counter = 1;
while (($buf = fgets(STDIN)) !== false) {
try {
//replace ,null: with ,"null": to prevent map keys which are not strings.
$original = array("null:","userIp");
$replaceWith = array("\"null\":", "userIP");
$data = json_decode(str_replace($original, $replaceWith, $buf), true);
//print_r($buf);
if ($data === false || $data == null ) {
throw new InvalidArgumentException("Unable to parse JSON line");
}
$mapped = map_request_event($data);
var_dump($mapped);
//$avroWriter->append($mapped);
//echo json_encode($mapped), "\n";
} catch (Exception $ex) {
fprintf(STDERR, "Caught exception: %s\n", $ex->getMessage());
fprintf(STDERR, "Line num: %s\n",$counter);
fprintf(STDERR, "buf: %s\n", $buf);
}
$counter++;
}
$avroWriter->close();
Notice I'm using the require_once 'vendor/autoload.php'; which states that autoload.php is under the folder vendor.
What is the right way to load the vendor folder into the EMR cluster (there are needed files there)?
Should the require_once path change?
Thanks.
Following Guy's comment I've used a bash script similar to the one you can find here.
I've changed the require_once 'vendor/autoload.php' line in the code to point to the location where i dropped my files. (/home/hadoop/contents worked perfect).
lastly I've added an EMR bootstrap custom step where you can add the bash script so it can run before the PHP streaming step.

Error Reporting? Import .sql file from PHP and show errors

I am building a way of importing .SQL files into a MySQL database from PHP. This is used for executing batches of queries. The issue I am having is error reporting.
$command = "mysql -u $dbuser --password='$dbpassword' --host='$sqlhost' $dbname < $file";
exec($command, $output);
This is essentially how I am importing my .sql file into my database. The issue is that I have no way of knowing if any errors occurred within the PHP script executing this command. Successful imports are entirely indistinguishable from a failure.
I have tried:
Using PHP's sql error reporting functions.
Adding the verbose argument to the command and examining the output. It simply returns the contents of the .sql file and that is all.
Setting errors to a user variable within the .sql file and querying it from the PHP script.
I hope I am not forced to write the errors into a temporary table. Is there a better way?
UPDATE:
If possible, it would be very preferable if I could determine WHAT errors occurred, not simply IF one occurred.
$command = "mysql -u $dbuser --password='$dbpassword' --host='$sqlhost' $dbname"
. " < $file 2>&1";
exec($command, $output);
The error message you're looking for is probably printed to stderr rather than stdout. 2>&1 causes stderr to be included in stdout, and as a result, also included in $output.
Even better, use proc_open instead of exec, which gives you far more control over the process, including separate stdout and stderr pipes.
Try using shell_exec
$output = shell_exec( "mysql -u $dbuser --password='$dbpassword' --host='$sqlhost' $dbname < $file" );
// parse $output here for errors
From the manual:
shell_exec — Execute command via shell and return the complete output as a string
Note:
This function is disabled when PHP is running in safe mode.
EDIT: Full solution:
what you need to do is grab STDERR and discard STDOUT. Do this by adding '2>&1 1> /dev/null' to the end of your command.
$output = shell_exec( "mysql -u $dbuser --password='$dbpassword' --host='$sqlhost' $dbname < $file 2>&1 1> /dev/null" );
$lines = explode( PHP_EOL, $output );
$errors = array();
foreach( $lines as $line )
{
if ( strtolower( substr( $line, 0, 5 ) ) == 'error' )
{
$errors[] = $line;
}
}
if ( count( $errors ) )
{
echo PHP_EOL . 'Errors occurred during import.';
echo implode( PHP_EOL, $errors );
}
else
{
echo 'No Errors' . PHP_EOL;
}
When issuing a exec, the shell will return a 0 on succes, or a number indicating a failure.
$result = exec( $command, $output );
should do the trick. Check result and handle appropiate.
You have done everything but look at the PHP manual! There is an additional parameter for the exec to return a result
http://php.net/manual/en/function.exec.php
"If the return_var argument is present along with the output argument, then the return status of the executed command will be written to this variable."
exec($command,$output,$result);
if ($result === 0) {
// success
} else {
// failure
}

Categories