I have a Pig script--currently running in local mode--that processes a huge file containing a list of categories:
/root/level1/level2/level3
/root/level1/level2/level3/level4
...
I need to insert each of these into an existing database by calling a stored procedure. Because I'm new to Pig and the UDF interface is a little daunting, I'm trying to get something done by streaming the file's content through a PHP script.
I'm finding that the PHP script only sees half of the category lines I'm passing through it, though. More precisely, I see a record returned for ceil( pig_categories/2 ). A limit of 15 will produce 8 entries after streaming through the PHP script--the last one will be empty.
-- Pig script snippet
ordered = ORDER mappable_categories BY category;
limited = LIMIT ordered 20;
categories = FOREACH limited GENERATE category;
DUMP categories; -- Displays all 20 categories
streamed = STREAM limited THROUGH `php -nF categorize.php`;
DUMP streamed; -- Displays 10 categories
# categorize.php
$category = fgets( STDIN );
echo $category;
Any thoughts on what I'm missing. I've poured over the Pig reference manual for a while now and there doesn't seem to be much information related to streaming through a PHP script. I've also tried the #hadoop channel on IRC to no avail. Any guidance would be much appreciated.
Thanks.
UPDATE
It's becoming evident that this is EOL-related. If I change the PHP script from using fgets() to stream_get_line(), then I get 10 items back, but the record that should be first is skipped and there's a trailing empty record that gets displayed.
(Arts/Animation)
(Arts/Animation/Anime)
(Arts/Animation/Anime/Characters)
(Arts/Animation/Anime/Clubs_and_Organizations)
(Arts/Animation/Anime/Collectibles)
(Arts/Animation/Anime/Collectibles/Cels)
(Arts/Animation/Anime/Collectibles/Models_and_Figures)
(Arts/Animation/Anime/Collectibles/Models_and_Figures/Action_Figures)
(Arts/Animation/Anime/Collectibles/Models_and_Figures/Action_Figures/Gundam)
()
In that result set, there should be a first item of (Arts). Closing in, but there's still some gap to close.
So it turns out that this is one of those instances where whitespace matters. I had an empty line in front of my opening <?php tag. Once I tightened all of that up, everything sailed through and produced as expected. /punitive headslap/
Related
Scenario:
I have a php file that I'm using by a zip code lookup form. It has number arrays of five digit zip codes running anywhere from 500 to 1400 zip codes. So far it works but I get PHP sniffer warnings in my code editor (Brackets) that I'm exceeding the 120 character limit.
Question:
Will this stop my PHP from running in certain browsers?
Do I have to go to every 120 characters and do a return just to keep the line length in compliance?
It appears, I need to place these long strings into a database and call them in to the array rather than hang them all inside the PHP.
I am front-end designer so a lot to learn.
<?php
$zip = $_GET['zip']; //your form method is post
// Region 01 - PersonOne Name Zips
$loc01 = array (59001,59002,59003,59004,59006);
// Region 02 - PersonTwo Name Zips
$loc01 = array ("00001","00002","00003","00004","00006");
// Above numeric strings could include 2000 zips
// Region 01 - PersonTwo Name Zips
if (in_array($zip, $loc01)) {
header("Location: https://company.com/personone");
// Region 02 - PersonTwo Name Zips
if (in_array($zip, $loc02)) {
header("Location: https://company.com/persontwo");
Question: Will this stop my PHP from running in certain browsers?
No, PHP runs entirely on the server. Browsers have nothing to do with PHP -- browsers are clients. Languages like HTML, CSS and (most) JavaScript are browser languages, but PHP is only server-side.
Do I have to go to every 120 characters and do a return just to keep the line length in compliance?
No, but I would highly suggest using a database to store tons of records like this. It's exactly what databases are for. Alternatively you could put them in a file and simply read the file in with PHP's file_get_contents function.
I will try to:
Add each array into a mysql database record.
Create a PHP script that fetches each array and applies it to the
respective location.
This will eliminate the bloated lines of arrays numbers in PHP.
BTW, I also need to define these as 5 digit numeric strings as many of the zips start with one or two zeros which are ignored by the POST match.
Thanks everyone for the input.
I have a script that is reading the last line of a log file for my radio station using the following command.
watch -n 1 php script.php
That command above executes my script in 1 sec intervals. The log this script is reading has output as listed below.
2016-04-28 23:30:34 INFO [ADMINCGI sid=1] Title updated [Bankroll Fresh Feat. Street Money Boochie - Get It]
2016-04-28 23:30:34 INFO [ADMINCGI sid=1] DJ name updated [1]
2016-04-28 23:30:36 INFO [YP] Updating listing details for stream #1 succeeded.
Every time the song changes, 3 more lines are added to the logs as in the example output above. I need a way to do 3 things.
1) Detect only the latest occurrance of an entry in the logs matching the pattern of line #1
2) Execute code when that occurs and do nothing else until that happens again.
3) Regex to Extract data between 'second set' of square brackets on a line delimited by a "-" e.g. [Rihanna feat. Keyshia Cole - Title (Remix)]
Before the log output of my radio script changed, my script would detect when a song change occurred by tailing the logs for the 'Title Updated' line and then extract the artist and title name from within the square brackets on that same line.
Once that happens the data is sent to a MySQL database and sent to Twitter.
I have tried using "strpos" within an if statement to first detect a line that contains "Title updated" and then execute a function to grab the song information from the line afterwards which works but only if I use a static scenario by putting an example except of line #1 into a variable and then running it off my script. It does detect the line and server it's purpose but I need my script to remain dynamic meaning to only do something when this event happens and always sit idle in the meantime.
Right now I had to go bootleg and do the following.
Created a function to grab the last 3 lines from the log and then put each line into an array. Going off the example output above.
array[0] = Target Line
array[1] = next line
array[3] = next line
The logs would remain in this fashion as no other output is posted until the next song change and then it repeats with the only thing changing is the Artist and Title information. Currently since I am running my script to forcibly look at array[0] which is always the line I need, when it the script posts to Twitter on a song change, it immediately sends duplicates. Luckily I implemented error codes into the Twitter portion so that I can use sleep() to force my script to idle for 180 seconds (3 minutes) around the average song length. Now this is decent and it does post but my Tweets are no longer in real time because each song has different lengths.
Here is a snippet from my script below...
$lines = read_file(TEXT_FILE, LINES_COUNT);
foreach ($lines as $line) {
$pattern="/\[([^\]]*)\]/";
if (preg_match_all($pattern, $lines[0], $matches)){
foreach ($matches[1] as $a ){
$fulltitle = explode("-", $matches[1][1]);
$artist = $fulltitle[0];
$title = $fulltitle[1];
This script below is the direction I would like to go back to and does work when using the static version as in the example below. Soon as I set the script to look at the last line of the log it never detects a change due to it always detecting the next line after the target line. (believe regex may be responsible but not sure)
$line ="2016-04-27 22:56:48 INFO [ADMINCGI sid=1] Title updated [Tessa Feat. Gucci Mane - Get You Right]";
echo $line;
$pattern="/\[([^\]]*)\]/";
$needle = " Title updated ";
if (strpos($line,$needle) !== false) {
preg_match_all($pattern, $line, $matches);
foreach ($matches[1] as $a ){
$fulltitle = explode("-", $matches[1][1]);
$artist = $fulltitle[0];
$title = $fulltitle[1];
I have a MySQL table which can contain up to 500 000 rows and I am calling them on my site without any LIMIT clause; when I do it this without AJAX, it works normally, but with AJAX , again without setting LIMIT, no data is returned. I checked the AJAX code and there is no mistake there. The thing is , when I write a limit, for example 45 000 , it works perfectly; but above this, ajax returns nothing.
With limit
witohut the limit :
Can this be a ajax issue because i found nothing similar on the web or something else?
EDIT
here is the sql request
SELECT ans.*, quest.inversion, t.wave_id, t.region_id, t.branch_id, quest.block, quest.saleschannelid, b.division, b.regionsid, quest.yes, quest.no FROM cms_vtb as ans
LEFT JOIN cms_vtb_question as quest ON ans.question_id=quest.id
LEFT JOIN cms_task as t ON t.id=ans.task_id
LEFT JOIN cms_wave as w ON w.id=t.wave_id
LEFT JOIN cms_branchemployees as b ON b.id=t.branchemployees_id WHERE t.publish='1' AND t.concurent_id='' AND ans.answer<>'3' AND w.publish='1' AND quest.questhide<>1 ORDER BY t.concurent_id DESC LIMIT 44115
the php :
var url='&module=ajax_typespace1&<?=$base_url?>';
$.ajax({
url: 'moduls_ajax.php?'+url,
cache: false,
dataType:'html',
success: function(data)
{
$("#result").html(data);
}
});
Apparently it was a server error, adding ini_set('memory_limit', '2048M'); helped a lot
The reason this happens has to do with how you format the data sent to the client. Not having seen the code of moduls_ajax.php, I can only suspect that you are probably assembling the query result into a variable - possibly in order to json_encode it properly?
But doing so may result in a huge memory allocation, whereas if you send the data piece by piece to the Web server, you may need a fraction of the memory only.
The same happens on your web page where the same query is either output straight on, or is not being encoded. In the latter case, you'll discover that when the row number grows to about two or three times the current value, the working Web page will stop also.
For example:
$result = array();
while ($tuple = $resultset->fetch()) {
$result[] = $tuple;
}
print json_encode($result);
Instead - of course, it's more complicated than before -
// Since we know it is an array with numeric keys, the JSON
// will be of the format [ <item>, <item>,...,<item> ]
$sep = '[';
while ($tuple = $resultset->fetch()) {
print $sep . json_encode($tuple);
$sep = ',';
}
print ']';
Pros and cons
This is about three times as expensive as a single function call, and can also yield a slightly worse compression performance (the web browser may receive the data in chunks of different size and find more difficulty in compressing them optimally; it's a matter of tenths of one percent, usually). On the other hand, in some setups the output will arrive much more quickly to the client browser and possibly prevent browser timeouts.
The memory requirements, if the tuples are all more or less of the same size, is around two to three N-ths of before - if you have one thousand rows, and needed one gigabyte to be able to process the query, now three-four megabytes ought to suffice. Of course, this also means that the more rows, the better... and the less rows, the less point there is in doing this.
More of the same
The same approach holds for other kind of assembling (to HTML, CSV and so on).
In some cases it may be helpful to dump the data into an external temporary file and send a Location header to have it loaded by the browser. Sometimes it is possible (if PHP is compiled as an Apache module on a Unix system) to output the file after having deleted it, so that it's not necessary to do garbage collection on the temporary files:
$fp = fopen($temporary_file, 'r');
unlink($temporary_file); // The file is deleted, the handle remains valid
fpassthru($fp); // On some platforms this results in the browser being "short-circuited" to the file descriptor, so that the PHP script may terminate while output continues normally.
die();
If I use a local filename, the filename is properly copied, however, if you leave local filename empty, you are supposed to receive the content of the file.
Example code:
$stat = $sftp->get('xmlfile.cml','xmlfile.xml');
print "$stat";
(This works fine)
$xmlcontent = $sftp->get('cp1301080801_status.xml');
print "Content of file = $xmlcontent<>";
*(This prints what looks more like the stat of the file instead of the content. It starts with the date (which is the modofoed timestamp of file, followed by some numbers and the name of the web server repeated about 10 times with a number after it that increases each time - like maybe a port number or byte offset) *
It would make things easier if I didn't have to fopen the local file after the transfer. Anyone have an idea what is going on here?
Can you post a copy of the logs? Here's an example of how to get them:
http://phpseclib.sourceforge.net/ssh/examples.html#logging
Note the define() and the $ssh->getLog() stuff.
As for the specific problem you're having... what does print "$stat" do? It should print "1".
Also, fwiw, you're opening two different files in your example. My best guess, atm, is that you're thinking you're opening the same files and expecting the content to be the same when in fact they should be different and that what you're getting with both of the $sftp->get()'s is, in fact, correct.
The logs will tell us for sure.
I'm learning php. Novice. For that purpouse I decided to start with a flat file comment system.
I'm using ajax to post to php that writes data to a flat-file database.
Similar to: 12.01.2011¦¦the name¦¦the comment¦¦md5email¦¦0
Where '0' is the start number of comment 'likes'...thumbs-up.
Everything is working fine with ajax. Even the comment delete.
At page load jquery counts the comments (starting from 0) and assigns to each's comment-'like' an numbered id.
That number is than posted via ajax to php, resulting the file line number to modify.
That system works great for identifying the line to delete.
And it deletes the right line!
Now, having the line (or string?) number. How to:
Search the file for that line. (foreach...?!...)
Finded the line - split it into arrays. (Explode...?)
And increment by 1 the defined array value.
Limit maximum likes to 99.
(1 per user session).
Write file, close and so on.
- I just can't start count the 'like' clicks.
Please help.
Any ideas?
Thanks in advance!
$lines = file($filename, FILE_IGNORE_NEW_LINES);
$entry = explode('¦¦', $lines[$linenumber]);
$entry[4]++;
$lines[$lineNumber] = implode('¦¦', $entry);
file_put_contents($filename, implode("\n", $lines));