Python-CGI environment set-up? - php

I am doing a project in Natural Language Processing using nltk in python.
The block structure of project is as follows:
Interface (in php) ->
[NLP Engine] (in python) ->
API calls (in php) ->
Result (in php)
The input is supposed to go via GET method from PHP Interface to the Python Engine.
Background:
I have created a virtual host (url=/linguistics/) server using Easy-PHP Dev Server (Location=D:\Computational_Linguistics). I have enabled it so that it can execute Test.py so that when I type linguistics/Test.py, it executes.
Issue:
The basic CGI was successfully executed and I could see the output in Chrome. But as soon as I imported another module, it returned this error:
Server error!
The server encountered an internal error and was unable to complete your request.
Error message:
End of script output before headers: engine.py
If you think this is a server error, please contact the webmaster.
Error 500
linguistics
Apache/2.4.4 (Win32) PHP/5.5.0
When I do NOT import nltk (or any other non-standard package) it works.
I did do the websearch to find the solution, and came to know I have to setup some environment variables to make it work.
But, I can not figure out how.
My code:
#!C:/Python27/python.exe
import nltk
from nltk import *
import re
import cgi, cgitb
inpt=cgi.FieldStorage()
str_in = inpt.getvalue('query')
def is_noun (str):
tags=nltk.pos_tag(nltk.word_tokenize(str))
for i in tags:
if i[1][1]=='N' or i[1][1]=='V': #Finding out the Nouns and the Verbs.
print "<h5>%s is a noun.<h5>" %i[0]
is_noun(str_in)
print "Content-type:text/html\r\n\r\n"
print "<html>"
print "<head>"
print "<title>Hello - Second CGI Program</title>"
print "</head>"
print "<body>"
is_noun(str_in)
print "</body>"
print "</html>"

Since I received no answers (Not blaming anyone!) I read more documentations. As I have described in my Problem statement above, only NLP engine is written in Python. And, the problem exists in CGI environment only.
Hence My solution:
I modified engine.py to recieve input as commanline arguments, and then process upon it. It returns the processed data (In a exact format) back to buffer stream.
I used exec() command in PHP to do so.
The project is on GitHub, so If anyone wants to have a look at it, he's most welcome!
PS: I still don't know the reason for that error. I am hell sure that all environment paths were correct. So I'd call this answer a work-around, rather than a solution.
PPS: I am answering my own question, so that If anybody have same problem, they might consider this work around.

The problem is that you run is_noun twice, and the first one before you sent any headers. Hence, the error.
Another problem is that str_in is str, but I think nltk.pos_tag expects unicode. that is you need to decode the str_in value (if you use any symbols outside plain ASCII. That is you should do it anyway, but you will notice only if there will be such a character in the input):
str_in = unicode(inpt.getfirst('query', ''), 'utf-8')
and then, when you print unicode, you will need to encode it back:
print "<h5>%s is a noun.<h5>" % i[0].encode('utf-8')
But, in its current form it might be looking garbled in the browser, because you need to notify the browser, that the charset is 'utf-8', that is you need to change the content-type header:
print "Content-Type: text/html; charset=utf-8"
print
P.S. Hopefully, this is all for local use only and not available from the internet, because this should be much more complicated.

Related

Why do I get random text appended to my STOMP protocol messages?

I'm having an issue where I'm able to write many json encoded arrays to HornetQ without any problem, but when I try to read the frames back, every n'th message has random text appended to it (usually MESSAGE or RECEIPT).
Example:
I send the following to HornetQ:
{"data":9933753,"more_data":"Some Text"}
and I get back the following when I read the frame body:
{"data":9933753,"more_data":"Some Text"}
MESSAGE subscription:subscription/jms.queue.testing.qa.myqueue message-id:1310
destination:jms.queue.testing.qa.myqueue expires:0 redelivered:false priority:4 timestamp:1382637077839
I read the STOMP protocol definition and I still don't get how I can get back just the json string that I sent without the extra text seeing as the MESSAGE is in the body itself (making it impossible to decode it back without doing hacky string manipulations).
I have the following setup:
HornetQ (latest)
PHP 5.4
STOMP library: http://stomp.fusesource.org/documentation/php/book.html
Any suggestions are appreciated!
It's a bug probably. there was a fix around that at some point as I remember. If you are still seeing it with the latest version you have to provide a testcase to the developers and we would gladly fix it. I'm speaking here as one of the developers.
But first check if you are on the latest version. (2.3.0+ or 2.4.0 beta), or any latest EAP version.

Get windows title using php doesn't work on browser call

My problem is I need to fetch FOOBAR2000's title because that including information of playing file, so I create a execute file via Win32 API(GetWindowText(), EnumWindows()) and it's working good.
TCHAR SearchText[MAX_LOADSTRING] = _T("foobar2000");
BOOL CALLBACK WorkerProc(HWND hwnd, LPARAM lParam)
{
TCHAR buffer[MAX_TITLESTRING];
GetWindowText(hwnd, buffer, MAX_TITLESTRING);
if(_tcsstr(buffer, SearchText))
{
// find it output something
}
return TRUE;
}
EnumWindows(WorkerProc, NULL);
Output would look like "album artis title .... [foobar2000 v1.1.5]"
I created a php file like test.php, and use exec() to execute it.
exec("foobar.exe");
then in console(cmd) I use command to execute it
php test.php
It's working good too, same output like before.
Now I use browser(firefox) to call this php file(test.php), strange things happened.
The output only foobar2000 v1.1.5, others information gone ...
I think maybe is exec() problem? priority or some limitation, so I use C# to create a COM Object and register it, and rewrite php code
$mydll = new COM("FOOBAR_COMObject.FOOBAR_Class");
echo $mydll->GetFooBarTitle();
still same result, command line OK, but browser Fail.
My question is
Why have 2 different output between command line and browser. I can't figure it out.
How can I get correct output via browser.
or there is a easy way to fetch FOOBAR2000's title?
Does anyone have experience on this problem?
== 2012/11/28 edited ==
follow Enno's opinion, I modify http_control plug-in to add filename info, original json info is "track title".
modify as following
state.cpp line 380 add 1 line
+pb_helper1 = pfc::string_filename(pb_item_ptr->get_path());
pb_helper1x = xml_friendly_string(pb_helper1);
# 1: when firefox opens the php and it gets executed, it the context depends on the user which runs the php-container (apache), this is quite different from the commandline call which gets executed in your context
# 2 and 3: there seems to be more than one way for getting the title: use the foobar-sdk and create a module which simply reads the current title per api, then write your result in an static-html-document inside your http-root-folder OR use the http-client inside the sdk, with it, you do not need a wabserver, even better use a already implemented module: for instance foo_upnp or foo-httpcontrol
Good luck!
If your webserver runs as a service, in windows you need to enable "allow desktop interaction" for the service. Your php script runs as a child of the webserver process when requested via browser.

How to read GTFS protocol buffer in PHP?

I have a GTFS protocol buffer message (VehiclePosition.pb), and the corresponding protocol format (gtfs-realtime.proto), I would like to read the message in PHP alone (is that even possible?).
I looked at Google's python tutorial https://developers.google.com/protocol-buffers/docs/pythontutorial and encoding documentation https://developers.google.com/protocol-buffers/docs/encoding and https://github.com/maxious/ACTBus-ui/tree/master/lib/Protobuf-PHP, but I am having a really hard time conceptualizing what is going on. I think I understand that gtfs-realtime.php is a compiled instruction set of the encoding defined in gtfs-realtime.proto (please correct me if I am wrong), but I have no clue how to get it to decode VehiclePosition.pb. Also, what are the dependencies of gtfs-realtime.php (or the python equivalent for that matter)? Is there anything else I have to compile myself or anything that is not a simple php script if all I want to do is read VehiclePosition.pb?
Thanks.
edmonscommerce and Julian are on the right track.
However, I've gone down the same path and I've found that the PHP implementation of Protocol Buffers is cumbersome (especially in the case of NYCT's MTA feed).
Alternative Method (Command Line + JSON):
If you're comfortable with command line tools and JSON, I wrote a standalone tool that converts GTFS-realtime into simple JSON: https://github.com/harrytruong/gtfs_realtime_json
Just download (no install), and run: gtfs_realtime_json <feed_url>
Here's a sample JSON output.
To use this in PHP, just put gtfs_realtime_json in the same directory as your scripts, and run the following:
<?php
$json = exec('./gtfs_realtime_json "http://developer.mbta.com/lib/GTRTFS/Alerts/VehiclePositions.pb"');
$feed = json_decode($json, TRUE);
var_dump($feed);
You can use the official tool: https://developers.google.com/transit/gtfs-realtime/code-samples#php
It was released very recently. I've been using it for a few days and works like a charm.
I would assume something along the lines of this snippet:
<?php
require_once 'DrSlump\Protobuf.php';
use DrSlump\Protobuf;
$data = file_get_contents('data.pb');
$person = new Tutorial\Person($data);
echo $person->getName();
as taken from the man page: http://drslump.github.io/Protobuf-PHP/protobuf-php.3.html
Before that step, I think you need to generate your PHP classes using the CLI tool as described here: http://drslump.github.io/Protobuf-PHP/protoc-gen-php.1.html
so something along the lines of:
protoc-gen-php gtfs-realtime.proto
Sorry Harry Truong, I tried your executable but it returns always NULL.
What I am doing wrong?
Edit: The problem is that I have no permission to execute in my server. Thanks for your executable.

Ajax issues, Invalid JSON

I'am building simple Ajax application (via jquery). I have strange issue. I found where the problem is, but I don't know how to solve it.
This is simple server-side php code:
<?php
require('some.php');
$return['pageContent'] = 'test';
echo(json_encode($return));
?>
On the client side, the error "Invalid JSON" is thrown.
I have discovered that if I delete require function, everything work fine.
Just for information, the "some.php" is an empty php file. There is no error when I open direct php files.
So, conclusion: I cant use require or include function if I want to use ajax?
Use Firebug to see what you're actually getting back during the AJAX call. My guess is that there's a PHP error somewhere, so you're getting more than just JSON back from the call (Firebug will show you that). As for your conclusion: using include/require by itself has absolutely no effect on the AJAX call (assuming there are no errors).
Try changing:
<?php
require('some.php');
$return['pageContent'] = 'test';
echo(json_encode($return));
?>
To:
<?php
$return = array(
'pageContent' => 'test'
);
echo json_encode($return);
?>
The problem might have to do with $return not being declared as an array prior to use.
Edit: Alright, so that might not be the problem at all. But! What might be happening is you might be echoing out something in the response. For example, if you have an error echoing out prior to the JSON, you'd be unable to parse it on the client.
if the "some.php" is an empty php file, why do you require it at all?!
require function throws a fatal error if it could't require the file. try using include function instead and see what happens, if it works then you probably have a problem with require 'some.php';
A require call won't have any effect. You need to check your returned output in Firebug. If using Chrome there is a plugin called Simple REST Client. https://chrome.google.com/extensions/detail/fhjcajmcbmldlhcimfajhfbgofnpcjmb through which you can quickly query for stuff.
Also, it's always good to send back proper HTTP headers with your response showing the response type.
It's most likely the BOM as has been discussed above. I had the same problem multiple times and used Fiddler to check the file in hex and noticed an extra 3 bytes that didn't exist in a prior backup and in new files I created. Somehow my original files were getting modified by Textpad (both in Windows). Although when I created them in Notepad++ I was fine.
So make sure that you have your encoding and codepages set up properly when you create, edit, and save your files in addition to keeping that consistent across OSes if you're developing on windows let's say and publishing to a linux environment at your hosting provider.

How to get Apache's running request at a specific moment?

I need to find a way to get all Apache running request at a given moment. I need to list the vhost, cpu, request ip address and some other information.
This information will be consumed by a PHP script.
I have mod_status installed and it has all the information I need. So I tried to use file_get_contents to get the report, generating a request from the server (http://localhost/server-status). It worked perfectly. Then I tried to parse the report, converting it to XML using simplexml_load_string. The problem is that the HTML outputted by mod_status is not well formed.
Here is the HTL from the table I need to parse:
<table border="0"><tr><th>Srv</th><th>PID</th><th>Acc</th><th>M</th><th>CPU
</th><th>SS</th><th>Req</th><th>Conn</th><th>Child</th><th>Slot</th><th>Client</th><th>VHost</th><th>Request</th></tr>
<tr><td><b>0-1</b></td><td>-</td><td>0/0/70</td><td>.
</td><td>0.00</td><td>107</td><td>0</td><td>0.0</td><td>0.00</td><td>0.34
</td><td>127.0.0.1</td><td nowrap>zsce</td><td nowrap>OPTIONS * HTTP/1.0</td></tr>
<tr><td><b>1-1</b></td><td>-</td><td>0/0/55</td><td>.
</td><td>0.04</td><td>108</td><td>0</td><td>0.0</td><td>0.00</td><td>0.70
</td><td>127.0.0.1</td><td nowrap>zsce</td><td nowrap>OPTIONS * HTTP/1.0</td></tr>
</table>
I'm sure someone has tried to do something like this before.
1) Is there another way to access the information I need?
2) Has anybody tried other tools / modules?
Thanks in advance.
I can't see the problem with the HTML. What's wrong with it?
Does PHP not have a liberal HTML parser; something like Python's BeautifulSoup or Ruby's Nokogiri?
Also, remember that mod_status has 'auto' mode for producing machine-readable output.
http://www.apache.org/server-status?auto
http://httpd.apache.org/docs/2.2/mod/mod_status.html#machinereadable
I just found that if I remove "nowrap" from the HTML before parsing it, it works.

Categories