length issues with solr select POST with cURL in PHP - php

I have a solr query that has been working perfectly:
$ch = curl_init();
$ch_searchURL = "$base_url/$collection/select?q=$s&wt=json&indent=true";
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_URL, $ch_searchURL);
$rawData = curl_exec($ch);
$json = json_decode($rawData,true);
Initially, my $s variable was literally one thing: e.g. ?q=name:brian, but my user base wanted the ability to search multiple things at once, so I started to build that in:
?q=name:("brian"+OR+"mike"+OR+"james"+OR+"emma"+OR+"luke")
It then got to the point where they wanted to search 5,000 things at once, which caused this method of building out the solr GET query to fail as the literal URL length was longer than the max allowed length of ~2,000, so I thought using a POST might work, which I accomplished by adding the following lines:
$ch_searchURL = "$base_url/$collection/select";
$multiline_q = "q=$s&wt=json&indent=true";
curl_setopt($ch, CURLOPT_POSTFIELDS, $multline_q);
This seemed to allow me to search for around 500 items at a time - (which would still, in GET world, cause a URL length of around 4,000) - so better than the GET method, but once I go past that number of items, the solr query fails again.
Because I'm POSTing (maybe?), I don't get any error response from solr, so I don't know what's causing the query to fail, and I can't manually test the query in the browser because it's ~40,000 characters long and won't paste. If I do var_dump($rawData);, I see this:
string(238) " 05 " // or 04, or 08
I've used solr quite a bit with PHP & cURL, but always with the GET method. This is my first foray into using POST. Am I doing something wrong here? Am I just exceeding the actual amount of q options that I can ask solr to retrieve for me, regardless of the method?
Any light that anyone could shed on this would be helpful...

There is no limit on the Solr side - we regularly use Solr in a similar way.
You need to look at the settings for your servlet container (Tomcat, Jetty etc.) and increase the maximum POST size. Look up maxPostSize if you are using Tomcat and maxFormContentSize if you are using Jetty.
source : link

Related

PostgreSQL Base64 Image decode issue

I am having an issue converting an image stored as base64 in a PostgreSQL database into an image to display on a website. The data type is bytea and I need to get the data via cURL.
I am working with an API to connect to a client's stock system which returns XML data.
I know storing images this way in a DB is not a great idea but that's how the client's system works and it can't be changed as it is a part of an enterprise solution provided by a 3rd Party.
I'm using the following to query the DB for the PICTURE field from the PICTURE table where the PART = 01000015
$ch = curl_init();
$server = 'xxxxxx';
$select = 'PICTURE';
$from = 'picture';
$where = 'part';
$answer = '01000015';
$myquery = "SELECT+".$select."+FROM+".$from.'+WHERE+'.$where."+=+'".$answer."'";
//Define curl options in an array
$options = array(CURLOPT_URL => "http://xx.xxx.xx.xx/GetSql?datasource=$server&query=$myquery+limit+1",
CURLOPT_PORT => "82",
CURLOPT_HEADER => "Content-Type:application/xml",
CURLOPT_RETURNTRANSFER => TRUE
);
//Set options against curl object
curl_setopt_array($ch, $options);
//Assign execution of curl object to a variable
$data = curl_exec($ch);
//Close curl object
curl_close($ch);
//Pass results to the SimpleXMLElement function
$xml = new SimpleXMLElement($data);
//Return String
echo $xml->row->picture;
The response I get from this is: System.Byte[]
Thus if I use base64_decode() in PHP I am obviously just decoding the string "System.Byte[]".
I am guessing that I need to use the DECODE() function in PostgreSQL to convert the data in the query? However, I've tried loads of combinations but I'm stuck. I've had a few downvotes for questions and I'm not too sure why so if this is a bad question I'm sorry, I just really need some help with this one.
(nb:I've replaced the IP and $server with xxxxx for security)
To explain further:
The client has a POS system which is based on ASP.NET and saves the data as XML files on the remote server. I have access to this data via an API which includes a SQL query function using HTTP/cURL defined as follows:
http://remoteserver:82/pos.asmx.GetSql?datasource=DATASOURCE&query=MYQUERY
So to get the field that contains the picture data I am currently usingthe above code.
The query is in the CURL URL i.e. http://remoteserver:82/pos.asmx.GetSql?datasource=12345&query=SELECT+*+FROM+picture+WHERE+part+=+'01000015'";
However, this returns System.Byte[] instead of encoded data which I can then decode in PHP.
Additional info:
PostgreSQL version: PostgreSQL 9.1.3 on i686-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-51), 32-bit
Table Schema:
Available here: http://i.stack.imgur.com/sc8Gw.png
You should preferably have the server storing the data in PostgreSQL as a bytea field, then encoding to base64 to send to the client, but it sounds like you don't control the server.
The string System.Byte[] suggests it's an app using .NET, like ASP.NET or similar, and it's not correctly handling a bytea array. Instead of formatting it as base64 for output it's embedding the type name in the output.
You can't fix that on the client side, because the server is sending the wrong data.
You'll need to show the server-side tables and queries.
Update after query amended:
You're storing a bytea and returning it directly. The client doesn't seem to understand byte arrays and tries to output it naïvely, probably something like casting it to a string. Since the documentation says it expects "base64" you should probably provide that, instead of a byte array.
PostgreSQL has a handy function to base64-encode bytea data: encode.
Try:
SELECT
account, company, date_amended,
depot, keyfield, part,
encode(picture, 'base64') AS picture,
picture_size, source
FROM picture
WHERE part = '01000015'
The formating isn't significant, it just makes it easier to read here

How do i post a long string into a PHP page?

This question has two parts:
Part I - restriction?
I'm able to store data to my DB with this:
www.mysite.com/myscript.php?testdata=abc123
This works for a short string (eg 'abc123') and the page echos what was written to the DB; however, if the [testdata=] string is longer than 512 chars and i check the database, it shows a row has been added but it's blank and also my echo statement in the script doesn't display the input string.
N.B. I'm on a shared server and have emailed my host to see if it's a restriction.
Part II - best practice?
If i can get past the above hurdle, I want to use a string that's ~15k chars long created in a desktop app that concatenates the [testdata=] string from various parameters; what's the best way to send a long string in PHP POST?
Thanks in advance for your help, i'm not too savvy with PHP.
Edit: Table config:
Edit2: Row anomaly with long string > 512 chars:
Edit3: here's my PHP script, if it helps:
<?
include("connect.php");
$data = $_GET['testdata'];
$result = mysql_query("INSERT INTO test (testdata) VALUES ('$data')");
if ($result) // Check result
{
echo $data;
}
else echo "Error ".$mysqli->error;
mysql_close(); ?>
POST is definitely the method you want to use, and your best bet with that will be with cURL. Something like this should work:
$ch = curl_init();
curl_setopt( $ch, CURLOPT_URL, "http://www.mysite.com/myscript.php" );
curl_setopt( $ch, CURLOPT_POST, TRUE );
curl_setopt( $ch, CURLOPT_POSTFIELDS, $my_really_long_string );
$data = curl_exec( $ch );
You'll need to modify the above to include additional cURL options as per your environment, but something like this is what you'd be looking for.
You'll want to make sure that your DB field is long enough to hold the really long string as well.
Answer 1 Yes, max length of url has restriction. See more:
What is the maximum possible length of a query string?
Answer 2 You can send your string like simple varible ($_POST). Check only settings for max vals of inputing/exectuting in php.ini.

php preg_match why on earth doesnt it work?

this is just so bang head on wall situation. this pattern works perfectly in javascript. and i have no idea what to do.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://yugioh.wikia.com/wiki/List_of_Yu-Gi-Oh!_BAM_cards');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$chHtml = curl_exec($ch);
curl_close($ch);
$patt = '/<table class="wikitable sortable card-list">[\s\S]*?<\/table/im'; //////////////this
preg_match($patt, $chHtml, $matches);
is the problem line
if i make it greedy
[\s\S]*
it works fine but it goes till the last
There is nothing wrong with the pattern, the problem is that you need a larger backtrack limit than the default.
Explaining:
In regex problems like that always check for errors using the preg_last error().
If you use it in the specific response from the site you submitted, since this is a resource problem and smaller texts do not raise the error, you will see that you are getting a PREG_BACKTRACK_LIMIT_ERROR.
Solution:
To overcome this limit you can raise it with the following in the start of your script:
ini_set ('pcre.backtrack_limit', 10000000);

cURL size limit of get

I want to know if there is any way to get only a particular amount of data through cURL?
Option 1:
curl_setopt ($curl_handle, CURLOPT_HTTPHEADER, array("Range: bytes=0-1000"));
but Its not supported by all servers
Option 2:
Having trouble limiting download size of PHP's cURL function but this function is giving me error Failed writing body (0 != 11350) and reading for which I found that many say its a bug.
So following the above write_function I tried to curl_close($handle) instead of returning 0 but this throws an error Attempt to close cURL handle from a callback
Now the only way I can think of is parsing headers for content length but this will eventually result in 2 requests ?? first getting headers with CURLOPT_NOBODY then getting full content?
Option 2: Having trouble limiting
download size of PHP's cURL function
but this function is giving me error
Failed writing body (0 != 11350) and
reading for which I found that many
say its a bug.
It's not clear what you are doing there exactly. If you return 0 then cURL will signal an error, sure, but you will have read all the data you need. Just ignore the error.
Another option that you don't mention if you have tried is to use fopen with the http:// wrapper. For example:
$h = fopen('http://example.com/file.php', 'r');
$first1000Bytes = fread($h, 1000);
fclose($h);
Is it possible to use fopen and fgets to read a line at a time until you believe you've read enough lines, or read a character at a time using fgetc.
fgets
Not sure if this is excatly what you're looking for, but should limit the amount of data gotten from the remote source.
This seems to solve your problem:
mb_strlen($string, '8bit');

How to parse dict output in a user friendly way in PHP?

I am trying to implement a dictionary-type service.
I send a request with php using cURL to dict.org with the dict protocol.
This is my code (which on its own works and may be helpful for future readers):
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "dict://dict.org/define:(hello):english:exact");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$definition = curl_exec($ch);
curl_close($ch);
echo $definition;
The server returns the definition, as expected, along with several headers (that I do not need). The response looks something like this:
220 miranda.org dictd 1.9.15/rf on Linux 2.6.26-2-686 <auth.mime> <29631663.31530.1250750274#miranda.org>
250 ok
150 3 definitions retrieved
151 "Hello" gcide "The Collaborative International Dictionary of English v.0.48"
Hello \Hel*lo"\, interj. & n.
An exclamation used as a greeting, to call attention, as an
exclamation of surprise, or to encourage one. This variant of
{Halloo} and {Holloo} has become the dominant form. In the
United States, it is the most common greeting used in
answering a telephone.
[1913 Webster +PJC]
(... some content removed)
.
250 ok [d/m/c = 3/0/162; 0.000r 0.000u 0.000s]
221 bye [d/m/c = 0/0/0; 0.000r 0.000u 0.000s]
I was wondering if:
a) Is there a way to specify to curl (or an option in the dict protocol) to not return all that extra information (i.e. 250 ok [d/m/c = 3/0/162; 0.000r...])
b) You probably noticed that the dict response returns information that is not displayed in the most user friendly way. I was wondering if anybody knew of any existing php library that will allow me to display this in a nicer way. Otherwise I'd have to code my own.
c) If this is not the way most dictionary websites retrieve their definitions, how do they do it? In my understanding the most comprehensive dictionary database is the one at dict.org (which supports the dict protocol and is where I am sending my cURL request).
Thank you!
Before I start let me state that I don't know the specific of the dict protocol.
I doubt that you'll be able to create a request that only delivers the text. The information you wish to discard looks like status information and is therefore useful.
The way I'd handle this is as follows:
Read the curl response data into an array so that each line is an separate entry in the array. You could use explode() and split at the new line character (\n) to do this.
Iterate the array, EG for ($response as $responseLine) {}
perform a regex (or some other form of pattern matching) on $responseLine to find the definition. It looks like the actual text is the only $responseLine which doesn't start with a number.
You may want to check what characterset the dict protocol uses. I haven't mentioned any error handling, but that should be straight forward.

Categories