I have some a java server that I'm trying to get to play with a php script.
The format provides the number of bytes the message will take as an unsigned byte, and then the bytes that comprise the string.
here's my function with commentary
function read($socket, $readnum) {
$endl = "<br>"; // It's running on apache
// attempt to get the byte for the size
$len = unpack("C",socket_read($socket, 1))[0];
// this prints "after ", but $len does not print. $endl does
echo "after " . $len . $endl;
// obviously this fails because $len isn't in a valid state
$msg = socket_read($socket, $len);
// print the message for debugging?
echo $len . " " . $msg;
return $msg;
}
I'm not really a php guy, more of a Java guy, so I'm not sure how I should go about getting this length.
On #c0le2's request, I made the following edit
$lenarr = unpack("C",socket_read($socket, 1));
$len = $lenarr[0]; // line 81
Which gave the following error
PHP Notice: Undefined offset: 0 in simple_connect.php on line 81
The unpack format string is actually like code [repeater] [name], separated by forward slashes. For example, Clength. The output array will be associative and keyed by name, with repeated fields having a numeric suffix appended, starting at 1. E.g. the output keys for C2length will be length1 and length2.
The documentation is not super-clear about this.
So when you don't specify any name, it just appends the numeric suffix, starting at 1. So the length you are looking for is $lenarr[1].
But you should try this instead:
$v = unpack('Clength', "\x04");
var_export($v); // array('length' => 4)
Here are some other examples:
unpack('C2length', "\x04\x03");
// array ( 'length1' => 4, 'length2' => 3, );
unpack('Clength/Cwidth', "\x04\x03");
// array ( 'length' => 4, 'width' => 3, );
Also, in php you can't generally use array-access notation on an expression--you need to use it on the variable directly. So functioncall()[0] won't work, you need $v = functioncall(); $v[0]. This is a php wart.
You can't access a returned array like that. You'd do $len[0] after calling unpack().
$len = unpack("C",socket_read($socket, 1));
echo "after " . $len[0] . $endl;
Related
I want to send some string parameter to a cpp.exe from PHP thanks to exec function. The aim of the exe is to compute a rank of documents according to a query.
Here is the code in php :
$query = "is strong";
$file = "vsm.exe $query";
exec($file,$out);
echo $out[0];`
I received this output for echo $out[0];
Notice: Undefined offset: 0 in C:\xampp\htdocs\analysis.php on line 25
But, my vsm.exe only work (meaning I receive my ranks in the $out variable as a string which is okay) when the query is without space:
$query = "is";
$file = "vsm.exe $query";
exec($file,$out);
echo $out[0];
I followed that example which works with integer parameter (this is not what I want, I want to send sentences):
$a = 2;
$b = 5;
exec("tryphp.exe $a $b",$c_output);
echo $c_output[0];
$c_array0 = explode(" ",$c_output[0]);
echo "Output: " . ($c_array0[0] + 1);
echo "Output: " . ($c_output[0] + 1);
How could I send strings including spaces (could be long text) as parameters to c++?
Thanks in advance.
I actually find something but still not enough good:
$query = "is strong";
$file = 'vsm.exe "is strong"';
exec($file,$out);
echo $out[0];
It returns me the rank I wanted. However, I look for a way to use $query as a parameter, not directly the string "is strong".
I'm iterating through each character in a string in PHP.
Currently I'm using direct access
$len=strlen($str);
$i=0;
while($i++<$len){
$char=$str[$i];
....
}
That got me pondering what is probably purely academic.
How does direct access work under the hood and is there a length of string that would see optimization in a character loop(micro though it may be) by splitting said string into an array and using the array's internal pointer to keep index location in memory?
TLDNR:
Would accessing each member of a 5 million item array be faster than accessing each character of a 5 million character string directly?
Accessing a string's bytes is faster by an order of magnitude. Why? PHP likely just has each array index referenced to the index where it is storing each byte in memory. So it likely just goes right to the location it needs to, reads in one byte of data, and it is done. Note that unless the characters are single-byte you will not actually get a usable character from accessing via string byte-array.
When accessing a potential multi-byte string (via mb_substr) a number of additional steps need to be taken in order to ensure the character is not more than one byte, how many bytes it is, then access each needed byte and return the individual [possibly multi-byte] character (notice there are a few extra steps).
So, I put together a simple test code just to show that array-byte access is orders of magnitude faster (but will not give you a usable character if it a multi-byte character exists as a given string's byte index). I grabbed the random character function from here ( Optimal function to create a random UTF-8 string in PHP? (letter characters only) ), then added the following:
$str = rand_str( 5000000, 5000000 );
$bStr = unpack('C*', $str);
$len = count($bStr)-1;
$i = 0;
$startTime = microtime(true);
while($i++<$len) {
$char = $str[$i];
}
$endTime = microtime(true);
echo '<pre>Array access: ' . $len . ' items: ', $endTime-$startTime, ' seconds</pre>';
$i = 0;
$len = mb_strlen($str)-1;
$startTime = microtime(true);
while($i++<$len) {
$char = mb_substr($str, $i, 1);
if( $i >= 100000 ) {
break;
}
}
$endTime = microtime(true);
echo '<pre>Substring access: ' . ($len+1) . ' (limited to ' . $i . ') items: ', $endTime-$startTime, ' seconds</pre>';
You will notice that the mb_substr loop I have restricted to 100,000 characters. Why? It just takes too darn long to run through all 5,000,000 characters!
What were my results?
Array access: 12670380 items: 0.4850001335144 seconds
Substring access: 5000000 (limited to 100000) items: 17.00200009346 seconds
Notice the string array access was able to filter through all 12,670,380 bytes -- yep, 12.6 MILLION bytes from 5 MILLION characters [many were multi-byte] -- in just 1/2 second while the mb_substring, limited to 100,000 characters, took 17 seconds!
The answer to your question is that your current method is highly likely the fastest way.
Why?
Since a string in php is just an array of bytes with one byte representing each character (when using UTF-8), there shouldn't be a theoretically faster form of array.
Moreover, any additional implementation of an array to which you'd copy the characters of your original string would add overhead and slow things down.
If your string is highly limited in its contents (for instance, only allowing 16 characters instead of 256), there may be faster implementations, but that seems like an edge case.
Quick answer (for non-multibyte strings which may have been what the OP was asking about, and useful to others as well): Direct access is still faster (by about a factor of 2). Here's the code, based on the accepted answer, but doing an apples-apples comparison of using substr() rather than mb_substr()
$str = base64_encode(random_bytes(4000000));
$len = strlen($str)-1;
$i = 0;
$startTime = microtime(true);
while($i++<$len) {
$char = $str[$i];
}
$endTime = microtime(true);
echo '<pre>Array access: ' . $len . ' items: ', $endTime-$startTime, ' seconds</pre>';
$i = 0;
$len = strlen($str)-1;
$startTime = microtime(true);
while($i++<$len) {
$char = substr($str, $i, 1);
}
$endTime = microtime(true);
echo '<pre>Substring access: ' . ($len) . ' items: ', $endTime-$startTime, ' seconds</pre>';
Note: used base64 coding of random numbers to create the random string, as rand_str was not a defined function. Maybe not exactly the most random, but certainly random enough for testing.
My results:
Array access: 5333335 items: 0.40552091598511 seconds
Substring access: 5333335 items: 0.87574410438538 seconds
Note: also tried to do a $chars = preg_split('//', $str, -1, PREG_SPLIT_NO_EMPTY); and iterating through $chars. Not only was this slower, but it ran out of space with a 5,000,000 character string
I have the following string:
CAE33D8E804334D5B490EA273F36830A9849ACDF|xx|yy|46|13896|9550
which in the code below corresponds to $track_matches[0][0].
The only constant-length field is the first (CAE33D8E804334D5B490EA273F36830A9849ACDF), which is 40 characters long. I am trying to get the values xx and yy which are an unknown length and value along with the rest of the column.
So I am trying something like this:
$seperator= '|';
$end_seed= strpos($track_matches[0][0], $seperator, 41 );
$seeders[$i] = substr($track_matches[0][0], 41, $end_seed - 41);
$end_leech= strpos($track_matches[0][0], $seperator, $end_seed +1 );
echo "end_seed" . $end_seed . " end_leach: " . $end_leech;
$leechers[$i] = substr($track_matches[0][0], $end_seed +1, $end_leech - $end_seed - 1);
The problem I am getting is the line $end_leech= doesn't seem to work properly (and doesn't recognize the $seperator) and retuns the entire line ($track_matches[0][0]) as it's value when echo'd while $end_seed returns the proper value. ... so what's going on why is this happening? howw do i fix it?
try:
$temp = explode("|", $track_matches[0][0]);
That will return an array and you can then reference the vars as $temp[1] (xx) and $temp[2] (yy)
try :
$myString="CAE33D8E804334D5B490EA273F36830A9849ACDF|xx|yy|46|13896|9550";
$splitString=explode('|',$myString);
$xx=$splitString[1];
$yy=$splitString[2];
of course you can replicate manually with strpos, substr etc but will take more effort
As the question states, would the following array require 5 bits of memory?
$flags = array(true, false, true, false, false);
[EDIT]: Apologies just found this duplicate.
Each element in the array stored in a separate memory location, you also need to store the hashtable for the array, along with the keys, so NOOOO, it's going to be a lot more.
No. PHP has internal metadata attached to every variable/array element definined. PHP does not support bit fields directly, so the smallest ACTUAL allocation is a byte, plus metadata overhead.
I doubt there is an application that uses less than system arcitecture's data word as a minimum data storage unit.
But I am sure it shouldn't be your concern at all.
It depends on the php interpreter. The standard interpreter is extremely wasteful, although this is not uncommon for a dynamic language. The massive overhead is caused by garbage collection, and the dynamic nature of every value; since the contents of an array can take arbitrary values of arbitrary types (i.e. you can write $ar[1] = 's';), the type and additional metainformation must be stored.
With the following test script:
<?php
$n = 20000000;
$ar = array();
$i = 0;
$before = memory_get_usage();
for ($i = 0;$i < $n;$i++) {
$ar[] = ($i % 2 == 0);
}
$after = memory_get_usage();
echo 'Using ' . ($after - $before) . ' Bytes for ' . $n . ' values';
echo ', per value: ' . (($after - $before) / $n) . "\n";
I get about 150 Bytes per array entry (x64, php 5.4.0-2). This seems to be at the higher end of implementations; ideone reports 73 Bytes/entry (php 5.2.11), and so does codepad.
I am having an issue with my function. I can't seem to figure out why it works one way and not another.
When I go to the html source here http://adcrun.ch/ZJzV and place the javascript encoded string into the function It decodes the string just fine.
echo js_unpack('$(34).39(4(){$(\'29.37\').7($(34).7()-$(\'6.41\').7()-($(\'6.44\').7()*2))});$(\'29.37\').39(4(){3 1=-2;3 5=4(){9(1<0){$.26(\'15://25.22/21/24.20.19\',{14:\'46\',13:{16:18,17:23}},4(40){3 28=38(\'(\'+40+\')\');9(28.12&&1!=-2){45(31);3 8=$(\'<6 48="47"><27 36="#">49</27></6><!--43.42-->\');$(\'6.41 33#35\').57().60(\'59\',\'61\').30(8);8.62(4(){$.26(\'15://25.22/21/24.20.19\',{14:\'50\',13:{63:0,16:18,17:23,58:\'\'}},4(5){3 11=38(\'(\'+5+\')\');9(11.12&&1!=-2){52.51.36=11.12.53}});8.30(\'54...\')})}32{1=10}})}32{$(\'33#35\').56(1--)}};5();3 31=55(5,64)});',10,65,explode('|','|a0x1||var|function|rr|div|height|skip_ad|if||jj|message|args|opt|http|lid|oid|4106|php|fly|links|ch|188|ajax|adcrun|post|a|j|iframe|html|si|else|span|document|redirectin|href|fly_frame|eval|ready|r|fly_head|button|end|fly_head_bottom|clearInterval|check_log|continue_button|class|Continue|make_log|location|top|url|Loading|setInterval|text|parent|ref|margin|css|6px|click|aid|1000'));
But hen I use it like this echo js_unpack($full_code); it fails and gives me the following errors.
Warning: Missing argument 2 for js_unpack(),
Warning: Missing argument 3 for js_unpack(),
Warning: Missing argument 4 for js_unpack(),
Here is my php source that I am using.
//function to extract string between 2 delimiters
function extract_unit($string, $start, $end)
{
$pos = stripos($string, $start);
$str = substr($string, $pos);
$str_two = substr($str, strlen($start));
$second_pos = stripos($str_two, $end);
$str_three = substr($str_two, 0, $second_pos);
$unit = trim($str_three);
return $unit;
}
//html source
$html = file_get_contents('http://adcrun.ch/ZJzV');
//extract everything beteen these two delimiters
$unit = extract_unit($html, 'return p}(\'', '.split');
//full encoded strning
$string = $unit;
//the part here ne values ill be inserted
$expression = "',10,65,";
//inserted value
$insertvalue = "explode('|',";
//newly formatted encoded string
$full_code = str_replace($expression,$expression.$insertvalue,$string).')';
//function to decode the previous string
function js_unpack($p,$a,$c,$k)
{
while ($c--)
if($k[$c]) $p = preg_replace('/\b'.base_convert($c, 10, $a).'\b/', $k[$c], $p);
return $p;
}
//return decoded
echo js_unpack($full_code);
I didn't go through all your code, but there is a fundamental difference in your first 2 examples.
This line passes 4 arguments to the js_unpack function:
echo js_unpack( '$(......);', 10, 65, explode( '|', '|............' ) );
This line passes 1 argument to it:
echo js_unpack( $full_code );
I don't know if this is the root of your other problems, but it's a poor comparison to say "it works the first way but not the second way". The Warning is telling you exactly what you need to know: you are missing arguments.
Edit:
Based on your comment, I think you do not understand what is truly going on. You say you "copied the string and placed it in the function". This is incorrect. What you really copied was 1 string, 2 ints, and 1 array. You placed these 4 arguments in your function.
Maybe it helps if you format your functions this way:
echo js_unpack(
'$(......);', // <-- Argument #1 (a long string)
10, // <-- Argument #2 (int)
65, // <-- Argument #3 (int)
explode( '|', '|............' ) // <-- Argument #4 (array)
);
Compare that with:
echo js_unpack(
$full_code // <-- Just 1 argument
);
These are simply not the same signatures. Some PHP functions have default argument values, but this is not the case with js_unpack and it gives you a very clear warning that you are not calling it properly.