C++ map lookup performance vs PHP array lookup performance - php

I can't understand the following and I'm hoping someone can shed some light on it for me:
In C++ if I create a vector of test data containing 2M different bits of text (testdata) then create a map using these strings as the index values, then look up all the values, like this:
//Create test data
for(int f=0; f<loopvalue; f++)
{
stringstream convertToString;
convertToString << f;
string strf = convertToString.str();
testdata[f] = "test" + strf;
}
time_t startTimeSeconds = time(NULL);
for(int f=0; f<2000000; f++) testmap[ testdata[f] ] = f; //Write to map
for(int f=0; f<2000000; f++) result = testmap[ testdata[f] ]; //Lookup
time_t endTimeSeconds = time(NULL);
cout << "Time taken " << endTimeSeconds - startTimeSeconds << "seconds." << endl;
It takes 10 seconds.
If I do seemingly at least the same in PHP:
<?php
$starttime = time();
$loopvalue = 2000000;
//fill array
for($f=0; $f<$loopvalue; $f++)
{
$filler = "test" . $f;
$testarray[$filler] = $f;
}
//look up array
for($f=0; $f<$loopvalue; $f++)
{
$filler = "test" . $f;
$result = $testarray[$filler];
}
$endtime = time();
echo "Time taken ".($endtime-$starttime)." seconds.";
?>
...it takes only 3 seconds.
Given that PHP is written in C does anyone know how PHP achieves this much faster text index lookup?
Thanks
C

Your loops are not absolutely equivalent algorithms.
Note that in the C++ version you have
testmap[ testdata[f] ] - this is actually a lookup + insert
testmap[ testdata[f] ] - 2 lookups
In the PHP versions you just have insert in the first loop and lookup in the second one.
PHP is interpreted - generally if you code is faster in PHP, check the code first ! ;-)

I suspect you benchmark the wrong things.
Anyway, I used your code (had to make some assumptions on your data types) and here are the results from my machine:
PHP:
Time taken 2 seconds.
C++ (using std::map):
Time taken 3 seconds.
C++ (using std::tr1::unordered_map):
Time taken 1 seconds.
C++ compiled with
g++ -03
Here is my test C++ code:
#include <map>
#include <sstream>
#include <string>
#include <vector>
#include <iostream>
#include <tr1/unordered_map>
int main(){
const int loopvalue=2000000;
std::vector<std::string> testdata(loopvalue);
std::tr1::unordered_map<std::string, int> testmap;
std::string result;
for(int f=0; f<loopvalue; f++)
{
std::stringstream convertToString;
convertToString << f;
std::string strf = convertToString.str();
testdata[f] = "test" + strf;
}
time_t startTimeSeconds = time(NULL);
for(int f=0; f<loopvalue; f++) testmap[ testdata[f] ] = f; //Write to map
for(int f=0; f<loopvalue; f++) result = testmap[ testdata[f] ]; //Lookup
time_t endTimeSeconds = time(NULL);
std::cout << "Time taken " << endTimeSeconds - startTimeSeconds << "seconds." << std::endl;
}
Conclusion:
You tested unoptimized C++ code, probably even compiled with VC++, which by default has a bounds check in std::vector::operator[] when compiled in debug mode.
There still is a difference of PHP to the optimised C++ code, when we use std::map, because of the difference in lookup complexity (see n0rd's answer), but C++ is faster when you use a Hashmap.

According to another question, associative arrays in PHP are implemented as hash tables, which have search complexity of O(1) on average, while std::map in C++ is a binary tree with search complexity of O(log n), which is slower.

Related

Bit shifting in C is different from PHP

I have a question about a small piece of code in C to make the same piece of code work in PHP, it has to do with a bit shift and I can't figure out what's wrong.
C:
unsigned u = 3910796769;
u += u << 8;
printf("%u\n",u);
//Result : 52422369
PHP:
$u = 3910796769;
$u += $u << 8;
printf("%u\n",$u);
//Result : 1005074769633
Well, unsigned in C is 32bit, you cannot even shift the number you provided once without triggering an overflow, but you have shifted it 8 times and added one more time, like multiplying the number by 257, you should get the result mod 2^32 == 4294967296:
unsigned u = 3910796769;
u += u << 8;
this should be 256*u + u == 257 * u == 1005074769633 ~= 52422369 (mod 4294967296)
You can test it.
[...]
//Result : 52422369 /* correct (mod 2^32) */
PHP probably uses 64bit integers for the operations, and the result properly fits in 64bit.
$u = 3910796769;
$u += $u << 8;
printf("%u\n",$u);
//Result : 1005074769633
But if you try:
#include <stdio.h>
#include <stdint.h>
int main()
{
uint64_t u = 3910796769;
u += u << 8;
printf("%Lu\n", u);
//Result : 1005074769633
}
you will get the correct result.
In my case, I needed to select elements from an array filled with 32-bit values using a specific formula.
The answer from #Eugene Sh helped me do this in PHP.
$u = 3910796769;
$u += $u << 8;
$u = $u & 0xFFFFFFFF
printf("%u\n",$u);
//Result : 52422369

Convert C++ code to PHP?

I am trying to write a struct in php, i know there is no such thing in php, but at least get it working somehow...
C++:
// The struct
typedef struct data
{
char numbers[20];
char numbers2[50];
char number3[6];
char sometext[100];
}data_t;
data_t config;
char numbers[20] = "12345.12345";
char numbers3[6] = "12345";
char sometext[100] = "asdsadsad";
// Storing into struct
strcpy_s(config.numbers, numbers);
strcpy_s(config.numbers3, numbers3);
strcpy_s(config.sometext, sometext);
// Serializing struct to test.dat
ofstream output_file("test.dat", ios::binary);
output_file.write((char*)&config, sizeof(config));
output_file.close();
// Reading from it
ifstream input_file("test.dat", ios::binary);
input_file.read((char*)&master, sizeof(master));
cout << "NUMBERS : " << master.numbers << endl;
cout << "NUMBERS3 : " << master.numbers3 << endl;
cout << "SOMETEXT : " << master.sometext << endl;
cout << endl << endl;
Now storing with c++ in the struct, then reading it works just fine, but i want to store in that file trough php, then read it from c++, so i have:
PHP:
$data = Array();
$data['numbers'] = "12345.12345";
$data['numbers3'] = "12345";
$data['sometext'] = "abcdfghs";
$fp=fopen("test.dat","wb") or die("Stop! i kill you...");
foreach($data as $key => $value){
echo 'written:'.$value;
fwrite($fp,$value."\t");
}
Now what is happening is:
NUMBERS : 12345.12345 12345 abcdfghs
NUMBERS3 :
SOMETEXT :
So as you can see, it`s not good, also i noticed a difference when writing to file from c++ (contains binary data), while writing to file from php is just plain text.
Some help would be apreciated, many thanks!
Your C++ struct allocates 20 bytes for the numbers member. That means when you write it to the file, all 20 bytes are written, the writing doesn't just stop after writing 12345.12345. Your PHP code, on the other hand, writes exactly what is in $data['numbers'] and stops immediately (well, after adding a useless "\t"). The "binary data" you noticed in the file is just the garbage which happened to be in memory in those leftover bytes after 12345.12345. Same goes for the other fields.
Your PHP code does not write the string's terminating NULL to the file.
Your PHP code does not write the numbers2 member to the file.
You need to ensure the PHP code writes the terminating NULL, pads the output to the same size as the field has in the C++ struct, and outputs the fields in the same order as the C++ struct. You can use pack() for this:
<?php
$data = array();
$data['numbers'] = "12345.12345";
$data['numbers2'] = '';
$data['numbers3'] = "12345";
$data['sometext'] = "abcdfghs";
$packed = pack('a20a50a6a100', $data['numbers'], $data['numbers2'], $data['numbers3'], $data['sometext']);
$written = file_put_contents("test.dat", $packed);
if($written === false) {
throw new RuntimeException("Failed to write data to file!");
} else if($written !== strlen($packed)) {
throw new RuntimeException("Writing to file was not complete!");
}
Note: For maximum compatibility, you should read/write each struct member to the file individually in a consistent order on both sides. Otherwise you can have problems due to C++ field padding/alignment.

I need to execute a C program from a PHP script

Ok i wanted to create a crawler with my PHP script. Certain parts of my crawler requires real fast manipulation of strings thats why i have decided to use a C/C++ program to assist my PHP script for that particular job. The following is my code:
$op=exec('main $a $b');
echo $op;
main is the executable file generated using my C file main.c i.e main.exe. in the above operation i just made a simple C program which accepts 2 values from PHP and returns the sum of the two values. the following is how my C program in looking like
#include< stdio.h >
#include< stdlib.h >
int main(int argc, char *argv[])
{
int i=add(atoi(argv[1]),atoi(argv[2]));
printf("%d\n",i);
return 0;
}
int add(int a, int b)
{
int c;
c=a+b;
return c;
}
i tried to execute the program via the CMD main 1 1 and it returned 2....it worked! when i entered them in the php script like this,
$a=1;
$b=1;
$op=exec('main $a $b');
echo $op;
it didn't work as expected so any ideas, suggestions or anything else i need to do on my code. I would be great if you could show me an example. THANKS IN ADVANCE!!!
You should enclosed the arguments of exec with double quotes since you're passing variables. And the output of your program is in the second argument of exec.
exec("main $a $b", $out);
print_r($out);
See exec() reference.
The function atoi() cannot distinguish invalid and valid inputs.
I suggest you use strtol() instead.
#include <stdio.h>
#include <stdlib.h>
void quit(const char *msg) {
if (msg) fprintf(stderr, "%s\n", msg);
exit(EXIT_FAILURE);
}
int add(int, int);
int main(int argc, char *argv[]) {
int a, b, i;
char *err;
if (argc != 3) quit("wrong parameter count");
a = strtol(argv[1], &err, 10);
if (*err) quit("invalid first argument");
b = strtol(argv[2], &err, 10);
if (*err) quit("invalid second argument");
i = add(a, b);
printf("%d\n", i);
return 0;
}
int add(int a, int b) {
return a + b;
}
You need to create an executable ./main.
And then use this code.It works
<?php
$a=1;
$b=1;
echo exec("./main $a $b");
?>

send PHP string to C++

I am trying to pass over from php a string into C++, i managed to figure out how to pass numbers, but it doesn't work for letters. Here's what i have that works for PHP
<?php
$r = 5;
$s = 12;
$x= 3;
$y= 4;
$q= "Hello World";
$c_output=`project1.exe $r $s $x $y $q`; // pass in the value to the c++ prog
echo "<pre>$c_output</pre>"; //received the sum
//modify the value in php and output
echo "output from C++ programm is" . ($c_output + 1);
?>
This sends the variables r,s,x,y, and q to the C++ programm project1.exe and IT WORKS, but the problem is that it doesn't work for the string variable $q.
Here's the code that I have in my C++ programm, it's simple:
#include<iostream>
#include<cstdlib>
#include<string>
using namespace std;
int main(int in, char* argv[]) {
int val[2];
for(int i = 1; i < in; i++) { // retrieve the value from php
val[i-1] = atoi(argv[i]);
}
double r = val[0];
double s = val[1];
double x = val[2];
double y = val[3];
double q = val[4]; // here's the problem, as soon as i try to define val[4] as a string or char, it screws up
cout << r;
cout <<s;
cout << x;
cout << y;
cout << q;
// will output to php
return 0;
}
It works, but for the string "Hello world" which i pass through $q from PHP doesn't give me the string back (i know it's defined as a double, but as soon as i try to change it to a string or a char variable the code just doesn't compile).
Please explain to me how i have to go around this problem so that $q can be processed as a string. FYI, I am a newbie to programming (6 months in).
Try not converting the final argument using atoi(argv[i]). Just keep it as argv[i].
for(int i = 1; i < in-1; i++)
{
val[i-1] = atoi(argv[i]);
}
q = argv[i];
It doesn't work for letters because you are doing atoi(..)(which converts char-string to integer) in the C++ program.
Have some means of letting the program know what to expect -- whether a number or a string. May be the first argument can help the program differentiate, like may be the following:
$c_output = `project1.exe nnsnns 1 2 string1 3 4 string2`
Then you could do:
for(int i = 0/*NOTE*/,len=strlen(argv[1]); i < len; i++) { // retrieve the value from php
if (argv[1][i] == 'n'){
//argv[2+i] must be an integer
}else if (argv[1][i] == 's'){
//argv[2+i] is a string
}
}
Of course you should check if (strlen(argv[1]) == in-2).
BTW, in the C++ code above, val is a array holding 2 ints; and you are trying to access much beyond index 1.
To pass one single string to the C++ you would do something like the following:
$output = `project1.exe $q`; //Read below.
NOTE: $q must be a single word. No spaces, no extra characters like '|', '&', or any other character which the shell might interpret differently. $q must be clean before you pass that on to C++ Program. If $q is more than one word, use quotes.
C++ Part (Just try the following, then you can modify as you go along)
cout<<argv[1]<<endl;

Porting code using sizeof() from C++ to PHP

I have some C++ code (segment seen below), I need to convert this to another language (namely PHP). The code, as seen, uses structs, which PHP doesn't do. I know I can "kind of" emulate structs through objects/arrays, however, this isn't the same. That is not my main problem though. I need a way to implement the sizeof() function found in C++ (since PHP's sizeof() function just counts the number of elements in an array/object).
typedef unsigned long Offset;
typedef unsigned long Size;
struct Location {
Offset offset;
Size size;
};
struct Header {
unsigned long magic;
unsigned long version;
struct Location elements;
struct Location ids;
struct Location strings;
struct Location integers;
struct Location decimals;
struct Location files;
};
int Build() {
Header theheader;
theheader.magic = *((unsigned long*)"P3TF");
theheader.version = 272;
theheader.elements.offset = sizeof(theheader);
theheader.elements.size = element_offset;
theheader.ids.offset = ((theheader.elements.offset + theheader.elements.size + 15) / 16) * 16;
theheader.ids.size = ids_offset;
theheader.strings.offset = ((theheader.ids.offset + theheader.ids.size + 15) / 16) * 16;
theheader.strings.size = string_offset;
theheader.integers.offset = ((theheader.strings.offset + theheader.strings.size + 15) / 16) * 16;
theheader.integers.size = 0;
theheader.decimals.offset = ((theheader.integers.offset + theheader.integers.size + 15) / 16) * 16;
theheader.decimals.size = 0;
theheader.files.offset = ((theheader.decimals.offset + theheader.decimals.size + 15) / 16) * 16;
theheader.files.size = file_offset;
theheader.padding[0] = 0;
theheader.padding[1] = 0;
fwrite(&theheader, 1, sizeof(theheader), file_handle);
}
Can anyone please point me in the right direction on how to do this?
Any help would be appreciated.
Obviously recreating sizeof from C will be a difficult feat, as C is statically-typed and, traditionally, sizeof is evaluated at run-time by the compiler. PHP is also pretty quiet about its memory usage.
One method of dynamically grabbing the size of an object is to use memory_get_usage (official PHP reference) before and after the allocation of the object in question. Of course, you'll run into some fun calculations when you compare the two memory usage values, as storing the values into variables will allocate memory also.
This is a pretty shaky method of recreating sizeof, but if it works it works.
You could simply sum all sizes of the objects in the array or object. However, that still only gets the length of strings, etc. If you want the actual size of the binary representation of the object, you'll have to do some additional math, such as converting all ints to 32 bits (or 64) and appending a null byte to all UTF-8 strings. If you're using charsets, do make sure that they are single-byte or at least measurable in bytes.
PHP does not have a function that checks the memory size of an object.

Categories