How does PHP represent string internally? - php

I have basic PHP question
lets say I have a string "02/03/2013", how is this represented internally in PHP, is it converted to integers or to a Hexadecimal equivalent
when comparing two strings, how does PHP compare them internally?
Thanks for the answer in advance

PHP is written in C. All variables are ZVAL structs.
Please read these tutorials to learn more about the PHP internals and get started with writing extensions.
Extension Writing Part I: Introduction to PHP and Zend
Extension Writing Part II: Parameters, Arrays, and ZVALs
Extension Writing Part III: Resources
Table 1 shows the various types, and their corresponding letter codes
and C types which can be used with zend_parse_parameters():
Type Code Variable Type
Boolean b zend_bool
Long l long
Double d double
String s char*, int
Resource r zval*
Array a zval*
Object o zval*
zval z zval*

A PHP string is just a sequence of bytes, with no encoding tagged to it. Visit here for additional info..

Strings are strings. No conversion takes place; your string just happens to contain some digits, which is fine, but PHP doesn't treat it any differently than any other string.
PHP compares strings the same way that any other language would: it goes through the two strings character by character, and looks for the first pair of characters that differ. Once it finds one, the string which had a character with a lower ASCII value (like you'd get from ord()) is considered as being "less" than the other string.

Related

How to convert binary string to string. json_encode binary string returns false

I have read, that b in front of something means binary.
I am receiving a text field from MS DB, column type is CLOB.
I am using Laravel and when I die dump (dd()) I see:
b"""
My big text
"""
If I create simple string and dd() it I see:
"My big text"
The problem is that json_encode() returns false on this b-String, but everything fine with simple string.
Could you please tell me how can I make it a simple string?
P.S. I have tried unpack() -> unsuccess
EDIT: actually json_encode() not related to this binary string. It was failing cause of non utf8 symbol. I see ...(22:45 – 0:15 CEST)..., but when I do utf8_decode($text), I see ...(22:45 ? 0:15 CEST)... and if I try to json_encode() now, it's works perfect.
PHP does not have "binary" and "non-binary" strings. It just has strings, and they're always "binary", as they're just acting like byte arrays. The b prefix is added by the Symfony VarDumper component as a sign that the string is not valid UTF-8. Arguably UTF-8 should be the one and only sensible encoding in use today, and apparently Symfony goes so far as to declare anything else as "binary", i.e. atypical text.
That is also the reason why your json_encode failed.
FWIW, b was a proposed forward compatibility prefix to prepare PHP code for PHP 6, which was supposed to have very Python-like binary strings and Unicode strings. Only PHP 6 never happened and the b prefix still does nothing. Symfony seems to have gone all out and adopted the b and """ conventions from Python nonetheless.

How do `intval()` and `(int)` handle whitespace?

The PHP manual on String conversion to numbers says:
The value is given by the initial portion of the string. If the string starts with valid numeric data, this will be the value used. Otherwise, the value will be 0 (zero).
This means that anything other than a number, plus or minus at the beginning of a string should return 0 when the string is converted to a number. Yet, (some) whitespace at the beginning of a string is ignored:
echo intval(" 3"); // 3
echo intval("
3"); // 3
Is there any kind of whitespace that intval() and (int) do not strip?
Where is this behavior documented?
The observed behavior is largely undocumented. Probably the space stripping depends on strtod() in C, which should use isspace().
intval() manual says:
The common rules of integer casting apply.
Following the link (links removed):
To explicitly convert a value to integer, use either the (int) or (integer) casts. However, in most cases the cast is not needed, since a value will be automatically converted if an operator, function or control structure requires an integer argument. A value can also be converted to integer with the intval() function.
So it looks like casting and intval() are equivalent. And a little below the quote above:
From strings
See String conversion to numbers
OK. Nothing really helpful there for us, except this little note:
For more information on this conversion, see the Unix manual page for strtod(3).
Following the thread of links again, choosing the version of strtod() specified in POSIX, the API standard for UNIX.
These functions shall convert the initial portion of the string pointed to by nptr to double, float, and long double representation, respectively. First, they decompose the input string into three parts:
1. An initial, possibly empty, sequence of white-space characters (as specified by isspace())
…
Common sense tells us that this should apply to floats only because strtod() returns a floating point number in C, but number parsing and internal representation is quirky in PHP, as much as almost everything in this language. Who knows how it really works under the hood. Better not to know.

php functions binary safe? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
In PHP what does it mean by a function being binary-safe ?
What exacly does it mean a function (example: dirname) is binary safe?
It means two things. First the function works on strings that contain \0 the NUL byte. This is not a given, because functions are often implemented in C which would treat that as string terminator. PHP however uses length-denominated strings.
Second, in some contexts it means that a particular string function ignores the character set and does not try to interpret UTF-8 sequences. For raw binary data the UTF-8 sequencing would be wrong, thus making functions fail if they try to treat it as text.
It means that the data will not be interpreted as text.
It means that binary data can pass through the function, and it won't be treated as text. Sometimes if you have string functions and you try to use them for raw binary data (such as a string replace function in other languages), they will garble your data.
Perhaps a better description at http://en.wikipedia.org/wiki/Binary-safe

Check for binary string length?

Is there a native or inexpensive way to check for the length of a string in bytes in PHP?
See http://bytes.com/topic/php/answers/653733-binary-string-length
Relevant part:
"In PHP, like in C, the string ends with a zero-character, '\0', (char)
0, null-terminator, null-byte or whatever you like to call it."
No, that's not the case - PHP strings are stored with both the length and the
data, unlike C strings that just has one pointer and uses a terminator. They're
"binary-safe" - NUL doesn't terminate the string.
See the definition of zvalue_value in zend.h; the string part has both a "char
*val" and "int len".
Problems would start if you're using the mbstring.func_overload, which changes
how strlen() and the other functions work, and does try and treat strings as
strings of characters in a specific encoding rather than a string of bytes.
This is not the normal PHP behaviour.
The answer is that strlen should return the number of bytes regardless of the content of the string. For multi-byte character strings, you get the wrong number of characters, but the right number of bytes. However, you need to be certain you're not using the mbstring overload, which changes how strlen behaves.
In the event that you have mbstring overload set or your are developing for the platforms where you are unsure about this setting you can do the following:
$len=strlen(bin2hex($data))/2;
The reason why this works is that in Hex you are guaranteed to get 2 characters for all bytes that come from bin2hex (it returns two chars even for the initial binary 0).
Note that it will use significantly more resources than a normal strlen (afterall, so you should definitely not do that to the large amount of data if it's not absolutely necessary.
On php.org, someone was nice enough to create this function. Just multiply by 8 and you've got however many bits were in that string, as the function returns bytes.
The length of a string (textual data) is determined by the position of the NULL character which marks the end.
In case of binary data, NULL can be and often is in the middle of data.
You don't check the length of binary data. You have to know it beforehand. In your case, the length is 16 (bytes, not bits, if it is UUID).
As far as UUID validity is concerned, any 16-byte value is a valid UUID, so you are out of luck there.

PHP: Very simple Encode/Decode string

Is there any PHP function that encodes a string to a int value, which later I can decode it back to a string without any key?
Sure, you can convert strings to numbers and vice versa. Consider:
$a = "" + 1
gettype($a) // integer
$b = "$a"
gettype($b) // string
You can also do type casting with settype().
If I misunderstood you and you want to encode arbitrary strings, consider using base64_encode() and bas64_decode(). If you want to convert the base 64 string representation to a base 10 integer, simply use base_convert().
And int has 4 or 8 bytes depending on the platform, and each character in a string is one byte (or more depending on encoding). So, you can only encode very small strings to integers, which basically makes the answer to your question: no.
What do you want to accomplish?
I would suspect not, since there are far more possible string combinations than integers within the MAX_INT.
Does it have to be an integer?
i'm convinced that what you think you want to do is not really what you want to do. :-) this just sounds like a silly idea. As another user has asked before:) what do you need this for? What are your intentions?
Well now that you mentioned that numbers and a-z letter are acceptable, then I have one suggestion, you could loop through the individual letters' ordinal value and display that as a two-digit hexadecimal. You can then convert these hexadecimals back to the ordinal values of the individual characters. Don't know what kind of characters are you about to encode, possibly you will need to use 4-characters per letter (e.g. String Peter would become 00700065007400650072 ) Well... have fun with that, I still don't really see the rationale for doing what you're doing.
op through the individual letters' ordinal value and display that as a two-digit hexadecimal. You can then convert these hexadecimals back to the ordinal values of the individual characters. Don't know what kind of characters are you about to encode, possibly you will need to use 4-characters per letter (e.g. String Peter would become 00700065007400650072 ) Well... have fun with that, I still don't really see the
There is no function for PHP but I recently wrote a class to encrypt and decrypt a string in PHP. You can look at it at: https://github.com/Lars-/PHP-Security-class

Categories