This question already has answers here:
Measure string size in Bytes in php
(5 answers)
Closed 8 years ago.
I have an encrypted image and before saving it I would like to know how much space it takes up. I can get the number of characters via strlen($img) or via mb_strlen($img) but I would like to get a number like 16KiB (or KB).
I then save the string into a MySQL database in blob format, where I can see the size of it using PhpMyAdmin.
EDIT
If I use strlen to get the byte size of the string (which I want) I get a different value from the byte size displayed in my MySQL database (where the string is not saved as a char but as a blog, meaning binary). How can this be? And how can I find out how large the binary size will be when I save the string in the database.
I save the string simply with the MySQL command
INSERT INTO table (content, bla) VALUES ($string, bla);
(not fully correct but for example purpose – this works when correct)
Now when I look inside my database it displays me a size e.g 315 KB but when I take $string and do strlen on it, it returns something like 240000 (Not the same in bits as in KB)
I will investigate my self...
This does essentially the same thing as Dany's answer, but a little more compact.
function human_filesize($bytes, $decimals = 2) {
$size = array('B','kB','MB','GB','TB','PB','EB','ZB','YB');
$factor = floor((strlen($bytes) - 1) / 3);
return sprintf("%.{$decimals}f", $bytes / pow(1024, $factor)) . #$size[$factor];
}
echo human_filesize(filesize($filename));
Source: http://jeffreysambells.com/2012/10/25/human-readable-filesize-php
Related
I need to split a big DBF file using php functions, this means that i have for example 1000 records, i have to create 2 files with 500 records each.
I do not have any dbase extension available nor i can install it so i have to work with basic php functions. Using basic fread function i'm able to correctly read and parse the file, but when i try to write a new dbf i have some problems.
As i have understood, the DBF file is structured in a 2 line file: the first line contains file info, header info and it's in binary. The second line contains the data and it's plain text. So i thought to simply write a new binary file replicating the first line and manually adding the first records in the first file, the other records in the other file.
That's the code i use to parse the file and it works nicely
$fdbf = fopen($_FILES['userfile']['tmp_name'],'r');
$fields = array();
$buf = fread($fdbf,32);
$header=unpack( "VRecordCount/vFirstRecord/vRecordLength", substr($buf,4,8));
$goon = true;
$unpackString='';
while ($goon && !feof($fdbf)) { // read fields:
$buf = fread($fdbf,32);
if (substr($buf,0,1)==chr(13)) {$goon=false;} // end of field list
else {
$field=unpack( "a11fieldname/A1fieldtype/Voffset/Cfieldlen/Cfielddec", substr($buf,0,18));
$unpackString.="A$field[fieldlen]$field[fieldname]/";
array_push($fields, $field);
}
}
fseek($fdbf, 0);
$first_line = fread($fdbf, $header['FirstRecord']+1);
fseek($fdbf, $header['FirstRecord']+1); // move back to the start of the first record (after the field definitions)
first_line is the variable the contains the header data, but when i try to write it in a new file something wrong happens and the row isn't written exactly as it was read. That's the code i use for writing:
$handle_log = fopen($new_filename, "wb");
fwrite($handle_log, $first_line, strlen($first_line) );
fwrite($handle_log, $string );
fclose($handle_log);
I've tried to add the b value to fopen mode parameter as suggested to open it in a binary way, i've also taken a suggestion to add exactly the length of the string to avoid the stripes of some characters but unsuccessfully since all the files written are not correctly in DBF format. What can i do to achieve my goal?
As i have understood, the DBF file is structured in a 2 line file: the
first line contains file info, header info and it's in binary. The
second line contains the data and it's plain text.
Well, it's a bit more complicated than that.
See here for a full description of the dbf file format.
So it would be best if you could use a library to read and write the dbf files.
If you really need to do this yourself, here are the most important parts:
Dbf is a binary file format, so you have to read and write it as binary. For example the number of records is stored in a 32 bit integer, which can contain zero bytes.
You can't use string functions on that binary data. For example strlen() will scan the data up to the first null byte, which is present in that 32 bit integer, and will return the wrong value.
If you split the file (the records), you'll have to adjust the record count in the header.
When splitting the records keep in mind that each record is preceded by an extra byte, a space 0x20 if the record is not deleted, an asterisk 0x2A if the record is deleted. (for example, if you have 4 fields of 10 bytes, the length of each record will be 41) - that value is also available in the header: bytes 10-11 - 16-bit number - Number of bytes in the record. (Least significant byte first)
The file could end with the end-of-file marker 0x1A, so you'll have to check for that as well.
I am developing a PHP application where large amounts of text needs to be stored in a MySQL database. Have come across PHP's gzcompress and MySQL's COMPRESS functions as possible ways of reducing the stored data size.
What is the difference, if any, between these two functions?
(My current thoughts are gzcompress seems more flexible in that it allows the compression level to be specified, whereas COMPRESS may be a bit simpler to implement and better decoupling? Performance is also a big consideration.)
The two methods are more or less the same thing, in fact you can mix them: compress in php and uncompress in MySQL and vice versa.
To compress in MySQL:
INSERT INTO table (data) VALUE(COMPRESS(data));
To compress in PHP:
$compressed_data = "\x1f\x8b\x08\x00".gzcompress($uncompressed_data);
To uncompress in MySQL:
SELECT UNCOMPRESS(data) FROM table;
To uncompress in PHP:
$uncompressed_data = gzuncompress(substr($compressed_data, 4));
Another option is to use MySQL table compression.
It only require configuration and then it is transparent.
This may be an old question, but it's important as a Google search destination. The results of MySQL's COMPRESS() vs PHP's gzcompress() are the same EXCEPT for MySQL puts a 4-byte header on the data, which indicates the uncompressed data length. You can easily ignore the first 4 bytes from MySQL's COMPRESS() and feed it to gzuncompress() and it will work, but you cannot take the results of PHP's gzcompress() and use MySQL's UNCOMPRESS() on it, unless you take specific care to add in that 4-byte length header, which of course requires having the uncompressed data already...
The accepted answer does not use the right 4 byte header.
The first 4 bytes are the LENGTH and not a static header.
I have no idea about the implications of using a wrong length but it can not be good and has the potential to crash the database or table content in the future (if not now)
The correct answer with POC example:
Output of mysql:
mysql : "select hex(compress('1234512345'))"
0A000000789C3334323631350411000AEB01FF
The php equivalent:
They both use zlib, so the compression will likely be about the same. Test it and see.
Adding this answer for reference, as I needed to use uncompress() to decompress data where the decompressed size was stored in a separate column to the blob.
As per the previous answers, uncompress() expects the first 4 bytes of the compressed data to be the length, stored in little-endian format. This can be prepended using concat e.g.
select uncompress(
concat(
char(size & 0x000000ff),
char((size & 0x0000ff00) >> 8),
char((size & 0x00ff0000) >> 16),
char((size & 0xff000000) >> 24),
compressed_data)) as decompressed
from my_blobs;
Johns answer is almost correct. The length must be computed by using strlen instead of mb_strlen as the latter will recognize multibyte characters as "1 character" although they span multiple bytes. Take the following example with a "▄" character that consists of 3 bytes:
$string="▄";
$compressed = gzcompress($string, 6);
echo "with strlen\n";
$len = strlen($string);
$head = pack('V', $len);
$base64 = base64_encode($head.$compressed);
echo "Length of string: $len\n";
echo $base64."\n";
echo `mysql -e "SELECT UNCOMPRESS(FROM_BASE64('$base64'))" -u root -proot -h mysql`;
echo "\n\nwith mb_strlen\n";
$len = mb_strlen($string);
$head = pack('V', $len);
$base64 = base64_encode($head.$compressed);
echo "Length of string: $len\n";
echo $base64."\n";
echo `mysql -e "SELECT UNCOMPRESS(FROM_BASE64('$base64'))" -u root -proot -h mysql`;
Output:
with strlen
Length of string: 3
AwAAAHicezStBQAEWQH9
UNCOMPRESS(FROM_BASE64('AwAAAHicezStBQAEWQH9'))
▄
with mb_strlen
Length of string: 1
AQAAAHicezStBQAEWQH9
UNCOMPRESS(FROM_BASE64('AQAAAHicezStBQAEWQH9'))
NULL
i am working on a php-script which encodes given text and hides that in an Image using LSB. But the encoded text is a Byte Array (text encrypted using mcrypt with rijndael-256 and then unpacked with unpack("C*", $encryptedText);) i have tp add the array-size at the beginning of the Array. if i would not do this, reading the Bytes from the Image again would be terrible later on, because the script would not know where to stop reading. I added size Information at the beginning of the Array using These lines of code:
$size = count($byteArray);
array_unshift($byteArray, $size >> 24, ($size & 0xff0000) >> 16, ($size & 0xff00) >> 8, $size & 0xff);
so the size is added in integer Format (4bytes), but now every Image created would have the characteristics that the first hidden Bytes start mostly with Zeros, besause $size is mostly in the range of 60000 or lower. is there any way i can encode size or Change other parts of the program so that it works and the beginning of the bytearry is not nearly the same every time?
Instead of always having the first 4 bytes encoding how long your message is, you can use the last two bits from the first byte to encode how many bytes you need to read for $size. Say, 00 = 1, 01 = 2, 10 = 3 and 11 = 4. For example, if $size is small enough to be expressed with just two bytes, the first few bytes will read as follow:
First byte: xxxxxx01
Second and third bytes: $size
Fourth byte and onward: ByteArray...
You can spice things up further by using a randomised embedding method. You can use a pseudorandom number generator, or chaotic maps, such as the Logistic Map, or Tent Map. The seed or initial condition parameters will be required by the receipt to decipher in what order to read the bytes to extract the message. For example, consider 5 bytes to embed data and 5 numbers generated between 0 and 1.
(0.2843, 0.5643, 0.0904, 0.4308, 0.9866)
Sorting the numbers in ascending order gives you the following order, which you can use to embed your secret:
(3, 1, 4, 2, 5)
I have a table with column name = recording_size.
In this column I am storing the size in bytes,
but on the Web I am showing the size in KB.
public static function getKbFromBytes($string){
if($string){
return ($string/1024);
}
}
Now I have a Filtration Functionality in Web. So as I am showing size in KB so I'll certainly take Input for Searching from user in KB & not in bytes although in DB I have that in bytes. For that I take input in KB and than convert it again to bytes like:
public static function getBytesFromKb($string){
if($string){
return ($string * 1024);
}
}
Example:
size in Bytes = 127232
When I apply my function so it = 124.25 KB
now when user write exactly like 124.25 then the search works,
but I want that user does not write exactly the same.
The user can also write 124 instead of 124.54, and when the user writes 124 then my search is not working — meaning it does not show any records.
I have also ADD & Subtract 50 from the converted bytes but it is not working.
$sql = $sql->where('r.recording_size BETWEEN "'.(Engine::getBytesFromKb($opt['sn']) - 50) .'" AND "'.(Engine::getBytesFromKb($opt['sn']) + 50) .'"');
How can I achieve this?
Searching for size should probably be as a range instead of an equality anyway. At least, the default should be the range, unless your app's primary focus is the exact size. For the SQL:
$kb = round($opt['sn']);
$sql = $sql->where('r.recording_size BETWEEN '.($kb * 1024).' AND '.(($kb + 1) * 1024));
By the way, you should omit the quotation marks ("). They are invalid/nonstandard even for strings. And you are comparing int's.
Another thing to watch for is KiB vs KB.
What is the best way to calculate the length of flv file using php with out external dependencies like ffmpege because client site run on shared hosting,
itry http://code.google.com/p/flv4php/, but it extract metadata and not all video contain meta data ?
There's a not too complicated way to do that.
FLV files have a specific data structure which allow them to be parsed in reverse order, assuming the file is well-formed.
Just fopen the file and seek 4 bytes before the end of the file.
You will get a big endian 32 bit value that represents the size of the tag just before these bytes (FLV files are made of tags). You can use the unpack function with the 'N' format specification.
Then, you can seek back to the number of bytes that you just found, leading you to the start of the last tag in the file.
The tag contains the following fields:
one byte signaling the type of the tag
a big endian 24 bit integer representing the body length for this tag (should be the value you found before, minus 11... if not, then something is wrong)
a big endian 24 bit integer representing the tag's timestamp in the file, in milliseconds, plus a 8 bit integer extending the timestamp to 32 bits.
So all you have to do is then skip the first 32 bits, and unpack('N', ...) the timestamp value you read.
As FLV tag duration is usually very short, it should give a quite accurate duration for the file.
Here is some sample code:
$flv = fopen("flvfile.flv", "rb");
fseek($flv, -4, SEEK_END);
$arr = unpack('N', fread($flv, 4));
$last_tag_offset = $arr[1];
fseek($flv, -($last_tag_offset + 4), SEEK_END);
fseek($flv, 4, SEEK_CUR);
$t0 = fread($flv, 3);
$t1 = fread($flv, 1);
$arr = unpack('N', $t1 . $t0);
$milliseconds_duration = $arr[1];
The two last fseek can be factorized, but I left them both for clarity.
Edit: Fixed the code after some testing
The calculation to get the duration of a movie is roughly this:
size of file in bytes / (bitrate in kilobits per second / 8)