CJK White Space Characters Disappearing

CJK White Space Characters Disappearing - php

I have a PHP script which fetches data from Salesforce via the Salesforce API and writes the output to a file using file_put_contents. The data is a mixture of Korean characters and English characters.
When I run the script on a box (1) running Red Hat Enterprise Linux ES release 4 (Nahant Update 8) with PHP 5.2.8 and a similar box (2) running PHP 5.3.6 the spaces in between the Korean characters disappear.
e.g. (Using K to represent a Korean character and E to represent an English character)
EEEEEEEEK KKK KKKK EEE KKKK is appearing as EEEEEEEEKKKKKKKK EEE KKKK
However when I run the script on a box (3) running CentOs with PHP 5.3.5 or on (4) my local windows machine with PHP 5.3.6 the text in the file is correct.
Can anyone suggest what the problem might be?
EDIT - Originally I was accessing the php script via a browser however to (hopefully) simplify the problem I am currently storing the output in a text file and downloading it to my windows machine.
EDIT - Hex version
Original text - CFD란 무엇입니까?
Hex from (1) - 43 46 44 eb 9e 80 eb ac b4 ec 97 87 ec 9e 85 eb 8b 88 ea b9 8c 3f
Hex from (3) - 43 46 44 eb 9e 80 20 eb ac b4 ec 97 87 ec 9e 85 eb 8b 88 ea b9 8c 3f
EDIT - Code used to select text (with user, pass, table, id and path omitted)
<?php
ini_set("soap.wsdl_cache_enabled", "0");
require_once ("../soapclient/SforcePartnerClient.php");
require_once ("../soapclient/SforceHeaderOptions.php");
$partner_wsdl = "../soapclient/new-partner.wsdl.xml";
$client = new SforcePartnerClient();
$client->createConnection($partner_wsdl);
$loginResult = $client->login('--user--', '--pass--');
$query = "Select Name FROM --table-- WHERE Id = '--id--'";
$response = $client->query($query);
echo'<pre>';print_r($response);echo'</pre>';
$queryResult = new QueryResult($response);
foreach ($queryResult->records as $qr) {
$content = $qr->fields->Name;
file_put_contents('--path--',$content);
}
?>

After more research I discovered a function in the SforcePartnerClient.php
$QueryResult = $this->sforce->query(array ('queryString' => $query))->result;
is returning different values depending on the box used.
Box 1 and 2:
<sf:Name>CFD란 무엇입니까?</sf:Name>
Box 3 and 4:
<sf:Name>CFD란 무엇입니까?</sf:Name>
When this is combined with / parsed / converted using the XML parser (later in the file) and WSDL file the XML parser strips any white space that occurs between consecutive &#xxxxx; s - I believe this is related to a bug https://bugs.php.net/bug.php?id=33240 to avoid this I suggest commenting out line 364 of SforcePartnerClient.php
xml_parser_set_option( $parser, XML_OPTION_SKIP_WHITE, 1 );
Unfortunately I do not know if this will have any adverse effects on other code using SforcePartnerClient.php.

Related

What is the encoder used by PHP $POST command for audio blob?

The first sequence of audio samples is recorded in JavaScript from the browser in mono, PCM format, 16-bit, and 96,000 Hz. I wrote this audio file as a blob to a server through a JavaScript FormData object using ajax.
These are the raw audio samples. When I retrieved the audio from the web server's directory listing, I received the second sequence of audio samples. It has been downsampled to 48000 Hz and the samples have been altered. What encoder is being used?
Server-side PHP code:
$input = $_FILES['audio']['tmp_name']; //audio blob
$output = $_POST['filename'];
if(move_uploaded_file($input, $output))
exit('Audio file Uploaded');
Client-side JavaScript code:
function send_audio(fn, blob){
var formData = new FormData();
formData.append("filename", fn);
formData.append('audio', blob);
$.ajax({
url:'save_audio.php',
type:'post',
data: formData,
contentType:false,
processData:false,
cache:false,
success: function(data){
console.log("send_audio success!");
console.log(data);
}
});
}

It's not clear whats going on but consider the following:
(1) Try 16-bits not 32-bits per sample (might help).
There is no need to record these audio samples at 32 bits per sample. That means you're using 4 bytes to record one sample value that fits within 2 bytes (16 bits).
Your highest value is 30465 (bytes x77 x01) and the lowest value is -32513 (bytes x80 xFF).
Notice those values fit two bytes? By using 4 bytes you make x00 x00 x77 x01. A waste to store extra00.
Two bytes hold max value of 65535. Or even 32767 if you divide into two parts as +32767 and -32767.
(2) Make sure there is no confusion between encoding of bytes (as hex), and digits (as integers).
Bytes are written as hex inside a file. Your values seem to mess up when converting from integers into hex format. Also 16 bit vs 32 bits for the
Some examples from your numbers compared to 2 bytes (16-bit) hex:
FIRST SEQ: SECOND SEQ:
value | hex | value | hex
100 00 64 612 02 64
553 02 29 41 00 29
203 00 CB -309 FE CB
-409 FE 67 103 00 67
68 00 44 836 03 44
953 03 B9 185 00 B9
154 00 9A -102 FF 9A
-127 FF 81 385 01 81
Notice the value of integer 100 when written as hex (byte values) it becomes byte x64, for some reason a 2 is added in front of that 64 now making it x0264 which converts to integer value of 612.
In the second sequence, where are these extra 02 and FE etc coming from? Why if hex of first seq has a 00, then for the second seq there's something extra now added? Even vice-versa (first = something, then second = something removed to become as 00)?
You need to double check your recording code. Does it add some increasing number to the recorded value?? Because, as you know, 2 is followed by 3 and that pattern is there in your second sequence. Also true that FE is followed by FF (and therefore previous to FE would be FD, FC, FB, FA, etc).
If you still can't explain it then:
Record at 16-bit with 44,100 Hz like the normal way others do it. Then later get complicated if the simple setup works okay. Why 32-bit and 96,000 Hz anyway? Is it a scientific test?
Use a Hex Editor to view your bytes from both sides (saved blob file via JS, against file downloaded via PHP) and compare any changes between the two files.
Show how/what your recording code is doing.
Share the bytes of "I retrieved the audio from the web server" (save as file) for checking.
Maybe someone else can help you. I don't use html forms so maybe there's an encoding option for the POST data type in there too?

PHP SNMP Real Walk returns bad characters

I read via PHP function snmp2_real_walk OID value '.1.3.6.1.2.1.17.4.3.1.1' for get MAC address from Cisco switch device. Problem is, that some results (randomly around the 50 results of the 200) return bad characters (yet I have found a mistake if I read only MAC addresses). Ie.:
correct output examples:
[Dot1dTpFdbAddres] => 30 05 5C 38 A7 8C
[Dot1dTpFdbAddres] => C0 7B BC 0E 56 18
wrong output examples:
[Dot1dTpFdbAddres] => ,v�?.b (HEX DUMP: 0 : 20 2c 76 8a 3f 2e 62 20 [ ,v.?.b ])
[Dot1dTpFdbAddres] => ,A8��7 (HEX DUMP: 0 : 20 2c 41 38 82 d9 37 20 [ ,A8..7 ])
[Dot1dTpFdbAddres] => xE�\ � (HEX DUMP: 0 : 20 78 45 c4 2a 5c 20 d9 20 [ xE.\ . ])
If I try read OID '.1.3.6.1.2.1.17.4.3.1.1' from program Getif, I have got correct results.
I can't find solution for this problem - can you please help me?
The tested solution that fails
PHP - Chnaged snmp2_real_walk by snmprealwalk
Changed encoding of files (UTF8, ANSI)
SNMP longer timeout and try add PHP sleep() function
PHP directive: snmp_set_oid_numeric_print(1)
PHP directive: snmp_set_quick_print(true)
PHP directive: snmp_set_enum_print(true)
PHP directive: snmp_set_valueretrieval(SNMP_VALUE_LIBRARY and SNMP_VALUE_PLAIN)
About server:
PHP Version 5.5.3
Apache/2.4.4 (Win32) OpenSSL/1.0.1e PHP/5.5.3
LAMP are equivalent to WAMP (error too)
New important informations:
Device return randomly none-hex value for hex values, ie.:
[iso.3.6.1.2.1.17.4.3.1.1.92.38.10.129.123.27] => Hex-STRING: 5C 26 0A 81 7B 1B
[iso.3.6.1.2.1.17.4.3.1.1.120.69.196.42.25.241] => Hex-STRING: 78 45 C4 2A 19 F1
[iso.3.6.1.2.1.17.4.3.1.1.120.69.196.42.27.169] => Hex-STRING: 78 45 C4 2A 1B A9
[iso.3.6.1.2.1.17.4.3.1.1.120.69.196.42.34.45] => STRING: "xE�*\"-"
[iso.3.6.1.2.1.17.4.3.1.1.120.172.192.142.199.214] => STRING: "x�����"
[iso.3.6.1.2.1.17.4.3.1.1.124.30.179.254.9.201] => Hex-STRING: 7C 1E B3 FE 09 C9
Mibs are imported correctly to Apache server, devices look fine.
Is there a way in PHP how to write for snmp2_real_walk() all returned values as Hex-STRING?
*Similar problems (without results):
http://forums.cacti.net/viewtopic.php?f=15&t=50806
http://comments.gmane.org/gmane.network.net-snmp.user/31569
http://forums.cacti.net/viewtopic.php?f=21&t=46068*
Thanks and best regards,
Petr

Use the following config:
snmp_set_valueretrieval(SNMP_VALUE_LIBRARY);
snmp_set_quick_print(1);
snmp_set_enum_print(0);

Complicated: Error uncompressing data from a packet stream zlib

I have a incoming stream that is compressed using the zlib functions, but I cannot tell the ending of the compressed data, so am having a lot of trouble getting the data out.
I also have a snippet of the source code where it is being uncompressed in AS3 flash, which should have been enough for me to figure it out, but I am at a loss.
Two files included:
http://falazar.com/projects/irnfll/version4/test//stackoverflow/as3_code.as.txt
http://falazar.com/projects/irnfll/version4/test//stackoverflow/bin_data_file
Snippet of the binary data, and what I know:
00 00 02 34 2c 02 31 78 5e ed dc cd 6e da 40 a5
21 19 40 f5 f2 c4 b7 e9 18 85 e1 5b 89 66 3d 42
31 95 90 cd 15 74 99 55 37 51 14 59 c9 a8 8c 54
0234 appears to be a size marker - 564
2c = 44, the code to match as3 COMMAND_WORLD_DATA, that is ok
0231 another size marker, always 3 smaller than above
785e - is the header marker for the zlib compress weakest compression, ZLIB_ENCODING_DEFLATE level 4
Later there is also a 789c which is another larger compressed block
I need to uncompress the two of these to move forward in project, thank you for your help.
There is also mention in the script of bigendian conversion, and am I not sure if I need to handle that.
I have written a couple scripts to try and solve this, including a php snippet that chops off the end until and loops trying to uncompress with no luck.
falazar.com/projects/irnfll/version4/test//stackoverflow/php_test.php.txt
Ideal solution in php or c#, but anything I can see that works will translate into another language easy enough.
(Using Free hex editor nero to view the binary)

You mean zlib.
Use PHP's gzuncompress() starting at each zlib header (e.g. 789c).

JPEG files uploaded to a PHP script arrive corrupt - but not all the time

I have a PHP script to which I upload JPEG images (through an HTML form). You can see the code here, but I will attempt to present the relevant parts in this post. The form is declared like so:
<form action="adm_addphoto.php" method="POST" enctype="multipart/form-data" name="myform">
The MAX_FILE_SIZE form field is set to 5MB:
<input type="hidden" name="MAX_FILE_SIZE" value=5242880>
The images I want to upload are about 3MB in size.
Once it's uploaded, I turn the image file into a GD jpeg:
$filename = $_FILES['file']['tmp_name'];
$myImage = imagecreatefromjpeg($filename);
Sometimes the upload works fine, and sometimes imagecreatefromjpeg emits warnings about the JPEG being corrupt. For example (line breaks added for readability):
Warning: imagecreatefromjpeg() [function.imagecreatefromjpeg]:
gd-jpeg, libjpeg: recoverable error: Corrupt JPEG data:
47 extraneous bytes before marker 0xd9 in
/path/adm_addphoto.php on line 97
Warning: imagecreatefromjpeg() [function.imagecreatefromjpeg]:
'/tmp/phpwlSS9x' is not a valid JPEG file in
/path/adm_addphoto.php on line 97
The thing is, this doesn't happen reliably. That is, if I try the same image several times in a row, sometimes it will upload successfully and sometimes there will be errors. And within the attempts that result in errors, the specifics of the error message will vary too. (With the particular photo that produced the above messages, the number of "extraneous bytes" is sometimes 47, sometimes 20, occasionally 68.)
What might be causing the files to arrive corrupt on some attempts but not others?
PS. I know that there is an ini setting that tells GD to try hard to work with corrupt JPEGs. But that's not the point, I want to know why the result of the upload is inconsistent.
PPS. Here are the values of some possibly-relevant PHP ini settings:
memory_limit .......... 128M
post_max_size ......... 8M
file_uploads .......... On
max_file_uploads ...... 20
upload_max_filesize ... 128M
upload_tmp_dir ........ no value
UPDATE: I added code to echo out the size and MD5 checksum of the file once it was uploaded. The size is always the correct filesize (the size of my local copy of the file). The MD5, however, varies between attempts. When the MD5 is correct (matches the MD5 of my local copy), there is no error message. When there is an error message, the MD5 is always different from expected. Also, this is just a perceived difference, but errors seemed to be occurring more frequently last night than this morning - only a minority of transfers this morning resulted in an error.
Here are some "bad" uploads (curiously, there were a couple of uploads that gave an MD5 checksum different to that expected but did not produce errors):
Correct MD5: f7b9587f39c7332e62a08adf34cefbd0
-----------------------------------------------------------------
38 extraneous bytes: 2a28c46079071d9d2e2fd49865b35d59
ce1c69f798953b201dcc35f85f3b29b4
b013a0428a71adff674a46e92372d46b
71 extraneous bytes: a271928f3559b6deaa19804704b5bcb3
92 extraneous bytes: cab2a10ad8535addaca3b19bcb607a30
premature end of data segment: 4514d39db1d94ab691d6da26c0832cdb
No error (!): 83b2e3624ddcfdeb3efc10be81631916
07d0a97b21d423fdeb4c6f88d76f8cd3
UPDATE: In response to Andre and Pekka: Here are links to two versions of the same picture. This one is the original copy of the image, as stored on my machine; I uploaded it the way I usually upload files to my webhosting, i.e. I used an FTP client. This one is the same image, uploaded using my script (but with a line added to move it to that location using move_uploaded_file()). The error message on that upload was "Corrupt JPEG data: 30 extraneous bytes...". You will notice that the bottom part of the image appears incorrect. I apologise for the large size of the images, but I can only use examples of the type of images that are giving me problems.
I should make it clear that this is a different image to the one that I used for the above MD5 checksums. I didn't think to use the same image.
UPDATE: Using the two images I linked above, I found the differences between the two files. There are two sequences of 32 bytes that are different in one image from the other. The files are otherwise identical.
n.b. the first character is numbered as character 1, not character 0
n.b. 2116624 and 2493456 are both 16 modulo 32.
The difference between them is 368 * 1024.
character 2116624 through line 2116655 (32 characters):
original image:
3c bc 20 19 eb 93 cd 34 db 93 68 7c 8d 5e 37 d4
d3 84 91 70 7e 7c 82 3e e9 e8 3d eb 2e ff 00 ce
bad image:
89 c9 54 a6 e5 3d f4 fc 8e 7f 68 d5 47 14 41 f6
55 11 7d a6 55 24 a8 e0 f7 a8 a4 43 06 18 c2 c1
character 2493456 through character 2493487 (32 characters):
original image:
3d a4 61 37 19 75 63 2e 51 62 ca 07 e0 9e 49 35
5d 65 8d 22 01 7f 5e b4 a7 19 3a 7a ef 72 1c e3
bad image:
4f 99 26 39 6a db d9 cd 3e 5b ec 63 46 3c b4 ef
2d b6 30 35 0f 12 a1 b6 9a 11 68 1a 69 07 0c 49

Are you sure there is no problem with your internet connection? Have you tried it from a different connection or even ISP?

Its a stretch, but one possible issue I can think of is that you're working on the temporary file, try using move_uploaded_file to move the file to somewhere you've specified, and see if that makes a difference?

Rather than attempt a conversion at this point, dump the temp file info and compare the file size etc.
Because of the random nature of this, it is probably some form of transmission problem. You can probably rule out your own hardware by trying to move a decent size zip file (for example) from the computer you are running to something else on the local network (multiple times) and then seeing if the zip file passes the crc check or becomes corrupted.
Also: are you running a firewall appliance which sniffs http traffic? (e.g. untangle)

PHP Library to Generate xdot Files From dot Files

Apologies in advance is I'm misusing terminology, and corrections are appreciated. I'm fascinated by directed graphs, but I never has the math/cs background to know what they're really about, I just like the tech because it makes useful diagrams.
I'm trying to create a web application feature that will render a dynamic directed graph to the browser. I recently discovered Canviz, which is a cavas based xdot renderer, which I'd like to use.
Canviz is awesome, but it renders xdot files, which (appear?) to contain all the complicated positioning logic
/* example xdot file */
digraph abstract {
graph [size="6,6"];
node [label="\N"];
graph [bb="0,0,1250,612",
_draw_="c 9 -#ffffffff C 9 -#ffffffff P 4 0 -1 0 612 1251 612 1251 -1 ",
xdotversion="1.2"];
S1 [pos="464,594", width="0.75", height="0.5", _draw_="c 9 -#000000ff e 464 594 27 18 ", _ldraw_="F 14.000000 11 -Times-Roman c 9 -#000000ff T 464 588 0 15 2 -S1 "];
10 [pos="409,522", width="0.75", height="0.5", _draw_="c 9 -#000000ff e 409 522 27 18 ", _ldraw_="F 14.000000 11 -Times-Roman c 9 -#000000ff T 409 516 0 15 2 -10 "];
S1 -> 10 [pos="e,421.43,538.27 451.52,577.66 444.49,568.46 435.57,556.78 427.71,546.5", _draw_="c 9 -#000000ff B 4 452 578 444 568 436 557 428 546 ", _hdraw_="S 5 -solid c 9 -#000000ff C 9 -#000000ff P 3 430 544 421 538 425 548 "];
}
The files I'm generating with my application are dot files, which contain none of this positioning logic
digraph g {
ranksep=6
node [
fontsize = "16"
shape = "rectangle"
width =3
height =.5
];
edge [
];
S1 -> 10
}
I'm looking for a PHP library that can convert my dot file into an xdot file that can be consumed by Canviz. I realize that the command line program dot can do this, but this is for a redistributable PHP web application, and I'd prefer to avoid any binaries as dependencies.
My core problem: I'm generating dot files based on simple directed relationships, and I want to display the visual graph to end users in a browser. I'd like to do this without having to rely on the presence of a particular binary program on the server. I think the best solution for this is Canviz+PHP to generate xdot files. I'm looking for a PHP library that can do this. However, I'm more than open to other solutions.

Have you looked at Image_GraphViz ? It's really just a wrapper for the binary, but from the look of things, I don't think you'll find something better and this at least keeps you from having to do direct command line calls from your PHP script.
$dot_obj = new Image_GraphViz();
$dot_obj -> load('path/to/graph.gv');
$xdot = $dot_obj -> fetch('xdot');

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.