php sscanf doesnt parse string properly - php

I develop ADIF parser and parsing process comes to the point where I use sscanf() php function The strind that I parse is as following: "QSO_DATE:8:D>20070909" and I need to draw info from here as following: "QSO_DATE", "8", "20070909" so I use code:
sscanf("QSO_DATE:8:D>20070909", "%s:%d:D>%d")
But returning array looks like this:
Array
(
[0] => QSO_DATE:8:D>20070909
[1] =>
[2] =>
)
What is wrong? maybe there is more efficient way to parse bunch of records like these:
<CALL:7>EM200FT<QSO_DATE:8:D>20140324<TIME_ON:4>1657<BAND:3>12M<MODE:5>PSK63<RST_SENT:3>599<RST_RCVD:0><QSL_SENT:1>Y<QSL_SENT_VIA:1>E<APP_EQSL_AG:1>Y<GRIDSQUARE:6>KN45kj<EOR>
<CALL:5>9V1SV<QSO_DATE:8:D>20140328<TIME_ON:4>1019<BAND:3>10M<MODE:4>JT65<RST_SENT:6>VK4CMV<RST_RCVD:0><QSL_SENT:1>Y<QSL_SENT_VIA:1>E<QSLMSG:54>Thank you and I confirm your SWL report, 73's de Siva.<APP_EQSL_AG:1>Y<GRIDSQUARE:6>OJ11ui<EOR>
<CALL:5>RA6DQ<QSO_DATE:8:D>20140328<TIME_ON:4>1019<BAND:3>10M<MODE:4>JT65<RST_SENT:3>599<RST_RCVD:0><QSL_SENT:1>Y<QSL_SENT_VIA:1>E<QSLMSG:3>73!<APP_EQSL_AG:1>Y<GRIDSQUARE:6>KN85nf<EOR>

%s means any characters, including colons, digits, chevrons, etc, except whitespace characters) and sscanf uses a greedy grab.... using more precise alternatives like %[A-Z_] or %[^:] might serve you better that %s
$result = sscanf("QSO_DATE:8:D>20070909", "%[^:]:%d:D>%d");
var_dump($result);
Which uses %[^:] to scan for any character other than a :

Related

getCsvControl() always returns same delimiter - PHP

$file = new SplFileObject('D:\BackUp\addressbook.csv');
print_r($file->getCsvControl());
What i am trying to do is find the delimiter of a csv file using php. the addressbook.csv file looks like
"id";"firstname";"lastname";"phone";"email"
"1";"jishan";"ishrak";"17878";"jishan.ishrak#gmail.com"
and another file is addressbook1.csv which is like
"id","firstname","lastname","phone","email"
"1","jishan","ishrak","17878","jishan.ishrak#gmail.com"
one is separated by "," and another one is with ";" but the function
getCsvControl()
always returns an array like
Array ( [0] => , [1] => " )
I mean in the [0] index it always gives "," for both files
is there a way to solve this issue.
This is not a bug. SplFileObject::getCsvControl() is never intended to detect the delimiter from a CSV file. It returns only the default control characters or the one previously set with SplFileObject::setCsvControl(). And this set CSV control characters are used, if is nothing handed over in the SplFileObject::fgetcsv() method.
Ok, it's badly documented, but this were my first thoughts, the method would never detect the characters and a look into the php source code confirmed this.
Proabably this is a bug?
as you can see here php doc 1st comment 1 year ago - Seems that this function always returns the same delimiter.
UPDATE
this is not a bug look at Pazi ツ answer.

php read list from website into array

I have a list of products codes that i wish to read into an array using php.
the list is to be fetched from a website and has over 700 items looks something like this:
4310ABC
4590DEF
8950GHK
What i want to do is put every code into a php array like so:
php_array ( [0] => 4310ABC
[1] => 4590DEF
[2] => 8950GHK)
This is what i have:
$php_array = file_get_contents('http://anysite.net/product_codes.php');
print_r (explode("\n",$php_array));
But my result is :
Array ( [0] => 4310ABC
4590DEF
8950GHK)
I have tried explode, preg_split('/[\n\r]+/', $php_array); but nothing seems to do the trick. Can anyone give me some pointers? thanks!
The lines are separated by a br, so use this instead:
$php_array = file_get_contents('http://anysite.net/product_codes.php');
print_r (explode("<br>",$php_array));
Don't forget to change the br to however it's spelled within the document you are fetching, for example it's often spelled like this:
<br />
Which is the most correct way to write it.
It would depend how your php file is echoing out the three values, so I am not sure how it is interpreting line breaks. Try echoing out the values with no line breaks, but separated by some other character like '*' or something, and then explode them along that and see if that works.

PHP Regex search failing when adding a second capture group

I have the following named capture group that works exactly as intended. It grabs the last date/time of a specific format from a string of text.
$re = "/.*(?<date>[0-9]+\\/[0-9]+\\/[0-9]+ [0-9]+:[0-9]+:[0-9]{1,2} (AM|PM))/s";
I want to capture the user ID that follows so I changed it to the following
$re = "/.*(?<date>[0-9]+\\/[0-9]+\\/[0-9]+ [0-9]+:[0-9]+:[0-9]{1,2} (AM|PM)) (?<name>\\w+)/s";
However when I do so it breaks both values giving the following error
Notice: Undefined index: date in Q:\XAMPP\htdocs\index.php on line 272
The Error stems from the preg_match matches array being blank. Print_r confirms the array does not contain any information once the regex is changed to the second value.
Both of these work fine in external sites as the link below to Regex101 shows
http://regex101.com/r/zO0mK0/4
Using PHP 5.5.9
So the question is, am I missing something in this regex statement that is breaking it between the external site and my internal code or does this work meaning it is 100% purely my php that is causing this issue.
$LastDateRegex = "/.*(?<date>[0-9]+\\/[0-9]+\\/[0-9]+ [0-9]+:[0-9]+:[0-9]{1,2} (AM|PM))/s";
preg_match($LastDateRegex, $arr2['WorkLog'], $LastDateMatches);
$Modsecs = (strtotime($ts) - strtotime($LastDateMatches['date']))%60;
This is an example of the code being used. As mentioned above, I know the error stems from the $LastDateMatches array being empty for the second regex example, however the code works 100% with the first so there is something between the two that causes the issue.
You have an extra \ here: \\w+ and \\ at a few places, if that makes a difference?
Not quite sure what you want returned but runnning this regular expression, where $str is the text you have in your regex101-link,
$regex = "/.*(?<date>\d+\/\d+\/\d+ \d+:\d+:\d{1,2} (?:AM|PM)) (?<name>\w+)/s";
preg_match($regex, $str, $LastDateMatches);
Output,
Array
(
[0] => ...
[date] => 6/20/2014 10:04:32 PM
[1] => 6/20/2014 10:04:32 PM
[name] => ihugett
[2] => ihugett
)

string delimiter ajax call

So a PHP file returns a string ( to an ajax call ) like this :
$output = $sessID."###".$sessEmail."###".$sessFirstName."###".$sessLanguage."###".$sessRememberMe;
and in javascript i do :
if (reply.indexOf("###") >= 0) {
arrayReply = reply.split("###");
user.ID = arrayReply[0];
user.Email = arrayReply[1];
user.FirstName = arrayReply[2];
user.Language = arrayReply[3];
user.RememberMe = arrayReply[4];
}
a problem can arise when parts of reply contain the the delimiter i use "###". What can I do in such a situation? Making the delimiter more complex/rare is not a solution in my opinion.
PS: I did try JSON but it's WAY SLOWER server side.
FINAL EDIT:
server side JSON is slower, and the same for client side, however it's not going to be a bottleneck ( 430ms for 100.000 calls ) and plus there is no need as Jules said below to re-invent the wheel. There was one more solution: bin2hex() in php [which reduced the time from 430ms to 240] and then get back the string in javascript with a hex2string function, however not worth the effort. JSON it is. Thank you all!
If as you say encoding as JSON is slower than you could try the following,
$output = '"' . some_kind_of_escape_function($sessID).'","'.some_kind_of_escape_function($sessEmail).'","'.some_kind_of_escape_function($sessFirstName).'","'.some_kind_of_escape_function($sessLanguage).'","'.$sessRememberMe.'"';
and of course replace some_kind_of_escape_function with the appropriate php function (e.g. addslashes or mysql_real_escape_string) it has been a while since I've done PHP development so choose the one that best suits your needs
Then it's a simple case of splitting by the comma and removing the quotes
One option is to use JSON object instead.
For PHP (using json_encode):
$output = json_encode(array(
"sessid" => $sessID,
"sessEmail" => $sessEmail,
"sessFirstName" => $sessFirstName,
"sessLanguage" => $sessLanguage,
"sessRememberMe" => $sessRememberMe
));
For JS (using jQuery method):
$.getJSON("/path/to/script.php", function(reply) {
user.ID = reply.sessid;
user.Email = reply.sessEmail;
user.FirstName = reply.sessFirstName;
user.Language = reply.sessLanguage;
user.RememberMe = reply.sessRememberMe;
});
Otherwise, you can use any other delimiter that possibly won't be found in the fields (or you can replace it throughout the fields). One of the examples is to use symbol of newline (\n).
Why develop your own format if there is already one?
use Json:
$output = json_encode(array('sessionID'=>$sessID,'sessionEmail'=>sessEmail,'sessionFirstName'=>$sessFirstName,'sessLanguage'=>$sessLanguage,'sessRememberMe'=>$sessRememberMe));
And for the Javsascript Side see
http://www.javascriptkit.com/dhtmltutors/ajaxgetpost4.shtml
or if your using JQuery etc. your Framework is much likely to have some kind of inbuild functionality such as http://api.jquery.com/jQuery.getJSON/
However if you want to use your ###-Delimiter i'd suggest you reduce it to just "#", for the sake of simplicity and space. After that introduce what is called an escape charater such as "\" So in a prepass you'll parse your input and replace all occurences of # with #, vice versa in the output. You can then Split your String using a special Regex, which only splits by # and not by "#"
You can use json.
http://php.net/manual/en/function.json-encode.php
How to JSON decode array elements in JavaScript?

Regex not finding all variables

I'm parsing some HTML, that I have generated in a form. This is a token system. I'm trying to get the information from the Regexp later on, but somehow, it's turning up only the first of the matches. I found a regexp on the Web, that did almost what I needed, except of being able to process multiple occurances.
I want to be able to replace the content found, with content that was generated from the found string.
So, here is my code:
$result = preg_replace_callback("/<\/?\w+((\s+(\w|\w[\w-]*\w)(\s*=\s*(?:\".*?\"|'.*?'|[^'\">\s]+))?)+\s*|\s*)\/?>\[\*.*\*\]\<\/[a]\>/i", array(get_class($this), 'embed_video'), $str);
public function embed_video($matches)
{
print_r($matches);
return $matches[1] . 'foo';
}
I really need only the attributes, since they containt all of the valuable information. The contents of the tag are used only to find the token. This is an example of what needs to happen:
<a type="TypeOfToken1" id="IdOfToken1">[*SomeTokenTitle1*]</a>
<a type="TypeOfToken2" id="IdOfToken2">[*SomeTokenTitle2*]</a>
After the preg_replace_callback() this should be returned:
type="TypeOfToken1" id="IdOfToken1" type="TypeOfToken2" id="IdOfToken2"
But, the callback function outputs the matches, but does not replace them with the return. So, the $result stays the same after the preg_replace_callback. What could be the problem?
An example with real data:
Input:
<p><a id="someToken1" rel="someToken1">[*someToken1*]</a> sdfsdf <a id="someToken2" rel="someToken2">[*someToken2*]</a></p>
returned $result:
id="someToken1" rel="someToken1"foo
Return from the print_r() if the callback function:
Array ( [0] => [*someToken1*] sdfsdf [*someToken2*] [1] => id="someToken1" rel="someToken1" [2] => rel="someToken1" [3] => rel [4] => ="someToken1" )
I think that it is not returning both of the strings it should.
For anyone else stumbling into a problem like this, try checking your regexp and it's modifiers.
Regarding the parsing of the document, I'm still doing it, just not HTML tags. I have instead gone with someting more textlike, that can be more easily parsed. In my case: [*TokeName::TokenDetails*].

Categories