Polymorph String To Pattern - php

I'm working on an issue where users (truck drivers in this case) use SMS to send in information about work status. I want to keep the keying simple as not all users have smart phones so I have adopted some simple short codes for their input. Here are some examples and their meanings:
P#123456-3 (This is for picking up load 123456-3)
D#456789-1 (For the dropping of load 456789-1)
L#345678-9 (Load 345678-9 is going to be late)
This is pretty simple but users (and truck drivers) being what they are will key the updates in somewhat deviant manners such as:
#D 456789-1
D# 456789 - 1
D#.456789-1 This load looks wet to me do weneed to cancelthis order
You can pretty much come up with a dozen other permutations and it's not hard for me to catch and fix those that I can imagine.
I mostly use regular expressions to test the input against all my imagined "bad" patterns and then extract what I assume are the good parts, reassembling them into the correct order.
It's the new errors that cause me problems so I got to wondering if there was a more generic method where I can pass a "pattern" and a "message" to a function that would do it's best to turn the "message" into something matching the "pattern".
My searches have not found anything that really fits what I'm trying to do and I'm not even sure if there is a good general way to do this. I happen to be using PHP for this implementation but any type of example should help. Do any of you have a method?

If the user has problems with your software, fix the software, not the user!
The problem arises because your format looks unnecessary complicated. Why do you need the hash in the first place? How about simplifying it down to the following:
operation-code maybe-space load-number maybe-space and comment
Operation codes are assigned to different phone keys, so that J, K and L mean the same thing. Load-numbers can be sent as digits and as letters as well, e.g. agja means 2452. It's hard for the user to make a mistake using this format.
Here's some code to illustrate this approach:
function parse($msg) {
$codes = array(
3 => 'DROP',
5 => 'LOAD',
// etc
);
preg_match('~(\S)\s*(\S+)(\s+.+)?~', $msg, $m);
if(!$m)
return null; // cannot parse
$a = '.,"?!abcdefghijklmnopqrstuvwxyz';
$d = '1111122233344455566677777888999';
return array(
'opcode' => $codes[strtr($m[1], $a, $d)],
'load' => intval(strtr($m[2], $a, $d)),
'comment' => isset($m[3]) ? trim($m[3]) : ''
);
}
print_r(parse(' j ww03 This load looks wet to me'));
//[opcode] => LOAD
//[load] => 9903
//[comment] => This load looks wet to me
print_r(parse('dxx0123'));
//[opcode] => DROP
//[load] => 990123
//[comment] =>

Try something like this:
function parse($input) {
// Clean up your input: 'D#.456789 - 1 foo bar' to 'D 456789 1 foo far'
$clean = trim(preg_replace('/\W+/', ' ', $input));
// Take first 3 words.
list($status, $loadId1, $loadId2) = explode(' ', $clean);
// Glue back your load ID to '456789-1'
$loadId = $loadId1 . '-' . $loadId2;
return compact('status', 'loadId');
}
Example:
$inputs = array(
'P#123456-3',
'#D 456789-1',
'D# 456789 - 1',
'D#.456789-1 This load looks wet to me do weneed to cancelthis order',
);
echo '<pre>';
foreach ($inputs as $s) {
print_r(parse($s));
}
Output:
Array
(
[status] => P
[loadId] => 123456-3
)
Array
(
[status] => D
[loadId] => 456789-1
)
Array
(
[status] => D
[loadId] => 456789-1
)
Array
(
[status] => D
[loadId] => 456789-1
)

First, remove stuff that shouldn't be there:
$str = preg_replace('/[^PDL\d-]/i', '', $str);
That gives you the following normalised results:
D456789-1
D456789-1
D456789-1ldlddld
Then, attempt to match the data you want:
if (preg_match('/^([PDL])(\d+-\d)/i', $str, $match)) {
$code = $match[1];
$load = $match[2];
} else {
// uh oh, something wrong with the format!
}

Something like
/^[#\s]*([PDL])[#\s]*(\d+[\s-]+\d)/
or to be even more relaxed,
/^[^\d]*([PDL])[^\d]*(\d+)[^\d]+(\d)/
would get you what you want. But I'd prefer HamZa's comment as a solution: throw it back and tell them to get their act together :)

Related

PHP sort list of subdomains by domain

I have a list of domains (array)
sub1.dom1.tld1
sub2.dom2.tld2
sub1.sub2.dom1.tld1
sub3.dom1.tld3
I want to achieve the following:
dom1.tld1
-> sub1.dom1.tld1
-> sub2.dom1.tld1
--> sub1.sub2.dom1.tld1
dom2.tld2
-> sub2.dom2.tld2
dom1.tld3
-> sub3.dom1.tld3
I have tried to adapt this, but it doesn't really fit:
How to alphabetically sort a php array after a certain character in a string
I would appreciate any kind of help.
I've had to attack a similar headache before. In the short term I flip the order of the domain components and use a hidden sorting column in a table/view:
$sortstring = implode('.',array_reverse(explode('.', $domain)));
In the long term I saved the reverse format of the domain records before saving changes to the DB into a computed field/column so that it didn't have to be re-computed every time the domain list is viewed.
If you don't want that sub-domain, just remove the last element of the array after the flip....
You can proceed like this:
$array=array(
'sub1.dom1.tld1',
'sub2.dom2.tld2',
'sub1.sub2.dom1.tld1',
'sub2.sub2.dom1.tld1',
'sub3.sub2.dom1.tld1',
'sub3.dom1.tld3');
function cmp($a,$b){
$a=array_reverse(explode('.',$a));
$b=array_reverse(explode('.',$b));
$ca=count($a);
$cb=count($b);
$string='';;
for($i=0,$c=min($ca,$cb);$i<$c;$i++){
$result=strnatcmp($a[$i],$b[$i]);
if($result!==0) return $result;
}
return $result;
}
usort($array,'cmp');
print_r($array);
and the output is:
Array
(
[0] => sub1.dom1.tld1
[1] => sub1.sub2.dom1.tld1
[2] => sub2.sub2.dom1.tld1
[3] => sub3.sub2.dom1.tld1
[4] => sub2.dom2.tld2
[5] => sub3.dom1.tld3
)
Here is an approach similar to #Elementary answer combine to #CBO one:
$domains = [
'sub.bbb.com',
'www.aaa.com',
'*.zzz.com',
'aaa.com',
'*.sub.bbb.com',
'zzz.com',
'beta.bbb.com',
'bbb.com',
'aaa.fr',
];
// #see https://stackoverflow.com/a/61461912/1731473
$computeDomainToSort = static function (string $domain): string {
return \implode(
'.',
array_reverse(
explode('.', $domain,
// Keep the base domain.tld collapsed for comparison.
substr_count($domain, '.')
)
)
);
};
\usort($this->domains, static function (string $domain1, string $domain2) use ($computeDomainToSort): int {
$domain1 = $computeDomainToSort($domain1);
$domain2 = $computeDomainToSort($domain2);
return strnatcmp($domain1, $domain2);
});
That way, given domains will be sorted like this:
aaa.com
www.aaa.com
aaa.fr
bbb.com
beta.bbb.com
sub.bbb.com
*.sub.bbb.com
zzz.com
*.zzz.com
The main difference is on the $computeDomainToSort lambda function, where I keep the base domain.tld onto one piece to have a more natural sorting.

How to parse function name with arguments in PHP?

I am building a module for my own PHP framework, so my question is very specific and special. It's difficult to explain my question so I will go ahead and show it on code below.
I have a little piece of PHP in a $code variable, it looks like this:
$code = "___echo(TODAY_IS, date('j.n.Y', time()), time());";
What I need is to parse this $code variable and I want to get this result:
$result = array(
'function_name' => "___echo",
'arguments' => array(
0 => "TODAY_IS",
1 => "date('j.n.Y', time())",
2 => "time()"
)
);
I am thinking and I have tried using some regex, but neither worked sufficiently well. I also tried using Tokenizer, however I wasn't successful either.
Thanks for any hints or help in advance.
Here is a shot using PHP-Parser. It's likely going to be more useful than tokenizer or some freaky regex.
Example:
$code = "___echo(TODAY_IS, date('j.n.Y', time()), time());";
$parser = new PhpParser\Parser(new PhpParser\Lexer);
$prettyPrinter = new PhpParser\PrettyPrinter\Standard;
$statements = $parser->parse("<?php $code");
$result['function_name'] = $statements[0]->name->toString();
foreach ($statements[0]->args as $arg) {
$result['arguments'][] = $prettyPrinter->prettyPrint(array($arg));
}
var_export($result);
Output:
array (
'function_name' => '___echo',
'arguments' =>
array (
0 => 'TODAY_IS',
1 => 'date(\'j.n.Y\', time())',
2 => 'time()',
),
)
token_get_all() function is what you need here:
token_get_all("<?php ___echo(TODAY_IS, date('j.n.Y', time()), time());")
This returns a list of tokens parsed from the given string. See the tokens documentation for recognizing the items of the list.
In my opinion, tokenizer-based solution should be preferred over any regular expressions based on whatever is written in the PHP manual regarding syntax.

array of required twig variables in symfony

If there any way to discover the variables required from a Twig template? Example, if I had:
Hello {{ user }}! You're {{ age }} years old, well done big man!
I'd be able to load this template and then gather each of the required variables, eventually allowing me to have something like:
Array ( [0] => user [1] => age )
The end goal of this is to be able to define a view and then have the system create a form based on the required variables in a template file.
Working Solution
Thanks to morg for pointing me towards tokenize I was able to get what I wanted using the following (I placed it in my controller for testing):
$lexer = new \Twig_Lexer(new \Twig_Environment());
$stream = $lexer->tokenize(new \Twig_Source('{{test|raw}}{{test2|raw|asd}}{{another}}{{help_me}}', null));
$variables = array();
while (!$stream->isEOF()) {
$token = $stream->next();
if($token->getType() === \Twig_Token::NAME_TYPE){
$variables[] = $token->getValue();
while (!$stream->isEOF() && $token->getType() !== \Twig_Token::VAR_END_TYPE) {
$token = $stream->next();
}
}
}
$variables = array_unique($variables);
This returns:
Array
(
[0] => test
[1] => test2
[2] => another
[3] => help_me
)
You'll notice I only get variables and not any of the functions (this is through design), although you could remove the nested while loop if you wish to get both variables and functions.
You can use the twig tokenizer for this.
$stream = $twig->tokenize($source, $identifier);
The tokenizer has a toString() Method, whose resulting string you can parse for
VAR_START_TYPE()
NAME_TYPE(varname)
VAR_END_TYPE()
Look at this for more detailed information.
You can try using preg_match_all('{{\s*(\w+)\s*}}', 'template {{string }} with {{ var}}', $matchesArray);. The $matchArray is structured as following:
Array(
0 => array(0 => '{{string }}', 1 => 'string'),
1 => array(0 => '{{ var}}', 1 => 'var')
)
Another way of doing this from inside PHP code is not elegant, but still more reliable than any regex will be:
$source = "My template string with {{ some }} parameters.";
$stream = $twig->tokenize(new \Twig_Source($source, "source"));
$matches = [];
preg_match_all(
"/NAME_TYPE\((.*)\)/", $stream->__toString(), $matches
);
if (count($matches) > 1) {
$params = array_unique($matches[1]);
} else {
$params = [];
}
This works by using Twig internal mechanisms to tokenize the template string and then extract parameters names with a regex.
Edit: The previous version of my answer used the parse method to create a tree of nodes, but it didn’t seem to work anymore, and matching on NAME_TYPE at the previous step seems more reliable, not sure if I missed something there…

Regular Expression filtering all translation functions

I am working on a Webinterface that provides the same function like poEdit.
I want to walk trough all .php files in a specified folder and search every line for a translation. For this I would like to use regular expression searching the actual line in the php file and return the translation-text-parameter and the domain-parameter.
My function looks like this:
__('This is my translation', 'domain');
But because for the domain-parameter I defined a default, the function __() can also be called like this:
__('this is my translation');
Now in PHP i tried to use the Function preg_match_all() but i can't gent my regex together.
Here is an example of a possible line in the script and the output array I would like to receive with the preg_match_all() function:
echo __('Hello World'); echo __('Some domain specific translation', 'mydomain');
Array output:
Array
(
[0] => Array
(
[0] => Hello World
)
[1] => Array
(
[0] => Some domain specific translation.
[1] => mydomain
)
)
Can anyone help me out with the Regex and the preg_math_all() flags?
Thank you guys.
Something like this should work. Array shift needed, because zero element will always contain full match, there is no flag to exclude it AFAIK.
if(preg_match_all('/__\(\s*\'((?:[^\']|(?<=\\\)\')+)\'(?:\s*,\s*\'((?:[^\']|(?<=\\\)\')+)\')?\s*\)/us', $data, $result)) {
foreach ($result as &$item) {
array_shift($item);
}
unset($item);
var_dump($result);
}
It finds correctly calls like these __('lorem \' ipsum', 'my\'domain'). It would fail on __('lorem \\') though.
The regex you would need for this is considerably complex.
__\(\s*(['"])((?:(?!(?<!\\)\1).)+)\1(?:,\s*(['"])((?:(?!(?<!\\)\3).)+)\3)?\s*\)
Matches would be in groups 2 and 4, for example
__('This is my translation', 'domain');
would produce these groups:
'
This is my translation
'
domain
and this
__('This is my \'translation\'', "domain");
would produce these groups:
'
This is my \'translation\'
"
domain

Search in array with relevance

I am doing a very small online store application in PHP. So I have an array of maps in PHP. I want to search for a string (a product) in the array. I looked at array_search in PHP and it seems that it only looks for exact match. Do you guys know a better way to do this functionality? Since this is a very small part of what I am actually doing, I was hoping that there was something built in. Any ideas?
Thanks!
EDIT: The array contains "products" in this format:
[6] => SimpleXMLElement Object
(
[#attributes] => Array
(
[id] => 2000-YM
)
[Name] => Team Swim School T-Shirt
[size] => YM
[price] => 15
[group] => Team Clothing
[id] => 2000-YM
)
[7] => SimpleXMLElement Object
(
[#attributes] => Array
(
[id] => 3000-YS
)
[Name] => Youth Track Jacket
[size] => YS
[price] => 55
[group] => Team Clothing
[id] => 3000-YS
)
So I was wondering I can do a search such as "Team" and it would return me first item seen here. I am basing the search on the Name (again this is just something small). I understand that I can find the exact string, I am just stuck on the "best results" if it cannot find the exact item. Efficiency is nice but not required since I only have about 50 items so even if I use a "slow" algorithm it won't take much time.
array_filter lets you specify a custom function to do the searching. In your case, a simple function that uses strpos() to check if your search string is present:
function my_search($haystack) {
$needle = 'value to search for';
return(strpos($haystack, $needle)); // or stripos() if you want case-insensitive searching.
}
$matches = array_filter($your_array, 'my_search');
Alternatively, you could use an anonymous function to help prevent namespace contamination:
$matches = array_filter($your_array, function ($haystack) use ($needle) {
return(strpos($haystack, $needle));
});
foreach($array as $item){
if(strpos($item,"mysearchword")!== false){
echo 'found';
}
}
or you can use preg_match for more flexible search instead of strpos.
I think Marc B's answer was a good starting point but for me it had some problems. Such as you have to know what the Needle is at "compile time" because you can't dynamically change that value. also if the needle appeared at the start of the string element it would act like it's not there at all. so after a little experimenting I manged to come up with a way around both problems. so you don't have to create a new function for every different needle your going to want to use anymore.
function my_search($haystack)
{
global $needle;
if( strpos($haystack, $needle) === false) {return false;} else {return true;}
}
and it would be called like this:
$needle="item to search for";
$matches = array_filter($my_array, 'my_search');
and being as needle is now accessible in the same scope that the rest of the code is you can set needle to any other string variable you wanted, including user input.
Unfortunately, search is one of the more difficult things to do in computer science. If you build for search based on literal string matches or regular expressions (regex), you may find that you'll be unhappy with the relevance of the results that are returned.
If you're interested in rolling up your sleeves and getting a little dirty with a more sophisticated solution, I'd try Zend's Lucene implementation ( http://framework.zend.com/manual/en/zend.search.lucene.html ). I've implemented a search on a site with it. It took a few days, but the results were MUCH better than the 15 minute solution of literal string matching.
PS. Here's an example: http://devzone.zend.com/article/91
I have same Issue but i have created i function to search in array by passing the array, key and value.
public function searchinarr($array, $key, $value)
{
$results = array();
for($i=0;$i<count($array);$i++)
{
foreach($array[$i] as $k=>$val)
{
if($k==$key)
{
if(strpos($val,$value)!== false)
{
$results[] = $array[$i];
}
}
}
}
return $results;
}

Categories