Convert Single Line Comments To Block Comments - php

I need to convert single line comments (//...) to block comments (/*...*/). I have nearly accomplished this in the following code; however, I need the function to skip any single line comment is already in a block comment. Currently it matches any single line comment, even when the single line comment is in a block comment.
## Convert Single Line Comment to Block Comments
function singleLineComments( &$output ) {
$output = preg_replace_callback('#//(.*)#m',
create_function(
'$match',
'return "/* " . trim(mb_substr($match[1], 0)) . " */";'
), $output
);
}

As already mentioned, "//..." can occur inside block comments and string literals. So if you create a small "parser" with the aid f a bit of regex-trickery, you could first match either of those things (string literals or block-comments), and after that, test if "//..." is present.
Here's a small demo:
$code ='A
B
// okay!
/*
C
D
// ignore me E F G
H
*/
I
// yes!
K
L = "foo // bar // string";
done // one more!';
$regex = '#
("(?:\\.|[^\r\n\\"])*+") # group 1: matches double quoted string literals
|
(/\*[\s\S]*?\*/) # group 2: matches multi-line comment blocks
|
(//[^\r\n]*+) # group 3: matches single line comments
#x';
preg_match_all($regex, $code, $matches, PREG_SET_ORDER | PREG_OFFSET_CAPTURE);
foreach($matches as $m) {
if(isset($m[3])) {
echo "replace the string '{$m[3][0]}' starting at offset: {$m[3][1]}\n";
}
}
Which produces the following output:
replace the string '// okay!' starting at offset: 6
replace the string '// yes!' starting at offset: 56
replace the string '// one more!' starting at offset: 102
Of course, there are more string literals possible in PHP, but you get my drift, I presume.
HTH.

You could try a negative look behind: http://www.regular-expressions.info/lookaround.html
## Convert Single Line Comment to Block Comments
function sinlgeLineComments( &$output ) {
$output = preg_replace_callback('#^((?:(?!/\*).)*?)//(.*)#m',
create_function(
'$match',
'return "/* " . trim(mb_substr($match[1], 0)) . " */";'
), $output
);
}
however I worry about possible strings with // in them. like:
$x = "some string // with slashes";
Would get converted.
If your source file is PHP, you could use tokenizer to parse the file with better precision.
http://php.net/manual/en/tokenizer.examples.php
Edit:
Forgot about the fixed length, which you can overcome by nesting the expression. The above should work now. I tested it with:
$foo = "// this is foo";
sinlgeLineComments($foo);
echo $foo . "\n";
$foo2 = "/* something // this is foo2 */";
sinlgeLineComments($foo2);
echo $foo2 . "\n";
$foo3 = "the quick brown fox";
sinlgeLineComments($foo3);
echo $foo3. "\n";;

Related

PHP Form implode / explode

I am using the a db field merchant_sku_item in a form. the original value is separated by / in the db like this:
2*CC689/1*CC368-8/1*SW6228-AB
I want to display in a text area on each line so I tried like this:
<textarea name="merchant_sku_item" rows="5" class="form-control" id="merchant_sku_item"><?
$items=explode('/',$merchant_sku_item);
foreach($items as $item){
echo $item."\r\n";
}
?></textarea>
All works fine:
2*CC689
1*CC368-8
1*SW6228-AB
but when I post the form I get a value like this:
2*CC689 1*CC368-8 1*SW6228-AB
but I wan't it back in the original format to update the DB in the correct format:
2*CC689/1*CC368-8/1*SW6228-AB
I tried to implode it with the / but I think it's just one string now so it's not working. I could replace the spaces I guess but this will not work if the field contains spaces.
Could somebody please tell me the best way to handle this?
The explode is correct, but you cannot just echo $item . "\r\n" because if $item contains </textarea> or whatever HTML you'll skrew up the page. You have to use echo htmlspecialchars($item) . "\n";. Normally, HTML pages have Linux line endings with "\n" and not Windows line endings with "\r\n".
To re-create the value for the DB, you have to take in consideration that the user may add some spaces or new lines. So you might not just get "\r\n" between the values but also " \n" or I don't know what. This is why a regular expression will be more flexible than a simple explode().
The regular expression pattern: \s+
The pattern \s will match any space, tab or new line chars. If you add the + sign after, it means that it can be 1 or multiple times. So this means that " \r\n" will match as it contains spaces, a carriege return and a new line. In PHP, you put the pattern between a delimiter char that you choose and that you put at the begin and the end. Commonly it's a slash so it becomes /\s+/. But you sometimes see also #\s+# or ~\s+~. After this delimiter, you can put some flags to change the way the regular expression is executed. Typically /hello/i will match "Hello" or "hello" because the i flag makes the search case-insensitive.
Similar to what you did: explode and re-implode example:
<?php
// Example of values that could be posted because users are always
// stupid and add spaces that they then don't see anymore.
$examples = [
"2*CC689 1*CC368-8 1*SW6228-AB", // spaces
"2*CC689\n1*CC368-8\n1*SW6228-AB", // new lines
"2*CC689\r\n1*CC368-8\r\n1*SW6228-AB", // carriege returns and new lines
"2*CC689\n 1*CC368-8 \n1*SW6228-AB", // new lines and spaces
];
foreach ($examples as $merchant_sku_item) {
$values = preg_split('/\s+/', $merchant_sku_item);
$merchant_sku_item_for_db = implode('/', $values);
echo $merchant_sku_item_for_db . "\n";
}
?>
Output:
2*CC689/1*CC368-8/1*SW6228-AB
2*CC689/1*CC368-8/1*SW6228-AB
2*CC689/1*CC368-8/1*SW6228-AB
2*CC689/1*CC368-8/1*SW6228-AB
Simplier, you could also just do a replacement with the same regular expression like this:
<?php
// Example of values that could be posted.
$examples = [
"2*CC689 1*CC368-8 1*SW6228-AB", // spaces
"2*CC689\n1*CC368-8\n1*SW6228-AB", // new lines
"2*CC689\r\n1*CC368-8\r\n1*SW6228-AB", // carriege returns and new lines
"2*CC689\n 1*CC368-8 \n1*SW6228-AB", // new lines and spaces
];
foreach ($examples as $merchant_sku_item) {
$merchant_sku_item_for_db = preg_replace('/\s+/', '/', $merchant_sku_item);
echo $merchant_sku_item_for_db . "\n";
}
?>
And just another important point regarding the data the user could input: What happens if the user types "2*CC/689" in the textarea?
Well, this will break your DB value :-/
This means that you have to validate the user input with some checks:
<?php
header('Content-Type: text/plain');
$examples = [
"2*CC689 1*CC368-8 1*SW6228-AB", // spaces
"2*CC689\n1*CC368-8\n1*SW6228-AB", // new lines
"2*CC689\r\n1*CC368-8\r\n1*SW6228-AB", // carriege returns and new lines
"2*CC689\n 1*CC368-8 \n1*SW6228-AB", // new lines and spaces
// Test with invalid datas:
"2*C/C689\n1*CC368-8\n1*SW6228-AB", // slash not allowed
"*CC689\n1*CC368-8\n1*SW6228-AB", // missing number before the *
"1*\n1*CC368-8\n1*SW62?28-AB", // missing product identifier and invalid ?
"1CC689 1*CC368-8 1SW6228-AB", // missing *
];
foreach ($examples as $example_nbr => $merchant_sku_item) {
echo str_repeat('=', 80) . "\n";
echo "Example $example_nbr\n\$merchant_sku_item = \"$merchant_sku_item\"\n";
$values = preg_split('/\s+/', $merchant_sku_item);
$errors = [];
foreach ($values as $i => $value) {
echo "Value $i = \"$value\"";
// Pattern: a number followed by * and followed by a product id (length between 3 and 10).
if (!preg_match('/^\d+\*[\d\w-]{3,10}$/i', $value)) {
echo " <-- ERROR\n";
$errors[] = $value;
} else {
echo "\n"; // It's ok
}
}
if (!empty($errors)) {
// You should handle the error and reload the form with the posted value and an error
// message explaining to the user what format is allowed.
echo "ERROR: Cannot save the value because the following products are wrong:\n";
echo implode("\n", $errors) . "\n";
}
}
?>
Test it here: https://onecompiler.com/php/3xtff6nk8
You can use preg_replace for your final post string like below code
$str = "2*CC689 blue 1*CC368-8 red 1*SW6228-AB";
$items = preg_replace('/\s+/', '/', $str);
echo $items;
output
2*CC689/blue/1*CC368-8/red/1*SW6228-AB
I Hope understand your question exactly.
Try replacing the created spaces and removing newline characters like so:
<?php
$merchant_sku_item = str_replace("\r\n","/",trim($_POST["merchant_sku_item"]));
?>
Make it this way:
<textarea name="merchant_sku_item" rows="5" class="form-control" id="merchant_sku_item"><?
$items=explode('//',$merchant_sku_item);
foreach($items as $item){
echo $item."\r\n";
}
?></textarea>

Explode() String Breaking Array Into Too Many Elements

I am working on scraping and then parsing an HTML string to get the two URL parameters inside the href. After scraping the element I need, $description, the full string ready for parsing is:
<a target="_blank" href="CoverSheet.aspx?ItemID=18833&MeetingID=773">Description</a><br>
Below I use the explode parameter to split the $description variable string based on the = delimiter. I then further explode based on the double quote delimiter.
Problem I need to solve: I want to only print the numbers for MeetingID parameter before the double quote, "773".
<?php
echo "Description is: " . htmlentities($description); // prints the string referenced above
$htarray = explode('=', $description); // explode the $description string which includes the link. ... then, find out where the MeetingID is located
echo $htarray[4] . "<br>"; // this will print the string which includes the meeting ID: "773">Description</a><br>"
$meetingID = $htarray[4];
echo "Meeting ID is " . substr($meetingID,0,3);
?>
The above echo statement using substr works to print the meeting ID, 773.
However, I want to make this bulletproof in the event MeetingID parameter exceeds 999, then we would need 4 characters. So that's why I want to delimit it by the double quotes, so it prints all numbers before the double quotes.
I try below to isolate all of the amount before the double quotes... but it isn't seeming to work correctly yet.
<?php
$htarray = explode('"', $meetingID); // split the $meetingID string based on the " delimiter
echo "Meeting ID0 is " . $meetingID[0] ; // this prints just the first number, 7
echo "Meeting ID1 is " . $meetingID[1] ; // this prints just the second number, 7
echo "Meeting ID2 is " . $meetingID[2] ; // this prints just the third number, 3
?>
Question, why is the array $meetingID[0] not printing the THREE numbers before the delimiter, ", but rather just printing a single number? If the explode function works properly, shouldn't it be splitting the string referenced above based on the double quotes, into just two elements? The string is
"773">Description</a><br>"
So I can't understand why when echoing after the explode with double quote delimiter, it's only printing one number at a time..
The reason you're getting the wrong response is because you're using the wrong variable.
$htarray = explode('"', $meetingID);
echo "Meeting ID0 is " . $meetingID[0] ; // this prints just the first number, 7
echo "Meeting ID1 is " . $meetingID[1] ; // this prints just the second number, 7
echo "Meeting ID2 is " . $meetingID[2] ; // this prints just the third number, 3
echo "Meeting ID is " . $htarray[0] ; // this prints 773
There's an easier way to do this though, using regular expressions:
$description = '<a target="_blank" href="CoverSheet.aspx?ItemID=18833&MeetingID=773">Description</a><br>';
$meetingID = "Not found";
if (preg_match('/MeetingID=([0-9]+)/', $description, $matches)) {
$meetingID = $matches[1];
}
echo "Meeting ID is " . $meetingID;
// this prints 773 or Not found if $description does not contain a (numeric) MeetingID value
There is a very easy way to do it:
Your Str:
$str ='<a target="_blank" href="CoverSheet.aspx?ItemID=18833&MeetingID=773">Description</a><br>';
Make substr:
$params = substr( $str, strpos( $str, 'ItemID'), strpos( $str, '">') - strpos( $str, 'ItemID') );
You will get substr like this :
ItemID=18833&MeetingID=773
Now do whatever you want to do!

PHP regex replace based on \v character (vertical tab)

I have a character string like (ascii codes):
32,13,7,11,11,
"string1,blah;like: this...", 10,10, 32,32,32,32, 138,138, 32,32,32,32, 13,7, 11,11,
"string2/lorem/example-text...", 10,10, 32,32,32,32,32, 143,143,143,143,143
So the sequence is:
any characters, followed by my search string, followed by any
characters
11,11
the string I want to replace
any non-printable characters
If the block contains string1 then I need to replace the next string with something else. The second string always starts directly after the 11,11.
I'm using PHP.
I thought something like this, but I am not getting the correct result:
$updated = preg_replace("/(.*string1.*?\\v+)([[:print:]]+)([[:ascii:]]*)/mi", "$1"."new string"."$3", $orig);
This puts "new string" between the 10,10 and the 138,138 (and replaces the 32's).
Also tried \xb instead of \v.
Normally I test with regex101, but not sure how to do that with non-printable characters. Any suggestions from regex guru's?
Edit: the expected output is the sequence:
32,13,7,11,11,
"string1,blah;like: this...", 10,10, 32,32,32,32, 138,138, 32,32,32,32, 13,7, 11,11,
"new string", 10,10, 32,32,32,32,32, 143,143,143,143,143
Edit: sorry for the confusion regarding the ascii codes.
Here's a complete example:
<?php
$s = chr(32).chr(32).chr(7).chr(11).chr(11);
$s .= "string1,blah;like: this...". chr(10).chr(10).chr(32).chr(32).chr(32).chr(32).chr(138).chr(138);
$s .= chr(32).chr(32).chr(32).chr(32).chr(13).chr(7).chr(11).chr(11);
$s .= "string2/lorem/example-text...". chr(10).chr(10).chr(32).chr(32).chr(32).chr(32).chr(32).chr(143).chr(143).chr(143);
$result = preg_replace('/(.*string1.*?\v+)([[:print:]]+)([[:ascii:]]*)/mi', "$1"."new string"."$3", $s);
echo "\n------------------------\n";
echo $result;
echo "\n------------------------\n";
The text string2/lorem/example-text... should be replaced by new string.
My php-cli halted every time preg_match has reached char(138) and I don't know why.
I will throw my hat on this RegEx (note: \v matches a new-line | no flags are set):
"[^"]*"[^\x0b]+\v{2}"\K[^"]*
PHP code:
$source = chr(32).chr(13).chr(7).chr(11).chr(11)."\"string1,blah;like: this...\"".chr(10).
chr(10).chr(32).chr(32).chr(32).chr(32).chr(138).chr(138).chr(32).chr(32).chr(32).chr(32).
chr(13).chr(7).chr(11).chr(11)."\"string2/lorem/example-text...\"".chr(10).chr(10).chr(32).
chr(32).chr(32).chr(32).chr(32).chr(143).chr(143).chr(143).chr(143).chr(143);
echo preg_replace('~"[^"]*"[^\x0b]+\v{2}"\K[^"]*~', "new string", $source);
Beautiful output:
"string1,blah;like: this..."
��
"new string"
�����
Live demo
Solved. It was a combination of things:
/mis was needed (instead of /mi)
\x0b was needed (instead of \v)
Complete working example:
<?php
$s = chr(32).chr(32).chr(7).chr(11).chr(11);
$s .= "string1,blah;like: this...". chr(10).chr(10).chr(32).chr(32).chr(32).chr(32).chr(138).chr(138);
$s .= chr(32).chr(32).chr(32).chr(32).chr(13).chr(7).chr(11).chr(11);
$s .= "string2/lorem/example-text...". chr(10).chr(10).chr(32).chr(32).chr(32).chr(32).chr(32).chr(143).chr(143).chr(143);
$result = preg_replace('/(.*string1.*?\x0b+)([[:print:]]+)/mis', "$1"."new string", $s);
echo "\n------------------------\n";
echo $result;
echo "\n------------------------\n";
Thanks for everyone's suggestions. It put me on the right track.

PHP regex split text to insert HTML

Very(!) new to regex but...
I have the following text strings outputted from a $title variable:
A. This is a title
B. This is another title
etc...
I'm after the following:
<span>A.</span> This is a title
<span>B.</span> This is another title
etc...
Currently I have the following code:
$title = $element['#title'];
if (preg_match("([A-Z][\.])", $title)) {
return '<li' . drupal_attributes($element['#attributes']) . ">Blarg</li>\n";
} else {
return '<li' . drupal_attributes($element['#attributes']) . '>' . $output . $sub_menu . "</li>\n";
}
This replaces anything A. through to Z. with Blarg however I'm not sure how to progress this?
In the Text Wrangler app I could wrap regex in brackets and output each argument like so:
argument 1 = \1
argument 2 = \2
etc...
I know I need to add an additional regex to grab the remainder of the text string.
Perhaps a regex guru could help and novice out!
Thanks,
Steve
Try
$title = 'A. This is a title';
$title = preg_replace('/^[A-Z]\./', '<span>$0</span>', $title);
echo $title;
// <span>A.</span> This is a title
If the string contains newlines and other titles following them, add the m modifier after the ending delimiter.
If the regex doesn't match then no replacements will be made, so there is no need for the if statement.
Is it always just 2 char ("A.", "B.", "C.",...)
because then you could work with a substring instead of regex.
Just pick of the first 2 chars of the link and wrap the span around the substring
Try this (untested):
$title = $element['#title'];
if (preg_match("/([A-Z]\.)(.*)/", $title, $matches)) {
return '<li' . drupal_attributes($element['#attributes']) . "><span>{$matches[0]</span>{$matches[1]}</li>\n";
} else {
return '<li' . drupal_attributes($element['#attributes']) . '>' . $output . $sub_menu . "</li>\n";
}
The change here was to first add / to the start and end of the string (to denote it's a regex), then remove the [ and ] around the period . because that's just a literal character on its own, then to add another grouping which will match the rest of the string. I also Added a $matches to preg_match() to place these two matches in to to use later, which we do on the next life.
Note: You could also do this instead:
$title = preg_replace('/^([A-Z]\.)/', "<span>$1</span>", $title);
This will simply replace the A-Z followed by the period at the start of the string (denoted with the ^ character) with <span>, that character (grabbed with the brackets) and </span>.
Again, that's not tested, but should give you a headstart :)

Scan HTML for values with a special character before them

Say I have values on my page, like #100 #246, What I want to do is scan the page for values with a # before them and then alter them to put a hyperlink on it
$MooringNumbers = '#!' . $MooringNumbers . ' | ' . '#!' . $row1["Number"];
}
$viewedResult = '<tr><td>' .$Surname.'</td><td>'.$Title.'</td><td>'.$MooringNumbers . '</td><td>'.$Telephone.'</td><td>' . '[EDIT]</td>'.'<td>'. '[x]</td>'. '</tr>'; preg_replace('/#!(\d\d\d)/', '${1}', $viewedResult);
echo $viewedResult;
This is the broken code which doesnt work.
I second Xoc - use PHP manual. The method next to the one he pointed is preg-replace-callback
Just call:
preg_replace_callback(
'/#\d\d\d/',
create_function(
// single quotes are essential here,
// or alternative escape all $ as \$
'$matches',
'return strtolower($matches[0]);' //this you replace with what you want to fetch from database
)
EDIT:
Since you want to always perform the same replacement go with Xoc's preg-replace:
preg_replace('/#!(\d\d\d)/', '${1}', $your_input);
Note: I don't have PHP here, so I give no guarantee of this code not wiping your entire hard disk ;)
You can accomplish this by using regular expressions, see PHP's preg_replace function.
$text = 'Lorem ipsum #300 dolar amet #20';
preg_match_all('/(^|\s)#(\w+)/', $text, $matches);
// Perform you database magic here for each element in $matches[2]
var_dump($matches[2]);
// Fake query result
$query_result = array ( 300 => 'http://www.example1.com', 20 => 'http://www.example2.com');
foreach($query_result as $result_key => $result_value)
{
$text = str_replace('#'.$result_key, ''. $result_value . '', $text);
}
var_dump($text);

Categories