regex failing with no errors - php

I have the following text in a string called $test:
Content-Type: text/plain
Server: testapp (4.2.1 (x86_64/linux))
Content-Length: 125
{"password":"123","email_address":"","name":"j.doe","username":"jd123"}
I am trying to write a regular expression in php that will return everything after content-length: 125.
Here's what I have so far:
if (preg_match('/^Content\-Length\:[0-9\\n]+([a-zA-Z0-9\{\}\"\:])*/',$test,$result))
{
var_dump($result[1]);
}
I don't get any error messages, but it doesn't find the pattern I've defined in my string.
I've also tried this pattern:
'/^Content\-Length\:[0-9\\n]+([a-zA-Z0-9{}\"\:])*/'
where I tried to remove the escape char infront of the curly braces. But it's still a no go.
Can you tell me what I'm missing?
Thanks.
EDIT 1
my code now looks like this:
<?php
$test = "Content-Type: text/plain
Server: kamailio (4.2.1 (x86_64/linux))
Content-Length: 125
{"password":"test123","email_address":"","name":"j.doe","username":"jd123"}";
//if (preg_match('/Content-Length\:[0-9\\n]*([a-zA-Z0-9{}\"\:])*/',$test,$result))
//{
// var_dump($result);
//}
preg_match('/({.*})/', $str, $matches);
echo $matches[0];
?>
That gives me the following error:
Undefined offset: 0 in /var/www/html/test/test.php on line 31
Line 31 is where I'm trying to echo the matches.

$str = <<<HEREDOC
Content-Type: text/plain
Server: testapp (4.2.1 (x86_64/linux))
Content-Length: 125
{"password":"123","email_address":"","name":"j.doe","username":"jd123"}
HEREDOC;
preg_match('/(\{.*\})/', $str, $matches);
echo $matches[0];
The regex here is simply matching a line that begins with { and ends with }. It's a quick and loose regex, however.

Instead of using a big pattern to match everything (which is timeconsuming) - why not use preg_split to cut your string into two pieces at your desired location?
$string = 'Content-Type: text/plain
Server: testapp (4.2.1 (x86_64/linux))
Content-Length: 125
{"password":"123","email_address":"","name":"j.doe","username":"jd123"}';
$parts = preg_split ("/Content-Length:\s*\d+\s*/", $string);
echo "The string i want is '" . $parts[1] . "'";
Output:
The string i want is '{"password":"123","email_address":"","name":"j.doe","username":"jd123"}'

You can avoid the regex altogether because the HTTP header is always separated from the response body by 2 consecutives line breaks.
list($headers, $body) = explode("\n\n", $string);
Or for windows-style breaks( which by the way are the standard for HTTP headers):
list($headers, $body) = explode("\r\n\r\n", $string);

Related

PHP regex replace based on \v character (vertical tab)

I have a character string like (ascii codes):
32,13,7,11,11,
"string1,blah;like: this...", 10,10, 32,32,32,32, 138,138, 32,32,32,32, 13,7, 11,11,
"string2/lorem/example-text...", 10,10, 32,32,32,32,32, 143,143,143,143,143
So the sequence is:
any characters, followed by my search string, followed by any
characters
11,11
the string I want to replace
any non-printable characters
If the block contains string1 then I need to replace the next string with something else. The second string always starts directly after the 11,11.
I'm using PHP.
I thought something like this, but I am not getting the correct result:
$updated = preg_replace("/(.*string1.*?\\v+)([[:print:]]+)([[:ascii:]]*)/mi", "$1"."new string"."$3", $orig);
This puts "new string" between the 10,10 and the 138,138 (and replaces the 32's).
Also tried \xb instead of \v.
Normally I test with regex101, but not sure how to do that with non-printable characters. Any suggestions from regex guru's?
Edit: the expected output is the sequence:
32,13,7,11,11,
"string1,blah;like: this...", 10,10, 32,32,32,32, 138,138, 32,32,32,32, 13,7, 11,11,
"new string", 10,10, 32,32,32,32,32, 143,143,143,143,143
Edit: sorry for the confusion regarding the ascii codes.
Here's a complete example:
<?php
$s = chr(32).chr(32).chr(7).chr(11).chr(11);
$s .= "string1,blah;like: this...". chr(10).chr(10).chr(32).chr(32).chr(32).chr(32).chr(138).chr(138);
$s .= chr(32).chr(32).chr(32).chr(32).chr(13).chr(7).chr(11).chr(11);
$s .= "string2/lorem/example-text...". chr(10).chr(10).chr(32).chr(32).chr(32).chr(32).chr(32).chr(143).chr(143).chr(143);
$result = preg_replace('/(.*string1.*?\v+)([[:print:]]+)([[:ascii:]]*)/mi', "$1"."new string"."$3", $s);
echo "\n------------------------\n";
echo $result;
echo "\n------------------------\n";
The text string2/lorem/example-text... should be replaced by new string.
My php-cli halted every time preg_match has reached char(138) and I don't know why.
I will throw my hat on this RegEx (note: \v matches a new-line | no flags are set):
"[^"]*"[^\x0b]+\v{2}"\K[^"]*
PHP code:
$source = chr(32).chr(13).chr(7).chr(11).chr(11)."\"string1,blah;like: this...\"".chr(10).
chr(10).chr(32).chr(32).chr(32).chr(32).chr(138).chr(138).chr(32).chr(32).chr(32).chr(32).
chr(13).chr(7).chr(11).chr(11)."\"string2/lorem/example-text...\"".chr(10).chr(10).chr(32).
chr(32).chr(32).chr(32).chr(32).chr(143).chr(143).chr(143).chr(143).chr(143);
echo preg_replace('~"[^"]*"[^\x0b]+\v{2}"\K[^"]*~', "new string", $source);
Beautiful output:
"string1,blah;like: this..."
��
"new string"
�����
Live demo
Solved. It was a combination of things:
/mis was needed (instead of /mi)
\x0b was needed (instead of \v)
Complete working example:
<?php
$s = chr(32).chr(32).chr(7).chr(11).chr(11);
$s .= "string1,blah;like: this...". chr(10).chr(10).chr(32).chr(32).chr(32).chr(32).chr(138).chr(138);
$s .= chr(32).chr(32).chr(32).chr(32).chr(13).chr(7).chr(11).chr(11);
$s .= "string2/lorem/example-text...". chr(10).chr(10).chr(32).chr(32).chr(32).chr(32).chr(32).chr(143).chr(143).chr(143);
$result = preg_replace('/(.*string1.*?\x0b+)([[:print:]]+)/mis', "$1"."new string", $s);
echo "\n------------------------\n";
echo $result;
echo "\n------------------------\n";
Thanks for everyone's suggestions. It put me on the right track.

find word/string in string , do I need regex?

I have the following String:
... 12:32 +0304] "GET /test.php?param=value ....
I want to extract the test.php out of this String. I tried to find a php function which could do this but there was nothing helpful. So my next guess was, what about regex and I tried for so long to get the part between GET / and ?. I failed badly...
Does a function in php exists which could help me out or do I need regex for this? If I do so, how can I get a string out of a string? Important, I don't want to know if test.php is in the string. I want to get everything between GET / and ?.
The regex extracting anything between GET / and ? in a capture group:
GET \/(.*?)\?
Demo: https://regex101.com/r/wR9yM5/1
In PHP it can be used like this:
$str = '... 12:32 +0304] "GET /test.php?param=value ....';
preg_match('/GET \/(.*?)\?/', $str, $re);
print_r($re[1]);
Demo: https://ideone.com/0XzZwo
<?php
$string = '... 12:32 +0304] "GET /test.php?param=value ....';
$find = explode("GET /", explode(".php", $string)[0])[1].".php";
echo $find; // test.php
?>
Try this:
(?>(\/))(\w+.php)
Or if You want any extention, 2 or 3 digits:
(?>(\/))(\w+.\w{3})
If only 3, delete "2," from brackets.
PHP CODE:
<?php
$subject='12:32 +0304] "GET /test.php?param=value';
$pattern='/(?>(\/))(\w+.{2,3})/s';
if (preg_match($pattern, $subject, $match))
echo $match[0];
?>
Without regex:
function between_solidus_and_question_mark($str) {
$start = strtok($str, '/');
$middle = strtok('?');
$end = strtok(null);
if($start && $end) {
return $middle;
}
}
$str = '... 12:32 +0304] "GET /test.php?param=value ....';
var_dump(between_solidus_and_question_mark($str));
Outputs:
test.php

Return multiple lines from a long string

I have a large string with multiple instances of header information. For example:
HTTP/1.1 302 Found
Cache-Control: no-cache, no-store, must-revalidate
Content-Type: text/html; charset=iso-8859-1
Date: Tue, 01 Mar 2016 01:43:13 GMT
Expires: Sat, 26 Jul 1997 05:00:00 GMT
Location: http://www.google.com
Pragma: no-cache
Server: nginx/1.7.9
Content-Length: 294
Connection: keep-alive
After "Location:", I want to save all the data from that line to an array. There might be 3 or 4 lines to save from a big block of text.
How could I do this?
Thanks!
There are plenty of ways you could do this.
Here's one way:
Split the text up at the point where Location: occurs
Split the result by new lines into an array
Example:
$text = substr($text, strpos($text, 'Location:'));
$array = explode(PHP_EOL, $text);
Here's another way:
Using regex, match Location: and everything after it
As above - split the result by new lines
Example:
preg_match_all('~(Location:.+)~s', $text, $output);
$output = explode(PHP_EOL, $output[0][0]);
Note: the s modifier means match newlines as part of the . - they will otherwise be ignored and new lines will terminate the capture.
I found another way that works too I figured I would add in case it helps anyone:
foreach(preg_split("/((\r?\n)|(\r\n?))/", $bigString) as $line){
if (strpos($line, 'Location') !== false) {
// Do stuff with the line
}
}
Source: Iterate over each line in a string in PHP
There's a lot of helpful other ways in there too.

str_replace multiple empty lines with 1 single line

I would like to replace multiple empty lines with one single line in php. currently I am replacing = with whitespace.
$message = str_replace('=', ' ', $message);
any suggestion on how to remove multiple {could even be 5 empty lines} with just one?
Output
Received On Thu, 29 May 2014 - 01:50 AM / user#test.com
= test
=
$message = 'Received On Thu, 29 May 2014 - 01:50 AM / user#test.com
= test
=';
echo preg_replace('/\n(\s*\n){2,}/', "\n\n", $message); // Quotes are important here.
OR
echo preg_replace('/\n(\s*\n){2,}/', "<br><br>", $message); //worked in browser
here is variant which replaces multiple new line symbols with any new line/space/tabs symbols after with single new line symbol:
$message = preg_replace('/[\r\n][\r\n\t ]*/', "\n", $message);
update: if you want to transform multi-line text in single line of text, you can use:
$message = preg_replace('/[\r\n][\r\n\t ]*/', " ", $message);

remove text before and after the text i want - PHP

here is my example text
------=_Part_564200_22135560.1319730112436
Content-Type: text/plain; charset=utf-8; name=text_0.txt
Content-Transfer-Encoding: 7bit
Content-ID: <314>
Content-Location: text_0.txt
Content-Disposition: inline
I hate u
------=_Part_564200_22135560.1319730112436
Content-Type: image/gif; name=dottedline350.gif
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename=dottedline350.gif
Content-ID: <dottedline350.gif>
I need to be able to extract the "I hate u".
I know i can use explode to remove the last part after the message by doing
$body_content = garbled crap above;
$message_short = explode("------=_",$body_content);
echo $message_short;
but i need to remove the first part, along with the last part.
So, explode function above does what I need to remove the end
now i need something that says
FIND 'Content-Disposition: inline' and remove along with anything before
then
FIND '------=_' and remove along with anything after
then
Anything remaining = $message_short
echo $message_short;
will look like
I hate u
Any help would be appreciated.
Thanks #Marcus
This is the code i now have and works wonderfully.
$body_content2 = imap_qprint(imap_body($gminbox, $email_number));
$body_content2 = strip_tags ($body_content2);
$tmobile_body = explode("------=_", $body_content2);
$tmobile_body_short = explode("Content-Disposition: inline", $tmobile_body[1]);
$tmobile_body_complete1 = trim($tmobile_body_short[1]);
$tmobile_body_complete2 = explode("T-Mobile", $tmobile_body_complete1);
$tmobile_body_complete3 = trim($tmobile_body_complete2[1]);
If you want to do it with explode and explode on "Content-Disposition: inline" here's how you can do:
$message_short = explode("------=_", $body_content);
$message_shorter = explode("Content-Disposition: inline", $message_short[1]);
$content = trim($message_shorter[1]);
// $content == I hate you
Note that this demands that the last line is Content-Disposition: inline
If you want to solve it using regex and use the double linebreak as a delimiter here's a codesnippet for you:
$pattern = '/(?<=------=_Part_564200_22135560.1319730112436)(?:.*?\s\s\s)(.*?)(?=------=_Part_564200_22135560.1319730112436)/is';
preg_match_all($pattern, $body_content, $matches);
echo($matches[1][0]);
Output:
I hate you
Here's a link to codepad with sample text.

Categories