How to extract text in php using regex [closed]

How to extract text in php using regex [closed] - php

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
My Text :
12a49803-713c-4204-a8e6-248e554a352d_ Content-Type: text/plain; charset="iso-8859-6" Content-Transfer-Encoding: base64 DQrn0Ocg0dPH5MkgyszR6sjqySDl5iDH5OfoyuXq5A0KDQrH5OTaySDH5NnRyOrJIOXP2ejlySAx MDAlDQogCQkgCSAgIAkJICA= --_12a49803-713c-4204-a8e6-248e554a352d_ Content-Type: text/html; charset="iso-8859-6" Content-Transfer-Encoding: base64 PGh0bWw+DQo8aGVhZD4NCjxzdHlsZT48IS0tDQouaG1tZXNzYWdlIFANCnsNCm1hcmdpbjowcHg7
I want to extract iso-8859-6

you could do: preg_match('/charset="([^"]+)"/',$string,$m); echo $m[1];
Edit: In case all need matching (prompted from other answer) modify like this:
preg_match_all('/charset="([^"]+)"/',$string,$m); print_r($m);

The regex you are looking for is:
iso[^"]+
The php code you need is:
<?php
$subject='12a49803-713c-4204-a8e6-248e554a352d_ Content-Type: text/plain; charset="iso-8859-6" Content-Transfer-Encoding: base64 DQrn0Ocg0dPH5MkgyszR6sjqySDl5iDH5OfoyuXq5A0KDQrH5OTaySDH5NnRyOrJIOXP2ejlySAx MDAlDQogCQkgCSAgIAkJICA= --_12a49803-713c-4204-a8e6-248e554a352d_ Content-Type: text/html; charset="iso-8859-6" Content-Transfer-Encoding: base64 PGh0bWw+DQo8aGVhZD4NCjxzdHlsZT48IS0tDQouaG1tZXNzYWdlIFANCnsNCm1hcmdpbjowcHg7';
$pattern='/iso[^"]+/m';
if (preg_match($pattern, $subject, $match))
echo $match[0];
?>
The output is:
iso-8859-6

if you are interested in getting both matches (since you have 2 in the string) and iterate through them you should do something like this.
also i used single quotes to not have to escape the quotes inside the regex. used ridgerunners suggestions aswell.
preg_match_all('/charset="([^"]+)"/', $subject, $result, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($result[0]); $i++) {
# Matched text = $result[0][$i];
}

Related

How can I evaluate php code in a string variable? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
If I use the below code, I get the literal string <b> DATE: <?php echo $date; ?> </b> appended to content:
$content .= '<b> DATE: <?php echo $date; ?> </b>';
$pdf->writeHTML($content);
How can I instead get the value of $date there?

You are trying to put php code inside a string which will not be evaluated again. Try this:
$content .= '<b> DATE: ' . $date . '</b>';
$pdf->writeHTML($content);

PHP prevent strip_tags from removing broken tags

I have the same situation as this this guy.
Basically strip_tags removes tags including broken tags (the term used in the documentation). Is there another way of doing this that doesn't involve removing < and any text after it if it's not an HTML tag?
I'm currently doing this:
$description = "<p>I am currently <30 years old.</p>";
$body = strip_tags(html_entity_decode($description, ENT_QUOTES, "UTF-8"), "<strong><em><u>");
echo $body;
But the code above will break something like:
<p>I am currently <30 years old.</p>
Into:
I am currently
eval.in
Here's an eval.in so you guys could see what I mean.

The HTML you have as input is invalid. So that needs fixing. You could replace all those unclosed < by < first, and then do your html_entity_decode after strip_tags:
$description = "<p>I am currently <30 years old.</p>";
$description = preg_replace("/<([^>]*(<|$))/", "<$1", $description);
$body = html_entity_decode(strip_tags($description, "<strong><em><u>"),
ENT_NOQUOTES, "UTF-8");
echo $body;
See it on eval.in
Alternatively you could use a DOM parser, which in some cases could give better results, but you'll still need to apply the fix first:
$description = "<p>I am currently <30 years old.</p>";
$description = preg_replace("/<([^>]*(<|$))/", "<$1", $description);
$doc = new DOMDocument();
$doc->loadHTML($description);
$body = $doc->documentElement->textContent;
echo $body;
See it on eval.in

Usually when using the less than and greater than operators, you're nearly always going to be using numbers (especially likely here, since you've since said there are no spaces involved). Assuming this is your case, you could quite easily use preg_match to regex this case scenario before running it through strip_tags:
$description = "<p>I am currently <30 years old.</p>";
$description = preg_replace("/<([0-9]+)/", "<$1", $description);
$body = strip_tags($description, "<strong><em><u>");
echo $body;

How to find string after any tag in the match [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
My string is
grunds&\#x00E4;tzlich Prek&\#x00E4;re&\#x201C;.<a id="cein_fn31"
href="einleitung.html#ein_fn31"><sup>31</sup></a> Nur in den
mannigfaltigen Spielarten des Festlichen ist
I need a regex pattern to match all the contents after the tag closing (i.e :
Nur in den mannigfaltigen Spielarten des Festlichen ist
note: tag name may vary anything
Anybody please share any ideas

This program works by finding all characters that are not angle brackets <> at the end of the string.
use strict;
use warnings;
my $s = <<'__END_STRING__';
grunds&\#x00E4;tzlich Prek&\#x00E4;re&\#x201C;.<a id="cein_fn31"
href="einleitung.html#ein_fn31"><sup>31</sup></a> Nur in den
mannigfaltigen Spielarten des Festlichen ist
__END_STRING__
my ($subtext) = $s =~ /([^<>]*)\z/;
print $subtext;
output
Nur in den
mannigfaltigen Spielarten des Festlichen ist

An alternate regex free solution
The code uses strrpos() as a lookbehind to find the closing > and chops off up to that using substr()
echo $str = trim(strip_tags(substr($str,strrpos($str,'>')+1)));
Demonstration

You can try Below code also.
<?php
$string = "<a>Hii</a>
<div>div tag here</div>
<p>para tag here</p>";
$string = str_replace(">","&#*#&", $string);
$string_arr = explode("&#*#&", $string);
$final_arr = array();
for($i=0; $i<count($string_arr); $i++){
$string_arr[$i] = trim(strip_tags($string_arr[$i]));
if($string_arr[$i] != ''){
$final_arr[] = $string_arr[$i];
}
}
print_r($final_arr);
?>

Limited Characters on a line [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I am designing a small application which shoots out status-like tweets from users at the five latest ones. To keep it a small application, I need for it to accept only so many characters on a line and the drop a line just below it. So for example:
Noah: The best thing about Stackoverflow
is that it is full of amazing programmers.
Something along the lines of something like that above. Can you help me with the code below :
echo "<div style='position:relative;top:-20px;padding-top:10px;padding-left:30px;padding-right:30px;'>";
echo "<p> $first_name: $body. </p>";
if (strlen($body <= 100)) {
echo "\n";
}
echo "</div>";
}

wordwrap will do that for you
string wordwrap ( string $str [, int $width = 75 [, string $break = "\n" [, bool $cut = false ]]] )
Example:
$text = "A very long woooooooooooord.";
$newtext = wordwrap($text, 8, "\n", true);
echo "$newtext\n";
A very
long
wooooooo
ooooord.

remove text before and after the text i want - PHP

here is my example text
------=_Part_564200_22135560.1319730112436
Content-Type: text/plain; charset=utf-8; name=text_0.txt
Content-Transfer-Encoding: 7bit
Content-ID: <314>
Content-Location: text_0.txt
Content-Disposition: inline
I hate u
------=_Part_564200_22135560.1319730112436
Content-Type: image/gif; name=dottedline350.gif
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename=dottedline350.gif
Content-ID: <dottedline350.gif>
I need to be able to extract the "I hate u".
I know i can use explode to remove the last part after the message by doing
$body_content = garbled crap above;
$message_short = explode("------=_",$body_content);
echo $message_short;
but i need to remove the first part, along with the last part.
So, explode function above does what I need to remove the end
now i need something that says
FIND 'Content-Disposition: inline' and remove along with anything before
then
FIND '------=_' and remove along with anything after
then
Anything remaining = $message_short
echo $message_short;
will look like
I hate u
Any help would be appreciated.
Thanks #Marcus
This is the code i now have and works wonderfully.
$body_content2 = imap_qprint(imap_body($gminbox, $email_number));
$body_content2 = strip_tags ($body_content2);
$tmobile_body = explode("------=_", $body_content2);
$tmobile_body_short = explode("Content-Disposition: inline", $tmobile_body[1]);
$tmobile_body_complete1 = trim($tmobile_body_short[1]);
$tmobile_body_complete2 = explode("T-Mobile", $tmobile_body_complete1);
$tmobile_body_complete3 = trim($tmobile_body_complete2[1]);

If you want to do it with explode and explode on "Content-Disposition: inline" here's how you can do:
$message_short = explode("------=_", $body_content);
$message_shorter = explode("Content-Disposition: inline", $message_short[1]);
$content = trim($message_shorter[1]);
// $content == I hate you
Note that this demands that the last line is Content-Disposition: inline
If you want to solve it using regex and use the double linebreak as a delimiter here's a codesnippet for you:
$pattern = '/(?<=------=_Part_564200_22135560.1319730112436)(?:.*?\s\s\s)(.*?)(?=------=_Part_564200_22135560.1319730112436)/is';
preg_match_all($pattern, $body_content, $matches);
echo($matches[1][0]);
Output:
I hate you
Here's a link to codepad with sample text.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How to extract text in php using regex [closed] - php

you could do: preg_match('/charset="([^"]+)"/',$string,$m); echo $m[1]; Edit: In case all need matching (prompted from other answer) modify like this: preg_match_all('/charset="([^"]+)"/',$string,$m); print_r($m);

Related

How can I evaluate php code in a string variable? [closed]

PHP prevent strip_tags from removing broken tags

How to find string after any tag in the match [closed]

Limited Characters on a line [closed]

remove text before and after the text i want - PHP

Categories

Resources