Remove div php - part of string [duplicate] - php

This question already has answers here:
How to remove text between tags in php?
(6 answers)
Strip HTML tags and its contents
(2 answers)
Closed 9 years ago.
I want to remove part of string between two html tags. I have something like this:
$variable = "This is something that I don't want to delete<blockquote>This is I want to delete </blockquote>";
the problem is that the string between blockquote tag is changing, and its need to be deleted, no matter what it is. Anyone now how?

Regex are not the best thing to parse html string, you should take a look at Simple HTML dom parser or the php DOMDocument class.
If you still want to use a regex in this case it will be for example :
$variable = preg_replace('/<blockquote>.+<\/blockquote>/siU', '', $variable);
Test it there.

You can use regular expressions, but this is by no means fail-safe and should only be used in trivial cases. A better way is to use a full-fledged HTML parser.
<?php
$str = preg_replace('#<blockquote>.*</blockquote>#siU', '', $str);
?>

please try using regex in this way
<?php
$variable = "This is something that I don't want to delete<blockquote>This is I want to delete </blockquote>";
$str = preg_replace('#(<blockquote>).*?(</blockquote>)#', '$1$2', $variable);
print($str);
?>

Related

"\/" text shows up within my JSON link values [duplicate]

This question already has answers here:
json_encode() escaping forward slashes
(4 answers)
Closed 4 years ago.
My database contains links to images which are displayed properly within their structure. When I run my PHP code, the outputted JSON values are the same image links which fail to load because the links keep being outputted like this:
https:\/\/i.ebayimg.com\/00\/s\/NDQwWDgwMA==\/z\/ViAAAOSwhmtbN7fe\/$_59.JPG\r\n
Even though the database displays it like this:
https://i.ebayimg.com/00/s/NDQwWDgwMA==/z/ViAAAOSwhmtbN7fe/$_59.JPG
Is there something wrong with my PHP code?
you can use string replace function in php for remove (\)
$your_string = str_replace("\\", "", $your_string);
Simple use stripslashes()
First remove \r\n using str_replace() from link/url and then apply stripslashes()
$link = 'https:\/\/i.ebayimg.com\/00\/s\/NDQwWDgwMA==\/z\/ViAAAOSwhmtbN7fe\/$_59.JPG\r\n';
$link = stripslashes( str_replace("\\r\\n", '', $link) );
echo $link;

find/replace in PHP string [duplicate]

This question already has answers here:
PHP strtr vs str_replace benchmarking
(3 answers)
Replace text in a string using PHP
(3 answers)
Closed 5 years ago.
I have a PHP string in which I would like to find and replace using the strtr function, problem is I have variable fields so I won't be able to replace by name. The string contains tags like the following:
[field_1=Company]
[field_4=Name]
What makes it difficult is the "Company" and "Name" part of the "tag", these can be variable. So I basically looking for a way to replace this part [field_1] where "=Company" and "=Name" must be discarded. Can this be done?
To explain: I'm using "=Company" so users don't just see "field_1" but know the value it represents. However users are able to change the value to what they see fit.
You are probably looking for regular expressions. There is a function in PHP to do a regex replace:
http://php.net/manual/en/function.preg-replace.php
Been a while since I've worked in PHP but you might want to try something like this:
preg_replace('/field_\d/','REPLACEMENT','[field_1=Company]');
Should result in
[REPLACEMENT=Company]
If you want to replace everything except the brackets:
preg_replace('/field_\d+=\w+/','REPLACEMENT','[field_1=Company]');

Capturing text within HTML tag using PHP and preg_match [duplicate]

This question already has answers here:
PHP parse/syntax errors; and how to solve them
(20 answers)
Closed 5 years ago.
I am hitting a road block with a script I have to check availability on a certain website. I need the text within html tags and I am unsure how to approach it.
My code I have tested ended with this:
<?php
ini_set("allow_url_fopen", 1);
$homepage2 = file_get_contents('https://www.someurlwithavailability.com');
//URL has the following HTML tag: <div id="Availability">
Availability: Special Offer, ships within 10 - 15 business days </div>"
preg_match("/<div id="Availability">(.*?)</div>/si", $homepage2, $avail);
print_r($avail);
echo '<br>', '~Availability is~', '<br>', $avail, '<br>';
$stringavail=implode(" ",$avail);
echo $stringavail;
?>
I get various errors depending on what I put after preg_match(***,$homepage2, $avail); and I am unsure about what syntax I need to enter to retrieve the text.
My code above gives me this:
Parse error: syntax error, unexpected 'Availability' (T_STRING) in /u/o/placeiamrunningthecodefrom.php on line 6
The URL that is requested comes back with a full HTML page that is quite large. This HTML tag is unique and does not repeat.
Anyone able to help me out?
Although this can work just fine with regex. It's not recommended, nor easier.
Id suggest giving DOMDocument::getElementById a go. It even has an example right on the page:
$doc = new DomDocument;
// We need to validate our document before refering to the id
$doc->validateOnParse = true;
$doc->Load('book.xml');
echo "The element whose id is 'php-basics' is: " . $doc->getElementById('php-basics')->tagName . "\n";
Now to get the content instead of tagName we can use ->textContent as inherited from domnode
Try using single quotes around that pattern.
And, make sure you are escaping the special regex characters.
And, you are essentially asking for everything to the last </div>. So, you need to be more specific.
'/<div id="Availability">([^<]*)<\/div>/si'
instead of
"/<div id="Availability">(.*?)</div>/si"
Of course, this could still be unreliable if there is html in that the <div>
But, this should get you closer.
Also, try an online regex tool. I like this one.
https://regex101.com/
The problem is that you have double quotes inside your double-quoted string, and didn't escape them:
preg_match("/<div id="Availability">(.*?)</div>/si", $homepage2, $avail);
^ ^
If you used a decent IDE it would have alerted you to this as you were typing.
Simply change the delimiting quotes to single quotes.
Also, since your regexp delimiter / appears in the regular expression, you either need to escape the character where it appears in the regexp, or use a delimiter that isn't in the expression.
preg_match('#<div id="Availability">(.*?)</div>#si', $homepage2, $avail);
However, using regular expressions to parse HTML is generally a bad idea. You should use a DOM parser library like the DOMDocument class.

Why does PHP function strip_tags() removes data that is not tags? How to avoid this?

This code:
$input = 'I love <3 PHP!';
echo strip_tags($input);
Outputs:
I love
Is there a PHP function (or anyone's custom function) which would remove only tags (that means properly closed tags), not everything preceded by < ?
Why does PHP function strip_tags() removes data that is not tags?
It errs on the side of security.
How to avoid this?
If you are expecting text input, use htmlspecialchars to escape < characters (and a few others) instead of removing them.
Try htmlspecialchars, it will still show tags, but converted to html entities
As of PHP 5 the Tidy extension is usually available in most compiled binaries. It is not 100% effective but could help you in this case. Tidy tries to close all unclosed HTML tags in a string. With it closed you could then ignore the wanted tag. You would then need to strip out the final tag that tidy put in.
Tidy documentation
$str = tidy("I <3 PHP");
// second param ignores the closed tag <3>
$str = strip_tags($str, '<3>')
$str = str_replace('<3>', '<3', $str);
echo $str;

php regex for parsing html [duplicate]

This question already has answers here:
How do you parse and process HTML/XML in PHP?
(31 answers)
Closed 3 years ago.
i need some help to parse a html, extracting everything starting with http://, containing "abc" until first occurance of " or ' or blank space.
i have some regex like this /http:\/\/abc(.*)\"/ but it's not working well :\
are there any ideas? :)
P.S. sorry for bad english, it's not my natural language ;)
StackOverflow tends to prefer an HTML Document Parser over Regular Expressions for parsing HTML.
However, with that said, if you just want URLs from a string that happens to be HTML, I still believe a Regex is fine for the job.
Try preg_match_all:
preg_match_all("/http:\/\/[^\s'\"]*abc[^\s'\"]*/", $string, $matches);
Use a parser instead of a regex.
RegEx match open tags except XHTML self-contained tags
If all you want to do is extract URLs, regexen are a good choice. You don't need to get into the parser world.
If you have unix-like command tools you could approximate it very simply (assuming one url per line) with two passes:
grep http myfile.html | grep abc
You can use preg_grep() similarly.
preg_match_all ('/http:[^"\' ]+/', $html, $urls);
# $urls contains all the urls from your document
$abc_urls = preg_grep( '/abc/', $urls );

Categories