Substr String With HTML In PHP - php

I have some text that comes back from my database like so:
<span rgb(61,="" 36,="" 36);="" font-family:="" 'frutiger="" neue="" w01="" book',="" 'helvetica="" neue',="" helvetica,="" arial,="" sans-serif;="" line-height:="" 23.8px;"="">The Department of ...
I use echo html_entity_decode($item->body); to display:
The Department of ...
However, if I use the PHP substr function on this content it never displays correctly. It will display the first x characters of HTML and not the HTML formatted text.
Here's what I tried: echo substr(html_entity_decode($item->body), 0, 5);
But it doesn't display anything. If I try an amount like 0, 200); it will display:
The Department of Molec
But this is most definitely not the first 200 characters of the formatted text because the first character is T.
My idea is that there must be way to format and then substr, even though I can't get it to work using html_entity_decode() and substr() by themselves.
Can anybody help me out here? Thanks!

Try to use this instead of html_entity_decode():
strip_tags($item->body);
strip_tags removes all HTML tags from the string. So you better of treating the string and then do something with it.

You will see the output in the source code, but it is not beeing rendered. The source code will show:
echo substr(html_entity_decode($item->body), 0, 5);
// Output: "<span"
What you probably want to do is search for the end of the html-tag, and display 5 characters after that, like:
$text = html_entity_decode($item->body);
$start = strpos( $text, '>' ) + 1;
echo substr( $text, $start, 5 );

Related

Everything between "[reply]" and extract the reply number

Please take a look at the following situation below.
[reply="292"] Text Here [/reply]
What I am trying to get is the number between the quotations in reply="NUMBERS". I want to extract that to one variable and the text between [reply="NUMBER"] this text here [/reply] to another variable.
So for this example:
[reply="292"] Text Here [/reply]
I want to extract the reply number: 292 and the text between the reply tags: Text here.
I have tried this:
\[reply\=\"]([A-Z]\w)\[\/reply]
But this only works until the reply tag, doesn't work after that. How can I go about doing this?
I left generic (. *), but you can specify a type like decimal (\d+).
php:
$s = '[reply="292"] Text Here [/reply]';
$expr = '/\[reply=\"(.*)\"\](.*)\[\/reply\]/';
if(preg_match($expr,$s,$r)){
var_dump($r);
}
javascript:
s = '[reply="292"] Text Here [/reply]'
s.match(/\[reply=\"(.*)\"\](.*)\[\/reply\]/)
//["[reply="292"] Text Here [/reply]", "292", " Text Here "]
Easy!
\[reply\=\"(\d+)\"](.*?)\[\/reply]
Explanation
\d for digit
+ for 1 or more occurrence of the specified character.
[\w\s] for any character in word and whitespace (\s)
Then apply it to PHP like this:
<?php
$str = "[reply=\"292\"] Text Here [/reply]";
preg_match('/\[reply\=\"(\d+)\"]([\w\s]+)\[\/reply]/', $str, $re);
print_r($re[1]); // printing group 1, the reply number
print_r($re[2]); // printing group 2, the text
?>
Important!!
Just get the group value, not all. You only need some of it anyway.

As only display text with substr()

On the basis of data I have saved an article that comes with HTML as <br>, <table> ... etc. ....
example :
<tr>
<td style="width:50%;text-align:left;">Temas nuevos - Comunidad</td>
<td class="blocksubhead" style="width:50%;text-align:left;">Temas actualizados - Comunidad Temas actualizados - Comunidad</td>
</tr>
What I want is displayed on another screen a summary of the article using substr (), my problem is I can not print what I want, and that prints the html code eta first.
Example: echo substr($row["news"], 0, 20);
It is printing the first 20 characters, it only show at browser:
<td style="width:50%;text-align:l<td/>
What I want is, it only show the text and discard the html code it has
Use strip_tags() to strip html etc from the string...
So: echo substr(strip_tags($row["news"]), 0, 20);
http://php.net/manual/en/function.strip-tags.php
You could also do it using preg_replace() to match and replace anything that looks like a tag :)
The strip_tags() function strips a string from HTML, XML, and PHP tags.
//remove the html from the string.
$row["news"] = strip_tags($row["news"]);
//getting the first 20 character from the string and display as output.
echo substr($row["news"], 0, 20);

PHP replace : find and replace the same characters with different text

How can I find and replace the same characters in a string with two different characters? I.E. The first occurrence with one character, and the second one with another character, for the entire string in one go?
This is what I'm trying to do (so users need not type html in the body): I've used preg_replace here, but I'll willing to use anything else.
$str = $str = '>>Hello, this is code>> Here is some text >>This is more code>>';
$str = preg_replace('#[>>]+#','[code]',$str);
echo $str;
//output from the above
//[code]Hello, this is code[code] Here is some text [code]This is more code[code]
//expected output
//[code]Hello, this is code[/code] Here is some text [code]This is more code[/code]
But problem here is, both >> get replaced with [code]. Is it possible to somehow replace the first >> with [code] and the second >> with a [/code] for the entire output?
Does php have something to do this in one go? How can this be done?
$str = '>>Hello, this is code>> Here is some text >>This is more code>>';
echo preg_replace( "#>>([^>]+)>>#", "[code]$1[/code]", $str );
The above will fail if something like the following is your input:
>>Here is code >to break >stuff>>
To deal with this, use negative lookahead:
#>>((?!>[^>]).+?)>>#
will be your pattern.
echo preg_replace( "#>>((?!>[^>]).+?)>>#", "[code]$1[/code]", $str );

Keep all html whitespaces in php mysql

i want to know how to keep all whitespaces of a text area in php (for send to database), and then echo then back later. I want to do it like stackoverflow does, for codes, which is the best approach?
For now i using this:
$text = str_replace(' ', '&nbs p;', $text);
It keeps the ' ' whitespaces but i won't have tested it with mysql_real_escape and other "inject prevent" methods together.
For better understanding, i want to echo later from db something like:
function jack(){
var x = "blablabla";
}
Thanks for your time.
Code Blocks
If you're trying to just recreate code blocks like:
function test($param){
return TRUE;
}
Then you should be using <pre></pre> tags in your html:
<pre>
function test($param){
return TRUE;
}
</pre>
As plain html will only show one space even if multiple spaces/newlines/tabs are present. Inside of pre tags spaces will be shown as is.
At the moment your html will look something like this:
function test($param){
return TRUE;
}
Which I would suggest isn't desirable...
Escaping
When you use mysql_real_escape you will convert newlines to plain text \n or \r\n. This means that your code would output something like:
function test($param){\n return TRUE;\n}
OR
<pre>function test($param){\n return TRUE;\n}</pre>
To get around this you have to replace the \n or \r\n strings to newline characters.
Assuming that you're going to use pre tags:
echo preg_replace('#(\\\r\\\n|\\\n)#', "\n", $escapedString);
If you want to switch to html line breaks instead you'd have to switch "\n" to <br />. If this were the case you'd also want to switch out space characters with - I suggest using the pre tags.
try this, works excellently
$string = nl2br(str_replace(" ", " ", $string));
echo "$string";

php preg_match_all html dates with slashes error

I've trying to preg_match_all a date with slashes in it sitting between 2 html tags; however its returning null.
here is the html:
> <td width='40%' align='right'class='SmallDimmedText'>Last Login: 11/14/2009</td>
Here is my preg_match_all() code
preg_match_all('/<td width=\'40%\' align=\'right\' class=\'SmallDimmedText\'>Last([a-zA-Z0-9\s\.\-\',]*)<\/td>/', $h, $table_content, PREG_PATTERN_ORDER);
where $h is the html above.
what am i doing wrong?
thanks in advance
It (from a quick glance) is because you are trying to match:
Last Login: 11/14/2009
With this regex:
Last([a-zA-Z0-9\s\.\-\',]*)
The regex doesn't contain the required characters of : and / which are included in the text string. Changing the required part of the regex to:
Last([a-zA-Z0-9\s\.\-\',:/]*)
Gives a match
Would it be better to simply use a DOM parser, and then preform the regex on the result of the DOM lookup? It makes for nicer regex...
EDIT
The other issue is that your HTML is:
...40%' align='right'class='SmallDimmedText'>...
Where there is no space between align='right' and class='SmallDimmedText'
However your regex for that section is:
...40%\' align=\'right\' class=\'SmallDimmedText\'>...
Where it is indicated there is a space.
Use a DOM Parser It will save you more headaches caused by subtle bugs than you can count.
Just to give you an idea on how simple it is to parse using Simple HTML DOM.
$html = str_get_html(...);
$elems = $html->find('.SmallDimmedText');
if ( count($elems->children()) != 1 ){
throw new Exception('Too many/few elements found');
}
$text = $elems->children(0)->plaintext;
//parsing here is only an example, but you have removed all
//the html so that any regex used is really simple.
$date = substr($text, strlen('Last Login: '));
$unixTime = strtotime($date);
I see at least two problems :
in your HTML string, there is no space between 'right' and class=, and there is one space there in your regex
you must add at least these 3 characters to the list of matched characters, between the [] :
':' (there is one between "Login" and the date),
' ' (there are spaces between "Last" and "Login", and between ":" and the date),
and '/' (between the date parts)
With this code, it seems to work better :
$h = "<td width='40%' align='right'class='SmallDimmedText'>Last Login: 11/14/2009</td>";
if (preg_match_all("#<td width='40%' align='right'class='SmallDimmedText'>Last([a-zA-Z0-9\s\.\-',: /]*)<\/td>#",
$h, $table_content, PREG_PATTERN_ORDER)) {
var_dump($table_content);
}
I get this output :
array
0 =>
array
0 => string '<td width='40%' align='right'class='SmallDimmedText'>Last Login: 11/14/2009</td>' (length=80)
1 =>
array
0 => string ' Login: 11/14/2009' (length=18)
Note I have also used :
# as a regex delimiter, to avoid having to escape slashes
" as a string delimiter, to avoid having to escape single quotes
My first suggestion would be to minimize the amount of text you have in the preg_match_all, why not just do between a ">" and a "<"? Second, I'd end up writing the regex like this, not sure if it helps:
/>.*[0-9]{1,2}/[0-9]{1,2}/[0-9]{2,4}</
That will look for the end of one tag, then any character, then a date, then the beginning of another tag.
I agree with Yacoby.
At the very least, remove all reference to any of the HTML specific and simply make the regex
preg_match_all('#Last Login: ([\d+/?]+)#', ...

Categories