Scan HTML for values with a special character before them

Scan HTML for values with a special character before them - php

Say I have values on my page, like #100 #246, What I want to do is scan the page for values with a # before them and then alter them to put a hyperlink on it
$MooringNumbers = '#!' . $MooringNumbers . ' | ' . '#!' . $row1["Number"];
}
$viewedResult = '<tr><td>' .$Surname.'</td><td>'.$Title.'</td><td>'.$MooringNumbers . '</td><td>'.$Telephone.'</td><td>' . '[EDIT]</td>'.'<td>'. '[x]</td>'. '</tr>'; preg_replace('/#!(\d\d\d)/', '${1}', $viewedResult);
echo $viewedResult;
This is the broken code which doesnt work.

I second Xoc - use PHP manual. The method next to the one he pointed is preg-replace-callback
Just call:
preg_replace_callback(
'/#\d\d\d/',
create_function(
// single quotes are essential here,
// or alternative escape all $ as \$
'$matches',
'return strtolower($matches[0]);' //this you replace with what you want to fetch from database
)
EDIT:
Since you want to always perform the same replacement go with Xoc's preg-replace:
preg_replace('/#!(\d\d\d)/', '${1}', $your_input);
Note: I don't have PHP here, so I give no guarantee of this code not wiping your entire hard disk ;)

You can accomplish this by using regular expressions, see PHP's preg_replace function.
$text = 'Lorem ipsum #300 dolar amet #20';
preg_match_all('/(^|\s)#(\w+)/', $text, $matches);
// Perform you database magic here for each element in $matches[2]
var_dump($matches[2]);
// Fake query result
$query_result = array ( 300 => 'http://www.example1.com', 20 => 'http://www.example2.com');
foreach($query_result as $result_key => $result_value)
{
$text = str_replace('#'.$result_key, ''. $result_value . '', $text);
}
var_dump($text);

Related

Replacing the SPACE at first and last

I want to replace the first and last words and sentences .
I use this code.
$text = ' this is the test for string. ';
echo $text = str_replace(" ", "", $text);
when i have use replace code .
all space is deleted and repalsed.
any body can help me?!
i want get this:
this is the test for string.

You probably want the trim function here:
$text = ' this is the test for string. ';
echo '***' . trim($text) . '***';
***this is the test for string.***
Just to round out this answer, if you wanted to accomplish the same thing using a replacement, you could do a regex replace as follows:
$out = preg_replace("/^\s*|\s*$/", "", $text);
echo '***' . $out . '***';
***this is the test for string.***
This approach might a good starting point if you wanted to do a regex replacement with perhaps slightly different logic.

remove HTML from displaying in PHP

I have this text : http://pastebin.com/2Zgbs7hi
And i want to be able to remove the HTML code from it and just display the plain text but i want to keep at least one line break where there are currently a few line breaks
i have tried:
$ticket["summary"] = 'pastebin example';
$TicketSummaryDisplay = nl2br($ticket["summary"]);
$TicketSummaryDisplay = stripslashes($TicketSummaryDisplay);
$TicketSummaryDisplay = trim(strip_tags($TicketSummaryDisplay));
$TicketSummaryDisplay = preg_replace('/\n\s+$/m', '', $TicketSummaryDisplay);
echo $TicketSummaryDisplay;
that is displaying as plain text, but it shows it all as one big block of text with no line breaks at all

Maybe this will earn you some time.
<?php
libxml_use_internal_errors(true); //crazy o tags
$html = file_get_contents('http://pastebin.com/raw.php?i=2Zgbs7hi');
$dom = new DOMDocument;
$dom->loadHTML($html);
$result='';
foreach ($dom->getElementsByTagName('p') as $node) {
if (strstr($node->nodeValue, 'Legal Disclaimer:')){
break;
}
$result .= $node->nodeValue;
}
echo $result;

This example should successfully store text from html into an array of strings.
After stripping all the tags, you can use preg_split with \R special character ( matches any newline sequence ) to convert string into array. That array will now have several blank values, and there will be also some amount of html non-breaking space entities, so we will check the array for empty values with array_filter() function ( it will remove all items that do not satisfy the filter conditions, in our case, an empty value ). Here are a problem with entity, because and space characters are not the same, they have different ASCII code, so trim() function will not remove spaces. Here are two possible solutions, the first uncommented part will only replace &nbsp and check for white space characters, while the second commented one will decode all html entities and also check for spaces.
PHP:
$text = file_get_contents( 'http://pastebin.com/raw.php?i=2Zgbs7hi' );
$text = strip_tags( $text );
$array = array_filter(
preg_split( '/\R/', $text ),
function( &$item ) {
$item = str_replace( ' ', ' ', $item );
return trim( $item );
// $item = html_entity_decode( $item );
// return trim( str_replace( "\xC2\xA0", ' ', $item ) );
}
);
foreach( $array as $value ) {
echo $value . '<br />';
}
Array output:
Array
(
[8] => Hi,
[11] => Ashley has explained that I need to ask for another line and broadband for the wifi to work, please can you arrange this.
[13] => Regards
[23] => Legal Disclaimer:
[24] => This email and its attachments are confidential. If you received it by mistake, please don’t share it. Let us know and then delete it. Its content does not necessarily represent the views of The Dragon Enterprise
[25] => Centre and we cannot guarantee the information it contains is complete. All emails are monitored and may be seen by another member of The Dragon Enterprise Centre's staff for internal use
)
Now you should have clear array with only items with value in it. By the way, newlines in HTML are expressed through <br />, not through \n, your example as response in a web browser still has them, but they are only visible in page source code. I hope I did not missed the point of the question.

try this get text output with line brakes
<?php
$ticket["summary"] = file_get_contents('http://pastebin.com/raw.php?i=2Zgbs7hi');
$TicketSummaryDisplay = nl2br($ticket["summary"]);
echo strip_tags($TicketSummaryDisplay,'<br>');
?>

You are asking on how to add line-breaks to your "one big block of text with no line breaks at all".
Short answer
After you stripped the HTML tags, apply wordwrap with a desired text-block length
$text = wordwrap($text, 90, "<br />\n");
I really wonder, why nobody suggested that function before.
there is also chunk_split around, which doesn't take words into account and just splits after a certain number of chars. breaking words - but that's not what you want, i guess.
PHP
<?php
$text = file_get_contents('http://pastebin.com/raw.php?i=2Zgbs7hi');
/**
* Returns string without html tags, also
* removes takes control chars, spaces and " " into account.
*/
function dropHtmlTags($string) {
// remove html tags
//$string = preg_replace ('/<[^>]*>/', ' ', $string);
$string = strip_tags($string);
// control characters and "&nbsp"
$string = str_replace("\r", '', $string); // remove
$string = str_replace("\n", ' ', $string); // replace with space
$string = str_replace("\t", ' ', $string); // replace with space
$string = str_replace(" ", ' ', $string);
// remove multiple spaces
$string = preg_replace('/ {2,}/', ' ', $string);
$string = trim($string);
return $string;
}
$text = dropHtmlTags($text);
// The Answer: insert line breaks after 95 chars,
// to get rid of the "one big block of text with no line breaks at all"
$text = wordwrap($text, 95, "<br />\n");
// if you want to insert line-breaks before the legal disclaimer,
// uncomment the next line
//$text = str_replace("Regards Legal Disclaimer", "<br /><br />Regards Legal Disclaimer", $text);
echo $text;
?>
Result
first section shows your text block
second section shows the text with wordwrap applied (code from above)

Hello it can be done as follows:
$abc= file_get_contents('http://pastebin.com/raw.php?i=2Zgbs7hi');
$abc = strip_tags("\n", $abc);
echo $abc;
Please, let me know whether it works

you may use
<?php
$a= file_get_contents('a.txt');
echo nl2br(htmlspecialchars($a));
?>

<?php
$handle = #fopen("pastebin.html", "r");
if ($handle) {
while (!feof($handle)) {
$buffer = fgetss($handle, 4096);
echo $buffer;
}
fclose($handle);
}
?>
output is
Hi,
Ashley has explained that I need to ask for another line and broadband for the wifi to work, please can you arrange this.
Regards
Legal Disclaimer:
This email and its attachments are confidential. If you received it by mistake, please don’t share it. Let us know and then delete it. Its content does not necessarily represent the views of The Dragon Enterprise
Centre and we cannot guarantee the information it contains is complete. All emails are monitored and may be seen by another member of The Dragon Enterprise Centre's staff for internal use
You can probably write additional code to convert to spaces etc.

I'm not sure I did understand everything correctly but this seems to be your expected result:
$txt = file_get_contents('http://pastebin.com/raw.php?i=2Zgbs7hi');
var_dump(preg_replace("/(\&nbsp\;(\s{1,})?)+/", "\n", trim(strip_tags(preg_replace("/(\s){1,}/", " ", $txt)))));
//more readable
$txt = preg_replace("/(\s){1,}/", " ", $txt);
$txt = trim(strip_tags($txt));
$txt = preg_replace("/(\&nbsp\;(\s{1,})?)+/", "\n", $txt);

The strip_tags() function strips HTML and PHP tags from a string, if that is what you are trying to accomplish.
Examples from the docs:
<?php
$text = '<p>Test paragraph.</p><!-- Comment --> Other text';
echo strip_tags($text);
echo "\n";
// Allow <p> and <a>
echo strip_tags($text, '<p><a>');
?>
The above example will output:
Test paragraph. Other text
<p>Test paragraph.</p> Other text

Merging preg_match_all and preg_replace

I have some code running which finds out hashtags in the string and turns them into links. I have done this using preg_match_all as shown below:
if(preg_match_all('/(#[A-z_]\w+)/', $postLong, $arrHashTags) > 0){
foreach ($arrHashTags[1] as $strHashTag) {
$long = str_replace($strHashTag, ''.$strHashTag.'', $postLong);
}
}
Also, for my search script, I need to bold the searched keywords in the result string. Something similar to the below code using preg_replace:
$string = "This is description for Search Demo";
$searchingFor = "/" . $searchQuery . "/i";
$replacePattern = "<b>$0<\/b>";
preg_replace($searchingFor, $replacePattern, $string);
The problem that I am having is that both have to work together and should be thrown as a combined result. One way I can think of is to run the resultant string from preg_match_all with the preg_replace code but what if the tags and the searched string are the same? The second block will bold my tag as well which is not desired.
update the code i'm running based on the answer given below but it still doesn't work
if(preg_match_all('/(#[A-z_]\w+)/', $postLong, $arrHashTags) > 0){
foreach ($arrHashTags[1] as $strHashTag) {
$postLong = str_replace($strHashTag, ''.$strHashTag.'', $postLong);
}
}
And immediately after this, i run this
$searchingFor = "/\b.?(?<!#)" . $keystring . "\b/i";
$replacePattern = "<b>$0<\/b>";
preg_replace($searchingFor, $replacePattern, $postLong);
Just so you know, this is all going inside a while loop, which is generating the list

You just need to modify you the search pattern to avoid ones that start with a '#'
$postLong = "This is description for Search Demo";
if(preg_match_all('/(#[A-z_]\w+)/', $postLong, $arrHashTags) > 0){
foreach ($arrHashTags[1] as $strHashTag) {
$postLong = str_replace($strHashTag, ''.$strHashTag.'', $postLong);
}
}
# This expression finds any text with 0 or 1 characters in front of it
# and then does a negative look-behind to make sure that the character isn't a #
searchingFor = "/\b.?(?<!#)" . $searchQuery . "\b/i";
$replacePattern = "<b>$0<\/b>";
preg_replace($searchingFor, $replacePattern, $postLong);
Or if you don't need an array of the available hashes for another reason, you could use preg_replace only.
$postLong = "This is description for #Search Demo";
$patterns = array('/(#[A-z_]\w+)/', "/\b.?(?<!#)" . $searchQuery . "\b/i");
$replacements = array(''.$0.'', ' "<b>$0<\/b>');
preg_replace($patterns, $replacements, $postLong);

How do I split a Wordpress title at the – in PHP?

I am working on my Wordpress blog and its required to get the title of a post and split it at the "-". Thing is, its not working, because in the source its &ndash and when I look at the result on the website, its a "long minus" (–). Copying and pasting this long minus into some editor makes it a normal minus (-). I cant split at "-" nor at &ndash, but somehow it must be possible. When I created the article, I just typed "-" (minus), but somewhere it gets converted to – automatically.
Any ideas?
Thanks!

I think I found it. I remember that I have meet the similar problem that when I paste code in my post the quote mark transform to an em-quad one when display to readers.
I found that is in /wp-include/formatting.php line 56 (wordpress ver 3.3.1), it defined some characters need to replace
$static_characters = array_merge( array('---', ' -- ', '--', ' - ', 'xn–', '...', '``', '\'\'', ' (tm)'), $cockney );
$static_replacements = array_merge( array($em_dash, ' ' . $em_dash . ' ', $en_dash, ' ' . $en_dash . ' ', 'xn--', '…', $opening_quote, $closing_quote, ' ™'), $cockneyreplace );
and in line 85 it make an replacement
// This is not a tag, nor is the texturization disabled static strings
$curl = str_replace($static_characters, $static_replacements, $curl);

If you want to split a string at the "-" character, basically you must replace "-" with a space.
Try this:
$string_to_be_stripped = "my-word-test";
$chars = array('-');
$new_string = str_replace($chars, ' ', $string_to_be_stripped);
echo $new_string;
These lines splits the string at the "-". For example, if you have my-word-test, it will echo "my word test". I hope it helps.
For more information about the str_replace function click here.
If you want to do this in a WordPress style, try using filters. I suggest placing these lines in your functions.php file:
add_filter('the_title', function($title) {
$string_to_be_stripped = $title;
$chars = array('-');
$new_string = str_replace($chars, ' ', $string_to_be_stripped);
return $new_string;
})
Now, everytime you use the_title in a loop, the title will be escaped.

Smiley Replace within CDATA of an HTML-String

i have got a simple problem :( I need to replace text smilies with the according smiley-image. ok.. thats not really complex, but now i have to replace only smilie appereances outside of HTML Tags. short examplae:
Text:
Thats a good example :/ .. with a link inside.
i want to replace ":/" with the image of this smiley...
ok, how to do that the best way?

I won't try to create some super script but think about it.... smilies are just about always surrounded by spaces. So str replace ' :/ ' with the smiley. You could be saying "what about a smiley at the end of a sentence(where it would be used the most)". Well just check for at least one space on either the left or the right of a potential smiley.
Using the above scripts:
$smiley_array = array(
":) " => "<a href...>",
" :)" => "<a href...>",
":/ " => "<a href...>",
" :/" => "<a href...>");
$codes = array_keys($smiley_array);
$links = array_values($smiley_array);
$str = str_replace($codes, $links, $str);
If you rather not have to type everything twice you can generate the array from a single smiley array.

Why don't you just try to use some special chars around your smiley text like this maybe -:/-
This will make your smiley text some kind of unique and easy to recognize

Use preg_replace with a lookbehind assertion. Example:
$smileys = array(
':/' => '<img src="..." alt=":/">'
);
foreach ($smileys as $smile => $img) {
$text = preg_replace('#(?<!<[^<>]*)' . preg_quote($smile, '#') . '#',
$img, $text);
}
The regex should match only smileys that are not inside angle brackets. This might be slow if you have a lot of false positives.

I wouldn't know about the best way, only the way I would do it.
Build an array having the smiley codes as the keys and the link as the value. The use str_replace. Pass as "needle" an array of the keys (the smiley codes) and as "replace" an array of the values.
For instance, suppose you have something like this:
$smiley_array = array(":)" => "<a href...>",
":(" => "<a href=....>");
$codes = array_keys($smiley_array);
$links = array_values($smiley_array);
$str = str_replace($codes, $links, $str);
EDIT: In case this could accidentally replace other instances with smiley-links you should consider using regexes with preg_replace. Obviously preg_replace is slower than str_replace.

You can use regex, or the extra sloppy version of the above:
$smiley_array = array(":)" => "<a href...>",
":(" => "<a href=....>");
$codes = array_keys($smiley_array);
$links = array_values($smiley_array);
$str = str_replace("://", "%%QF%%", $str);
$str = str_replace($codes, $links, $str);
$str = str_replace("%%QF%%", "://", $str);
Actually, assuming str_replace follows the array sorting...
this should work:
$smiley_array = array("://" => "%%QF%%", ":)" => "<a href...>",
":(" => "<a href=....>", "%%QF%%" => "://");
$codes = array_keys($smiley_array);
$links = array_values($smiley_array);
$str = str_replace($codes, $links, $str);

Possible overkill (increased cpu/load), but 99.99999999% safe:
<?php
$n = new DOMDocument();
$n->loadHTML('<p>Thats a good example :/ .. with a link inside.</p>');
$x = new DOMXPath($n);
$instances = $x->query('//text()[contains(.,\':/\')]');//or use '//*[child::text()]' for all textnodes
foreach($instances as $node){
if($node instanceof DOMText && preg_match_all('/:\//',$node->wholeText,$matches,PREG_OFFSET_CAPTURE|PREG_SET_ORDER)){
foreach($matches[0] as $match){
$newnode = $node->splitText($match[1]);
$newnode->replaceData(0,strlen($match[0]),'');
$img = $n->createElement('img');
$img->setAttribute('src','smily.gif');
$img = $newnode->parentNode->insertBefore($img,$newnode);
//var_dump($match);
}
}
}
var_dump($n->saveHTML());
?>
But in reality you do not want to do this all that often, save once, show many, if you are letting users edit the html (beit in wysiwyg or elsewise, the 'return' transformation (img to text) is a whole lot lighter. Up to you to expand with different smilies (one monster regex to match them, or several smaller ones / strstr()'s for readability, and a array for smiley to src (e.g. array(':/'=>'frown.gif')) would be the way to go.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Scan HTML for values with a special character before them - php

Related

Replacing the SPACE at first and last

remove HTML from displaying in PHP

Merging preg_match_all and preg_replace

How do I split a Wordpress title at the – in PHP?

Smiley Replace within CDATA of an HTML-String

Categories

Resources