After my $plates = explode(';', $plates); i'd like to include another regular expression when creating a new $dossier (in my foreach):
$minus = preg_replace('~[-]~', '', $license_plates);
How would I do that?
This is my code:
public function addLicensePlates(Request $request)
{
$product_id = $request->input('product_id');
$license_plates = $request->input('license_plates');
$plates = preg_replace('~\s+|[.,:;*/_]~', ';', $license_plates); // \s+|[.,:;*/_]
$plates = explode(';', $plates);
foreach($plates as &$plate) {
$dossier = new Dossier;
$dossier->license_plate = trim($plate);
$dossier->product_id = $product_id;
$dossier->save();
}
}
PS: I don't want to add the - to the $plates expression, but after the explode.
I suggest editing the existing function as follows:
Use trim() on the input string when preg_replaceing to get rid of empty values later
Use str_replace to replace the hyphens later in the foreach block.
See the PHP demo:
$license_plates = " WORD-HERE 36-LXD-5";
$plates = preg_replace('~\s+|[.,:;*/_]~', ';', trim($license_plates)); // trim the incoming value
$plates = explode(';', $plates);
foreach($plates as &$plate) {
$license_plate = str_replace('-', '', $plate); // Remove hyphens
echo $license_plate. "\n";
}
Output: WORDHERE and 36LXD5.
I am setting up a new content plugin for Joomla 3, that should replace plugin tags with html content. Everything works fine till the moment when i am preg_replace plugin tags in $row->fulltext.
Here is the plugin code
public function onContentPrepare($context, &$row, &$params, $page = 0) {
$pattern = '#\{uni\}(.*){\/uni\}#sU';
preg_match_all($pattern, $row->fulltext, $matches, PREG_PATTERN_ORDER);
foreach($matches[1] as $k=>$uni){
preg_match('/\{uni-title\}(.*)[\{]/Ui', $uni, $unititle);
preg_match('/\{uni-text\}(.*)/si', $uni, $unitext);
$titleID = str_replace(' ', '_', trim($unititle[1]));
$newString = '<span id="'.$titleID.'">'.$unititle[1].'</span><div class="university-info-holder"><div class="university-info"><i class="icon icon-close"></i>'.$unitext[1].'</div></div>';
$row->fulltext = preg_replace($pattern,$newString,$row->fulltext);
}
}
Any ideas, why it duplicates first found match, as many times as foreach goes?
Just to mention, if i do:
echo $unititle[1];
inside foreach, items aren't duplicated, but are rendered as it should be.
There are a few problems with the original code.
It should be using $row->text instead of $row->fulltext. This is because when rendering an article Joomla merges tht introtext and fulltext fields.
It's a mistake to use $pattern for the matching when making the substitution. That's because the $pattern matches all of the items. Instead use the $match[0][$k] to do the replacement. Use str_replace instead of preg_replace because now you are matching the exact string and don't need to do a regex.
Here's the code for the whole thing.
class PlgContentLivefilter extends JPlugin{
public function onContentPrepare($context, &$row, &$params, $page = 0) {
return $renderUniInfo = $this->renderUniInfo($row, $params, $page = 0);
}
private function renderUniInfo(&$row, &$params, $page = 0) {
$pattern = '#\{uni\}(.*){\/uni\}#sU';
preg_match_all($pattern, $row->text, $matches);
foreach($matches[0] as $k=>$uni){
preg_match('/\{uni-title\}(.*)[\{]/Ui', $uni, $unititle);
preg_match('/\{uni-text\}(.*)/si', $uni, $unitext);
print_r($unititle[1]);
$title = $unititle[1];
$text = $unitext[1];
if (preg_match('#(?:http://)?(?:https://)?(?:www\.)?(?:youtube\.com/(?:v/|embed/|watch\?v=)|youtu\.be/)([\w-]+)?#i', $unitext[1], $match)) {
$video_id = $match[1];
$video_string = '<div class="videoWrapper"><iframe src="http://youtube.com/embed/'.$video_id.'?rel=0"></iframe></div>';
$unitext[1] = preg_replace('#(?:http://)?(?:https://)?(?:www\.)?(?:youtube\.com/(?:v/|embed/|watch\?v=)|youtu\.be/)([\w-]+)?#i', $video_string, $unitext[1]);
$text = $unitext[1];
}
$titleID = str_replace(' ', '_', trim($title));
$newString = '<span id="'.$titleID.'">'.$title.'</span><div class="university-info-holder"><div class="university-info"><i class="icon icon-close"></i>'.$text.'</div></div>';
$row->text = str_replace($matches[0][$k],$newString,$row->text);
}
}
}
I'm using Michel Fortin's PHP Markdown for Markdown converting but i want to show images as links instead of inline. Because anyone can insert a 5MB jpg from Imgur and slow down page.
How can i change images to link-to-image?
An example of an override would look something like the follwing:
class CustomMarkdown extends Markdown
{
function _doImages_reference_callback($matches) {
$whole_match = $matches[1];
$alt_text = $matches[2];
$link_id = strtolower($matches[3]);
if ($link_id == "") {
$link_id = strtolower($alt_text); # for shortcut links like ![this][].
}
$alt_text = $this->encodeAttribute($alt_text);
if (isset($this->urls[$link_id])) {
$url = $this->encodeAttribute($this->urls[$link_id]);
$result = "$alt_text";
} else {
# If there's no such link ID, leave intact:
$result = $whole_match;
}
return $result;
}
function _doImages_inline_callback($matches) {
$whole_match = $matches[1];
$alt_text = $matches[2];
$url = $matches[3] == '' ? $matches[4] : $matches[3];
$title =& $matches[7];
$alt_text = $this->encodeAttribute($alt_text);
$url = $this->encodeAttribute($url);
return "$alt_text";
}
}
Demo: http://codepad.viper-7.com/VVa2hP
You should have a look at Parsedown. It is a more recent and, I believe, easier to extend implementation of Markdown.
i wrote simple 3 functions to scrape titles , description and keywords of simple html page
this is the first function to scrape titles
function getPageTitle ($url)
{
$content = $url;
if (eregi("<title>(.*)</title>", $content, $array)) {
$title = $array[1];
return $title;
}
}
and it works fine
and those are 2 functions to scrape description and keywords and those not working
function getPageKeywords($url)
{
$content = $url;
if ( preg_match('/<meta[\s]+[^>]*?name[\s]?=[\s\"\']+keywords[\s\"\']+content[\s]?=[\s\"\']+(.*?)[\"\']+.*?>/i', $content, $array)) {
$keywords = $array[1];
return $keywords;
}
}
function getPageDesc($url)
{
$content = $url;
if ( preg_match('/<meta[\s]+[^>]*?name[\s]?=[\s\"\']+description[\s\"\']+content[\s]?=[\s\"\']+(.*?)[\"\']+.*?>/i', $content, $array)) {
$desc = $array[1];
return $desc;
}
}
i know there may be something wrong with the preg_match line but i really don't know
i tried it so much things but it doesn't work
Why not use get_meta_tags? PHP Documentation Here
<?php
// Assuming the above tags are at www.example.com
$tags = get_meta_tags('http://www.example.com/');
// Notice how the keys are all lowercase now, and
// how . was replaced by _ in the key.
echo $tags['author']; // name
echo $tags['keywords']; // php documentation
echo $tags['description']; // a php manual
echo $tags['geo_position']; // 49.33;-86.59
?>
NOTE You can change the parameter to either a URL, local file or string.
Its better to use php's native DOMDocument to parse HTML then regex, you can also use , tho in this day in age allot of sites dont even add the keywords, description tags no more, so you cant rely on them always being there. But here is how you can do it with DOMDocument:
<?php
$source = file_get_contents('http://php.net');
$dom = new DOMDocument("1.0","UTF-8");
#$dom->loadHTML($source);
$dom->preserveWhiteSpace = false;
//Get Title
$title = $dom->getElementsByTagName('title')->item(0)->nodeValue;
$description = '';
$keywords = '';
foreach($dom->getElementsByTagName('meta') as $metas) {
if($metas->getAttribute('name') =='description'){ $description = $metas->getAttribute('content'); }
if($metas->getAttribute('name') =='keywords'){ $keywords = $metas->getAttribute('content'); }
}
print_r($title);
print_r($description);
print_r($keywords);
?>
The function below is designed to apply rel="nofollow" attributes to all external links and no internal links unless the path matches a predefined root URL defined as $my_folder below.
So given the variables...
$my_folder = 'http://localhost/mytest/go/';
$blog_url = 'http://localhost/mytest';
And the content...
internal
internal cloaked link
external
The end result, after replacement should be...
internal
internal cloaked link
external
Notice that the first link is not altered, since its an internal link.
The link on the second line is also an internal link, but since it matches our $my_folder string, it gets the nofollow too.
The third link is the easiest, since it does not match the blog_url, its obviously an external link.
However, in the script below, ALL of my links are getting nofollow. How can I fix the script to do what I want?
function save_rseo_nofollow($content) {
$my_folder = $rseo['nofollow_folder'];
$blog_url = get_bloginfo('url');
preg_match_all('~<a.*>~isU',$content["post_content"],$matches);
for ( $i = 0; $i <= sizeof($matches[0]); $i++){
if ( !preg_match( '~nofollow~is',$matches[0][$i])
&& (preg_match('~' . $my_folder . '~', $matches[0][$i])
|| !preg_match( '~'.$blog_url.'~',$matches[0][$i]))){
$result = trim($matches[0][$i],">");
$result .= ' rel="nofollow">';
$content["post_content"] = str_replace($matches[0][$i], $result, $content["post_content"]);
}
}
return $content;
}
Here is the DOMDocument solution...
$str = 'internal
internal cloaked link
external
external
external
external
';
$dom = new DOMDocument();
$dom->preserveWhitespace = FALSE;
$dom->loadHTML($str);
$a = $dom->getElementsByTagName('a');
$host = strtok($_SERVER['HTTP_HOST'], ':');
foreach($a as $anchor) {
$href = $anchor->attributes->getNamedItem('href')->nodeValue;
if (preg_match('/^https?:\/\/' . preg_quote($host, '/') . '/', $href)) {
continue;
}
$noFollowRel = 'nofollow';
$oldRelAtt = $anchor->attributes->getNamedItem('rel');
if ($oldRelAtt == NULL) {
$newRel = $noFollowRel;
} else {
$oldRel = $oldRelAtt->nodeValue;
$oldRel = explode(' ', $oldRel);
if (in_array($noFollowRel, $oldRel)) {
continue;
}
$oldRel[] = $noFollowRel;
$newRel = implode($oldRel, ' ');
}
$newRelAtt = $dom->createAttribute('rel');
$noFollowNode = $dom->createTextNode($newRel);
$newRelAtt->appendChild($noFollowNode);
$anchor->appendChild($newRelAtt);
}
var_dump($dom->saveHTML());
Output
string(509) "<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body>
internal
internal cloaked link
external
external
external
external
</body></html>
"
Try to make it more readable first, and only afterwards make your if rules more complex:
function save_rseo_nofollow($content) {
$content["post_content"] =
preg_replace_callback('~<(a\s[^>]+)>~isU', "cb2", $content["post_content"]);
return $content;
}
function cb2($match) {
list($original, $tag) = $match; // regex match groups
$my_folder = "/hostgator"; // re-add quirky config here
$blog_url = "http://localhost/";
if (strpos($tag, "nofollow")) {
return $original;
}
elseif (strpos($tag, $blog_url) && (!$my_folder || !strpos($tag, $my_folder))) {
return $original;
}
else {
return "<$tag rel='nofollow'>";
}
}
Gives following output:
[post_content] =>
internal
<a href="http://localhost/mytest/go/hostgator" rel=nofollow>internal cloaked link</a>
<a href="http://cnn.com" rel=nofollow>external</a>
The problem in your original code might have been $rseo which wasn't declared anywhere.
Try this one (PHP 5.3+):
skip selected address
allow manually set rel parameter
and code:
function nofollow($html, $skip = null) {
return preg_replace_callback(
"#(<a[^>]+?)>#is", function ($mach) use ($skip) {
return (
!($skip && strpos($mach[1], $skip) !== false) &&
strpos($mach[1], 'rel=') === false
) ? $mach[1] . ' rel="nofollow">' : $mach[0];
},
$html
);
}
Examples:
echo nofollow('something');
// will be same because it's already contains rel parameter
echo nofollow('something'); // ad
// add rel="nofollow" parameter to anchor
echo nofollow('something', 'localhost');
// skip this link as internall link
Using regular expressions to do this job properly would be quite complicated. It would be easier to use an actual parser, such as the one from the DOM extension. DOM isn't very beginner-friendly, so what you can do is load the HTML with DOM then run the modifications with SimpleXML. They're backed by the same library, so it's easy to use one with the other.
Here's how it can look like:
$my_folder = 'http://localhost/mytest/go/';
$blog_url = 'http://localhost/mytest';
$html = '<html><body>
internal
internal cloaked link
external
</body></html>';
$dom = new DOMDocument;
$dom->loadHTML($html);
$sxe = simplexml_import_dom($dom);
// grab all <a> nodes with an href attribute
foreach ($sxe->xpath('//a[#href]') as $a)
{
if (substr($a['href'], 0, strlen($blog_url)) === $blog_url
&& substr($a['href'], 0, strlen($my_folder)) !== $my_folder)
{
// skip all links that start with the URL in $blog_url, as long as they
// don't start with the URL from $my_folder;
continue;
}
if (empty($a['rel']))
{
$a['rel'] = 'nofollow';
}
else
{
$a['rel'] .= ' nofollow';
}
}
$new_html = $dom->saveHTML();
echo $new_html;
As you can see, it's really short and simple. Depending on your needs, you may want to use preg_match() in place of the strpos() stuff, for example:
// change the regexp to your own rules, here we match everything under
// "http://localhost/mytest/" as long as it's not followed by "go"
if (preg_match('#^http://localhost/mytest/(?!go)#', $a['href']))
{
continue;
}
Note
I missed the last code block in the OP when I first read the question. The code I posted (and basically any solution based on DOM) is better suited at processing a whole page rather than a HTML block. Otherwise, DOM will attempt to "fix" your HTML and may add a <body> tag, a DOCTYPE, etc...
Thanks #alex for your nice solution. But, I was having a problem with Japanese text. I have fixed it as following way. Also, this code can skip multiple domains with the $whiteList array.
public function addRelNoFollow($html, $whiteList = [])
{
$dom = new \DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->loadHTML(mb_convert_encoding($html, 'HTML-ENTITIES', 'UTF-8'));
$a = $dom->getElementsByTagName('a');
/** #var \DOMElement $anchor */
foreach ($a as $anchor) {
$href = $anchor->attributes->getNamedItem('href')->nodeValue;
$domain = parse_url($href, PHP_URL_HOST);
// Skip whiteList domains
if (in_array($domain, $whiteList, true)) {
continue;
}
// Check & get existing rel attribute values
$noFollow = 'nofollow';
$rel = $anchor->attributes->getNamedItem('rel');
if ($rel) {
$values = explode(' ', $rel->nodeValue);
if (in_array($noFollow, $values, true)) {
continue;
}
$values[] = $noFollow;
$newValue = implode($values, ' ');
} else {
$newValue = $noFollow;
}
// Create new rel attribute
$rel = $dom->createAttribute('rel');
$node = $dom->createTextNode($newValue);
$rel->appendChild($node);
$anchor->appendChild($rel);
}
// There is a problem with saveHTML() and saveXML(), both of them do not work correctly in Unix.
// They do not save UTF-8 characters correctly when used in Unix, but they work in Windows.
// So we need to do as follows. #see https://stackoverflow.com/a/20675396/1710782
return $dom->saveHTML($dom->documentElement);
}
<?
$str='internal
internal cloaked link
external';
function test($x){
if (preg_match('#localhost/mytest/(?!go/)#i',$x[0])>0) return $x[0];
return 'rel="nofollow" '.$x[0];
}
echo preg_replace_callback('/href=[\'"][^\'"]+/i', 'test', $str);
?>
Here is the another solution which has whitelist option and add tagret Blank attribute.
And also it check if there already a rel attribute before add a new one.
function Add_Nofollow_Attr($Content, $Whitelist = [], $Add_Target_Blank = true)
{
$Whitelist[] = $_SERVER['HTTP_HOST'];
foreach ($Whitelist as $Key => $Link)
{
$Host = preg_replace('#^https?://#', '', $Link);
$Host = "https?://". preg_quote($Host, '/');
$Whitelist[$Key] = $Host;
}
if(preg_match_all("/<a .*?>/", $Content, $matches, PREG_SET_ORDER))
{
foreach ($matches as $Anchor_Tag)
{
$IS_Rel_Exist = $IS_Follow_Exist = $IS_Target_Blank_Exist = $Is_Valid_Tag = false;
if(preg_match_all("/(\w+)\s*=\s*['|\"](.*?)['|\"]/",$Anchor_Tag[0],$All_matches2))
{
foreach ($All_matches2[1] as $Key => $Attr_Name)
{
if($Attr_Name == 'href')
{
$Is_Valid_Tag = true;
$Url = $All_matches2[2][$Key];
// bypass #.. or internal links like "/"
if(preg_match('/^\s*[#|\/].*/', $Url))
{
continue 2;
}
foreach ($Whitelist as $Link)
{
if (preg_match("#$Link#", $Url)) {
continue 3;
}
}
}
else if($Attr_Name == 'rel')
{
$IS_Rel_Exist = true;
$Rel = $All_matches2[2][$Key];
preg_match("/[n|d]ofollow/", $Rel, $match, PREG_OFFSET_CAPTURE);
if( count($match) > 0 )
{
$IS_Follow_Exist = true;
}
else
{
$New_Rel = 'rel="'. $Rel . ' nofollow"';
}
}
else if($Attr_Name == 'target')
{
$IS_Target_Blank_Exist = true;
}
}
}
$New_Anchor_Tag = $Anchor_Tag;
if(!$IS_Rel_Exist)
{
$New_Anchor_Tag = str_replace(">",' rel="nofollow">',$Anchor_Tag);
}
else if(!$IS_Follow_Exist)
{
$New_Anchor_Tag = preg_replace("/rel=[\"|'].*?[\"|']/",$New_Rel,$Anchor_Tag);
}
if($Add_Target_Blank && !$IS_Target_Blank_Exist)
{
$New_Anchor_Tag = str_replace(">",' target="_blank">',$New_Anchor_Tag);
}
$Content = str_replace($Anchor_Tag,$New_Anchor_Tag,$Content);
}
}
return $Content;
}
To use it:
$Page_Content = 'internal
internal
google
example
stackoverflow';
$Whitelist = ["http://yoursite.com","http://localhost"];
echo Add_Nofollow_Attr($Page_Content,$Whitelist,true);
WordPress decision:
function replace__method($match) {
list($original, $tag) = $match; // regex match groups
$my_folder = "/articles"; // re-add quirky config here
$blog_url = 'https://'.$_SERVER['SERVER_NAME'];
if (strpos($tag, "nofollow")) {
return $original;
}
elseif (strpos($tag, $blog_url) && (!$my_folder || !strpos($tag, $my_folder))) {
return $original;
}
else {
return "<$tag rel='nofollow'>";
}
}
add_filter( 'the_content', 'add_nofollow_to_external_links', 1 );
function add_nofollow_to_external_links( $content ) {
$content = preg_replace_callback('~<(a\s[^>]+)>~isU', "replace__method", $content);
return $content;
}
a good script which allows to add nofollow automatically and to keep the other attributes
function nofollow(string $html, string $baseUrl = null) {
return preg_replace_callback(
'#<a([^>]*)>(.+)</a>#isU', function ($mach) use ($baseUrl) {
list ($a, $attr, $text) = $mach;
if (preg_match('#href=["\']([^"\']*)["\']#', $attr, $url)) {
$url = $url[1];
if (is_null($baseUrl) || !str_starts_with($url, $baseUrl)) {
if (preg_match('#rel=["\']([^"\']*)["\']#', $attr, $rel)) {
$relAttr = $rel[0];
$rel = $rel[1];
}
$rel = 'rel="' . ($rel ? (strpos($rel, 'nofollow') ? $rel : $rel . ' nofollow') : 'nofollow') . '"';
$attr = isset($relAttr) ? str_replace($relAttr, $rel, $attr) : $attr . ' ' . $rel;
$a = '<a ' . $attr . '>' . $text . '</a>';
}
}
return $a;
},
$html
);
}