What I am trying to do is create a syntax highlighter.What I need help with is creating reg expression to find and highlight punction marks but I want the match to exclude strings between certain characters "",'' and also exclude strings that have been commented or defined//,/**/,#. I am currently useing this pattern
'/(!|%|\^|\*|\(|\)|\+|\=|\-|>|<|\?|,|\.|\:|\;|\'|&|\[|\]|\}|\{|~)(?=[^>]*(<|$))/'
This part seems to work to exclude stuff between tags but honestly it looks incorrect and I am clueless on how to structure the pattern correctly.
(?=[^>]*(<|$))
Here is my Highlight class
class SyntaxHighlight
{
public static function process($s,$lang,$raw = false)
{
$orig_s = $s;
$regexp = array(
//Keywords working
'/(?<!\w|\$|\%|\#|>)(and|or|xor|for|do|while|foreach|as|return|die|exit|if|then|else|
elseif|new|delete|try|throw|catch|finally|class|function|string|
array|object|resource|var|bool|boolean|int|integer|float|double|
real|string|array|global|const|static|public|private|protected|
published|extends|switch|true|false|null|void|this|self|struct|
char|signed|unsigned|short|long|break|goto|static_cast|const_cast|
LIMIT|ASC|DESC|ORDER|BY|SELECT|FROM|WHERE)(?!\w|=")/iux'
=> '<span class="K">$1</span>',
//$var, %var, #var
'/(?<!\w)(
(\$|\%|\#)(\->|\w)+
)(?!\w)/ix'
=> '<span class="V">$1</span>',
// Numbers (also look for Hex)
'/(?<!\w)(
0x[\da-f]+|
\d+
)(?!\w)(?=[^"]*(<|"))/ix'
=> '<span class="N">$1</span>',
//comments
'/(\/\*.*?\*\/|
\/\/.*?\n|
\#.[^a-zA-Z0-9]+?\|
\<\!\-\-[\s\S]+\-\-\>|
(?<!\\\)\'(.*?)(?<!\\\)\'
)/isx'
=> '<span class="C">$1</span>',
//char strings
'/\'(.+?)\'(?=[^>]*(<|$))/'
=> "<span class='CS'>'$1'</span>",
//back quote strings
'/\`(.+?)\`(?=[^>]*(<|$))/'
=> "<span class='BS'>`$1`</span>",
//strings
'/"(.+?)"(?=[^>]*(<|$))/'
=> '<span class="S">"$1"</span>',
//Punc
'/(!|%|\^|\*|\(|\)|\+|\=|\-|>|<|\?|,|\.|\:|\;|\'|&|\[|\]|\}|\{|~)(?=[^>]*(<|$))/'
=> '<span class="P">$1</span>',
);
$s = preg_replace( array_keys($regexp), array_values($regexp), $s);
if($lang == "Text" || $lang == "Other")
$s = $orig_s;
if($raw == "false")
$s = self::DisplayLineCount($s);
return '<pre style = "background-color:white;margin:0auto;text-align:left;overflow:auto;">'.$s.'</pre>';
}
private static function DisplayLineCount($s)
{
$i=1;
$count = substr_count($s, "\n");
$s = "<span style = 'color:#dfecf2;'>".$i."\t</span>".$s;
while($i <= $count)
{
++$i;
$s = preg_replace("/[\n]/","<span style = 'color:#dfecf2;'>".$i."\t</span>",$s,1);
}
return $s;
}
}
I am getting the syntax to highligh from a database and calling the function like this.
if(isset($row['syntax']) && $row['syntax'] == $cat_ && isset($row['title']) && $row['title'] == $top_ && isset($row['id']) && $row['id'] == $id_)
{
$id_ = isset($row['id']) ? $row['id'] : '';
$top_ = isset($row['title']) ? $row['title'] : '';
$top_ = htmlspecialchars($top_);
$auth_ = isset($row['author']) ? $row['author'] : '';
$date_ = isset($row['date']) ? $row['date'] : '';
$post_ = isset($row['paste']) ? $row['paste'] : '';
$newtext = htmlspecialchars($post_);
echo SyntaxHighlight::process($newtext,$cat_,$raw);
}
I am very new to useing reg expression patterns so please forgive me and thank you in advance.
Related
This question already has answers here:
Multiple returns from a function
(32 answers)
Closed 2 years ago.
I have the script below and I need to return both of the conditional return values -- each result on a line. Is there any way to format it as an array in order to work ?
The script is:
if ($pdf_url != '') {
if ($title != '') {
$title_from_url = $this->make_title_from_url($pdf_url);
if ($title == $title_from_url || $this->make_title_from_url('/'.$title) == $title_from_url) {
// This would be the default title anyway based on URL
// OR if you take .pdf off title it would match, so that's close enough - don't load up shortcode with title param
$title = '';
} else {
$title = ' title="' . esc_attr( $title ) . '"';
}
}
return apply_filters('pdfemb_override_send_to_editor', '[pdf-embedder url="' . $pdf_url . '"'.$title.']', $html, $id, $attachment);
} else {
return $html;
}
If you want each result on a line, it sounds like you could use a temporary variable, and concatenate to it based on the logic.
<?php
$output = '';
if ($pdf_url != '') {
if ($title != '') {
$title_from_url = $this->make_title_from_url($pdf_url);
if ($title == $title_from_url || $this->make_title_from_url('/'.$title) == $title_from_url) {
// This would be the default title anyway based on URL
// OR if you take .pdf off title it would match, so that's close enough - don't load up shortcode with title param
$title = '';
} else {
$title = ' title="' . esc_attr( $title ) . '"';
}
}
// You may need print_r() around apply_filters() if it returns an array
$output .= apply_filters('pdfemb_override_send_to_editor', '[pdf-embedder url="' . $pdf_url . '"'.$title.']', $html, $id, $attachment) . "\n";
} else {
$output .= $html . "\n";
}
return $output;
Or if you want it in array:
<?php
$output = [];
if ($pdf_url != '') {
if ($title != '') {
$title_from_url = $this->make_title_from_url($pdf_url);
if ($title == $title_from_url || $this->make_title_from_url('/'.$title) == $title_from_url) {
// This would be the default title anyway based on URL
// OR if you take .pdf off title it would match, so that's close enough - don't load up shortcode with title param
$title = '';
} else {
$title = ' title="' . esc_attr( $title ) . '"';
}
}
// You may need print_r() around apply_filters() if it returns an array
$output[] = apply_filters('pdfemb_override_send_to_editor', '[pdf-embedder url="' . $pdf_url . '"'.$title.']', $html, $id, $attachment);
} else {
$output[] = $html;
}
return $output;
I'm using smarty for my website and special link modifier that turns into clickable links text from users comments. It works perfect except it is turning such pieces of text as p.s. someword.asag and so on? Is there any way to improve it and prevent such things to happen, keeping it work only on real web links?
function smarty_modifier_autolink($string)
{
$string = preg_replace_callback('#(?:https?://\S+)|(?:www.\S+)|(?:\S+\.\S+)#', function($arr)
{
$email = filter_var($arr[0], FILTER_VALIDATE_EMAIL) === $arr[0] ? true : false;
$origArr0 = $arr[0];
if(strpos($arr[0], 'http://') !== 0 && strpos($arr[0], 'https://') !== 0)
{
$arr[0] = 'http://' . $arr[0];
}
$url = parse_url($arr[0]);
// images
if(isset($url['path'])){
if(preg_match('#\.(png|jpg|gif)$#', $url['path']))
{
return '<img class="img-responsive" src="'. $arr[0] . '" />';
}
}
// youtube
if(in_array($url['host'], array('www.youtube.com', 'youtube.com'))
&& $url['path'] == '/watch'
&& isset($url['query']))
{
parse_str($url['query'], $query);
return sprintf('<div class="embed-responsive embed-responsive-16by9"><iframe class="embed-responsive-item" src="http://www.youtube.com/embed/%s" allowfullscreen></iframe></div>', $query['v']);
}
//links
if($email){
return sprintf('%1$s', $origArr0);
}else{
return sprintf('%2$s', $arr[0], str_replace($url['scheme'].'://', '', $arr[0]));
}
}, $string);
return $string;
}
I need create a function that checks a parsed value to see if it matches a few other values and then return that match. For example I am trying to match video urls correctly. So if it's youtube do this or if it's vimeo do this or if it's nothing do this. I know how to create a function but I'm not sure what to use for the parse, would it be parse_url?
For my test cases I need to send in the right parameter and then see that the returned values are matching what I want them to be.
Here's what I've tried so far:
function get_video_embed_string($videostring) {
$video_url_parse = parse_url( $videostring, PHP_URL_HOST ); //get the input string ready to parse
$returnstring = ""; //default return string to empty string
if ($video_url_parse === 'vimeo.com') {
$returnstring = str_replace( 'vimeo.com', 'player.vimeo.com', $video_url_parse );
} else if ($video_url_parse === 'youtube.com') {
$returnstring = str_replace( 'youtube.com', 'youtube.com/embed/', $video_url_parse );
} else {
//do nothing
}
return $returnstring;
}
parse_str($returnstring);
//now setup your test cases and see what echos out of the above method
if ($returnstring === 'player.vimeo.com') {
echo "vimeo: <" . get_video_embed_string ("https://vimeo.com/abcdefg123") . ">";
} else if ($returnstring === 'youtube.com/embed/'){
echo "youtube: <" . get_video_embed_string ("https://youtube.com/abcdefg123") . ">";
} else if($returnstring === '' ){
echo "nothing: <" . get_video_embed_string ("https://abc123.com/abcdefg123") . ">";
} else {
echo "empty:< " . get_video_embed_string ("") . ">";
}
I think you're on the right track using parse_url, but I have a couple suggestions for improvement:
instead of the run-on if/elseif chain, use a switch
the str_replace isn't working well as is because you're replacing the parsed host, so why spend the overhead searching again for the string to replace when you've already found it.
in the user comments for parse_url, there's an excellent example to reconstruct the parsed url. this will avoid string replacements where the host name is also part of the url (www.youtube.com/youtubevideo123)
simplify your test cases by just calling your function for each case instead of another if/else chain check.
function get_video_embed_string($videostring) {
$video_url_parse = parse_url($videostring); //get the input string ready to parse
switch ($video_url_parse['host']) {
case 'vimeo.com':
$video_url_parse['host'] = 'player.vimeo.com';
return unparse_url($video_url_parse);
case 'youtube.com':
$video_url_parse['host'] = 'youtube.com/embed';
return unparse_url($video_url_parse);
default:
return unparse_url($video_url_parse);
}
}
function unparse_url($parsed_url) {
$scheme = isset($parsed_url['scheme']) ? $parsed_url['scheme'] . '://' : '';
$host = isset($parsed_url['host']) ? $parsed_url['host'] : '';
$port = isset($parsed_url['port']) ? ':' . $parsed_url['port'] : '';
$user = isset($parsed_url['user']) ? $parsed_url['user'] : '';
$pass = isset($parsed_url['pass']) ? ':' . $parsed_url['pass'] : '';
$pass = ($user || $pass) ? "$pass#" : '';
$path = isset($parsed_url['path']) ? $parsed_url['path'] : '';
$query = isset($parsed_url['query']) ? '?' . $parsed_url['query'] : '';
$fragment = isset($parsed_url['fragment']) ? '#' . $parsed_url['fragment'] : '';
return "$scheme$user$pass$host$port$path$query$fragment";
}
//now setup your test cases and see what echos out of the above method
echo "vimeo: <" . get_video_embed_string ("https://vimeo.com/abcdefg123") . ">\n";
echo "youtube: <" . get_video_embed_string ("https://youtube.com/abcdefg123") . ">\n";
echo "nothing: <" . get_video_embed_string ("https://abc123.com/abcdefg123") . ">\n";
echo "empty:< " . get_video_embed_string ("") . ">\n";
This will result in the following output in source:
vimeo: <https://player.vimeo.com/abcdefg123>
youtube: <https://youtube.com/embed/abcdefg123>
nothing: <https://abc123.com/abcdefg123>
empty:< >
parse_url() is very good for parsing URLs and - in your case - extract the host name from it.
Your example is a little messed up. $returnstring is not defined outside of your function. You should turn error reporting on, so you will see NOTICE messages on this kind of errors.
I assume, your function should return the video embed url, not only the host name. So you should do your replace on $videostring, not $video_url_parse:
function get_video_embed_string($videostring) {
$video_url_parse = parse_url( $videostring, PHP_URL_HOST ); //get the input string ready to parse
$returnstring = ""; //default return string to empty string
if ($video_url_parse === 'vimeo.com') {
$returnstring = str_replace( 'vimeo.com', 'player.vimeo.com', $videostring );
} else if ($video_url_parse === 'youtube.com') {
$returnstring = str_replace( 'youtube.com', 'youtube.com/embed', $videostring );
} else {
//do nothing
}
return $returnstring;
}
This will give you this output:
echo get_video_embed_string("https://vimeo.com/abcdefg123"); // https://player.vimeo.com/abcdefg123
echo get_video_embed_string("https://youtube.com/abcdefg123"); // https://youtube.com/embed/abcdefg123
echo get_video_embed_string("https://abc123.com/abcdefg123"); // <empty string>
[For a more robust approach, I would probably try to extract the video ID from all known valid URL schemes using regexp and just insert this ID in the embed url.]
Can anybody help, i'm trying to build a search using php that searches a text field in mysql.
I would like users to be able to enter a must have criteria, an or criteria and a not criteria, as well as being able to search for strings, so for example:
("This phrase" OR "That phrase") AND word
At present i'm using the example below to generate a search string:
$all = $row['and_search'] ;
$any = $row['or_search'] ;
$none = $row['not_search'];
if((!$all) || ($all == "")) { $all = ""; } else { $all = "$all"; }
if((!$any) || ($any == "")) { $any = ""; }
if((!$none) || ($none == "")) { $none = ""; } else { $none = "$none"; }
The above works brilliantly for only single words, but not searches such as the example above.
Any ideas how I can change achieve this?
Strip $all, $any and $none by whitespace into arrays. Than join them like
$newAll = '("' . join('" OR "', split(' ', $all)) . '")';
I'm using the jquery address plugin to build an ajax driven site, and i've got it working! Yay! For the purposes of this question we can use the test site:
http://www.asual.com/jquery/address/samples/crawling
http://www.asual.com/download/jquery/address
(I had to remove two calls to urlencode() to make the crawling example work.)
I'm encountering a problem with the $crawling->nav() call. It basically uses js and php to load parts of an xml file into the dom. I (mostly) understand how it works, and I would like to modify the example code to include sub pages.
For example, I would like to show 'subnav-project.html' at '/!#/project' and '/!#/project/blue', but not at '/!#/contact'. To do this, I figure php should 'know' what page the user is on, that way I can base my logic off of that.
Is this crazy? Can php ever know the current state of the site if I'm building it this way? If not, how does one selectively load html snippets, or modify what links are shown in navigation menus?
I've never gotten too crazy with ajax before, so any feedback at all would be helpful.
EDIT
This is the crawling class.
class Crawling {
const fragment = '_escaped_fragment_';
function Crawling(){
// Initializes the fragment value
$fragment = (!isset($_REQUEST[self::fragment]) || $_REQUEST[self::fragment] == '') ? '/' : $_REQUEST[self::fragment];
// Parses parameters if any
$this->parameters = array();
$arr = explode('?', $fragment);
if (count($arr) > 1) {
parse_str($arr[1], $this->parameters);
}
// Adds support for both /name and /?page=name
if (isset($this->parameters['page'])) {
$this->page = '/?page=' . $this->parameters['page'];
} else {
$this->page = $arr[0];
}
// Loads the data file
$this->doc = new DOMDocument();
$this->doc->load('data.xml');
$this->xp = new DOMXPath($this->doc);
$this->nodes = $this->xp->query('/data/page');
$this->node = $this->xp->query('/data/page[#href="' . $this->page . '"]')->item(0);
if (!isset($this->node)) {
header("HTTP/1.0 404 Not Found");
}
}
function base() {
$arr = explode('?', $_SERVER['REQUEST_URI']);
return $arr[0] != '/' ? preg_replace('/\/$/', '', $arr[0]) : $arr[0];
}
function title() {
if (isset($this->node)) {
$title = $this->node->getAttribute('title');
} else {
$title = 'Page not found';
}
echo($title);
}
function nav() {
$str = '';
// Prepares the navigation links
foreach ($this->nodes as $node) {
$href = $node->getAttribute('href');
$title = $node->getAttribute('title');
$str .= '<li><a href="' . $this->base() . ($href == '/' ? '' : '?' . self::fragment . '=' .html_entity_decode($href)) . '"'
. ($this->page == $href ? ' class="selected"' : '') . '>'
. $title . '</a></li>';
}
echo($str);
}
function content() {
$str = '';
// Prepares the content with support for a simple "More..." link
if (isset($this->node)) {
foreach ($this->node->childNodes as $node) {
if (!isset($this->parameters['more']) && $node->nodeType == XML_COMMENT_NODE && $node->nodeValue == ' page break ') {
$str .= '<p><a href="' . $this->page .
(count($this->parameters) == 0 ? '?' : '&') . 'more=true' . '">More...</a></p>';
break;
} else {
$str .= $this->doc->saveXML($node);
}
}
} else {
$str .= '<p>Page not found.</p>';
}
echo(preg_replace_callback('/href="(\/[^"]+|\/)"/', array(get_class($this), 'callback'), $str));
}
private function callback($m) {
return 'href="' . ($m[1] == '/' ? $this->base() : ($this->base() . '?' . self::fragment . '=' .$m[1])) . '"';
}
}
$crawling = new Crawling();
You won't be able to make server-side decisions using the fragment-identifier (i.e., everything to the right of the # character). This is because browsers don't send fragment-identifiers to the server. If you're going to want to make server-side decisions, you'll need to use some JavaScript assistance (including AJAX) to communicate what the current fragment-identifier is.