PHP Equivalent to Perl's URI::URL - php

I'm in the process of rewriting a Perl-based web crawler I wrote nearly 8 years ago in PHP. I used the quite handy URI::URL module in perl to do things like:
$sourceUrl = '/blah.html';
$baseHost = 'http://www.example.com';
my $url = URI::URL->new($sourceUrl, $baseHost);
return $url->abs;
returns: 'http://www.example.com/blah.html'
the parse_url function in PHP is quite handy, but is there something more robust? Specifically something that will give the above functionality?

Maybe Zend_Uri is what you are looking for?

print $baseHost . $sourceURL;
Am I missing something? Your way seems needlessly overcomplicated.

I did a bit of searching on the PEAR archive, and my first-guess approximation of URI::URL is Net_URL2. Maybe you want to give that a shot?

Related

PHP Regex problem!

I was creating a Syntax Highlighter in PHP but I was failed! You see when I was creating script comments (//) Syntax Highlighting (gray) , I was facing some problems. So I just created a shortened version of my Syntax Highlighting Function to show you all my problem. See whenever a PHP variable ,i.e., $example, is inserted in between the comment it doesn't get grayed as it should be according to my Syntax Highlighter. You see I'm using preg_replace() to achieve this. But the regex of it which I'm using currently doesn't seem to be right. I tried out almost everything that I know about it, but it doesn't work. See the demo code below.
Problem Demo Code
<?php
$str = '
<?php
//This is a php comment $test and resulted bad!
$text_cool++;
?>
';
$result = str_replace(array('<','>','/'),array('[',']','%%'),$str);
$result = preg_replace("/%%%%(.*?)(?=(\n))/","<span style=\"color:gray;\">$0</span>",$result);
$result = preg_replace("/(?<!\"|'|%%%%\w\s\t)[\$](?!\()(.*?)(?=(\W))/","<span style=\"color:green;\">$0</span>",$result);
$result = str_replace(array('[',']','%%'),array('<','>','/'),$result);
$resultArray = explode("\n",$result);
foreach ($resultArray as $i) {
echo $i.'</br>';
}
?>
Problem Demo Screen
So you see the result I want is that $test in the comment string of the 'Demo Screen' above should also be colored as gray!(See below.)
Can anyone help me solve this problem?
I'm Aware of highlight_string() function!
THANKS IN ADVANCE!
Reinventing the wheel?
highlight_string()
Also, this is why they have parsers, and regex (despite popular demand) should not be used as a parser.
I agree, that you should use existing, parsers. Every ide has a php parser, and many people have written more of them.
That said, I do think it is worth the mental exercise. So, you can replace:
$result = preg_replace("/(?<!\"|')[\$](?!\()(.*?)(?=(\W))/","<span style=\"color:green;\">$0</span>",$result);
with
//regular expression.:
//#([^(%%%%|\"|')]*)([\$](?!\()(.*?)(?=(\W)))#
//replacement text:
//$1<span style=\"color:green;\">$2</span>
$result = preg_replace("#([^(%%%%|\"|')]*)([\$](?!\()(.*?)(?=(\W)))#","$1<span style=\"color:green;\">$2</span>",$result);
Personally, I think your best bet is to use CSS selectors. Replace style=\"color:gray;\" with class="comment-text" and style=\"color:green;\" with class="variable-text" and this CSS should work for you:
.variable-text {
color: #00E;
}
.comment-text .comment-text.variable-text {
color: #DDD;
}
Insert don't use regex to parse irregular languages here
anyway, it looks like you've run into a prime example of why regular expressions are not suited for this kind of problem. You'd be better off looking into PHP's highlight_string functionality
Well, you don't seem to care that php already has a function like this.
But because of the structure of php code one cannot simply use a regex for this or walk into mordor (the latter being the easier).
You have to use a parser or you will fly over the cuckoo's nest soon.

Is there a PHP equivalent of Perl's URI::ParseSearchString?

I'm doing some work for a client that involves parsing the referrer information from Google et al to target various parts of a page to the user's search keywords.
I noticed that Perl's CPAN has a module called URI::ParseSearchString which seems to do exactly what I need. The problem is, I need to do it in PHP.
So, to avoid reinventing the wheel, does anyone know if there is a library out there for PHP that does the same / similar thing?
parse_str() is what you are looking for.
You may additionally want to use parse_url() to get the search string.
I'm the author of the module. As far as I know, I've never seen something similar for PHP. If you do come across anything, please do let me know.
That being said, I cannot image this being very hard to port to PHP and I can have an attempt at it if you dont find anything similar out there.
Spiros
Maybe this is too inefficient or the http_referer isn't showing the full uri ...
function parse_uri($uri) {
if (substr_count('?', $uri) > 0) {
$queryString = explode('?', $uri);
return parse_str($queryString[1]);
} else {
return parse_str($uri);
}
}
if (isset($_SERVER['HTTP_REFERER'])) {
print_r(parse_uri($_SERVER['HTTP_REFERER']));
}

Character-wise string diff in PHP

In short I am looking for something like google-diff-match-patch in PHP.
I have had a look at some similar questions at SO, and also at the algorithm provided here, but all of them fail:
diff("draßen", "da draußen")
should not give
<del>draßen</del> <ins>da draußen</ins>
(which is kind of stupid for my purpose, because I want to compare file names), but (try here)
<ins>da </ins>dra<ins>u</ins>ßen
Is there a code snippet in PHP that does this? Unfortunately, I cannot use (i.e. install) external packages.
https://github.com/gorhill/PHP-FineDiff supports character-wise diff and can render the differences in HTML
The PEAR Package Text_Diff provides Inline-Diffs.
There is a php version of google-diff-match-patch available here: https://github.com/nuxodin/diff_match_patch-php
There is a port of fresh version google-diff-match-patch library.
It is much faster than previous and have no problems wth utf8.

Am I breaking any "php good practice" in the following php array which deals with 3 (human) languages?

This is the most optimal way of dealing with a multilingual website I can think of, right now (not sure) which doesn't involve gettext, zend_translate or any php plugin or framework.
I think its pretty straight forward: I have 3 languages and I write their "content" in different files (in form of arrays), and later, I call that content to my index.php like you can appreciate in the following picture:
alt text http://img31.imageshack.us/img31/1471/codew.png
I just started with php and I would like to know if I'm breaking php good practices, if the code is vulnerable to XSS attack or if I'm writing more code than necessary.
EDIT: I posted a picture so that you can see the files tree (I'm not being lazy)
EDIT2: I'm using Vim with the theme ir_black and NERDTree.
Looks all right to me, although I personally prefer creating and using a dictionary helper function:
<?php echo dictionary("showcase_li2"); ?>
that would enable you to easily switch methods later, and gives you generally more control over your dictionary. Also with an array, you will have the problem of scope - you will have to import it into every function using global $language; very annoying.
You will probably also reach the point when you have to insert values into an internationalized string:
You have %1 votes left in the next %2 hours.
Sie haben %1 stimmen übrig für die nächsten %2 stunden.
Sinulla on %1 ääntä jäljellä seuraavan %2 tunnin ajassa.
that is something a helper function can be very useful for:
<?php echo dictionary("xyz", $value1, $value2 ); ?>
$value1 and $value2 would be inserted into %1 and %2 in the dictionary string.
Such a helper function can easily be built with an unlimited number of parameters using func_get_args().
It's OK generally. For instance, punBB's localization works this way. It is very fast. Faster than calling a function or an object's method or property. But I see a problem with this approach, since it doesn't support language fallbacks easily. I mean, if you don't have a string for Chinese, let it be displayed in English.
This problem is topical when you upgrade your system and you don't have time to translate everything in every language.
I'd better use something like
lang.en.php
$langs['en'] = array(
...
);
lang.cn.php
$langs['cn'] = array(
...
);
[prepend].php (some common lib)
define('DEFAULT_LANG', 'en');
include_once('lang.' . DEFAULT_LANG '.php');
include_once('lang.' . $user->lang . '.php');
$lang = array_merge($langs[DEFAULT_LANG], $langs[$user->lang]);
Looks all right to me also, but:
Seems that you have localization for multiple modules/sites, so why not break it down to multidimensional array?
$localization = array(
'module' => (object)array(
'heading' => 'oh, no!',
'perex' => 'oh, yes!'
)
);
I personally like to creat stdClass out of arrays with
$localization = (object)$localization;
so you can use
$localization->module->heading;
:) my 2 cents
The only way that this could be xss is if you have register_globals=On and you don't set $lang['showcase_lil'] or other $lang's. But I don't think you have to worry about this. So I think your in the clear.
as an xss test:
http://127.0.0.1/whatever.php?lang[showcase_lil]=alert(/xss/)
Wouldn't it have been better to post code and briefly explain this issue to us?
Anyway, putting each language in its own file and loading it through some sort of language component seems okay. I'd prefer using some sort of gettext, but this is okay too, I guess.
You should make a function for calling the language keys rather than relying on an array, something like
<?php echo lang('yourKey'); ?>
One thing to watch for is interpolation; that's really the only place XSS could sneak in if your server settings are sensible. If you at any point need to do something along the lines of translating "$project->name has $project->member_count members", you'll have to make sure you escape all HTML that goes in there.
But other than that, you should be fine.

Get full url and split into an array with JSP/ASP

In PHP/Apache I can get the full url and cut it up into parts like this
URL: example.com/friends/enemies-cats/
Then using PHP explode function I can split the URL by the "/" into an array.
Array[0] = 'friends';
Array[1] = 'enemies-cats';
I wonder, is it possible to do the same thing on a Java server. I am hoping the same thing could work on all servers e.g. tomcat, jboss, websphere etc. I would prefer not to use things like urlrewriter if I can avoid it.
Also is it possible to achieve the same thing in ASP?
Realistically, I would like to find the easiest way to convert the URL to an array in each of PHP, JSP, and ASP.
If it is possible, any idea where to start? Any limitations? Any security issues, etc.?
JSP:
String[] stringArray = url.split("/");
PHP: You already have it...
$parts = explode('/', $url);
ASP: I don't know ASP, but here is what google found:
parts = Split(url, "/");

Categories