delete all bug string between two words - php

I've got a script which generates text. I need to be strip all repeated blocks of text. The string is in xml format, so I can use the beginning and ending tags to determine where the strings are. I've been using substr_replace to remove the unnecessary text... However, this only works if I know how many times said text is going to be present in the string. Example :
<container>
<string1>This is the first string.</string>
<string2>This is the second string.</string>
<stuff>This is the important stuff.</stuff>
</container>
That container might appear once, twice six times, seven times, whatever. The point is, it's necessary to only have it appear once in the string variable. Right now this is what I'm doing.
$where_begin = strpos($wsman_output,'<container');
$where_end = strpos($wsman_output,"</container>");
$end_length = strlen("</Envelope>");
$attack = $where_end - $where_begin;
$attack = $attack + $end_length;
$wsman_output = substr_replace($wsman_output,"",$where_begin,$attack);
And I do that for each time the container exists.... However, I just found out that it's not always going to be the same.. Which really messes things up.
Any ideas?

In the end I decided to use the method suggested here.
I pulled each block of string I wanted from the variable, then combined them back together in the required order.

Related

how to decode php base64_decode

I'm a newbie starting to learn from source code. I bought a source code on the internet with full source code switching but it turns out there is a part that is hidden. How to do decrypt/decode for lines like this:
<?php
$keystroke1 = base64_decode("d2RyMTU5c3E0YXllejd4Y2duZl90djhubHVrNmpoYmlvMzJtcA==");
eval(gzinflate(base64_decode('hY5NCsIwEIWv8ixdZDCKWZcuPUfRdqrBmsBkAkrp3aVIi3Tj9v1+vje7PodWfQwNv3zSZAqJyqGNHRdE4+JiVU2ZVHy42fLyjDkoYUT54DdqpHxNKmsAJwtHFXxvksrAYXGort1cE9YsAe1dTJTOzCuEPZbhChN4SPw/iePMd/7ybSmcxeb+4Mj+vkzTBw==')));
$O0O0O0O0O0O0=$keystroke1[2].$keystroke1[32].$keystroke1[20].$keystroke1[11].$keystroke1[23].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
$keystroke2 = $O0O0O0O0O0O0("xes26:tr5bzf{8ydhog`uw9omvl7kicjp43nq", -1);
$OO000OO000OO=$keystroke2[16].$keystroke2[12].$keystroke2[31].$keystroke2[23].$keystroke2[18].$keystroke2[24].$keystroke2[9].$keystroke2[20].$keystroke2[11];
$O0000000000O=$keystroke1[30].$keystroke1[9].$keystroke1[6].$keystroke1[11].$keystroke1[27].$keystroke1[8].$keystroke1[19].$keystroke1[1].$keystroke1[11].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
eval($OO000OO000OO(base64_decode('LcTLsm
tKAADQn7lVZ+8yoBtB3ZH3OyEEMbnl0SLxTJrQvv
5M7hos9C36n38uF4Zh/u+nLDA6cf/VqJpq9PPHq2
IHD+dQlrVwpIa3BPicV2atbjLVsx+to7il1297dn
c+9PeDJGOoGn0MJUJnSqiJwrGcK5/bG2iiJtUoOk
3GKbHYjjzd5yLu3q2dPpWSFjDVTKWSS6MFsF6MU5
dsbJn7qHRxhGo0MNuluk29F3iwyAx/cYO+OfPWi1
ECDkWG1NsMLuAcM3F98vtMsubbvQjf1ZpVMUP5Eh
puFNzCi/CYkoM1VgsAetzjpvEe1M2AlX4YFjQZF0
A0VBRQKS0B5mcI7na2N/nER993+qocgmh9WawUrU
YhBMUiPNpuXNQy2o7VxHvhyO3nZkcWTmQu5kV1C2
ECbZiH8XsL4QuYbf7lI4SF1gDM/vVqRz4qyj7a8b
qS1nXP79731t4O0qcDaqN97BHDzlPwTEF6H7p9a3
Zu1Ut6X5GNTgZhWe3dHa+6yzJ58MX1Pc8mwAWK4v
EVLjGolQQLieOvkn4jD4d0FMQuLYvXhaxbzJyLR2
OHDKhMu2EwHthDt+I7YwOvVUydwEnCigk/n4iQei
SzwWNKicdunzmrVoOWl9gt8lhK+WzNpbPqkHEK7i
xBHT84UAbkHpity8i9eLUUulASI5d7cfpGWF6I4l
7tYBeJmYzXycA3FbbrSb+yNgd8XM5u7wU0mL8tVP
hJ2J/nu2QLr/OgzZrmp7xvKmpZCgHU7w0RlS1PT9
4JvxXtekif9dDGvBxSQjcwj2i32C7Abbcosvey5I
iq2hW7mjn/lUS6OUQ64Kw/v7+///4F')));
?>
is code like this dangerous?
You are looking at a piece of obfuscated code. I will explain it line by line, but first let's go over the functions that are used:
base64_decode()
This function decodes a base64 encoded string. It's used here to unscramble intentionally scrambled code.
gzinflate()
This function decompresses a compressed string. It's used the same way as base64_decode().
eval()
This function executes a string as code. Its use is discouraged and is in itself a bit of a red flag, though it has legitimate uses.
$keystroke1 = base64_decode("d2RyMTU5c3E0YXllejd4Y2duZl90djhubHVrNmpoYmlvMzJtcA==");
This line creates an apparently random string of characters: wdr159sq4ayez7xcgnf_tv8nluk6jhbio32mp
This string is saved to a variable, $keystroke1. The string itself is not important, other than that it contains some letters that are used later.
eval(gzinflate(base64_decode('hY5NCsIwEIWv8ixdZDCKWZcuPUfRdqrBmsBkAkrp3aVIi3Tj9v1+vje7PodWfQwNv3zSZAqJyqGNHRdE4+JiVU2ZVHy42fLyjDkoYUT54DdqpHxNKmsAJwtHFXxvksrAYXGort1cE9YsAe1dTJTOzCuEPZbhChN4SPw/iePMd/7ybSmcxeb+4Mj+vkzTBw==')));
This line unscrambles a doubly scrambled string and then runs this resulting code:
if(!function_exists("rotencode")){function rotencode($string,$amount) { $key = substr($string, 0, 1); if(strlen($string)==1) { return chr(ord($key) + $amount); } else { return chr(ord($key) + $amount) . rotEncode(substr($string, 1, strlen($string)-1), $amount); }}}
This creates a new function called rotencode(), which is yet another way of unscrambling strings.
$O0O0O0O0O0O0=$keystroke1[2].$keystroke1[32].$keystroke1[20].$keystroke1[11].$keystroke1[23].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
This line takes specific characters from that random string from earlier to create the word "rotencode" as a string, stored in the variable named $O0O0O0O0O0O0.
$keystroke2 = $O0O0O0O0O0O0("xes26:tr5bzf{8ydhog`uw9omvl7kicjp43nq", -1);
This line uses the rotencode() function to unscramble yet another string (actually exactly the same string as before, for some reason).
$OO000OO000OO=$keystroke2[16].$keystroke2[12].$keystroke2[31].$keystroke2[23].$keystroke2[18].$keystroke2[24].$keystroke2[9].$keystroke2[20].$keystroke2[11];
$O0000000000O=$keystroke1[30].$keystroke1[9].$keystroke1[6].$keystroke1[11].$keystroke1[27].$keystroke1[8].$keystroke1[19].$keystroke1[1].$keystroke1[11].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
On these lines the two (identical but separate) random strings are used to create the words gzinflate and base64_decode. This is done so the coder can use these functions without it being apparent that that's what is happening. However, base64_decode() is never used this way in the snippet you posted. That might suggest that it is used later in the code in places you haven't seen or recognized yet. Searching your code for "$O0000000000O" might yield other uses.
eval($OO000OO000OO(base64_decode('LcTLsmtKAADQn7lVZ+8yoBtB3ZH3OyEEMbnl0SLxTJrQvv5M7hos9C36n38uF4Zh/u+nLDA6cf/VqJpq9PPHq2IHD+dQlrVwpIa3BPicV2atbjLVsx+to7il1297dnc+9PeDJGOoGn0MJUJnSqiJwrGcK5/bG2iiJtUoOk3GKbHYjjzd5yLu3q2dPpWSFjDVTKWSS6MFsF6MU5dsbJn7qHRxhGo0MNuluk29F3iwyAx/cYO+OfPWi1ECDkWG1NsMLuAcM3F98vtMsubbvQjf1ZpVMUP5EhpuFNzCi/CYkoM1VgsAetzjpvEe1M2AlX4YFjQZF0A0VBRQKS0B5mcI7na2N/nER993+qocgmh9WawUrUYhBMUiPNpuXNQy2o7VxHvhyO3nZkcWTmQu5kV1C2ECbZiH8XsL4QuYbf7lI4SF1gDM/vVqRz4qyj7a8bqS1nXP79731t4O0qcDaqN97BHDzlPwTEF6H7p9a3Zu1Ut6X5GNTgZhWe3dHa+6yzJ58MX1Pc8mwAWK4vEVLjGolQQLieOvkn4jD4d0FMQuLYvXhaxbzJyLR2OHDKhMu2EwHthDt+I7YwOvVUydwEnCigk/n4iQeiSzwWNKicdunzmrVoOWl9gt8lhK+WzNpbPqkHEK7ixBHT84UAbkHpity8i9eLUUulASI5d7cfpGWF6I4l7tYBeJmYzXycA3FbbrSb+yNgd8XM5u7wU0mL8tVPhJ2J/nu2QLr/OgzZrmp7xvKmpZCgHU7w0RlS1PT94JvxXtekif9dDGvBxSQjcwj2i32C7Abbcosvey5Iiq2hW7mjn/lUS6OUQ64Kw/v7+///4F')));
This is where it all comes together. This line unscrambles a line of code which has been compressed and encoded 10 times over. The final result is this:
$cnk = array('localhost');
That's it. It sets the string "localhost" as the sole element of an array and saves it in a variable named $cnk.
In and of itself, there's nothing hazardous about running this code, but noting the lengths that the coder went to in order to hide this line, it's probably a safe bet that it wasn't placed there to help you - the buyer - in any way. Search your code for the $cnk variable if you want to know exactly what's being done. Or better yet, chalk this experience down to a loss and find a better way to learn coding. There are plenty of books, video tutorials and free resources online. Do not place your trust in whoever sold you this code. While they may not have been malicious (people suggested in comments that this might be part of a license check), anyone who includes something like this in their code is not someone you should be learning from.
Good luck on your coding journey!

PHP variables look the same but are not equal (I'm confused)

OK, so I shave my head, but if I had hair I wouldn't need a razor because I'd have torn it all out tonight. It's gone 3am and what looked like a simple solution at 00:30 has become far from it.
Please see the code extract below..
$psusername = substr($list[$count],16);
if ($psusername == $psu_value){
$answer = "YES";
}
else {
$answer = "NO";
}
$psusername holds the value "normann" which is taken from a URL in a text based file (url.db)
$psu_value also holds the value "normann" which is retrieved from a cookie set on the user's computer (or a parameter in the browser address bar - URL).
However, and I'm sure you can guess my problem, the variable $answer contains "NO" from the test above.
All the PHP I know I've picked up from Google searches and you guys here, so I'm no expert, which is perhaps evident.
Maybe this is a schoolboy error, but I cannot figure out what I'm doing wrong. My assumption is that the data types differ. Ultimately, I want to compare the two variables and have a TRUE result when they contain the same information (i.e normann = normann).
So if you very clever fellows can point out why two variables echo what appears to be the same information but are in fact different, it'd be a very useful lesson for me and make my users very happy.
Do they echo the same thing when you do:
echo gettype($psusername) . '\n' . gettype($psu_value);
Since i can't see what data is stored in the array $list (and the index $count), I cannot suggest a full solution to yuor problem.
But i can suggest you to insert this code right before the if statement:
var_dump($psusername);
var_dump($psu_value);
and see why the two variables are not identical.
The var_dump function will output the content stored in the variable and the type (string, integer, array ec..), so you will figure out why the if statement is returning false
Since it looks like you have non-printable characters in your string, you can strip them out before the comparison. This will remove whatever is not printable in your character set:
$psusername = preg_replace("/[[:^print:]]/", "", $psusername);
0D 0A is a new line. The first is the carriage return (CR) character and the second is the new line (NL) character. They are also known as \r and \n.
You can just trim it off using trim().
$psusername = trim($psusername);
Or if it only occurs at the end of the string then rtrim() would do the job:
$psusername = rtrim($psusername);
If you are getting the values from the file using file() then you can pass FILE_IGNORE_NEW_LINES as the second argument, and that will remove the new line:
$contents = file('url.db', FILE_IGNORE_NEW_LINES);
I just want to thank all who responded. I realised after viewing my logfile the outputs in HEX format that it was the carriage return values causing the variables to mismatch and a I mentioned was able to resolve (trim) with the following code..
$psusername = preg_replace("/[^[:alnum:]]/u", '', $psusername);
I also know that the system within which the profiles and usernames are created allow both upper and lower case values to match, so I took the precaution of building that functionality into my code as an added measure of completeness.
And I'm happy to say, the code functions perfectly now.
Once again, thanks for your responses and suggestions.

How to easily generate debugging statements for PHP code?

I need to be able to generate debugging statements for my code. For example, here is some code I have:
$this->R->radius_ft = $this->TC->diameter / 24;
$this->R->TBETA2_rad = $this->D->beta2 / $rad; //Outer angle
$this->R->TBETA1_rad = $this->R->inner_beta1 / $rad; //Inner angle
I need to be able display results of computations so that they can be read by a human.
So far I have been doing this (example showing first line from above only):
$this->R->radius_ft = $this->TC->diameter / 24;
if (self::DEBUG)
print("radius_ft({$this->R->radius_ft}) = diameter({$this->TC->diameter}) / 24");
The above print something like radius_ft(1.4583) = diameter(35) / 24 and a few of those lines looks like equations and are nicely traceable when I want to verify things on paper, or if I want to expose the intermediate work of the computations to someone else.
The problem is that it is a pain to construct those debugging statements. I craft them by hand, and usually it is not a problem, but in my current example, there are hundreds of lines of code where this needs to be done. Much pain.
I was wondering if there are facilities in PHP that will allow me to make print-outs of statements showing what each line of code does. Or methods to semi-automate creating the debug lines for me.
I have so far discovered this method to cut down on some of the work .... use Macro facilities of a text editor.
Paste line of code into TextPad (or similar editor that supports macros). Record macro and use Search, Mark and Copy facilities to carefully navigate between special symbols of the variable, such as $, >, and symbols that are not alphanumeric or $, >, etc. while copying and extracting and pasting parts of variable to craft my particular statement.
Exact steps may differ for one's needs. My macro operates on one variable like $this->R->radius_ft with cursor at the start and ends up with something like radius_ft({$this->R->radius_ft}), with cursor a few chars after the end, sometimes coinciding with the next variable to process.
Perhaps same could be done with regular expressions but I like the way Macro does it - I can process a variable and go to the next one and just repeat the macro with a hot key combination. This takes out the most tedious chunk of work for me.
Alternatively - hand the person the code and let them figure it out. Teach them how to read code.

PHP preg_replace markdown issue - detecting duplicates

In a project I am building I would like to use markdown as follows
*text* = <em>text</em>
**text** = <strong>text</strong>
***text*** = <strong><em>text</em><strong>
As those are the only three markdown formats I require, I would like to remain lightweight and avoid importing the entire PHP markdown library as that would introduce features I do not require and create issues.
So I have been trying to build some simple regex replaces. Using preg_replace I run:
'/(\*\*\*)(.*?)\1/' to '<strong><em>\2</em></strong>'
'/(\*\*)(.*?)\1/' to '<strong>\2</strong>'
'/(\*)(.*?)\1/' to '<em>\2</em>',
And this works great! em, bold, and the combo all work fine...
But if the user makes a mistake or enters to many stars, everything breaks.
i.e.
****hello**** = <strong><em><em>hello</em></strong></em>
*****hello***** = <strong><em><strong>hello</em></strong></strong>
******hello****** = <strong><em></em></strong>hello<strong><em></em></strong>
etc
When ideally it would create
****hello**** = *<strong><em>hello</em></strong>*
*****hello***** = **<strong><em>hello</em></strong>**
******hello****** = ***<strong><em>hello</em></strong>***
etc
Ignoring the un-required stars (so it would become clear to the user they made a mistake, and more importantly, the rendered HTML remains valid).
I presume there must be some way to modify my regex to do this but I cannot for the life of my work it out, even after a whole day trying!
I would also be happy with the result of
******hello****** = <strong><em>hello</em></strong>
So please, can anybody help me?
Also please consider uneven stars. In this case the below scenario would be ideal.
***hello* = **<em>hello</em>
And the time when a star should be part of the body and not detected, such as if a user inputs:
'terms and conditions may apply*'
or
'I give the film 5* out of 10'
Many many thanks
Try different capturing pattern (match anything except * one or more times),
'/(\*\*\*)([^*]+)\1/'

character-based pagination - inserting page breaks on text, not punctuation or code

I'm writing code to generate character-based pagination. I have articles in my site that I want to split up based on length.
The code I have so far is working albeit two issues:
It's splitting pages in the middle of words and HTML tags; I want it to
only split after a complete word, tag, or a punctuation mark.
In the pagination bar, it's generating the wrong number of pages.
In the
pagination bar, it's generating the
wrong number of pages.
Need help addressing these two issues. Code follows:
$text = file_get_contents($View);
$ArticleLength = strlen($text);
$CharsPerPage = 5000;
$NoOfPages = round((double)$ArticleLength / (double)$CharsPerPage);
$CurrentPage = $this->ReturnNeededObject('pagenumber');
$Page = (isset($CurrentPage) && '' !== $CurrentPage) ? $CurrentPage : '1';
$PageText = substr($text, $CharsPerPage*($Page-1), $CharsPerPage);
echo $PageText, '<p>';
for ($i=1; $i<$NoOfPages+1; $i++)
{
if ($i == $CurrentPage)
{
echo '<strong>', $i, '</strong>';
}
else
{
echo '', $i, '';
}
echo ' | ';
}
echo '</p>';
What am I doing wrong?
Thanks, guys. I put in the fix for the 1st point and it worked beautifully.
Hm. I guess it is messy to do the second point. I've found some regex on-line. Will think, write, and get back to you when I make some progress.
Thanks again.
$NoOfPages = round((double)$ArticleLength / (double)$CharsPerPage);
That should use ceil instead of round - if you use round, 4.2 pages will only show 1-4 - you need a 5th page to show the last .2 of a page.
The other part is harder ... its common to use some sort of marker in the file to indicate where the page breaks go as no matter how clever your code, it can't appreciate where is a good break in then same way a human can.
If you insist on doing it suggest some logic that first works forwards/backwards to the nearest space when a page break is created, which isn't too tricky. More tricky is deciding when you are within a tag or not .... think you'll either need some fairly heavy regex, or else an HTML parsing tool.
You're calculating the number of pages wrong... you should be using ceil() not round() (for example 4.1 pages worth of text is still 5 pages to display).
To fix the other issue, you're going to have big problems if there's arbitrary HTML in there. For example, you need to know that <div>s and <p>s are OK to split, but <table>s aren't (unless you want to get really fancy)!
To do it properly you should use an HTML library to build a tree of elements and then go from there.
Based on your first statement,
It's splitting pages in the middle of words and HTML tags
it appears that your character count is being done after markup is inserted. That would imply that e.g. long URLs in links would be counted against the page length you're trying to achieve. However, you didn't say how the articles were being created initially.
I'd suggest looking for a point in the process of creating the article where you could examine the raw text. By regarding the actual content (without markup) as a set of paragraphs, and estimating the vertical length of each paragraph based on typical number of characters per line, you can come up with a more consistent sizing.
I would also consider only breaking between paragraphs, to keep units of thought together on the same page. Speaking as a reader, I really hate going to sites that force me to pause, hit a button or link, and wait for a page reload, all in the middle of a single thought.

Categories