RegEx for string replacement in MySQL Database - php

I have recently upgrade an IPB version of my forum but the quotes were not upgraded and IPS is not giving the support I need.
I need to build a regular expression do find and replace
For example this is the old forum format:
<div class="quotetop">QUOTE(Cleber__v # Apr 14 2015, 12:25 PM)
<a href="index.php?act=findpost&pid=2778161"><{POST_SNAPBACK}>
</a></div><div class="quotemain"><!--quotec-->
TEXT TO BE KEPT
<!--QuoteEnd-->
</div><!--QuoteEEnd-->
And this is the new format it should be on:
`<div>
<blockquote class="ipsQuote"
data-cite="Em 14/04/2015, (Cleber__v disse:" data-ipsquote=""
data-ipsquote-timestamp="1428004301" data-ipsquote-userid="2350"
data-ipsquote-username="Cleber__v" data-ipsquote-contapp="forums"
data-ipsquote-contenttype="forums" data-ipsquote-contentclass="forums_Topic"
data-ipsquote-contentid="105179" data-ipsquote-contentcommentid="2768819">
TEXT TO BE INSERTED</blockquote></div><p><span>​</span></p>`
So I need to find the content, save Username, postID (or commentid), time and text and replace it with the correct format.
I've been researching regex for about a week now with no sucess on how to make this happen
Anybody could help? Thank you

You should try for youself and give us the regex you tried, but here is a first step for you :
.*?>QUOTE\((?P<name>.*)\ \#\ (?P<date>[^\)]+).*?pid\=(?P<pid>[0-9]*).*?\<\!\-\-quotec\-\-\>(?P<text>.*?)\<\!\-\-QuoteEnd\-\-\>
See here how it works : https://regex101.com/r/dM0eG3/1 and how the match information corresponds to your need.
NB : you must remove all new line characters before applying this regex, but this is fairly easy to do in PHP or in any language you might use to create your db upgrade script.
That will extract all the relevant information from your text. Replacing these in the new format is left as an exercise for the reader.

Related

PHP pdf form parse regex

I have a two PDF forms that I'd like to input values for using PHP. There doesn't seem to be any open source solutions. The only solution seems to be SetaSign which is over $400. So instead I'm trying to dump the data as a string, parse using a regex and then save. This is what I have so far:
$pdf = file_get_contents("../forms/mypdf.pdf");
$decode = utf8_decode($pdf);
$re = "/(\d+)\s(?:0 obj <>\/AP<>\/)(.*)(?:>> endobj)/U";
preg_match_all($re, $decode, $matches);
print_r($matches);
However, my print_r is empty even after testing here. The matches on the right are first a numerical identifier for the field (I think) and then V(XX1) where "XX1" is the text I've manually entered into the form and saved (as a test to find how and where that data is stored). I'm assuming (but haven't tested) that N<>>>/AS/Off is a checkbox.
Is there something I need to change in my regex to find matches like (2811 0 obj <>/AP<>/V(XX2)>> endobj) where the first find will be a key and the second find is the value?
Part 1 - Extract text from PDF
Download the class.pdf2text.php # http://pastebin.com/dvwySU1a (Updated on 5 of April 2014) or http://www.phpclasses.org/browse/file/31030.html (Registration required)
Usage:
include('class.pdf2text.php');
$a = new PDF2Text();
$a->setFilename('test.pdf');
$a->decodePDF();
echo $a->output();
The class doesn't work with all pdf's I've tested, give it a try and you may get lucky :)
Part 2 - Write to PDF
To write the pdf contents use tcpdf which is an enhanced and maintained version of fpdf.
Thanks for those who've looked into this. I decided to convert the pdfs (since I'm not doing this as a batch) into svg files. This online converter kept the form fields and with some small edits I've made them printable. Now, I'll be able to populate the values and have a visual representation of the pdf. I may try tcpdf in the event I want to make it an actual pdf again though I'm assuming it wont keep the form fields.

PHP Mysql CodeIgniter Converting characters to symbols in very bizarre circumstances

PHP Mysql CodeIgniter Converting characters to symbols in very bizarre circumstances
Application Built on CodeIgniter.
Has been running for over a year. No problems.
Client fills in a form about a customer.
A simple trim($_POST['notes']) captures textarea form field text and saves to MySQL
no error reported in PHP or JavaScript
The other day I notice some text the client has entered, has had the brackets used in the text "()" replaced with the equivalent "()
I think... "That's strange... I don't recall any reason why those characters would have been replaced like that.!"
I take a look ... and a day later... here is my madness revealed:
The text in question is verbatim "
Always run credit card on file (we do not charge this customer for pick-up or return)
"
No matter what I did or changed on the code side.. I could not prevent the PHP... OR Javascript... Or MySQL... OR alien beings... - or whoever the heck is doing it - from converting the "()" in the text to "(). And I tried many things like cleaning the string in all ways known to man or god. Capturing the string previous to sending just before saving to the database. And the conversion would always take place just before the save to MySQL. I tried posting in different forms and fields... Same thing every time... could not stop the magic conversion to "().
What in the name of batman is in this magical text that is causing this to happen?? is it magic pixie dust sprinkled on to godaddy server it is running on??? 0_o
.......
Being the genius that I am 0_0 I decide to remove one word from the paragraph at a time.
Magically... as all the creatures of the forest gathered around - as I finally got to the word "file" in the paragraph, and removed it !!! Like magic - the "()" stay as "()" and are NOT converted to "()?!?!???!?!? :\ How come??I simply removed the word "file" from the text... How could this change anything?? What is the word "file" causing to change with how the string is saved or converted??
OK -So I tested this out on any and every form field in the app. Every single time, in any field, if you type the word "file" followed by a "(" it will convert the first "(" to "(; and the very next ")" to ")
So.. if the string is:
"file ( any number of characters or text ) any other text or characters"
On post, it will be converted mysteriously to:
"file ( any number of characters or text &#41 any other text or characters"
Remove the word "file" from the string, and you get:
"( any number of characters or text ) any other text or characters"
The alien beings return the abducted "()"
Anyone have a clue what the heck could be going on here?
What is causing this?
Is the word "file" a keyword that is tripping some sort of security measures? interpereting it as "file()"???
I dunno :\
It's the strangest thing I ever saw... Except for that time I walked in on Mom and Dad 0_o
Any help would be greatly appreciated, and I will buy you a beer for sure :)
The very large headed, - (way to much power for such tender egos) -, Noo-Noos here at stack have paused this question as "Off topic" LOL... honest to God these guys are so silly.
So - in an effort to placate the stack-gestapo - I will attempt to edit this question so that it is... "on topic"??? 0_o ... anything for you oh so "King" Stack Guys O_O - too bad you would never have the whit to ever notice such a bug... maybe some day. ;)
Sample code:
<textarea name="notes">Always run credit card on file (we do not charge this customer for pick-up or return) blah blah</textarea>
<?php
if(isset($_POST['notes']){
$this->db->where("ID = ".$_POST['ID']);
$this->db->update('OWNER', $_POST['notes']);
}
?>
Resulting MySQL storage:
"Always run credit card on file (we do not charge this customer for pick-up or return) blah blah"
InnoDB - Type text utf8_general_ci
I am not looking for a way to prevent it, or clean it... I am clearly asking "What causes it"
/*
* Sanitize naughty scripting elements
*
* Similar to above, only instead of looking for
* tags it looks for PHP and JavaScript commands
* that are disallowed. Rather than removing the
* code, it simply converts the parenthesis to entities
* rendering the code un-executable.
*
* For example: eval('some code')
* Becomes: eval('some code')
*/
$str = preg_replace('#(alert|cmd|passthru|eval|exec|expression|system|fopen|fsockopen|file|file_get_contents|readfile|unlink)(\s*)\((.*?)\)#si', "\\1\\2(\\3)", $str);
This is the part of XSS Clean. (system/core/Security.php)
If you want the filter to run automatically every time it encounters POST or COOKIE data you can enable it by opening your application/config/config.php file and setting this:
$config['global_xss_filtering'] = TRUE;
https://www.codeigniter.com/user_guide/libraries/security.html
try something like this
$this->db->set('OWNER', $_POST['notes'],FALSE);
$this->db->where('ID ', $_POST['ID']);
$this->db->update('table_name');
Men I think Is in your server. If Ur using Wamp try to check if you have miss Install some arguments in xhtml. This is my Idea. it's related on my experience in CodeIgniter. hope U will response if you want some advice.
Use utf8 encoding to store these values.
To avoid injections use mysql_real_escape_string() (or prepared statements).
To protect from XSS use htmlspecialchars.
How ever not sure what is the issue in ur case..
Probably try using some other sql keywords in the string and verify the solution.
Try replacing the &#40 and the &#41 with ( and ) using str_replace
If you are storing &#40 and &#41 in your database then you should try replacing it on output if not try and replace it before input.
I'm not sure if this would work, but you could try inserting a slash in or before the word 'file':
fi\le ( any number of characters or text ) any other text or characters

PHP preg_replace markdown issue - detecting duplicates

In a project I am building I would like to use markdown as follows
*text* = <em>text</em>
**text** = <strong>text</strong>
***text*** = <strong><em>text</em><strong>
As those are the only three markdown formats I require, I would like to remain lightweight and avoid importing the entire PHP markdown library as that would introduce features I do not require and create issues.
So I have been trying to build some simple regex replaces. Using preg_replace I run:
'/(\*\*\*)(.*?)\1/' to '<strong><em>\2</em></strong>'
'/(\*\*)(.*?)\1/' to '<strong>\2</strong>'
'/(\*)(.*?)\1/' to '<em>\2</em>',
And this works great! em, bold, and the combo all work fine...
But if the user makes a mistake or enters to many stars, everything breaks.
i.e.
****hello**** = <strong><em><em>hello</em></strong></em>
*****hello***** = <strong><em><strong>hello</em></strong></strong>
******hello****** = <strong><em></em></strong>hello<strong><em></em></strong>
etc
When ideally it would create
****hello**** = *<strong><em>hello</em></strong>*
*****hello***** = **<strong><em>hello</em></strong>**
******hello****** = ***<strong><em>hello</em></strong>***
etc
Ignoring the un-required stars (so it would become clear to the user they made a mistake, and more importantly, the rendered HTML remains valid).
I presume there must be some way to modify my regex to do this but I cannot for the life of my work it out, even after a whole day trying!
I would also be happy with the result of
******hello****** = <strong><em>hello</em></strong>
So please, can anybody help me?
Also please consider uneven stars. In this case the below scenario would be ideal.
***hello* = **<em>hello</em>
And the time when a star should be part of the body and not detected, such as if a user inputs:
'terms and conditions may apply*'
or
'I give the film 5* out of 10'
Many many thanks
Try different capturing pattern (match anything except * one or more times),
'/(\*\*\*)([^*]+)\1/'

yahoo weather Api

I would like to get the temperature value from Yahoo's weather API. I have found a tutorial but in the tutorial he is getting a different value. Could some one help me modify the tutorial that it could get the temp value from Yahoo's weather RSS feed?
<yweather:condition text="Partly Cloudy" code="30" temp="3"
date="Mon, 09 Apr 2012 3:48 pm EEST" />
RSS feed: http://weather.yahooapis.com/forecastrss?w=566473&u=c
The tutorial I followed: http://css-tricks.com/using-weather-data-to-change-your-websites-apperance-through-php-and-css/
If some one has a better solution for getting the value don't hesitate to say it. :)
This seems pretty straightforward. From the tutorial:
Since the only bit of information we care about is the yweather:condition element's text attribute, we're going to avoid creating an XML parsing object and use a short regular expression.
So, just look at the line with the regular expression:
$weather_class = format_result(
get_match( '/<yweather:condition text="(.*)"/isU', $data )
);
This is actually a bad regular expression because it assumes text will always be the first attribute (and that there'll always be that weird double-space. Here's a regular expression that will get the temp attribute regardless of where it falls:
/<yweather:condition\s+(?:.*\s)?temp="(.+)"/isU
Substitute that for the regular expression given to get_match() and you should be good to go.
Oh, and lest I be kicked off SO for not saying so: Attempting to parse arbitrary HTML XML with regular expressions is the path to madness.

RegEx teaser

Let's say we have 2 php variables:
$name = 'caption';
$url = 'http://domain.com/photo.jpg';
The input string of '{#url,<img src="," alt="{#name}" />}' should return:
'<img src="http://domain.com/photo.jpg" alt="caption" />'
The {tag} takes up to 3 parameters: {#variable[,text_before][,text_after]}.
What regex would be needed to make this happen? The tricky part is that a {#..} tag is nested within another.
I think you've come across one of those situations where you shouldn't use regex.
much like this one.
Multi-Line group and search with Regex
Multi-Line group and search with Regex
It's for a CMS. Admins can add column fields, then add template code for how it will be displayed on the listing page. The {#tags} are used to output the dynamic column values. This template code:
<p>Link: {#name} - {#date}</p>
would create a listing page like:
Link: link one - 2 July 2008
Link: link two - 14 June 2008
Link: link three - 9 February 2007
...
I figured that people might want to use column values within others, hence the "alt" tag example from the first post. So using regex for this would be a bad idea?

Categories