Regular Expression in Serialized Data - php

I am looking to a database search on serialized data. I am currently using Symfony2 as my Framework making pdo_mysql calls using Doctrine 2. What I would like to do is create a query that uses REGEXP to find data within a certian part of the array. The data I am trying to search within looks like this: -
a:1:{s:8:"bedrooms";a:5:{i:0;i:1;i:1;i:2;i:2;i:3;i:3;i:4;i:4;s:2:"5+";}}
So let's say I am looking for a record that has 3 bedrooms, then I would want it to find: -
i:2;i:3
The query I have come up with so far is: -
SELECT * FROM table WHERE field_name REGEXP '.*"bedrooms"; a:[0-9]+:{i:[0-9]+;i:3;}.*';
However this doesn't work. Can someone help me find a fix around this please? I think it's down to the way the regular expression is written.
Also its worth noting that there are other arrays stored in the field such credit limits and other data.
Thank you in advance.

I believe you can do it with the help of negated character class [^{}] that matches any character but a { and }:
.*"bedrooms";a:[0-9]+:[{][^{}]*i:[0-9]+;i:3[^{}]*[}]
See the regex demo

I see at least 2 mistakes and improvements you can do
first, in regex drop the blank space after "bedrooms";
you should scape the curly braces like \{ and \} since they are not literal for regex engine
if you are interested in a specific chunk in the string you must specify it as a group and inform what kind of characters are around, like
"bedrooms";a:[0-9]+:\{.*(i:[0-9];i:3).*\}
In this case in looking for i:*:i:3 where * is any digit

Related

Regular expression to match identical repeated digits in phone numbers

Sorry if the title wasnt descriptive enough. I have a bunch of phone numbers in a mysql database. Dont know if there is a query to do this or better to use something like preg_match with PHP. But I need to search using a pattern like so:
Ends with XXXX
or
Contains 4XXX
The X means the same number. So if I searched for Ends with XXXX Im looking for any number like so:
671-0000
421-5555
789-1111
If I search Contains 4XXX then Im looking for any number like so:
345-4111
156-4777
For some reason I cant wrap my brain around this. Seems like it would be pretty easy. Can anyone help? Appreciate it!
The simplest expression here for the NXXX pattern is:
\d(\d)\1{2}
If the regular expression engine you're using supports that kind of back-tracking with \1, which references whatever digit happens to be in the (\d) spot, then this should work as in this example.
You could also adapt that for the XXXX pattern like:
(\d)\1{3}
Where that's three repeated, identical digits after the first.

Insert £ before number

This whole problem has come up because our data input people are useless. We have a form for adding items to a database, and one of the fields is a price. The format is something like lowest - highest (lowest without 10% fee - highest without 10% fee), e.g. 11 - 22 (10 - 20)
The problem is the people adding this data are REALLY inconsistent with adding the pound sign, so some are like 11-£22(£10-20), so my idea is when I'm bringing back the data, remove any £ sign in there, and re add them all, so they will all look the same.
I'm guessing to do this I will need some sort of RegEx to match something, but I'm not sure what the pattern would be.
Can anyone help me figure out what RegEx I'd need to use?
If your regex flavour supports lookarounds you could use the expression:
£?(?<!\d)(\d+)
and use the following as the replacement:
£\1
This should work fine in PHP
You could also use this expression if you expect the price to contain commas and full-stops
£?(?<![0-9,.])(\d+)
A simpler solution would be to provide a drop down with a list of currency symbols. That way the addition of the symbol is obvious to the users.
You can still add an expression, could replace all non-numeric characters and allow a single dot character and many commas.
You could also user javascript to restrict the entered characters, but provide server side validation/modification anyway.
You can simply do this:
$result = preg_replace('~£?(\d+)~', '£$1', '11-£22(£10-20)');

Basic Regular Expression for

For some reason I always get stuck making anything past extremely basic regular expressions.
I'm trying to make a regular expression that kind of looks like a URL. I only want basic checking.
I would like it to match the following patterns where X is "something".
X://X.X
X://X.X... etc.
X.X
X.X... etc
If the string contains one of these patterns, it is sufficient checking for me. This way a url like www.example.com:8888 will still match. I have tried many different REGEX combinations with preg_match and cannot seem to get any to behave the way I want it to. I have consulted many other related REGEX questions on SO but my readings have not helped me.
Any help? I will be happy to provide more information if you would like but I don't know what else you would need.
It takes practice but here is one that I made using a regex tester (http://www.regextester.com/) to check my pattern:
^.+(:\/\/|\.)([a-zA-Z0-9]+\.)+.+
My approach is to slowly build my pattern from the beginning and add on one piece at a time. This cheatsheet is extremely helpful for remembering http://www.cheatography.com/davechild/cheat-sheets/regular-expressions/ what everything is.
Basically the pattern starts at the beginning of the string and checks for any characters followed by either :// or . then checks for groupings of letters and numbers followed by a . ending with any number of characters.
The pattern could probably be improved with groupings to not pass on invalid characters. But this one was quick and dirty. You could replace the first and last . with the characters that would be valid.
UPDATE
Per the comments here is an updated pattern:
^.+?(:\/\/|\.)?([a-zA-Z0-9]+?\.)+.+
/^(.+:\/\/)?[^.]+\.[^.\/]+([.\/][^.\/]+)*$/

Simple Regular Expression that detects strings that start with a z, but not zz

I'm sorry if this sounds amateur (and for many of you this will) but I'm in a rush, and I thought I'd leverage the wonderful brains of this community rather than attempting to create and expression that does not work.
I need to achieve the following in MySQL!
In a certain field, if the string starts with a z, pick them up in teh SQL statement, but NOT if it's followed by another z. This only applies to the beginning (^) of the string. Case insensitive. So if the string is already "zz_fdfad" that should not be picked up, but anything with "zfda" should be picked up. Also, if the z is followed by anything other than an alpha numeric value, they should be NOT picked up (so I do NOT want results that are like "z_fdfdsa" or "z-fdfsdaa", or "z#fdfds".. you get the idea).
All this while keeping in mind I need an SQL statement for this in MySQL, and it will be processed in PHP.
Thank you so much!
In SQL
col Like 'z%' and not col like 'zz%'
It's even index friendly!
This will match anything starts with a Z and has a non-Z alphanum after it:
^z[A-Ya-y0-9]

Finding correct php regex for this complex element

I'm trying to get a regex which is able to find the following part in a string.
[TABLE|head,border|{
#TEXT|TEXT|TEXT#
TEXT|TEXT|TEXT
TEXT|TEXT|TEXT
TEXT|TEXT|TEXT
}]
Its from a simple self made WYSIWYG Editor, which gives the possibility to add tables. But the "syntax" for a table should be as simple as the one above.
No as there can be many of these table definitions, I need to find all with php's preg_match_all to replace them with the well known <table> tag in html.
The regex iam trying to use for is the following:
/\[TABLE\|(.*)\|\{(.*)\}\]/si
The \x0A stays for a newline as my app is running on Linux this is enough (works fine with simpler regex).
I use the online regex tester on functions-online.com.
The matches it gets are not really usefull. And if i have more than one TABLE definition like the one above, then the matches are completely useless. Because of the (.*) it covers all from starting from "head,border" going to the very last "|" character in the second TABLE definition.
I would like to get a list of matches giving me the complete table command one by one.
This is because by default the .* will be a greedy match, assuming your code works correctly for an input containing only a single value. Placing a question mark after the two .*'s should prevent greedyness being an issue.
/\[TABLE\|(.*?)\|\{(.*?)\}\]/si

Categories