Extract an 8 character integer string from an ical file - php

This code to get all sequences of 8 integers works fine:
preg_match_all('/[0-9]{8}/', $string, $match);
However I am only interested if the match starts with 20.
I know I have to add ^20 somewhere but I have tried many times with no success. I have looked at many regex tutorials but none of them seems to explain how to do 2 separate searches.
I am actually trying to parse ICAL files to extract the dates. If the 8 digit integer starts with 20 it almost certainly is a date.
For example: DTSTART:20150112T120000Z

How about this solution:
/(20)\d{6}/

This will probably find what you are looking for:
(?=20)(\d{8})
It does a positive lookahead to capture a group if it starts with 20 along with a 8 digit number.

The answer highly depends on what you want to achieve. Do you want to extract all and any dates from an icalendar file. If so, you might be missing birthday dates as their year are most likely to be starting with 19xx.
Also matching any dates will yield most likely many undesired dates like UNTIL, TRIGGER, DTEND, ...
Assuming from your example you want to extract events start dates, you could try:
DTSTART[a-zA-Z._%+-/=;]*:(\d){8}[T]?[\d]{6}
To be kept in mind: following DTSTART can be a timezone definition like TZID=America/New_York and/or the type definition DATE or DATE-TIME (see RFC5545 DATE-TIME

Related

Regex Number reservations

Good afternoon, I am creating a form with a number reservation field, for the user to pick. The system i had would only allow the reservation number by number, example: 1,2,3 and it would book the numbers 1,2 and 3.
Now i would like to add the option to book several numbers at once, example: 1-5,9,10and in this case it would book numbers 1 to 5, 9 and 10.
I'm using the following regex code, but it's not working as I want
^\d{1,5}(?:-\d{1,5})*(?:,\d{1,5})*(?:,\d{1,5}-\d{1,5})*(?:-\d{1,5},\d{1,5})*$
The problem with this code is whenever the user inserts two 1-3,4-6 it only allows one more number. For example 1-3,4-6,2,3 shows error when the ,3 is inserted.
There is also a problem where it allows to write several dashes without commas
example 1-3-6-8-9
perhaps something like this:
\A\d{1,5}(?:-\d{1,5})?(?:,\d{1,5}(?:-\d{1,5})?)*\z
The idea:
the range is optional (?:-\d{1,5})? (and follows the first number)
The group, that contains a comma followed by a number or a range, can occur zero or more times
Note that a problem can't be solved by regex since 6-4 or 1-5,2,3,4 are always possible. So you will need sooner or later to explode the string and to check if numbers and ranges are coherent.

PHP Regex extract date or date range

I have a database full of movie titles and i want to extract the date which i've managed to do with the following:
(19|20)[0-9][0-9]
However i've noticed some of my dates are in ranges for example 1998-2003 or sometimes there is a space like 1998 - 2003. Is there any way to adapt the regex to match the ranges with or without a space?
Use \s* to match zero or more spaces.
(?:19|20)[0-9]{2}\s*-\s*(?:19|20)[0-9]{2}
DEMO
If you want to match also the single year, then make the second part as optional.
(?:19|20)[0-9]{2}(?:\s*-\s*(?:19|20)[0-9]{2})?
DEMO

Parsing input from user in any order or format

I am having some trouble trying to figure out how to parse information collected from user. The information I am collecting is:
Age
Sex
Zip Code
Following are some examples of how I may receive this from users:
30 Male 90250
30/M/90250
30 M 90250
M 30 90250
30-M-90250
90250,M,30
I started off with explode function but I was left with a huge list of if else statements to try to see how the user separated the information (was it space or comma or slash or hypen)
Any feedback is appreciated.
Thanks
It's easy enough. The ZIP code is always 5 digits, so a simple regex matching /\d{5}/ will work just fine. The Age is a number from 1 to 3 digits, so /\d{1,3}/ takes care of that. As for the gender, you could just look for an f for female and if there isn't one assume male.
With all that said, what's wrong with separate input fields?
You might want to use a few regular expressions:
One that looks for 5 numeric digits: [^\d]\d{5}[^\d]
One that looks for 2 numeric digits: [^\d]\d{2}[^\d]
One that looks for a single letter: [a-zA-Z]
[EDIT]
I've edited the RegExes. They now match every one of the presented alternatives, and don't require any alteration of the input string (which makes it a more efficient choice). They can also be run in any order.

How can I use regex to solve this?

I have two strings that I need to pull data out of but can't seem to get it working. I wish I knew regular expression but unfortunately I don't. I have read some beginner tutorials but I can't seem to find an expression that will do what I need.
Out of this first string delimited by the equal character, I need to skip the first 6 characters and grab the following 9 characters. After the equal character, I need to grab the first 4 characters which is a day and year. Lastly for this string, I need the remaining numbers which is a date in YYYYmmdd.
636014034657089=130719889904
The second string seems a little more difficult because the spaces between the characters differ but always seem to be delimited by at minimum, a single space. Sometimes, there are as many as 15 or 20 spaces separating the blocks of data.
Here are two different samples that show the space difference.
!!92519 C 01 M600200BLNBRN D55420090205M1O
!!95815 A M511195BRNBRN D62520070906 ":%/]Q2#0*&
The data that I need out of these last two strings are:
The zip code following the 2 exclamation marks.
The single letter 'M' following that. It always appears to be in a 13 character block
The 3 numbers after the single letter
The next 3 numbers which are the person's height
The following next 3 are the person's weight
The next 3 are eye color
The next block of 3 which are the person's hair color
The last block that I need data from:
I need to get the single letter which in the example appears to be a 'D'.
Skip the next 3 numbers
The last and remaining 8 numbers which is a date in YYYYmmdd
If someone could help me resolve this, I'd be very grateful.
For the first string you can use this regular expression:
^[0-9]{6}([0-9]{9})=([0-9]{4})([0-9]{4})([0-9]{2})([0-9]{2})$
Explanation:
^ Start of string/line
[0-9]{6} Match the first 6 digits
([0-9]{9}) Capture the next 9 digits
= Match an equals sign
([0-9]{4}) Capture the "day and year" (what format is this in?)
([0-9]{4}) Capture the year
([0-9]{2}) Capture the month
([0-9]{2}) Capture the date
$ End of string/line
For the second:
^!!([0-9]{5}) +.*? +M([0-9]{3})([0-9]{3})([A-Z]{3})([A-Z]{3}) +([A-Z])[0-9]{3}([0-9]{4})([0-9]{2})([0-9]{2})
Rubular
It works in a similar way to the first. You may need to adjust it slightly if your data is not exactly in the format that the regular expression expects. You might want to replace the .*? with something more precise but I'm not sure what because you haven't described the format of the parts you are not interested in.

One type of delimiter in one date time string in RegEx

Is there any way to write a Regex that can validate one type of delimiter in one date time string only?
For example, 30/04/2010 is correct but 30-04/2010 is incorrect.
I googled and found something about backtrack but I am not very sure how to use it. For example, if i have this regex:
(?P<date>((31(?![\.\-\/\—\ \,\–\-]{1,2}(Feb(ruary)?|Apr(il)?|June?|(Sep(?=\b|t)t?|Nov)(ember)?)))|((30|29)(?![\.\-\/\—\ \,\–\-]{1,2}Feb(ruary)?))|(29(?=[\.\-\/\—\ \,\–\-]{1,2}Feb(ruary)?[\.\-\/\—\ \,\–\-]{1,2}(((1[6-9]|[2-9]\d)(0[48]|[2468][048]|[13579][26])|((16|[2468][048]|[3579][26])00)))))|(0?[1-9])|1\d|2[0-8])[\.\-\/\—\ \,\–\-]{1,2}(Jan(uary)?|Feb(ruary)?|Ma(r(ch)?|y)|Apr(il)?|Ju((ly?)|(ne?))|Aug(ust)?|Oct(ober)?|(Sep(?=\b|t)t?|Nov|Dec)(ember)?)[\.\-\/\—\ \,\–\-]{1,2}((1[6-9]|[2-9]\d)\d{2}))
Then how am I supposed to use backtrack here?
Thank you very much.
Not an answer to your question, but have you considered using strtotime()?
It is a very flexible function able to parse about any english date and time:
Feb 2, 2010
February 2, 2010
02/02/10
4 weeks ago
Next Monday
If a date is unparseable, the function will return false (or -1 prior to PHP 5.1).
There are a few gotchas - I think I remember that when using xx-xx-xxxx notation, it tends to assume an european DD-MM-YYYY date - but all in all, you may fare much better with it than with a regex.
While answer from Pekka much better solves your problem, I think it is worth answering this part of your question:
Then how am I supposed to use backtrack here?
Here is an example of regex which matches "05-07-2010" and "05/07/2010", but not "05-07/2010":
"#^\d{2}([/-])\d{2}\1\d{4}$#"
------ --
The most important parts of the regex are underlined; \1 here is a back reference to the first capturing subpattern ([/-]). You can get more information in PHP Manual's Back references chapter.
The regex you have, seems to check the string is in any of the possible formats that a datetime can be.
If you just want to check for your given example 30/04/2010 you could use this easy one
([\d]{1,2})/([\d]{1,2})/([\d]{2,4})
(day and month 1-2 digits, year 2-4 digits)

Categories