PHP REGEX is a weakness of mine, but still I manage to get some things done with online tools. Consider the following:
A subject string which generally follows this pattern: 1551 UTC 04 June 2012
I want to extract the "04" and assign it to the $day variable using below:
$day = preg_replace("/^([0-9]{4})\s([A-Z]{3})\s([0-9]{2})\s([A-Za-z]{3,})\s([0-9]{4})$/", "$3", $weather['date']);
This works on the following website: http://sqa.fyicenter.com/Online_Test_Tools/Test_Regular_Expression_Search_Replace.php
but I can't get it to work in my script... $day would equal the whole subject string.
The result of your var_dump() is string(38) "1551 UTC 04 June 2012 ". It has 38 chars while it should be only 21. So it looks like there are multiple whitespaces in the string.
Try to trim() your input string and replace \s with \s+ to support multiple whitespaces:
$day = preg_replace("/^([0-9]{4})\s+([A-Z]{3})\s+([0-9]{2})\s+([A-Za-z]{3,})\s+([0-9]{4})$/", "$3", trim($weather['date']));
you say preg_replace, but I think you want to use preg_match(). Is that correct that you don't want to replace the "04" but you just want to put it into a the variable $day? If so use preg_match(). In your description you say you want to capture only the "04" part, but your regex has many capture groups (anything within "()" is a capture group and will be returned in the array you give to preg_match).
Related
I am trying to change a string which may have a date inside e.g.
"This is the test string with 22/12/2012. 23/12/12 could anywhere in the string"
I need to change above string so that date are in the format d-m-y i.e.
"This is the test string with 22-12-2012. 23-12-12 could appear anywhere in the string"
EDIT:
Please note that the date will could changed in terms of years i.e. 2012 or 12 could be used at time i.e 20/06/2012, 20/06/12. Only year could be 2 or 4 digits, rest will be same.
Any help will be highly appreciated.
Cheers,
Use preg_replace like this:
$repl = preg_replace('~(\d{2})/(\d{2})/(\d{2,4})~', '$1-$2-$3', $str);
Live Demo: http://ideone.com/7HDNZa
$string = preg_replace("/([0-9]{2})\/([0-9]{2})\/([0-9]{2,4})/", "$1-$2-$3", $string);
The regex will find 3 lots of 2 numbers (or 2x2 + 1x4) separated by /'s and replace them with the same numbers separated by -'s.
You could try something like this:
preg_replace('~\b([0-2]?[1-9]|3[01])/(0?[1-9]|1[0-2])/(?=(?:\d\d|\d{4})\b)~', '$1-$2-', $str);
Should match valid dates only. Does match dates where the prefix 0 is not present, e.g. 4/16/13 if this is not desierable, remove the two first question marks (in [0-2]? and 0?)
I need to get everything before "On Sun, May 27, 2012 at 6:25 AM,"
I am hoping to get everything before "On xxx, xxx xx, xxxx at xx:xx xx,"
The problem here is that May, 27, and 6 are all variable in length. What is the best tool for this job. Due to my lack of experience with regex I am trying to use explode() but it doesn't appear it can do the job here. Is regex my best option?
[EDIT]
I ended up using a combination of answers. I went with:
preg_match("/(.*)On\s+(Sun|Sat|Fri|Thu|Wed|Tue|Mon),\s+(January|February|March|April|May|June|July|August|September|October|November|December)\s+\d?\d,\s+\d{4}\s+at\s+\d?\d:\d\d\s+[AP]M,/i", $to, $end);
Something like this, I guess:
/On\s+(Sun|Sat|Fri|Thu|Wed|Tue|Mon),\s+(January|February|March|April|May|June|July|August|September|October|November|December)\s+\d?\d,\s+\d{4}\s+at\s+\d?\d:\d\d\s+[AP]M,/i
[EDIT]
As per the comment: I have added support for case insensitive (by adding the i modifier to the end of the regex). I have also change the spaces in the expression to \s to allow any whitespace character, and added + to allow multiples spaces between words.
I haven't changed it to support long day names or short month names, as the questions specified that month name was variable in length but didn't specify day name as being variable. However, it should be trivial enough to add these variants if required.
[EDIT]
$to = "Let me know how this response looks..... On Sun, May 27, 2012 at 6:25 AM, Pr";
preg_match("/On\s+(Sun|Sat|Fri|Thu|Wed|Tue|Mon),\s+(January|February|March|April|May|June|July|August|September|October|November|December)\s+\d?\d,\s+\d{4}\s+at\s+\d?\d:\d\d\s+[AP]M,/i", $to, $end);
This code works for the example given in your comment.
Hope that helps.
preg_match('/(.*?) On \w+, \w+ \d?\d, \d+ at \d?\d:\d?\d \w\w,/', 'grab this text here On Sun, May 27, 2012 at 6:25 AM,', $matches);
echo $matches[1];
// echoes 'grab this text here'
(.*?) matches everything in the beginning, \w+ matches any alphanumeric character 1 or more times, \d?\d matches either one or two digits
a regular expression would work since that's what it was made for: selecting data based on a pattern. You could however explode on ',' (comma) and just implode the first 4 elements together again to form your sentence. I doubt using regular expression will be faster in this case.
Ultimately it's your preference: which is better readable and understandable by you.
The main advantage regular expression would have in this particular case is hat they can extract specific values/patterns, so you could easily have them set aside the month for instance.
$dateString = "On Sun, May 27, 2012 at 6:25 AM, some other text here";
// using explode/implode
$result = explode(',',$dateString);
print "we got: " . implode(',', array_slice($result,0,3)) . "\n";
// using regular expression
$pattern = "/On [A-Z,a-z]{3}, [A-Z,a-z]{3} [0-9]+, [0-9]{4} at [0-9,:]+ (?:A|P)M/U";
preg_match($pattern,$dateString,$match);
print "We got: " . $match[0] . "\n";
Please also read the PHP manual, Regular Expressions subsection together with an initial tutorial
Personally in this case I think reg exp might be overkill both visually and performance wise. Do learn regular expressions though, they can be very helpful at times.
how can i replace date/time in this format 'Fri Mar 23 15:21:08 2012' with preg_replace?
Date in this format is present couple of times in my text and i need to replace it with current time/date.
Thanks,
Chris
Well, what you need is an expression that will match 3 letters (Fri) followed by a space and another three letters (Mar).
First we need to match some letters:
/[a-z]/
We can match exactly 3 letters like this:
/[a-z]{3}/
...and we'll need it to be case insensitive:
/[a-z]{3}/i
...so the first part is just:
/[a-z]{3} [a-z]{3}/i
Next, we need to match either 1 or 2 numerics. A numeric can be represented with the escape sequence \d, so we'd use:
/\d{1,2}/
Next we match the time string, using the same escape sequence:
/\d{2}:\d{2}:\d{2}/
...followed by a final 4 digit year:
/\d{4}/
Put it all together and we get:
/[a-z]{3} [a-z]{3} \d{1,2} \d{2}:\d{2}:\d{2} \d{4}/i
// Fri Mar 23 15 : 21 : 08 2012
Now, we need to replace it with the current date and time. The usual place we'd go for that is the date() function, but how to we get that into the replacement dynamically? Well we could pass it as a string literal, or we could use a callback function to get it from preg_replace_callback(). But, preg_replace() gives us the e modifier which causes the replacement string to be evaluated for PHP code. We have to be careful and sparing with it's use, as with any PHP eval(), but this is a legitimate use case.
So our final PHP code looks like this:
preg_replace(
'/[a-z]{3} [a-z]{3} \d{1,2} \d{2}:\d{2}:\d{2} \d{4}/ie',
"date('D M j H:i:s Y')",
$str
);
See it working
I think listing the finite sets of options is kind of better for these task and it will also save you from false positives. These are the patterns to match each part of the date format:
Days: (?:Mon|Tue|Wed|Thu|Fri|Sat|Sun)
Months: (?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)
Day: \d{1,2}
Time: \d{1,2}:\d{2}:\d{2}
Year: \d{4}
Putting everything together:
(?:Mon|Tue|Wed|Thu|Fri|Sat|Sun) (?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) \d{1,2} \d{1,2}:\d{2}:\d{2} \d{4}
The code might look like:
$current_date = date('D M j H:i:s Y');
$text = preg_replace(
'/(?:Mon|Tue|Wed|Thu|Fri|Sat|Sun) (?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) \d{1,2} \d{1,2}:\d{2}:\d{2} \d{4}/i',
$current_date,
$text
);
See a working example.
preg_replace('/Fri Mar 23 15:21:08 2012/',date('D M d H:i:s Y'),$string);
Normally do what you want.
I am trying to extract a date from a string variable and was hoping to get some help.
$editdate = "Content last modified on 17 May 2011 at 23:13";
from this string, I am trying to extract 17 May 2011, please keep in mind that the date will vary and the code needs to be able to extract any date in this format, DD MMM YYYY.
I thought of using preg_match to do this but I couldn't come up with a proper regex pattern that would extract the date properly.
Is this possible to do with regex or should I use a different function?
Thanks for the help !
Try:
$timestamp = strtotime( str_replace( array("Content last modified ", "at"), "", $editdate ) );
Which will leave you with an epoch time stamp that you can then output however you like using date()
This is possible with a regex. Given the format DD MMM YYYY you would need a regex that matches two (or one?) digits, then one space, three letters, one space and four digits.
That would look like:
$regex = '/(\d{2} [a-z]{3} \d{4})/i';
This can be optimized further.
Presuming the textual content of your string is always the same, and that it always ends with the time...
$editdate = substr($editdate, 25, -9); // 17 May 2011
However, this is very inflexible if the date format were ever to change.
Try this 'un:
preg_match('/(\d?\d [A-Za-z]+ \d\d\d\d) at (\d\d\:\d\d)/', $editdate, $matches);
print_r($matches);
$date = $matches[1];
$time = $matches[2];
I THINK that'll work in all cases (though it is pretty ugly).... :)
This might be the pattern that does the trick:
([0-9]){1}([0-9]){0,1}(\s.*\s)([0-9]){4}
Search for 1 digit then there might be another, followed by a space and character, a space and 4 digits for the year.
I have some strings I need to scrape data from. I need a simple way of telling PHP to look in the string and delete data before and after the part I need. An example is:
When: Sat 19 Sep 2009 22:00 to Sun 20 Sep 2009 03:00
I want to delete the "When: " and then remove the & and everything after it. Is this a Regex thing? Not really used them before.
I would not use regular expressions for this.
$data = substr($input, 6, strpos($input, '&') - 6);
Yes, regex can do this kind of thing in its sleep.
$result = preg_replace('/When:(.*)&.*/', '$1', $text);
UPDATE
If you want to find the date range only, in the middle of a lot of other text, here is a crude regex that will match the one in the question...
if (preg_match('/[a-z]{3} [0-9]{2} [a-z]{3} [0-9]{4} [0-9]{2}:[0-9]{2} to [a-z]{3} [0-9]{2} [a-z]{3} [0-9]{4} [0-9]{2}:[0-9]{2}/i', $text, $regs)) {
$result = $regs[0];
} else {
$result = "";
}
So you would want to keep "Sat 19 Sep 2009 22:00 to Sun 20 Sep 2009 03:00"
Well you can go for a regexp alright. I don't know much about the Regexp in PHP, but in PERL, you could do somehing like
/^When: (.*)\ $/ .
The (.*) could then be used to get all that is what you want to keep. In PERL, that would be looking the $1 var.
Or you could do something like
/^When: (.)\&.$/ if the content after the & is variable.
Also, you must watch out. If the string you want to keep contains &, then it might a little more tricky.
But RegExp are usually the way to got for this type of work.