Ok, i'm using a library to get some strings from X website, this string looks like:
Mar 17 2019, 16:08:43 CET Died at Level 418 by Gaz'haragoth.
if($player->getDeaths()) {
$mystring = $player->getDeaths()[0];
$dateString = preg_replace("/\([^)]+\)/","",$mystring);
$date = new DateTime($dateString);
echo $date->format('Y-m-d H:i:s');
}
This is how my code looks looks like right now, how can I get only "Mar 17 2019, 16:08:43"?
Thanks!
echo substr("Mar 17 2019, 16:08:43 CET Died at Level 418 by Gaz'haragoth.", 0, 21);
Can use regex search of
(.*?)[0-9]+[:][0-9]+[:][0-9]+
on the string. Gets everything before the hh:mm:ss tag and nothing after
Getting substring upto a certain position too would work if it will be the same length every time.
If the first half of the string (the date) is standard you can also use this without any regex needed:
$output_str = implode(" ",array_splice(explode(" ",$input_str),0,4));
You can use DateTime::createFromFormat to create a DateTime object from that string.
$string = "Mar 17 2019, 16:08:43 CET Died at Level 418 by Gaz'haragoth.";
$date = DateTime::createFromFormat('M d Y, H:i:s T+', $string);
Then you can output from that in whatever format you like.
You may not really need a DateTime object though. If all you need to do is strip off the trailing text it seems like substr would be the simplest way, as long as your day is formatted with a leading zero the date part of the string should always be the same length.
Try
<?php
$str = "Mar 17 2019, 16:08:43 CET Died at Level 418 by Gaz'haragoth";
echo preg_replace('/^(.+\d+:\d+:\d+).+$/', '$1', $str);
?>
I have a date string, or datestamp, whichever is the easiest.
The string looks like this:
Thursday, 01 December 2016 19:00 - 22:00
Or, if there is no end time:
Saturday, 01 December 2016 19:00
I can also get them as datestamps.
The $event object contains information like this:
[dtstart] => 1479232800
[dtstart] => 1481094000
[dtend] => 1481151599
//apparently if there is no end time, the endtime is set to 23:59:59.
I tried doing this:
echo 'start '.date('H:i','$event->dtstart')
But that didn't give me anything.
Change this line:
echo 'start '.date('H:i','$event->dtstart');
to
echo 'start '.date('H:i',$event->dtstart); // remove the quotes from 2 parameter and try again, it will give you the result like:
start 10:00
Remove quotes from $event->dtstart
echo 'start '.date('H:i',$event->dtstart);
example: echo date('H:i',1171502725);
I have to edit a whole bunch of date intervals. But they are all mixed up. Most are in the form Month YearMonth Year
eg January 2014March 2015
How would I insert a hyphen in between so I end up with
January 2014 - March 2015
I also have the problem where these dates occur in the same year.
eg April 2012September2012
In such a case I would need to insert the hyphen and remove the year so that I'm left with
April - September
There must be some PHP string operators for stuff like this. Well thats what I'm hoping.
Would appreciate some guidance. Thanks in advance.
Thanks, sorry for my delayed reply
$string = "January 2014March 2015";
preg_match('/([a-z]+) *(\d+) *([a-z]+) *(\d+)/i', $string, $match);
print "$match[1] $match[2] - $match[3] $match[4]";
outputs,
January 2014 - March 2015
You could do it using lookaround:
$string = "January 2014March 2015";
$res = preg_replace('/(?<=\d)(?=[A-Z])/', ' - ', $string);
echo $res,"\n";
Output:
January 2014 - March 2015
I'm building a local events calendar which takes RSS feeds and website scrapes and extracts event dates from them.
I've previously asked how to extract dates from text in PHP here, and received a good answer at the time from MarcDefiant:
function parse_date_tokens($tokens) {
# only try to extract a date if we have 2 or more tokens
if(!is_array($tokens) || count($tokens) < 2) return false;
return strtotime(implode(" ", $tokens));
}
function extract_dates($text) {
static $patterns = Array(
'/^[0-9]+(st|nd|rd|th|)?$/i', # day
'/^(Jan(uary)?|Feb(ruary)?|Mar(ch)?|etc)$/i', # month
'/^20[0-9]{2}$/', # year
'/^of$/' #words
);
# defines which of the above patterns aren't actually part of a date
static $drop_patterns = Array(
false,
false,
false,
true
);
$tokens = Array();
$result = Array();
$text = str_word_count($text, 1, '0123456789'); # get all words in text
# iterate words and search for matching patterns
foreach($text as $word) {
$found = false;
foreach($patterns as $key => $pattern) {
if(preg_match($pattern, $word)) {
if(!$drop_patterns[$key]) {
$tokens[] = $word;
}
$found = true;
break;
}
}
if(!$found) {
$result[] = parse_date_tokens($tokens);
$tokens = Array();
}
}
$result[] = parse_date_tokens($tokens);
return array_filter($result);
}
# test
$texts = Array(
"The focus of the seminar, on Saturday 2nd February 2013 will be [...]",
"Valentines Special # The Radisson, Feb 14th",
"On Friday the 15th of February, a special Hollywood themed [...]",
"Symposium on Childhood Play on Friday, February 8th",
"Hosting a craft workshop March 9th - 11th in the old [...]"
);
$dates = extract_dates(implode(" ", $texts));
echo "Dates: \n";
foreach($dates as $date) {
echo " " . date('d.m.Y H:i:s', $date) . "\n";
}
However, the solution has some downsides - for one thing, it can't match date ranges.
I'm now looking for a more complex solution that can extract dates, times and date ranges from sample text.
Whats the best approach for this? It seems like I'm leaning back toward a series of regex statements run one after the other to catch these cases. I can't see a better way of catching date ranges in particular, but I know there must be a better way of doing this. Are there any libraries out there just for date parsing in PHP?
Date / Date Range samples, as requested
$dates = [
" Saturday 28th December",
"2013/2014",
"Friday 10th of January",
"Thursday 19th December",
" on Sunday the 15th December at 1 p.m",
"On Saturday December 14th ",
"On Saturday December 21st at 7.30pm",
"Saturday, March 21st, 9.30 a.m.",
"Jan-April 2014",
"January 21st - Jan 24th 2014",
"Dec 30th - Jan 3rd, 2014",
"February 14th-16th, 2014",
"Mon 14 - Wed 16 April, 12 - 2pm",
"Sun 13 April, 8pm",
"Mon 21 - Wed 23 April",
"Friday 25 April, 10 – 3pm",
"The focus of the seminar, on Saturday 2nd February 2013 will be [...]",
"Valentines Special # The Radisson, Feb 14th",
"On Friday the 15th of February, a special Hollywood themed [...]",
"Symposium on Childhood Play on Friday, February 8th",
"Hosting a craft workshop March 9th - 11th in the old [...]"
];
The function I'm currently using (not the above) is about 90% accurate. It can catch date ranges, but has difficulty if a time is also specified. It uses a list of regex expressions and is very convoluted.
UPDATE: Jan 6th, 2014
I'm working on code that does this, working on my original method of a series of regex statements run one after the other. I think I'm close to a working solution that can pretty much extract almost any date/time range / format from a piece of text. When I'm done I'll post it here as an answer.
I think you can sum up the regex in your question like the one below.
(?<date_format_1>(?<day>(?i)\b\s*[0-9]+(?:st|nd|rd|th|)?)(?<month>(?i)\b\s*(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|etc))(?<year>\b\s*20[0-9]{2}) ) |
(?<date_format_2>(?&month)(?&day)(?!\s+-)) |
(?<date_format_3>(?&day)\s+of\s+(?&month)) |
(?<range_type_1>(?&month)(?&day)\s+-\s+(?&day))
Flags: x
Description
Demo
http://regex101.com/r/wP5fR4
Discussion
By using recursive subpatterns, you reduce the complexity of the final regex.
I have used a negative lookahead in the date_format_2 because it would match partially range_type_1. You may need to add more range type depending on your data. Don't forget to check other partterns in case of partial match.
Another solution would consist in build small regexes in different string variables and then concatenate them in PHP to build a bigger regex.
Given an arbitrary string, for example ("I'm going to play croquet next Friday" or "Gadzooks, is it 17th June already?"), how would you go about extracting the dates from there?
If this is looking like a good candidate for the too-hard basket, perhaps you could suggest an alternative. I want to be able to parse Twitter messages for dates. The tweets I'd be looking at would be ones which users are directing at this service, so they could be coached into using an easier format, however I'd like it to be as transparent as possible. Is there a good middle ground you could think of?
If you have the horsepower, you could try the following algorithm. I'm showing an example, and leaving the tedious work up to you :)
//Attempt to perform strtotime() on each contiguous subset of words...
//1st iteration
strtotime("Gadzooks, is it 17th June already")
strtotime("is it 17th June already")
strtotime("it 17th June already")
strtotime("17th June already")
strtotime("June already")
strtotime("already")
//2nd iteration
strtotime("Gadzooks, is it 17th June")
strtotime("is it 17th June")
strtotime("17th June") //date!
strtotime("June") //date!
//3rd iteration
strtotime("Gadzooks, is it 17th")
strtotime("is it 17th")
strtotime("it 17th")
strtotime("17th") //date!
//4th iteration
strtotime("Gadzooks, is it")
//etc
And we can assume that strtotime("17th June") is more accurate than strtotime("17th") simply because it contains more words... i.e. "next Friday" will always be more accurate than "Friday".
I would do it this way:
First check if the entire string is a valid date with strtotime(). If so, you're done.
If not, determine how many words are in your string (split on whitespace for example). Let this number be n.
Loop over every n-1 word combination and use strtotime() to see if the phrase is a valid date. If so you've found the longest valid date string within your original string.
If not, loop over every n-2 word combination and use strtotime() to see if the phrase is a valid date. If so you've found the longest valid date string within your original string.
...and so on until you've found a valid date string or searched every single/individual word. By finding the longest matches, you'll get the most informed dates (if that makes sense). Since you're dealing with tweets, your strings will never be huge.
Inspired by Juan Cortes's broken link based off Dolph's algorithm, I went ahead and wrote it up myself. Note that I decided to just return on first successful match.
<?php
function extractDatetime($string) {
if(strtotime($string)) return $string;
$string = str_replace(array(" at ", " on ", " the "), " ", $string);
if(strtotime($string)) return $string;
$list = explode(" ", $string);
$first_length = count($list);
for($j=0; $j < $first_length; $j++) {
$original_length = count($list);
for($i=0; $i < $original_length; $i++) {
$temp_list = $list;
for($k = 0; $k < $i; $k++) unset($temp_list[$k]);
//echo "<code>".implode(" ", $temp_list)."</code><br/>"; // for visualizing the tests, if you want to see it
if(strtotime(implode(" ", $temp_list))) return implode(" ", $temp_list);
}
array_pop($list);
}
return false;
}
Inputs
$array = array(
"Gadzooks, is it 17th June already",
"I’m going to play croquet next Friday",
"Where was the dog yesterday at 6 PM?",
"Where was Steve on Monday at 7am?"
);
foreach($array as $a) echo "$a => ".extractDatetime(str_replace("?", "", $a))."<hr/>";
Outputs
Gadzooks, is it 17th June already
is it 17th June already
it 17th June already
17th June already
June already
already
Gadzooks, is it 17th June
is it 17th June
it 17th June
17th June
Gadzooks, is it 17th June already => 17th June
-----
I’m going to play croquet next Friday
going to play croquet next Friday
to play croquet next Friday
play croquet next Friday
croquet next Friday
next Friday
I’m going to play croquet next Friday => next Friday
-----
Where was Rav Four yesterday 6 PM
was Rav Four yesterday 6 PM
Rav Four yesterday 6 PM
Four yesterday 6 PM
yesterday 6 PM
Where was the Rav Four yesterday at 6 PM? => yesterday 6 PM
-----
Where was Steve Monday 7am
was Steve Monday 7am
Steve Monday 7am
Monday 7am
Where was Steve on Monday at 7am? => Monday 7am
-----
Something like the following might do it:
$months = array(
"01" => "January",
"02" => "Feberuary",
"03" => "March",
"04" => "April",
"05" => "May",
"06" => "June",
"07" => "July",
"08" => "August",
"09" => "September",
"10" => "October",
"11" => "November",
"12" => "December"
);
$weekDays = array(
"01" => "Monday",
"02" => "Tuesday",
"03" => "Wednesday",
"04" => "Thursday",
"05" => "Friday",
"06" => "Saturday",
"07" => "Sunday"
);
foreach($months as $value){
if(strpos(strtolower($string),strtolower($value))){
\\ extract and assign as you like...
}
}
Probably do a nother loop to check for other weekDays or other formats, or just nest.
Use the strtotime php function.
Of course you would need to set up some rules to parse them since you need to get rid of all the extra content on the string, but aside from that, it's a very flexible function that will more than likely help you out here.
For example, it can take strings like "next Friday" and "June 15th" and return the appropriate UNIX timestamp for the date in the string. I guess that if you consider some basic rules like looking for "next X" and week and month names you would be able to do this.
If you could locate the "next Friday" from the "I'm going to play croquet next Friday" you could extract the date. Looks like a fun project to do! But keep in mind that strtotime only takes english phrases and will not work with any other language.
For example, a rule that will locate all the "Next weekday" cases would be as simple as:
$datestring = "I'm going to play croquet next Friday";
$weekdays = array('monday','tuesday','wednesday',
'thursday','friday','saturday','sunday');
foreach($weekdays as $weekday){
if(strpos(strtolower($datestring),"next ".$weekday) !== false){
echo date("F j, Y, g:i a",strtotime("next ".$weekday));
}
}
This will return the date of the next weekday mentioned on the string as long as it follows the rule! In this particular case, the output was June 18, 2010, 12:00 am.
With a few (maybe more than a few!) of those rules you will more than likely extract the correct date in a high percentage of the cases, considering that the users use correct spelling though.
Like it's been pointed out, with regular expressions and a little patience you can do this. The hardest part of coding is deciding what way you are going to approach your problem, not coding it once you know what!
Following Dolph Mathews idea and basically ignoring my previous answer, I built a pretty nice function that does exactly that. It returns the string it thinks is the one that matches a date, the unix datestamp of it, and the date itself either with the user specified format or the predefined one (F j, Y).I wrote a small post about it on Extracting a date from a string with PHP. As a teaser, here's the output of the two example strings:
Input: “I’m going to play croquet next Friday”
Output: Array (
[string] => "next friday",
[unix] => 1276844400,
[date] => "June 18, 2010"
)
Input: “Gadzooks, is it 17th June already?”
Output: Array (
[string] => "17th june",
[unix] => 1276758000,
[date] => "June 17, 2010"
)
I hope it helps someone.
Based on Dolph's suggestion, I wrote out a function that I think serves the purpose.
public function parse_date($text, $offset, $length){
$parseArray = preg_split( "/[\s,.]/", $text);
$dateTest = implode(" ", array_slice($parseArray, $offset, $length == 0 ? null : $length));
$date = strtotime($dateTest);
if ($date){
return $date;
}
//make the string one word shorter in the front
$offset++;
//have we reached the end of the array?
if($offset > count($parseArray)){
//reset the start of the string
$offset = 0;
//trim the end by one
$length--;
//reached the very bottom with no date found
if(abs($length) >= count($parseArray)){
return false;
}
}
//try to find the date with the new substring
return $this->parse_date($text, $offset, $length);
}
You would call it like this:
parse_date('Setting the due date january 5th 2017 now', 0 , 0)
What you're looking for a is a temporal expression parser. You might look at the Wikipedia article to get started. Keep in mind that the parsers can get pretty complicated, because this really a language recognition problem. That is commonly a problem tackled by the artificial intelligence/computational linguistics field.
Majority of suggested algorithms are in fact pretty lame. I suggest using some nice regex for dates and testing the sentence with it. Use this as an example:
(\d{1,2})?
((mon|tue|wed|thu|fri|sat|sun)|(monday|tuesday|wednesday|thursday|friday|saturday|sunday))?
(\d{1,2})? (\d{2,4})?
I skipped months, since I'm not sure I remember them in the right order.
This is the easiest solution, yet I will do the job better than other compute-power based solutions. (And yeah, it's hardly a fail-proof regex, but you get the point). Then apply the strtotime function on the matched string. This is the simplest and the fastest solution.