Adding and removing from a string containing dates using PHP - php

I have to edit a whole bunch of date intervals. But they are all mixed up. Most are in the form Month YearMonth Year
eg January 2014March 2015
How would I insert a hyphen in between so I end up with
January 2014 - March 2015
I also have the problem where these dates occur in the same year.
eg April 2012September2012
In such a case I would need to insert the hyphen and remove the year so that I'm left with
April - September
There must be some PHP string operators for stuff like this. Well thats what I'm hoping.
Would appreciate some guidance. Thanks in advance.

Thanks, sorry for my delayed reply
$string = "January 2014March 2015";
preg_match('/([a-z]+) *(\d+) *([a-z]+) *(\d+)/i', $string, $match);
print "$match[1] $match[2] - $match[3] $match[4]";
outputs,
January 2014 - March 2015

You could do it using lookaround:
$string = "January 2014March 2015";
$res = preg_replace('/(?<=\d)(?=[A-Z])/', ' - ', $string);
echo $res,"\n";
Output:
January 2014 - March 2015

Related

Extracting data from a string

I have a string and want to extract data from it.
$str = "Online (UVD) - 154,842 - Last Updated: Nov 23 2015 02:24 PM";
I want this 154,842 extract and this 2015 I've successfully extracted the first part. with this method
trim(str_replace("Online (UVD) - ", "", str_replace(",", "", substr_replace($str, "", strpos($str, " - Last Updated"))), $str))
Now, I'm unsure how to extract the other one. Data can vary for instance,
$str = "Online (UVD) - 1123123 - Last Updated: Nov 23 2015 02:24 PM";
$str = "Online (UVD) - 12 - Last Updated: Nov 23 2015 02:24 PM";
$str = "Online (UVD) - 1546546 - Last Updated: Nov 23 2015 02:24 PM";
$str = "Online (UVD) - 3525252525 - Last Updated: Nov 23 2015 02:24 PM";
Is there a better method to extract?/
If the strings will always have the same number of values perhaps explode and then using specific array positions would work for you.
$str = "Online (UVD) - 154,842 - Last Updated: Nov 23 2015 02:24 PM";
$pieces = explode(' ',$str);
echo 'Value is ' . $pieces[3] . ' and the year is ' . $pieces[9];
You can do it without using regex if all the words in the string are in same order that you provided. Let's try with explode() -
<?php
$str = "Online (UVD) - 1123123 - Last Updated: Nov 23 2015 02:24 PM";
$str = "Online (UVD) - 12 - Last Updated: Nov 23 2015 02:24 PM";
$str = "Online (UVD) - 1546546 - Last Updated: Nov 23 2015 02:24 PM";
$str = "Online (UVD) - 3525252525 - Last Updated: Nov 23 2015 02:24 PM";
$digit = explode(' ',$str);
echo trim($digit[3]); // returns digits
echo trim($digit[9]); // returns date
?>
DEMO: https://3v4l.org/ttBDG
I know this is answered but I think on also providing a regex solution for this:
To extract your 1st group, you can use bellow regex:
preg_match('/.-.(\d+).-/', $str, $numExtracted);
if (!empty($numExtracted)) {
echo $numExtracted[1].PHP_EOL;
}
To extract your Year:
preg_match('/(\w\w\w).(\d\d).(\d\d\d\d)/', $str, $year, PREG_OFFSET_CAPTURE);
$year = $year[3][0];
echo $year.PHP_EOL;
This worked on all of the below trials:
Online (UVD) - 1123123 - Last Updated: Nov 23 2015 02:24 PM
Online (UVD) - 12 - Last Updated: Nov 23 2015 02:24 PM
Online (UVD) oi oi - 1546546 - Last Updated: Nov 23 2015 02:24 PM
Online -sdtgstg346fg - (UVD) - 3525252525 - Last Updated: Nov 23 2015 02:24 PM
You can check the working code here
As per you comment question, you can enhance your regex to consider such cases:
.-.(\d+)?[\,\#\!\?\$\£\;\:]*(\d+)?.-
It will match all of the above plus this cases:
Online (UVD) - 1123,123 - Last Updated: Nov 23 2015 02:24 PM
Online (UVD) - 1123#!,123 - Last Updated: Nov 23 2015 02:24 PM
But I think there is a time you need to consider if you want to have a hold on the information you received or just consider it corrupt.
You can even introduce cycles to parse to every single case scenario but if I am expecting a number and suddenly the regex that triggers a match is for something like 1A2B3C4G5D8D2F I will discard it as it goes far from what I initially expected. But it all depends from where you receive your information, how likely is it to change, etc :)
Still, I think regex will make you happier and assert far more possibilities
PS: For the special cases introduced, because the number is interrupted by special chars (or even words if you consider them) it now interprets and 2 numbers.

distinct date mysql php year

I have problem with my php/mysql.
I want to get distinct year from my fields ex 2013-06-20.
Now I get something like this :
2014
2013
2014
2013
2013
2014
2014
2014
2014
2014
2014
2014
PHP CODE :
<?$chuj=mysql_query("SELECT DISTINCT date_issue FROM invoices_sales ");
while($chuje=mysql_fetch_object($chuj)){
$test=explode("-",$chuje->date_issue);
print_r($test['0']."<br>");
}
?>
How I can get only once 2013 or 2014 year ?
You need to grab the year from that particular date string.
SELECT DISTINCT YEAR(date_issue) FROM invoices_sales
This will ensure it's only comparing the year, as it stands, each of those strings likely have different dates, so it is evaluating the entire date as opposed to only the year.
<?php
$chuj=mysql_query("SELECT DISTINCT YEAR(`date_issue`) as `yr` FROM `invoices_sales`");
while($chuje=mysql_fetch_object($chuj)) {
$test = $chuje->yr;
print_r($test['0'] . "<br>");
}
?>

Extract dates, times and date ranges from text in PHP

I'm building a local events calendar which takes RSS feeds and website scrapes and extracts event dates from them.
I've previously asked how to extract dates from text in PHP here, and received a good answer at the time from MarcDefiant:
function parse_date_tokens($tokens) {
# only try to extract a date if we have 2 or more tokens
if(!is_array($tokens) || count($tokens) < 2) return false;
return strtotime(implode(" ", $tokens));
}
function extract_dates($text) {
static $patterns = Array(
'/^[0-9]+(st|nd|rd|th|)?$/i', # day
'/^(Jan(uary)?|Feb(ruary)?|Mar(ch)?|etc)$/i', # month
'/^20[0-9]{2}$/', # year
'/^of$/' #words
);
# defines which of the above patterns aren't actually part of a date
static $drop_patterns = Array(
false,
false,
false,
true
);
$tokens = Array();
$result = Array();
$text = str_word_count($text, 1, '0123456789'); # get all words in text
# iterate words and search for matching patterns
foreach($text as $word) {
$found = false;
foreach($patterns as $key => $pattern) {
if(preg_match($pattern, $word)) {
if(!$drop_patterns[$key]) {
$tokens[] = $word;
}
$found = true;
break;
}
}
if(!$found) {
$result[] = parse_date_tokens($tokens);
$tokens = Array();
}
}
$result[] = parse_date_tokens($tokens);
return array_filter($result);
}
# test
$texts = Array(
"The focus of the seminar, on Saturday 2nd February 2013 will be [...]",
"Valentines Special # The Radisson, Feb 14th",
"On Friday the 15th of February, a special Hollywood themed [...]",
"Symposium on Childhood Play on Friday, February 8th",
"Hosting a craft workshop March 9th - 11th in the old [...]"
);
$dates = extract_dates(implode(" ", $texts));
echo "Dates: \n";
foreach($dates as $date) {
echo " " . date('d.m.Y H:i:s', $date) . "\n";
}
However, the solution has some downsides - for one thing, it can't match date ranges.
I'm now looking for a more complex solution that can extract dates, times and date ranges from sample text.
Whats the best approach for this? It seems like I'm leaning back toward a series of regex statements run one after the other to catch these cases. I can't see a better way of catching date ranges in particular, but I know there must be a better way of doing this. Are there any libraries out there just for date parsing in PHP?
Date / Date Range samples, as requested
$dates = [
" Saturday 28th December",
"2013/2014",
"Friday 10th of January",
"Thursday 19th December",
" on Sunday the 15th December at 1 p.m",
"On Saturday December 14th ",
"On Saturday December 21st at 7.30pm",
"Saturday, March 21st, 9.30 a.m.",
"Jan-April 2014",
"January 21st - Jan 24th 2014",
"Dec 30th - Jan 3rd, 2014",
"February 14th-16th, 2014",
"Mon 14 - Wed 16 April, 12 - 2pm",
"Sun 13 April, 8pm",
"Mon 21 - Wed 23 April",
"Friday 25 April, 10 – 3pm",
"The focus of the seminar, on Saturday 2nd February 2013 will be [...]",
"Valentines Special # The Radisson, Feb 14th",
"On Friday the 15th of February, a special Hollywood themed [...]",
"Symposium on Childhood Play on Friday, February 8th",
"Hosting a craft workshop March 9th - 11th in the old [...]"
];
The function I'm currently using (not the above) is about 90% accurate. It can catch date ranges, but has difficulty if a time is also specified. It uses a list of regex expressions and is very convoluted.
UPDATE: Jan 6th, 2014
I'm working on code that does this, working on my original method of a series of regex statements run one after the other. I think I'm close to a working solution that can pretty much extract almost any date/time range / format from a piece of text. When I'm done I'll post it here as an answer.
I think you can sum up the regex in your question like the one below.
(?<date_format_1>(?<day>(?i)\b\s*[0-9]+(?:st|nd|rd|th|)?)(?<month>(?i)\b\s*(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|etc))(?<year>\b\s*20[0-9]{2}) ) |
(?<date_format_2>(?&month)(?&day)(?!\s+-)) |
(?<date_format_3>(?&day)\s+of\s+(?&month)) |
(?<range_type_1>(?&month)(?&day)\s+-\s+(?&day))
Flags: x
Description
Demo
http://regex101.com/r/wP5fR4
Discussion
By using recursive subpatterns, you reduce the complexity of the final regex.
I have used a negative lookahead in the date_format_2 because it would match partially range_type_1. You may need to add more range type depending on your data. Don't forget to check other partterns in case of partial match.
Another solution would consist in build small regexes in different string variables and then concatenate them in PHP to build a bigger regex.

Zend_Date, ISO_8601, date parsing and local system clock

I have a strange problem with Zend_Date object.
It seems that setters perform different operations with different system clock dates.
Let's assume that system date is 28 January 2013, following code:
$now=new Zend_Date(Zend_Date::ISO_8601);
$now->now();
echo '<br/>now: ' . $now->toString();
echo '<br/>now->day: ' . $now->get(Zend_Date::DAY);
echo '<br/>now->month: ' . $now->get(Zend_Date::MONTH);
echo '<br/>now->year: ' . $now->get(Zend_Date::YEAR);
$end=new Zend_Date('2013-02-25 14:23:34', Zend_Date::ISO_8601);
echo '<br/>end: ' . $end->toString();
$end->setHour('23')->setMinute('59')->setSecond('59')->setDay($now->get(Zend_Date::DAY))->setMonth($now->get(Zend_Date::MONTH))->setYear($now->get(Zend_Date::YEAR));
echo '<br/>endAfterSetters: ' . $end->toString();
will produce following output:
now: 28-01-2013 14:04:28
now->day: 28
now->month: 01
now->year: 2013
end: 25-02-2013 14:23:34
endAfterSetters: 28-01-2013 23:59:59
But if you change system clock to 29 January 2013, output is different from expectations:
now: 29-01-2013 14:07:22
now->day: 29
now->month: 01
now->year: 2013
end: 25-02-2013 14:23:34
endAfterSetters: 01-01-2013 23:59:59
Last output is 01-01-2013 23:59:59, but should be 29-01-2013 23:59:59 !
It happens on PHP 5.3.2 and 5.3.16, Zend_Framework 10.7, latest Zend_Date 24880 version.
Everyting worked fine in the past.
Any ideas why it happens?
P.S.: I have also found jquery datatime plugin malfunciton while using it at 29,30,31 January... But i will describe it in other question.
Remember that your setters are called in sequence. So when you call setDay(29) you're telling it to change the date to 29th February 2013, which isn't a valid date, so it's rolling that over to make it 1st March 2013. Then you call setMonth(1), which changes the month to January, giving you 1st January 2013.
You can control this behaviour by passing the extend_month option to the Zend_Date constructor, see: http://framework.zend.com/manual/1.12/en/zend.date.overview.html#zend.date.options.extendmonth

How can i use same regex for two different strings?

hi have string,
$rt="Ability: B,Session: Session #2: Tues June 14th - Fri June 24th (9-2:00PM),Time: 9:30am,karthi";
$rt="Ability: B,Session: Session #2: Tues June 14th - Fri June 24th (9-2:00PM),Time: 9:30pm,karthi";
i used below regex for remove text from last comma(,).
$it_nme = preg_replace('/(?<=pm,)\S*/is', '', $rt);
it is worked for second string (because before comma have 'pm' text). for second one before comma we have string 'am'.
for both how can i write single regex?
preg_replace('/(?<=[ap]m,)\S*/is', '', $rt)
You can use a regex OR like so:
$it_nme = preg_replace('/(?<=(pm|am),)\S*/is', '', $rt);

Categories