Time Zones and Time Zone Offsets
Peter Perkins, in development, wrote this as an explanation for what the development team and documentation needed to understand to make sure time zones were implemented and explained correctly. I stole it (with his permission) to share with you.
A very common mistake when talking about time zones is to confuse a time zone with a specific offset from UTC. Those are two different things. To learn more about this in MATLAB, check out the documentation. A time zone consists of
- a name, like America/New_York
- a standard offset from UTC, like -5hrs
- a daylight saving time offset from that, like +1hr
- a pair of rules that determine when daylight saving time shifts occur, like "spring ahead the second Sunday in March, fall back the first Sunday in November, both at 2:00am"
- a history of those four things over time, like "from 1987-2006, it was the first Sunday in April and the last Sunday in October",
Then there are time zone offsets, like Eastern Standard Time. This is the offset from UTC that you use to anchor a specific clockface time. Imagine someone gives you the timestamp
timestamp = '5-Nov-2018 13:58:55'
timestamp = '5-Nov-2018 13:58:55'
What does that mean in the real world? Maybe you and the other guy may have some agreed-upon convention, but in general, you don't know if that means "in New York", or "in London", or whatever. You'd want to append an offset to that to remove the ambiguity, maybe like this:
timestamp = '5-Nov-2018 13:58:55 -05:00'
timestamp = '5-Nov-2018 13:58:55 -05:00'
But people like words better than numbers, so usually it's written like this:
timestamp = '5-Nov-2018 13:58:55 EST'
timestamp = '5-Nov-2018 13:58:55 EST'
As long as you can agree that EST means US Eastern Standard Time, and not Australian or Brazilian Eastern Standard Time (NEITHER of which are UTC-5, but still called EST by the locals), you're good. It's 13:58:55 offset from UTC by -5hrs, or 18:58:55 UTC. But why not avoid language issues (French Canadians call it HNE) and say "-05:00". You are better off.
Notice that all of the z/Z/x/X things you can put in a datetime array's display format spit out time zone offsets. The time zone is a property of a datetime array. Each element may display itself with a different offset, though (standard vs. daylight saving time).
Providing an offset for a timestamp (or agreeing on one by convention) helps with that one timestamp, but it doesn't provide context for arithmetic and computing other timestamps. You might be working with data collected in New York, or with data collected in Tulum (which does not observe DST at all). So does two days before the above timestamp work out to
fmt = 'dd-MMM-yyyy HH:mm:ss z'; datetime('5-Nov-2018 13:58:55 EST','TimeZone','America/New_York','Format',fmt) - hours(48)
ans = datetime 03-Nov-2018 14:58:55 EDT
or to
datetime('26-Oct-2018 13:58:55 EST','TimeZone','America/Cancun','Format',fmt) - hours(48)
ans = datetime 24-Oct-2018 13:58:55 EST
You might say, "Ha, Tulum, I wish, but get real." But even in the US, this comes up.
datetime('26-Oct-1998 13:58:55 EST','TimeZone','America/New_York','Format',fmt) - hours(48)
ans = datetime 24-Oct-1998 14:58:55 EDT
datetime('26-Oct-1998 13:58:55 EST','TimeZone','America/Indianapolis','Format',fmt) - hours(48)
ans = datetime 24-Oct-1998 13:58:55 EST
You might say, "Sure, everyone drags out that old chestnut", but these things matter, and have to be correct. And guess what? Much or all of New England is likely to change their time zone rules in the next few years, so you better get used to saying something like America/Boston instead of EST.
I said above that -05:00 is unambiguous and better than EST, and from a language standpoint, that's perfectly true. But from the context standpoint, it's actually more ambiguous than EST. There are a BUNCH of places that at one time of year or another specify their times with a UTC-5 offset. Chicago, for example, which observes what in the US is called Central Standard/Daylight Time. And lots of data acquisition hardware knows no nothing about DST, so they spit out timestamps like 26-Oct-1998 13:58:55 -05:00 all year round. Should your calculations respect that, or respect New_York's DST rules, or what? (MATLAB supports the UTC-5 timezone for that use case -- it observes no DST shift, never changes its behavior, and therefore its name is just its offset.)
So you need a time zone. But if you only specify the time zone (assuming you're woke about what that means, and you're beyond mistakenly specifying an offset), there is an issue if you don't specify an offset. Consider this timestamp
datetime('04-Nov-2018 01:35:23','TimeZone','America/New_York','Format',fmt)
ans = datetime 04-Nov-2018 01:35:23 EST
But what about the other 01:35:23 on that day, the first one, an hour earlier during Daylight Saving time?
The right way to get these when you are reading timestamps in data is to demand an offset.
datetime('04-Nov-2018 01:35:23 EST','TimeZone','America/New_York','Format',fmt)
ans = datetime 04-Nov-2018 01:35:23 EST
datetime('04-Nov-2018 01:35:23 EDT','TimeZone','America/New_York','Format',fmt)
ans = datetime 04-Nov-2018 01:35:23 EDT
OK, once a year, in the middle of the night, big deal. But these things matter, and have to be correct. And guess what? Without an offset, if you're working with data from data acq hardware that doesn't know about DST, and also doesn't add an offset to its timestamps, you are off by one hour in half your data.
Conclusion: Time zones and time zone offsets are two different things. Specifying an offset gives you the precise meaning for one timestamp, but provides no context for calculations. Unless you are working with "unzoned" timestamps with only one possible meaning, and from only one data source, you need to specify a time zone too. And EST is not the way to do that.
Have you needed to master time zones? And have you had troubles? Let us know what has and has not worked for you here.
- 类别:
- Time