0:00

I want to talk briefly about dates and time in R, which is a

Â very, is a very special topic and could require a lot more time.

Â And I have something I need to talk

Â a little bit about how R represents dates and

Â times, and how you can use them in kind

Â of ara, arithmetic and data analysis types of computations.

Â 0:16

[NOISE] so R has a special way to represent dates and times.

Â And they're, they're represented using special data classes.

Â So, in the past,

Â we talked about different data types like lists.

Â And character vectors.

Â And, numeric vectors, and so.

Â This is just another type of data on top of those kinds of classes.

Â So dates are represented by the date class.

Â and, times are represented by two separate

Â classes: the POSIXct and the POSIXlt class.

Â 0:45

So dates are basically.

Â Don't have times attached to them. They represent a day, in a year

Â in a month.

Â And the, kind of, you can figure them in a kind of a year, month, day format so for

Â example, this date is 1970, January 1st and so

Â internally the dates are stored as the number of days since 1970 January 1st.

Â That particular detail is not very important but in case you're wondering

Â you don't know how they, how the, how R actually does calculations

Â based on dates.

Â Times are stored internally as the number of seconds.

Â Since 1970, January 1st.

Â And so, that's, again, another underlying detail.

Â That's not very important, but it maybe useful to know, sometimes.

Â So, the way dates in R, in R work, is you can take a character screen.

Â Like this following 1970-01-01.

Â And convert it into a date, using as.Date function.

Â That's probably the most common way that you'll start

Â your, begin working with dates.

Â And, you'll notice that if you print out this object.

Â You'll get something that looks like a character string.

Â Now it's not actually a character string but it will

Â print out that way because there's a special print method.

Â Now if I unclass the object here you'll see I get the number 0.

Â Remember?

Â Because the dates are stored internally as the

Â number of days since 1970, January 1st and since.

Â 1970 January 1st. 0 days from that point.

Â You'll get 0. If I.

Â If I input January 2, 1970, then you'll see that underline is

Â represented as a number 1, because that's 1 day after 1970 January 1st.

Â If you had a.

Â If you had a date that was before

Â 1970 Then, they'd be represented as negative numbers.

Â 2:24

so, but that's just for your little background.

Â You don't, you don't have to worry about the underlying representation.

Â Ultimately, what you need to know, is that dates are stored

Â as objects of the date class.

Â 2:36

Times, on the other hand, are represented as two possible types.

Â One is called POSIXct and the other POSIXlt.

Â So, POSIX is a family of computing standards for how things should be

Â done on certain types of computers or how data should be represented and

Â so there's a there's a family of standards for how to represent dates

Â and times and pos and so POSIX a that's part of the POSIX standard.

Â So [COUGH] in the POSIXct class.

Â Times are represented at just as very large integers.

Â 3:05

And so it's a useful type of class if you want to say, store times in

Â a data frame or something like because

Â it's basically it's like a big integer vector.

Â POSIXlt stores a time as a list.

Â 3:17

underlying, and so, and it stores a bunch of

Â other useful information about a given time, for example what's

Â the day of the week of that time, what's the the day of the year, the day of the

Â month, or the month itself.

Â And there are a number of generic

Â functions that operate on both dates and times.

Â That you can use such as the, so the weekdays functions tells you

Â what day of the week a given time is or a given day is.

Â The month function tells you what month that date or time is.

Â And the quarters functions gives you the quarter number.

Â So for example, quarter Q1 would be January through March, Q2 would be

Â April through June, etc. like that.

Â So, these generic functions operate on, on objects of class.

Â POSIXct or POSIXlt or date.

Â 4:02

So, so, for example, you can, you can, you

Â can coerce things back and forth between POSIXlt and POSIXct.

Â If you want, using the as.POSIXlt or the as.POSIXct function.

Â so, for example the Sys.time function here,

Â just gives you the current time. As it's known by the system.

Â And you can see that when it prints out, it prints out in a year month day format.

Â And then an hour minute second format.

Â 4:27

And then the, the timezone, which is Eastern Standard Time right now.

Â [COUGH] So you can convert this di to a POSIXlt using pa, as.POSIXlt.

Â And, POSIXlt remember

Â underlying is a list.

Â So you can look at the names of the elements.

Â In this list if you unclass it and you can

Â see that there's an element called seconds that's the seconds

Â and the minutes the hours the M days the day

Â of the month which in this case would be 23.

Â The month is just the month your in so that's a January and then the year.

Â The weekdays or the day of the week.

Â The year, day, which is the day of the year.

Â And then are we on daylight

Â savings or not?

Â [NOISE] So, if I extract the sec, seconds

Â element of this list, you'll see it's 11.86.

Â And so that, and so this actually gives you the seconds in in fractional seconds.

Â So, it's 11 seconds and then .86 seconds.

Â So, that's, that's the number of seconds in the time.

Â 5:25

The POSIXct format you can see is you can also get it from

Â the sys.time function and you can see that if I un-class the POSIXct object.

Â I get this very large integer, because that's

Â just the number of seconds since January 1, 1970.

Â Now if I try to apply the list operator, the dollar sign to this object, you

Â see I get an error because objects of

Â POSIXct don't have these list elements in them.

Â You want to get those list elements out you

Â have to convert it to POSIXlt using the as.POSIXlt function.

Â Then I can get the seconds out.

Â In this case it's 11.88 seconds.

Â 6:01

So, finally there's a strptime function which converts dates which

Â are written in character string format into date or time objects.

Â Well, in this case it converts into two time objects.

Â So and they use what are called format strings.

Â So here I've got a string that says January 10th, 2012 and then 10, 40

Â meaning hour 10 minute 40 and

Â then I have another string that said December 9th 2012.

Â 9, 10.

Â So if I want to convert these strings to time objects I can use the strptime

Â function, what I do is I pass the

Â character vector and I pass it a format string.

Â So in this case I got, and so you'll see how these present signs fall by letters.

Â And then you can, you can look-up what

Â these symbols mean in the help page for strptime.

Â 6:54

So present B means the month.

Â In an abbreviate [UNKNOWN] name, %d is the day.

Â Then comma and then %Y is the four-digit year.

Â Then %H is the hour, sort of like colon followed by %M which is the minute.

Â And so that's the format. Of the, of the times here.

Â 7:15

You can see that after I call [UNKNOWN] I print out X, I get

Â these time objects that are formatted, that

Â are printed out in the standard format.

Â When I look at the class of this object you'll see it's in a POSIXlt format and so

Â 7:32

that's the so you can look at so that's the underlying kind of list format here.

Â Now I personally can never remember what those formatting strings

Â are the %B ,%D, %Y I can never remember what

Â those are and so I always have to look at

Â the help page for [UNKNOWN] to remember what those details are.

Â 7:49

Now, once you've got data in the date or time format, you can, you

Â can do operations on them, which can be very handy for example, you can you

Â can add and subtract dates, you can compare dates, you can see is, is

Â one date less than another date or are these two dates equal to each other?

Â 8:07

So, but the end you need to be

Â careful that you can't always mix different classes.

Â So for here, I have X which is a date object, then Y which is a POSIXlt object.

Â 8:17

If I try to subtract the two, you'll, you'll get

Â an error because they're not the same, type of object.

Â So I can, can, but if I convert the date to a POSIXlt object so I can do as.POSIXlt

Â on it, then if I take the difference it'll say

Â that the, the difference of 356.3 days between the two.

Â 8:37

The nice thing about the date and time operators is that they keep track

Â of very tricky things like leap years,

Â leap seconds daylight savings and time zones.

Â So just, this first example here, you can see that so 2012,

Â it was a leap year and so there was a February 29th.

Â And so the difference between March 1st and February 28th is actually a difference

Â of two days, not a difference of one day like it is every other year.

Â 9:02

Similarly

Â I could take two times one which is in my, X which is in my kind of

Â current time and then Y which is in the time zone of GMT, so Greenwich Mean Time.

Â And take the difference between those.

Â So even though it looks like it should be a 5-hour difference, it's

Â actually only a 1 hour difference because of the change in the time zones.

Â So one of the advantages of using the date time classes

Â is that it will automatically take care of these kinds of irregularities.

Â 9:31

So that was just a quick sum, kind of overview

Â of the, the dates and the time classes in R.

Â So just to summerize there are special classes in R that will, that

Â represents dates and times that'll allow

Â you to do numerical and statistic calculations.

Â dates, date, class, times use either the POSIXct or POSIClt class.

Â 9:49

And then character strings can be coerced to either a date

Â or a time class, using the strptime function or as.Date.as.POSIXlt, and

Â POSIX, as.POSIXct.

Â The other thing to note, that I haven't really talked about here

Â is that a lot of the plotting functions, will recognize date time objects.

Â So when you try to plot An object that, that's a date time class.

Â It will recognize that object and then format the X axis in

Â a special way so that it will have a time element to it.

Â So you might want to try to experiment with that a little bit to see how

Â plots change when you use a date time class.

Â