This release (hopefully) marks the stability of a tsibble data object (
tbl_ts contains the following components:
key: single or multiple columns uniquely identify observational units over time. A key consisting of nested and crossed variables reflects the structure underlying the data. The programme itself takes care of the updates in the “key” when manipulating the data. The “key” differs from the grouping variables with respect to variables manipulated by users.
index: a variable represent time. This together the “key” uniquely identifies each observation in the data table.
index2: why do we need the second index? It means re-indexing to a variable, not the second index. It is identical to the
indexmost time, but start deviating when using
index_by()works similarly to
group_by(), but groups the index only. The dplyr verbs, like
mutate(), operates on each time group of the data defined by
index_by(). You may wonder why introducing a new function rather than using
group_by()that users are most familiar with. It’s because time is indispensable to a tsibble,
index_by()provides a trace to understanding how the index changes. For this purpose,
group_by()is just too general. For example,
summarise()aggregates data to less granular time period, leading to the update in index, which is nicely and intuitively handled now.
intervalclass to save a list of time intervals. It computes the greatest common factor from the time difference of the
indexcolumn, which should give a sensible interval for the almost all the cases, compared to minimal time distance. It also depends on the time representation. For example, if the data is monthly, the index is suggested to use a
yearmonth()format instead of
Dateonly gives the number of days not the number of months.
regular: since a tsibble factors in the implicit missing cases, whether the data is regular or not cannot be determined. This relies on the user’s specification.
ordered: time-wise and rolling window functions assume data of temporal ordering. A tsibble will be sorted by its time index. If a key is explicitly declared, the key will be sorted first and followed by arranging time in ascending order. If it’s not in time order, it broadcasts a warning.
tsummarise()and its scoped variants. It can be replaced by the combo
tsummarise()provides an unintuitive interface where the first argument keeps the same size of the index, but the remaining arguments reduces rows to a single one. Analogously, it does
summarise(). The proposed
index_by()solves the issue of index update.
find_duplicates()to better reflect its functionality.
group_vars()return a vector of characters instead of a list.
distinct.tbl_ts()now returns a tibble instead of an error.
tidyr::fill(), as they respect the input structure.
index_sum(), and replaced by
index_valid()to extend index type support.
fill_na.tbl_ts()gained a new argument of
.full = FALSE.
.full = FALSE(the default) inserts
NAfor each key within its time period,
TRUEfor the entire time span. This affects the results of
fill_na.tbl_ts()as it only took
TRUEinto account previously. (#15)
.dropin column-wise dplyr verbs.
group_by.tbl_ts()behaves exactly the same as
group_by.tbl_dfnow. Grouping variables are temporary for data manipulation. Nested or crossed variables are not the type that
tbl_tsgains a new attribute
index2, which is a candidate of new index (symbol) used by
attr(grouped_ts, "vars")stores characters instead of names, same as
This release introduces major changes into the underlying
tbl_tsclass to reduce the object size, and computed on the fly when printing.
tbl_tsobject is a symbol now instead of a quosure.
tbl_tsobject is an unnamed list of symbols.
split_by()to split a tsibble into a list of data by unquoted variables.
build_tsibble()allows users to gain more control over a tsibble construction.
as_tsibble.msts()for multiple seasonality time series defined in the forecast package.
stretch(), are no longer defined as S3 methods. Several new variants have been introduced for the purpose of type stability, like
slide_dfr()(a row-binding data frame),
slide_dfc()(a column-binding data frame).
indexvariable must sit in the first name-value pair in
tsummarise()instead of any position in the call.
transmute.tbl_ts()keeps the newly created variables along with index and keys, instead of throwing an error before.
This release marks the complete support of dplyr key verbs.
inform_duplicates()informs which row has duplicated elements of key and index variables.
tsummarise.tbl_ts(), when calling functions with no parameters like
tsummarise.tbl_ts(), one grouping level should be dropped for the consistency with
dplyr::summarise()for a grouped
tbl_tsare supported in
as_tsibble(). An empty tsibble is not allowed.
group_by.tbl_ts(.data, ..., add = TRUE)works as expected now.