Base R’s list operations are designed for interactive use, offering flexibility but often being overly permissive with user input. While convenient, this can lead to subtle bugs during development and requires additional, sometimes tedious, checks to ensure code robustness.
The {container} package addresses these challenges by providing operations that explicitly define the intent of each action. By prioritizing clarity and precision, {container} enables you to write leaner and more reliable code right from the start.
This vignette revisits some of the basic operations familiar from base R lists and demonstrates how {container} enhances them with strict validation and powerful additional features.
Using base R lists notation, elements are usually added by name or concatenation.
The {container} package provides the add
function to add
elements.
For container objects there is not much of a difference between the two methods. Now, if for example you don’t want to allow duplicated names, you can use dict objects instead. These are a subclass of container and would throw an error in this case.
For more details see the reference documentation or have a look at the Deque, Set, and Dict vignette. Lastly, note that the base append function also works with containers.
As demonstrated before, elements can be loosely replaced by index or name.
Also, in contrast to base lists, the container will not allow to add elements at positions longer than the length of the object.
If the name does not exist, the element is appended as known from base lists.
Let’s imagine you want to replace an element of a certain name, and therefore expect that the name exists already. In code development, this would require an additional check, for example:
name <- "z"
if (name %in% names(co)) {
co[[name]] <- 10
} else {
stop("Name '", name, "' does not exist.")
}
Clearly this is a lot of boilerplate code for a simple operation, and it is easy to forget such checks. In addition, you end up with a lot of unit tests basically to check the checks. Last but not least, the intent of the code is not as clear as it could be.
This is where the {container} package comes in. If you want to make
sure that something is replaced, {container} provides the function
replace_at
, which will only replace elements at names or
positions that exist. The following statements are all equal and show
the different possibilities on how to use replace_at
.
replace_at(co, x = 10, y = 13) # name = value pairs
# [x = 10, 12, x = 3, y = 13]
replace_at(co, c("x", "y"), c(10, 13)) # names followed by values
# [x = 10, 12, x = 3, y = 13]
replace_at(co, c(1, 4), c(10, 13)) # positions followed by values
# [x = 10, 12, x = 3, y = 13]
replace_at(co, list(1, "y"), c(10, 13)) # mixed indices followed by values
# [x = 10, 12, x = 3, y = 13]
Next, let’s see how invalid indices are signaled.
replace_at(co, z = 10)
# Error: names(s) not found: 'z'
replace_at(co, "z", 10)
# Error: names(s) not found: 'z'
replace_at(co, 5, 10)
# Error: index out of range (length = 4): 5
If you instead don’t mind that elements at new names will be added,
set .add = TRUE
. Invalid positional indices are still
signaled.
It is also possible to replace elements by value, that is, you
specify the value (not the index) that should be replaced. To see this,
let’s replace 12
(located at the 2nd postion) by
"foo"
and then y = 5
(located at the 4th
position) by 1:2
.
co <- replace(co, old = 12, new = "foo")
co
# [x = 0, "foo", x = 3, y = 5, z = 10]
co <- replace(co, old = 5, new = 1:2)
co
# [x = 0, "foo", x = 3, y = (1L 2L), z = 10]
Implementing this “manually” would require even more additional code as before. As intended, if the value does not exist, an error is signaled.
replace(co, old = "non-existent-value", new = "my value")
# Error: old element ("non-existent-value") is not in Container
Again, the intend that you want to replace but don’t mind that the element is added can be specified:
Let’s recap the standard extract operators.
co[[1]]
# [1] 0
co[["x"]]
# [1] 0
co[3:5]
# [x = 3, y = (1L 2L), z = 10]
co[c("x", "y", "z")]
# [x = 0, y = (1L 2L), z = 10]
The {container} functions to strictly select one or multiple elements
are named at2
and at
.1
at2(co, 1)
# [1] 0
at2(co, "x")
# [1] 0
at(co, 3:5)
# [x = 3, y = (1L 2L), z = 10]
at(co, c("x", "y", "z"))
# [x = 0, y = (1L 2L), z = 10]
As before you can specify mixed indices via lists.
Accessing non-existent names or positions is signaled with an error as follows.
at2(co, 10)
# Error: index 10 exceeds length of Container, which is 5
at2(co, "a")
# Error: index 'a' not found
at(co, 3:6)
# Error: index 6 exceeds length of Container, which is 5
at(co, c("x", "a"))
# Error: index 'a' not found
Be reminded that with base lists non-existent indices just
would have returned NULL
values.
If needed, the (less strict) list access can be mimicked
with peek_at
and peek_at2
.
co
# [x = 0, "foo", x = 3, y = (1L 2L), z = 10]
peek_at(co, 10, 11)
# []
peek_at(co, 5:10)
# [z = 10]
peek_at2(co, "a")
# NULL
As you see, one important difference is that multiple access via
peek_at
by default instead of NULL
values just
returns nothing. However, both functions allow to specify a custom
default value being returned if the index does not exist.
To remove elements in lists, they are usually replaced by
NULL
.
With the container package this is done differently, as replacing by
NULL
will not delete the element but literally replace it
by NULL
.
Instead, elements can be deleted by index (delete_at
) or
value (delete
) as follows.
co
# [x = NULL, "foo", x = 3, y = (1L 2L), z = 10]
delete_at(co, 1, "y", "z")
# ["foo", x = 3]
delete(co, NULL, 1:2, 10) # same but remove by value
# ["foo", x = 3]
As before, invalid indices or missing values are signaled.
co
# [x = NULL, "foo", x = 3, y = (1L 2L), z = 10]
delete_at(co, "a")
# Error: names(s) not found: 'a'
delete_at(co, 10)
# Error: index out of range (length = 5): 10
delete(co, 1:3)
# Error: (1L 2L 3L) is not in Container
If you need a less strict delete operation, use the
discard
functions, which delete all valid indices or values
and ignore the rest.
The update
function is used to combine/merge two
containers.
c1 <- container(1, b = 2)
c2 <- container( b = 0, c = 3)
update(c1, c2)
# [1, b = 0, c = 3]
update(c2, c1)
# [b = 2, c = 3, 1]
With the container package this function is also provided for base R lists.
l1 <- list(1, b = 2)
l2 <- list( b = 0, c = 3)
update(l1, l2)
# [[1]]
# [1] 1
#
# $b
# [1] 0
#
# $c
# [1] 3
update(l2, l1)
# $b
# [1] 2
#
# $c
# [1] 3
#
# [[3]]
# [1] 1
Note that there is a similar function utils::modifyList
,
which, however, in contrast to update
, does not
“forward” unnamed elements.
modifyList(l1, l2)
# [[1]]
# [1] 1
#
# $b
# [1] 0
#
# $c
# [1] 3
modifyList(l2, l1) # drops l1[[1]] = 1
# $b
# [1] 2
#
# $c
# [1] 3
Also, while utils::modifyList
modifies a list
recursively by changing a subset of elements at each level,
update
just works on the first level.
l1 <- list(a = 1, b = list(c = "a", d = FALSE))
l2 <- list(e = 2, b = list(d = TRUE))
modifyList(l1, l2) # modifies l1$b$d from FALSE to TRUE
# $a
# [1] 1
#
# $b
# $b$c
# [1] "a"
#
# $b$d
# [1] TRUE
#
#
# $e
# [1] 2
update(l1, l2) # replaces l1$b by l2$b
# $a
# [1] 1
#
# $b
# $b$d
# [1] TRUE
#
#
# $e
# [1] 2
The apply family and common higher-order functions both can be used with containers as usual.
This vignette demonstrates how {container} enhances robust code development by providing:
To see how some of the functions disussed here are applied with derived data structures, see:
Resembling R base-internal .subset2 and .subset.↩︎