R Lists

R lists are a general data object made up of components, where each component is itself a data object that can be any mode or type.  The length() of a list is the number of components.  Lists are very flexible and a convenient structure for packaging or storing different kinds of data in one object.  However, for large data, array structures are preferred based on operational run-times.  

One advantage of lists is the fact they are recursive.  In particular, many function calls will naturally loop across the named components without the need to explicitly define loop structures when coding.

Creating R Lists

The list() function is used to combine arbitrary data objects into a single object. For example, suppose you have a vector x with a numeric sequence and a matrix y of character data.  You can combine the two distinct class objects into a single list:

Each of the list components is preceded with the $ operator, followed by its name (assuming a name was given).

The function vector() can also be used to create a list since its first argument requires a class definition: 

The number of components in any list can be identified using the length() function, the component names, or by using the names() function.  The attributes() function is also valuable, especially for controlling custom function output where user defined attributes or comments can be supplied. 

List Subscripting

List components are also numbered, facilitating subscripting or the extraction and replacement of data.  The infix operators for list subscripting are [ ] , [[ ]] and $.  The examples use all three operators with varying results based on the level of the data object to be extracted/replaced:

Using List Names

Names of components may be abbreviated to a minimum. Thus, mylist$Sequence is equal to mylist$S, assuming no other elements begins with S.  Abbreviations are suitable for use in the interactive Console window, but abbreviations are bad programming practice as its not clear what data is being referenced.  Meanwhile, if you define a list without component names, components can be accessed only using the [ ] and  [[  ]] operators with numeric indexing.  Again, bad programming relies on index numbers since numbers make it hard to understand what data is being extracted or replaced.  This si especially true when dealing with large data objects or dormant code that hasnt been used for a while.

R code can be very dynamic, as elements can be added, deleted or replaced.  Use of the match() function helps to overcome potential confusion in indexing.  For example:

is guaranteed to extract the named component of the list, or return NULL if there is not one.

Warning: Index results that return a list component with names can not be used in functions that require vector or matrix inputs.  The named component is still classified as a list and will generate errors when another class is expected.  To avoid this problem, extract data at the lowest level possible and change the data class as needed.

Modifying and Concatenating R Lists

The names of a list’s components can be changed by re-assigning them with the names() function:

Lists can also have new components added or removed using list subscripting rules:

When the concatenation function c() employs list arguments, the result is an object of mode list whose components are those of the argument lists joined together:

Warning: The concatenation function will combine elements or components, but at the expense of discarding dim attributes.  

Two or more lists can also be modified using append() and detach() functions:

Finally, the unlist() function will destroy the list data object structure, print and return a list as a numeric vector that combines all the data.

Analysis of List Components

We have already seen the power of of the apply() function when working with vectors or matrices.  The following functions also apply a function within a function to assess list components:

FunctionArgumentsInput ObjectOutput Object
lapply()(X=, FUN=, ...)Any R objectList
sapply()(X=, FUN=, ..., simplify=TRUE)Any R objectVector, matrix, or list

For lapply(X, FUN=, …), each component of the list is replaced by the result of the executing function (FUN) on that component.  The sapply(X, FUN=, …) function calls lapply(), then attempts to simplify to a vector or matrix.  For example:

Both functions are extremely flexible given the ability to apply a function within a function.  The … argument implies an infinite number of additional arguments can be specified to refine list anlysis.

Back | Next

Leave a Reply

Your email address will not be published. Required fields are marked *