Local vs Global Objects

Local vs global objects in R serve to distinguish temporary and permanent data.

Local Objects and Frames

Data objects assigned within the body of a function are temporary.  That is, they are local to the function only.  Local objects have no effect outside the function, and they disappear when function evaluation is complete.    

Local objects are stored in a frame in virtual memory (e.g. RAM). Frame 0 is called the session frame or “global environment” and exists as long as R is operational.  Frame 1 and higher are created to support function execution only.  Each of these frames contain a list of R objects with names and values.  Objects stored in Frame 1 and higher are erased when flow control returns to the Console.  The temporary use of virtual memory implies that functions offer an efficiency advantage over scripts as functions make little to no use of disk or virtual storage.

Global Objects and the .Data Directory

Data objects that are global are stored in the .Data directory.  As a result, they persist until they are explicitly removed (or the search path is altered).  Scripts the contain no functions and expressions defined in the Console create global objects, they rely on hard disk storage, and they are fundamentally slower to process.

Local objects within functions become permanent or global objects only through the use of the assign() function or the infix operator <<-.  Explicit assignment transfers an object from RAM or virtual memory to the hard-drive.  For example:

Function assign() can also be used with its where argument to put (or to replace) objects in other databases.

Memory Schematic for Local vs Global Objects

The difference between local and global objects is a function of where each is stored, as shown below:


click to enlarge

Object Masking

Object masking occurs when two or more objects  have the same name but reside in different storage locations.  The first object on the search path will be recognized and all others with the same name remain unreachable.

To bypass masking restrictions, the get() function may be used to select an object from any given location on the search path.  For example:

In this case, get() copies myObject from the 3rd directory in the search path to the current working directory and assigns it a new name. 

The conflicts() function is also available to help check if system objects are being masked, whether intentionally or not. The function exists() checks if an object exists on the search path, or in a specific database if argument where is given.  

Extra directories, lists or data frames can be added to the search list with the attach() function and removed with the detach() function.  Normally a new entity is attached at position 2, or to directory 2 on the search path.  

Environment Management Functions

The following functions can be used to manage memory and environments on the search path.  Other functions listed also provide diagnostics to test objects and to ensure functions run efficiently:

assign()Assign a value to a name for a environment (frame), database or directory
attach()Attach a database or directory to the search path and assign its position in the search hierarchy
conflicts()Checks specified locations for objects that appear more than once and returns conflict details
detach()Detach a database from the search path; or detach a data object which has been loaded with attach() or package attached by library()
eval()Evaluates an expression in a specified environment (frame)
exists()Checks the search to determine if an objects exists and returns a logical
gc()garbage collection; reports the number of bytes used by memory
get()Search for an R object with a given name and return it, looking across the entire search path or a specified environment (frame)
new.env()Create a new environment (frame)
object.size()Provides an estimate of the memory that is allocated to store an R object...very useful
profrA package to visualize data created by Rprof functions
Rprof()Profiles the incremental amount of time used by different R Functions in a call stack....very useful
Rprofmem()Profiles incremental R memory allocation used by different R functions in a call stack...very useful
sys.time()Calls proc.time(); measures elapsed to run a process or code chunk. The values returned (user, system, and elapsed) will be defined by your operating system, but generally, the user time relates to the execution of the code, the system time relates to your CPU, and the elapsed time is the difference in times since you started and stopped a timer (and will be equal to the sum of user and system times)

Back | Next

Leave a Reply

Your email address will not be published. Required fields are marked *