Statement of Purpose
Documentation on the R programming language has been developed to provide a comprehensive answer to question “What is R?” The approach taken seeks to appeal to new users and the reliance on practical examples seeks to provide applied, long-term reference for seasoned users.
What is R?
R is an open-source implementation of the the S programming language, which was developed by Bell Labs “to improve data manipulation, analysis, and visualization.”
Development of S started in 1976 and took place in the same offices responsible for the transistor, UNIX, C, PostScript, and TCP/IP. In 1984, AT&T established Statistical Sciences Corp (StatSci) to distribute the S language under monopoly rules that required Bell Labs to commercialize its patents absent copyright protection. The S programming language is now licensed by AT&T/Lucent exclusively to Insightful Corp. under the product name S-PLUS.
R represents a development path separate and distinct from S. R is GNU S, an open-source platform developed in 1997 at the University of New Zealand, Auckland. Since 1997, the open-source framework has been managed by an international “R Core” team and the language has attract a substantial user base. R use is now significantly larger than S/S-Plus. The language also enjoys critical development momentum as evidenced by the large number of extension packages available … over 5,000 released between 2006 and 2013.
R is an interpreted language, not a compiled language. As a result, R code is dependent on the R interpreter for machine interface and data handling. Reliance on the R kernel offers many advantages for simplifying data handling, object-orientated programming, and memory management. Most of the user visible functions are written in R with primitive functions written in C and Fortran. It is possible for the user to interface R with C, C++ and Fortran, and also to write additional primitives.
Why R for New Energy?
R is easy to learn, easy to implement, and reinforces focus on data science and large data analytics absent the burden writing code for memory or machine management. In practice, renewable energy is dependent on scientific and operational data sets that are very large in nature. Examples include:
- Ground sensor networks for site-specific resource and weather monitoring;
- Long-term time series from satellites which monitor the atmospheric boundary layer;
- SCADA systems that provide real-time performance and equipment diagnostics;
- High frequency historic and real-time data on end-user demand; and
- Geo-spatial data for land use, environmental, and power network models.
R relies on a wide array of data object structures that can easily combine diverse data sets and formats. R’s data object system is also extendable and lends itself easily to customization for applied science models.
There is great practical value in open-source R. First, R is free. R has also enjoyed wide support for many years in the absence of performance, credit or reputation risk. More important, the source code – both the base system and all package extensions – is readily available with standardized documentation. Finally, R benefits from vender support and is one of the most actively discussed programming languages in on-line forums.
Base R, a core set of packages, and package extensions can be obtained from the Comprehensive R Archive Network (CRAN).