This article provides a comprehensive overview of data types in R, discussing primitive data types, composite data types, and conversion between types, with illustrative examples.
Primitive Data Types
Primitive data types, also known as atomic vectors, are the most basic types in R. They include numeric, integer, complex, logical, and character.
Numeric is the default computational data type in R. If a number has a decimal point, R will treat it as numeric. For example:
x <- 20.22 class(x) # Output: "numeric"
Integer is a number without a decimal point. To specify an integer in R, you append an
L to the end of the number:
y <- 10L class(y) # Output: "integer"
R has support for complex numbers. A complex number is a number with a real and an imaginary part:
z <- 3 + 2i class(z) # Output: "complex"
Logical data type represents Boolean values
FALSE. The values are case sensitive, and R also recognizes
F as shortcuts:
a <- TRUE b <- F class(a) # Output: "logical" class(b) # Output: "logical"
Any data enclosed in quotes is considered a character string in R:
str <- "Hello, World!" class(str) # Output: "character"
Composite Data Types
Beyond these basic types, R provides several composite data types that allow you to work with collections of data, including vectors, lists, matrices, arrays, factors, and data frames.
A vector is a sequence of data elements of the same basic type. Members in a vector are officially called components. You can create a vector using the
vec <- c(1, 2, 3) class(vec) # Output: "numeric"
A list is a special type of vector that can contain elements of different types:
my_list <- list(1, "a", TRUE, 1 + 1i) class(my_list) # Output: "list"
A matrix is a two-dimensional array where each element has the same atomic type. You can create a matrix using the
mat <- matrix(1:9, nrow = 3, ncol = 3) class(mat) # Output: "matrix"
Like matrices, arrays are also multi-dimensional. However, arrays can have more than two dimensions:
arr <- array(1:24, dim = c(3, 4, 2)) class(arr) # Output: "array"
Factors are used to represent categorical data and can be ordered or unordered. You can create a factor using the
fac <- factor(c("low", "high", "medium", "high", "low", "medium")) class(fac) # Output: "factor"
A data frame is a table or a two-dimensional array-like structure where each column contains values of one variable and each row contains one set of values from each column. Columns can be of different types:
df <- data.frame( numbers = 1:3, letters = c("a", "b", "c"), logical = c(TRUE, FALSE, TRUE) ) class(df) # Output: "data.frame"
Converting Data Types
You can convert data from one type to another using various functions in R, including
# Convert a character to numeric x <- "123" x <- as.numeric(x) class(x) # Output: "numeric"
It’s important to note that not all conversions are valid, and an inappropriate conversion will result in NA.
In summary, understanding and working with different data types is fundamental in R programming. Each data type has its specific use, and mastering how to convert between types and manipulate data structures can enhance the effectiveness and efficiency of data analysis in R. Whether it’s performing numerical computations, handling text data, or managing larger data structures like data frames, the right use of data types can make the job significantly easier.