R Data Types

Spread the love

This article provides a comprehensive overview of data types in R, discussing primitive data types, composite data types, and conversion between types, with illustrative examples.

Primitive Data Types

Primitive data types, also known as atomic vectors, are the most basic types in R. They include numeric, integer, complex, logical, and character.

Numeric

Numeric is the default computational data type in R. If a number has a decimal point, R will treat it as numeric. For example:

x <- 20.22
class(x)
# Output: "numeric"

Integer

Integer is a number without a decimal point. To specify an integer in R, you append an L to the end of the number:

y <- 10L
class(y)
# Output: "integer"

Complex

R has support for complex numbers. A complex number is a number with a real and an imaginary part:

z <- 3 + 2i
class(z)
# Output: "complex"

Logical

Logical data type represents Boolean values TRUE and FALSE. The values are case sensitive, and R also recognizes T and F as shortcuts:

a <- TRUE
b <- F
class(a)
# Output: "logical"
class(b)
# Output: "logical"

Character

Any data enclosed in quotes is considered a character string in R:

str <- "Hello, World!"
class(str)
# Output: "character"

Composite Data Types

Beyond these basic types, R provides several composite data types that allow you to work with collections of data, including vectors, lists, matrices, arrays, factors, and data frames.

Vectors

A vector is a sequence of data elements of the same basic type. Members in a vector are officially called components. You can create a vector using the c() function:

vec <- c(1, 2, 3)
class(vec)
# Output: "numeric"

Lists

A list is a special type of vector that can contain elements of different types:

my_list <- list(1, "a", TRUE, 1 + 1i)
class(my_list)
# Output: "list"

Matrices

A matrix is a two-dimensional array where each element has the same atomic type. You can create a matrix using the matrix() function:

mat <- matrix(1:9, nrow = 3, ncol = 3)
class(mat)
# Output: "matrix"

Arrays

Like matrices, arrays are also multi-dimensional. However, arrays can have more than two dimensions:

arr <- array(1:24, dim = c(3, 4, 2))
class(arr)
# Output: "array"

Factors

Factors are used to represent categorical data and can be ordered or unordered. You can create a factor using the factor() function:

fac <- factor(c("low", "high", "medium", "high", "low", "medium"))
class(fac)
# Output: "factor"

Data Frames

A data frame is a table or a two-dimensional array-like structure where each column contains values of one variable and each row contains one set of values from each column. Columns can be of different types:

df <- data.frame(
  numbers = 1:3,
  letters = c("a", "b", "c"),
  logical = c(TRUE, FALSE, TRUE)
)
class(df)
# Output: "data.frame"

Converting Data Types

You can convert data from one type to another using various functions in R, including as.numeric(), as.integer(), as.logical(), as.character(), etc.:

# Convert a character to numeric
x <- "123"
x <- as.numeric(x)
class(x)
# Output: "numeric"

It’s important to note that not all conversions are valid, and an inappropriate conversion will result in NA.

In summary, understanding and working with different data types is fundamental in R programming. Each data type has its specific use, and mastering how to convert between types and manipulate data structures can enhance the effectiveness and efficiency of data analysis in R. Whether it’s performing numerical computations, handling text data, or managing larger data structures like data frames, the right use of data types can make the job significantly easier.

Posted in RTagged

Leave a Reply