How to Use the read.table Function in R

Spread the love

In R, one of the fundamental functions to read data from text files (such as CSV and TSV files) into a data frame is read.table. This comprehensive guide will provide a deep dive into how to use the read.table function in R.


R’s read.table function is a versatile tool that allows you to import datasets from plain text files into R. The function reads the data into a data frame, which is a key data structure in R that stores data in a tabular format.

A key strength of read.table is its flexibility. It can handle files with different column separators, different numbers of rows and columns, missing data, and many other complexities. Additionally, read.table is a base R function, which means it is included with R and does not require any additional packages to be installed.

Basic Usage

The simplest way to use read.table is to call it with the name of the file you want to read:

data <- read.table("data.txt")

In this example, read.table reads the file data.txt from the current working directory and stores the resulting data frame in the data variable. By default, read.table assumes that the data is space-separated and that the first row of the file contains the column names.


read.table comes with a large number of optional arguments that give you fine control over how the data is read. Here are some of the most important ones:

  • file: A character string giving the name of the file to read.
  • header: A logical value indicating whether the file contains the names of the variables as its first line. If missing, the value is determined from the file format: header is set to TRUE if and only if the first row contains one fewer field than the number of columns.
  • sep: A character string that sets the field separator character. Values on each line of the file are separated by this character. If sep = "" (the default for read.table), the separator is ‘white space’, that is one or more spaces, tabs, newlines or carriage returns.
  • quote: A character string containing the set of quoting characters. To disable quoting altogether, use quote = "".
  • dec: A character string indicating the character used in the file for decimal points.
  • row.names, col.names: These arguments are used to specify the row and column names, respectively.
  • na.strings: A character vector of strings that are to be interpreted as NA values. Blank fields are also considered to be missing values in logical, integer, numeric and complex fields.
  • stringsAsFactors: This argument is used to control the conversion of character vectors to factors. Its default setting has been changed from TRUE to FALSE in R version 4.0.0 and beyond.

Here’s an example of using some of these options:

data <- read.table("data.csv", header = TRUE, sep = ",", quote = "\"", dec = ".", stringsAsFactors = FALSE)

In this example, read.table is set to read a CSV file with a header row. The field separator is set to a comma, the quoting character is a double quote, the decimal point character is a period, and character data is read as character vectors, not factors.

Dealing with Large Files

When working with large data files, reading the entire file into memory using read.table may not be feasible. For this, you can use the nrows argument to specify how many rows to read from the file:

data <- read.table("large_data.csv", header = TRUE, sep = ",", nrows = 1000)

This code will only read the first 1000 rows from the data file.

You can also use the colClasses argument to specify the class of each column in the data frame. This can greatly improve performance when reading large files because it avoids the need for read.table to guess the class of each column.

classes <- c("numeric", "character", "Date")
data <- read.table("large_data.csv", header = TRUE, sep = ",", colClasses = classes)

In this code, the colClasses argument is used to specify that the first column should be read as numeric, the second as character, and the third as Date.


The read.table function in R provides a powerful and flexible way to import data from text files into R. Understanding its various options and arguments will allow you to efficiently work with data in R, regardless of how the data is formatted in your files.

However, keep in mind that while read.table is highly flexible and widely used, it may not always be the fastest or most memory-efficient option for reading large data files. Other functions and packages in R, such as readr or data.table, provide faster functions for reading text data that may be preferable for large datasets.

Posted in RTagged

Leave a Reply