read_csv( )

Import a CSV file as a tibble. read_csv() and read_csv2() are from the readr package, which is part of the tidyverse.

Original Documentation ↗ readr Cheatsheet ↗

Required Library

install.packages("tidyverse")
library(tidyverse)

Syntax

read_csv(file)

read_csv("file_name.csv") reads a comma-separated values (.csv) file and returns a tibble. Use if your file uses commas as column separators and dots for decimals.

read_csv2(file)

read_csv2("file_name.csv") reads a .csv file that uses semicolons as column separators and commas for decimals. Note that the created tibble uses dots for decimals.

Why 2 different functions?

English standard vs. danish standard for writing decimals. To find out how your .csvfile is formatted, open it with a text editor on your computer or use the following code to print the first rows with R: read_lines("file.csv", n_max = 3)

Argument Overview

Required arguments must be included when using a function while optional arguments can be included on demand.

file — path to the CSV file character | connection Required

A string giving the path to the CSV file. Use forward slashes (/) in file paths, even on Windows.

read_csv("data/capsule_weights.csv")              # relative path from working directory
read_csv("C:/Users/Morten/Documents/data.csv")    # absolute path on your computer

Data Types: character (file path or URL) | connection | I(text)

col_names — use first row as column names logical | character Optional

If TRUE (default), the first row is used as column names. If FALSE, column names are generated automatically (X1, X2, …). You can also supply a character vector to set names directly.

read_csv("data.csv", col_names = FALSE)
read_csv("data.csv", col_names = c("id", "dose", "response"))

Data Types: logical | character vector · Default: TRUE

col_select — select columns to read tidy-select Optional

Read only a subset of columns. Accepts tidy-select syntax — column names, ranges, or helper functions like starts_with().

read_csv("data.csv", col_select = c(id, dose, response))
read_csv("data.csv", col_select = starts_with("conc"))

Data Types: tidy-select expression · Default: NULL (all columns)

na — strings to interpret as missing character vector Optional

A character vector of strings that should be converted to NA when reading. Default is c("", "NA"). Extend this when your data uses other missing value codes.

read_csv("data.csv", na = c("", "NA", "N/A", "ND", "Missing", "."))

Data Types: character vector · Default: c(““,”NA”)

comment — comment character character Optional

Any text after this character on a line is ignored. Useful for files that contain comment lines starting with #.

read_csv("data.csv", comment = "#")

Data Types: single character · Default: ““ (no comments)

trim_ws — trim whitespace from fields logical Optional

If TRUE (default), leading and trailing whitespace is stripped from each field before parsing.

read_csv("data.csv", trim_ws = FALSE)

Data Types: logical · Default: TRUE

skip — rows to skip at the top integer Optional

Number of lines to skip before reading the header row. Useful when CSV files exported from instruments contain metadata rows above the column names.

read_csv("instrument_export.csv", skip = 5)   # skip 5 metadata rows

Data Types: integer · Default: 0

n_max — maximum number of rows to read numeric Optional

Stop reading after this many data rows (not counting the header). Useful for previewing a large file or reading only a subset.

read_csv("large_dataset.csv", n_max = 100)   # first 100 rows only

Data Types: numeric · Default: Inf (read all rows)

name_repair — strategy for duplicate or invalid column names character | function Optional

How to handle column names that are duplicated or syntactically invalid. Common options: "unique" (default), "minimal", "universal", or a custom function.

read_csv("data.csv", name_repair = "universal")

Data Types: character | function · Default: “unique”

skip_empty_rows — skip blank rows logical Optional

If TRUE (default), completely empty rows are ignored. Set to FALSE to preserve them as rows of NA.

read_csv("data.csv", skip_empty_rows = FALSE)

Data Types: logical · Default: TRUE

Examples

Example File (.csv) ↗

Read a CSV file from your computer

In practice you pass a file path:

Handle missing value codes

The example file above has missing absorbance readings encoded three different ways — "N/A", "-", and a blank cell. Use the optional argument na to unify how all of them are handled, so they all become NA.

Skip blank rows left over from instrument exports

The example file also has a fully blank row between tablet 1 and tablet 2 — common after copy-pasting from Excel. By default skip_empty_rows = TRUE drops it automatically. Set it to FALSE if you need to keep it (it shows up as a row of NAs).

Select only the columns you need

The dissolution data has 5 columns (tablet, weight_mg, time_min, absorbance, dilution). Use col_select to keep only the ones relevant to your analysis — here, just the tablet ID and the measured absorbance over time.

Skip metadata rows from an instrument export

The example file starts with 2 lines of instrument metadata (device name, run date) before the real header row. Use skip to jump straight to the data — without it, read_csv() would try to use the device name as your column names.

Preview the first rows of a large file

Use n_max to read in only a handful of rows — useful for previewing a large dataset without loading the whole thing.

Ignore comment lines in the file

Some exported .csv files include annotation lines starting with #. Tell read_csv() to ignore everything after # on a line with the comment argument, so these annotations don’t get parsed as data.

Set your own column names

If a file has no header row, or you want clearer column names than the original file provides, supply a character vector to col_names. Make sure the order matches the columns in the file, and use skip = 1 to drop the original header row.