R – Create DataFrame

To create DataFrame in R programming language, call data.frame() function and provide necessary initial data.

In this tutorial, we will go through the syntax of data.frame() function, and how to use it to create a DataFrame, with examples.

Syntax

The syntax of data.frame() function is

data.frame(..., row.names = NULL, check.rows = FALSE,
           check.names = TRUE, fix.empty.names = TRUE,
           stringsAsFactors = FALSE)

The default values for the arguments are given in the above definition itself.

where

ArguemntDescription
Multiple arguments. Each argument is usually of the form value or tag = value. value could be a vector.
row.namesAn integer or character string. This specifies a column to be used as row names.
check.rowsIf TRUE, then the rows are checked for consistency of length and names.
check.namesIf TRUE then the names of the variables in the DataFrame are checked to ensure that they are syntactically valid variable names and are not duplicated. 
fix.empty.namesIf TRUE, the arguments which are unnamed, get an automatically constructed name or rather name "".
stringsAsFactorsIf TRUE, character vectors be converted to factors.
ADVERTISEMENT

Examples

In the following examples, we will go through different scenarios of creating a DataFrame in R, based on the argument values.

1. Create a simple DataFrame

In the following program, we create a DataFrame from three vectors. These vectors will be used as columns. We provide the vectors in the form tag = value. tag would be considered as column name, and the value would be considered as column values.

example.R

df <- data.frame(a = c(41, 42, 43, 44),
                 b = c(45, 46, 47, 48),
                 c = c(49, 50, 51, 52))
print(df)

Output

a  b  c
1 41 45 49
2 42 46 50
3 43 47 51
4 44 48 52

a, b, c are column names. Since, no row names are provided, default values of 1, 2, 3, .. are considered.

If no tag is provided for the values in arguments, then R would generate column name automatically for the argument.

In the following program, the first argument to data.frame() function is a vector with no tag / column name.

example.R

df <- data.frame(c(41, 42, 43, 44),
                 b = c(45, 46, 47, 48),
                 c = c(49, 50, 51, 52))
print(df)

Output

c.41..42..43..44.  b  c
1                41 45 49
2                42 46 50
3                43 47 51
4                44 48 52

Since the argument is just a value with no tag / column name, R generated a column name, which in this case is c.41..42..43..44..

2. Create DataFrame with Row Names

In the following program, we create a DataFrame with specific row names. We pass a vector for row.names argument in the data.frame() function call.

example.R

df <- data.frame(a = c(41, 42, 43, 44),
                 b = c(45, 46, 47, 48),
                 c = c(49, 50, 51, 52),
                 row.names = c('p', 'q', 'r', 's'))
print(df)

Output

a  b  c
p 41 45 49
q 42 46 50
r 43 47 51
s 44 48 52

Conclusion

In this R Tutorial, we learned how to create a DataFrame in R using data.frame() function, with the help of examples.