Data Types in R

R Tutorial – We shall learn about R atomic data types, different R data types, their syntax and example R commands for R data types.

While writing a program, you may need to store your data in variables. And this data might be of different types like Integer, String, Array of Integers etc. Based on these data types, the Operating System stores them in memory in an optimized manner. Data Types also helpful to the programmer for understanding the type of data he is handling or manipulating.

Unlike statistically typed languages, R derives the data type of the variable implicitly from the R object assigned to the variable.

There are many types of R-objects. But all of them are built from R atomic data types. In R programming language there are six atomic data types.

Atomic Vectors of R Atomic Data Types

Following are the six types of vectors that could be built from R atomic data types.

Data TypeExampleDescription
LogicalTRUE, FALSEboolean values
Numeric2, 45.9, 3782Numbers of all kinds
Integer9L, 779LExplicitly Integers
Complex8+9iReal Value + Complex Value
Character‘m’, “hello”Characters and Strings
Raw [68, 65, 6C, 6C,6F] is the value for string hello.Any data is stored as raw bytes

Note : When data type is Raw, user has to know the format or protocol of the data.

ADVERTISEMENT

Examples of Atomic Vectors

We shall run the following commands to assign variables, data of different data types and print the class of the variable to verify the data type.

Logical

> x <- TRUE
> print(class(x))
[1] "logical"

Numeric

> x <- 67.54
> print(class(x))
[1] "numeric"

Integer

x <- 63L
> print(class(x))
[1] "integer"

Complex

> x <- 6 + 4i
> print(class(x))
[1] "complex"

Character

> x <- "hello"
> print(class(x))
[1] "character"

Raw

> x <- charToRaw("hello")
> print(class(x))
[1] "raw"

Data Types of R – Objects

As already mentioned there are many types of R Objects. We shall look into some of the most commonly used data types. They are :

  • Vectors
  • Lists
  • Matrices
  • Arrays
  • Factors
  • Data Frames

We shall in detail about about these data types.

R Vectors

In R programming language, a Vector is a fixed-length collection of values of a data type. The vector would get the data type of items in the collection.

Syntax – Define a Vector

variable <- c(comma separated atomic vectors belonging to a data type)

For example (‘apple’,’orange’,”banana”) is a vector and is a collection of values of data type Character. So the vector would become a Character Vector. Similarly an Integer Vector or Complex Vector.

Following is an example of a Character Vector. We shall learn how to assign an R Character Vector to a variable, print the vector and verify the data type of vector.

> fruits = c('apple','orange',"banana") > print(class(fruits))
[1] "character"
> print(fruits)
[1] "apple"  "orange" "banana"

R Lists

In R programming language, a List is a collection of List Items (R Objects) belonging to different data types. A List may contain another list as its item. A List Item may contain a Matrices, an Array, a Factor, an R function or any of R Object.

Syntax to Define List

variable <- list(comma seperated list items)

Following is an example of an R List. We shall learn how to assign a list of Number, Character, Function and another list  to a List and print the List.

> listX = list(51,"hello",tan,list(8L,"a")) > print(listX)
[[1]]
[1] 51

[[2]]
[1] "hello"

[[3]]
function (x)  .Primitive("tan")

[[4]]
[[4]][[1]]
[1] 8

[[4]][[2]]
[1] "a"

Please observe that fourth and last item in the list is another list.

R Matrices

In R programming language, A Matrix is a 2-D set of data elements. A Vector, number of rows and number of columns could be used to create a Matrix.

Syntax – Define Matrix

variable <- matrix(vector, number of rows, number of columns, split by row or column)

split by row or column : if TRUE then its split by row, else if its FALSE then split by column.

Following is an example to define a matrix :

1. Split by row.

> A = matrix(c(1,2,3,4,5,6,7,8),2,4,TRUE) > print(A)
     [,1] [,2] [,3] [,4]
[1,]    1    2    3    4
[2,]    5    6    7    8

2. Split by column.

> A <- matrix(c(1,2,3,4,5,6,7,8),2,4,FALSE) > print(A)
     [,1] [,2] [,3] [,4]
[1,]    1    3    5    7
[2,]    2    4    6    8

R Arrays

In R programming language, Arrays are N-Dimensional data sets.

Syntax – Define an R Array

variable <- array(list, dimension)

where list contains the elements of array and dimension is a list containing the information about dimensionality of the array. If dimension is c(2,5,4,8), the array is 4-Dimensional with dimensions 2x5x4x8.

Following is an example of 3-D array.

> A = array(c(1,2,3,4,5,6,7,8,9,10,11,12),c(2,3,2)) > print(A)
, , 1

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

, , 2

     [,1] [,2] [,3]
[1,]    7    9   11
[2,]    8   10   12

R Factors

In R programming language, a Factor is a vector along with the distinct values of vector as levels. Factors are useful during statistical modelling.

Levels are stored as R Characters.

Syntax – Define an R Factor

variable <- factor(vector)

Following is an example to define an R Factor

> factorX = factor(c(1,4,7,2,6,7,1,6,4)) > print(factorX)
[1] 1 4 7 2 6 7 1 6 4
Levels: 1 2 4 6 7

R Data Frames

In R programming language, a Data Frame is a set of equal length vectors. The vectors could be of different data types.

Syntax – Define an R Data Frame

variable <- data.frame(listA, listB, listC, .., listN)

Following is an example to define an R Data Frame :

> dataX = data.frame(values = c(21,42,113), RGB = c('red','blue','green')) > print(dataX)
  values   RGB
1     21   red
2     42  blue
3    113 green

Conclusion

In this R Tutorial, we have learnt about different R atomic data types and different data types of R-Objects used most commonly in R programming language.