Compare Two Data Frames in R
In this tutorial, we will learn how to compare two Data Frames using compare() function.
To compare two R Data frames, there are many possible ways like using compare() function of compare package, or sqldf() function of sqldf package. In this article, we will use inbuilt function, compare() to compare two Data frames.
The syntax of compare() function is
compare(model, comparison,
equal = TRUE,
coerce = allowAll,
shorten = allowAll,
ignoreOrder = allowAll,
ignoreNameCase = allowAll,
ignoreNames = allowAll,
ignoreAttrs = allowAll,
round = FALSE,
ignoreCase = allowAll,
trim = allowAll,
dropLevels = allowAll,
ignoreLevelOrder = allowAll,
ignoreDimOrder = allowAll,
ignoreColOrder = allowAll,
ignoreComponentOrder = allowAll,
colsOnly = !allowAll,
allowAll = FALSE)
where
modelThe “correct” object.comparisonThe object to be compared with themodel.equalTest for equality if test for identity fails.coerceIf objects are not the same, allow coercion of comparsion to model class.shortenIf the length of one object is less than the other, shorten the longer object.ignoreOrderIgnore the order of values when comparing.ignoreNameCaseIgnore the case of names when comparing.ignoreNamesIgnore names attributes altogether.ignoreAttrsIgnore attributes altogether.roundIf objects are not the same, allow numbers to be rounded.ignoreCaseIgnore the case of string values.trimIgnore leading and trailing spaces in string values.dropLevelsIf factors are not the same, allow unused levels to be dropped.ignoreLevelOrderIgnore the order of factor levels.ignoreDimOrderIgnore the order of dimensions when comparing matrices, arrays, or tables.ignoreColOrderIgnore the order of columns when comparing data frames.ignoreComponentOrderIgnore the order of components when comparing lists.colsOnlyOnly transform columns (not rows) when comparing data frames.allowAllAllow any sort of transformation (almost; see Details).
The list of arguments is very big. But no worries, we will go through those that are generally used for comparing data frames.
Basic Comparison between two Data Frames
In this case, we will go with the default values and just provide the original (model in argument list) data frame and the comparison data frame.
Consider two data frames, DF1 and DF2 shown below.
> DF1 = data.frame(id=c(1,2,3,4), name=c("John", "Manu", "Surya", "Amith"))
> DF2 = data.frame(id=c(1,2,3,4), name=c("John", "Manu", "Surya", "Tinu"))
> DF1
id name
1 1 John
2 2 Manu
3 3 Surya
4 4 Amith
> DF2
id name
1 1 John
2 2 Manu
3 3 Surya
4 4 Tinu
>
DF1 and DF2 differ in the fourth row name value.
Now, use compare function with DF1 as model and DF2 as comparison.
> compare(DF1, DF2)
FALSE [TRUE, FALSE]
>
The straight away comparison results in FALSE which is right.
Let us take identical data frames and compare.
> DF1 = data.frame(id=c(1,2,3,4), name=c("John", "Manu", "Surya", "Amith"))
> DF2 = data.frame(id=c(1,2,3,4), name=c("John", "Manu", "Surya", "Amith"))
> compare(DF1, DF2)
TRUE
Conclusion
In this R Tutorial, we have learnt how to compare two Data Frames.
