R’s coercion behavior may seem inconvenient, but it is not arbitrary. R always follows the same rules when it coerces data types. Once you are familiar with these rules, you can use R’s coercion behavior to do surprisingly useful things.
So how does R coerce data types? If a character string is present in an atomic vector, R will convert everything else in the vector to character strings. If a vector only contains logical and numbers, R will convert the logical to numbers; every TRUE becomes a 1, and every FALSE becomes a 0, as shown below.
R always uses the same rules to coerce data to a single type. If character strings are present, everything will be coerced to a character string. Otherwise, locals are coerced to numerics.
This arrangement preserves information. It is easy to look at a character string and tell what information it used to contain. For example, you can easily spot the origins of "TRUE"
and "5"
. You can also easily back-transform a vector of 1s and 0s to TRUE
s and FALSE
s.
R uses the same coercion rules when you try to do math with logical values. So the following code:
sum(c(TRUE, TRUE, FALSE, FALSE))
will become:
sum(c(1, 1, 0, 0))
## 2
This means that sum
will count the number of TRUE
s in a logical vector (and mean
will calculate the proportion of TRUE
s). Neat, huh?
You can explicitly ask R to convert data from one type to another with the as
functions. R will convert the data whenever there is a sensible way to do so:
as.character(1)
## "1"
as.logical(1)
## TRUE
as.numeric(FALSE)
## 0
You now know how R coerces data types, but this won’t help you save a playing card. To do that, you will need to avoid coercion altogether. You can do this by using a new type of object, a list.
Before we look at lists, let’s address a question that might be on your mind.
Many data sets contain multiple types of information. The inability of vectors, matrices, and arrays to store multiple data types seems like a major limitation. So why bother with them?
In some cases, using only a single type of data is a huge advantage. Vectors, matrices, and arrays make it very easy to do math on large sets of numbers because R knows that it can manipulate each value the same way. Operations with vectors, matrices, and arrays also tend to be fast because the objects are so simple to store in memory.
In other cases, allowing only a single type of data is not a disadvantage. Vectors are the most common data structure in R because they store variables very well. Each value in a variable measures the same property, so there’s no need to use different types of data.
Conclusion
Hence we saw what is coercion in R with the examples.
This brings the end of this Blog. We really appreciate your time.
Hope you liked it.
Do visit our page www.zigya.com/blog for more informative blogs on Data Science
Keep Reading! Cheers!
Zigya Academy
BEING RELEVANT