In this article, you’ll learn Statistical functions used in R. We will also be each one of them with an example and various ways to use them for better understanding.
R standard installation contains wide range of statistical functions. In this article, we will briefly look at the most important function.
Arithmetic Mean mean()
Generic function for the (trimmed) arithmetic mean.
Usage
> mean(x, …)
> mean(x, trim = 0, na.rm = FALSE, …)
Arguments
Values | Description |
x | An R object. Currently, there are methods for numeric/logical vectors and date, date-time, and time interval objects. Complex vectors are allowed for trim = 0 , only. |
trim | the fraction (0 to 0.5) of observations to be trimmed from each end of x before the mean is computed. Values of trim outside that range are taken as the nearest endpoint. |
na.rm | a logical value indicating whether NA values should be stripped before the computation proceeds. |
Example
# Create a vector with random values with mean of 50 and sd of 5
> x <- round(rnorm(10, mean = 50, sd = 5))
> x
[1] 52 47 54 51 49 49 60 54 56 50
> mean(x)
[1] 52.2
> mean(x, trim=49)
[1] 51.5
Median Value
Compute the sample median.
Usage
median(x, na.rm = FALSE, …)
Arguments
Values | Description |
x | An R object. Currently, there are methods for numeric/logical vectors and date, date-time, and time interval objects. Complex vectors are allowed for trim = 0 , only. |
na.rm | a logical value indicating whether NA values should be stripped before the computation proceeds. |
Let’s see it with an example.
> median(1:10)
> median(c(2,5,1,3,5,23,34,12,67))
[1] 5
Variance
The variance is a numerical measure of how the data values are dispersed around the mean. In particular, the sample variance is defined as:
Estimation Of A VAR(P)
Estimation of a VAR by utilizing OLS per equation.
Usage
VAR(y, p = 1, type = c("const", "trend", "both", "none"),
season = NULL, exogen = NULL, lag.max = NULL,
ic = c("AIC", "HQ", "SC", "FPE"))
print(x, digits = max(3, getOption("digits") - 3), ...)
Arguments
Values | Description |
y | Data item containing the endogenous variables |
p | Integer for the lag order (default is p=1). |
type | Type of deterministic regressors to include. |
season | Inlusion of centered seasonal dummy variables (integer value of frequency). |
exogen | Inlusion of exogenous variables. |
lag.max | Integer, determines the highest lag order for lag length selection according to the choosen ic . |
ic | Character, selects the information criteria, if lag.max is not NULL . |
x | Object with class attribute ‘varest’. |
Let’s take a built-in dataset cars
and find the var of speed in cars.
# Load the cars dataset
> dt <- cars
> speed <- dt$speed
> speed
[1] 4 4 7 7 8 9 10 10 10 11 11 12 12 12 12 13 13 13 13 14 14 14 14 15 15
[26] 15 16 16 17 17 17 18 18 18 18 19 19 19 20 20 20 20 20 22 23 24 24 24 24 25
> var(speed)
[1] 27.95918
Standard Deviation
This function computes the standard deviation of the values in x
. If na.rm
is TRUE
then missing values are removed before computation proceeds.
Usage
sd(x, na.rm = FALSE)
Arguments
Values | Description |
x | An R object. Currently, there are methods for numeric/logical vectors and date, date-time, and time interval objects. Complex vectors are allowed for trim = 0 , only. |
na.rm | a logical value indicating whether NA values should be stripped before the computation proceeds. |
Let’s see it with an example using rnorm()
function.
> x = rnorm(10, 10, 20)
> x
[1] 36.1663953 32.9141902 18.7596222 -20.5158583 -15.9542984 0.8033739
[7] -3.1769068 6.7160742 3.7626753 8.7092254
> sd(x)
[1] 18.55818
Conclusion
Hence, we the various function which are used for statistical programming in R, along with how to use them with each example each.
This brings the end of this Blog. We really appreciate your time.
Hope you liked it.
Do visit our page www.zigya.com/blog for more informative blogs on Data Science
Keep Reading! Cheers!
Zigya Academy
BEING RELEVANT