R Statistics and Software Tutorial
R is a programming language and software provider for statistical computing and graphical visualization. It has many features which has in-built functions as well as functional coding. Both the ways it can be done in R. R is a freely available under GNU general public License. R provides a wide variety of statistics and graphical techniques which includes both linear and non-linear models, time series analysis, classification analysis, clustering, forecasting, classical test and many more. Now a days R has become data mining tool as it is used by many data miners. R has only static graphics. But if we need dynamic graphics, which requires special packages need to be installed.
One has to use C ( ) command while creating vectors.
Ex: mydata <- c (3, 4, 5, 6)
Arithmetic operations on vectors carried out as component wise.
To get a sequence of Numbers:
Seq (0.5, 2.5, 3.5) or Seq (0.5, 5.5, length = 5) or 1:10
Either we can use “seq” function or “:”
Basic mathematic operations in R:
- Complex arithmetic operations
- Exponential functions
- Hyperbolic functions
- Logical operators
- Matrix operations
- Trigonometric functions
Statistic features in R:
- Standard variance
- Cross tabulations
Different probability functions will be done in R:
- F- distribution
- Cluster analysis
- K-Means Cluster
- Hierarchical cluster
- Neural networks
- Trees and recursive
Statistical Modelling in R:
- ANOVA ( Analysis of Variance)
- Factor analysis
- Exploratory factor analysis
- Factor analysis
- Design of Experiments (DOE)
- Time series Analysis
- ARIMA Models
- Holt-winter Model
- Exponential Smoothing Model
- Double Exponential Smoothing Model
- Winters Model
- Moving Average Method
- Linear model
- Garch Model
- Linear Models
- Linear regression
- Multi linear regression
- Multivariate statistics
- Multivariate ANOVA model
- Multi-dimensional Scaling
- Principal component Analysis
- Testing Models:
- Pairwise t-tests
- Two-sample Test scale
Top left section: defines the scripting file which we can save.
Bottom left section: defines direct scripting and immediate result.
Top right section: Defines environment and any tables which we creates will be shown as descriptive wise.
Bottom Right section: where we can check the files, packages available, plots.
How we can create data frames in R.
In Script file I created a variable d and given the function name data.frame ()
In fun: defined serial number as subject id, gender and score to each.
So, we get the output in console section ( Bottom left)
Screen 3 Says:
How many no of rows and no of columns in “d” variable, what are the attributes mentioned.
How to display the data frame and view the data frame and edit the data frame.
If we need help for a function then.
?function name has to be specified.
Screen 4 shows below how to install a package and from where it is downloading the package.
What is the command line for installing packages?
Once the package installed. Then we can check that in packages list which is shown in screen 5.
Now we need to import the package to use.
So, command line is
> library (package_name)
Here in Screen5 there are 4 highlighted portions.
- Scripting file shows the command lines.
- Console section shows the package status installed or loading
- Package section shows whether it is already installed or need to install
- Data section shows what all the data is available in that d Variable.