# Write in front

I have learned more about R language recently, so I would like to summarize here. This part mainly talks about the graphics and some processing functions of R language. The commands are more mathematical. After all, R language was designed for statistics at the beginning.

P.S. because I used R Markdown to write R program, the two "\\\\\\\\\.

# primary coverage

1. Common mathematical functions and statistical functions;

2. Statistical drawing function (histogram, nuclear density diagram, box line diagram, normal QQ diagram, stem leaf diagram, empirical distribution diagram, etc.)

3. High level drawing functions (plot, coplot, pairs, qqnorm, contour, persp, etc.)

4. High level drawing commands (add, axes, log, type, etc.)

5. Low level drawing function and parameter setting

# Mathematical and statistical functions

```abs(-3)
##  3
sqrt(9)
##  3
ceiling(5/3)
##  2
floor(5/3)
##  1
round(4.55)
##  5
log(exp(10))
##  10
sin(pi/2)
##  1
cos(pi/2)
##  6.123032e-17
x <- c(1,2,3,3)
mean(x) # Equivalent to meanx < - sum (x) /length (x); Meanx
##  2.25
median(x)
##  2.5
sd(x)
##  0.9574271
var(x)
##  0.9166667
min(x)
##  1
max(x)
##  3
```

## Standardization of data

```x <- c(1,3,5,4)
scale(x)
##           [,1]
## [1,] -1.317465
## [2,] -0.146385
## [3,]  1.024695
## [4,]  0.439155
## attr(,"scaled:center")
##  3.25
## attr(,"scaled:scale")
##  1.707825
```

## Probability function

```x <- pretty(c(-3, 3), 30); x
```
```##   -3.0 -2.8 -2.6 -2.4 -2.2 -2.0 -1.8 -1.6 -1.4 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2
##   0.0  0.2  0.4  0.6  0.8  1.0  1.2  1.4  1.6  1.8  2.0  2.2  2.4  2.6  2.8
##   3.0
```
```y <- dnorm(x)
plot(x, y)
```
```rnorm(50, mean = 20, sd = 8)
```
``` 11.233327 22.000808 10.066406 19.273047 21.771780 34.621389 21.706101

 21.934236 19.608269 16.698233 13.597833 22.653223 20.194896 25.730459

 10.912291 18.101390 12.456593 21.678788  7.763024 27.193593 24.136049

 31.140113 22.691629 18.891392 18.009354 28.952892  8.203124 16.267587

 21.039319 26.668597 15.264060 15.474431 28.440294 14.970583 26.289378

 18.113167 11.175129  2.085909 26.948591 12.651352 17.815405 13.490284

 21.128309 41.396762 32.838635 14.187705 29.128805 16.050802 14.680583

 31.128813

```

# Generating pseudorandom numbers with normal distribution

```runif(5)
```
```##  0.6973212 0.8353123 0.1633793 0.7737247 0.3019795
```
```# Seed random numbers
set.seed(12)
```

# String handler

```# Count the number of characters in a string
nchar("abcde")
```
```##  5
```
```# Extract string (generate substring)
substr("abcde", 3, 5)
```
```##  "cde"
```
```# String lookup
grep("a", c("a", "c", "b", "a"))
```
```##  1 4
```
```# String substitution
sub("a", "A", "abcde")
```
```##  "Abcde"
```
```# Segmentation of strings
strsplit("abcde", "c")
```
```##  "ab" "de"
```
```strsplit("abcde", "") # Separate each character
```
```##  "a" "b" "c" "d" "e"
```
```# Merging of strings
paste("Today is", "Tuesday.")
```
```##  "Today is Tuesday."
```
```# Case conversion function
toupper("abc")
```
```##  "ABC"
```
```tolower("ABc")
```
```##  "abc"
```

# Functions applied to matrices and data frames

```b <- matrix(runif(12), nrow=3)
# Functions dealing with matrices
log(b) # Take the natural logarithm of each element of the matrix
```
```##             [,1]      [,2]       [,3]       [,4]
## [1,] -2.66843174 -1.311625 -1.7215713 -4.7885131
## [2,] -0.20116780 -1.775799 -0.4436883 -0.9347165
## [3,] -0.05909021 -3.384469 -3.7775907 -0.2059417
```
```mean(b) # Average all elements of the matrix
```
```##  0.3633845
```
```# The apply function processes the matrix by dimension
apply(b, 1, mean)
```
```##  0.1314632 0.5053715 0.4533189
```
```# The happy function processes the list to get the processing result of each component of the list
x <- list(a = 1:10, beta = exp(-3:3),
logic = c(TRUE,FALSE,FALSE,TRUE))
lapply(x, mean) # The logic value TRUE is 1 by default, and the logic value FALSE is 0 by default
```
```## \$a
##  5.5
##
## \$beta
##  4.535125
##
## \$logic
##  0.5
```

# Graphic drawing

## Histogram rendering (hist)

Used to represent the distribution of frequencies

```# Basic histogram rendering
x <- mtcars\$mpg; x
```
```##   21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
##  10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
##  15.0 21.4
```
```hist(x)
```
```# Parameter settings. breaks indicates the number of groups divided, and the default y-axis indicates the frequency
hist(x, breaks = 12, col = "red", xlab = "Miles Per Callon")
```
```# freq=F set y-axis to represent probability density
hist(x, freq = F, breaks = 12, col = "green",
xlab = "Miles Per Callon")
# Axonometric drawing
lines(density(x), col = "red", lwd = 2) # Frequency variation
```

## Drawing of nuclear density map

A tool for observing the distribution of continuous variables
The x-axis represents the value, and the y-axis represents the density (probability) of the value in all data

```x <- density(mtcars\$mpg); x
```
```##
## Call:
## 	density.default(x = mtcars\$mpg)
##
## Data: mtcars\$mpg (32 obs.);	Bandwidth 'bw' = 2.477
##
##        x               y
##  Min.   : 2.97   Min.   :6.481e-05
##  1st Qu.:12.56   1st Qu.:5.461e-03
##  Median :22.15   Median :1.926e-02
##  Mean   :22.15   Mean   :2.604e-02
##  3rd Qu.:31.74   3rd Qu.:4.530e-02
##  Max.   :41.33   Max.   :6.795e-02
```
```plot(x)
```
```attach(mtcars)
library(sm)
```
```sm.density.compare(mpg, cyl, xlab = "Miles Per Gallon")
```

## Box diagram

```boxplot(mtcars\$mpg, main = "Box Plot", ylab = "Miles per gallon")
```
```boxplot(mpg~cyl, data=mtcars, main = "Box Plot",
xlab = "Number of Cylinders", ylab = "Miles per gallon")
```

## Experience distribution map

Suitable for continuous distribution

```w <- c(75.0, 64.0, 47.4, 66.9, 62.2, 62.2, 58.7, 63.5,
66.6, 64.0, 57.0, 69.0, 56.9, 50.0, 72.0); w
```
```##   75.0 64.0 47.4 66.9 62.2 62.2 58.7 63.5 66.6 64.0 57.0 69.0 56.9 50.0 72.0
```
```# Sum up five numbers, calculate the two maxima and three quantiles in the data
fivenum(w)
```
```##  47.40 57.85 63.50 66.75 75.00
```
```# Drawing of empirical distribution map
ecdf(w) # Calculate the numerical vector obtained from the empirical distribution function
```
```## Empirical CDF
## Call: ecdf(w)
##  x[1:13] =   47.4,     50,   56.9,  ...,     72,     75
```
```plot(ecdf(w),verticals = TRUE, do.p = TRUE)
x <- 44:78
lines(x, pnorm(x, mean(w), sd(w)))
```

## Normal QQ chart

The inverse function of the distribution function of the normal distribution is the uniform distribution on 0-1

```w <- c(75.0, 64.0, 47.4, 66.9, 62.2, 62.2, 58.7, 63.5,
66.6, 64.0, 57.0, 69.0, 56.9, 50.0, 72.0); w
```
```##   75.0 64.0 47.4 66.9 62.2 62.2 58.7 63.5 66.6 64.0 57.0 69.0 56.9 50.0 72.0
```
```qqnorm(w)
qqline(w)
```

## Stem leaf diagram

```x<-c(25, 45, 50, 54, 55, 61, 64, 68, 72, 75, 75,
78, 79, 81, 83, 84, 84, 84, 85, 86, 86, 86,
87, 89, 89, 89, 90, 91, 91, 92, 100); x
```
```##    25  45  50  54  55  61  64  68  72  75  75  78  79  81  83  84  84  84  85
##   86  86  86  87  89  89  89  90  91  91  92 100
```
```stem(x)
```
```##
##   The decimal point is 1 digit(s) to the right of the |
##
##    2 | 5
##    3 |
##    4 | 5
##    5 | 045
##    6 | 148
##    7 | 25589
##    8 | 1344456667999
##    9 | 0112
##   10 | 0
```

# High level mapping function and low level mapping function

High level mapping functions include: plot(), coplot(), pairs(), qqnorm(), qqline(), hist(), contour(), which can generate graphs and customize parameters;
However, the low-level mapping function cannot generate graphics by itself, and can only add new graphics on the basis of high-level mapping function.

## High level mapping function

1. plot() function
Plot the scatter diagram and curve of data.

There are four drawing methods: two vector scatter diagram, time series scatter diagram (scatter diagram of one-way quantity with respect to subscript and complex vector), box graph of factors, and scatter diagram composed of different indicators (regression diagnosis diagram, etc.).

```x <- c(1,3,2,3,3,5);
y <- c(3,2,3,4,5,6);
z <- complex(re = x, im = y);
plot(x)
```
```plot(x, y)
```
```plot(z)
```
```# Box plot of factors
y<-c(1600, 1610, 1650, 1680, 1700, 1700, 1780, 1500, 1640,
1400, 1700, 1750, 1640, 1550, 1600, 1620, 1640, 1600,
1740, 1800, 1510, 1520, 1530, 1570, 1640, 1600)
f<-factor(c(rep(1,7),rep(2,5), rep(3,8), rep(4,6)))
plot(f,y)
```
```# Scatter diagram of indicator composition of data frame
df<-data.frame(
Age=c(13, 13, 14, 12, 12, 15, 11, 15, 14, 14, 14,
15, 12, 13, 12, 16, 12, 11, 15 ),
Height=c(56.5, 65.3, 64.3, 56.3, 59.8, 66.5, 51.3,
62.5, 62.8, 69.0, 63.5, 67.0, 57.3, 62.5,
59.0, 72.0, 64.8, 57.5, 66.5),
Weight=c( 84.0, 98.0, 90.0, 77.0, 84.5, 112.0, 50.5,
112.5, 102.5, 112.5, 102.5, 133.0, 83.0,
84.0, 99.5, 150.0, 128.0, 85.0, 112.0))
plot(df)
```
```attach(df)
# Scatter plot of height and age indicators
plot(~Age+Height)
```
```# Scatter plot of weight versus age and height
plot(Weight~Age+Height)
```
1. Functions for plotting multivariable data

pairs() function, when the data is a matrix or data frame, draw the scatter diagram of the matrix about its columns
The coplot() function can draw a more detailed scatter diagram, and can also represent the relationship between the indicators in each column

```# Consistent with the result of the plot() function, it is a boxplot
pairs(df)
```
```# Draw the scatter diagram of indicators. The following is the scatter diagram of weight and height by age
coplot(Weight ~ Height | Age)
```
1. qqnorm(), hist(), dotchart(), contour(), image(), persp(), etc

dotchart() function draws the dot graph of data x

```# Population mortality point map of Virginia in 1940
dotchart(VADeaths, main = "Death Rates in Virginia - 1940")
```
```dotchart(t(VADeaths), main = "Death Rates in Virginia - 1940")
```

contour(), image(), persp() function to draw contour map of mountain area

```x <- seq(0,2800, 400); y <- seq(0,2400,400);
z <- c(1180,1320,1450,1420,1400,1300,700,900,
1230,1390,1500,1500,1400,900,1100,1060,
1270,1500,1200,1100,1350,1450,1200,1150,
1370,1500,1200,1100,1550,1600,1550,1380,
1460,1500,1550,1600,1550,1600,1600,1600,
1450,1480,1500,1550,1510,1430,1300,1200,
1430,1450,1470,1320,1280,1200,1080,940)
Z <- matrix(z, nrow = 8)
# Draw image map
image(x, y, Z)
```
```# Draw contour map
contour(x, y, Z, levels = seq(min(z), max(z), by = 50))
```
```# Draw 3D surfaces
persp(x, y, Z, theta=30, phi=45, expand=.3)
```

## Commands in high level drawings

1. Logical commands in the diagram

Add a new figure to the original figure: add=T, the default is F, that is, directly replace the original figure
Display coordinate axis: axes=F, default to T, that is, display coordinate axis

```contour(x, y, Z)
contour(x, y, Z, levels = seq(min(z),
max(z), by = 80), col=5, add=T)
```
```contour(x, y, Z, levels = seq(min(z), max(z), by = 50), axes=F)
```
1. Data logarithm

Logarithm of x axis: log="x"
Logarithm of y axis: log="y"
Take logarithm of x and y axes simultaneously: log="xy"

```x1 <- 1:10; x2 <- 4:13;
plot(x1, x2, log="x")
```
```plot(x1, x2, log="xy", col=3)
```
1. type command
Set parameters for plotting scatter
Default: type="p"
Solid line diagram: type="l"
Points are connected by solid lines (do not pass through points):type="b"
Solid line passes through all points: type="o"
Make a vertical line through the x axis: type="h"
Do not draw any points and curves: type="n"
```x1 <- 1:10; x2 <- 4:13;
plot(x1, x2, type = "s")
```
1. Other drawing commands

pch: setting symbols for drawing
cex: set the size of the symbol (relative to the size of the drawing, numerical representation)
lty: set alignment
lwd: set lineweight
xlab(ylab): axis title
Main: main title
Sub: sub title

```x1 <- 1:10; x2 <- 4:13;
plot(x1, x2, pch=12, cex=3, lty=2, lwd=2, col="red",
xlab="x axis", ylab="y axis", main="straight line", sub="Minor line")
```

## Low level mapping function

lines(): Lines
text(): add a mark at a point on the graph
abline(): add a line to the graph. abline(a, b) means to draw a line y=bx+ay=bx+ay=bx+a, and h=y and v=x respectively mean a line parallel to the coordinate axis
title(main="", sub= ""): add a mark, description or other content to the diagram
axis(side):1, 2, 3 and 4 of the side indicate bottom, left, top and right

```x <- 1:10; y <- 4:13;
plot(x, y)
xp <- c(8, 3, 4); yp <- c(9, 10, 5);
points(xp, yp, pch=16, col="green")
lines(xp, yp, pch=16, col="blue")
text(x, y)
abline(3,4)
legend("topleft", inset = .01, "legend",
c("A", "B"), lty=c(1, 2), pch=c(15, 17))
```

Tags: R Language

Posted by stevehossy on Wed, 01 Jun 2022 17:10:50 +0530