Chapter 1 Fundamentals of machine learning (1) Numpy
1.Nmupy overview
NumPy(Numerical Python)yes Python An open source numerical computation extension of. This tool can be used to store and process large matrices Python Nested list of itself( nested list structure)The structure should be more efficient (the structure can also be used to represent the matrix( matrix)),It supports a large number of dimensional array and matrix operations. In addition, it also provides a large number of mathematical function libraries for array operations --from Baidu Encyclopedia
The Numpy provides an N-dimensional type nArray, which describes a collection of the same data type "items".
The The difference between Numpy and python native list:
We can see from the picture ndarray When storing data, the data and the address of the data are continuous, which makes the batch operation of array elements faster. that is because ndarray All elements in are of the same type, and Python The element type in the list is arbitrary, so ndarray Memory can be contiguous when storing elements, and python Primordial list The next element can only be found through addressing, which also leads to Numpy of ndarray Less than Python Primordial list,But in scientific computing, Numpy of ndarray Many loop statements can be omitted, and the code usage is better than Python Primordial list Much simpler.
2. basic operation of numpy
The original intention of this tutorial is only to facilitate the operation and reading of the dataSet provided by machine learning, so it is relatively simple. If you want to continue to learn more about Numpy, please move to gayHub, the world's largest gay dating platform Numpy Chinese document address
2.1 basic operations - value taking and value changing
The If you need to use Numpy, you should first install the Numpy library. I personally recommend using Anaconda for installation. Please move to the installation tutorial Anaconda installation
The If you already have the Numpy library, you can completely ignore the above step, enter import numpy as np in the compiler to import the package, and have fun with Daji~
The OK, no more funny ~ let's get down to business. First, import the pilot data required for the operation.
## Numpy import numpy as np #Import package ##Values involved in the case a = np.array(['a','b','c','d','e']) b = np.array([1,2,3,4,5]) c = np.array([6,7,8,9,10]) d = np.eye(4) print("a ===>",a) print("b ===>",b) print("c ===>",c) print("d ===>",d)
The results obtained are:
a ===> ['a' 'b' 'c' 'd' 'e'] b ===> [1 2 3 4 5] c ===> [ 6 7 8 9 10] d ===> [[1. 0. 0. 0.] [0. 1. 0. 0.] [0. 0. 1. 0.] [0. 0. 0. 1.]]
Then start reading and operation:
##Read, modify print(a[1]) ##output:'b' ##Read the specified value of a two-dimensional array print(d[1,1]) ##output:1.0 #Read 1 row print(d[1]) ##output:[0. 1. 0. 0.] #Read column 1 (ps: the reason why d[:,2] is not used is that numpy is stored according to rows, that is, the matrix compression of two-dimensional matrix) print(d[:,2]) #output:[0. 0. 1. 0.] ##Modify 2D array d[1,2] = 3 print(d) """The result is: [[1. 0. 0. 0.] [0. 1. 3. 0.] [0. 0. 1. 0.] [0. 0. 0. 1.]]"""
2.2 basic operation - matrix operation
## Four operations of matrix ##plus print(b+c) #output:[ 7 9 11 13 15] #reduce print(b-c) #output:[-5 -5 -5 -5 -5] #ride print(b*c) #output:[ 6 14 24 36 50] #except print(b/c) #output:[0.16666667 0.28571429 0.375 0.44444444 0.5 ] #Power operation print(b**2) #output:[ 1 4 9 16 25]
2.3 basic operations - matrix properties
In machine learning, there are many places where we need to know the properties of matrices, such as dimensions, types, etc.
##shape returns the length of each dimension, and its return value is a tuple print(a.shape) #output:(5,) print(d.shape) #output:(4, 4) ##Dimension operation ndim print(a.ndim) #1 = = = > one dimensional array print(d.ndim) #2 = = = > 2D array ###dtype to view the type of numpy print(a.dtype) #output:<U1 print(d.dtype) #output:float64 ##Specify data type arr = np.array([1,2.2,3,3.2],dtype="int32") print(arr,arr.dtype) #output:[1 2 3 3] int32 ##If a data type that cannot be converted is encountered, an error will be reported ##arr1 = ['1','2.2','a','ier',dtype="int32"] ##print(arr1) ##The conversion of numpy data type needs to use astype(). The return value is the modified data type, while the original tuple remains unchanged arr = np.array([1,2.2,3.0,4,5,6.6]) e1 = arr.astype(int) print(e1,e1.dtype) #output:[1 2 3 4 5 6] int32 e2 = arr.astype(np.str) ' '6.6'] print(e2,e2.dtype) #output:['1.0' '2.2' '3.0' '4.0' '5.0'] <U32 ##Show bytes of maximum element itemsize print(a.itemsize) #output:4 ##Total element bytes nbytes print(d.nbytes) #output:128
2.4 basic operation – Numpy function
In machine learning, Numpy functions are sometimes used, such as tile and fill. Let's follow my younger brother to have a look~
#fill fills the tuple with the specified element # a.fill('a') print(a) ##output:['a' 'a' 'a' 'a' 'a'] ##Reshape reshape to regenerate the array according to the specified shape without changing the original data ##Array Reshape (a, b) reorganizes the array into a list of row a and column B a = np.arange(1,25) #Np The range (start, end) function generates a list from start to END-1 #output:[ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24] ##If the number of elements does not match, an error will be reported #a.reshape(4,5) #output:ValueError print(a.reshape(4,6)) #output: ''' [[ 1 2 3 4 5 6] [ 7 8 9 10 11 12] [13 14 15 16 17 18] [19 20 21 22 23 24]] '''
###Summation sum has an axis parameter that can specify columns for summation ''' hypothesis a The shape of is a(2,3,2)Array, then: IF axis=0(-3) THEN a Sum on the 0th dimension (the penultimate dimension) to get an array of (3,2) IF axis=1(-2) THEN a Sum on the first dimension (the penultimate dimension) to get an array of (2,2) IF axis=2(-1) THEN a Sum on the second dimension (the penultimate dimension) to get an array of (2, 3) ''' a = np.arange(12).reshape(2,3,2) ''' [[[ 0 1] [ 2 3] [ 4 5]] [[ 6 7] [ 8 9] [10 11]]] ''' a_0 = a.sum(axis=0) a_1 = a.sum(axis=1) a_2 = a.sum(axis=2) print(a_0,a_0.shape) #output: [[ 6 8][10 12][14 16]] (3, 2) print(a_1,a_1.shape) #output:[[ 6 9][24 27]] (2, 2) print(a_2,a_2.shape) #output: [[ 1 5 9][13 17 21]] (2, 3)
##tile function copies the original matrix horizontally and vertically. ##Tile means tile. As the name suggests, tile the matrix like a tile ##tile(matrix,(a,b)): tile the matrix horizontally B, and then vertically a. import numpy as np mat = np.array([[1,2],[3,4]]) print(mat) tile(mat,4) #Equal to tile (mat, (1,4)) ===> tile horizontally for 4 times, i.e. [[1 2121212] [343434]] tile(mat,(3,1)) #Vertical tiling for 3 times, i.e. [[1 2] [3 4][1 2] [3 4] [1 2] [3 4]] #Horizontal + vertical tile(mat, (3, 4)) ''' Results: [[1 2 1 2 1 2 1 2] [3 4 3 4 3 4 3 4] [1 2 1 2 1 2 1 2] [3 4 3 4 3 4 3 4] [1 2 1 2 1 2 1 2] [3 4 3 4 3 4 3 4]] '''
2.5 basic operation - index
Indexes are used in various basic operations such as fetching and looping, so this chapter is very important, but very simple
##Index operation #Positive index 1, 2, 3, 4, 5, 6 #Negative index -6, -5, -4, -3, -2, -1 a = np.array([10,5,8,9,2,1,64]) b = np.arange(20).reshape(4,5) #section ##One dimensional slice print(a[1:3]) #a[start:end] start~end-1 print(a[3:]) #a[index:] start~ last element print(a[:3]) #a[:end] start of the first element ~a[3] ##Negative index print(a[-4:-2]) #a[-start,-end] from the penultimate start to the penultimate end+1 #a[::index] once every interval index-1 print(a[::2]) #Section jump a[start: end: interval] b = np.arange(10) #[0 1 2 3 4 5 6 7 8 9] print(b[1:7:2]) #[1 3 5]
Write at the end:
Dear students, please remember, what is the most important thing in the computer field? Practice, others' things are others' after all. Only practice can transform others' things into your own. I believe you can do it. Come on~
Finally, if there are like-minded friends who want to discuss together, please add QQ group [662151913] to sprout a new AI's journey into the pit