Tip: After the article is written, the table of contents can be automatically generated. For how to generate it, please refer to the help document on the right
foreword
This article is about some ideas of Li Hongyi's second assignment in the 2022 edition of machine learning, recording his own learning process, and the main reference content is attached at the end of the article.
HW2 job tasks:
This assignment comes from the part of speech recognition, which requires us to predict phonemes based on the existing audio material, and the data preprocessing part is: extract mfcc features from the original waveform (which has been done by the teaching assistant), and then we need to use this classification: i.e. frame-level phoneme classification using pre-extracted mfcc features.
Phoneme classification prediction (Phoneme classification) is to predict phonemes through speech data. A phoneme is the smallest unit of speech in a human language that can distinguish meaning, and it is the basic concept of phonological analysis. Every language has its own phoneme system.
Requirements are as follows
Level | Accuracy |
---|---|
simple | 0.45797 |
medium | 0.69747 |
strong | 0.75028 |
boss | 0.82324 |
The following is a summary of the changes and methods of the teaching assistant code
1. Need to pass strongbaseline
1,concat_nframes
First of all, according to the teaching assistant's prompt, because a phoneme will not have only one frame (frame), connecting the frame before and after training will get better results. Here, increase the concat_nframes, and connect the symmetrical number before and after, for example, if concat_n = 19, connect 9 before and after. frame.
2. Small details
Added Batch Normalization and dropout
Advantages about Batch Normalization:
Link: link
About weight decay
Link: link
About dropout
Link: link
3. Cosine Annealing
Reference: Links: link
The official website link is the link: link
Here we add the following code
optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate,weight_decay=0.01) scheduler = torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(optimizer,T_0=8,T_mult=2,eta_min = learning_rate/2)
The cosine annealing learning rate formula is
The function usage is as follows
torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(optimizer, T_0, T_mult=1, eta_min=0, last_epoch=- 1, verbose=False)
T_0 is the initial period, which is exactly the epoch required for the learning rate to go from the maximum value to the next maximum value, and the required epoch after that is T_mult times the previous step. And eta_min is the minimum learning rate. last_epoch is the index of the last epoch, defaults to -1. When verbose is True, it can automatically output the learning rate of each epoch.
We verify with the following code
import torch import torch.nn as nn from torch.optim.lr_scheduler import CosineAnnealingWarmRestarts import matplotlib.pyplot as plt class Simple_Model(nn.Module): def __init__(self): super(Simple_Model, self).__init__() self.conv1 = nn.Conv2d(in_channels=3,out_channels=3,kernel_size=1) def forward(self,x): pass learning_rate = 0.0001 model = Simple_Model() optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) scheduler = torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(optimizer,T_0=8,T_mult=2,eta_min = learning_rate/2) print('initial learning rate',optimizer.defaults['lr']) lr_get = [] #save lr for plotting for epoch in range(1,100): # train optimizer.zero_grad() optimizer.step() lr_get.append(optimizer.param_groups[0]['lr']) scheduler.step() #CosineAnnealingWarmRestarts draws changes in learning rate plt.plot(list(range(1,100)),lr_get) plt.xlabel('epoch') plt.ylabel('learning_rate') plt.title('How CosineAnnealingWarmRestart goes') plt.show()
The result is as follows:
Finally, through the above steps, our model can achieve the following results
2. Follow-up improvement
1.LSTM(Long Short-term Memory )
Study ing, and then make up after learning, it is expected to be completed within these three days
The code is as follows (example):
Summarize
not yet concluded
refer to
Link: link
[[ML2021 Li Hongyi Machine Learning] Homework 2 TIMIT phoneme classification Explanation of ideas - Bilibili]
Link: link
Link: link