about云開發

 找回密碼
 立即注冊

QQ登錄

只需一步,快速開始

打印 上一主題 下一主題

[連載型] TensorFlow ML cookbook 第九章2節 實施LSTM模型

[復制鏈接]
跳轉到指定樓層
樓主
levycui 發表于 2019-11-5 19:14:43 | 只看該作者 |只看大圖 回帖獎勵 |倒序瀏覽 |閱讀模式
本帖最后由 levycui 于 2019-11-5 19:29 編輯
問題導讀:
1、LSTM如何解決可變長度RNN具有的消失/爆炸梯度問題?
2、如何創建一個函數來返回兩個字典?
3、如何聲明LSTM模型以及測試模型?
4、如何使用thenumpy.roll()函數?




上一篇:TensorFlow ML cookbook 第九章1節 為垃圾郵件預測實施RNN

實施LSTM模型

通過在本配方中引入LSTM單元,我們將擴展RNN模型以使用更長的序列。

做好準備
長短期記憶(LSTM)是傳統RNN的一種變體.LSTM是解決可變長度RNN具有的消失/爆炸梯度問題的一種方法。為了解決此問題,LSTM單元引入了一個內部忘記門,該門可以修改 信息從一個單元流向另一個單元。 為了概念化它的工作原理,我們將一次遍歷一個模型的無偏版本,第一步與常規RNN相同:




LSTM的想法是使信息通過細胞具有自我調節能力,并且可以根據輸入到細胞的信息來忘記或修改這些信息。

在本食譜中,我們將使用帶有LSTM單元的序列RNN來嘗試預測接下來的單詞,這些單詞將在莎士比亞的作品中進行訓練。為了測試我們的工作方式,我們將提供模型候選短語,例如,您還可以 并查看模型是否可以嘗試找出該詞組后面應包含哪些詞。

怎么做…
1.首先,我們為腳本加載必要的庫:
[Python] 純文本查看 復制代碼
import os
import re
import string
import requests
import numpy as np
import collections
import random
import pickle
import matplotlib.pyplot as plt
import tensorflow as tf 


2接下來,我們開始一個圖形會話并設置RNN參數sess = tf.Session()
[Python] 純文本查看 復制代碼
# Set RNN Parameters
min_word_freq = 5
rnn_size = 128
epochs = 10
batch_size = 100
learning_rate = 0.001
training_seq_len = 50
embedding_size = rnn_size
save_every = 500
eval_every = 50
prime_texts = ['thou art more', 'to be or not to', 'wherefore art thou'] 


3,設置數據,模型文件夾和文件名,并聲明要刪除的標點符號。由于莎士比亞經常使用連字符和撇號來組合單詞和音節,因此我們希望保留連字符和撇號:
[Python] 純文本查看 復制代碼
data_dir = 'temp'
data_file = 'shakespeare.txt'
model_path = 'shakespeare_model'
full_model_dir = os.path.join(data_dir, model_path)
# Declare punctuation to remove, everything except hyphens and apostrophes
punctuation = string.punctuation
punctuation = ''.join([x for x in punctuation if x not in ['-', "'"]])


4.接下來我們得到數據。 如果數據文件不存在,我們將下載并保存莎士比亞文本。 如果確實存在,我們將加載數據:
[Python] 純文本查看 復制代碼
if not os.path.exists(full_model_dir):
os.makedirs(full_model_dir)
# Make data directory
if not os.path.exists(data_dir):
  os.makedirs(data_dir)
  print('Loading Shakespeare Data')
# Check if file is downloaded.
if not os.path.isfile(os.path.join(data_dir, data_file)):
  print('Not found, downloading Shakespeare texts from www. gutenberg.org')
  shakespeare_url = 'http://www.gutenberg.org/cache/epub/100/ pg100.txt'
# Get Shakespeare text
  response = requests.get(shakespeare_url)
  shakespeare_file = response.content
# Decode binary into string
  s_text = shakespeare_file.decode('utf-8')
# Drop first few descriptive paragraphs.
  s_text = s_text[7675:]
# Remove newlines
  s_text = s_text.replace('\r\n', '')
  s_text = s_text.replace('\n', '')
# Write to file
with open(os.path.join(data_dir, data_file), 'w') as out_conn:
  out_conn.write(s_text)
else:
# If file has been saved, load from that file
  with open(os.path.join(data_dir, data_file), 'r') as file_ conn:
s_text = file_conn.read().replace('\n', '')

5.我們通過刪除標點符號和多余的空格來清除莎士比亞的文本:
[Python] 純文本查看 復制代碼
s_text = re.sub(r'[{}]'.format(punctuation), ' ', s_text)
s_text = re.sub('\s+', ' ', s_text ).strip().lower() 


6,我們現在要創建莎士比亞詞匯表來使用,我們創建一個函數來返回兩個字典(單詞到索引和索引到單詞),并且出現頻率超過指定頻率:
[Python] 純文本查看 復制代碼
def build_vocab(text, min_word_freq):
word_counts = collections.Counter(text.split(' '))

# limit word counts to those more frequent than cutoff
word_counts = {key:val for key, val in word_counts.items() if val>min_word_freq}
# Create vocab --> index mapping
words = word_counts.keys()
vocab_to_ix_dict = {key:(ix+1) for ix, key in enumerate(words)}
# Add unknown key --> 0 index
vocab_to_ix_dict['unknown']=0
# Create index --> vocab mapping
ix_to_vocab_dict = {val:key for key,val in vocab_to_ix_dict. items()}
return(ix_to_vocab_dict, vocab_to_ix_dict)
ix2vocab, vocab2ix = build_vocab(s_text, min_word_freq)
vocab_size = len(ix2vocab) + 1



7,現在我們有了詞匯表,我們將莎士比亞文本變成索引數組:
[Python] 純文本查看 復制代碼
s_text_words = s_text.split(' ')
s_text_ix = []
for ix, x in enumerate(s_text_words):
  try:
    s_text_ix.append(vocab2ix[x])
  except:
    s_text_ix.append(0)
    s_text_ix = np.array(s_text_ix) 


8,在本食譜中,我們將展示如何在類對象中創建模型,這將對我們有所幫助,因為我們希望使用相同的模型(具有相同的權重)進行批量訓練并從中生成文本 示例文本。如果沒有帶有內部采樣方法的類,這將很難做到。理想情況下,此類代碼應位于單獨的Python文件中,我們可以在此腳本的開頭導入該文件:
[Python] 純文本查看 復制代碼
class LSTM_Model():
def __init__(self, rnn_size, batch_size, learning_rate,
  training_seq_len, vocab_size, infer =False):
  self.rnn_size = rnn_size
  self.vocab_size = vocab_size
  self.infer = infer

self.learning_rate = learning_rate
if infer:
  self.batch_size = 1
  self.training_seq_len = 1
else:
  self.batch_size = batch_size
  self.training_seq_len = training_seq_len
  self.lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(rnn_size)
  self.initial_state = self.lstm_cell.zero_state(self.batch_ size, tf.float32)
  self.x_data = tf.placeholder(tf.int32, [self.batch_size, self.training_seq_len])
  self.y_output = tf.placeholder(tf.int32, [self.batch_size, self.training_seq_len])
with tf.variable_scope('lstm_vars'):
# Softmax Output Weights
  W = tf.get_variable('W', [self.rnn_size, self.vocab_ size], tf.float32, tf.random_normal_initializer())
  b = tf.get_variable('b', [self.vocab_size], tf.float32, tf.constant_initializer(0.0))
# Define Embedding
  embedding_mat = tf.get_variable('embedding_mat', [self.vocab_size, self.rnn_size], tf.float32, tf.random_normal_ initializer())
  embedding_output = tf.nn.embedding_lookup(embedding_ mat, self.x_data)
  rnn_inputs = tf.split(1, self.training_seq_len, embedding_output)
  rnn_inputs_trimmed = [tf.squeeze(x, [1]) for x in rnn_ inputs]
# If we are inferring (generating text), we add a 'loop' function
# Define how to get the i+1 th input from the i th output
def inferred_loop(prev, count):
  prev_transformed = tf.matmul(prev, W) + b
  prev_symbol = tf.stop_gradient(tf.argmax(prev_ transformed, 1))


output = tf.nn.embedding_lookup(embedding_mat, prev_ symbol)
return(output)
decoder = tf.nn.seq2seq.rnn_decoder
outputs, last_state = decoder(rnn_inputs_trimmed,
self.initial_state,
self.lstm_cell,
loop_function=inferred_loop if infer else None)
# Non inferred outputs
output = tf.reshape(tf.concat(1, outputs), [-1, self.rnn_ size])
# Logits and output
self.logit_output = tf.matmul(output, W) + b
self.model_output = tf.nn.softmax(self.logit_output)
loss_fun = tf.nn.seq2seq.sequence_loss_by_example
loss = loss_fun([self.logit_output],[tf.reshape(self.y_ output, [-1])],
[tf.ones([self.batch_size * self.training_seq_ len])],
self.vocab_size)
self.cost = tf.reduce_sum(loss) / (self.batch_size * self. training_seq_len)
self.final_state = last_state
gradients, _ = tf.clip_by_global_norm(tf.gradients(self. cost, tf.trainable_variables()), 4.5)
optimizer = tf.train.AdamOptimizer(self.learning_rate)
self.train_op = optimizer.apply_gradients(zip(gradients, tf.trainable_variables()))
def sample(self, sess, words=ix2vocab, vocab=vocab2ix, num=10, prime_text='thou art'):
state = sess.run(self.lstm_cell.zero_state(1, tf.float32))
word_list = prime_text.split()
for word in word_list[:-1]:
  x = np.zeros((1, 1))
  x[0, 0] = vocab[word]
  feed_dict = {self.x_data: x, self.initial_state:state}
  [state] = sess.run([self.final_state], feed_dict=feed_ dict)
  out_sentence = prime_text
  word = word_list[-1]

for n in range(num):
  x = np.zeros((1, 1))
  x[0, 0] = vocab[word]
  feed_dict = {self.x_data: x, self.initial_state:state}
  [model_output, state] = sess.run([self.model_output, self.final_state], feed_dict=feed_dict)
  sample = np.argmax(model_output[0])
  if sample == 0:
    break
word = words[sample]
out_sentence = out_sentence + ' ' + word
return(out_sentence) 


9.現在,我們將聲明LSTM模型以及測試模型。 我們將在變量范圍內執行此操作,并告訴范圍我們將對測試LSTM模型重新使用變量:
[Python] 純文本查看 復制代碼
with tf.variable_scope('lstm_model') as scope:
# Define LSTM Model
lstm_model = LSTM_Model(rnn_size, batch_size, learning_rate,
training_seq_len, vocab_size)
scope.reuse_variables()
test_lstm_model = LSTM_Model(rnn_size, batch_size, learning_ rate,
training_seq_len, vocab_size, infer=True) 


10,我們創建一個保存操作,并將輸入文本分成相等的批處理大小的塊,然后我們將初始化模型的變量:
[Python] 純文本查看 復制代碼
saver = tf.train.Saver()
# Create batches for each epoch
num_batches = int(len(s_text_ix)/(batch_size * training_seq_len)) + 1
# Split up text indices into subarrays, of equal size
batches = np.array_split(s_text_ix, num_batches)
# Reshape each split into [batch_size, training_seq_len]
batches = [np.resize(x, [batch_size, training_seq_len]) for x in batches]
# Initialize all variables
init = tf.initialize_all_variables()
sess.run(init) 


11,現在我們可以遍歷各個紀元,在每個紀元開始之前對數據進行改組,我們的數據目標只是相同的數據,但是移位了一個值(使用thenumpy.roll()函數):
[Python] 純文本查看 復制代碼
train_loss = []
iteration_count = 1

for epoch in range(epochs):
# Shuffle word indices
random.shuffle(batches)
# Create targets from shuffled batches
targets = [np.roll(x, -1, axis=1) for x in batches]
# Run a through one epoch
print('Starting Epoch #{} of {}.'.format(epoch+1, epochs))
# Reset initial LSTM state every epoch
state = sess.run(lstm_model.initial_state)
for ix, batch in enumerate(batches):
training_dict = {lstm_model.x_data: batch, lstm_model.y_ output: targets[ix]}
c, h = lstm_model.initial_state
training_dict[c] = state.c
training_dict[h] = state.h
temp_loss, state, _ = sess.run([lstm_model.cost, lstm_ model.final_state, lstm_model.train_op], feed_dict=training_dict)
train_loss.append(temp_loss)
# Print status every 10 gens
if iteration_count % 10 == 0:
summary_nums = (iteration_count, epoch+1, ix+1, num_ batches+1, temp_loss)
print('Iteration: {}, Epoch: {}, Batch: {} out of {}, Loss: {:.2f}'.format(*summary_nums))
# Save the model and the vocab
if iteration_count % save_every == 0:
# Save model
model_file_name = os.path.join(full_model_dir, 'model')
saver.save(sess, model_file_name, global_step = iteration_count)
print('Model Saved To: {}'.format(model_file_name))
# Save vocabulary
dictionary_file = os.path.join(full_model_dir, 'vocab. pkl')
with open(dictionary_file, 'wb') as dict_file_conn:
pickle.dump([vocab2ix, ix2vocab], dict_file_conn)
if iteration_count % eval_every == 0:

for sample in prime_texts:
print(test_lstm_model.sample(sess, ix2vocab, vocab2ix, num=10, prime_text=sample))
iteration_count += 1 


12.這將導致以下輸出:
[Python] 純文本查看 復制代碼
Loading Shakespeare Data
Cleaning Text
Building Shakespeare Vocab
Vocabulary Length = 8009
Starting Epoch #1 of 10.
Iteration: 10, Epoch: 1, Batch: 10 out of 182, Loss: 10.37
Iteration: 20, Epoch: 1, Batch: 20 out of 182, Loss: 9.54
...
Iteration: 1790, Epoch: 10, Batch: 161 out of 182, Loss: 5.68
Iteration: 1800, Epoch: 10, Batch: 171 out of 182, Loss: 6.05
thou art more than i am a
to be or not to the man i have
wherefore art thou art of the long
Iteration: 1810, Epoch: 10, Batch: 181 out of 182, Loss: 5.99 


13,最后,這是我們如何計算歷時的訓練損失。
[Python] 純文本查看 復制代碼
plt.plot(train_loss, 'k-')
plt.title('Sequence to Sequence Loss')
plt.xlabel('Generation')
plt.ylabel('Loss')
plt.show()


圖4:模型各代中的序列間丟失。

這個怎么運作…
在此示例中,我們基于莎士比亞詞匯表構建了具有LSTM單位的RNN模型以預測下一個單詞。可以采取一些措施來改進模型,例如增加序列大小,降低學習速度或 訓練模型以獲得更多時代。

還有更多…
為了采樣,我們實現了一個貪婪的采樣器,貪婪的采樣器可能會一遍又一遍地重復相同的短語而陷入困境。 為了防止這種情況,為避免這種情況,我們可能會陷入困境。為了避免這種情況,我們還可以采用一種更加隨機的方式對單詞進行采樣,可能是通過根據輸出的對數或概率分布進行加權采樣來實現的。



原文:
Implementing an LSTM Model


We will extend our RNN model to be able to use longer sequences by introducing the LSTM unit in this recipe.


Getting ready
Long Short Term Memory(LSTM) is a variant of the traditional RNN.LSTM is a way to address the vanishing/exploding gradient problem that variable length RNNs have.To address this issue, LSTM cells introduce an internal forget gate,which can modify a flow of information from one cell to the next. To conceptualize how this works, we will walk through an unbiased version of LSTM one equation at a time.The first step is the same as for the regular RNN:



The idea with LSTM is to have a self-regulating flow of information through the cells that can be forgotten or modified based on the informationinput to the cell.

For this recipe, we will use a sequence RNN with LSTM cells to try to predict the next words, trained on the works of Shakespeare.To test how we are doing, we will feed the model candidate phrases, such as, thou art more, and see if the model can attempt to figure out what words should follow the phrase.

How to do it…
1.To start, we load the necessary libraries for the script:
import os
import re
import string
import requests
import numpy as np
import collections
import random
import pickle
import matplotlib.pyplot as plt
import tensorflow as tf


2.Next, we start a graph session and set the RNN parameterssess = tf.Session()
# Set RNN Parameters
min_word_freq = 5
rnn_size = 128
epochs = 10
batch_size = 100
learning_rate = 0.001
training_seq_len = 50
embedding_size = rnn_size
save_every = 500
eval_every = 50
prime_texts = ['thou art more', 'to be or not to', 'wherefore art thou']


3.We set up the data and model folders and filenames, along with declaring punctuation to remove.We will want to keep hyphens and apostrophes because Shakespeare uses them frequently to combine words and syllables:
data_dir = 'temp'
data_file = 'shakespeare.txt'
model_path = 'shakespeare_model'
full_model_dir = os.path.join(data_dir, model_path)
# Declare punctuation to remove, everything except hyphens and apostrophes
punctuation = string.punctuation
punctuation = ''.join([x for x in punctuation if x not in ['-', "'"]])


4.Next we get the data. If the data file doesn't exist, we download and save the Shakespeare text. If it does exist, we load the data:
if not os.path.exists(full_model_dir):
os.makedirs(full_model_dir)
# Make data directory
if not os.path.exists(data_dir):
os.makedirs(data_dir)
print('Loading Shakespeare Data')
# Check if file is downloaded.
if not os.path.isfile(os.path.join(data_dir, data_file)):
print('Not found, downloading Shakespeare texts from www. gutenberg.org')
shakespeare_url = 'http://www.gutenberg.org/cache/epub/100/ pg100.txt'
# Get Shakespeare text
response = requests.get(shakespeare_url)
shakespeare_file = response.content
# Decode binary into string
s_text = shakespeare_file.decode('utf-8')
# Drop first few descriptive paragraphs.
s_text = s_text[7675:]
# Remove newlines
s_text = s_text.replace('\r\n', '')
s_text = s_text.replace('\n', '')
# Write to file
with open(os.path.join(data_dir, data_file), 'w') as out_conn:
out_conn.write(s_text)
else:
# If file has been saved, load from that file
with open(os.path.join(data_dir, data_file), 'r') as file_ conn:
s_text = file_conn.read().replace('\n', '')


5.We clean the Shakespeare text by removing punctuation and extra whitespace:
s_text = re.sub(r'[{}]'.format(punctuation), ' ', s_text)
s_text = re.sub('\s+', ' ', s_text ).strip().lower()


6.We now deal with creating the Shakespeare vocabulary to use.We create a function that will return the two dictionaries (word to index, and index to word) with words that appear more than a specified frequency:
def build_vocab(text, min_word_freq):
word_counts = collections.Counter(text.split(' '))

# limit word counts to those more frequent than cutoff
word_counts = {key:val for key, val in word_counts.items() if val>min_word_freq}
# Create vocab --> index mapping
words = word_counts.keys()
vocab_to_ix_dict = {key:(ix+1) for ix, key in enumerate(words)}
# Add unknown key --> 0 index
vocab_to_ix_dict['unknown']=0
# Create index --> vocab mapping
ix_to_vocab_dict = {val:key for key,val in vocab_to_ix_dict. items()}
return(ix_to_vocab_dict, vocab_to_ix_dict)
ix2vocab, vocab2ix = build_vocab(s_text, min_word_freq)
vocab_size = len(ix2vocab) + 1

7.Now that we have our vocabulary, we turn the Shakespeare text into an array of indices:
s_text_words = s_text.split(' ')
s_text_ix = []
for ix, x in enumerate(s_text_words):
try:
s_text_ix.append(vocab2ix[x])
except:
s_text_ix.append(0)
s_text_ix = np.array(s_text_ix)


8.In this recipe, we will show how to create a model in a class object.This will be helpful for us, because we would like to use the same model (with the same weights) to train on batches and to generate text from sample text.This will prove hard to do without a class with an internal sampling method.Ideally, this class code should sit in a separate Python file,which we can import at the beginning of this script:
class LSTM_Model():
def __init__(self, rnn_size, batch_size, learning_rate,
training_seq_len, vocab_size, infer =False):
self.rnn_size = rnn_size
self.vocab_size = vocab_size
self.infer = infer

self.learning_rate = learning_rate
if infer:
self.batch_size = 1
self.training_seq_len = 1
else:
self.batch_size = batch_size
self.training_seq_len = training_seq_len
self.lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(rnn_size)
self.initial_state = self.lstm_cell.zero_state(self.batch_ size, tf.float32)
self.x_data = tf.placeholder(tf.int32, [self.batch_size, self.training_seq_len])
self.y_output = tf.placeholder(tf.int32, [self.batch_size, self.training_seq_len])
with tf.variable_scope('lstm_vars'):
# Softmax Output Weights
W = tf.get_variable('W', [self.rnn_size, self.vocab_ size], tf.float32, tf.random_normal_initializer())
b = tf.get_variable('b', [self.vocab_size], tf.float32, tf.constant_initializer(0.0))
# Define Embedding
embedding_mat = tf.get_variable('embedding_mat', [self.vocab_size, self.rnn_size], tf.float32, tf.random_normal_ initializer())
embedding_output = tf.nn.embedding_lookup(embedding_ mat, self.x_data)
rnn_inputs = tf.split(1, self.training_seq_len, embedding_output)
rnn_inputs_trimmed = [tf.squeeze(x, [1]) for x in rnn_ inputs]
# If we are inferring (generating text), we add a 'loop' function
# Define how to get the i+1 th input from the i th output
def inferred_loop(prev, count):
prev_transformed = tf.matmul(prev, W) + b
prev_symbol = tf.stop_gradient(tf.argmax(prev_ transformed, 1))


output = tf.nn.embedding_lookup(embedding_mat, prev_ symbol)
return(output)
decoder = tf.nn.seq2seq.rnn_decoder
outputs, last_state = decoder(rnn_inputs_trimmed,
self.initial_state,
self.lstm_cell,
loop_function=inferred_loop if infer else None)
# Non inferred outputs
output = tf.reshape(tf.concat(1, outputs), [-1, self.rnn_ size])
# Logits and output
self.logit_output = tf.matmul(output, W) + b
self.model_output = tf.nn.softmax(self.logit_output)
loss_fun = tf.nn.seq2seq.sequence_loss_by_example
loss = loss_fun([self.logit_output],[tf.reshape(self.y_ output, [-1])],
[tf.ones([self.batch_size * self.training_seq_ len])],
self.vocab_size)
self.cost = tf.reduce_sum(loss) / (self.batch_size * self. training_seq_len)
self.final_state = last_state
gradients, _ = tf.clip_by_global_norm(tf.gradients(self. cost, tf.trainable_variables()), 4.5)
optimizer = tf.train.AdamOptimizer(self.learning_rate)
self.train_op = optimizer.apply_gradients(zip(gradients, tf.trainable_variables()))
def sample(self, sess, words=ix2vocab, vocab=vocab2ix, num=10, prime_text='thou art'):
state = sess.run(self.lstm_cell.zero_state(1, tf.float32))
word_list = prime_text.split()
for word in word_list[:-1]:
x = np.zeros((1, 1))
x[0, 0] = vocab[word]
feed_dict = {self.x_data: x, self.initial_state:state}
[state] = sess.run([self.final_state], feed_dict=feed_ dict)
out_sentence = prime_text
word = word_list[-1]

for n in range(num):
x = np.zeros((1, 1))
x[0, 0] = vocab[word]
feed_dict = {self.x_data: x, self.initial_state:state}
[model_output, state] = sess.run([self.model_output, self.final_state], feed_dict=feed_dict)
sample = np.argmax(model_output[0])
if sample == 0:
break
word = words[sample]
out_sentence = out_sentence + ' ' + word
return(out_sentence)


9.Now we will declare the LSTM model as well as the test model. We will do this within a variable scope and tell the scope that we will reuse the variables for the test LSTM model:
with tf.variable_scope('lstm_model') as scope:
# Define LSTM Model
lstm_model = LSTM_Model(rnn_size, batch_size, learning_rate,
training_seq_len, vocab_size)
scope.reuse_variables()
test_lstm_model = LSTM_Model(rnn_size, batch_size, learning_ rate,
training_seq_len, vocab_size, infer=True)


10.We create a saving operation, as well as splitting up the input text into equal batch-size chunks.Then we will initialize the variables of the model:
saver = tf.train.Saver()
# Create batches for each epoch
num_batches = int(len(s_text_ix)/(batch_size * training_seq_len)) + 1
# Split up text indices into subarrays, of equal size
batches = np.array_split(s_text_ix, num_batches)
# Reshape each split into [batch_size, training_seq_len]
batches = [np.resize(x, [batch_size, training_seq_len]) for x in batches]
# Initialize all variables
init = tf.initialize_all_variables()
sess.run(init)


11.We can now iterate through our epochs, shuffling the data before each epoch starts.The target for our data is just the same data, but shifted by one value (using thenumpy.roll() function):
train_loss = []
iteration_count = 1

for epoch in range(epochs):
# Shuffle word indices
random.shuffle(batches)
# Create targets from shuffled batches
targets = [np.roll(x, -1, axis=1) for x in batches]
# Run a through one epoch
print('Starting Epoch #{} of {}.'.format(epoch+1, epochs))
# Reset initial LSTM state every epoch
state = sess.run(lstm_model.initial_state)
for ix, batch in enumerate(batches):
training_dict = {lstm_model.x_data: batch, lstm_model.y_ output: targets[ix]}
c, h = lstm_model.initial_state
training_dict[c] = state.c
training_dict[h] = state.h
temp_loss, state, _ = sess.run([lstm_model.cost, lstm_ model.final_state, lstm_model.train_op], feed_dict=training_dict)
train_loss.append(temp_loss)
# Print status every 10 gens
if iteration_count % 10 == 0:
summary_nums = (iteration_count, epoch+1, ix+1, num_ batches+1, temp_loss)
print('Iteration: {}, Epoch: {}, Batch: {} out of {}, Loss: {:.2f}'.format(*summary_nums))
# Save the model and the vocab
if iteration_count % save_every == 0:
# Save model
model_file_name = os.path.join(full_model_dir, 'model')
saver.save(sess, model_file_name, global_step = iteration_count)
print('Model Saved To: {}'.format(model_file_name))
# Save vocabulary
dictionary_file = os.path.join(full_model_dir, 'vocab. pkl')
with open(dictionary_file, 'wb') as dict_file_conn:
pickle.dump([vocab2ix, ix2vocab], dict_file_conn)
if iteration_count % eval_every == 0:

for sample in prime_texts:
print(test_lstm_model.sample(sess, ix2vocab, vocab2ix, num=10, prime_text=sample))
iteration_count += 1


12.This results in the following output:
Loading Shakespeare Data
Cleaning Text
Building Shakespeare Vocab
Vocabulary Length = 8009
Starting Epoch #1 of 10.
Iteration: 10, Epoch: 1, Batch: 10 out of 182, Loss: 10.37
Iteration: 20, Epoch: 1, Batch: 20 out of 182, Loss: 9.54
...
Iteration: 1790, Epoch: 10, Batch: 161 out of 182, Loss: 5.68
Iteration: 1800, Epoch: 10, Batch: 171 out of 182, Loss: 6.05
thou art more than i am a
to be or not to the man i have
wherefore art thou art of the long
Iteration: 1810, Epoch: 10, Batch: 181 out of 182, Loss: 5.99


13.And finally, here is how we plot the training loss over the epochs.
plt.plot(train_loss, 'k-')
plt.title('Sequence to Sequence Loss')
plt.xlabel('Generation')
plt.ylabel('Loss')
plt.show()


Figure 4: The sequence-to-sequence loss over all generations of the model.

How it works…
In this example, we built an RNN model with LSTM units to predict the next word, based on Shakespearean vocabulary.There are a few things that could be done to improve the model, maybe increasing the sequence size, having a decaying learning rate, or training the model for more epochs.


There's more…
For sampling, we implemented a greedy sampler.Greedy samplers can get stuck repeating the same phrases over and over. For example, it may get stuck saying for the for the for the….To prevent this, we could also implement a more random way of sampling words, maybe by doing a weighted sampler based on the logits or probability distribution of the output.
最新經典文章,歡迎關注公眾號


您需要登錄后才可以回帖 登錄 | 立即注冊

本版積分規則

關閉

推薦上一條 /4 下一條

QQ|小黑屋|about云開發-學問論壇|社區 ( 京ICP備12023829號 )

GMT+8, 2019-12-16 19:44 , Processed in 1.140625 second(s), 30 queries , Gzip On.

Powered by Discuz! X3.4 Licensed

© 2018 Comsenz Inc.Designed by u179

快速回復 返回頂部 返回列表
排球比赛场地