# Отчёт по лабораторной работе №3 ## по теме: "Распознавание изображений" --- ### Выполнили: Бригада 2, Мачулина Д.В., Бирюкова А.С., А-02-22 --- ### 1. Создание блокнота и настройка среды. Настройка блокнота для работы с аппаратным ускорителем GPU ```python from google.colab import drive drive.mount('/content/drive') import os os.chdir('/content/drive/MyDrive/Colab Notebooks/is_lab4') from tensorflow import keras from tensorflow.keras import layers from tensorflow.keras.models import Sequential import matplotlib.pyplot as plt import numpy as np from sklearn.metrics import classification_report, confusion_matrix from sklearn.metrics import ConfusionMatrixDisplay ``` ```python import tensorflow as tf device_name = tf.test.gpu_device_name() if device_name != '/device:GPU:0': raise SystemError('GPU device not found') print('Found GPU at: {}'.format(device_name)) ``` --- ### 2. Загрузка данных IMBd. Настройка набора данных (4*2 - 1) ```python # загрузка датасета from keras.datasets import imdb vocabulary_size = 5000 index_from = 3 (X_train, y_train), (X_test, y_test) = imdb.load_data(path="imdb.npz", num_words=vocabulary_size, skip_top=0, maxlen=None, seed=7, start_char=1, oov_char=2, index_from=index_from ) print('Shape of X train:', X_train.shape) print('Shape of y train:', y_train.shape) print('Shape of X test:', X_test.shape) print('Shape of y test:', y_test.shape) ``` Shape of X train: (25000,)
Shape of y train: (25000,)
Shape of X test: (25000,)
Shape of y test: (25000,)
--- ### 3. Вывод отзыва из обучающего множества в виде списка индекса слов. Преобразование списка индексов в текст и вывод отзыва в виде текста. Вывод длины отзыва, метки класса и названия. ```python # создание словаря для перевода индексов в слова # заргузка словаря "слово:индекс" word_to_id = imdb.get_word_index() # уточнение словаря word_to_id = {key:(value + index_from) for key,value in word_to_id.items()} word_to_id[""] = 0 word_to_id[""] = 1 word_to_id[""] = 2 word_to_id[""] = 3 # создание обратного словаря "индекс:слово" id_to_word = {value:key for key,value in word_to_id.items()} ``` ```python #Вывод отзыва из обучающего множества в виде списка индексов слов some_number = 192 review_indices = X_train[some_number] print("Список индексов слов:") print(review_indices) #Преобразование списка индексов в текст review_as_text = ' '.join(id_to_word[id] for id in X_train[some_number]) print("\nОтзыв в виде текста:") print(review_as_text) ``` Список индексов слов: [1, 225, 164, 433, 74, 2753, 35, 2188, 20, 5, 397, 35, 298, 20, 585, 305, 10, 10, 45, 64, 61, 652, 21, 6, 52, 708, 9, 2, 725, 4, 2, 7, 1451, 1089, 105, 17, 230, 17, 2, 9, 1947, 4, 3238, 11, 135, 422, 26, 6, 2, 1021, 378, 1780, 224, 472, 36, 26, 379, 724, 2607, 387, 178, 1582, 4, 771, 36, 2, 112, 2, 34, 6, 2, 10, 10, 300, 103, 6, 2, 2, 8, 516, 25, 30, 252, 8, 376, 90, 51, 1455, 335, 4010, 33, 54, 25, 2440, 90, 125, 10, 10, 241, 1559, 4, 609, 46, 7, 4, 2, 11, 3826, 2, 5, 11, 1011, 7, 4193, 7, 4684, 2, 3498, 90, 8, 3534, 2, 7, 4779, 10, 10, 342, 92, 1414, 979, 4, 568, 44, 4, 2, 5, 331, 2237, 18, 57, 684, 52, 282, 15, 4, 1788, 71, 2, 34, 90, 10, 10, 470, 137, 269, 8, 1090, 387, 129, 761, 46, 7, 129, 1682, 17, 76, 17, 614, 8, 2, 15, 4, 2, 2, 41, 10, 10, 457, 103, 397, 339, 39, 294, 8, 169, 4, 2, 103, 2, 129, 322, 30, 252, 8, 2222, 98, 245, 17, 515, 17, 614, 38, 25, 70, 393, 90, 31, 23, 31, 57, 213, 11, 112, 2, 208, 10, 10, 150, 474, 115, 535, 15, 101, 415, 62, 30, 2, 8, 231, 6, 171, 2497, 467, 134, 2, 2, 21, 4, 105, 11, 135, 422, 26, 38, 2, 5, 97, 38, 111, 1297, 2497, 15, 45, 2620, 1167, 18, 4, 529, 8, 459, 44, 68, 4369, 237, 36, 26, 1484, 7, 68, 205, 399, 14, 1098, 4, 2, 7, 4, 436, 22, 10, 10, 11, 420, 25, 71, 1535, 4, 2, 161, 570, 19, 2, 2, 105, 237, 36, 533, 26, 1348, 2, 2, 18, 487, 14, 2, 36, 872, 8, 97, 1186, 38, 2, 2270, 15, 32, 281, 7, 635, 271, 46, 4, 2054, 10, 10, 300, 4, 2, 1098, 6, 1002, 1004, 6, 568, 1709, 475, 137, 4, 2311, 9, 2359, 57, 53, 74, 747, 2194, 245, 10, 10, 241, 4, 2, 2, 11, 32, 2580, 7, 2, 4983, 11, 3826, 2, 5, 187, 3400, 7, 84, 246, 57, 31, 85, 74, 4, 1021, 378, 186, 8, 1495, 27, 1032, 2003, 10, 10, 342, 4, 2, 2, 35, 1755, 1166, 7, 567, 15, 62, 28, 556, 101, 406, 112, 10, 10, 470, 4, 836, 139, 69, 57, 1546, 1689, 11, 192, 49, 139, 71, 1504, 1677, 2, 39, 298, 102, 10, 10, 4, 64, 1123, 9, 4, 2, 751, 4, 130, 63, 16, 6, 184, 1770, 136, 237, 12, 16, 2, 725, 4, 322, 45, 99, 78, 4, 1057, 1477, 12, 56, 19, 35, 2, 379, 277, 15, 266, 46, 7, 317, 1845, 10, 10, 371, 4, 2, 496, 4, 231, 7, 135, 422, 144, 30, 3013, 7, 533, 128, 246, 36, 144, 43, 847, 8, 2642, 5, 193, 2, 19, 84, 37, 97, 102, 19, 6, 729, 2, 18, 489, 5, 1663] Отзыв в виде текста: there's nothing worse than renting an asian movie and getting an american movie experience instead br br it's only my opinion but a good thriller is upon the of likable intelligent characters as far as is concerned the protagonists in say yes are a married couple nicely done unfortunately they are stupid beyond belief let us count the ways they being by a br br 1 after a to kill you be sure to tell him what hotel you're staying at when you drop him off br br 2 beat the hell out of the in broad and in front of dozens of witnesses allowing him to press of assault br br 3 don't bother telling the police about the and simply assume for no apparently good reason that the cops were by him br br 4 while trying to escape let your lady out of your sight as much as possible to that the her br br 5 after getting help from someone to find the after your wife be sure to send them away as soon as possible so you can face him one on one no point in being right br br now i'd never expect that any person would be to making a few mistakes under these but the characters in say yes are so and make so many unbelievable mistakes that it's effectively impossible for the viewer to care about their safety since they are victims of their own doing this kills the of the entire film br br in case you were wondering the didn't stop with characters since they themselves are surely for writing this they decided to make situations so unrealistic that all sense of reality goes out the window br br 1 the kills a cop inside a police station – while the protagonist is asleep no more than ten feet away br br 2 the in all sorts of activities in broad and around tons of people yet no one other than the married couple seems to notice his odd behavior br br 3 the an absurd amount of violence that would have killed any human being br br 4 the suspense scenes had no imagination whatsoever in fact some scenes were direct rip from american movies br br the only positive is the near the end which was a pretty brutal scene since it was upon the wife it's too bad the filmmakers followed it up with an stupid ending that comes out of left field br br truly the behind the making of say yes should be ashamed of themselves better yet they should just move to california and take with people who make movies with a similar for quality and intelligence ```python #Вывод метки и названия класса class_label = y_train[some_number] class_name = "Positive" if class_label == 1 else "Negative" print(f"\nМетка класса: {class_label} - {class_name}") ``` Метка класса: 0 - Negative --- ### 4. Вывод максимальной и минимальной длины отзыва в обучающем множестве ```python #Вывод длины отзыва max_review_length = len(max(X_train, key=len)) print(f"\nМаксимальная длина отзыва: {max_review_length}") min_review_length = len(min(X_train, key=len)) print(f"\nМинимальная длина отзыва: {min_review_length}") review_length = len(review_indices) print(f"\nДлина отзыва: {review_length}") ``` Максимальная длина отзыва: 2494
Минимальная длина отзыва: 11
Длина отзыва: 502 --- ### 5. Предобработка данных ```python # предобработка данных from tensorflow.keras.utils import pad_sequences max_words = 500 X_train = pad_sequences(X_train, maxlen=max_words, value=0, padding='pre', truncating='post') X_test = pad_sequences(X_test, maxlen=max_words, value=0, padding='pre', truncating='post') ``` --- ### 6. Повтор пунктов 4-3. Вывод о том, как преобразовался отзыв после предобработки ```python #Вывод длины отзыва max_review_length = len(max(X_train, key=len)) print(f"\nМаксимальная длина отзыва: {max_review_length}") min_review_length = len(min(X_train, key=len)) print(f"\nМинимальная длина отзыва: {min_review_length}") review_length = len(review_indices) print(f"\nДлина отзыва: {review_length}") ``` Максимальная длина отзыва: 500
Минимальная длина отзыва: 500
Длина отзыва: 502 ```python #Вывод отзыва из обучающего множества в виде списка индексов слов some_number = 132 review_indices = X_train[some_number] print("Список индексов слов:") print(review_indices) #Преобразование списка индексов в текст review_as_text = ' '.join(id_to_word[id] for id in X_train[some_number]) print("\nОтзыв в виде текста:") print(review_as_text) ``` Список индексов слов: [ 1 225 164 433 74 2753 35 2188 20 5 397 35 298 20 585 305 10 10 45 64 61 652 21 6 52 708 9 2 725 4 2 7 1451 1089 105 17 230 17 2 9 1947 4 3238 11 135 422 26 6 2 1021 378 1780 224 472 36 26 379 724 2607 387 178 1582 4 771 36 2 112 2 34 6 2 10 10 300 103 6 2 2 8 516 25 30 252 8 376 90 51 1455 335 4010 33 54 25 2440 90 125 10 10 241 1559 4 609 46 7 4 2 11 3826 2 5 11 1011 7 4193 7 4684 2 3498 90 8 3534 2 7 4779 10 10 342 92 1414 979 4 568 44 4 2 5 331 2237 18 57 684 52 282 15 4 1788 71 2 34 90 10 10 470 137 269 8 1090 387 129 761 46 7 129 1682 17 76 17 614 8 2 15 4 2 2 41 10 10 457 103 397 339 39 294 8 169 4 2 103 2 129 322 30 252 8 2222 98 245 17 515 17 614 38 25 70 393 90 31 23 31 57 213 11 112 2 208 10 10 150 474 115 535 15 101 415 62 30 2 8 231 6 171 2497 467 134 2 2 21 4 105 11 135 422 26 38 2 5 97 38 111 1297 2497 15 45 2620 1167 18 4 529 8 459 44 68 4369 237 36 26 1484 7 68 205 399 14 1098 4 2 7 4 436 22 10 10 11 420 25 71 1535 4 2 161 570 19 2 2 105 237 36 533 26 1348 2 2 18 487 14 2 36 872 8 97 1186 38 2 2270 15 32 281 7 635 271 46 4 2054 10 10 300 4 2 1098 6 1002 1004 6 568 1709 475 137 4 2311 9 2359 57 53 74 747 2194 245 10 10 241 4 2 2 11 32 2580 7 2 4983 11 3826 2 5 187 3400 7 84 246 57 31 85 74 4 1021 378 186 8 1495 27 1032 2003 10 10 342 4 2 2 35 1755 1166 7 567 15 62 28 556 101 406 112 10 10 470 4 836 139 69 57 1546 1689 11 192 49 139 71 1504 1677 2 39 298 102 10 10 4 64 1123 9 4 2 751 4 130 63 16 6 184 1770 136 237 12 16 2 725 4 322 45 99 78 4 1057 1477 12 56 19 35 2 379 277 15 266 46 7 317 1845 10 10 371 4 2 496 4 231 7 135 422 144 30 3013 7 533 128 246 36 144 43 847 8 2642 5 193 2 19 84 37 97 102 19 6 729 2 18 489] Отзыв в виде текста: there's nothing worse than renting an asian movie and getting an american movie experience instead br br it's only my opinion but a good thriller is upon the of likable intelligent characters as far as is concerned the protagonists in say yes are a married couple nicely done unfortunately they are stupid beyond belief let us count the ways they being by a br br 1 after a to kill you be sure to tell him what hotel you're staying at when you drop him off br br 2 beat the hell out of the in broad and in front of dozens of witnesses allowing him to press of assault br br 3 don't bother telling the police about the and simply assume for no apparently good reason that the cops were by him br br 4 while trying to escape let your lady out of your sight as much as possible to that the her br br 5 after getting help from someone to find the after your wife be sure to send them away as soon as possible so you can face him one on one no point in being right br br now i'd never expect that any person would be to making a few mistakes under these but the characters in say yes are so and make so many unbelievable mistakes that it's effectively impossible for the viewer to care about their safety since they are victims of their own doing this kills the of the entire film br br in case you were wondering the didn't stop with characters since they themselves are surely for writing this they decided to make situations so unrealistic that all sense of reality goes out the window br br 1 the kills a cop inside a police station – while the protagonist is asleep no more than ten feet away br br 2 the in all sorts of activities in broad and around tons of people yet no one other than the married couple seems to notice his odd behavior br br 3 the an absurd amount of violence that would have killed any human being br br 4 the suspense scenes had no imagination whatsoever in fact some scenes were direct rip from american movies br br the only positive is the near the end which was a pretty brutal scene since it was upon the wife it's too bad the filmmakers followed it up with an stupid ending that comes out of left field br br truly the behind the making of say yes should be ashamed of themselves better yet they should just move to california and take with people who make movies with a similar for quality Отбросили два последних слова, чтобы длина отзыва стала равна заданному значению (500) --- ### 7. Вывод предобработанных массивов обучающих и тестовых данных и их размерностей ```python # вывод данных print('X train: \n',X_train) print('X train: \n',X_test) # вывод размерностей print('Shape of X train:', X_train.shape) print('Shape of X test:', X_test.shape) ``` X train:
[[ 0 0 0 ... 104 545 7]
[ 0 0 0 ... 2 262 372]
[ 0 0 0 ... 758 10 10]
...
[ 0 0 0 ... 2 27 375]
[ 0 0 0 ... 11 111 531]
[ 0 0 0 ... 152 1833 12]]

X train:
[[ 0 0 0 ... 2 126 3849]
[ 0 0 0 ... 25 1833 12]
[ 0 0 0 ... 129 249 4262]
...
[ 0 0 0 ... 2 24 1178]
[ 0 0 0 ... 61 278 145]
[ 0 0 0 ... 12 5 358]]
Shape of X train: (25000, 500)
Shape of X test: (25000, 500) --- ### 8. Реализация модели рекуррентной нейронной сети, состоящей из слоев Embedding, LSTM, Dropout, Dense и её обучение. Вывод информации об архитектуре нейронной сети. ```python model = Sequential() model.add(layers.Embedding(input_dim=vocabulary_size, output_dim=32, input_length=max_words, input_shape=(max_words,))) model.add(layers.LSTM(76)) model.add(layers.Dropout(0.3)) model.add(layers.Dense(1, activation='sigmoid')) model.summary() ```
Layer (type) Output Shape Param #
embedding_1 (Embedding) (None, 500, 32) 160,000
lstm_1 (LSTM) (None, 76) 33,136
dropout_1 (Dropout) (None, 76) 0
dense_1 (Dense) (None, 1) 77
Total params: 193,213 (754.74 KB)
Trainable params: 193,213 (754.74 KB)
Non-trainable params: 0 (0.00 B) ```python batch_size = 64 epochs = 5 model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"]) model.fit(X_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.2) ``` ```python test_loss, test_acc = model.evaluate(X_test, y_test) print(f"\nTest accuracy: {test_acc}") ``` Test accuracy: 0.8670799732208252 --- ### 9. Оценить качество обучения на тестовых данных. ```python y_score = model.predict(X_test) y_pred = [1 if y_score[i,0]>=0.5 else 0 for i in range(len(y_score))] from sklearn.metrics import classification_report print(classification_report(y_test, y_pred, labels = [0, 1], target_names=['Negative', 'Positive'])) ```
precision recall f1-score support
Negative 0.86 0.88 0.87 12500
Positive 0.88 0.86 0.87 12500
accuracy 0.87 25000
macro avg 0.87 0.87 0.87 25000
weighted avg 0.87 0.87 0.87 25000
```python from sklearn.metrics import roc_curve, auc import matplotlib.pyplot as plt fpr, tpr, thresholds = roc_curve(y_test, y_score) plt.plot(fpr, tpr) plt.grid() plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.title('ROC') plt.show() print('Area under ROC is', auc(fpr, tpr)) ``` Area under ROC is 0.936850272 **Вывод**
Модель Количество настраиваемых параметров Количество эпох обучения Качество классификации тестовой выборки
Рекуррентная
193,213
3
accuracy: 0.867
loss: 0.331
ROC: 0.936
В ходе лабораторной работы было изучено применение рекуррентной нейронной сети. Исходя из анализа полученных результатов, представленных в таблице, делаем вывод, что модель хорошо справилась с задачей определения тональности текста: accuracy = 0.867, loss = 0.331. Показатель точности превышает требуемый порог в 0,8. Значение ROC превышает 0,9, что свидетельствует о способности модели хорошо различать два класса - отрицательные и положительные отзывы