форкнуто от main/is_dnn
Вы не можете выбрать более 25 тем
Темы должны начинаться с буквы или цифры, могут содержать дефисы(-) и должны содержать не более 35 символов.
1 строка
62 KiB
Plaintext
1 строка
62 KiB
Plaintext
{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"provenance":[],"gpuType":"T4","mount_file_id":"1QIiOMAmZBjzdMF2Bi5JM-jXjf4HLANTm","authorship_tag":"ABX9TyP9+9H2XR/BwZPK6jFtI0kW"},"kernelspec":{"name":"python3","display_name":"Python 3"},"language_info":{"name":"python"},"accelerator":"GPU"},"cells":[{"cell_type":"markdown","source":["1) В среде Google Colab создали новый блокнот (notebook). Импортировали необходимые для работы библиотеки и модули. Настроили блокнот для работы с аппаратным ускорителем GPU."],"metadata":{"id":"YqtYef25bm5U"}},{"cell_type":"code","source":["from tensorflow import keras\n","from tensorflow.keras import layers\n","from tensorflow.keras.models import Sequential\n","import matplotlib.pyplot as plt\n","import numpy as np"],"metadata":{"id":"Y1y4sLVsW546","executionInfo":{"status":"ok","timestamp":1765387055442,"user_tz":-180,"elapsed":13450,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}}},"execution_count":2,"outputs":[]},{"cell_type":"code","source":["import tensorflow as tf\n","device_name = tf.test.gpu_device_name()\n","if device_name != '/device:GPU:0':\n"," raise SystemError('GPU device not found')\n","print('Found GPU at: {}'.format(device_name))"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"0AqVK0T-bWzu","executionInfo":{"status":"ok","timestamp":1765387056464,"user_tz":-180,"elapsed":8,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}},"outputId":"45a1df4b-1766-425c-85c0-c0ec3410336d"},"execution_count":3,"outputs":[{"output_type":"stream","name":"stdout","text":["Found GPU at: /device:GPU:0\n"]}]},{"cell_type":"markdown","source":["2) Загрузили набор данных IMDb, содержащий оцифрованные отзывы на фильмы, размеченные на два класса: позитивные и негативные. При загрузке набора данных параметр seed выбрали равным значению (4k – 1)=39, где k=10 – номер бригады. Вывели размеры полученных обучающих и тестовых массивов данных."],"metadata":{"id":"6ADb8hakbfjl"}},{"cell_type":"code","source":["# загрузка датасета\n","from keras.datasets import imdb\n","\n","vocabulary_size = 5000\n","index_from = 3\n","\n","(X_train, y_train), (X_test, y_test) = imdb.load_data(\n"," path=\"imdb.npz\",\n"," num_words=vocabulary_size,\n"," skip_top=0,\n"," maxlen=None,\n"," seed=39,\n"," start_char=1,\n"," oov_char=2,\n"," index_from=index_from\n"," )\n","\n","# вывод размерностей\n","print('Shape of X train:', X_train.shape)\n","print('Shape of y train:', y_train.shape)\n","print('Shape of X test:', X_test.shape)\n","print('Shape of y test:', y_test.shape)"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"JE5QF7cLbhKL","executionInfo":{"status":"ok","timestamp":1765387060440,"user_tz":-180,"elapsed":3967,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}},"outputId":"4668f394-934d-4f2a-8a08-dd9961f12d12"},"execution_count":4,"outputs":[{"output_type":"stream","name":"stdout","text":["Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz\n","\u001b[1m17464789/17464789\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 0us/step\n","Shape of X train: (25000,)\n","Shape of y train: (25000,)\n","Shape of X test: (25000,)\n","Shape of y test: (25000,)\n"]}]},{"cell_type":"markdown","source":["3) Вывели один отзыв из обучающего множества в виде списка индексов слов. Преобразовали список индексов в текст и вывели отзыв в виде текста. Вывели длину отзыва. Вывели метку класса данного отзыва и название класса (1 – Positive, 0 – Negative)."],"metadata":{"id":"dm8qlFzecBEi"}},{"cell_type":"code","source":["# создание словаря для перевода индексов в слова\n","# загрузка словаря \"слово:индекс\"\n","word_to_id = imdb.get_word_index()\n","\n","# уточнение словаря\n","word_to_id = {key:(value + index_from) for key,value in word_to_id.items()}\n","word_to_id[\"<PAD>\"] = 0\n","word_to_id[\"<START>\"] = 1\n","word_to_id[\"<UNK>\"] = 2\n","word_to_id[\"<UNUSED>\"] = 3\n","\n","# создание обратного словаря \"индекс:слово\"\n","id_to_word = {value:key for key,value in word_to_id.items()}"],"metadata":{"id":"6Uf4JIlDcCGm","executionInfo":{"status":"ok","timestamp":1765387060803,"user_tz":-180,"elapsed":361,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}},"colab":{"base_uri":"https://localhost:8080/"},"outputId":"8dcff087-874f-4a5b-d579-cd42b1294037"},"execution_count":5,"outputs":[{"output_type":"stream","name":"stdout","text":["Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb_word_index.json\n","\u001b[1m1641221/1641221\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 0us/step\n"]}]},{"cell_type":"code","source":["print(X_train[39])\n","print('len:',len(X_train[39]))"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"Z0sjggGrcH-1","executionInfo":{"status":"ok","timestamp":1765387060812,"user_tz":-180,"elapsed":7,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}},"outputId":"bf26b133-276c-420d-f55d-92e29e1b49b9"},"execution_count":6,"outputs":[{"output_type":"stream","name":"stdout","text":["[1, 3206, 2, 3413, 3852, 2, 2, 73, 256, 19, 4396, 3033, 34, 488, 2, 47, 2993, 4058, 11, 63, 29, 4653, 1496, 27, 4122, 54, 4, 1334, 1914, 380, 1587, 56, 351, 18, 147, 2, 2, 15, 29, 238, 30, 4, 455, 564, 167, 1024, 2, 2, 2, 4, 2, 65, 33, 6, 2, 1062, 3861, 6, 3793, 1166, 7, 1074, 1545, 6, 171, 2, 1134, 388, 7, 3569, 2, 567, 31, 255, 37, 47, 6, 3161, 1244, 3119, 19, 6, 2, 11, 12, 2611, 120, 41, 419, 2, 17, 4, 3777, 2, 4952, 2468, 1457, 6, 2434, 4268, 23, 4, 1780, 1309, 5, 1728, 283, 8, 113, 105, 1037, 2, 285, 11, 6, 4800, 2905, 182, 5, 2, 183, 125, 19, 6, 327, 2, 7, 2, 668, 1006, 4, 478, 116, 39, 35, 321, 177, 1525, 2294, 6, 226, 176, 2, 2, 17, 2, 1220, 119, 602, 2, 2, 592, 2, 17, 2, 2, 1405, 2, 597, 503, 1468, 2, 2, 17, 2, 1947, 3702, 884, 1265, 3378, 1561, 2, 17, 2, 2, 992, 3217, 2393, 4923, 2, 17, 2, 2, 1255, 2, 2, 2, 117, 17, 6, 254, 2, 568, 2297, 5, 2, 2, 17, 1047, 2, 2186, 2, 1479, 488, 2, 4906, 627, 166, 1159, 2552, 361, 7, 2877, 2, 2, 665, 718, 2, 2, 2, 603, 4716, 127, 4, 2873, 2, 56, 11, 646, 227, 531, 26, 670, 2, 17, 6, 2, 2, 3510, 2, 17, 6, 2, 2, 2, 3014, 17, 6, 2, 668, 2, 503, 1468, 2, 19, 11, 4, 1746, 5, 2, 4778, 11, 31, 7, 41, 1273, 154, 255, 555, 6, 1156, 5, 737, 431]\n","len: 274\n"]}]},{"cell_type":"code","source":["review_as_text = ' '.join(id_to_word[id] for id in X_train[39])\n","print(review_as_text)\n","print('len:',len(review_as_text))"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"KLVb0gQ7cMKH","executionInfo":{"status":"ok","timestamp":1765387060830,"user_tz":-180,"elapsed":16,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}},"outputId":"d95be73d-324b-40de-d096-3a99a9cdfa2b"},"execution_count":7,"outputs":[{"output_type":"stream","name":"stdout","text":["<START> troubled <UNK> magazine photographer <UNK> <UNK> well played with considerable intensity by michael <UNK> has horrific nightmares in which he brutally murders his models when the lovely ladies start turning up dead for real <UNK> <UNK> that he might be the killer writer director william <UNK> <UNK> <UNK> the <UNK> story at a <UNK> pace builds a reasonable amount of tension delivers a few <UNK> effective moments of savage <UNK> violence one woman who has a plastic garbage bag with a <UNK> in it placed over her head <UNK> as the definite <UNK> inducing highlight puts a refreshing emphasis on the nicely drawn and engaging true to life characters further <UNK> everything in a plausible everyday world and <UNK> things off with a nice <UNK> of <UNK> female nudity the fine acting from an excellent cast helps matters a whole lot <UNK> <UNK> as <UNK> charming love interest <UNK> <UNK> james <UNK> as <UNK> <UNK> double <UNK> brother b j <UNK> <UNK> as <UNK> concerned psychiatrist dr frank curtis don <UNK> as <UNK> <UNK> gay assistant louis pamela <UNK> as <UNK> <UNK> detective <UNK> <UNK> <UNK> little as a hard <UNK> police chief and <UNK> <UNK> as sweet <UNK> model <UNK> r michael <UNK> polished cinematography makes impressive occasional use of breathtaking <UNK> <UNK> shots jack <UNK> <UNK> <UNK> score likewise does the trick <UNK> up in cool bit parts are robert <UNK> as a <UNK> <UNK> sally <UNK> as a <UNK> <UNK> <UNK> shower as a <UNK> female <UNK> b j <UNK> with in the ring and <UNK> bay in one of her standard old woman roles a solid and enjoyable picture\n","len: 1584\n"]}]},{"cell_type":"markdown","source":["4) Вывели максимальную и минимальную длину отзыва в обучающем множестве."],"metadata":{"id":"G4MDeQlFcVU0"}},{"cell_type":"code","source":["print('MAX Len: ',len(max(X_train, key=len)))\n","print('MIN Len: ',len(min(X_train, key=len)))"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"tGV9ptPGcWXJ","executionInfo":{"status":"ok","timestamp":1765387060843,"user_tz":-180,"elapsed":6,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}},"outputId":"285002ae-1451-468f-89bb-2b96b426446b"},"execution_count":8,"outputs":[{"output_type":"stream","name":"stdout","text":["MAX Len: 2494\n","MIN Len: 11\n"]}]},{"cell_type":"markdown","source":["5) Провели предобработку данных. Выбрали единую длину, к которой будут приведены все отзывы. Короткие отзывы дополнили спецсимволами, а длинные обрезали до выбранной длины."],"metadata":{"id":"t6dS8DRnccz6"}},{"cell_type":"code","source":["#предобработка данных\n","from tensorflow.keras.utils import pad_sequences\n","\n","max_words = 500\n","X_train = pad_sequences(X_train, maxlen=max_words, value=0, padding='pre', truncating='post')\n","X_test = pad_sequences(X_test, maxlen=max_words, value=0, padding='pre', truncating='post')"],"metadata":{"id":"eRN7vYrScd_T","executionInfo":{"status":"ok","timestamp":1765387062120,"user_tz":-180,"elapsed":1275,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}}},"execution_count":9,"outputs":[]},{"cell_type":"markdown","source":["6. Повторили пункт 4."],"metadata":{"id":"KduPqn6gcmJe"}},{"cell_type":"code","source":["print('MAX Len: ',len(max(X_train, key=len)))\n","print('MIN Len: ',len(min(X_train, key=len)))"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"cG5lyZ1icon9","executionInfo":{"status":"ok","timestamp":1765387062147,"user_tz":-180,"elapsed":26,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}},"outputId":"fcc705bd-894c-4c01-8b03-0e943c05cd15"},"execution_count":10,"outputs":[{"output_type":"stream","name":"stdout","text":["MAX Len: 500\n","MIN Len: 500\n"]}]},{"cell_type":"markdown","source":["7) Повторили пункт 3. Сделали вывод о том, как отзыв преобразовался после предобработки."],"metadata":{"id":"8iQi-RT8cvrI"}},{"cell_type":"code","source":["print(X_train[39])\n","print('len:',len(X_train[39]))"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"4LedIjtMcwo_","executionInfo":{"status":"ok","timestamp":1765387062160,"user_tz":-180,"elapsed":6,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}},"outputId":"fd4978d5-fb69-4946-dfbb-a426a2261e5c"},"execution_count":11,"outputs":[{"output_type":"stream","name":"stdout","text":["[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"," 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"," 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"," 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"," 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"," 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"," 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"," 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"," 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"," 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"," 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"," 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"," 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"," 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"," 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"," 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"," 0 0 1 3206 2 3413 3852 2 2 73 256 19 4396 3033\n"," 34 488 2 47 2993 4058 11 63 29 4653 1496 27 4122 54\n"," 4 1334 1914 380 1587 56 351 18 147 2 2 15 29 238\n"," 30 4 455 564 167 1024 2 2 2 4 2 65 33 6\n"," 2 1062 3861 6 3793 1166 7 1074 1545 6 171 2 1134 388\n"," 7 3569 2 567 31 255 37 47 6 3161 1244 3119 19 6\n"," 2 11 12 2611 120 41 419 2 17 4 3777 2 4952 2468\n"," 1457 6 2434 4268 23 4 1780 1309 5 1728 283 8 113 105\n"," 1037 2 285 11 6 4800 2905 182 5 2 183 125 19 6\n"," 327 2 7 2 668 1006 4 478 116 39 35 321 177 1525\n"," 2294 6 226 176 2 2 17 2 1220 119 602 2 2 592\n"," 2 17 2 2 1405 2 597 503 1468 2 2 17 2 1947\n"," 3702 884 1265 3378 1561 2 17 2 2 992 3217 2393 4923 2\n"," 17 2 2 1255 2 2 2 117 17 6 254 2 568 2297\n"," 5 2 2 17 1047 2 2186 2 1479 488 2 4906 627 166\n"," 1159 2552 361 7 2877 2 2 665 718 2 2 2 603 4716\n"," 127 4 2873 2 56 11 646 227 531 26 670 2 17 6\n"," 2 2 3510 2 17 6 2 2 2 3014 17 6 2 668\n"," 2 503 1468 2 19 11 4 1746 5 2 4778 11 31 7\n"," 41 1273 154 255 555 6 1156 5 737 431]\n","len: 500\n"]}]},{"cell_type":"code","source":["review_as_text = ' '.join(id_to_word[id] for id in X_train[39])\n","print(review_as_text)\n","print('len:',len(review_as_text))"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"CSssMrdVc16R","executionInfo":{"status":"ok","timestamp":1765387062206,"user_tz":-180,"elapsed":44,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}},"outputId":"59e36760-8cd9-4d62-b956-4d18e352836d"},"execution_count":12,"outputs":[{"output_type":"stream","name":"stdout","texttroubled <UNK> magazine photographer <UNK> <UNK> well played with considerable intensity by michael <UNK> has horrific nightmares in which he brutally murders his models when the lovely ladies start turning up dead for real <UNK> <UNK> that he might be the killer writer director william <UNK> <UNK> <UNK> the <UNK> story at a <UNK> pace builds a reasonable amount of tension delivers a few <UNK> effective moments of savage <UNK> violence one woman who has a plastic garbage bag with a <UNK> in it placed over her head <UNK> as the definite <UNK> inducing highlight puts a refreshing emphasis on the nicely drawn and engaging true to life characters further <UNK> everything in a plausible everyday world and <UNK> things off with a nice <UNK> of <UNK> female nudity the fine acting from an excellent cast helps matters a whole lot <UNK> <UNK> as <UNK> charming love interest <UNK> <UNK> james <UNK> as <UNK> <UNK> double <UNK> brother b j <UNK> <UNK> as <UNK> concerned psychiatrist dr frank curtis don <UNK> as <UNK> <UNK> gay assistant louis pamela <UNK> as <UNK> <UNK> detective <UNK> <UNK> <UNK> little as a hard <UNK> police chief and <UNK> <UNK> as sweet <UNK> model <UNK> r michael <UNK> polished cinematography makes impressive occasional use of breathtaking <UNK> <UNK> shots jack <UNK> <UNK> <UNK> score likewise does the trick <UNK> up in cool bit parts are robert <UNK> as a <UNK> <UNK> sally <UNK> as a <UNK> <UNK> <UNK> shower as a <UNK> female <UNK> b j <UNK> with in the ring and <UNK> bay in one of her standard old woman roles a solid and enjoyable picture\n","len: 2940\n"]}]},{"cell_type":"markdown","source":["8) Вывели предобработанные массивы обучающих и тестовых данных и их размерности."],"metadata":{"id":"q0vqP9aFc58N"}},{"cell_type":"code","source":["# вывод данных\n","print('X train: \\n',X_train)\n","print('X train: \\n',X_test)\n","\n","# вывод размерностей\n","print('Shape of X train:', X_train.shape)\n","print('Shape of X test:', X_test.shape)"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"g64JxLfVc607","executionInfo":{"status":"ok","timestamp":1765387062215,"user_tz":-180,"elapsed":6,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}},"outputId":"18b90fd0-1838-4396-d59d-b248fda1d34b"},"execution_count":13,"outputs":[{"output_type":"stream","name":"stdout","text":["X train: \n"," [[ 0 0 0 ... 7 4 2407]\n"," [ 0 0 0 ... 34 705 2]\n"," [ 0 0 0 ... 2222 8 369]\n"," ...\n"," [ 0 0 0 ... 11 4 4596]\n"," [ 0 0 0 ... 574 42 24]\n"," [ 0 0 0 ... 7 13 3891]]\n","X train: \n"," [[ 0 0 0 ... 6 52 20]\n"," [ 0 0 0 ... 62 30 821]\n"," [ 0 0 0 ... 24 3081 25]\n"," ...\n"," [ 0 0 0 ... 19 666 3159]\n"," [ 0 0 0 ... 7 15 1716]\n"," [ 0 0 0 ... 1194 61 113]]\n","Shape of X train: (25000, 500)\n","Shape of X test: (25000, 500)\n"]}]},{"cell_type":"markdown","source":["9) Реализовали модель рекуррентной нейронной сети, состоящей из слоев Embedding, LSTM, Dropout, Dense, и обучили ее на обучающих данных с выделением части обучающих данных в качестве валидационных. Вывели информацию об архитектуре нейронной сети. Добились качества обучения по метрике accuracy не менее 0.8."],"metadata":{"id":"tt0ie0K0dAbR"}},{"cell_type":"code","source":["embed_dim = 32\n","lstm_units = 64\n","\n","model = Sequential()\n","model.add(layers.Embedding(input_dim=vocabulary_size, output_dim=embed_dim, input_length=max_words, input_shape=(max_words,)))\n","model.add(layers.LSTM(lstm_units))\n","model.add(layers.Dropout(0.5))\n","model.add(layers.Dense(1, activation='sigmoid'))\n","\n","model.summary()"],"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":363},"id":"jD3pkS_qdBmo","executionInfo":{"status":"ok","timestamp":1765387064134,"user_tz":-180,"elapsed":1912,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}},"outputId":"9e9c0371-8141-4d53-afb6-ebe331e090e3"},"execution_count":14,"outputs":[{"output_type":"stream","name":"stderr","text":["/usr/local/lib/python3.12/dist-packages/keras/src/layers/core/embedding.py:97: UserWarning: Argument `input_length` is deprecated. Just remove it.\n"," warnings.warn(\n","/usr/local/lib/python3.12/dist-packages/keras/src/layers/core/embedding.py:100: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.\n"," super().__init__(**kwargs)\n"]},{"output_type":"display_data","data":{"text/plain":["\u001b[1mModel: \"sequential\"\u001b[0m\n"],"text/html":["<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">Model: \"sequential\"</span>\n","</pre>\n"]},"metadata":{}},{"output_type":"display_data","data":{"text/plain":["┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n","┃\u001b[1m \u001b[0m\u001b[1mLayer (type) \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mOutput Shape \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m Param #\u001b[0m\u001b[1m \u001b[0m┃\n","┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n","│ embedding (\u001b[38;5;33mEmbedding\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m500\u001b[0m, \u001b[38;5;34m32\u001b[0m) │ \u001b[38;5;34m160,000\u001b[0m │\n","├─────────────────────────────────┼────────────────────────┼───────────────┤\n","│ lstm (\u001b[38;5;33mLSTM\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m64\u001b[0m) │ \u001b[38;5;34m24,832\u001b[0m │\n","├─────────────────────────────────┼────────────────────────┼───────────────┤\n","│ dropout (\u001b[38;5;33mDropout\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m64\u001b[0m) │ \u001b[38;5;34m0\u001b[0m │\n","├─────────────────────────────────┼────────────────────────┼───────────────┤\n","│ dense (\u001b[38;5;33mDense\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m1\u001b[0m) │ \u001b[38;5;34m65\u001b[0m │\n","└─────────────────────────────────┴────────────────────────┴───────────────┘\n"],"text/html":["<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n","┃<span style=\"font-weight: bold\"> Layer (type) </span>┃<span style=\"font-weight: bold\"> Output Shape </span>┃<span style=\"font-weight: bold\"> Param # </span>┃\n","┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n","│ embedding (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Embedding</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">500</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">32</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">160,000</span> │\n","├─────────────────────────────────┼────────────────────────┼───────────────┤\n","│ lstm (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">LSTM</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">64</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">24,832</span> │\n","├─────────────────────────────────┼────────────────────────┼───────────────┤\n","│ dropout (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dropout</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">64</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │\n","├─────────────────────────────────┼────────────────────────┼───────────────┤\n","│ dense (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dense</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">1</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">65</span> │\n","└─────────────────────────────────┴────────────────────────┴───────────────┘\n","</pre>\n"]},"metadata":{}},{"output_type":"display_data","data":{"text/plain":["\u001b[1m Total params: \u001b[0m\u001b[38;5;34m184,897\u001b[0m (722.25 KB)\n"],"text/html":["<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Total params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">184,897</span> (722.25 KB)\n","</pre>\n"]},"metadata":{}},{"output_type":"display_data","data":{"text/plain":["\u001b[1m Trainable params: \u001b[0m\u001b[38;5;34m184,897\u001b[0m (722.25 KB)\n"],"text/html":["<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">184,897</span> (722.25 KB)\n","</pre>\n"]},"metadata":{}},{"output_type":"display_data","data":{"text/plain":["\u001b[1m Non-trainable params: \u001b[0m\u001b[38;5;34m0\u001b[0m (0.00 B)\n"],"text/html":["<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Non-trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> (0.00 B)\n","</pre>\n"]},"metadata":{}}]},{"cell_type":"code","source":["# компилируем и обучаем модель\n","batch_size = 64\n","epochs = 3\n","model.compile(loss=\"binary_crossentropy\", optimizer=\"adam\", metrics=[\"accuracy\"])\n","model.fit(X_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.2)"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"UK281-h_dcQk","executionInfo":{"status":"ok","timestamp":1765387088620,"user_tz":-180,"elapsed":24484,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}},"outputId":"b252ad63-dabc-44fb-bf64-2d9e1e431d62"},"execution_count":15,"outputs":[{"output_type":"stream","name":"stdout","text":["Epoch 1/3\n","\u001b[1m313/313\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m11s\u001b[0m 23ms/step - accuracy: 0.6315 - loss: 0.6268 - val_accuracy: 0.8072 - val_loss: 0.4273\n","Epoch 2/3\n","\u001b[1m313/313\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m6s\u001b[0m 20ms/step - accuracy: 0.8559 - loss: 0.3469 - val_accuracy: 0.8496 - val_loss: 0.3603\n","Epoch 3/3\n","\u001b[1m313/313\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m7s\u001b[0m 21ms/step - accuracy: 0.8993 - loss: 0.2662 - val_accuracy: 0.8666 - val_loss: 0.3242\n"]},{"output_type":"execute_result","data":{"text/plain":["<keras.src.callbacks.history.History at 0x7ee57a9e6d20>"]},"metadata":{},"execution_count":15}]},{"cell_type":"code","source":["test_loss, test_acc = model.evaluate(X_test, y_test)\n","print(f\"\\nTest accuracy: {test_acc}\")"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"2Jdh6S_8dgXE","executionInfo":{"status":"ok","timestamp":1765387095405,"user_tz":-180,"elapsed":6779,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}},"outputId":"492ad1bc-cd39-4e7c-dca5-c523652318b4"},"execution_count":16,"outputs":[{"output_type":"stream","name":"stdout","text":["\u001b[1m782/782\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m7s\u001b[0m 9ms/step - accuracy: 0.8714 - loss: 0.3110\n","\n","Test accuracy: 0.8674799799919128\n"]}]},{"cell_type":"markdown","source":["10) Оценили качество обучения на тестовых данных:\n","- вывели значение метрики качества классификации на тестовых данных\n","- вывели отчет о качестве классификации тестовой выборки\n","- построили ROC-кривую по результату обработки тестовой выборки и вычислили площадь под ROC-кривой (AUC ROC)"],"metadata":{"id":"sDkhhezJdpNi"}},{"cell_type":"code","source":["#значение метрики качества классификации на тестовых данных\n","print(f\"\\nTest accuracy: {test_acc}\")"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"OeP6ss3CdotA","executionInfo":{"status":"ok","timestamp":1765387095425,"user_tz":-180,"elapsed":16,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}},"outputId":"b8cd954e-ae86-4f53-fed5-f46a01a72288"},"execution_count":17,"outputs":[{"output_type":"stream","name":"stdout","text":["\n","Test accuracy: 0.8674799799919128\n"]}]},{"cell_type":"code","source":["#отчет о качестве классификации тестовой выборки\n","y_score = model.predict(X_test)\n","y_pred = [1 if y_score[i,0]>=0.5 else 0 for i in range(len(y_score))]\n","\n","from sklearn.metrics import classification_report\n","print(classification_report(y_test, y_pred, labels = [0, 1], target_names=['Negative', 'Positive']))"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"YyVafpMldt5g","executionInfo":{"status":"ok","timestamp":1765387105893,"user_tz":-180,"elapsed":10466,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}},"outputId":"88178647-99c1-4628-d374-e86bf709fae9"},"execution_count":18,"outputs":[{"output_type":"stream","name":"stdout","text":["\u001b[1m782/782\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m6s\u001b[0m 7ms/step\n"," precision recall f1-score support\n","\n"," Negative 0.86 0.88 0.87 12500\n"," Positive 0.87 0.86 0.87 12500\n","\n"," accuracy 0.87 25000\n"," macro avg 0.87 0.87 0.87 25000\n","weighted avg 0.87 0.87 0.87 25000\n","\n"]}]},{"cell_type":"code","source":["#построение ROC-кривой и AUC ROC\n","from sklearn.metrics import roc_curve, auc\n","\n","fpr, tpr, thresholds = roc_curve(y_test, y_score)\n","plt.plot(fpr, tpr)\n","plt.grid()\n","plt.xlabel('False Positive Rate')\n","plt.ylabel('True Positive Rate')\n","plt.title('ROC')\n","plt.show()\n","print('AUC ROC:', auc(fpr, tpr))"],"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":490},"id":"N05HejFXdyp-","executionInfo":{"status":"ok","timestamp":1765387106261,"user_tz":-180,"elapsed":366,"user":{"displayName":"Егор Кирсанов","userId":"10290320580506007453"}},"outputId":"acc69a49-92f4-4e4c-cd2d-e2870ddda448"},"execution_count":19,"outputs":[{"output_type":"display_data","data":{"text/plain":["<Figure size 640x480 with 1 Axes>"],"image/png":"\n"},"metadata":{}},{"output_type":"stream","name":"stdout","text":["AUC ROC: 0.9387573504\n"]}]}]} |