Deep Learning with Python series notes (1): Deep learning basics.

A Preliminary Study on Neural Networks

এখন একটি স্নায়ুর নেটওয়ার্ক, যা হাতে লেখা ডিজিটের শ্রেণীভুক্ত করা শিখতে পাইথন গ্রন্থাগার Keras ব্যবহার প্রথম কংক্রিট উদাহরণ তাকান। Mnist 10 ক্লাস সঙ্গে একটি 28 * 28 গ্রেস্কেল ইমেজ। আপনি গভীর শেখার জন্য একটি "হ্যালো ওয়ার্ল্ড" হিসাবে "সমাধানে" MNIST মনে করতে পারেন। আপনাকে যা করতে হবে তা যাচাই করা হয়েছে যা বাস্তবায়িত অ্যালগরিদম প্রত্যাশা অনুযায়ী কাজ করে।

Loading the Mnist dataset on Keras

from keras.datasets import mnist (train_images, train_labels), (test_images, test_labels) = mnist.load_data()

train_images এবং train_label একটি "প্রশিক্ষণ সেট" গঠন করে, এবং মডেলটি ডেটা থেকে শিখবে। test_images এবং test_label: মডেল তারপর "টেস্ট সেট" এ পরীক্ষা করা হবে। আমাদের চিত্রগুলি Numpy অ্যারে হিসাবে এনকোডেড থাকে এবং লেবেলগুলির শুধু সংখ্যার একটি সেট, 0 থেকে 9 হয়, সেখানে ইমেজ এবং লেবেলের মধ্যে একটি এক-এক সাদৃশ্য রয়েছে।

The training data


>>> train_images.shape
(60000, 28, 28)
>>> len(train_labels)
60000
>>> train_labels
array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

The test data

>>> test_images.shape (10000, 28, 28) >>> len(test_labels) 10000 >>> test_labels array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

train_images এবং train_label: প্রথম, আমরা আমাদের স্নায়ুর নেটওয়ার্ক প্রশিক্ষণ ডেটার সাথে প্রশিক্ষণ হবে: আমাদের কাজ হল নিম্নরূপঃ। তারপর, নেটওয়ার্ক লেবেল সহ সহযোগী ইমেজ learns। পরিশেষে, আমরা নেটওয়ার্কের test_images উপর ভবিষ্যৎবাণী করার অনুরোধ জানানো হবে, এবং আমরা তা যাচাই করবে এই ভবিষ্যৎবাণী test_label লেবেল মেলে।

Network structure

from keras import models

from keras import layers

network = models.Sequential()

network.add(layers.Dense(512,activation='relu',input_shape=(28*28,))) # 全连接层：512个神经元，激活函数：relu，输入大小： 28*28

network.add(layers.Dense(10,activation='softmax')) # 输出层：返回10个类别的概率

Here, our network consists of two dense layers, which are tightly connected (fully connected) neural layers. The second (ie, last) layer is a 10-class "softmax" layer, which means it will return an array of 10 probability values (totaling 1).

To prepare our network for training, we need to define three additional parameters as part of the "compilation" step.

** 1. Loss function: ** The network measures its learning performance and how it can define the network going in the right direction.

** 2. Optimization parameters: ** This is the mechanism for the network to update itself based on data and loss functions, such as: SGD, Rmsprop, etc.

3. Metrics : accuracy, etc.

Network compilation

network.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

Before training, we will preprocess the data, modify it to the shape expected by the network, and scale it so that all values are in the interval [0,1].

Before processing, our training images are stored in an array of type uint8 (60000, 28, 28) with values in the [0,255] interval. We convert it to a floating point number (60000,28 * 28) with a value between 0 and 1.

Preparing the image data

train_images = train_images.reshape((60000, 28 * 28))

train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28 * 28))

test_images = test_images.astype('float32') / 255

We also need to encode the labels.

from keras.utils import to_categorical train_labels = to_categorical(train_labels) test_labels = to_categorical(test_labels)

We are now ready to train our network. This is done in Keras by calling the network's fit method: we "match" the model to its training data.

Training network

>>> network.fit(train_images, train_labels, epochs=5, batch_size=128)

Epoch 1/5

60000/60000 [==============================] - 9s - loss: 0.2524 - acc: 0.9273

Epoch 2/5

51328/60000 [========================>.....] - ETA: 1s - loss: 0.1035 - acc: 0.9692

We quickly reached an accuracy of 0.989 (or 98.9%) on the training data.

Verify the network

test_loass, test_acc = network.evaluate(test_images, test_labels) print('test_acc': test_acc) >>test_acc: 0.9785

The accuracy of our test set is 97.8%, which is much lower than the accuracy of the training set. The gap between training accuracy and test accuracy is an example of "overfitting", that is, machine learning models tend to perform worse on new data than on training data.

Tensors

Scalars (0D tensors)

A tensor containing only one number is called a "scalar" (or "scalar tensor", that is, a 0-dimensional tensor, or a 0D tensor). In Numpy, a float32 or float64 number is a scalar tensor (or scalar array). The number of axes of a Numpy tensor can be displayed through the ndim property; a scalar tensor has 0 axes (ndim == 0), and the number of axes of the tensor is also called rank.


>>> import numpy as np
>>> x = np.array(12)
>>> x
array(12)
>>> x.ndim
0

1
2
3
4
5
6
7

Vectors (1D tensors)

A set of numbers is called a vector, which is a 1D tensor. A 1D tensor will be said to have only one "axis".

>>> x = np.array([12, 3, 6, 14])

>>> x

array([12, 3, 6, 14])

>>> x.ndim

Here, this vector has 5 elements, so it will be called a "5-dimensional vector". Don't confuse a 5D vector with a 5D tensor! A 5D vector has only one axis and has 5 dimensions along its axis, while a 5D tensor has 5 axes (and may have any arbitrary number on each axis) Number of sizes).

Matrices (2D tensors)

An array of vectors is a matrix, or two-dimensional tensor. A matrix has two axes (usually representing "rows" and "columns"). You can intuitively interpret a matrix as a rectangular grid of numbers.

>>> x = np.array([[5, 78, 2, 34, 0], [6, 79, 3, 35, 1], [7, 80, 4, 36, 2]]) >>> x.ndim 2

The first axis is called "row" and the second axis is called "column". In the above example, [5,78,2,34,0] is the first row and [5,6,7] is the first column.

3D tensors and higher-dimensional tensors

>>> x = np.array([[[5, 78, 2, 34, 0],

[6, 79, 3, 35, 1],

[7, 80, 4, 36, 2]],

[[5, 78, 2, 34, 0],

[6, 79, 3, 35, 1],

[7, 80, 4, 36, 2]],

[[5, 78, 2, 34, 0],

[6, 79, 3, 35, 1],

[7, 80, 4, 36, 2]]])

>>> x.ndim

By packaging a 3D tensor in an array, you can create a 4D tensor. and many more. In deep learning, you usually manipulate tensors from 0D to 4D, but if you process video data, you may reach 5D.

tensor key attributes

A tensor is defined by 3 key attributes

** 1.axes: rank. ** For example, a 3D tensor has 3 axes and a matrix has 2 axes. This is also called a tensor of ndim, such as Numpy in Python libraries.

** 2. Shape. ** This is a tuple of integers that describes the size of the tensor on each axis. For example, the matrix example above has shape (3,5), and our 3D tensor example has shape (3, 3, 5). The shape of a vector has only one element, such as (5,), and the scalar will have one Empty shape ().

** 3. Data type: ** It is usually called dtype in Python library. The data types contained in the tensor; for example, float32, uint8, float64 ...

To make this more specific, let's review the data processed in our MNIST example:

from keras.datasets import mnist

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

>>> print(train_images.ndim)

>>> print(train_images.shape)

(60000, 28, 28)

>>> print(train_images.dtype)

uint8

Here is a 3D tensor with 8-bit integers. More precisely, it is a 60,000 matrix containing 28x28 integers. Each such matrix is a grayscale image with coefficients between 0 and 255.

Let's use the library Matplotlib (part of the standard scientific Python suite) to display the fourth number in this 3D tensor:

digit = train_images[4] import matplotlib.pyplot as plt plt.imshow(digit, cmap=plt.cm.binary) plt.show()

Handling tensors in Numpy

"Selecting" a particular element in a tensor is called a "tensor slice".

>>> my_slice = train_images[10:100]

>>> print(my_slice.shape)

(90, 28, 28)

Or as follows:

>>> my_slice = train_images[10:100, :, :] # equivalent to the above example

>>> my_slice.shape

>(90, 28, 28)

>>> my_slice = train_images[10:100, 0:28, 0:28] # also equivalent to the above example

>>> my_slice.shape

(90, 28, 28)

In general, you can choose any two indicators along each tensor axis. For example, to select 14x14 pixels in the bottom right corner of all images, you can do this:

my_slice = train_images[:, 14:, 14:]

You can also use negative exponents. Similar to negative indices in Python lists, they represent positions relative to the end of the current axis. To crop our image to a middle position of 14x14 pixels, we can do this:

my_slice = train_images[:, 7:-7, 7:-7]

data batch

Deep learning models do not process the entire data set at the same time, but break the data into small batches. Specifically, set the batch of MNIST numbers to 128:

batch = train_images[:128]

# and here's the next batch

batch = train_images[128:256]

# and the n-th batch:

batch = train_images[128 * n:128 * (n + 1)]

When setting up such a batch tensor, the first axis (axis 0) is called "batch axis" or "batch dimension".

Real examples of data tensors.

The data you will process will almost always fall into one of the following categories:

1. Vector (vector) data : 2D tensors of shape (samples, features).

** 2. Timeseries data or sequence data: ** 3D tensors of shape (samples) , timesteps, features).

** 3. Image: ** 4D tensors of shape (samples, width, height, channels) or (samples, channels, width, height).

** 4. Video: ** 5D tensors of shape (samples, frames, width, height, channels) or (samples, frames, channels, width, height).

Vector data

In such a data set, each single data point can be encoded into a vector, so a batch of data will be encoded into a two-dimensional tensor (that is, a vector array), where the first axis is the "sample axis" , The second axis is the "feature axis". as follows:

An actuarial dataset, we consider each person's age, identity code, and income. Each person can be described as a vector of 3 values, so the entire dataset of 100,000 people can be stored in a two-dimensional shape tensor (100000,3).

Timeseries data or sequence data

When time plays a role in your data (or the concept of sequence order), it makes sense to store it in a three-dimensional tensor with an explicit time axis. Each sample can be encoded as a sequence of vectors (a two-dimensional tensor), so a batch of data will be encoded as a three-dimensional tensor.

Data set of stock prices. Every minute we store the current price, the highest price in the past minute and the lowest price in the past minute. So every minute is so encoded as a 3D vector, the entire trading day is coded as a 2D tensor of shape (390, 3) on the trading day (390 minutes), and 250 days of data can be stored in a 3D shape tensor (250, 390, 3). Here, each sample will have one day of data.

Image data

চিত্রগুলির সাধারণত 3 টি মাত্রা থাকে: প্রস্থ, উচ্চতা এবং রঙ গভীরতা। যদিও গ্রেস্কেল চিত্রগুলির (যেমন আমাদের এমএনআইএসটি সংখ্যা হিসাবে) কেবল একটি একক রঙের চ্যানেল রয়েছে, সেগুলি দ্বি-মাত্রিক টেনারগুলিতে সংরক্ষণ করা যেতে পারে তবে গ্র্যাসকেল চিত্রগুলির জন্য এক-মাত্রিক রঙ চ্যানেল সহ চিরাচরিত চিত্র টেনারগুলি সাধারণত ত্রিমাত্রিক।

Video data

ভিডিও ডেটাকে কয়েক বাস্তব ধরনের তথ্য যে একটি 5D টেন্সর প্রয়োজন অন্যতম। একটি ভিডিও ফ্রেম একটি ক্রম হিসেবে বোঝা যাবে, প্রতিটি ফ্রেম একটি রং ইমেজ। যেহেতু প্রতিটি ফ্রেম একটি 3D টেন্সর (প্রস্থ, উচ্চতা, color_depth) সংরক্ষণ করা যাবে, তারপর ফ্রেম একটি ক্রম, একটি 4D টেন্সর (ফ্রেম, প্রস্থ, উচ্চতা, color_depth) সংরক্ষণ করা যাবে, যাতে ভিডিওর একটি ভিন্ন সেট সংরক্ষণ করা যেতে পারে ( নমুনা ফ্রেম, প্রস্থ, উচ্চতা, color_depth)।

উদাহরণস্বরূপ, একটি 60-দ্বিতীয়, 256x144 YouTube ভিডিও ক্লিপ, প্রতি সেকেন্ডে 4 ফ্রেম এ নমুনা 240 ফ্রেম হবে। 4 এই ধরনের ভিডিও ক্লিপ একটি সেট একটি টেন্সর (4,240,256,144,3) এ সংরক্ষণ করা হবে। এই 106.168.320 মূল্যবোধের মোট হয়! তাহলে টেন্সর এর dtype float32 হয়, তাহলে প্রতিটি মান, 32 বিট সংরক্ষন হয় যাতে টেন্সর 425MB উপস্থাপিত করবে। ভিডিও বাস্তব জীবনে সম্মুখীন অনেক ছোট কারণ তারা float32 সঞ্চিত নেই, এবং তারা সাধারণত একটি বড় ফ্যাক্টর (যেমন এমপিইজি ফরম্যাট হিসাবে) দ্বারা সংকুচিত করছে।

Tensor operations

A simple application to implement the Relu function:

def naive_relu(x):

# x is 2D Numpy tensor

assert len(x.shape) == 2

# 断言函数

x = x.copy() # Avoid overwriting the input tensor

for i in range(x.shape[0]):

for j in range(x.shape[1]):

x[i, j] = max(x[i, j], 0)

return x

The Python assertion function is applied as follows:

>>> assert 1==1

>>> assert 1==0

Traceback (most recent call last):

File "<pyshell#1>", line 1, in <module>

assert 1==0

AssertionError

>>> assert True

>>> assert False

Traceback (most recent call last):

File "<pyshell#3>", line 1, in <module>

assert False

AssertionError

>>> assert 3<2

Traceback (most recent call last):

File "<pyshell#4>", line 1, in <module>

assert 3<2

AssertionError

Simple application of matrix addition:

def naive_add_matrix_and_vector(x, y):

# x is a 2D Numpy tensor

# y is a Numpy vector

assert len(x.shape) == 2

assert len(y.shape) == 1

assert x.shape[1] == y.shape[0]

x = x.copy() # Avoid overwriting the input tensor

for i in range(x.shape[0]):

for j in range(x.shape[1]):

x[i, j] += y[j]

return x

Two simple maximum applications with different shapes:

import numpy as np

# x is a random tensor with shape (64, 3, 32, 10)

x = np.random.random((64, 3, 32, 10))

# y is a random tensor with shape (32, 10)

y = np.random.random((32, 10))

# The output z has shape (64, 3, 32, 10) like x

z = np.maximum(x, y)

Tensor multiplication

import numpy as np z = np.dot(x, y)

import numpy as np

def naive_matrix_vector_dot(x, y):

# x is a Numpy matrix

# y is a Numpy vector

assert len(x.shape) == 2

assert len(y.shape) == 1

# The 1st dimension of x must be

# the same as the 0th dimension of y!

assert x.shape[1] == y.shape[0]

# This operation returns a vector of 0s

# with the same shape as y

z = np.zeros(x.shape[0])

for i in range(x.shape[0]):

for j in range(x.shape[1]):

z[i] += x[i, j] * y[j]

return z

The multiplication form of tensor is as follows:

Tensor reshaping

>>> x = np.array([[0., 1.],

[2., 3.],

[4., 5.]])

>>> print(x.shape)

(3, 2)

>>> x = x.reshape((6, 1))

array([[ 0.],

[ 1.],

[ 2.],

[ 3.],

[ 4.],

[ 5.]])

>>> x = x.reshape((2, 3))

array([[ 0., 1., 2.],

[ 3., 4., 5.]])

Analysis of neural networks

As we saw in previous chapters, training a neural network revolves around the following objects:

1. Layers

Combine them into a network (or model).

2. Input data and corresponding targets

3. Loss function:

defines the feedback signal for learning.

4. The optimization function

determines the progress of learning.

Layers: the cornerstone of deep learning

একটি স্নায়ুর নেটওয়ার্ক মৌলিক ডাটা স্ট্রাকচার "স্তর", যা পূর্ববর্তী অধ্যায়ে চালু করা হয় হয়।

একটি স্তর একটি ডাটা প্রসেসিং মডিউল যে ইনপুট এক বা একাধিক tensors এবং আউটপুট

এক বা একাধিক tensors যেমন লাগে। কিছু স্তর আড়ম্বরহীন, কিন্তু ততোধিক লেয়ারকে এক

রাজ্যের আছে: স্তরের "ওজন", এবং এক বা একাধিক tensors সম্ভাব্যতার সূত্রাবলি গ্রেডিয়েন্ট

বংশদ্ভুত মাধ্যমে শেখা হয়।

বিভিন্ন স্তর বিভিন্ন টেন্সর বিন্যাস ও ডাটা প্রসেসিং বিভিন্ন ধরনের জন্য উপযুক্ত। উদাহরণস্বরূপ,

ফর্ম (নমুনা, বৈশিষ্ট্য) এর দ্বি-মাত্রিক টেন্সর সহজ ভেক্টর ডাটা সাধারণত "সম্পূর্ণরূপে সংযুক্ত"

স্তর (Keras ঘন বর্গ) দ্বারা প্রক্রিয়াভুক্ত করা। (নমুনা timesteps, বৈশিষ্ট্য) আকারে সিকোয়েন্স

তথ্য সাধারণত (যেমন LSTM স্তর হিসেবে) "পৌনঃপুনিক" স্তর দ্বারা প্রক্রিয়া করা হয়। একটি 4D

টেন্সর সঞ্চিত ইমেজ তথ্য সাধারণত একটি দ্বি-মাত্রিক convolutional স্তর (Conv2D) দ্বারা প্রক্রিয়াভুক্ত

করা।

from keras import layers # A dense layer with 32 output units layer = layers.Dense(32, input_shape=(784,))

এমন একটি স্তর তৈরি করুন যা কেবলমাত্র ইনপুট 2 ডি টেনসরটিকে প্রথম মাত্রা 784 (শূন্য-ত্রিমাত্রিক, ব্যাচের মাত্রা, অনির্ধারিত এবং এইভাবে কোনও মান গৃহীত হবে) গ্রহণ করে। এই স্তরটি একটি টেনসর ফিরে আসবে যেখানে প্রথম মাত্রা 3 এ রূপান্তরিত হয়।

>>> layer.output_shape (None, 32)

সুতরাং, এই স্তরটি কেবল ইনপুট হিসাবে পছন্দসই 32-মাত্রিক ভেক্টরের সাথে সংযুক্ত করা যেতে পারে। কেরাস ব্যবহার করার সময়, আপনাকে সামঞ্জস্যতা নিয়ে চিন্তা করতে হবে না, কারণ মডেলটিতে যুক্ত স্তরগুলি ইনকামিং স্তরের আকারের সাথে গতিশীলভাবে নির্মিত।

from keras import models

from keras import layers

model = models.Sequential()

model.add(layers.Dense(32, input_shape=(784,)))

model.add(layers.Dense(32))

Keras as defined by the Sequential model

from keras import models

from keras import layers

model = models.Sequential()

model.add(layers.Dense(32, activation='relu', input_shape=(784,)))

model.add(layers.Dense(10, activation='softmax'))

Keras defined by the Functional API:

input_tensor = layers.Input(shape=(784,))

x = layers.Dense(32, activation='relu')(input_tensor)

output_tensor = layers.Dense(10, activation='softmax')(x)

model = models.Model(input=input_tensor, output=output_tensor)

একবার মডেল আর্কিটেকচারটি সংজ্ঞায়িত হয়ে গেলে, আপনি সিক্যুয়াল মডেল বা ফাংশনাল এপিআই ব্যবহার করেন কিনা তা বিবেচনা করে না, পরবর্তী সমস্ত ধাপ একই are

from keras import optimizers

model.compile(optimizer=optimizers.RMSprop(lr=0.001),

loss='mse',

metrics=['accuracy'])

model.fit(input_tensor, target_tensor, batch_size=128, epochs=10)

Classified movie reviews: an example of binary classification

Load IMDB dataset

from keras.datasets import imdb (train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

পরামিতি num_words = 10000 মানে যে আমরা কেবল প্রশিক্ষণ ডেটা 10,000 সবচেয়ে সাধারণ শব্দ রাখা হবে। নতুন শব্দগুলি বাতিল করা হবে।

ভেরিয়েবল train_data এবং test_data ডেটাসেট তালিকা আছে, এবং প্রতিটি পর্যালোচনা শব্দ ইনডেক্স (ক শব্দ ক্রম এনকোডিং) এর একটি তালিকা রয়েছে। train_label এবং test_label 0 এবং 0 এর, যেখানে 0 একটি "ঋণাত্মক সংখ্যা" উপস্থাপন করে এবং 1 "পজিটিভ সংখ্যা" প্রতিনিধিত্ব করে একটি তালিকা আছে।

>>> train_data[0] [1, 14, 22, 16, ... 178, 32] >>> train_labels[0] 1 >>> max([max(sequence) for sequence in train_data]) 9999

We cannot enter a list of integers into a neural network, we must turn the list into a tensor.

import numpy as np

def vectorize_sequences(sequences, dimension=10000):

# Create an all-zero matrix of shape (len(sequences), dimension)

results = np.zeros((len(sequences), dimension))

for i, sequence in enumerate(sequences):

results[i, sequence] = 1. # set specific indices of results[i] to 1s

return results

# Our vectorized training data

x_train = vectorize_sequences(train_data)

# Our vectorized test data

x_test = vectorize_sequences(test_data)

Coding example

>>> x_train[0] array([ 0., 1., 1., ..., 0., 0., 0.])

ncoding label

# Our vectorized labels

y_train = np.asarray(train_labels).astype('float32')

y_test = np.asarray(test_labels).astype('float32')

Compile model

model.compile(optimizer='rmsprop',

loss='binary_crossentropy',

metrics=['accuracy'])

Or add the parameters of the optimizer:

from keras import optimizers

model.compile(optimizer=optimizers.RMSprop(lr=0.001),

loss='binary_crossentropy',

metrics=['accuracy'])

Use custom losses and metrics:

from keras import losses

from keras import metrics

model.compile(optimizer=optimizers.RMSprop(lr=0.001),

loss=losses.binary_crossentropy,

metrics=[metrics.binary_accuracy])

Validation model:

In order to monitor the accuracy of the data during the training process, we will create a "validation set" that separates 10,000 samples from the original training data.

x_val = x_train[:10000]

partial_x_train = x_train[10000:]

y_val = y_train[:10000]

partial_y_train = y_train[10000:]

Training model:

history = model.fit(partial_x_train,

partial_y_train,epochs=20,

batch_size=512,

validation_data=(x_val, y_val))

Note that the call to model.fit () returns a history object. This object has a member history, which is a dictionary containing data for everything that happened during training.

>>> history_dict = history.history

>>> history_dict.keys()

[u'acc', u'loss', u'val_acc', u'val_loss']

It can be seen that the history object has 4 keywords that can be used to draw the image:

import matplotlib.pyplot as plt

acc = history.history['acc']

val_acc = history.history['val_acc']

loss = history.history['loss']

val_loss = history.history['val_loss']

epochs = range(1, len(acc) + 1)

# "bo" is for "blue dot"

plt.plot(epochs, loss, 'bo', label='Training loss')

# b is for "solid blue line"

plt.plot(epochs, val_loss, 'b', label='Validation loss')

plt.title('Training and validation loss')

plt.xlabel('Epochs')

plt.ylabel('Loss')

plt.legend()

plt.show()

plt.clf() # clear figure

acc_values = history_dict['acc']

val_acc_values = history_dict['val_acc']

plt.plot(epochs, acc, 'bo', label='Training acc')

plt.plot(epochs, val_acc, 'b', label='Validation acc')

plt.title('Training and validation accuracy')

plt.xlabel('Epochs')

plt.ylabel('Loss')

plt.legend()

plt.show()

It can be seen that the network performs best when epoch = 4. Set epochs = 4 to retrain the network:

model = models.Sequential()

model.add(layers.Dense(16, activation='relu', input_shape=(10000,)))

model.add(layers.Dense(16, activation='relu'))

model.add(layers.Dense(1, activation='sigmoid'))

model.compile(optimizer='rmsprop',

loss='binary_crossentropy',

metrics=['accuracy'])

model.fit(x_train, y_train, epochs=4, batch_size=512)

results = model.evaluate(x_test, y_test)

>>> results

[0.2929924130630493, 0.88327999999999995]

>>> model.predict(x_test)

[[ 0.98006207]

[ 0.99758697]

[ 0.99975556]

...,

[ 0.82167041]

[ 0.02885115]

[ 0.65371346]]

Classification newswires: a multi-class classification example

Load the Reuters dataset

from keras.datasets import reuters

(train_data, train_labels), (test_data, test_labels) = reuters.load_data(num_words=10000)

As with the IMDB dataset, the parameter num_words = 10000 limits the data to the 10,000 most common words found in the data.

>>> len(train_data)

8982

>>> len(test_data)

2246

The data and labels are as follows:

>>> train_data[10]

[1, 245, 273, 207, 156, 53, 74, 160, 26, 14, 46, 296, 26, 39, 74, 2979,

3554, 14, 46, 4689, 4329, 86, 61, 3499, 4795, 14, 61, 451, 4329, 17, 12]

>>> train_labels[10]

Data preprocessing

import numpy as np

def vectorize_sequences(sequences, dimension=10000):

results = np.zeros((len(sequences), dimension))

for i, sequence in enumerate(sequences):

results[i, sequence] = 1.

return results

# Our vectorized training data

x_train = vectorize_sequences(train_data)

# Our vectorized test data

x_test = vectorize_sequences(test_data)

One-hot encoding the labels

def to_one_hot(labels, dimension=46):

results = np.zeros((len(labels), dimension))

for i, label in enumerate(labels):

results[i, label] = 1.

return results

# Our vectorized training labels

one_hot_train_labels = to_one_hot(train_labels)

# Our vectorized test labels

one_hot_test_labels = to_one_hot(test_labels)

One-hot encoding the labels, the Keras way

from keras.utils.np_utils import to_categorical

one_hot_train_labels = to_categorical(train_labels)

one_hot_test_labels = to_categorical(test_labels)

Define the model:

from keras import models

from keras import layers

model = models.Sequential()

model.add(layers.Dense(64, activation='relu', input_shape=(10000,)))

model.add(layers.Dense(64, activation='relu'))

model.add(layers.Dense(46, activation='softmax'))

We are exporting the network with a 46-dimensional fully connected layer. This means that for each input sample, our network will output a 46-dimensional vector. Each entry (each dimension) in this vector will encode a different output class.

The last layer is activated using softmax. You have seen this pattern in the MNIST example. This means that the network will output a probability distribution of 46 different output classes, that is, for each input sample, the network will generate a 46-dimensional output vector. The output [i] is the probability that the sample belongs to class i, and the sum of the 46 values is 1.

Compile model

model.compile(optimizer='rmsprop',

loss='categorical_crossentropy',

metrics=['accuracy'])

Validation model

x_val = x_train[:1000]

partial_x_train = x_train[1000:]

y_val = one_hot_train_labels[:1000]

partial_y_train = one_hot_train_labels[1000:]

history = model.fit(partial_x_train,

partial_y_train,

epochs=20,

batch_size=512,

validation_data=(x_val, y_val))

Visualization of test results:

import matplotlib.pyplot as plt

loss = history.history['loss']

val_loss = history.history['val_loss']

epochs = range(1, len(loss) + 1)

plt.plot(epochs, loss, 'bo', label='Training loss')

plt.plot(epochs, val_loss, 'b', label='Validation loss')

plt.title('Training and validation loss')

plt.xlabel('Epochs')

plt.ylabel('Loss')

plt.legend()

plt.show()

plt.clf() # clear figure

acc = history.history['acc']

val_acc = history.history['val_acc']

plt.plot(epochs, acc, 'bo', label='Training acc')

plt.plot(epochs, val_acc, 'b', label='Validation acc')

plt.title('Training and validation accuracy')

plt.xlabel('Epochs')

plt.ylabel('Loss')

plt.legend()

plt.show()

Make predictions on new data

predictions = model.predict(x_test) >>> predictions[0].shape (46,) >>> np.sum(predictions[0]) 1.0 >>> np.argmax(predictions[0]) 4

Different ways to deal with labels and losses

Another way to encode tags is to convert them into an integer tensor, like this:

y_train = np.array(train_labels)

y_test = np.array(test_labels)

The only change is the choice of loss function. Our previous loss, categorical_crossentropy, expects labels to follow categorical encoding. For integer labels, we should use sparse_categorical_cross entropy.

model.compile(optimizer='rmsprop', loss='sparse_categorical_crossentropy', metrics=['acc'])

Predicting house prices: an example of regression

In the previous two examples, we considered the classification problem, and the goal was to predict a single discrete label of the input data points. Another common machine learning problem is "regression," which involves predicting a continuous value instead of a discrete label. For example, based on meteorological data, predicting tomorrow's temperature, or predicting how long a software project will need to complete.

Load the dataset: the Boston Housing Price dataset

from keras.datasets import boston_housing

(train_data, train_targets), (test_data, test_targets) = boston_housing.load_data()

>>> train_data.shape

(404, 13)

>>> test_data.shape

(102, 13)

As you can see, we have 404 training samples and 102 test samples. The data consists of 13 features.

The training goals are as follows:

>>> train_targets [ 15.2, 42.3, 50. ... 19.4, 19.4, 29.1]

ডেটা প্রিপ্রোসেসিং

এটা তাদের স্নায়ুর নেটওয়ার্ক মান প্রবেশ করা সমস্যাযুক্ত, কারণ তারা সব খুব ভিন্ন মান ব্যাপ্তির আছে। নেটওয়ার্কের স্বয়ংক্রিয়ভাবে এই ভিন্নধর্মী তথ্য মানিয়ে নিতে সক্ষম হতে পারে, কিন্তু এটা অবশ্যই আরো কঠিন শেখার করে তোলে। সবচেয়ে প্রক্রিয়া যেমন তথ্য বৈশিষ্ট্য জ্ঞানী নিয়মমাফিককরণ হল: প্রতিটি বৈশিষ্ট্য (ইনপুট ডেটা ম্যাট্রিক্স একটি কলাম), আমরা ফাংশন 0 এবং ইউনিট উপর ফোকাস করা, স্ট্যানডার্ড ডেভিয়েশন বলতে বৈশিষ্ট্য এবং ডিভাইড বিয়োগ হবে ইনপুট ডেটা স্ট্যান্ডার্ড ডেভিয়েশন।

mean = train_data.mean(axis=0)

train_data -= mean

std = train_data.std(axis=0)

train_data /= std

test_data -= mean

test_data /= std

Building the model

from keras import models

from keras import layers

def build_model():

# Because we will need to instantiate

# the same model multiple time,

# we use a function to construct it.

model = models.Sequential()

model.add(layers.Dense(64, activation='relu',

input_shape=(train_data.shape[1],)))

model.add(layers.Dense(64, activation='relu'))

model.add(layers.Dense(1))

model.compile(optimizer='rmsprop', loss='mse', metrics=['mae'])

return model

আমাদের নেটওয়ার্ক কোনও অ্যাক্টিভেশন ফাংশন ছাড়াই একটি মান আউটপুট করে (যেমন এটি লিনিয়ার স্তর হবে)। এটি স্কেলার রিগ্রেশন-এর সাধারণ বৈশিষ্ট্য (এটি, আমরা একটি একক ধারাবাহিক মানের জন্য রিগ্রেশনটির পূর্বাভাস দেওয়ার চেষ্টা করছি)। একটি অ্যাক্টিভেশন ফাংশন প্রয়োগ করা আউটপুটটির সুযোগকে সীমাবদ্ধ করবে; উদাহরণস্বরূপ, আমরা যদি আমাদের শেষ স্তরটিতে সিগময়েড অ্যাক্টিভেশন ফাংশন প্রয়োগ করি, নেটওয়ার্কটি কেবল 0 এবং 1 এর মধ্যে মানগুলি শিখতে পারে এখানে, যেহেতু শেষ স্তরটি নিখুঁত রৈখিক, তাই নেটওয়ার্কটি মানগুলির পূর্বাভাস শিখতে মুক্ত যে কোনও ব্যাপ্তি।

কে-ফোল্ড বৈধতা ব্যবহার করে

আমাদের নেটওয়ার্কটি মূল্যায়ন করতে, যেমন আমরা ক্রমাগত এর পরামিতিগুলি সমন্বয় করি (যেমন মহাকাশের সংখ্যা), আমরা কেবল উদাহরণটিকে পূর্বের মত যা করেছি তা যেমন একটি প্রশিক্ষণ সেট এবং একটি বৈধতা সেট হিসাবে ডেটা পচন করতে পারি। তবে, আমাদের ডেটা পয়েন্টগুলি খুব অল্প হওয়ায় বৈধতা সেটটি খুব ছোট হয়ে যায় (উদাহরণস্বরূপ, প্রায় 100 উদাহরণ)। ফলস্বরূপ, আমাদের বৈধতার স্কোরগুলি বৈধতার জন্য আমরা যে ডেটা পয়েন্টগুলি বেছে নিয়েছি তার উপর নির্ভর করে এবং আমরা যে প্রশিক্ষণ সেটটি চয়ন করি তার উপর নির্ভর করে, যা বৈধতা স্কোরগুলির বৈধতা বিচ্ছেদে উল্লেখযোগ্য পার্থক্য থাকতে পারে, যা আমাদের নির্ভরযোগ্যভাবে আমাদের মূল্যায়ন করতে বাধা দেবে মডেল.

এই ক্ষেত্রে, সেরা অনুশীলন হ'ল কে-ভাঁজ ক্রস বৈধতা ব্যবহার করা। এটি উপলব্ধ পার্টিশনগুলিকে কে পার্টিশনে ভাগ করে (সাধারণত কে = 4 বা 5), তারপরে কে অভিন্ন মডেলগুলি ইনস্ট্যান্ট করে এবং প্রতিটি পার্টিশনকে কে -1 পার্টিশনে প্রশিক্ষণ দিয়ে থাকে, যখন বাকী পার্টিশনগুলি মূল্যায়ন করে। মডেলের যাচাইকরণের স্কোর প্রাপ্ত কে যাচাইকরণের স্কোরগুলির গড় হবে।

import numpy as np

k = 4

num_val_samples = len(train_data) // k

num_epochs = 100

all_scores = []

for i in range(k):

print('processing fold #', i)

# Prepare the validation data: data from partition # k

val_data = train_data[i * num_val_samples: (i + 1) * num_val_samples]

val_targets = train_targets[i * num_val_samples: (i + 1) * num_val_samples]

# Prepare the training data: data from all other partitions

partial_train_data = np.concatenate(

[train_data[:i * num_val_samples],

train_data[(i + 1) * num_val_samples:]],

axis=0)

partial_train_targets = np.concatenate(

[train_targets[:i * num_val_samples],

train_targets[(i + 1) * num_val_samples:]],

axis=0)

# Build the Keras model (already compiled)

model = build_model()

# Train the model (in silent mode, verbose=0)

model.fit(partial_train_data, partial_train_targets,

epochs=num_epochs, batch_size=1, verbose=0)

# Evaluate the model on the validation data

val_mse, val_mae = model.evaluate(val_data, val_targets, verbose=0)

all_scores.append(val_mae)

Running the above snippet with num_epochs = 100, you can get the following results:

>>> all_scores

[2.588258957792037, 3.1289568449719116, 3.1856116051248984, 3.0763342615401386]

>>> np.mean(all_scores)

2.9947904173572462

Let's try to train this network a little longer: 500 epochs. In order to record the performance of this model at each epoch, we will modify our training loop to keep a verification log of each stage.

num_epochs = 500

all_mae_histories = []

for i in range(k):

print('processing fold #', i)

# Prepare the validation data: data from partition # k

val_data = train_data[i * num_val_samples: (i + 1) * num_val_samples]

val_targets = train_targets[i * num_val_samples: (i + 1) * num_val_samples]

# Prepare the training data: data from all other partitions

partial_train_data = np.concatenate(

[train_data[:i * num_val_samples],

train_data[(i + 1) * num_val_samples:]],

axis=0)

partial_train_targets = np.concatenate(

[train_targets[:i * num_val_samples],

train_targets[(i + 1) * num_val_samples:]],

axis=0)

# Build the Keras model (already compiled)

model = build_model()

# Train the model (in silent mode, verbose=0)

history = model.fit(partial_train_data, partial_train_targets,

validation_data=(val_data, val_targets),

epochs=num_epochs, batch_size=1, verbose=0)

mae_history = history.history['val_mean_absolute_error']

all_mae_histories.append(mae_history)

twenty one

twenty two

twenty three

twenty four

You can then calculate the average MSE score for each epoch:

average_mae_history = [

np.mean([x[i] for x in all_mae_histories]) for i in range(num_epochs)]

import matplotlib.pyplot as plt

plt.plot(range(1, len(average_mae_history) + 1), average_mae_history)

plt.xlabel('Epochs')

plt.ylabel('Validation MAE')

plt.show()

প্রথম 10 টি ডাটা পয়েন্ট ছেড়ে দিন, যা বাকী বাঁক থেকে আলাদাভাবে স্কেল করা হয়।

একটি মসৃণ বক্ররেখার জন্য প্রতিটি পয়েন্টকে সূচকীয় চলমান গড়ের সাথে প্রতিস্থাপন করুন।

def smooth_curve(points, factor=0.9):

smoothed_points = []

for point in points:

if smoothed_points:

previous = smoothed_points[-1]

smoothed_points.append(previous * factor + point * (1 - factor))

else:

smoothed_points.append(point)

return smoothed_points

smooth_mae_history = smooth_curve(average_mae_history[10:])

plt.plot(range(1, len(smooth_mae_history) + 1), smooth_mae_history)

plt.xlabel('Epochs')

plt.ylabel('Validation MAE')

plt.show()

এই প্লট অনুসারে, 80 যুগের পরে, এমএইএ যাচাইকরণ আর উল্লেখযোগ্যভাবে উন্নত হয়েছে বলে মনে হয় না। এই মুহুর্তে, আমরা ওভারফিট করা শুরু করি।

একবার অনুকূলিত হয়ে গেলে, অন্যান্য প্যারামিটারের মডেলগুলি (ইউপসের সংখ্যা ছাড়াও, আমরা লুকানো স্তরের আকারও সামঞ্জস্য করতে পারি), আমরা সর্বশেষ "উত্পাদন" মডেলটির প্রশিক্ষণের ডেটার সেরা পরামিতিগুলি প্রশিক্ষণ দিতে পারি, এবং তারপরে এর দিকে তাকাতে পারি কর্মক্ষমতা পরীক্ষা ডেটা।

# Get a fresh, compiled model.

model = build_model()

# Train it on the entirety of the data.

model.fit(train_data, train_targets,

epochs=80, batch_size=16, verbose=0)

test_mse_score, test_mae_score = model.evaluate(test_data, test_targets)

>>> test_mae_score

2.5532484335057877

Mohammad Mostofa Zaman

Deep Learning with Python series notes (1): Deep learning basics

Deep Learning with Python series notes (1): Deep learning basics.

A Preliminary Study on Neural Networks

Tensors

0 comments:

Post a Comment

Popular Posts

New Research

SAY HELLO TO ME

ADDRESS

EMAIL

TELEPHONE

MOBILE