一個稍微講究一點的辦法是,利用在大規模資料集上預訓練好的網路。這樣的網路在多數的計算機視覺問題上都能取得不錯的特徵,利用這樣的特徵可以讓我們獲得更高的準確率。
我們將使用vgg-16網路,該網路在ImageNet資料集上進行訓練,這個模型我們之前提到過了。因為ImageNet資料集包含多種“貓”類和多種“狗”類,這個模型已經能夠學習與我們這個資料集相關的特徵了。事實上,簡單的記錄原來網路的輸出而不用bottleneck特徵就已經足夠把我們的問題解決的不錯了。不過我們這裡講的方法對其他的類似問題有更好的推廣性,包括在ImageNet中沒有出現的類別的分類問題。
VGG-16的網路結構如下:
我們的方法是這樣的,我們將利用網路的卷積層部分,把全連線以上的部分拋掉。然後在我們的訓練集和測試集上跑一遍,將得到的輸出(即“bottleneck feature”,網路在全連線之前的最後一層啟用的feature map)記錄在兩個numpy array裡。然後我們基於記錄下來的特徵訓練一個全連線網路。
我們將這些特徵儲存為離線形式,而不是將我們的全連線模型直接加到網路上並凍結之前的層引數進行訓練的原因是處於計算效率的考慮。執行VGG網路的代價是非常高昂的,尤其是在CPU上執行,所以我們只想執行一次。這也是我們不進行資料提升的原因。
我們不再贅述如何搭建vgg-16網路了,這件事之前已經說過,在keras的example裡也可以找到。但讓我們看看如何記錄bottleneck特徵。
generator = datagen.flow_from_directory(
"data/train",
target_size=(150, 150),
batch_size=32,
class_mode=None, # this means our generator will only yield batches of data, no labels
shuffle=False) # our data will be in order, so all first 1000 images will be cats, then 1000 dogs
# the predict_generator method returns the output of a model, given
# a generator that yields batches of numpy data
bottleneck_features_train = model.predict_generator(generator, 2000)
# save the output as a Numpy array
np.save(open("bottleneck_features_train.npy", "w"), bottleneck_features_train)
"data/validation",
class_mode=None,
shuffle=False)
bottleneck_features_validation = model.predict_generator(generator, 800)
np.save(open("bottleneck_features_validation.npy", "w"), bottleneck_features_validation)
記錄完畢後我們可以將資料載入,用於訓練我們的全連線網路:
train_data = np.load(open("bottleneck_features_train.npy"))
# the features were saved in order, so recreating the labels is easy
train_labels = np.array([0] * 1000 + [1] * 1000)
validation_data = np.load(open("bottleneck_features_validation.npy"))
validation_labels = np.array([0] * 400 + [1] * 400)
model = Sequential()
model.add(Flatten(input_shape=train_data.shape[1:]))
model.add(Dense(256, activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(1, activation="sigmoid"))
model.compile(optimizer="rmsprop",
loss="binary_crossentropy",
metrics=["accuracy"])
model.fit(train_data, train_labels,
nb_epoch=50, batch_size=32,
validation_data=(validation_data, validation_labels))
model.save_weights("bottleneck_fc_model.h5")
因為特徵的size很小,模型在CPU上跑的也會很快,大概1s一個epoch,最後我們的準確率是90%~91%,這麼好的結果多半歸功於預訓練的vgg網路幫助我們提取特徵。
下面是程式碼:
[python] view plain copy
import os
import h5py
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.layers import Activation, Dropout, Flatten, Dense
import sys
defaultencoding = "utf-8"
if sys.getdefaultencoding() != defaultencoding:
reload(sys)
sys.setdefaultencoding(defaultencoding)
# path to the model weights file.
weights_path = "../weights/vgg16_weights.h5"
top_model_weights_path = "bottleneck_fc_model.h5"
# dimensions of our images.
img_width, height = 150, 150
train_data_dir = "../data/train"
validation_data_dir = "../data/validation"
nb_train_samples = 2000
nb_validation_samples = 800
nb_epoch = 50
def save_bottlebeck_features():
datagen = ImageDataGenerator(rescale=1./255)
# build the VGG16 network
model.add(ZeroPadding2D((1, 1), input_shape=(3, width, height)))
model.add(Convolution2D(64, 3, 3, activation="relu", name="conv1_1"))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(64, 3, 3, activation="relu", name="conv1_2"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(Convolution2D(128, 3, 3, activation="relu", name="conv2_1"))
model.add(Convolution2D(128, 3, 3, activation="relu", name="conv2_2"))
model.add(Convolution2D(256, 3, 3, activation="relu", name="conv3_1"))
model.add(Convolution2D(256, 3, 3, activation="relu", name="conv3_2"))
model.add(Convolution2D(256, 3, 3, activation="relu", name="conv3_3"))
model.add(Convolution2D(512, 3, 3, activation="relu", name="conv4_1"))
model.add(Convolution2D(512, 3, 3, activation="relu", name="conv4_2"))
model.add(Convolution2D(512, 3, 3, activation="relu", name="conv4_3"))
model.add(Convolution2D(512, 3, 3, activation="relu", name="conv5_1"))
model.add(Convolution2D(512, 3, 3, activation="relu", name="conv5_2"))
model.add(Convolution2D(512, 3, 3, activation="relu", name="conv5_3"))
# load the weights of the VGG16 networks
# (trained on ImageNet, won the ILSVRC competition in 2014)
# note: when there is a complete match between your model definition
# and your weight savefile, you can simply call model.load_weights(filename)
assert os.path.exists(weights_path), "Model weights not found (see "weights_path" variable in script)."
f = h5py.File(weights_path)
for k in range(f.attrs["nb_layers"]):
if k >= len(model.layers):
# we don"t look at the last (fully-connected) layers in the savefile
break
g = f["layer_{}".format(k)]
weights = [g["param_{}".format(p)] for p in range(g.attrs["nb_params"])]
model.layers[k].set_weights(weights)
f.close()
print("Model loaded.")
train_data_dir,
target_size=(img_width, height),
print("generator ok.")
bottleneck_features_train = model.predict_generator(generator, nb_train_samples)
print("predict ok.")
np.save(open("bottleneck_features_train.npy", "wb"), bottleneck_features_train)
validation_data_dir,
bottleneck_features_validation = model.predict_generator(generator, nb_validation_samples)
np.save(open("bottleneck_features_validation.npy", "wb"), bottleneck_features_validation)
print("save_bottlebeck_features ok")
def train_top_model():
train_labels = np.array([0] * (nb_train_samples / 2) + [1] * (nb_train_samples / 2))
validation_labels = np.array([0] * (nb_validation_samples / 2) + [1] * (nb_validation_samples / 2))
model.compile(optimizer="rmsprop", loss="binary_crossentropy", metrics=["accuracy"])
nb_epoch=nb_epoch, batch_size=32,
model.save_weights(top_model_weights_path)
print("train_top_model ok")
save_bottlebeck_features()
train_top_model()
一個稍微講究一點的辦法是,利用在大規模資料集上預訓練好的網路。這樣的網路在多數的計算機視覺問題上都能取得不錯的特徵,利用這樣的特徵可以讓我們獲得更高的準確率。
我們將使用vgg-16網路,該網路在ImageNet資料集上進行訓練,這個模型我們之前提到過了。因為ImageNet資料集包含多種“貓”類和多種“狗”類,這個模型已經能夠學習與我們這個資料集相關的特徵了。事實上,簡單的記錄原來網路的輸出而不用bottleneck特徵就已經足夠把我們的問題解決的不錯了。不過我們這裡講的方法對其他的類似問題有更好的推廣性,包括在ImageNet中沒有出現的類別的分類問題。
VGG-16的網路結構如下:
我們的方法是這樣的,我們將利用網路的卷積層部分,把全連線以上的部分拋掉。然後在我們的訓練集和測試集上跑一遍,將得到的輸出(即“bottleneck feature”,網路在全連線之前的最後一層啟用的feature map)記錄在兩個numpy array裡。然後我們基於記錄下來的特徵訓練一個全連線網路。
我們將這些特徵儲存為離線形式,而不是將我們的全連線模型直接加到網路上並凍結之前的層引數進行訓練的原因是處於計算效率的考慮。執行VGG網路的代價是非常高昂的,尤其是在CPU上執行,所以我們只想執行一次。這也是我們不進行資料提升的原因。
我們不再贅述如何搭建vgg-16網路了,這件事之前已經說過,在keras的example裡也可以找到。但讓我們看看如何記錄bottleneck特徵。
generator = datagen.flow_from_directory(
"data/train",
target_size=(150, 150),
batch_size=32,
class_mode=None, # this means our generator will only yield batches of data, no labels
shuffle=False) # our data will be in order, so all first 1000 images will be cats, then 1000 dogs
# the predict_generator method returns the output of a model, given
# a generator that yields batches of numpy data
bottleneck_features_train = model.predict_generator(generator, 2000)
# save the output as a Numpy array
np.save(open("bottleneck_features_train.npy", "w"), bottleneck_features_train)
generator = datagen.flow_from_directory(
"data/validation",
target_size=(150, 150),
batch_size=32,
class_mode=None,
shuffle=False)
bottleneck_features_validation = model.predict_generator(generator, 800)
np.save(open("bottleneck_features_validation.npy", "w"), bottleneck_features_validation)
記錄完畢後我們可以將資料載入,用於訓練我們的全連線網路:
train_data = np.load(open("bottleneck_features_train.npy"))
# the features were saved in order, so recreating the labels is easy
train_labels = np.array([0] * 1000 + [1] * 1000)
validation_data = np.load(open("bottleneck_features_validation.npy"))
validation_labels = np.array([0] * 400 + [1] * 400)
model = Sequential()
model.add(Flatten(input_shape=train_data.shape[1:]))
model.add(Dense(256, activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(1, activation="sigmoid"))
model.compile(optimizer="rmsprop",
loss="binary_crossentropy",
metrics=["accuracy"])
model.fit(train_data, train_labels,
nb_epoch=50, batch_size=32,
validation_data=(validation_data, validation_labels))
model.save_weights("bottleneck_fc_model.h5")
因為特徵的size很小,模型在CPU上跑的也會很快,大概1s一個epoch,最後我們的準確率是90%~91%,這麼好的結果多半歸功於預訓練的vgg網路幫助我們提取特徵。
下面是程式碼:
[python] view plain copy
import os
import h5py
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.layers import Activation, Dropout, Flatten, Dense
import sys
defaultencoding = "utf-8"
if sys.getdefaultencoding() != defaultencoding:
reload(sys)
sys.setdefaultencoding(defaultencoding)
# path to the model weights file.
weights_path = "../weights/vgg16_weights.h5"
top_model_weights_path = "bottleneck_fc_model.h5"
# dimensions of our images.
img_width, height = 150, 150
train_data_dir = "../data/train"
validation_data_dir = "../data/validation"
nb_train_samples = 2000
nb_validation_samples = 800
nb_epoch = 50
def save_bottlebeck_features():
datagen = ImageDataGenerator(rescale=1./255)
# build the VGG16 network
model = Sequential()
model.add(ZeroPadding2D((1, 1), input_shape=(3, width, height)))
model.add(Convolution2D(64, 3, 3, activation="relu", name="conv1_1"))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(64, 3, 3, activation="relu", name="conv1_2"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation="relu", name="conv2_1"))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation="relu", name="conv2_2"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation="relu", name="conv3_1"))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation="relu", name="conv3_2"))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation="relu", name="conv3_3"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation="relu", name="conv4_1"))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation="relu", name="conv4_2"))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation="relu", name="conv4_3"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation="relu", name="conv5_1"))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation="relu", name="conv5_2"))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation="relu", name="conv5_3"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
# load the weights of the VGG16 networks
# (trained on ImageNet, won the ILSVRC competition in 2014)
# note: when there is a complete match between your model definition
# and your weight savefile, you can simply call model.load_weights(filename)
assert os.path.exists(weights_path), "Model weights not found (see "weights_path" variable in script)."
f = h5py.File(weights_path)
for k in range(f.attrs["nb_layers"]):
if k >= len(model.layers):
# we don"t look at the last (fully-connected) layers in the savefile
break
g = f["layer_{}".format(k)]
weights = [g["param_{}".format(p)] for p in range(g.attrs["nb_params"])]
model.layers[k].set_weights(weights)
f.close()
print("Model loaded.")
generator = datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, height),
batch_size=32,
class_mode=None,
shuffle=False)
print("generator ok.")
bottleneck_features_train = model.predict_generator(generator, nb_train_samples)
print("predict ok.")
np.save(open("bottleneck_features_train.npy", "wb"), bottleneck_features_train)
generator = datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, height),
batch_size=32,
class_mode=None,
shuffle=False)
bottleneck_features_validation = model.predict_generator(generator, nb_validation_samples)
np.save(open("bottleneck_features_validation.npy", "wb"), bottleneck_features_validation)
print("save_bottlebeck_features ok")
def train_top_model():
train_data = np.load(open("bottleneck_features_train.npy"))
train_labels = np.array([0] * (nb_train_samples / 2) + [1] * (nb_train_samples / 2))
validation_data = np.load(open("bottleneck_features_validation.npy"))
validation_labels = np.array([0] * (nb_validation_samples / 2) + [1] * (nb_validation_samples / 2))
model = Sequential()
model.add(Flatten(input_shape=train_data.shape[1:]))
model.add(Dense(256, activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(1, activation="sigmoid"))
model.compile(optimizer="rmsprop", loss="binary_crossentropy", metrics=["accuracy"])
model.fit(train_data, train_labels,
nb_epoch=nb_epoch, batch_size=32,
validation_data=(validation_data, validation_labels))
model.save_weights(top_model_weights_path)
print("train_top_model ok")
save_bottlebeck_features()
train_top_model()