0
  • 聊天消息
  • 系統(tǒng)消息
  • 評(píng)論與回復(fù)
登錄后你可以
  • 下載海量資料
  • 學(xué)習(xí)在線課程
  • 觀看技術(shù)視頻
  • 寫(xiě)文章/發(fā)帖/加入社區(qū)
會(huì)員中心
創(chuàng)作中心

完善資料讓更多小伙伴認(rèn)識(shí)你,還能領(lǐng)取20積分哦,立即完善>

3天內(nèi)不再提示

如何構(gòu)建一個(gè)雙編碼器神經(jīng)網(wǎng)絡(luò)模型

LiveVideoStack ? 來(lái)源:LiveVideoStack ? 作者:LiveVideoStack ? 2021-03-02 15:59 ? 次閱讀

如何構(gòu)建一個(gè)雙編碼器(也稱(chēng)為雙塔)神經(jīng)網(wǎng)絡(luò)模型,以使用自然語(yǔ)言搜索圖像。

1 介紹 該示例演示了如何構(gòu)建一個(gè)雙編碼器(也稱(chēng)為雙塔)神經(jīng)網(wǎng)絡(luò)模型,以使用自然語(yǔ)言搜索圖像。該模型的靈感來(lái)自于Alec Radford等人提出的CLIP方法,其思想是聯(lián)合訓(xùn)練一個(gè)視覺(jué)編碼器和一個(gè)文本編碼器,將圖像及其標(biāo)題的表示投射到同一個(gè)嵌入空間,從而使標(biāo)題嵌入位于其描述的圖像的嵌入附近。 這個(gè)例子需要TensorFlow 2.4或更高版本。此外,BERT模型需要TensorFlow Hub和TensorFlow Text,AdamW優(yōu)化器需要TensorFlow Addons。這些庫(kù)可以使用以下命令進(jìn)行安裝。

pipinstall-q-U tensorflow-hubtensorflow-texttensorflow-addons

2 安裝

importosimportcollectionsimportjsonimportnumpyasnpimporttensorflowastffromtensorflowimportkerasfromtensorflow.kerasimportlayersimporttensorflow_hubashubimporttensorflow_textastextimporttensorflow_addonsastfaimportmatplotlib.pyplotaspltimportmatplotlib.imageasmpimgfromtqdmimporttqdm #Suppressingtf.hubwarningstf.get_logger().setLevel("ERROR")

3 準(zhǔn)備數(shù)據(jù)

我們使用MS-COCO數(shù)據(jù)集來(lái)訓(xùn)練我們的雙編碼器模型。MS-COCO包含超過(guò)82,000張圖片,每張圖片至少有5個(gè)不同的標(biāo)題注釋。該數(shù)據(jù)集通常用image captioning任務(wù),但我們可以重新利用圖像標(biāo)題對(duì)來(lái)訓(xùn)練雙編碼器模型進(jìn)行圖像搜索。

下載提取數(shù)據(jù)

首先,下載數(shù)據(jù)集,它由兩個(gè)壓縮文件夾組成:一個(gè)是圖像,另一個(gè)是相關(guān)的圖像標(biāo)題。值得注意的是壓縮后的圖像文件夾大小為13GB。

root_dir = "datasets"annotations_dir=os.path.join(root_dir,"annotations")images_dir = os.path.join(root_dir, "train2014")tfrecords_dir = os.path.join(root_dir, "tfrecords")annotation_file = os.path.join(annotations_dir, "captions_train2014.json") #Downloadcaptionannotationfilesif not os.path.exists(annotations_dir): annotation_zip = tf.keras.utils.get_file( "captions.zip", cache_dir=os.path.abspath("."), origin="https://images.cocodataset.org/annotations/annotations_trainval2014.zip", extract=True, ) os.remove(annotation_zip) # Download image filesif not os.path.exists(images_dir): image_zip = tf.keras.utils.get_file( "train2014.zip", cache_dir=os.path.abspath("."), origin="https://images.cocodataset.org/zips/train2014.zip", extract=True, ) os.remove(image_zip) print("Datasetisdownloadedandextractedsuccessfully.") with open(annotation_file, "r") as f: annotations = json.load(f)["annotations"] image_path_to_caption = collections.defaultdict(list)for element in annotations: caption = f"{element['caption'].lower().rstrip('.')}" image_path = images_dir + "/COCO_train2014_" + "%012d.jpg" % (element["image_id"]) image_path_to_caption[image_path].append(caption) image_paths = list(image_path_to_caption.keys())print(f"Number of images: {len(image_paths)}")

Downloading data from https://images.cocodataset.org/annotations/annotations_trainval2014.zip252878848/252872794 [==============================] - 5s 0us/stepDownloading data from https://images.cocodataset.org/zips/train2014.zip13510574080/13510573713 [==============================] - 394s 0us/stepDataset is downloaded and extracted successfully.Number of images: 82783

處理并將數(shù)據(jù)保存到TFRecord文件中

你可以改變sample_size參數(shù)去控制將用于訓(xùn)練雙編碼器模型的多對(duì)圖像-標(biāo)題。在這個(gè)例子中,我們將training_size設(shè)置為30000張圖像,約占數(shù)據(jù)集的35%。我們?yōu)槊繌垐D像使用2個(gè)標(biāo)題,從而產(chǎn)生60000個(gè)圖像-標(biāo)題對(duì)。訓(xùn)練集的大小會(huì)影響生成編碼器的質(zhì)量,樣本越多,訓(xùn)練時(shí)間越長(zhǎng)。

train_size = 30000valid_size = 5000captions_per_image = 2images_per_file = 2000train_image_paths = image_paths[:train_size]num_train_files = int(np.ceil(train_size / images_per_file))train_files_prefix = os.path.join(tfrecords_dir, "train") valid_image_paths = image_paths[-valid_size:]num_valid_files = int(np.ceil(valid_size / images_per_file))valid_files_prefix = os.path.join(tfrecords_dir, "valid") tf.io.gfile.makedirs(tfrecords_dir) def bytes_feature(value): return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value])) def create_example(image_path, caption): feature = {"caption":bytes_feature(caption.encode()), "raw_image": bytes_feature(tf.io.read_file(image_path).numpy()),} return tf.train.Example(features=tf.train.Features(feature=feature)) defwrite_tfrecords(file_name,image_paths):caption_list=[] image_path_list=[]forimage_pathinimage_paths: captions = image_path_to_caption[image_path][:captions_per_image] caption_list.extend(captions) image_path_list.extend([image_path] * len(captions)) withtf.io.TFRecordWriter(file_name)aswriter:forexample_idxinrange(len(image_path_list)): example = create_example( image_path_list[example_idx], caption_list[example_idx] ) writer.write(example.SerializeToString())returnexample_idx+1 def write_data(image_paths, num_files, files_prefix): example_counter = 0 for file_idx in tqdm(range(num_files)): file_name = files_prefix + "-%02d.tfrecord" % (file_idx) start_idx = images_per_file * file_idx end_idx = start_idx + images_per_file example_counter += write_tfrecords(file_name, image_paths[start_idx:end_idx])returnexample_counter train_example_count=write_data(train_image_paths,num_train_files,train_files_prefix)print(f"{train_example_count} training examples were written to tfrecord files.") valid_example_count = write_data(valid_image_paths, num_valid_files, valid_files_prefix)print(f"{valid_example_count}evaluationexampleswerewrittentotfrecordfiles.")

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [03:19<00:00, 13.27s/it] 0%| | 0/3 [00:00

創(chuàng)建用于訓(xùn)練和評(píng)估的 tf.data.Dataset

feature_description = { "caption": tf.io.FixedLenFeature([], tf.string), "raw_image": tf.io.FixedLenFeature([], tf.string),} def read_example(example): features = tf.io.parse_single_example(example, feature_description) raw_image = features.pop("raw_image") features["image"] = tf.image.resize( tf.image.decode_jpeg(raw_image, channels=3), size=(299, 299) ) return features def get_dataset(file_pattern, batch_size): return ( tf.data.TFRecordDataset(tf.data.Dataset.list_files(file_pattern)) .map( read_example, num_parallel_calls=tf.data.experimental.AUTOTUNE, deterministic=False, ) .shuffle(batch_size * 10) .prefetch(buffer_size=tf.data.experimental.AUTOTUNE) .batch(batch_size) ) 4 實(shí)時(shí)投影頭

投影頭用于將圖像和文字嵌入到具有相同的維度的同一嵌入空間。

def project_embeddings( embeddings, num_projection_layers, projection_dims, dropout_rate): projected_embeddings = layers.Dense(units=projection_dims)(embeddings) for _ in range(num_projection_layers): x = tf.nn.gelu(projected_embeddings) x = layers.Dense(projection_dims)(x) x = layers.Dropout(dropout_rate)(x) x = layers.Add()([projected_embeddings, x]) projected_embeddings = layers.LayerNormalization()(x) return projected_embeddings 5 實(shí)現(xiàn)視覺(jué)編碼器

在本例中,我們使用Keras Applications的Xception作為視覺(jué)編碼器的基礎(chǔ)。

def create_vision_encoder( num_projection_layers, projection_dims, dropout_rate, trainable=False): # Load the pre-trained Xception model to be used as the base encoder. xception = keras.applications.Xception( include_top=False, weights="imagenet", pooling="avg" ) # Set the trainability of the base encoder. for layer in xception.layers: layer.trainable = trainable # Receive the images as inputs. inputs = layers.Input(shape=(299, 299, 3), name="image_input") # Preprocess the input image. xception_input = tf.keras.applications.xception.preprocess_input(inputs) # Generate the embeddings for the images using the xception model. embeddings = xception(xception_input) # Project the embeddings produced by the model. outputs = project_embeddings( embeddings, num_projection_layers, projection_dims, dropout_rate ) # Create the vision encoder model. return keras.Model(inputs, outputs, name="vision_encoder") 6 實(shí)現(xiàn)文本編碼器 我們使用TensorFlow Hub的BERT作為文本編碼器

def create_text_encoder( num_projection_layers, projection_dims, dropout_rate, trainable=False): # Load the BERT preprocessing module. preprocess = hub.KerasLayer( "https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/2", name="text_preprocessing", ) # Load the pre-trained BERT model to be used as the base encoder. bert = hub.KerasLayer( "https://tfhub.dev/tensorflow/small_bert/bert_en_uncased_L-4_H-512_A-8/1", "bert", ) # Set the trainability of the base encoder. bert.trainable = trainable # Receive the text as inputs. inputs = layers.Input(shape=(), dtype=tf.string, name="text_input") # Preprocess the text. bert_inputs = preprocess(inputs) # Generate embeddings for the preprocessed text using the BERT model. embeddings = bert(bert_inputs)["pooled_output"] # Project the embeddings produced by the model. outputs = project_embeddings( embeddings, num_projection_layers, projection_dims, dropout_rate ) # Create the text encoder model. return keras.Model(inputs, outputs, name="text_encoder")

7 實(shí)現(xiàn)雙編碼器

為了計(jì)算loss,我們計(jì)算每個(gè) caption_i和 images_j之間的對(duì)偶點(diǎn)積相似度作為預(yù)測(cè)值。caption_i和image_j之間的目標(biāo)相似度計(jì)算為(caption_i和caption_j之間的點(diǎn)積相似度)和(image_i和image_j之間的點(diǎn)積相似度)的平均值。然后,我們使用交叉熵來(lái)計(jì)算目標(biāo)和預(yù)測(cè)之間的損失。

class DualEncoder(keras.Model): def __init__(self, text_encoder, image_encoder, temperature=1.0, **kwargs): super(DualEncoder, self).__init__(**kwargs) self.text_encoder = text_encoder self.image_encoder = image_encoder self.temperature = temperature self.loss_tracker = keras.metrics.Mean(name="loss") @property def metrics(self): return [self.loss_tracker] def call(self, features, training=False): # Place each encoder on a separate GPU (if available). # TF will fallback on available devices if there are fewer than 2 GPUs. with tf.device("/gpu:0"): # Get the embeddings for the captions. caption_embeddings = text_encoder(features["caption"], training=training) with tf.device("/gpu:1"): # Get the embeddings for the images. image_embeddings = vision_encoder(features["image"], training=training) return caption_embeddings, image_embeddings def compute_loss(self, caption_embeddings, image_embeddings): # logits[i][j] is the dot_similarity(caption_i, image_j). logits = ( tf.matmul(caption_embeddings, image_embeddings, transpose_b=True) / self.temperature ) # images_similarity[i][j] is the dot_similarity(image_i, image_j). images_similarity = tf.matmul( image_embeddings, image_embeddings, transpose_b=True ) # captions_similarity[i][j] is the dot_similarity(caption_i, caption_j). captions_similarity = tf.matmul( caption_embeddings, caption_embeddings, transpose_b=True ) # targets[i][j] = avarage dot_similarity(caption_i, caption_j) and dot_similarity(image_i, image_j). targets = keras.activations.softmax( (captions_similarity + images_similarity) / (2 * self.temperature) ) # Compute the loss for the captions using crossentropy captions_loss = keras.losses.categorical_crossentropy( y_true=targets, y_pred=logits, from_logits=True ) # Compute the loss for the images using crossentropy images_loss = keras.losses.categorical_crossentropy( y_true=tf.transpose(targets), y_pred=tf.transpose(logits), from_logits=True ) # Return the mean of the loss over the batch. return (captions_loss + images_loss) / 2 def train_step(self, features): with tf.GradientTape() as tape: # Forward pass caption_embeddings, image_embeddings = self(features, training=True) loss = self.compute_loss(caption_embeddings, image_embeddings) # Backward pass gradients = tape.gradient(loss, self.trainable_variables) self.optimizer.apply_gradients(zip(gradients, self.trainable_variables)) # Monitor loss self.loss_tracker.update_state(loss) return {"loss": self.loss_tracker.result()} def test_step(self, features): caption_embeddings, image_embeddings = self(features, training=False) loss = self.compute_loss(caption_embeddings, image_embeddings) self.loss_tracker.update_state(loss) return {"loss": self.loss_tracker.result()}

8 訓(xùn)練雙編碼模型

在這個(gè)實(shí)驗(yàn)中,我們凍結(jié)了文字和圖像的基礎(chǔ)編碼器,只讓投影頭進(jìn)行訓(xùn)練。

num_epochs = 5 # In practice, train for at least 30 epochsbatch_size=256 vision_encoder = create_vision_encoder( num_projection_layers=1, projection_dims=256, dropout_rate=0.1)text_encoder = create_text_encoder( num_projection_layers=1, projection_dims=256, dropout_rate=0.1)dual_encoder = DualEncoder(text_encoder, vision_encoder, temperature=0.05)dual_encoder.compile( optimizer=tfa.optimizers.AdamW(learning_rate=0.001, weight_decay=0.001))

值得注意的是使用 V100 GPU 加速器訓(xùn)練 60000 個(gè)圖像標(biāo)題對(duì)的模型,批量大小為 256 個(gè),每個(gè) epoch 需要 12 分鐘左右。如果有2個(gè)GPU,則每個(gè)epoch需要8分鐘左右。

print(f"Number of GPUs: {len(tf.config.list_physical_devices('GPU'))}")print(f"Number of examples (caption-image pairs): {train_example_count}")print(f"Batch size: {batch_size}")print(f"Steps per epoch: {int(np.ceil(train_example_count / batch_size))}")train_dataset = get_dataset(os.path.join(tfrecords_dir, "train-*.tfrecord"), batch_size)valid_dataset = get_dataset(os.path.join(tfrecords_dir, "valid-*.tfrecord"), batch_size)# Create a learning rate scheduler callback.reduce_lr = keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.2, patience=3)# Create an early stopping callback.early_stopping = tf.keras.callbacks.EarlyStopping( monitor="val_loss", patience=5, restore_best_weights=True)history = dual_encoder.fit( train_dataset, epochs=num_epochs, validation_data=valid_dataset, callbacks=[reduce_lr, early_stopping],)print("Training completed. Saving vision and text encoders...")vision_encoder.save("vision_encoder")text_encoder.save("text_encoder")print("Models are saved.")

Number of GPUs: 2Number of examples (caption-image pairs): 60000Batch size: 256Steps per epoch: 235Epoch 1/5235/235 [==============================] - 573s 2s/step - loss: 60.8318 - val_loss: 9.0531Epoch 2/5235/235 [==============================] - 553s 2s/step - loss: 7.8959 - val_loss: 5.2654Epoch 3/5235/235 [==============================] - 541s 2s/step - loss: 4.6644 - val_loss: 4.9260Epoch 4/5235/235 [==============================] - 538s 2s/step - loss: 4.0188 - val_loss: 4.6312Epoch 5/5235/235 [==============================] - 539s 2s/step - loss: 3.5555 - val_loss: 4.3503Training completed. Saving vision and text encoders...Models are saved. 訓(xùn)練損失的繪制:

plt.plot(history.history["loss"])plt.plot(history.history["val_loss"])plt.ylabel("Loss")plt.xlabel("Epoch")plt.legend(["train", "valid"], loc="upper right")plt.show()

9 使用自然語(yǔ)言查詢(xún)搜索圖像

我們可以通過(guò)以下步驟來(lái)檢索對(duì)應(yīng)自然語(yǔ)言查詢(xún)的圖像:

1. 將圖像輸入vision_encoder,生成圖像的嵌入。

2. 將自然語(yǔ)言查詢(xún)反饋給text_encoder,生成查詢(xún)嵌入。

3. 計(jì)算查詢(xún)嵌入與索引中的圖像嵌入之間的相似度,以檢索出最匹配的索引。

4. 查閱頂部匹配圖片的路徑,將其顯示出來(lái)。

值得注意的是在訓(xùn)練完雙編碼器后,將只使用微調(diào)后的visual_encoder和text_encoder模型,而dual_encoder模型將被丟棄。

生成圖像的嵌入

我們加載圖像,并將其輸入到vision_encoder中,以生成它們的嵌入。在大規(guī)模系統(tǒng)中,這一步是使用并行數(shù)據(jù)處理框架來(lái)執(zhí)行的,比如Apache Spark或Apache Beam。生成圖像嵌入可能需要幾分鐘時(shí)間。

print("Loading vision and text encoders...")vision_encoder = keras.models.load_model("vision_encoder")text_encoder = keras.models.load_model("text_encoder")print("Models are loaded.") def read_image(image_path): image_array = tf.image.decode_jpeg(tf.io.read_file(image_path), channels=3) return tf.image.resize(image_array, (299, 299)) print(f"Generating embeddings for {len(image_paths)} images...")image_embeddings = vision_encoder.predict( tf.data.Dataset.from_tensor_slices(image_paths).map(read_image).batch(batch_size), verbose=1,)print(f"Image embeddings shape: {image_embeddings.shape}.")

Loading vision and text encoders...Models are loaded.Generating embeddings for 82783 images...324/324 [==============================] - 437s 1s/stepImage embeddings shape: (82783, 256).

檢索相關(guān)圖像

該例子中,我們通過(guò)計(jì)算輸入的查詢(xún)嵌入和圖像嵌入之間的點(diǎn)積相似度來(lái)使用精確匹配,并檢索前k個(gè)匹配。然而,在實(shí)時(shí)用例中,使用ScaNN、Annoy或Faiss等框架進(jìn)行近似匹配是首選,以擴(kuò)展大量圖像。

def find_matches(image_embeddings, queries, k=9, normalize=True): # Get the embedding for the query. query_embedding = text_encoder(tf.convert_to_tensor(queries)) # Normalize the query and the image embeddings. if normalize: image_embeddings = tf.math.l2_normalize(image_embeddings, axis=1) query_embedding = tf.math.l2_normalize(query_embedding, axis=1) # Compute the dot product between the query and the image embeddings. dot_similarity = tf.matmul(query_embedding, image_embeddings, transpose_b=True) # Retrieve top k indices. results = tf.math.top_k(dot_similarity, k).indices.numpy() # Return matching image paths. return [[image_paths[idx] for idx in indices] for indices in results] 將查詢(xún)變量設(shè)置為你要搜索的圖片類(lèi)型。試試像 "一盤(pán)健康的食物", "一個(gè)戴著帽子的女人走在人行道上", "一只鳥(niǎo)坐在水邊", 或 "野生動(dòng)物站在田野里"。

query = "a family standing next to the ocean on a sandy beach with a surf board"matches = find_matches(image_embeddings, [query], normalize=True)[0] plt.figure(figsize=(20, 20))for i in range(9): ax = plt.subplot(3, 3, i + 1) plt.imshow(mpimg.imread(matches[i])) plt.axis("off")

評(píng)估檢索質(zhì)量

為了評(píng)估雙編碼器模型,我們使用標(biāo)題作為查詢(xún)。使用訓(xùn)練外樣本圖像和標(biāo)題來(lái)評(píng)估檢索質(zhì)量,使用top k精度。如果對(duì)于一個(gè)給定的標(biāo)題,其相關(guān)的圖像在前k個(gè)匹配范圍內(nèi)被檢索到,則算作一個(gè)真正的預(yù)測(cè)。

def compute_top_k_accuracy(image_paths, k=100): hits = 0 num_batches = int(np.ceil(len(image_paths) / batch_size)) for idx in tqdm(range(num_batches)): start_idx = idx * batch_size end_idx = start_idx + batch_size current_image_paths = image_paths[start_idx:end_idx] queries = [ image_path_to_caption[image_path][0] for image_path in current_image_paths ] result = find_matches(image_embeddings, queries, k) hits += sum( [ image_path in matches for (image_path, matches) in list(zip(current_image_paths, result)) ] ) return hits / len(image_paths) print("Scoring training data...")train_accuracy = compute_top_k_accuracy(train_image_paths)print(f"Train accuracy: {round(train_accuracy * 100, 3)}%") print("Scoring evaluation data...")eval_accuracy = compute_top_k_accuracy(image_paths[train_size:])print(f"Eval accuracy: {round(eval_accuracy * 100, 3)}%")

0%| | 0/118 [00:00

結(jié)束語(yǔ)

你可以通過(guò)增加訓(xùn)練樣本的大小,訓(xùn)練更多的時(shí)期,探索其他圖像和文本的基礎(chǔ)編碼器,設(shè)置基礎(chǔ)編碼器的可訓(xùn)練性,以及調(diào)整超參數(shù),特別是softmax的temperature loss計(jì)算,獲得更好的結(jié)果。

責(zé)任編輯:lq

聲明:本文內(nèi)容及配圖由入駐作者撰寫(xiě)或者入駐合作網(wǎng)站授權(quán)轉(zhuǎn)載。文章觀點(diǎn)僅代表作者本人,不代表電子發(fā)燒友網(wǎng)立場(chǎng)。文章及其配圖僅供工程師學(xué)習(xí)之用,如有內(nèi)容侵權(quán)或者其他違規(guī)問(wèn)題,請(qǐng)聯(lián)系本站處理。 舉報(bào)投訴
  • 編碼器
    +關(guān)注

    關(guān)注

    44

    文章

    3552

    瀏覽量

    133792
  • 神經(jīng)網(wǎng)絡(luò)

    關(guān)注

    42

    文章

    4726

    瀏覽量

    100311
  • 自然語(yǔ)言
    +關(guān)注

    關(guān)注

    1

    文章

    279

    瀏覽量

    13309

原文標(biāo)題:雙編碼器的自然語(yǔ)言圖像搜索

文章出處:【微信號(hào):livevideostack,微信公眾號(hào):LiveVideoStack】歡迎添加關(guān)注!文章轉(zhuǎn)載請(qǐng)注明出處。

收藏 人收藏

    評(píng)論

    相關(guān)推薦

    UNet模型屬于哪種神經(jīng)網(wǎng)絡(luò)

    分割任務(wù)而設(shè)計(jì)。U-Net模型以其獨(dú)特的U形網(wǎng)絡(luò)結(jié)構(gòu)而得名,這結(jié)構(gòu)結(jié)合了編碼器和解碼的對(duì)稱(chēng)設(shè)計(jì),以實(shí)現(xiàn)對(duì)圖像的高效分割。
    的頭像 發(fā)表于 07-24 10:59 ?1510次閱讀

    如何構(gòu)建多層神經(jīng)網(wǎng)絡(luò)

    構(gòu)建多層神經(jīng)網(wǎng)絡(luò)(MLP, Multi-Layer Perceptron)模型個(gè)在機(jī)器學(xué)習(xí)和深度學(xué)習(xí)領(lǐng)域廣泛使用的技術(shù),尤其在處理分類(lèi)和
    的頭像 發(fā)表于 07-19 17:19 ?529次閱讀

    如何構(gòu)建三層bp神經(jīng)網(wǎng)絡(luò)模型

    能力。本文將介紹如何構(gòu)建三層BP神經(jīng)網(wǎng)絡(luò)模型。 神經(jīng)網(wǎng)絡(luò)基礎(chǔ)知識(shí) 2.1 神經(jīng)元模型 神經(jīng)元是
    的頭像 發(fā)表于 07-11 10:55 ?293次閱讀

    pytorch中有神經(jīng)網(wǎng)絡(luò)模型

    當(dāng)然,PyTorch是個(gè)廣泛使用的深度學(xué)習(xí)框架,它提供了許多預(yù)訓(xùn)練的神經(jīng)網(wǎng)絡(luò)模型。 PyTorch中的神經(jīng)網(wǎng)絡(luò)
    的頭像 發(fā)表于 07-11 09:59 ?578次閱讀

    PyTorch神經(jīng)網(wǎng)絡(luò)模型構(gòu)建過(guò)程

    PyTorch,作為個(gè)廣泛使用的開(kāi)源深度學(xué)習(xí)庫(kù),提供了豐富的工具和模塊,幫助開(kāi)發(fā)者構(gòu)建、訓(xùn)練和部署神經(jīng)網(wǎng)絡(luò)模型。在
    的頭像 發(fā)表于 07-10 14:57 ?369次閱讀

    神經(jīng)網(wǎng)絡(luò)預(yù)測(cè)模型構(gòu)建方法

    神經(jīng)網(wǎng)絡(luò)模型作為種強(qiáng)大的預(yù)測(cè)工具,廣泛應(yīng)用于各種領(lǐng)域,如金融、醫(yī)療、交通等。本文將詳細(xì)介紹神經(jīng)網(wǎng)絡(luò)預(yù)測(cè)模型
    的頭像 發(fā)表于 07-05 17:41 ?500次閱讀

    rnn是什么神經(jīng)網(wǎng)絡(luò)模型

    RNN(Recurrent Neural Network,循環(huán)神經(jīng)網(wǎng)絡(luò))是種具有循環(huán)結(jié)構(gòu)的神經(jīng)網(wǎng)絡(luò)模型,它能夠處理序列數(shù)據(jù),并對(duì)序列中的元素進(jìn)行建模。RNN在自然語(yǔ)言處理、語(yǔ)音識(shí)別、
    的頭像 發(fā)表于 07-05 09:50 ?459次閱讀

    神經(jīng)網(wǎng)絡(luò)模型的原理、類(lèi)型及應(yīng)用領(lǐng)域

    了廣泛應(yīng)用。本文將詳細(xì)介紹神經(jīng)網(wǎng)絡(luò)模型的原理、類(lèi)型、應(yīng)用領(lǐng)域以及存在的問(wèn)題和挑戰(zhàn)。 神經(jīng)網(wǎng)絡(luò)模型的基本原理
    的頭像 發(fā)表于 07-02 11:31 ?790次閱讀

    使用PyTorch構(gòu)建神經(jīng)網(wǎng)絡(luò)

    PyTorch是個(gè)流行的深度學(xué)習(xí)框架,它以其簡(jiǎn)潔的API和強(qiáng)大的靈活性在學(xué)術(shù)界和工業(yè)界得到了廣泛應(yīng)用。在本文中,我們將深入探討如何使用PyTorch構(gòu)建神經(jīng)網(wǎng)絡(luò),包括從基礎(chǔ)概念到高級(jí)
    的頭像 發(fā)表于 07-02 11:31 ?573次閱讀

    基于神經(jīng)網(wǎng)絡(luò)算法的模型構(gòu)建方法

    神經(jīng)網(wǎng)絡(luò)種強(qiáng)大的機(jī)器學(xué)習(xí)算法,廣泛應(yīng)用于各種領(lǐng)域,如圖像識(shí)別、自然語(yǔ)言處理、語(yǔ)音識(shí)別等。本文詳細(xì)介紹了基于神經(jīng)網(wǎng)絡(luò)算法的模型構(gòu)建方法,包
    的頭像 發(fā)表于 07-02 11:21 ?382次閱讀

    建立神經(jīng)網(wǎng)絡(luò)模型的三個(gè)步驟

    建立神經(jīng)網(wǎng)絡(luò)模型個(gè)復(fù)雜的過(guò)程,涉及到多個(gè)步驟和細(xì)節(jié)。以下是對(duì)建立神經(jīng)網(wǎng)絡(luò)模型的三
    的頭像 發(fā)表于 07-02 11:20 ?542次閱讀

    構(gòu)建神經(jīng)網(wǎng)絡(luò)模型方法有幾種

    構(gòu)建神經(jīng)網(wǎng)絡(luò)模型是深度學(xué)習(xí)領(lǐng)域的核心任務(wù)之。本文將詳細(xì)介紹構(gòu)建神經(jīng)網(wǎng)絡(luò)
    的頭像 發(fā)表于 07-02 10:15 ?269次閱讀

    深度神經(jīng)網(wǎng)絡(luò)模型有哪些

    模型: 多層感知(Multilayer Perceptron,MLP): 多層感知是最基本的深度神經(jīng)網(wǎng)絡(luò)模型,由多個(gè)全連接層組成。每個(gè)隱
    的頭像 發(fā)表于 07-02 10:00 ?871次閱讀

    助聽(tīng)器降噪神經(jīng)網(wǎng)絡(luò)模型

    抑制任務(wù)是語(yǔ)音增強(qiáng)領(lǐng)域的個(gè)重要學(xué)科, 隨著深度神經(jīng)網(wǎng)絡(luò)的興起,提出了幾種基于深度模型的音頻處理新方法[1,2,3,4]。然而,這些通常是為離線處理而開(kāi)發(fā)的,不需要考慮實(shí)時(shí)性。當(dāng)使用
    發(fā)表于 05-11 17:15

    編碼器好壞怎么判斷,編碼器原理

    (Autoencoder),它是種無(wú)監(jiān)督學(xué)習(xí)的神經(jīng)網(wǎng)絡(luò)模型。自動(dòng)編碼器由兩部分組成:編碼器和解碼
    的頭像 發(fā)表于 01-23 10:58 ?1654次閱讀