多标签模型 深度学习实战 手把手教你构建多义务

  • 电脑网络维修
  • 2024-11-15

多义务多标签模型是现代机器学习中的基础架构,这个义务在概念上很繁难 -训练一个模型同时预测多个义务的多个输入。

在本文中,咱们将基于盛行的 MovieLens 数据集,经常使用稠密特色来创立一个多义务多标签模型,并逐渐引见整个环节。所以本文将涵盖数据预备、模型构建、训练循环、模型诊断,最后经常使用 Ray Serve 部署模型的所有流程。

1.设置环境

在深化代码之前,请确保装置了必要的库(以下不是详尽列表):

pip install pandas scikit-learn torch ray[serve] matplotlib requests tensorboard

咱们在这里经常使用的数据集足够小,所以可以经常使用 CPU 启动训练。

2.预备数据集

咱们将从创立用于解决 MovieLens 数据集的下载、预解决的类开局,而后将数据宰割为训练集和测试集。

MovieLens数据集蕴含无关用户、电影及其评分的消息,咱们将用它来预测评分(回归义务)和用户能否青睐这部电影(二元分类义务)。

import osimport pandas as pdfrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import LabelEncoderimport torchfrom torch.utils.data import,):print("Initializing MovieLensDataset...")if not os.path.exists(data_dir):os.makedirs(data_dir)if):if split == "train":data = self.train_dfelif split == "test":data = self.test_dfelse:raise ValueError("Invalid split. Choose 'train' or 'test'.")dense_features = torch.tensor(data[['user', 'movie']].values, dtype=torch.long)labels = torch.tensor(data[['rating', 'liked']].values, dtype=torch.float32)return dense_features, labelsdef get_encoders(self):return self.user_encoder, self.movie_encoder

定义了 MovieLensDataset,就可以将训练集和评价集加载到内存中

# Example usage with a single if you are using# a GPUdataset = MovieLensDataset(dataset_version="small")print("Getting training)print("Getting testing)# Create>

3.定义多义务多标签模型

咱们将定义一个基本的 PyTorch 模型,解决两个义务:预测评分(回归)和用户能否青睐这部电影(二元分类)。

模型经常使用稠密嵌入来示意用户和电影,并有共享层,这些共享层会输入到两个独自的输入层。

经过在义务之间共享一些层,并为每个特定义务的输入设置独自的层,该模型应用了共享示意,同时依然针对每个义务定制其预测。

from torch import nnclass MultiTaskMovieLensModel(nn.Module):def __init__(self, n_users, n_movies, embedding_size, hidden_size):super(MultiTaskMovieLensModel, self).__init__()self.user_embedding = nn.Embedding(n_users, embedding_size)self.movie_embedding = nn.Embedding(n_movies, embedding_size)self.shared_layer = nn.Linear(embedding_size * 2, hidden_size)self.shared_activation = nn.ReLU()self.task1_fc = nn.Linear(hidden_size, 1)self.task2_fc = nn.Linear(hidden_size, 1)self.task2_activation = nn.Sigmoid()def forward(self, x):user = x[:, 0]movie = x[:, 1]user_embed = self.user_embedding(user)movie_embed = self.movie_embedding(movie)combined = torch.cat((user_embed, movie_embed), dim=1)shared_out = self.shared_activation(self.shared_layer(combined))rating_out = self.task1_fc(shared_out)liked_out = self.task2_fc(shared_out)liked_out = self.task2_activation(liked_out)return rating_out, liked_out

输入 (x):

用户和电影嵌入:

衔接:

共享层:

义务特定输入:

前往 :

模型前往两个输入:

4.训练循环

首先,用一些恣意选用的超参数(嵌入维度和暗藏层中的神经元数量)实例化咱们的模型。关于回归义务将经常使用均方误差损失,关于分类义务,将经常使用二元交叉熵。

咱们可以经过它们的初始值来归一化两个损失,以确保它们都大抵处于相似的尺度(这里也可以经常使用不确定性加权来归一化损失)

而后将经常使用数据加载器训练模型,并跟踪两个义务的损失。损失将被绘制成图表,以可视化模型在评价集上随期间的学习和泛化状况。

import torch.optim as optimimport matplotlib.pyplot as plt# Check if GPU is availabledevice = torch.device("cuda" if torch.cuda.is_available() else "cpu")print(f"Using device: {device}")embedding_size = 16hidden_size = 32n_users = len(dataset.get_encoders()[0].classes_)n_movies = len(dataset.get_encoders()[1].classes_)model = MultiTaskMovieLensModel(n_users, n_movies, embedding_size, hidden_size).to(device)criterion_rating = nn.MSELoss()criterion_liked = nn.BCELoss()optimizer = optim.Adam(model.parameters(), lr=0.001)train_rating_losses, train_liked_losses = [], []eval_rating_losses, eval_liked_losses = [], []epochs = 10# used for loss normalizationinitial_loss_rating = Noneinitial_loss_liked = Nonefor epoch in range(epochs):model.train()running_loss_rating = 0.0running_loss_liked = 0.0for dense_features, labels in train_loader:optimizer.zero_grad()dense_features = dense_features.to(device)labels = labels.to(device)rating_pred, liked_pred = model(dense_features)rating_target = labels[:, 0].unsqueeze(1)liked_target = labels[:, 1].unsqueeze(1)loss_rating = criterion_rating(rating_pred, rating_target)loss_liked = criterion_liked(liked_pred, liked_target)# Set initial lossesif initial_loss_rating is None:initial_loss_rating = loss_rating.item()if initial_loss_liked is None:initial_loss_liked = loss_liked.item()# Normalize lossesloss = (loss_rating / initial_loss_rating) + (loss_liked / initial_loss_liked)loss.backward()optimizer.step()running_loss_rating += loss_rating.item()running_loss_liked += loss_liked.item()train_rating_losses.append(running_loss_rating / len(train_loader))train_liked_losses.append(running_loss_liked / len(train_loader))model.eval()eval_loss_rating = 0.0eval_loss_liked = 0.0with torch.no_grad():for dense_features, labels in test_loader:dense_features = dense_features.to(device)labels = labels.to(device)rating_pred, liked_pred = model(dense_features)rating_target = labels[:, 0].unsqueeze(1)liked_target = labels[:, 1].unsqueeze(1)loss_rating = criterion_rating(rating_pred, rating_target)loss_liked = criterion_liked(liked_pred, liked_target)eval_loss_rating += loss_rating.item()eval_loss_liked += loss_liked.item()eval_rating_losses.append(eval_loss_rating / len(test_loader))eval_liked_losses.append(eval_loss_liked / len(test_loader))print(f'Epoch {epoch+1}, Train Rating Loss: {train_rating_losses[-1]}, Train Liked Loss: {train_liked_losses[-1]}, Eval Rating Loss: {eval_rating_losses[-1]}, Eval Liked Loss: {eval_liked_losses[-1]}')# Plotting lossesplt.figure(figsize=(14, 6))plt.subplot(1, 2, 1)plt.plot(train_rating_losses, label='Train Rating Loss')plt.plot(eval_rating_losses, label='Eval Rating Loss')plt.xlabel('Epoch')plt.ylabel('Loss')plt.title('Rating Loss')plt.legend()plt.subplot(1, 2, 2)plt.plot(train_liked_losses, label='Train Liked Loss')plt.plot(eval_liked_losses, label='Eval Liked Loss')plt.xlabel('Epoch')plt.ylabel('Loss')plt.title('Liked Loss')plt.legend()plt.tight_layout()plt.show()

还可以经过应用 Tensorboard 监控训练的环节

from torch.utils.tensorboard import SummaryWriter# Check if GPU is availabledevice = torch.device("cuda" if torch.cuda.is_available() else "cpu")print(f"Using device: {device}")# Model and Training Setupembedding_size = 16hidden_size = 32n_users = len(user_encoder.classes_)n_movies = len(movie_encoder.classes_)model = MultiTaskMovieLensModel(n_users, n_movies, embedding_size, hidden_size).to(device)criterion_rating = nn.MSELoss()criterion_liked = nn.BCELoss()optimizer = optim.Adam(model.parameters(), lr=0.001)epochs = 10# used for loss normalizationinitial_loss_rating = Noneinitial_loss_liked = None# TensorBoard setupwriter = SummaryWriter(log_dir='runs/multitask_movie_lens')# Training Loop with TensorBoard Loggingfor epoch in range(epochs):model.train()running_loss_rating = 0.0running_loss_liked = 0.0for batch_idx, (dense_features, labels) in enumerate(train_loader):# Move>
  • 关注微信

本网站的文章部分内容可能来源于网络和网友发布,仅供大家学习与参考,如有侵权,请联系站长进行删除处理,不代表本网站立场,转载联系作者并注明出处:https://duobeib.com/diannaowangluoweixiu/7694.html

猜你喜欢

热门标签

洗手盆如何疏浚梗塞 洗手盆为何梗塞 iPhone提价霸占4G市场等于原价8折 明码箱怎样设置明码锁 苏泊尔电饭锅保修多久 长城画龙G8253YN彩电输入指令画面变暗疑问检修 彩星彩电解除童锁方法大全 三星笔记本培修点上海 液晶显示器花屏培修视频 燃气热水器不热水要素 热水器不上班经常出现3种处置方法 无氟空调跟有氟空调有什么区别 norltz燃气热水器售后电话 大连站和大连北站哪个离周水子机场近 热水器显示屏亮显示温度不加热 铁猫牌保险箱高效开锁技巧 科技助力安保无忧 创维8R80 汽修 a1265和c3182是什么管 为什么电热水器不能即热 标致空调为什么不冷 神舟培修笔记本培修 dell1420内存更新 青岛自来水公司培修热线电话 包头美的洗衣机全国各市售后服务预定热线号码2024年修缮点降级 创维42k08rd更新 空调为什么运转异响 热水器为何会漏水 该如何处置 什么是可以自己处置的 重庆华帝售后电话 波轮洗衣机荡涤价格 鼎新热水器 留意了!不是水平疑问! 马桶产生了这5个现象 方便 极速 邢台空调移机电话上门服务 扬子空调缺点代码e4是什么疑问 宏基4736zG可以装置W11吗 奥克斯空调培修官方 为什么突然空调滴水很多 乐视s40air刷机包 未联络视的提高方向 官网培修 格力空调售后电话 皇明太阳能电话 看尚X55液晶电视进入工厂形式和软件更新方法 燃气热水器缺点代码

热门资讯

关注我们

微信公众号