如何使用Python将文本数据嵌入到维度向量中?
Tensorflow是谷歌提供的机器学习框架。它是一个开源框架,与Python结合使用,用于实现算法、深度学习应用程序等等。它用于研究和生产目的。
Keras是作为ONEIROS项目(开放式神经电子智能机器人操作系统)研究的一部分开发的。Keras是一个深度学习API,是用Python编写的。它是一个高级API,具有高效的界面,有助于解决机器学习问题。它运行在Tensorflow框架之上。它旨在帮助快速进行实验。它提供了开发和封装机器学习解决方案所必需的基本抽象和构建块。
Keras已存在于Tensorflow包中。可以使用以下代码行访问它。
import tensorflow from tensorflow import keras
Keras函数式API有助于创建比使用顺序API创建的模型更灵活的模型。函数式API可以处理具有非线性拓扑的模型,可以共享层,并可以处理多个输入和输出。深度学习模型通常是一个包含多个层的定向无环图 (DAG)。函数式API有助于构建图层图。
我们正在使用Google Colaboratory运行以下代码。Google Colab或Colaboratory有助于在浏览器上运行Python代码,无需任何配置,并可免费访问GPU(图形处理单元)。Colaboratory构建在Jupyter Notebook之上。以下是我们将标题中的每个单词嵌入到64维向量中的代码片段:
示例
print("Number of unique issue tags") num_tags = 12 print("Size of vocabulary while preprocessing text data") num_words = 10000 print("Number of classes for predictions") num_classes = 4 title_input = keras.Input( shape=(None,), name="title" ) print("Variable length int sequence") body_input = keras.Input(shape=(None,), name="body") tags_input = keras.Input( shape=(num_tags,), name="tags" ) print("Embed every word in the title to a 64-dimensional vector") title_features = layers.Embedding(num_words, 64)(title_input) print("Embed every word into a 64-dimensional vector") body_features = layers.Embedding(num_words, 64)(body_input) print("Reduce sequence of embedded words into single 128-dimensional vector") title_features = layers.LSTM(128)(title_features) print("Reduce sequence of embedded words into single 132-dimensional vector") body_features = layers.LSTM(32)(body_features) print("Merge available features into a single vector by concatenating it") x = layers.concatenate([title_features, body_features, tags_input]) print("Use logistic regression to predict the features") priority_pred = layers.Dense(1, name="priority")(x) department_pred = layers.Dense(num_classes, name="class")(x) print("Instantiate a model that predicts priority and class") model = keras.Model( inputs=[title_input, body_input, tags_input], outputs=[priority_pred, department_pred], )
代码来源 − https://tensorflowcn.cn/guide/keras/functional
输出
Number of unique issue tags Size of vocabulary while preprocessing text data Number of classes for predictions Variable length int sequence Embed every word in the title to a 64-dimensional vector Embed every word into a 64-dimensional vector Reduce sequence of embedded words into single 128-dimensional vector Reduce sequence of embedded words into single 132-dimensional vector Merge available features into a single vector by concatenating it Use logistic regression to predict the features Instantiate a model that predicts priority and class
解释
函数式API可用于处理多个输入和输出。
顺序API无法做到这一点。
广告