当前位置: 开发笔记 > 编程语言 > 正文

在Keras中批量大小的batch_dot。-batch_dotwithvariablebatchsizeinKeras

作者：芦子根_889 | 来源：互联网 | 2023-08-29 18:49

Imtryingtowrittingalayertomerge2tensorswithsuchaformula我试着写一个层来合并两个张量和这个公式Theshape

I'm trying to writting a layer to merge 2 tensors with such a formula

我试着写一个层来合并两个张量和这个公式

The shapes of x[0] and x[1] are both (?, 1, 500).

x[0]和x[1]的形状都是(?),500)。

M is a 500*500 Matrix.

M是一个500*500矩阵。

I want the output to be (?, 500, 500) which is theoretically feasible in my opinion. The layer will output (1,500,500) for every pair of inputs, as (1, 1, 500) and (1, 1, 500). As the batch_size is variable, or dynamic, the output must be (?, 500, 500).

我希望输出是(?在我看来，这在理论上是可行的。该层将为每一对输入输出(1,1,500)和(1,1,500)。由于batch_size是变量或动态的，所以输出必须是(?、500、500)。

However, I know little about axes and I have tried all the combinations of axes but it doesn't make sense.

但是，我对坐标轴知之甚少，我尝试过所有的坐标轴的组合，但是没有意义。

I try with numpy.tensordot and keras.backend.batch_dot(TensorFlow). If the batch_size is fixed, taking a = (100,1,500) for example, batch_dot(a,M,(2,0)), the output can be (100,1,500).

我和numpy试试。tensordot和keras.backend.batch_dot(TensorFlow)。如果batch_size是固定的，以a =(100,1500)为例，batch_dot(a,M，(2,0))为例，输出可以是(100,1500)。

Newbie for Keras, sorry for such a stupid question but I have spent 2 days to figure out and it drove me crazy :(

作为Keras的新手，很抱歉问了这么一个愚蠢的问题，但我花了两天时间才弄明白，这让我发疯

    def call(self,x):
            input1 = x[0]
            input2 = x[1]
            #self.M is defined in build function
            output = K.batch_dot(...)
            return output

Update:

更新:

Sorry for being late. I try Daniel's answer with TensorFlow as Keras's backend and it still raises a ValueError for unequal dimensions.

抱歉迟到了。我用TensorFlow作为Keras的后端来尝试Daniel的答案，它仍然会为不相等的维度带来一个ValueError。

I try the same code with Theano as backend and now it works.

我尝试用Theano作为后端，现在它可以工作了。

>>> import numpy as np
>>> import keras.backend as K
Using Theano backend.
>>> from keras.layers import Input
>>> x1 = Input(shape=[1,500,])
>>> M = K.variable(np.ones([1,500,500]))
>>> firstMul = K.batch_dot(x1, M, axes=[1,2])

I don't know how to print tensors' shape in theano. It's definitely harder than tensorflow for me... However it works.

我不知道如何打印张量的形状。对我来说，这绝对比紧张要难……然而它的工作原理。

For that I scan 2 versions of codes for Tensorflow and Theano. Following are differences.

为此，我扫描了Tensorflow和Theano两个版本的代码。以下是差异。

In this case, x = (?, 1, 500), y = (1, 500, 500), axes = [1, 2]

在这种情况下，x = (?， y =(1,500,500)，坐标轴= [1,2]

In tensorflow_backend:

在tensorflow_backend:

return tf.matmul(x, y, adjoint_a=True, adjoint_b=True)

In theano_backend:

在theano_backend:

return T.batched_tensordot(x, y, axes=axes)

(If following changes of out._keras_shape don't make influence on out's value.)

(如果以下更改为out。_keras_shape不影响out的值。

2 个解决方案

#1

Your multiplications should select which axes it uses in the batch dot function.

您的乘法应该选择它在批点函数中使用的轴。

Axis 0 - the batch dimension, it's your ?
轴0 -批次尺寸，它是你的?
Axis 1 - the dimension you say has length 1
轴1 -你说的尺寸长度是1
Axis 2 - the last dimension, of size 500
轴2 -最后一个维度，大小为500

You won't change the batch dimension, so you will use batch_dot always with axes=[1,2]

您不会更改批处理维度，因此您将始终使用batch_dot，并使用坐标轴=[1,2]

But for that to work, you must ajust M to be (?, 500, 500).
For that define M not as (500,500), but as (1,500,500) instead, and repeat it in the first axis for the batch size:

但是要使它起作用，你必须让我成为(?、500、500)。定义M不是为(500,500)，而是为(1,500,500)，并在第一个轴中重复批次大小:

import keras.backend as K

#Being M with shape (1,500,500), we repeat it.   
BatchM = K.repeat_elements(x=M,rep=batch_size,axis=0)
#Not sure if repeating is really necessary, leaving M as (1,500,500) gives the same output shape at the end, but I haven't checked actual numbers for correctness, I believe it's totally ok. 

#Now we can use batch dot properly:
firstMul = K.batch_dot(x[0], BatchM, axes=[1,2]) #will result in (?,500,500)

#we also need to transpose x[1]:
x1T = K.permute_dimensions(x[1],(0,2,1))

#and the second multiplication:
result = K.batch_dot(firstMul, x1T, axes=[1,2])

#2

I prefer using TensorFlow so I tried to figure it out with TensorFlow in past few days.

我更喜欢使用TensorFlow，所以在过去的几天里我尝试用TensorFlow来解决这个问题。

The first one is much similar to Daniel's solution.

第一个和丹尼尔的解决方案很相似。

x = tf.placeholder('float32',shape=(None,1,3))
M = tf.placeholder('float32',shape=(None,3,3))
tf.matmul(x, M)
# return:

It needs to feed values to M with fit shapes.

它需要向M提供符合形状的值。

sess = tf.Session()
sess.run(tf.matmul(x,M), feed_dict = {x: [[[1,2,3]]], M: [[[1,2,3],[0,1,0],[0,0,1]]]})
# return : array([[[ 1.,  4.,  6.]]], dtype=float32)

Another way is simple with tf.einsum.

另一种方法是使用tf.einsum。

x = tf.placeholder('float32',shape=(None,1,3))
M = tf.placeholder('float32',shape=(3,3))
tf.einsum('ijk,lm->ikl', x, M)
# return:

Let's feed some values.

让我们喂一些值。

sess.run(tf.einsum('ijk,kl->ijl', x, M), feed_dict = {x: [[[1,2,3]]], M: [[1,2,3],[0,1,0],[0,0,1]]})
# return: array([[[ 1.,  4.,  6.]]], dtype=float32)

Now M is a 2D tensor and no need to feed batch_size to M.

现在M是一个二维张量，不需要向M输入batch_size。

What's more, now it seems such a question can be solved in TensorFlow with tf.einsum. Does it mean it's a duty for Keras to invoke tf.einsum in some situations? At least I find no where Keras calls tf.einsum. And in my opinion, when batch_dot 3D tensor and 2D tensor Keras behaves weirdly. In Daniel's answer, he pads M to (1,500,500) but in K.batch_dot() M will be adjusted to (500,500,1) automatically. I find tf will adjust it with Broadcasting rules and I'm not sure Keras does the same.

更重要的是，现在似乎这样的问题可以通过tf.einsum的TensorFlow来解决。这是否意味着Keras有责任调用tf ?在某些情况下einsum吗?至少我找不到Keras打电话给tf.einsum的地方。在我看来，当batch_dot 3D张量和2D张量Keras怪异地运行时。在丹尼尔的回答中，他把M(1,500500)写在了(1,500500)上，但在kbatch_dot()中，M将被自动调整为(500,500,1)。我发现tf会根据广播规则进行调整，我不确定Keras也会这样做。

推荐阅读

hash
MySQL初级篇——字符串、日期时间、流程控制函数的相关应用

文章目录：1.字符串函数2.日期时间函数2.1获取日期时间2.2日期与时间戳的转换2.3获取年月日、时分秒、星期数、天数等函数2.4时间和秒钟的转换2. ... [详细]

蜡笔小新 2024-11-14 10:57:02
replace
Python 数据类型入门指南

本文介绍了 Python 中的基本数据类型，包括不可变数据类型（数字、字符串、元组）和可变数据类型（列表、字典、集合），并详细解释了每种数据类型的使用方法和常见操作。 ... [详细]

蜡笔小新 2024-11-15 09:59:00
hash
Java 中 com.apollographql.apollo.api.internal.Optional.orNull() 方法详解与示例

本文详细介绍了 com.apollographql.apollo.api.internal.Optional 类中的 orNull() 方法，并提供了多个实际代码示例，帮助开发者更好地理解和使用该方法。 ... [详细]

蜡笔小新 2024-11-14 15:03:23
string
包含phppdoerrorcode的词条

包含phppdoerrorcode的词条 ... [详细]

蜡笔小新 2024-11-14 12:06:14
hash
在范围[0..n-1]中产生m个不同的随机数 - Generating m distinct random numbers in the range [0..n-1]

Ihavetwomethodsofgeneratingmdistinctrandomnumbersintherange[0..n-1]我有两种方法在范围[0.n-1]中生 ... [详细]

蜡笔小新 2024-11-13 09:49:14
int
php更新数据库字段的函数是,php更新数据库字段的函数是

php更新数据库字段的函数是,php更新数据库字段的函数是 ... [详细]

蜡笔小新 2024-11-12 11:37:31
process
开机自启动的几种方式

0x01快速自启动目录快速启动目录自启动方式源于Windows中的一个目录，这个目录一般叫启动或者Startup。位于该目录下的PE文件会在开机后进行自启动 ... [详细]

蜡笔小新 2024-11-12 11:16:30
int
Go (Golang) 语言Golang 定时器Timer和Ticker、time.AfterFunc、time.NewTicker()实例

文章目录Golang定时器Timer和Tickertime.Timertime.NewTimer()实例time.AfterFunctime.Tickertime.NewTicke ... [详细]

蜡笔小新 2024-11-12 09:39:10
replace
PTArchiver工作原理详解与应用分析

PTArchiver工作原理及其应用分析本文详细解析了PTArchiver的工作机制，探讨了其在数据归档和管理中的应用。PTArchiver通过高效的压缩算法和灵活的存储策略，实现了对大规模数据的高效管理和长期保存。文章还介绍了其在企业级数据备份、历史数据迁移等场景中的实际应用案例，为用户提供了实用的操作建议和技术支持。 ... [详细]

蜡笔小新 2024-11-11 13:40:49
hash
ipsec 加密流程（二）：ipsec初始化操作

《openswan》专栏系列文章主要是记录openswan源码学习过程中的笔记。Author:叨陪鲤Email:vip_13031075266163.comDate:2020.1 ... [详细]

蜡笔小新 2024-11-15 20:32:44
c语言
求助：C语言实现哈夫曼树编码与解码系统

最近遇到了一道关于哈夫曼树的编程题目，需要在下午之前完成。题目要求设计一个哈夫曼编码和解码系统，能够反复显示和处理多个项目，直到用户选择退出。希望各位大神能够提供帮助。 ... [详细]

蜡笔小新 2024-11-15 19:59:41
const
h5调用本地摄像头和麦克风一

h5调用本地摄像头和麦克风一,Go语言社区,Golang程序员人脉社 ... [详细]

蜡笔小新 2024-11-15 05:01:35
python
C语言编写线程池的简单实现方法

2019独角兽企业重金招聘Python工程师标准好文章，一起分享——有时我们会需要大量线程来处理一些相互独立的任务，为了避免频繁的申请释放线程所带 ... [详细]

蜡笔小新 2024-11-14 20:11:23
int
JUC（三）：深入解析AQS

本文详细介绍了Java并发工具包中的核心类AQS（AbstractQueuedSynchronizer），包括其基本概念、数据结构、源码分析及核心方法的实现。 ... [详细]

蜡笔小新 2024-11-13 15:40:34
int
掌握MySQL数据库的基础语法与核心操作

本文详细介绍了MySQL数据库的基础语法与核心操作，涵盖从基础概念到具体应用的多个方面。首先，文章从基础知识入手，逐步深入到创建和修改数据表的操作。接着，详细讲解了如何进行数据的插入、更新与删除。在查询部分，不仅介绍了DISTINCT和LIMIT的使用方法，还探讨了排序、过滤和通配符的应用。此外，文章还涵盖了计算字段以及多种函数的使用，包括文本处理、日期和时间处理及数值处理等。通过这些内容，读者可以全面掌握MySQL数据库的核心操作技巧。 ... [详细]

蜡笔小新 2024-11-11 23:39:51

芦子根_889

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章