作者:走过滴岁月688 | 来源:互联网 | 2023-05-25 00:52
tensorflow中是否有任何Python API(或任何其他方式)来检查TPU加速器是v2还是v3 TPU?
您可以使用tf.profiler.experimental.client.monitor
来了解TPU设备的类型。要在colab中使用TPU,您需要在第一步中创建TPU策略。
请参考如下所示的完整代码
%tensorflow_version 2.x
import tensorflow as tf
import os
try:
tpu = tf.distribute.cluster_resolver.TPUClusterResolver() # TPU detection
print('Running on TPU ',tpu.cluster_spec().as_dict()['worker'])
except ValueError:
raise BaseException('ERROR: Not connected to a TPU runtime; please see the previous cell in this notebook for instructions!')
tf.config.experimental_connect_to_cluster(tpu)
tf.tpu.experimental.initialize_tpu_system(tpu)
tpu_strategy = tf.distribute.TPUStrategy(tpu)
输出:
Running on TPU ['10.96.157.82:8470']
INFO:tensorflow:Initializing the TPU system: grpc://10.96.157.82:8470
INFO:tensorflow:Initializing the TPU system: grpc://10.96.157.82:8470
INFO:tensorflow:Clearing out eager caches
INFO:tensorflow:Clearing out eager caches
INFO:tensorflow:Finished initializing TPU system.
INFO:tensorflow:Finished initializing TPU system.
INFO:tensorflow:Found TPU system:
INFO:tensorflow:Found TPU system:
INFO:tensorflow:*** Num TPU Cores: 8
INFO:tensorflow:*** Num TPU Cores: 8
INFO:tensorflow:*** Num TPU Workers: 1
INFO:tensorflow:*** Num TPU Workers: 1
INFO:tensorflow:*** Num TPU Cores Per Worker: 8
INFO:tensorflow:*** Num TPU Cores Per Worker: 8
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0,CPU,0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0,0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:XLA_CPU:0,XLA_CPU,0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0,0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0,TPU,0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1,0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2,0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3,0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4,0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5,0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6,0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7,0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0,TPU_SYSTEM,0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0,0)
TPU系统已初始化为10.96.157.82:8470
,但是已在8466
端口的所有TPU工作器中启动了探查器服务,因此必须将8470
替换为8466
>
tpu_worker = os.environ['COLAB_TPU_ADDR'].replace('8470','8466')
print(tf.profiler.experimental.client.monitor(tpu_worker,1))
输出:
Timestamp: 15:37:16
TPU type: TPU v2
Utilization of TPU Matrix Units (higher is better): 0.000%
从此输出中,您可以了解TPU的类型。有关更多信息,请参阅this。