TensorFlow 的规矩

在我看来, TensorFlow 程序相当于 “程序中的程序”

对于一般程序, 如下的代码简洁明了:

// c code
const int a = 1;
const int b = 2;
int c = a + b; // 代码执行完这一句时, 变量 c 的值是 a + b = 3
printf("%d", c);

而使用 Tensorflow 时, 你得这么写

# tensorflow code
a = tf.constant(1)
b = tf.constant(2)
c = a + b # 代码执行完这一句时, 变量 c 表示了 a + b 这个操作, 还没有赋值
with tf.Session() as sess: # 将上述的计算过程实例化
    print(sess.run(c)) # 计算 c 的值

由此可见, tensorflow 的规矩是, 先定义操作, 只有在你显式通过 tf.Session() 运行的时候, 才开始进行实际的计算. 这就好像是用 python 代码写了一个 tensorflow 的程序, 然后再用 python 代码执行这个程序.

基本类型

常量

a = tf.constant(1)
b = tf.constant(2, dtype = tf.int32, name = 'this_is_a_name')

变量

a = tf.Variable(1)
b = tf.Vairable(2, dtype = tf.float32, name = 'another_name')

Placeholder

a = tf.placeholder(dtype = tf.float32, shape = [2, 2])
b = tf.placeholder(dtype = tf.int32, shape = [1], name = 'place_holder')

计算过程

Tensorflow 的计算需要显示地启动, 每一次计算都称作一个 Session

a = tf.constant(1)
b = tf.Variable(2)
c = tf.placeholder(dtype = tf.int32, shape = [])
data = 3
b_change = tf.assign(b, a) # 定义了一个赋值操作 b = a
init = tf.global_variables_initializer() # 初始化所有全局变量的操作
# 以上只是定义了数据和计算过程, 计算还没有开始
with tf.Session() as sess: # 获得 Session 对象, 只有在调用 Session.run() 时才开始计算
    sess.run(init)
    print(sess.run(b)) # 2
    sess.run(b_change)
    print(sess.run(b)) # 1
    print(sess.run(c, feed_dict={ c: data})) # 0阶张量 3

由此可见:

想要使用 tf.Variable, 必须要额外定义一个变量初始化的操作 tf.global_variables_initializer()
想要开始计算, 必须要有一个 tf.Session 对象
placeholder 相当于这个 tensorflow 程序的参数, 需要从外界获得, 通过 feed_dict = {foo: bar} 赋值

一些简便写法

想要执行计算必须要有 tf.Session, 而使用 tf.Session 需要代码处在 with tf.Session() as sess: 的语句块中

Tensorflow 提供了一种更方便的写法:

a = tf.constant(1)
# 写法1
with tf.Session() as sess:
    print(sess.run(a))
# 写法2
sess = tf.InteractiveSession()
print(sess.run(a))
sess.close()

由此可以看出, InteractiveSession 优点是不需要缩进, 因此你可以在交互命令行中直接使用, 这也是它叫 Interactive 的原因 (在交互命令行中使用写法1真的太麻烦了), 但是写法2你需要显示调用 sess.close() 来释放资源, 写法1是自动帮你释放的

如果你想获取一个张量的值, 可以有另一种简便写法:

a = tf.constant(1)
sess = tf.InteractiveSession()
# 写法1
print(sess.run(a))
# 写法2
print(a.eval())
sess.close()

个人感觉写法2更舒服, 但是写法1的好处是可以同时输出多个计算结果, 比如

a = tf.constant(1)
b = tf.constant(2)
sess = tf.InteractiveSession()
aa, bb = sess.run([a, b])
sess.close()

图

大家都说 Tensorflow 的范式是计算图, 图 (Graph) 是图论的图, 它是一个有向图, 每一个节点是一个计算操作, 这很好理解, 因此当你在写上述代码时, a, b, c 等张量和其他的操作都是一个图中的节点, 这个图是 tensorflow 在启动时自动为你创建的默认图. 如果说 Tensorflow 是程序中的程序, 那么图就是这个程序的一个模块.

你可以通过如下代码创建一个新图并在这个图中定义该模块的计算操作

default_graph = tf.get_default_graph()
new_graph = tf.Graph()
with new_graph.as_default():
    a = tf.constant(1) # a 属于 new_graph
b = tf.constant(2) # b 属于 default_graph

总结

Tensorflow 的计算图模式非常新颖, 有点意思, 但是有几个槽点:

动不动就 deprecate 一些 API, 国内的大部分教材的代码都已经过时 ( 虽然还是可以用, 但是被提示 “xxx is deprecated” 真的很膈应)
数据的定义和 Numpy 不一致, 比如:

tensorflow
foo = tf.random_uniform([2, 2])

numpy
bar = np.random.rand(2, 2)
方法名称起得太随便了, 比如在自学的时候遇到过这样一个方法

tf.nn.softmax_cross_entropy_with_logits_v2() # WDNMD, 你当你是 Objective-C ???