How to debug TensorFlow

How to debug TensorFlow cover

This Python tutorial is a part of our series of Python packages related tutorials.

Try a faster and easier way to manage your Python packages and resolve dependencies. Use Python 3.9 by ActiveState and build your own runtime with the packages and dependencies you need. The ActiveState Platform’s command line interface, the State Tool will automatically resolve dependencies for you to ensure your Python environment won’t be corrupted by installing incompatible dependencies.

Get started for free by creating an account on the ActiveState Platform or logging in with your GitHub account.

How to debug TensorFlow

TensorFlow is an open source Python library for complex numeric computation. It is also a language for computational graphs that represent neural networks, and may require debugging and error correction for code that does not always behave as intended.

How to debug tensorflow image

The graph-based functionality of TensorFlow combined with the stochastic nature of many of the deep learning algorithms that it uses, makes debugging TensorFlow challenging. Certain traditional methods of debugging languages do not work with TensorFlow.

For example, inserting print statements helps with debugging imperative languages, but because of its graph structure TensorFlow is likely to ignore print statements. 

Different Types of Bugs

Debugging in TensorFlow can refer to bugs in code, as well as to models that are not converging or not producing target values.

This post provides recommendations and best practices for avoiding both kinds of problems by using: 

  • Methods that reduce the need for debugging wherever possible
  • Functions and debugging tools included in the TensorFlow library

TensorFlow Debug Methods

There are four principle methods for debugging TensorFlow:

  •  Print values within Session.run
  •  Use the tf.Print operation
  •  Increase logging 
  • Use the TensorFlow debugger and API functions 

Debug TensorFlow by Printing values within Session.run

Create TensorFlow objects in a session environment, and run them in a print statement. This is an easy, safe method for getting tensor values.

A Session object encapsulates the environment in which other objects are executed and evaluated. An encapsulated session avoids having possible errors in the session affect functionality outside of the session. This simplifies debugging. For example: 

import tensorflow as tf 

# Eager execution is automatic in TensorFlow 2.x
# and needs to be disabled in this instance:
tf.compat.v1.disable_eager_execution()

# Create a graph:
a = tf.constant(4.0)
b = tf.constant(5.0)
c = a * b

# Launch the graph in a session:
sess = tf.compat.v1.Session()

# Evaluate tensor 'c' in this session:
print(sess.run(c))

Figure 1. Print session.run() values.

how to debug tensorflow figure 1

Debug TensorFlow Using the tf.print Operation

Substitute Python print with tf.print to print tensor values during graph execution. 

tf.print() is a TensorFlow operator that prints specified inputs to a specified output stream including potential errors, in case of debugging. tf.print() can help track values during execution. For example:

import tensorflow as tf
import sys 

tensor = tf.range(8)

# Print "[0 1 2 ... 7]" to sys.stderr:
tf.print(tensor, output_stream=sys.stderr)

Debug TensorFlow by Increasing Logging

Logfiles can be a source of potential debugging information. Begin by importing the required library: import logging

In general, most TensorFlow logging facilities support different levels of severity (levels). TensorFlow supports the following five standard severity levels, in order of severity: DEBUG, ERROR, FATAL, INFO, * WARN 

In order to make better use of logging, increase the verbosity level in TensorFlow logs by entering the following code in a python console:

TF_CPP_VMODULE=segment=2
convert_graph=2
convert_nodes=2
trt_engine_op=2

To return the TensorFlow logger instance, enter:

tf.get_logger()

Debug TensorFlow Using the Tensor Debugger and API Functions

The TensorFlow debugger dumps debugging information like:

  • TensorFlow Function constructions, such as:
    • A compilation of Python functions decorated with @tf.function
    • Op types
    • Names (if available)
    • Context
    • Input and output tensors, and the associated stack traces
  • Execution of TensorFlow operations (ops) and Functions and their stack traces, op types, names (if available) and contexts. Depending on the value of the tensor_debug_mode argument, the value(s) of the output tensors or more concise summaries of the tensor values will be dumped.
  • A snapshot of Python source files involved in the execution of the TensorFlow program.

You can implement TensorFlow debugging and send the info to a specified location by adding a single line of code at the top of your TensorFlow application:

tf.debugging.experimental.enable_dump_debug_info('/tmp/my-tfdbg-dump
s')
# TensorFlow application code:
Import tensorflow as tf
....

Figure 2: TensorFlow debugging:

how to debug tensorflow figure 2

 

 

 

 

 

TensorFlow debugging can be disabled by running: 

disable_dump_debug_info()

Enable TensorFlow Numerics Checking 

When tf.debugging.enable_check_numerics is activated it will stop execution as soon as a tensor containing Infinity or NaN is encountered. This method is effective only on the thread in which it is called.

For example: 

import tensorflow as tf
import numpy as np

# Enable numeric checking:

tf.debugging.enable_check_numerics()

x = np.array([[0.0, -1.0], [7.0, 8.0]])

# Negative element in the input array will generate a NaN error. Because numeric checking # is enabled, the program will throw an error and print an error message:
y = tf.math.sqrt(x) 

Figure 3. Numeric checking output: 

How to debug tensorflow figure 3

The numerics checking mechanism can be disabled with:

tf.debugging.disable_check_numerics()

Decorate Python Functions with @tf.function for Error Detection

tf.function is a decorator function that converts Python code to a callable Tensorflow graph function. In this example, the tf.function decorator converts ‘add’ into a callable function:

@tf.function  
def add(a, b):
  return a + b

add(tf.ones([2, 2]), tf.ones([2, 2]))

Figure 4. ‘add’ is called as a function by tf.function and returns a + b:

how to debug tensorflow figure 4

 

 

 

Utilize TensorFlow tf.debugging API Functions 

Use the API to check for bugs, errors, and True/False conditions depending on specifications within each function.

For example, to check tensors for NaNs and Infinity values, you can use the tf.debugging.check_numerics function. When run, the function returns an InvalidArgument error if the tensor has any NaNs and Infinity values: 

tf.debugging.check_numerics(
    tensor, message, name=None
)

For a complete list of tf.debugging API functions, refer to the API docs


A modern solution to Python package management – Try ActiveState’s Platform

The ActiveState Platform is a cloud-based build tool for Python. It provides build automation and vulnerability remediation for:

  • Python language cores, including Python 2.7 and Python 3.5+
  • Python packages and their dependencies, including:
  • Transitive dependencies (ie., dependencies of dependencies)
  • Linked C and Fortran libraries, so you can build data science packages
  • Operating system-level dependencies for Windows, Linux, and macOS
  • Shared dependencies (ie., OpenSSL)
  • Find, fix and automatically rebuild a secure version of Python packages like Django and environments in minutes

Python 3.9 Web GUI Screenshot

Python Package Management In Action

Get a hands-on appreciation for how the ActiveState Platform can help you manage your dependencies for Python environments. Just run the following command to install Python 3.9 and our package manager, the State Tool:

Windows

powershell -Command "& $([scriptblock]::Create((New-Object Net.WebClient).DownloadString('https://platform.activestate.com/dl/cli/install.ps1'))) -activate-default ActiveState-Labs/Python-3.9Beta"

Linux

sh <(curl -q https://platform.activestate.com/dl/cli/install.sh) --activate-default ActiveState-Labs/Python-3.9Beta

Now you can run state install <packagename>. Learn more about how to use the State Tool to manage your Python environment.

Let us know your experience in the ActiveState Community forum.

Remi M