[NNVM][TEST] Test against numerical grad by sgrechanik-h · Pull Request #1505 · apache/tvm

sgrechanik-h · 2018-07-29T19:31:03Z

This PR moves the helper function checking nnvm operations against numpy reference implementations from test_top_level# to nnvm.testing and adds the ability to compare symbolic gradients to numerically computed gradients. Also this PR fixes a wrong gradient implementation for the division operation (it was found using this approach).

Currently only tests that used the old helper function are rewritten in such a way that they now use the new function. There are still many tests which didn't use the helper function, they should also be rewritten so that they use this function, if possible, but this is left for future work.

sgrechanik-h · 2018-07-30T18:43:49Z

@kazum @kevinthesun @srkreddy1238 could you please review this PR?

srkreddy1238

@sgrechanik-h
That was a great start to standardize the helpers across various test cases 👍

Few comments to allow non numpy forward/backward callers.

Otherwise this PR look great & waiting to get this merged so that all other test cases can use this instead of local helpers.

srkreddy1238 · 2018-07-31T05:46:28Z

+        None (default) means that all input variables will be used. May be a
+        list of pairs `(var_name, shape)`.
+
+    np_forward : Callable[..., List[numpy.ndarray]], optional


Suggest to give different naming for np_forward : Need not be numpy always, could be an implementation in any of the frontends we have.

Also receive and pass a flexible parameters 'attr' (dictionary object with default value None) for this function which can carry the implementation specific data like session info ...etc.

srkreddy1238 · 2018-07-31T05:49:00Z

+                logging.debug(debug_stage)
+
+                numpy_res = np_forward(**np_inputs_without_head_grads)
+                np.testing.assert_allclose(nnvm_res[0], numpy_res, atol=atol, rtol=rtol)


In the context of above comment on np_forward,
Pls check if the result from np_forward is a list and is yes, then compare all arrays in list.

srkreddy1238 · 2018-07-31T05:51:44Z

+
+    dump_graph : bool
+        Dump the graph even on success.
+    """


Not sure an option to custom comparison function argument make any sense here.
Just raising as it came across my thought.

kevinthesun · 2018-07-31T04:52:53Z

+    only_targets : Set[str]
+        Test only for those targets from `ctx_list()` that are also in this set.
+
+    numerical_grads : bool or "if_possible"


How about putting "if_possible" case inside numerical_grads=True?

kevinthesun · 2018-07-31T04:54:56Z

+        then check gradients numerically only if this graph can be created (i.e. if there are some
+        operations with unimplemented gradients, it will just issue a warning).
+
+    delta : float


Add optional to optional arguments.

kevinthesun · 2018-07-31T05:01:19Z

+    shape = shape.copy() if shape else {}
+
+    grad_input_vars_real = []
+    for x in grad_input_vars:


This logic looks complicated. How about just passing a list of symbol as grad_input_vars? Shape information can be defined in shape(maybe input_shapes of dict from str to tuple?) argument so that we don't have multiple shapes.

Yes, not having shapes in grad_input_vars would be cleaner, I'll fix it.

kevinthesun · 2018-07-31T05:04:36Z

+    dtypes = graph.json_attr('dtype')
+
+    if shapes is None or dtypes is None:
+        graph = graph.apply('InferShape').apply('InferType')


Here we assume set_shape_inputs and set_dtype_inputs have been called. It might be better to directly pass input_shapes and input_types into this function to make it more general.

kevinthesun · 2018-07-31T05:11:09Z

+
+    forward_graph = nnvm.graph.create(symbol)
+
+    if dtype is not None:


We need to set_dtype_inputs. just check dtype is not None.

Sorry, I think I didn't understand this one.

If dtype is None, graph_to_function will raise exception? Same thing will happen for shape.
How about setting dtype as optional but not None as default value, such as dtype="float32"? For shape argument we should force user to pass in value. Then the code logic can be simpler.

Setting dtype='float32' as default would simplify a lot of tests a little bit. However, if some variables have their dtypes specified by attributes then either these attributes should be preferred or an error should be raised, so I'm not sure.

OK. Maybe we can put the shape/dtype check right in this function, before infer shape/dtype, and check either input variable shape/dtype attributes are set or shape/dtype arguments are passed.

Since shapes and dtypes of some inputs may be inferred from shapes and dtypes of other inputs, the proper solution would be to run inference passes, and only then check if some inputs are left without types. Then these inputs can be assigned the default type.

kevinthesun · 2018-07-31T05:35:53Z

+            # nnvm_res contains the output and gradients (if they are needed)
+            nnvm_res = main_function(**np_inputs)
+
+            if np_forward is not None:


If np_forward is None, forward part comparison will be skipped? I think we should enforce np_forward to be set.

Sometimes we might want check gradients only (or even only numerical gradients). But I think it is good idea to require passing numerical_grads=True explicitly if no reference function is specified.

kevinthesun · 2018-07-31T05:39:53Z

+                # Since the result may be non-scalar, we have to put another operation on the top,
+                # so we just multiple by the randomly generated head_grads.
+                # This way we can reuse the gradient values which has been already computed.
+                def function(**kwargs):


Change to a more meaningful name.

kevinthesun · 2018-07-31T05:44:50Z

+    ----------
+    function
+        A function that takes inputs as keyword arguments (like `function(**input_values)`) and
+        returns a scalar result. Should accept and return numpy arrays.


Description is not quite clear here. Scalar or ndarray?

kevinthesun · 2018-07-31T05:47:17Z

+        Relative tolerance.
+    """
+
+    if function_value is None:


We can just compute this value instead of setting it as an argument?

kevinthesun · 2018-07-31T05:58:29Z

-    inputs = [('x', dshape, x)]
-    helper(y, inputs, dtype, forward, backward)
+    inputs = [(x, dshape)]
+    check_function(y, inputs, forward, backward, dtype=dtype)


Here we are setting grad_input_vars=inputs, but inputs are actually input_shapes. It would be more clear if we can refactor the interface of check_function.

kevinthesun · 2018-07-31T06:08:08Z

+def check_numerical_grads(function, grad_input_vars, input_values, grad_values, function_value=None,
+                          delta=1e-3, max_error=1e+3, max_discarded_frac=0.1,
+                          atol=1e-2, rtol=1e-2):
+    """A helper function that checks that numerical gradients of a function are equal to


Since this is new and will affect all backward operators, can you give more explanation on it?

If developers add new backward ops but find the checking failure, how can they quickly debug the issue? Since this is not simply comparing with np computation result, maybe we can have a tutorial for it. This would also be good to the stability of CI while we are introducing more and more backward ops.

kazum · 2018-07-31T20:11:33Z

+        if grad.shape != input_values[x_name].shape:
+            raise AssertionError(
+                "Gradient wrt '{}' has unexpected shape {}, expected {} "
+                .format(x.attr('name'), grad.shape, input_values[x_name].shape))


x is not defined here. x.attr('name') should be x_name.

kazum · 2018-07-31T20:14:25Z

+
+
+def check_numerical_grads(function, grad_input_vars, input_values, grad_values, function_value=None,
+                          delta=1e-3, max_error=1e+3, max_discarded_frac=0.1,


The default value of delta is different from the check_function's one. Should be 1e-2?

Oh, it should be 1e-3 everywhere, thank you!

kazum · 2018-07-31T20:25:01Z

+        return (function(**modified_values) - function_value)/a_delta
+
+    for x_name in grad_input_vars:
+        grad = grad_values[x_name]


We can replace these two lines with for x_name, grad in grad_values.items():. Then grad_input_vars is no longer necessary.

kazum · 2018-07-31T20:37:33Z

+        raise ValueError("numerical_grads must be a bool or 'if_possible', not {}"
+                         .format(numerical_grads))
+
+    input_vars = [x for x in symbol.list_input_variables()]


Can we simply replace this with input_vars = symbol.list_input_variables()?

sgrechanik-h · 2018-08-02T18:52:13Z

Renamed np_forward to forward, same for backward, and now it is possible to pass additional
parameters to these functions via additional_params.
Added support for multiple inputs.
Now additional parameters to the numgrad checking function are passed as a single dictionary.
I'm not sure yet about merging the behaviour of numerical_grads=True and
numerical_grads='if_possible', so I left it as it is for now.
Require passing numerical_grads=True explicitly if no reference function is specified.
Got rid of explicit passing of input variables in most of the tests. Also shapes cannot be
passed with grad_input_vars now.

Also added some tests for check_function itself.

The main thing that is left to do is something like a tutorial on what to do if numerical gradient
testing fails. I'll return to this task in a few days.

sgrechanik-h · 2018-08-08T12:44:13Z

Simplified numerical grad checking, improved docs. Wasn't sure where to put the text on what to do when numgrad check fails, so I put it under Python API/NNVM API/nnvm.testing

@kazum @kevinthesun @srkreddy1238 could you please review this again?

kevinthesun · 2018-08-13T20:01:50Z

+
+    shape : Dict[nnvm.Symbol or str, Tuple[int]] or Tuple[int], optional
+        A dict mapping input variable names to shapes, or just a single shape.
+        By default shapes will be inferred automatically.


How can we infer input shapes? I think this argument shouldn't be optional.

They are inferred from variables' attributes:

x = sym.Variable("x", shape=(1, 2))

This interface doesn't have any constraint to input symbol. So I can write some codes like:
data = sym.Variable("data")
net = sym.conv2d(data, ...)
check_function(net, shape=None, ...)

It will raise exception later, since no shape information is provided? Currently this kind of input is marked as valid.

Also for kwargs, we can directly pass default values, instead of just None. Then we don't need to write a lot of "if xxx = None:" statements at the beginning stage.

Oh, the error message is not nice in this case. I think making it nicer and clarifying the documentation should be enough.

kevinthesun · 2018-08-13T20:04:42Z

+
+    forward_graph = nnvm.graph.create(symbol)
+
+    if dtype is not None:


If dtype is None, graph_to_function will raise exception? Same thing will happen for shape.
How about setting dtype as optional but not None as default value, such as dtype="float32"? For shape argument we should force user to pass in value. Then the code logic can be simpler.

kevinthesun · 2018-08-13T20:11:23Z

+            if shape is not None:
+                nnvm.compiler.graph_attr.set_shape_inputs(backward_graph, shape)
+
+            backward_graph = backward_graph.apply('InferShape').apply('InferType')


https://github.com/dmlc/tvm/blob/master/nnvm/tests/python/compiler/test_top_level1.py#L32-L37
We just need to update the shape head_grads for input shape dictionary when compiling.

kazum · 2018-08-16T18:35:59Z

+Testing new operations
+----------------------
+
+When adding new operations, it it a good idea to test them. Testing


“it it” => “it is”

kazum · 2018-08-16T18:38:06Z

+            all_shapes = graph.json_attr('shape')
+            all_dtypes = graph.json_attr('dtype')
+
+        all_dtypes = [None if t == -1 else TCODE_TO_DTYPE[t] for t in all_dtypes]


TCODE_TO_DTYPE[-1] is defined as None in graph_attr.py. The if-else expression is not necessary.

kazum · 2018-08-16T18:39:30Z

+    if None in dtype.values():
+        raise ValueError("Input variables with no type: {}".format(dtype))
+
+    if any([not s for s in shape.values()]):


not all(shape.values()) looks simpler.

kazum · 2018-08-16T18:46:01Z

+        if (exclude_targets is not None and (target in exclude_targets or
+                                             str(target) in exclude_targets)) or\
+           (only_targets is not None and not (target in only_targets or
+                                              str(target) in only_targets)):


The expression is a bit complicated. I’d suggest splitting them:

if exclude_targets is not None: if target in exclude_targets or str(target) in exclude_targets: logging.info(...) continue if only_targets is not None: if target not in only_targets and str(target) not in only_targets: logging.info(...) continue

yzhliu · 2018-08-20T17:38:19Z

@srkreddy1238 @kevinthesun @kazum could you help to review again according to the new changes and explicitly approve if it is good to you.
@sgrechanik-h Please rebase.

…fallback

yzhliu · 2018-08-23T23:49:56Z

Thanks everyone. This is now merged.

* [NNVM][TEST] Numerical gradient testing * [NNVM][TEST] Make some tests a little faster * Fix the failing test_top_level3 * Target exclusion for the check_function * Try to ignore singularities * grad_input_vars now can't contain shapes * Don't pass unnecessary grad_input_vars to check_function * Multiple outputs; fixes; testing of check_function * Use numerical_grads_params to pass parameters to numgrad checker * Fail when no action is requested excplicitly * Pass additional params to functions * Silence the linter issue * Simplified numgrad checking * Improved docs for check_function * Fixed the error message when no dtype is provided * Several fixes * Tests with shape/dtype inference for inputs * Don't check dense's grads on cuda * Raise an error if output dtypes haven't been inferred * Moved shape/dtype inference into a separate function; use float32 as fallback * Remove redundant dtype=float32 * Fix multiple outputs * Use check_function in the rest of the test_top_level1

tqchen added the status: need review label Jul 30, 2018

srkreddy1238 reviewed Jul 31, 2018

View reviewed changes

kevinthesun requested changes Jul 31, 2018

View reviewed changes

kevinthesun reviewed Jul 31, 2018

View reviewed changes

sgrechanik-h mentioned this pull request Jul 31, 2018

[NNVM] Fix gradients for broadcast_div #1512

Merged

kazum requested changes Jul 31, 2018

View reviewed changes

tqchen added status: review in progress status: need update need update based on feedbacks and removed status: need review labels Aug 2, 2018

srkreddy1238 mentioned this pull request Aug 3, 2018

[FRONTEND][TENSORFLOW] Optimized tensorflow testcases #1546

Merged

tqchen force-pushed the master branch from e316f03 to 7b59b8e Compare August 4, 2018 17:14

tqchen added the status: need review label Aug 8, 2018

sgrechanik-h changed the title ~~[NNVM][TEST] Test against numerical grad; Fix div grad~~ [NNVM][TEST] Test against numerical grad Aug 9, 2018

kevinthesun reviewed Aug 13, 2018

View reviewed changes

kazum requested changes Aug 16, 2018

View reviewed changes

yzhliu self-assigned this Aug 20, 2018

kevinthesun approved these changes Aug 20, 2018

View reviewed changes

kazum approved these changes Aug 21, 2018

View reviewed changes

sgrechanik-h added 8 commits August 21, 2018 11:54

[NNVM][TEST] Numerical gradient testing

b325520

[NNVM][TEST] Make some tests a little faster

d28b187

Fix the failing test_top_level3

e289e91

Target exclusion for the check_function

ba5cbb8

Try to ignore singularities

82846bb

grad_input_vars now can't contain shapes

d8085c2

Don't pass unnecessary grad_input_vars to check_function

4ed69d0

Multiple outputs; fixes; testing of check_function

9728f18

sgrechanik-h added 15 commits August 21, 2018 11:54

Use numerical_grads_params to pass parameters to numgrad checker

980d22f

Fail when no action is requested excplicitly

f92df1c

Pass additional params to functions

3e9fd96

Silence the linter issue

50bf483

Simplified numgrad checking

d196138

Improved docs for check_function

c989973

Fixed the error message when no dtype is provided

4be1b1e

Several fixes

375523b

Tests with shape/dtype inference for inputs

70eb969

Don't check dense's grads on cuda

f060042

Raise an error if output dtypes haven't been inferred

f26b856

Moved shape/dtype inference into a separate function; use float32 as …

4ca3816

…fallback

Remove redundant dtype=float32

e8e49d9

Fix multiple outputs

2043689

Use check_function in the rest of the test_top_level1

538ea9d

yzhliu approved these changes Aug 23, 2018

View reviewed changes

yzhliu merged commit 0edf87e into apache:master Aug 23, 2018

yzhliu added status: accepted and removed status: need review status: need update need update based on feedbacks status: review in progress labels Aug 23, 2018


		forward_graph = nnvm.graph.create(symbol)

		if dtype is not None:



		def check_numerical_grads(function, grad_input_vars, input_values, grad_values, function_value=None,
		delta=1e-3, max_error=1e+3, max_discarded_frac=0.1,

Conversation

sgrechanik-h commented Jul 29, 2018

Uh oh!

sgrechanik-h commented Jul 30, 2018

Uh oh!

srkreddy1238 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kevinthesun Jul 31, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgrechanik-h commented Aug 2, 2018

Uh oh!

sgrechanik-h commented Aug 8, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kevinthesun Aug 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kevinthesun Jul 31, 2018 •

edited

Loading

kevinthesun Aug 14, 2018 •

edited

Loading