2021-10-19 06:58:53,420 [INFO] root: Registry: ['nvcr.io'] 2021-10-19 06:58:53,593 [WARNING] tlt.components.docker_handler.docker_handler: Docker will run the commands as root. If you would like to retain your local host permissions, please add the "user":"UID:GID" in the DockerOptions portion of the "/home/paperspace/.tao_mounts.json" file. You can obtain your users UID and GID by using the "id -u" and "id -g" commands on the terminal. Using TensorFlow backend. WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them. Using TensorFlow backend. WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:43: The name tf.train.SessionRunHook is deprecated. Please use tf.estimator.SessionRunHook instead. WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead. WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py:68: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead. WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py:68: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/__init__.py:117: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead. 2021-10-19 10:59:02,416 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/__init__.py:117: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/__init__.py:143: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead. 2021-10-19 10:59:02,417 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/__init__.py:143: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead. 2021-10-19 10:59:03,262 [INFO] iva.common.logging.logging: Log file already exists at /workspace/tao-experiments/detectnet_v2/experiment_dir_retrain/status.json 2021-10-19 10:59:03,264 [INFO] __main__: Loading experiment spec at /workspace/tao-experiments/detectnet_v2/specs/detectnet_v2_retrain_resnet18_kitti-def.txt. 2021-10-19 10:59:03,266 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /workspace/tao-experiments/detectnet_v2/specs/detectnet_v2_retrain_resnet18_kitti-def.txt 2021-10-19 10:59:03,893 [INFO] __main__: Cannot iterate over exactly 1543 samples with a batch size of 4; each epoch will therefore take one extra step. WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:107: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead. 2021-10-19 10:59:03,896 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:107: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead. WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:110: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead. 2021-10-19 10:59:03,897 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:110: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead. WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:113: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead. 2021-10-19 10:59:03,900 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:113: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead. 2021-10-19 10:59:04,489 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead. 2021-10-19 10:59:04,503 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead. 2021-10-19 10:59:04,531 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead. 2021-10-19 10:59:05,563 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead. 2021-10-19 10:59:05,563 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead. 2021-10-19 10:59:05,952 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead. /usr/local/lib/python3.6/dist-packages/keras/engine/saving.py:292: UserWarning: No training configuration found in save file: the model was *not* compiled. Compile it manually. warnings.warn('No training configuration found in save file: ' 2021-10-19 10:59:06,405 [INFO] iva.detectnet_v2.objectives.bbox_objective: Default L1 loss function will be used. __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_1 (InputLayer) (None, 3, 544, 960) 0 __________________________________________________________________________________________________ conv1 (Conv2D) (None, 32, 272, 480) 4736 input_1[0][0] __________________________________________________________________________________________________ bn_conv1 (BatchNormalization) (None, 32, 272, 480) 128 conv1[0][0] __________________________________________________________________________________________________ activation_1 (Activation) (None, 32, 272, 480) 0 bn_conv1[0][0] __________________________________________________________________________________________________ block_1a_conv_1 (Conv2D) (None, 40, 136, 240) 11560 activation_1[0][0] __________________________________________________________________________________________________ block_1a_bn_1 (BatchNormalizati (None, 40, 136, 240) 160 block_1a_conv_1[0][0] __________________________________________________________________________________________________ block_1a_relu_1 (Activation) (None, 40, 136, 240) 0 block_1a_bn_1[0][0] __________________________________________________________________________________________________ block_1a_conv_2 (Conv2D) (None, 56, 136, 240) 20216 block_1a_relu_1[0][0] __________________________________________________________________________________________________ block_1a_conv_shortcut (Conv2D) (None, 56, 136, 240) 1848 activation_1[0][0] __________________________________________________________________________________________________ block_1a_bn_2 (BatchNormalizati (None, 56, 136, 240) 224 block_1a_conv_2[0][0] __________________________________________________________________________________________________ block_1a_bn_shortcut (BatchNorm (None, 56, 136, 240) 224 block_1a_conv_shortcut[0][0] __________________________________________________________________________________________________ add_1 (Add) (None, 56, 136, 240) 0 block_1a_bn_2[0][0] block_1a_bn_shortcut[0][0] __________________________________________________________________________________________________ block_1a_relu (Activation) (None, 56, 136, 240) 0 add_1[0][0] __________________________________________________________________________________________________ block_1b_conv_1 (Conv2D) (None, 32, 136, 240) 16160 block_1a_relu[0][0] __________________________________________________________________________________________________ block_1b_bn_1 (BatchNormalizati (None, 32, 136, 240) 128 block_1b_conv_1[0][0] __________________________________________________________________________________________________ block_1b_relu_1 (Activation) (None, 32, 136, 240) 0 block_1b_bn_1[0][0] __________________________________________________________________________________________________ block_1b_conv_2 (Conv2D) (None, 64, 136, 240) 18496 block_1b_relu_1[0][0] __________________________________________________________________________________________________ block_1b_conv_shortcut (Conv2D) (None, 64, 136, 240) 3648 block_1a_relu[0][0] __________________________________________________________________________________________________ block_1b_bn_2 (BatchNormalizati (None, 64, 136, 240) 256 block_1b_conv_2[0][0] __________________________________________________________________________________________________ block_1b_bn_shortcut (BatchNorm (None, 64, 136, 240) 256 block_1b_conv_shortcut[0][0] __________________________________________________________________________________________________ add_2 (Add) (None, 64, 136, 240) 0 block_1b_bn_2[0][0] block_1b_bn_shortcut[0][0] __________________________________________________________________________________________________ block_1b_relu (Activation) (None, 64, 136, 240) 0 add_2[0][0] __________________________________________________________________________________________________ block_2a_conv_1 (Conv2D) (None, 40, 68, 120) 23080 block_1b_relu[0][0] __________________________________________________________________________________________________ block_2a_bn_1 (BatchNormalizati (None, 40, 68, 120) 160 block_2a_conv_1[0][0] __________________________________________________________________________________________________ block_2a_relu_1 (Activation) (None, 40, 68, 120) 0 block_2a_bn_1[0][0] __________________________________________________________________________________________________ block_2a_conv_2 (Conv2D) (None, 56, 68, 120) 20216 block_2a_relu_1[0][0] __________________________________________________________________________________________________ block_2a_conv_shortcut (Conv2D) (None, 56, 68, 120) 3640 block_1b_relu[0][0] __________________________________________________________________________________________________ block_2a_bn_2 (BatchNormalizati (None, 56, 68, 120) 224 block_2a_conv_2[0][0] __________________________________________________________________________________________________ block_2a_bn_shortcut (BatchNorm (None, 56, 68, 120) 224 block_2a_conv_shortcut[0][0] __________________________________________________________________________________________________ add_3 (Add) (None, 56, 68, 120) 0 block_2a_bn_2[0][0] block_2a_bn_shortcut[0][0] __________________________________________________________________________________________________ block_2a_relu (Activation) (None, 56, 68, 120) 0 add_3[0][0] __________________________________________________________________________________________________ block_2b_conv_1 (Conv2D) (None, 128, 68, 120) 64640 block_2a_relu[0][0] __________________________________________________________________________________________________ block_2b_bn_1 (BatchNormalizati (None, 128, 68, 120) 512 block_2b_conv_1[0][0] __________________________________________________________________________________________________ block_2b_relu_1 (Activation) (None, 128, 68, 120) 0 block_2b_bn_1[0][0] __________________________________________________________________________________________________ block_2b_conv_2 (Conv2D) (None, 48, 68, 120) 55344 block_2b_relu_1[0][0] __________________________________________________________________________________________________ block_2b_conv_shortcut (Conv2D) (None, 48, 68, 120) 2736 block_2a_relu[0][0] __________________________________________________________________________________________________ block_2b_bn_2 (BatchNormalizati (None, 48, 68, 120) 192 block_2b_conv_2[0][0] __________________________________________________________________________________________________ block_2b_bn_shortcut (BatchNorm (None, 48, 68, 120) 192 block_2b_conv_shortcut[0][0] __________________________________________________________________________________________________ add_4 (Add) (None, 48, 68, 120) 0 block_2b_bn_2[0][0] block_2b_bn_shortcut[0][0] __________________________________________________________________________________________________ block_2b_relu (Activation) (None, 48, 68, 120) 0 add_4[0][0] __________________________________________________________________________________________________ block_3a_conv_1 (Conv2D) (None, 48, 34, 60) 20784 block_2b_relu[0][0] __________________________________________________________________________________________________ block_3a_bn_1 (BatchNormalizati (None, 48, 34, 60) 192 block_3a_conv_1[0][0] __________________________________________________________________________________________________ block_3a_relu_1 (Activation) (None, 48, 34, 60) 0 block_3a_bn_1[0][0] __________________________________________________________________________________________________ block_3a_conv_2 (Conv2D) (None, 256, 34, 60) 110848 block_3a_relu_1[0][0] __________________________________________________________________________________________________ block_3a_conv_shortcut (Conv2D) (None, 256, 34, 60) 12544 block_2b_relu[0][0] __________________________________________________________________________________________________ block_3a_bn_2 (BatchNormalizati (None, 256, 34, 60) 1024 block_3a_conv_2[0][0] __________________________________________________________________________________________________ block_3a_bn_shortcut (BatchNorm (None, 256, 34, 60) 1024 block_3a_conv_shortcut[0][0] __________________________________________________________________________________________________ add_5 (Add) (None, 256, 34, 60) 0 block_3a_bn_2[0][0] block_3a_bn_shortcut[0][0] __________________________________________________________________________________________________ block_3a_relu (Activation) (None, 256, 34, 60) 0 add_5[0][0] __________________________________________________________________________________________________ block_3b_conv_1 (Conv2D) (None, 256, 34, 60) 590080 block_3a_relu[0][0] __________________________________________________________________________________________________ block_3b_bn_1 (BatchNormalizati (None, 256, 34, 60) 1024 block_3b_conv_1[0][0] __________________________________________________________________________________________________ block_3b_relu_1 (Activation) (None, 256, 34, 60) 0 block_3b_bn_1[0][0] __________________________________________________________________________________________________ block_3b_conv_2 (Conv2D) (None, 256, 34, 60) 590080 block_3b_relu_1[0][0] __________________________________________________________________________________________________ block_3b_conv_shortcut (Conv2D) (None, 256, 34, 60) 65792 block_3a_relu[0][0] __________________________________________________________________________________________________ block_3b_bn_2 (BatchNormalizati (None, 256, 34, 60) 1024 block_3b_conv_2[0][0] __________________________________________________________________________________________________ block_3b_bn_shortcut (BatchNorm (None, 256, 34, 60) 1024 block_3b_conv_shortcut[0][0] __________________________________________________________________________________________________ add_6 (Add) (None, 256, 34, 60) 0 block_3b_bn_2[0][0] block_3b_bn_shortcut[0][0] __________________________________________________________________________________________________ block_3b_relu (Activation) (None, 256, 34, 60) 0 add_6[0][0] __________________________________________________________________________________________________ block_4a_conv_1 (Conv2D) (None, 512, 34, 60) 1180160 block_3b_relu[0][0] __________________________________________________________________________________________________ block_4a_bn_1 (BatchNormalizati (None, 512, 34, 60) 2048 block_4a_conv_1[0][0] __________________________________________________________________________________________________ block_4a_relu_1 (Activation) (None, 512, 34, 60) 0 block_4a_bn_1[0][0] __________________________________________________________________________________________________ block_4a_conv_2 (Conv2D) (None, 512, 34, 60) 2359808 block_4a_relu_1[0][0] __________________________________________________________________________________________________ block_4a_conv_shortcut (Conv2D) (None, 512, 34, 60) 131584 block_3b_relu[0][0] __________________________________________________________________________________________________ block_4a_bn_2 (BatchNormalizati (None, 512, 34, 60) 2048 block_4a_conv_2[0][0] __________________________________________________________________________________________________ block_4a_bn_shortcut (BatchNorm (None, 512, 34, 60) 2048 block_4a_conv_shortcut[0][0] __________________________________________________________________________________________________ add_7 (Add) (None, 512, 34, 60) 0 block_4a_bn_2[0][0] block_4a_bn_shortcut[0][0] __________________________________________________________________________________________________ block_4a_relu (Activation) (None, 512, 34, 60) 0 add_7[0][0] __________________________________________________________________________________________________ block_4b_conv_1 (Conv2D) (None, 512, 34, 60) 2359808 block_4a_relu[0][0] __________________________________________________________________________________________________ block_4b_bn_1 (BatchNormalizati (None, 512, 34, 60) 2048 block_4b_conv_1[0][0] __________________________________________________________________________________________________ block_4b_relu_1 (Activation) (None, 512, 34, 60) 0 block_4b_bn_1[0][0] __________________________________________________________________________________________________ block_4b_conv_2 (Conv2D) (None, 128, 34, 60) 589952 block_4b_relu_1[0][0] __________________________________________________________________________________________________ block_4b_conv_shortcut (Conv2D) (None, 128, 34, 60) 65664 block_4a_relu[0][0] __________________________________________________________________________________________________ block_4b_bn_2 (BatchNormalizati (None, 128, 34, 60) 512 block_4b_conv_2[0][0] __________________________________________________________________________________________________ block_4b_bn_shortcut (BatchNorm (None, 128, 34, 60) 512 block_4b_conv_shortcut[0][0] __________________________________________________________________________________________________ add_8 (Add) (None, 128, 34, 60) 0 block_4b_bn_2[0][0] block_4b_bn_shortcut[0][0] __________________________________________________________________________________________________ block_4b_relu (Activation) (None, 128, 34, 60) 0 add_8[0][0] __________________________________________________________________________________________________ output_bbox (Conv2D) (None, 28, 34, 60) 3612 block_4b_relu[0][0] __________________________________________________________________________________________________ output_cov (Conv2D) (None, 7, 34, 60) 903 block_4b_relu[0][0] ================================================================================================== Total params: 8,345,347 Trainable params: 8,336,643 Non-trainable params: 8,704 __________________________________________________________________________________________________ Traceback (most recent call last): File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 843, in File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 832, in File "", line 2, in main File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/utilities/timer.py", line 46, in wrapped_fn File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 821, in main File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 702, in run_experiment File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 596, in train_gridbox File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 229, in build_rasterizers File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/rasterizers/bbox_rasterizer.py", line 95, in __init__ AssertionError 2021-10-19 06:59:10,083 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.