I am trying to build mxnet with cuda 10.2 in a docker container.
For my build, I am using the following docker image from nvidia: nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04
apt-get update && apt-get upgrade -y
apt-get install -y libopenblas-dev git python python-pip
apt-get install -y libjemalloc-dev
pip install cmake
git clone https://github.com/apache/incubator-mxnet.git mxnet
cd mxnet
mkdir build
cd build
export CUDA_HOME=/usr/local/cuda
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH=${CUDA_HOME}/compat:${LD_LIBRARY_PATH}
cmake -DUSE_CUDNN=1 -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda -DCMAKE_BUILD_TYPE=Release \
-DBLAS=open -DUSE_OPENCV=OFF -DUSE_CPP_PACKAGE=ON -DENABLE_CUDA_RTC=ON ..
make -j4
[21:33:50] : [Step 1/1] [ 95%] Building CXX object tests/CMakeFiles/mxnet_unit_tests.dir/cpp/test_main.cc.o
[21:34:11] : [Step 1/1] [ 95%] Linking CUDA device code CMakeFiles/mxnet_unit_tests.dir/cmake_device_link.o
[21:34:27] : [Step 1/1] [ 95%] Linking CXX executable mxnet_unit_tests
[21:34:36]W: [Step 1/1] /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o: In function `_start':
[21:34:36]W: [Step 1/1] (.text+0x26): relocation truncated to fit: R_X86_64_GOTPCRELX against symbol `__libc_start_main@@GLIBC_2.2.5' defined in .text section in /lib/x86_64-linux-gnu/libc.so.6
[21:34:36] : [Step 1/1] tests/CMakeFiles/mxnet_unit_tests.dir/build.make:449: recipe for target 'tests/mxnet_unit_tests' failed
[21:34:36]W: [Step 1/1] /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o:(.eh_frame+0x20): I
[21:34:36]W: [Step 1/1] /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crti.o: In function `_init':
[21:34:36]W: [Step 1/1] (.init+0x7): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against undefined symbol `__gmon_start__'
[21:34:36]W: [Step 1/1] CMakeFiles/mxnet_unit_tests.dir/cpp/engine/omp_test.cc.o: In function `OMPBehaviour_after_fork_Test::TestBody()':
[21:34:36]W: [Step 1/1] omp_test.cc:(.text+0x193): relocation truncated to fit: R_X86_64_PC32 against symbol `vtable for std::basic_ios<char, std::char_traits<char> >@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o
[21:34:36]W: [Step 1/1] omp_test.cc:(.text+0x1a7): relocation truncated to fit: R_X86_64_PC32 against symbol `VTT for std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >@@GLIBCXX_3.4.21' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o
[21:34:36]W: [Step 1/1] omp_test.cc:(.text+0x1b2): relocation truncated to fit: R_X86_64_PC32 against symbol `VTT for std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >@@GLIBCXX_3.4.21' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o
[21:34:36]W: [Step 1/1] omp_test.cc:(.text+0x1f6): relocation truncated to fit: R_X86_64_PC32 against symbol `vtable for std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >@@GLIBCXX_3.4.21' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o
[21:34:36]W: [Step 1/1] omp_test.cc:(.text+0x217): relocation truncated to fit: R_X86_64_PC32 against symbol `vtable for std::basic_streambuf<char, std::char_traits<char> >@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o
[21:34:36]W: [Step 1/1] omp_test.cc:(.text+0x242): relocation truncated to fit: R_X86_64_PC32 against symbol `vtable for std::__cxx11::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >@@GLIBCXX_3.4.21' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o
[21:34:36]W: [Step 1/1] omp_test.cc:(.text+0x432): relocation truncated to fit: R_X86_64_PC32 against symbol `vtable for std::basic_ios<char, std::char_traits<char> >@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o
[21:34:36]W: [Step 1/1] omp_test.cc:(.text+0x446): additional relocation overflows omitted from the output
[21:34:36]W: [Step 1/1] /usr/bin/ld: failed to convert GOTPCREL relocation; relink with --no-relax
[21:34:36]W: [Step 1/1] collect2: error: ld returned 1 exit status
[21:34:36]W: [Step 1/1] make[2]: *** [tests/mxnet_unit_tests] Error 1
[21:34:36] : [Step 1/1] CMakeFiles/Makefile2:2398: recipe for target 'tests/CMakeFiles/mxnet_unit_tests.dir/all' failed
[21:34:36] : [Step 1/1] Makefile:140: recipe for target 'all' failed
[21:34:36]W: [Step 1/1] make[1]: *** [tests/CMakeFiles/mxnet_unit_tests.dir/all] Error 2
[21:34:36]W: [Step 1/1] make: *** [all] Error 2
[21:34:36]i: [Step 1/1] Docker event: {"status":"die","id":"a42467491c7573f601b26314aee381f40e74a1c4ae21e2e94929d22b6587b2a3","from":"nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04","Type":"container","Action":"die","Actor":{"ID":"a42467491c7573f601b26314aee381f40e74a1c4ae21e2e94929d22b6587b2a3","Attributes":{"com.nvidia.cudnn.version":"7.6.5.32","exitCode":"2","image":"nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04","jetbrains.teamcity.buildId":"202","maintainer":"NVIDIA CORPORATION <cudatools@nvidia.com>","name":"priceless_tereshkova"}},"scope":"local","time":1578605676,"timeNano":1578605676595309140}
[21:34:37]i: [Step 1/1] Docker event: {"status":"destroy","id":"a42467491c7573f601b26314aee381f40e74a1c4ae21e2e94929d22b6587b2a3","from":"nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04","Type":"container","Action":"destroy","Actor":{"ID":"a42467491c7573f601b26314aee381f40e74a1c4ae21e2e94929d22b6587b2a3","Attributes":{"com.nvidia.cudnn.version":"7.6.5.32","image":"nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04","jetbrains.teamcity.buildId":"202","maintainer":"NVIDIA CORPORATION <cudatools@nvidia.com>","name":"priceless_tereshkova"}},"scope":"local","time":1578605677,"timeNano":1578605677103449725}
[21:34:37]W: [Step 1/1] Process exited with code 2
[21:34:37]E: [Step 1/1] Process exited with code 2 (Step: Command Line)
[21:34:37]E: [Step 1/1] Step Command Line failed
[21:34:37] : Publishing artifacts
[21:34:37] : [Publishing artifacts] Collecting files to publish: [/opt/jetbrains/TeamCity/buildAgent/temp/buildTmp/.teamcity/docker/build_3/events.json => .teamcity/docker/]
[21:34:37] : [Publishing artifacts] Publishing 1 file using [WebPublisher]: /opt/jetbrains/TeamCity/buildAgent/temp/buildTmp/.teamcity/docker/build_3/events.json => .teamcity/docker
[21:34:37] : [Publishing artifacts] Publishing 1 file using [ArtifactsCachePublisher]: /opt/jetbrains/TeamCity/buildAgent/temp/buildTmp/.teamcity/docker/build_3/events.json => .teamcity/docker
[21:34:37]i: Docker wrapper: setting permissions for '/opt/jetbrains/TeamCity/buildAgent/temp/buildTmp' and '/opt/jetbrains/TeamCity/buildAgent/work/a29666811c94acf8' to 755
[21:34:37] : Publishing internal artifacts
[21:34:37] : [Publishing internal artifacts] Publishing 1 file using [WebPublisher]
[21:34:37] : [Publishing internal artifacts] Publishing 1 file using [ArtifactsCachePublisher]
[21:34:37] : Build is failed. Artifacts will not be published for this build
[21:34:37] : Build finished
If I build mxnet using the exact same build command and options but use the cuda 10.0 image nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04, then the build completes as expected.
I am trying to build mxnet with cuda 10.2 in a docker container.
For my build, I am using the following docker image from nvidia:
nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04Within the container (from fresh image) I am running the following commands:
The build gets to the 95% percent mark, then fails with the following message:
If I build mxnet using the exact same build command and options but use the cuda 10.0 image
nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04, then the build completes as expected.