Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

USE_DIST_KVSTORE triggers "undefined reference to 'void mxnet::op::ElemwiseBinaryOp::DnsCsrDnsOp'" linker error #21144

@jens-maus

Description

@jens-maus

Description

After switching to Ubuntu 22.04 with latest gcc/g++ v11 and CUDA 11.7 with NVIDIA driver 515.65.01 for Tesla V100S GPU cards I tried to compile mxnet 1.9.1 for our new environment because we need to get the R-package installed/updated as well. However, while most of the mxnet build seem to suceed fine, the build unfortunately stops right at trying to link img2rec with an error message mentioned in the next section. Trying to skip the img2rec build ends up in similar linker errors for other tools for which I could not find any solution. Also looking at similar issue tickets like #18761 and #18357 did not end up in a fix we could apply for the issue.

Any help in trying to solve this issue would be highly appreciated.

Error Message

$ cmake --build . --parallel 1
Consolidate compiler generated dependencies of target objects
[  8%] Built target objects
[  8%] Built target libzmq-static
Consolidate compiler generated dependencies of target dnnl_cpu_x64
[ 27%] Built target dnnl_cpu_x64
Consolidate compiler generated dependencies of target dnnl_common
[ 32%] Built target dnnl_common
Consolidate compiler generated dependencies of target dnnl_cpu
[ 41%] Built target dnnl_cpu
[ 41%] Built target dnnl
Consolidate compiler generated dependencies of target intgemm
[ 41%] Built target intgemm
[ 42%] Built target libomp-needed-headers
Consolidate compiler generated dependencies of target omp
[ 45%] Built target omp
Consolidate compiler generated dependencies of target dmlc
[ 46%] Built target dmlc
[ 46%] Built target proto_python
Consolidate compiler generated dependencies of target pslite
[ 47%] Built target pslite
Consolidate compiler generated dependencies of target mxnet
[ 94%] Built target mxnet
Consolidate compiler generated dependencies of target customop_lib
[ 94%] Built target customop_lib
Consolidate compiler generated dependencies of target transposecsr_lib
[ 94%] Built target transposecsr_lib
Consolidate compiler generated dependencies of target transposerowsp_lib
[ 95%] Built target transposerowsp_lib
Consolidate compiler generated dependencies of target subgraph_lib
[ 95%] Built target subgraph_lib
Consolidate compiler generated dependencies of target pass_lib
[ 95%] Built target pass_lib
Consolidate compiler generated dependencies of target customop_gpu_lib
[ 95%] Built target customop_gpu_lib
Consolidate compiler generated dependencies of target im2rec
[ 95%] Linking CXX executable im2rec
/usr/bin/ld: libmxnet.so: undefined reference to `void mxnet::op::ElemwiseBinaryOp::DnsCsrDnsOp<mxnet::op::mshadow_op::plus>(mshadow::Stream<mshadow::gpu>*, nnvm::NodeAttrs const&, mxnet::OpContext const&, mxnet::NDArray const&, mxnet::NDArray const&, mxnet::OpReqType, mxnet::NDArray const&, bool)'
/usr/bin/ld: libmxnet.so: undefined reference to `void mxnet::op::ElemwiseBinaryOp::DnsCsrDnsOp<mxnet::op::mshadow_op::minus>(mshadow::Stream<mshadow::gpu>*, nnvm::NodeAttrs const&, mxnet::OpContext const&, mxnet::NDArray const&, mxnet::NDArray const&, mxnet::OpReqType, mxnet::NDArray const&, bool)'
collect2: error: ld returned 1 exit status
gmake[2]: *** [CMakeFiles/im2rec.dir/build.make:130: im2rec] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:749: CMakeFiles/im2rec.dir/all] Error 2
gmake: *** [Makefile:146: all] Error 2

To Reproduce

This is the config.cmake file we are using to build mxnet 1.9.1 in our ubuntu 22.04 environment:

set(CMAKE_BUILD_TYPE "Distribution" CACHE STRING "Build type")
set(CFLAGS "-mno-avx" CACHE STRING "CFLAGS")
set(CXXFLAGS "-mno-avx" CACHE STRING "CXXFLAGS")
set(USE_CUDA ON CACHE BOOL "Build with CUDA support")
set(USE_CUDNN ON CACHE BOOL "Build with CUDA support")
set(USE_NCCL ON CACHE BOOL "Build with NCCL support")
set(USE_OPENCV ON CACHE BOOL "Build with OpenCV support")
set(USE_OPENMP ON CACHE BOOL "Build with Openmp support")
set(USE_MKL_IF_AVAILABLE OFF CACHE BOOL "Use Intel MKL if found")
set(USE_MKLDNN ON CACHE BOOL "Build with MKL-DNN support")
set(USE_LAPACK ON CACHE BOOL "Build with lapack support")
set(USE_TVM_OP OFF CACHE BOOL "Enable use of TVM operator build system.")
set(USE_SSE ON CACHE BOOL "Build with x86 SSE instruction support")
set(USE_F16C OFF CACHE BOOL "Build with x86 F16C instruction support")
set(USE_LIBJPEG_TURBO ON CACHE BOOL "Build with libjpeg-turbo")
set(USE_DIST_KVSTORE ON CACHE BOOL "Build with DIST_KVSTORE support")
set(MXNET_CUDA_ARCH "5.0;6.0;7.0;8.0;8.6" CACHE STRING "Cuda architectures")
set(CMAKE_CUDA_COMPILER "/usr/local/cuda-11.7/bin/nvcc" CACHE STRING "Cuda compiler")
set(OPENMP_FILECHECK_EXECUTABLE "/usr/lib/llvm-14/bin/FileCheck")
set(OPENMP_LLVM_LIT_EXECUTABLE "/usr/lib/llvm-14/build/utils/lit/lit.py")
set(USE_CPP_PACKAGE ON CACHE BOOL "Build C++ Package")
set(NCCL_ROOT "/usr/local/nccl" CACHE BOOL "NCCL install path. Supports autodetection.")

Steps to reproduce

(Paste the commands you ran that produced the error.)

  1. cmake -DCMAKE_INSTALL_PREFIX=/usr/local/mxnet-1.9.1 ..
  2. cmake --build . --parallel 20

What have you tried to solve it?

  1. Tried to apply similar fixes like in Fix undef symbol mxnet::op::ElemwiseBinaryOp::DnsCsrCsrOp [libmxnet.so] #18357 or Ultimately fix undefined reference to void mxnet::op::ElemwiseBinaryOp::DnsCsrCsrOp #18761 but to no avail.

Environment

n/a

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions