§2024-12-10

I used Nvidia SDK manager to flash an linux image into Jetson Orin Nano development Board, and it booted up. How to verify the jetson driver software has been installed and I could drive the GPU core?

  1. Check if NVIDIA drivers are installed
alexlai@jetsonOrinNano:~$ lsmod | grep nvidia
nvidia_drm             94208  1
nvidia_modeset       1261568  5 nvidia_drm
nvidia               1458176  9 nvidia_modeset
nvidia_vrs_pseq        16384  0
tegra_dce              98304  2 nvidia
tsecriscv              32768  1 nvidia
host1x_nvhost          40960  8 nvhost_isp5,nvhost_nvcsi_t194,nvidia,tegra_camera,nvhost_capture,nvhost_nvcsi,nvhost_vi5,nvidia_modeset
drm_kms_helper        278528  4 tegra_drm,nvidia_drm
nvidia_p2p             20480  0
host1x                180224  6 host1x_nvhost,host1x_fence,nvgpu,tegra_drm,nvidia_drm,nvidia_modeset
mc_utils               16384  3 nvidia,nvgpu,tegra_camera_platform
drm                   602112  15 drm_kms_helper,nvidia,tegra_drm,nvidia_drm

This should show a list of NVIDIA kernel modules if the drivers are installed correctly. Look for entries like nvidia, nvidia_uvm, nvidia_modeset, and nvidia_drm.

alexlai@jetsonOrinNano:~$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX Open Kernel Module for aarch64  540.4.0  Release Build  (buildbrain@mobile-u64-5441-d8000)  Thu Sep 12 21:22:07 PDT 2024
GCC version:  collect2: error: ld returned 1 exit status

  1. Check for GPU presence using nvidia-smi
alexlai@jetsonOrinNano:~$ nvidia-smi
Tue Dec 10 08:55:26 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 540.4.0                Driver Version: 540.4.0      CUDA Version: 12.6     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Orin (nvgpu)                  N/A  | N/A              N/A |                  N/A |
| N/A   N/A  N/A               N/A /  N/A | Not Supported        |     N/A          N/A |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

  1. Use tegrastats for GPU monitoring
alexlai@jetsonOrinNano:~$ tegrastats
12-10-2024 08:56:18 RAM 2127/7620MB (lfb 4x4MB) SWAP 0/20194MB (cached 0MB) CPU [1%@729,1%@729,0%@729,1%@883,0%@729,1%@729] GR3D_FREQ 0% cpu@48.562C soc2@48.218C soc0@47.343C gpu@47.875C tj@48.562C soc1@48.312C VDD_IN 5164mW/5164mW VDD_CPU_GPU_CV 511mW/511mW VDD_SOC 1498mW/1498mW
12-10-2024 08:56:19 RAM 2127/7620MB (lfb 4x4MB) SWAP 0/20194MB (cached 0MB) CPU [5%@729,0%@729,1%@729,1%@729,0%@729,1%@729] GR3D_FREQ 0% cpu@48.562C soc2@48.156C soc0@47.406C gpu@47.937C tj@48.562C soc1@48.406C VDD_IN 5203mW/5184mW VDD_CPU_GPU_CV 511mW/511mW VDD_SOC 1498mW/1498mW
12-10-2024 08:56:20 RAM 2127/7620MB (lfb 3x4MB) SWAP 0/20194MB (cached 0MB) CPU [1%@729,1%@729,1%@729,2%@729,3%@729,1%@729] GR3D_FREQ 22% cpu@48.406C soc2@48.125C soc0@47.468C gpu@47.625C tj@48.406C soc1@48.156C VDD_IN 5282mW/5216mW VDD_CPU_GPU_CV 551mW/524mW VDD_SOC 1537mW/1511mW
12-10-2024 08:56:21 RAM 2127/7620MB (lfb 3x4MB) SWAP 0/20194MB (cached 0MB) CPU [2%@729,6%@729,6%@729,1%@729,2%@729,0%@729] GR3D_FREQ 0% cpu@48.656C soc2@48.125C soc0@47.437C gpu@47.562C tj@48.656C soc1@48.25C VDD_IN 5243mW/5223mW VDD_CPU_GPU_CV 511mW/521mW VDD_SOC 1498mW/1508mW

...
  1. Check GPU performance using clinfo (OpenCL)
$ do apt-get install clinfo

alexlai@jetsonOrinNano:~$ clinfo 
Number of platforms                               0
  1. Run a sample GPU program
cd /usr/local/cuda/samples/1_Utilities/deviceQuery   <-- no such directory
sudo make
./deviceQuery
  1. Check GPU memory usage with cuda-memcheck (if CUDA is installed)

To check if CUDA is working and the GPU memory is being used correctly, you can use cuda-memcheck. This will check for memory errors in CUDA applications. For example:

cuda-memcheck ./your_cuda_application    <-- command not found
  1. nvcc not found
$ sudo apt update
$ sudo apt install nvidia-cuda-toolkit
....
The following packages have unmet dependencies:
 nvidia-cuda-toolkit : Depends: nvidia-cuda-dev (= 11.5.1-1ubuntu1) but 6.1+b123 is to be installed
                       Recommends: nvidia-cuda-toolkit-doc (= 11.5.1-1ubuntu1) but it is not going to be installed
                       Recommends: nvidia-cuda-gdb (= 11.5.114~11.5.1-1ubuntu1) but it is not going to be installed
                       Recommends: nsight-compute (= 2021.3.1.4~11.5.1-1ubuntu1)
                       Recommends: nsight-systems (= 2021.3.3.2~11.5.1-1ubuntu1)
E: Unable to correct problems, you have held broken packages.   <--- ???

alexlai@jetsonOrinNano:/usr/local$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2021 NVIDIA Corporation Built on Thu_Nov_18_09:45:25_PST_2021 Cuda compilation tools, release 11.5, V11.5.119 Build cuda_11.5.r11.5/compiler.30672275_0 alexlai@jetsonOrinNano:/usr/local$ sudo apt install nvidia-cuda-dev=11.5.1-1ubuntu1 && sudo apt install nvidia-cuda-toolkit^C alexlai@jetsonOrinNano:/usr/local$ cd /usr/local/cuda/samples/1_Utilities/deviceQuery bash: cd: /usr/local/cuda/samples/1_Utilities/deviceQuery: No such file or directory

git clone https://github.com/NVIDIA/cuda-samples.git alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples$ cd .. alexlai@jetsonOrinNano:/opt/ext4/alexlai/build$ cd cuda-samples/Samples/1_Utilities/deviceQuery alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples/Samples/1_Utilities/deviceQuery$ ls deviceQuery.cpp deviceQuery_vs2017.vcxproj deviceQuery_vs2019.vcxproj deviceQuery_vs2022.vcxproj NsightEclipse.xml deviceQuery_vs2017.sln deviceQuery_vs2019.sln deviceQuery_vs2022.sln Makefile README.md

alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples/Samples/1_Utilities/deviceQuery$ make /usr/local/cuda/bin/nvcc -ccbin g++ -I../../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_87,code=sm_87 -gencode arch=compute_90,code=sm_90 -gencode arch=compute_90,code=compute_90 -o deviceQuery.o -c deviceQuery.cpp make: /usr/local/cuda/bin/nvcc: No such file or directory make: *** [Makefile:341: deviceQuery.o] Error 127

alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples/Samples/1_Utilities/deviceQuery$ which nvcc /usr/bin/nvcc alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples/Samples/1_Utilities/deviceQuery$ ls /usr/local/ bin/ etc/ games/ include/ lib/ man/ sbin/ share/ src/
alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples/Samples/1_Utilities/deviceQuery$ sudo mkdir -p /usr/local/cuda/bin

alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples/Samples/1_Utilities/deviceQuery$ sudo ln -sf /usr/bin/nvcc /usr/local/cuda/bin/nvcc alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples/Samples/1_Utilities/deviceQuery$ ls -l /usr/bin/nvcc /usr/local/cuda/bin/nvcc -rwxr-xr-x 1 root root 59 十二 2 2021 /usr/bin/nvcc lrwxrwxrwx 1 root root 13 十二 10 09:36 /usr/local/cuda/bin/nvcc -> /usr/bin/nvcc $ make SMS="50 60"

$ alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples/Samples/1_Utilities/deviceQuery$ ./deviceQuery ./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Orin" CUDA Driver Version / Runtime Version 12.6 / 11.5 CUDA Capability Major/Minor version number: 8.7 Total amount of global memory: 7620 MBytes (7989878784 bytes) (008) Multiprocessors, (128) CUDA Cores/MP: 1024 CUDA Cores GPU Max Clock rate: 624 MHz (0.62 GHz) Memory Clock rate: 624 Mhz Memory Bus Width: 128-bit L2 Cache Size: 2097152 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total shared memory per multiprocessor: 167936 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 1536 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 2 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: Yes Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device supports Managed Memory: Yes Device supports Compute Preemption: Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: Yes Device PCI Domain ID / Bus ID / location ID: 0 / 0 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.6, CUDA Runtime Version = 11.5, NumDevs = 1 Result = PASS