§2024-12-10
I used Nvidia SDK manager to flash an linux image into Jetson Orin Nano development Board, and it booted up. How to verify the jetson driver software has been installed and I could drive the GPU core?
- Check if NVIDIA drivers are installed
alexlai@jetsonOrinNano:~$ lsmod | grep nvidia
nvidia_drm 94208 1
nvidia_modeset 1261568 5 nvidia_drm
nvidia 1458176 9 nvidia_modeset
nvidia_vrs_pseq 16384 0
tegra_dce 98304 2 nvidia
tsecriscv 32768 1 nvidia
host1x_nvhost 40960 8 nvhost_isp5,nvhost_nvcsi_t194,nvidia,tegra_camera,nvhost_capture,nvhost_nvcsi,nvhost_vi5,nvidia_modeset
drm_kms_helper 278528 4 tegra_drm,nvidia_drm
nvidia_p2p 20480 0
host1x 180224 6 host1x_nvhost,host1x_fence,nvgpu,tegra_drm,nvidia_drm,nvidia_modeset
mc_utils 16384 3 nvidia,nvgpu,tegra_camera_platform
drm 602112 15 drm_kms_helper,nvidia,tegra_drm,nvidia_drm
- nvidia_drm, nvidia_modeset, nvidia: These modules are part of the NVIDIA driver stack.
- nvidia_drm is the Direct Rendering Manager module for NVIDIA.
- nvidia_modeset is used for managing the display modes.
- nvidia is the core NVIDIA driver for the GPU.
- nvidia_vrs_pseq: This module is related to NVIDIA Variable Rate Shading (VRS), a feature for performance optimizations.
- tegra_dce, host1x_nvhost, host1x: These are related to the Tegra platform and its interaction with the NVIDIA hardware (such as GPU, video processing, and display).
- drm_kms_helper, drm: These modules are for the Direct Rendering Manager (DRM) system, which is responsible for managing display buffers and coordinating the GPU and display hardware.
- nvidia_p2p: This module is for Peer-to-Peer (P2P) communication, which is important when using multiple GPUs.
This should show a list of NVIDIA kernel modules if the drivers are installed correctly. Look for entries like nvidia, nvidia_uvm, nvidia_modeset, and nvidia_drm.
alexlai@jetsonOrinNano:~$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX Open Kernel Module for aarch64 540.4.0 Release Build (buildbrain@mobile-u64-5441-d8000) Thu Sep 12 21:22:07 PDT 2024
GCC version: collect2: error: ld returned 1 exit status
- Check for GPU presence using nvidia-smi
alexlai@jetsonOrinNano:~$ nvidia-smi
Tue Dec 10 08:55:26 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 540.4.0 Driver Version: 540.4.0 CUDA Version: 12.6 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Orin (nvgpu) N/A | N/A N/A | N/A |
| N/A N/A N/A N/A / N/A | Not Supported | N/A N/A |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
- Use tegrastats for GPU monitoring
alexlai@jetsonOrinNano:~$ tegrastats
12-10-2024 08:56:18 RAM 2127/7620MB (lfb 4x4MB) SWAP 0/20194MB (cached 0MB) CPU [1%@729,1%@729,0%@729,1%@883,0%@729,1%@729] GR3D_FREQ 0% cpu@48.562C soc2@48.218C soc0@47.343C gpu@47.875C tj@48.562C soc1@48.312C VDD_IN 5164mW/5164mW VDD_CPU_GPU_CV 511mW/511mW VDD_SOC 1498mW/1498mW
12-10-2024 08:56:19 RAM 2127/7620MB (lfb 4x4MB) SWAP 0/20194MB (cached 0MB) CPU [5%@729,0%@729,1%@729,1%@729,0%@729,1%@729] GR3D_FREQ 0% cpu@48.562C soc2@48.156C soc0@47.406C gpu@47.937C tj@48.562C soc1@48.406C VDD_IN 5203mW/5184mW VDD_CPU_GPU_CV 511mW/511mW VDD_SOC 1498mW/1498mW
12-10-2024 08:56:20 RAM 2127/7620MB (lfb 3x4MB) SWAP 0/20194MB (cached 0MB) CPU [1%@729,1%@729,1%@729,2%@729,3%@729,1%@729] GR3D_FREQ 22% cpu@48.406C soc2@48.125C soc0@47.468C gpu@47.625C tj@48.406C soc1@48.156C VDD_IN 5282mW/5216mW VDD_CPU_GPU_CV 551mW/524mW VDD_SOC 1537mW/1511mW
12-10-2024 08:56:21 RAM 2127/7620MB (lfb 3x4MB) SWAP 0/20194MB (cached 0MB) CPU [2%@729,6%@729,6%@729,1%@729,2%@729,0%@729] GR3D_FREQ 0% cpu@48.656C soc2@48.125C soc0@47.437C gpu@47.562C tj@48.656C soc1@48.25C VDD_IN 5243mW/5223mW VDD_CPU_GPU_CV 511mW/521mW VDD_SOC 1498mW/1508mW
...
- Check GPU performance using clinfo (OpenCL)
$ do apt-get install clinfo
alexlai@jetsonOrinNano:~$ clinfo
Number of platforms 0
- Run a sample GPU program
cd /usr/local/cuda/samples/1_Utilities/deviceQuery <-- no such directory
sudo make
./deviceQuery
- Check GPU memory usage with cuda-memcheck (if CUDA is installed)
To check if CUDA is working and the GPU memory is being used correctly, you can use cuda-memcheck. This will check for memory errors in CUDA applications. For example:
cuda-memcheck ./your_cuda_application <-- command not found
- nvcc not found
$ sudo apt update
$ sudo apt install nvidia-cuda-toolkit
....
The following packages have unmet dependencies:
nvidia-cuda-toolkit : Depends: nvidia-cuda-dev (= 11.5.1-1ubuntu1) but 6.1+b123 is to be installed
Recommends: nvidia-cuda-toolkit-doc (= 11.5.1-1ubuntu1) but it is not going to be installed
Recommends: nvidia-cuda-gdb (= 11.5.114~11.5.1-1ubuntu1) but it is not going to be installed
Recommends: nsight-compute (= 2021.3.1.4~11.5.1-1ubuntu1)
Recommends: nsight-systems (= 2021.3.3.2~11.5.1-1ubuntu1)
E: Unable to correct problems, you have held broken packages. <--- ???
alexlai@jetsonOrinNano:/usr/local$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2021 NVIDIA Corporation Built on Thu_Nov_18_09:45:25_PST_2021 Cuda compilation tools, release 11.5, V11.5.119 Build cuda_11.5.r11.5/compiler.30672275_0 alexlai@jetsonOrinNano:/usr/local$ sudo apt install nvidia-cuda-dev=11.5.1-1ubuntu1 && sudo apt install nvidia-cuda-toolkit^C alexlai@jetsonOrinNano:/usr/local$ cd /usr/local/cuda/samples/1_Utilities/deviceQuery bash: cd: /usr/local/cuda/samples/1_Utilities/deviceQuery: No such file or directory
git clone https://github.com/NVIDIA/cuda-samples.git alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples$ cd .. alexlai@jetsonOrinNano:/opt/ext4/alexlai/build$ cd cuda-samples/Samples/1_Utilities/deviceQuery alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples/Samples/1_Utilities/deviceQuery$ ls deviceQuery.cpp deviceQuery_vs2017.vcxproj deviceQuery_vs2019.vcxproj deviceQuery_vs2022.vcxproj NsightEclipse.xml deviceQuery_vs2017.sln deviceQuery_vs2019.sln deviceQuery_vs2022.sln Makefile README.md
alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples/Samples/1_Utilities/deviceQuery$ make /usr/local/cuda/bin/nvcc -ccbin g++ -I../../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_87,code=sm_87 -gencode arch=compute_90,code=sm_90 -gencode arch=compute_90,code=compute_90 -o deviceQuery.o -c deviceQuery.cpp make: /usr/local/cuda/bin/nvcc: No such file or directory make: *** [Makefile:341: deviceQuery.o] Error 127
alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples/Samples/1_Utilities/deviceQuery$ which nvcc
/usr/bin/nvcc
alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples/Samples/1_Utilities/deviceQuery$ ls /usr/local/
bin/ etc/ games/ include/ lib/ man/ sbin/ share/ src/
alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples/Samples/1_Utilities/deviceQuery$ sudo mkdir -p /usr/local/cuda/bin
alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples/Samples/1_Utilities/deviceQuery$ sudo ln -sf /usr/bin/nvcc /usr/local/cuda/bin/nvcc alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples/Samples/1_Utilities/deviceQuery$ ls -l /usr/bin/nvcc /usr/local/cuda/bin/nvcc -rwxr-xr-x 1 root root 59 十二 2 2021 /usr/bin/nvcc lrwxrwxrwx 1 root root 13 十二 10 09:36 /usr/local/cuda/bin/nvcc -> /usr/bin/nvcc $ make SMS="50 60"
$ alexlai@jetsonOrinNano:/opt/ext4/alexlai/build/cuda-samples/Samples/1_Utilities/deviceQuery$ ./deviceQuery ./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "Orin" CUDA Driver Version / Runtime Version 12.6 / 11.5 CUDA Capability Major/Minor version number: 8.7 Total amount of global memory: 7620 MBytes (7989878784 bytes) (008) Multiprocessors, (128) CUDA Cores/MP: 1024 CUDA Cores GPU Max Clock rate: 624 MHz (0.62 GHz) Memory Clock rate: 624 Mhz Memory Bus Width: 128-bit L2 Cache Size: 2097152 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total shared memory per multiprocessor: 167936 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 1536 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 2 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: Yes Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device supports Managed Memory: Yes Device supports Compute Preemption: Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: Yes Device PCI Domain ID / Bus ID / location ID: 0 / 0 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.6, CUDA Runtime Version = 11.5, NumDevs = 1 Result = PASS