top of page

How To: GPU Acceleration for
StarNet++ V2 using CUDA
Updated 08 February 2022

StarNetIntro.png
cuda.jpeg

If you've used StarNet before, you know long it takes sometimes to process a single image. Chances are, you'd love to find a way to speed it up! If you have a NVIDIA GPU, look no further: using CUDA acceleration, you can consistently improve performance by over 80%!

If you don't have a NVIDIA GPU, no worries—all DirectX 12 compatible GPU's on Windows can be accelerated using DirectML! 

This tutorial is based upon the tutorial by DarkArchon, which can be found on their website. It is written for StarNet V2, PixInsight 1.8.8-12, and CUDA 11.6. StarNet V2 has a few notable changes from the original: the standalone module now exists as a GUI instead of a command-line application, its highest stride has been increased from 128 to 256, and the PixInsight module now works on linear images as well (no need for the LinearStarnet script anymore!) This guide is meant to be used with the standalone StarNet++ module written by nekitmm, which can be found on SourceForge. A version using Anaconda Navigator, written by Johannes Schäfer (MrSheppard), containing CUDA 11.4, can be found on Google Drive. A PDF version of this tutorial will be made available shortly. Now, let's get started!

0. Compatibility

0. Compatibility

This tutorial is meant for x64 systems running Windows 10 or 11. Windows on ARM is not supported.
An NVIDIA GPU with CUDA compute capabilities of 3.5 or higher. A list of compute capabilities can be found here.

1.1 Extracting StarNet

1.1 Extracting StarNet++

StarNetDL.png

- If you are using the standalone StarNet module:

Download StarNetv2GUI_Win.zip or StarNetv2CLI_Win.zip from nekitmm's website here, then extract the folder where you would like to keep the module (e.g. Downloads, Desktop, etc.)
GUI give the graphical interface, while CLI gives the command line interface.

- If you are using StarNet with PixInsight:

Download StarNetv2PI_Win.zip from the SourceForge project, extract the folder, then follow the instructions in the README.txt file.

UPDATE: April 2023

The CUDA download links and paths in this article now default to CUDA 11.8. This new version continues to work with StarNet. I have not had a chance to test CUDA 12.x yet. Let me know if you've had any success or issues with CUDA 12.x in the comments below!

1.2 Extracting libtensorflow

1.2 Extracting and Replacing libtensorflow

Download libtensorflow-gpu-windows-x86_64-2.7.0.zip from the TensorFlow project here, extract the folder, then navigate to \lib inside the zip file.

 

- If you are using the standalone StarNet module: 

Replace the file titled tensorflow.dll in StarNetv2GUI_Win or StarNetv2GUI_Win (whichever one was downloaded) with the one inside \lib.

- If you are using StarNet with PixInsight: 

Navigate to C:\Program Files\Pixinsight\bin, then replace the file titled tensorflow.dll with the one inside \lib.

libtflow.png
2. Installing CUDA

2. Installing CUDA

Download the CUDA 11.8 installer from NVIDIA's website. Follow the on-screen instructions. Select Express (Recommended) to install all CUDA components.

If you would like to custom install CUDA, ensure that CUDA > Runtime and CUDA > Development > Tools > CUPTI are both checked.

StarNet_CudaInstaller.png
3. Environment Variables

3. Environment Variables

Open the Environment Variables editor. This can be found using the Windows Search Bar.
Once the "System Properties" window is open, click "Environment Variables" to edit your environment variables.

Check that the following System Variables are present. If not, add them:

  • TF_FORCE_GPU_ALLOW_GROWTH = TRUE

  • CUDA_PATH = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8

  • CUDA_PATH_V11_8 = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8

EnvVar_TFGPU.png
EnvVar_CPath.png
EnvVar_CPathV116.png

Under the PATH environment variable, ensure that the following paths are present. If not, add them:

  • C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin

  • C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\include

  • C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\lib

  • C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\extras\CUPTI\lib64

  • C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\libnvvp

  • C:\tools\cuda\bin

EnvVar_Path.png
4. cuDNN and Final Files

4. Final Files: cuDNN

Navigate to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\include.

Copy the folder named \tensorflow from libtensorflow-gpu-windows-x86_64-2.7.0\include into C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\include.

tf_include_fldr.png

Then, copy tensorflow.lib from libtensorflow-gpu-windows-x86_64-2.7.0\lib into C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\lib.

Next, download cudnn-11.4-windows-x64-v8.2.4.15.zip from the NVIDIA website here. Note that although cuDNN v8.2.4 is advertised as compatible with CUDA 11.4, this version of cuDNN is compatible with CUDA 11.8 as well. As of the writing of this article, version 8.2.4 is the latest version of CUDA compatible with TensorFlow 2.7.0.


If you don't want to register for an NVIDIA Developer Program account, you can also download it from this mirror.


Finally, copy the contents of the folders \bin and \include from cudnn-11.4-windows-x64-v8.2.4.15.zip into the corresponding folders in C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8. Copy the files from \lib into \lib\x64.

files.png
5. Running StarNet

5. Running StarNet

To use the portable StarNet GUI module, run the application named starnetGUI.exe. Click "Browse" to select the 16-bit *.tiff file that you would like to run through StarNet, then click "Run".

If you are using the Command-Line Interface version of StarNet, copy the 16-bit *.tiff file you would like to process into the folder containing all the files, then drag the new file on top of starnet++.exe. Your output file will appear as starless.tif in the same folder that you executed the file from.

To use the PixInsight module, navigate to the "Process" tab, then select "StarNet2" under "All Processes". Apply the process to the desired image (linear data now supported!)

StarNet2_Test.png
6. Testing & Troubleshooting

6. Testing StarNet with CUDA and Troubleshooting

To check if CUDA is working, open Task Manager, click "More Details" at the bottom of the window, then click on the "Performance" tab at the top. Click on your GPU. Run an instance of the StarNet which has had its original tensorflow.dll replaced with the GPU-enabled version.

 

If you were successful in enabling CUDA acceleration, the usage graph for your GPU should shoot to around 100%. Without GPU acceleration, the CPU usage graph would shoot upward but the GPU usage graph would stay constant. If your GPU usage doesn't go up, check to make sure all files were copied to the correct folders, and that all environment variables were added correctly. 

taskmgr.png

Running the standalone module can help reveal potential reasons behind an error or a crash, as the console output can give clues as to what's going on. The PixInsight module does not have this information in its console I/O.

7. Benchmark

7. A Quick Benchmark

I wanted to see just how much performance increase I was able to get by upgrading to StarNet V2 with CUDA acceleration, so I ran a very short test. By no means is this an exhaustive test, nor do I guarantee that everyone else will see similar results. Your mileage will vary depending on your hardware configuration, as well as the image that you run through StarNet.

I'm running PixInsight 1.8.8-12 on Windows 11, with an AMD Ryzen 7 5800X CPU and NVIDIA GeForce RTX 3070 GPU. A 6000x4000 image of M31 was used as a benchmark. More information about the tests can be found in the PDF version of this tutorial.

Stock StarNet V1 averaged 1:47.52, while StarNet V1 with CUDA acceleration averaged 16.56s — an 85% improvement!

Stock StarNet V2 averaged 1:20.66, while StarNet V2 with CUDA acceleration averaged 12.50s — also an 85% improvement!

StarNet V2 with CUDA offered an 88% performance increase compared to stock StarNet V1.

Although not everyone will see the same results as these, CUDA acceleration of StarNet has been known to consistently show a performance increase of around 80-90% in most systems. Pretty impressive for so little work—that's the beauty of taking advantage of existing hardware!

StarNet_V2_GPU_2.png

Thanks for reading this tutorial, and clear skies to everyone! If you have any comments or questions please don't hesitate to let me know!​​- WL

bottom of page