Configuring Nvidia CUDA Drivers for Hashcat in Fedora 31 on AWS

I’m back, and in the time since my last post I have graduated with my Information Assurance and Cyber Defense degree from Eastern Michigan University. Immediately after that I started in the Masters in Cybersecurity program that they had just stood up. One of my first projects in an Offensive Security class was getting a remote shell on a vulnerable web app. Once I gained access to the machine and escalated my privileges to root, I was able to send the passwd and shadow files back to my Kali VM via netcat. With my shiny new passwd files in hand, I started thinking about what were the next steps.

I currently use a hexa-core 9th-gen Dell XPS that came with a Nvidia GTX 1650. I figured that this machine would be perfect, because I could use the GPU CUDA cores to crack passwords. But then I thought, why stop there? So I decided to spin up an Amazon EC2 instance and take my show on the road. In this guide I will walk you through that process and show you how to ultimately get password cracking with Hashcat running on a GPU (or multiple) in the cloud. While these instructions are for a cloud-based Fedora 31 image, it is the same steps to get these applications running locally and so this guide should be of use to anyone attempting to configure these technologies.


Configuring AWS

The first step is the create and log into an Amazon Web Services account. From there you can click the Services dropdown menu at the top of the page and select EC2.

Then select “Instances” and click the blue “Create Instance” button.

On the next page you will select the type of image that would run in your EC2 instance. I typed in Fedora and selected the checkbox for AWS Marketplace to narrow down the results. From there I selected the Fedora 31 Base Minimal Image Golden AMI.

This image will give you a clean installation of Fedora 31 to play with.

Click through the pop-up that covers pricing for the different types of instances and you are met with a long list of options to choose from for the hardware to host your Fedora server. If you scroll about 2/3rds of the way down the list you will come to the GPU instances. These are what we will need for the raw password cracking power we will utilize with Hashcat. I went with the p2.xlarge instance with has one GPU and costs $0.90/hour, which is quite affordable. However, this instance type only includes one Nvidia Tesla V100 GPU. The 8xlarge has GPU which would drastically increase your processing power, and the 16x obviously doubles it again (albeit, at a much higher hourly rate.)

The p2.16xlarge instance costs $14.40 an hour.

At this point you can select “Review and Launch” to double check all of the settings are correct and create the instance. The first time I did this I needed to request the ability to use more vCPU cores, as a new account comes with an allotment of 0. If this is the case, when you click “Launch” it will fail and there will be a message informing you how to create a service request. This page can also be accessed under the Limits menu of the EC2 Management Page.

You will need to find the “Running On-Demand All P instances” option and request a limit increase to the number specified by instance you are choosing. In the image above you can see the p2.xlarge requires 4 vCPUs.

The final step will be creating a key pair so that you can authenticate with the server and log in. This can be done wile creating the instance or in advance under the “Key Pairs” header of the EC2 management pane. Make sure to select the correct file format, pem for Linux users and ppk for Windows.

At this point you should have a brand new AWS instance up and running and waiting for you to log in and get started.


Setting Up your Fedora Server

To connect to your server you will need to get the Public DNS name for the machine and then connect with ssh. Youll need to change the name of your pem file and the address in the text below.

ssh -i raul.pem fedora@ec2-xx-xx-xx-99.compute-1.amazonaws.com

Finally, you will be greeted by a command prompt and will be in your new instance.

The hostname is an internal IP address, or I would hide it from you.

The first thing I did once I was logged into the machine was to do an update. Then I installed the RPM fusion repos.

sudo dnf upgrade
sudo dnf install https://download1.rpmfusion.org/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm https://download1.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm

These repositories will give you access to the Nivida drivers. The next thing you will want to do is blacklist the Nouveau drivers. These are open source Nvidia drivers that will conflict with the proprietary ones we need to utilize the CUDA cores. To do this would will want to create a blacklist.conf file in /etc/modprobe.d/ with the following command.

sudo vi /etc/modprobe.d/blacklist.conf

Then paste in the following text:

blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off

With the pre-requisites out of the way you can install the Nvidia Cuda drivers and test that they are working.

sudo dnf install xorg-x11-drv-nvidia-cuda

Test that it is working properly with the following command.

nvidia-smi

There will be a line item for every GPU installed on the system. If this command does not work, there is an issue. Perhaps the machine will need a restart before everything loads.

With the GPU drivers configured it is time to install and run Hashcat. This is as easy as running:

sudo dnf install hashcat

Now that all the pieces are in place you can finally run the Hashcat benchmarks to get an idea of the performance of you cloud-based password cracker. Here is the command and the output of it on my machine.

hashcat -b
hashcat (v5.1.0) starting in benchmark mode...

* Device #1: This hardware has outdated CUDA compute capability (3.7).
             For modern OpenCL performance, upgrade to hardware that supports
             CUDA compute capability version 5.0 (Maxwell) or higher.
* Device #2: Not a native Intel OpenCL runtime. Expect massive speed loss.
             You can use --force to override, but do not report related errors.
OpenCL Platform #1: NVIDIA Corporation
======================================
* Device #1: Tesla K80, 2860/11441 MB allocatable, 13MCU

OpenCL Platform #2: The pocl project
====================================
* Device #2: pthread-Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, skipped.

OpenCL Platform #3: Mesa, skipped or no OpenCL compatible devices found.

Benchmark relevant options:
===========================
* --optimized-kernel-enable
* --workload-profile=3

Hashmode: 0 - MD5

Speed.#1.........:  3978.5 MH/s (53.04ms) @ Accel:512 Loops:128 Thr:256 Vec:2

Hashmode: 100 - SHA1

Speed.#1.........:  1619.0 MH/s (66.25ms) @ Accel:256 Loops:128 Thr:256 Vec:4

Hashmode: 1400 - SHA2-256

Speed.#1.........:   724.5 MH/s (74.09ms) @ Accel:256 Loops:64 Thr:256 Vec:1

Hashmode: 1700 - SHA2-512

Speed.#1.........:   211.4 MH/s (63.88ms) @ Accel:128 Loops:32 Thr:256 Vec:1

Hashmode: 2500 - WPA-EAPOL-PBKDF2 (Iterations: 4096)

Speed.#1.........:    82308 H/s (80.27ms) @ Accel:128 Loops:64 Thr:256 Vec:1

Hashmode: 1000 - NTLM

Speed.#1.........:  6742.3 MH/s (63.00ms) @ Accel:512 Loops:256 Thr:256 Vec:4

Hashmode: 3000 - LM

Speed.#1.........:  4243.1 MH/s (51.01ms) @ Accel:64 Loops:1024 Thr:256 Vec:1

Hashmode: 5500 - NetNTLMv1 / NetNTLMv1+ESS

Speed.#1.........:  3862.1 MH/s (55.02ms) @ Accel:512 Loops:128 Thr:256 Vec:4

Hashmode: 5600 - NetNTLMv2

Speed.#1.........:   247.0 MH/s (54.73ms) @ Accel:128 Loops:32 Thr:256 Vec:2

Hashmode: 1500 - descrypt, DES (Unix), Traditional DES

Speed.#1.........:   174.4 MH/s (77.91ms) @ Accel:4 Loops:1024 Thr:256 Vec:1

Hashmode: 500 - md5crypt, MD5 (Unix), Cisco-IOS $1$ (MD5) (Iterations: 1000)

Speed.#1.........:  1210.0 kH/s (81.99ms) @ Accel:512 Loops:500 Thr:32 Vec:1

Hashmode: 3200 - bcrypt $2*$, Blowfish (Unix) (Iterations: 32)

Speed.#1.........:     2550 H/s (39.78ms) @ Accel:16 Loops:2 Thr:8 Vec:1

Hashmode: 1800 - sha512crypt $6$, SHA512 (Unix) (Iterations: 5000)

Speed.#1.........:    28376 H/s (92.95ms) @ Accel:256 Loops:128 Thr:32 Vec:1

Hashmode: 7500 - Kerberos 5 AS-REQ Pre-Auth etype 23

Speed.#1.........: 46972.8 kH/s (72.29ms) @ Accel:128 Loops:32 Thr:64 Vec:1

Hashmode: 13100 - Kerberos 5 TGS-REP etype 23

Speed.#1.........: 47306.6 kH/s (71.78ms) @ Accel:128 Loops:32 Thr:64 Vec:1

Hashmode: 15300 - DPAPI masterkey file v1 (Iterations: 23999)

Speed.#1.........:    14347 H/s (78.45ms) @ Accel:128 Loops:64 Thr:256 Vec:1

Hashmode: 15900 - DPAPI masterkey file v2 (Iterations: 7999)

Speed.#1.........:    10108 H/s (83.90ms) @ Accel:256 Loops:64 Thr:32 Vec:1

Hashmode: 7100 - macOS v10.8+ (PBKDF2-SHA512) (Iterations: 35000)

Speed.#1.........:     2554 H/s (75.80ms) @ Accel:64 Loops:32 Thr:256 Vec:1

Hashmode: 11600 - 7-Zip (Iterations: 524288)

Speed.#1.........:     2012 H/s (99.54ms) @ Accel:512 Loops:64 Thr:256 Vec:1

Hashmode: 12500 - RAR3-hp (Iterations: 262144)

Speed.#1.........:     9942 H/s (83.53ms) @ Accel:4 Loops:16384 Thr:256 Vec:1

Hashmode: 13000 - RAR5 (Iterations: 32767)

Speed.#1.........:     7967 H/s (103.37ms) @ Accel:128 Loops:64 Thr:256 Vec:1

Hashmode: 6211 - TrueCrypt PBKDF2-HMAC-RIPEMD160 + XTS 512 bit (Iterations: 2000)

Speed.#1.........:    64961 H/s (103.67ms) @ Accel:128 Loops:32 Thr:256 Vec:1

Hashmode: 13400 - KeePass 1 (AES/Twofish) and KeePass 2 (AES) (Iterations: 6000)

Speed.#1.........:    32603 H/s (138.83ms) @ Accel:512 Loops:128 Thr:32 Vec:1

Hashmode: 6800 - LastPass + LastPass sniffed (Iterations: 500)

Speed.#1.........:   499.2 kH/s (45.16ms) @ Accel:64 Loops:62 Thr:256 Vec:1

Hashmode: 11300 - Bitcoin/Litecoin wallet.dat (Iterations: 199999)

Speed.#1.........:      946 H/s (71.35ms) @ Accel:128 Loops:32 Thr:256 Vec:1

Started: Wed Feb  5 05:20:54 2020
Stopped: Wed Feb  5 05:27:19 2020

As you can see even this modest instance in capable of trying millions of weak password hashes a second for less than $25/day. If you need more power the larger instances should offer a good solution considering the largest one costs around $360 a day, compared to what would likely be $18,000 to build a comparable machine. Interestingly enough, my laptop with the GTX 1650 outperforms these benchmarks, though it would quickly be left behind by some of the more powerful offerings. Hashcat is really a great tool and I will expand on its usage in another guide very soon. Thanks for reading.

Leave a Reply

Your email address will not be published. Required fields are marked *