Detecting Parkinson’s Disease with OpenCV, Computer Vision, and the Spiral/Wave Test

In this tutorial, you will learn how to use OpenCV and machine learning to automatically detect Parkinson’s disease in hand-drawn images of spirals and waves.

Today’s tutorial is inspired from PyImageSearch reader, Joao Paulo Folador, a PhD student from Brazil.

Joao is interested in utilizing computer vision and machine learning to automatically detect and predict Parkinson’s disease based on geometric drawings (i.e., spirals and sign waves).

While I am familiar with Parkinson’s disease, I had not heard of the geometric drawing test — a bit of research led me to a 2017 paper, Distinguishing Different Stages of Parkinson’s Disease Using Composite Index of Speed and Pen-Pressure of Sketching a Spiral, by Zham et al.

The researchers found that the drawing speed was slower and the pen pressure lower among Parkinson’s patients — this was especially pronounced for patients with a more acute/advanced forms of the disease.

One of the symptoms of Parkinson’s is tremors and rigidity in the muscles, making it harder to draw smooth spirals and waves.

Joao postulated that it might be possible to detect Parkinson’s disease using the drawings alone rather than having to measure the speed and pressure of the pen on paper.

Reducing the requirement of tracking pen speed and pressure:

  1. Eliminates the need for additional hardware when performing the test.
  2. Makes it far easier to automatically detect Parkinson’s.

Graciously, Joao and his advisor allowed me access to the dataset they collected of both spirals and waves drawn by (1) patients with Parkinson’s, and (2) healthy participants.

I took a look at the dataset and considered our options.

Originally, Joao wanted to apply deep learning to the project, but after consideration, I carefully explained that deep learning, while powerful, isn’t always the right tool for the job! You wouldn’t want to use a hammer to drive in a screw, for instance.

Instead, you look at your toolbox, carefully consider your options, and grab the right tool.

I explained this to Joao and then demonstrated how we can predict Parkinson’s in images with 83.33% accuracy using standard computer vision and machine learning algorithms.

To learn how to apply computer vision and OpenCV to detect Parkinson’s based on geometric drawings, just keep reading!

Looking for the source code to this post?
Jump right to the downloads section.

Detecting Parkinson’s with OpenCV, Computer Vision, and the Spiral/Wave Test

In the first part of this tutorial, we’ll briefly discuss Parkinson’s disease, including how geometric drawings can be used to detect and predict Parkinson’s.

We’ll then examine our dataset of drawings gathered from both patients with and without Parkinson’s.

After reviewing the dataset, I will teach how to use the HOG image descriptor to quantify the input images and then how we can train a Random Forest classifier on top of the extracted features.

We’ll wrap up by examining our results.

What is Parkinson’s disease?

Figure 1: Patients with Parkinson’s disease have nervous system issues. Symptoms include movement issues such as tremors and rigidity. In this blog post, we’ll use OpenCV and machine learning to detect Parkinson’s disease from hand drawings consisting of spirals and waves.

Parkinson’s disease is a nervous system disorder that affects movement. The disease is progressive and is marked by five different stages (source).

  1. Stage 1: Mild symptoms that do not typically interfere with daily life, including tremors and movement issues on only one side of the body.
  2. Stage 2: Symptoms continue to become worse with both tremors and rigidity now affecting both sides of the body. Daily tasks become challenging.
  3. Stage 3: Loss of balance and movements with falls becoming frequent and common. The patient is still capable of (typically) living independently.
  4. Stage 4: Symptoms become severe and constraining. The patient is unable to live alone and requires help to perform daily activities.
  5. Stage 5: Likely impossible to walk or stand. The patient is most likely wheelchair bound and may even experience hallucinations.

While Parkinson’s cannot be cured, early detection along with proper medication can significantly improve symptoms and quality of life, making it an important topic as computer vision and machine learning practitioners to explore.

Drawing spirals and waves to detect Parkinson’s disease

Figure 2: A 2017 study by Zham et al. concluded that it is possible to detect Parkinson’s by asking the patient to draw a spiral while tracking the speed of pen movement and pressure. No image processing was conducted in this study. (image source)

A 2017 study by Zham et al. found that it was possible to detect Parkinson’s by asking the patient to draw a spiral and then track:

  1. Speed of drawing
  2. Pen pressure

The researchers found that the drawing speed was slower and the pen pressure lower among Parkinson’s patients — this was especially pronounced for patients with a more acute/advanced forms of the disease.

We’ll be leveraging the fact that two of the most common Parkinson’s symptoms include tremors and muscle rigidity which directly impact the visual appearance of a hand drawn spiral and wave.

The variation in visual appearance will enable us to train a computer vision + machine learning algorithm to automatically detect Parkinson’s disease.

The spiral and wave dataset

Figure 3: Today’s Parkinson’s image dataset is curated by Andrade and Folado from the NIATS of Federal University of Uberlândia. We will use Python and OpenCV to train a model for automatically classifying Parkinson’s from similar spiral/wave drawings.

The dataset we’ll be using here today was curated by Adriano de Oliveira Andrade and Joao Paulo Folado from the NIATS of Federal University of Uberlândia.

The dataset itself consists of 204 images and is pre-split into a training set and a testing set, consisting of:

  • Spiral: 102 images, 72 training, and 30 testing
  • Wave: 102 images, 72 training, and 30 testing

Figure 3 above shows examples of each of the drawings and corresponding classes.

While it would be challenging, if not impossible, for a person to classify Parkinson’s vs. healthy in some of these drawings, others show a clear deviation in visual appearance — our goal is to quantify the visual appearance of these drawings and then train a machine learning model to classify them.

Preparing a computing environment for today’s project

Today’s environment is straightforward to get up and running on your system.

You will need the following software:

  • OpenCV
  • NumPy
  • Scikit-learn
  • Scikit-image
  • imutils

Each package can be installed with pip, Python’s package manager.

But before you dive into pip, read this tutorial to set up your virtual environment and to install OpenCV with pip.

Below you can find the commands you’ll need to configure your development environment.

Project structure

Go ahead and grab today’s “Downloads” associated with today’s post. The .zip file contains the spiral and wave dataset along with a single Python script.

You may use the tree  command in a terminal to inspect the structure of the files and folders:

Our dataset/  is first broken down into spiral/  and wave/ . Each of those folders is further split into testing/  and training/ . Finally our images reside in healthy/  or parkinson/  folders.

We’ll be reviewing a single Python script today: detect_parkinsons.py . This script will read all of the images, extract features, and train a machine learning model. Finally, results will be displayed in a montage.

Implementing the Parkinson’s detector script

To implement our Parkinson’s detector you may be tempted to throw deep learning and Convolutional Neural Networks (CNNs) at the problem — there’s a problem with that approach though.

To start, we don’t have much training data, only 72 images for training. When confronted with a lack of tracking data we typically apply data augmentation — but data augmentation in this context is also problematic.

You would need to be extremely careful as improper use of data augmentation could potentially make a healthy patient’s drawing look like a Parkinson’s patient’s drawing (or vice versa).

And more to the point, effectively applying computer vision to a problem is all about bringing the right tool to the job — you wouldn’t use a screwdriver to bang in a nail, for instance.

Just because you may know how to apply deep learning to a problem doesn’t necessarily mean that deep learning is “always” the best choice for the problem.

In this example, I’ll show you how the Histogram of Oriented Gradients (HOG) image descriptor along with a Random Forest classifier can perform quite well given the limited amount of training data.

Open up a new file, name it detect_parkinsons.py , and insert the following code:

We begin with our imports on Lines 2-11:

  • We’ll be making heavy use of scikit-learn as is evident in the first three imports:
    • The classifier we are using is the RandomForestClassifier .
    • We’ll use a LabelEncoder  to encode labels as integers.
    • A confusion_matrix  will be built so that we can derive raw accuracy, sensitivity, and specificity.
  • Histogram of Oriented Gradients (HOG) will come from the feature  import of scikit-image.
  • Two modules from imutils  will be put to use:
    • We will build_montages  for visualization.
    • Our paths  import will help us to extract the file paths to each of the images in our dataset.
  • NumPy will help us calculate statistics and grab random indices.
  • The argparse  import will allow us to parse command line arguments.
  • OpenCV ( cv2) will be used to read, process, and display images.
  • Our program will accommodate both Unix and Windows file paths with the os  module.

Let’s define a function to quantify a wave/spiral image  with the HOG method:

We will extract features from each input image with the quantify_image  function.

First introduced by Dalal and Triggs in their CVPR 2005 paper, Histogram of Oriented Gradients for Human Detection, HOG will be used to quantify our image.

HOG is a structural descriptor that will capture and quantify changes in local gradient in the input image. HOG will naturally be able to quantify how the directions of a both spirals and waves change.

And furthermore, HOG will be able to capture if these drawings have more of a “shake” to them, as we might expect from a Parkinson’s patient.

Another application of HOG is this PyImageSearch Gurus sample lesson. Be sure to refer to the sample lesson for a full explanation on the feature.hog  parameters.

The resulting features are a 12,996-dim feature vector (list of numbers) quantifying the wave or spiral. We’ll train a Random Forest classifier on top of the features from all images in the dataset.

Moving on, let’s load our data and extract features:

The load_split  function has a goal of accepting a dataset path  and returning all feature data  and associated class labels . Let’s break it down step by step:

  • The function is defined to accept a path  to the dataset (either waves or spirals) on Line 23.
  • From there we grab input imagePaths , taking advantage of imutils (Line 26).
  • Both data  and labels  lists are initialized (Lines 27 and 28).
  • From there we loop over all imagePaths  beginning on Line 31:
    • Each label  is extracted from the path (Line 33).
    • Each image  is loaded and preprocessed (Lines 37-44). The thresholding step segments the drawing from the input image, making the drawing appear as white foreground on a black background.
    • Features are extracted via our quantify_image  function (Line 47).
    • The features  and label  are appended to the data  and labels  lists respectively (Lines 50-51).
  • Finally data  and labels  are converted to NumPy arrays and returned conveniently in a tuple (Line 54).

Let’s go ahead and parse our command line arguments:

Our script handles two command line arguments:

  • --dataset : The path to the input dataset (either waves or spirals).
  • --trials : The number of trials to run (by default we run 5  trials).

To prepare for training we’ll perform initializations:

Here we are building paths to training and testing input directories (Lines 65 and 66).

From there we load our training and testing splits by passing each path to load_split  (Lines 70 and 71).

Our trials  dictionary is initialized on Line 79 (recall that by default we will run 5  trials).

Let’s start our trials now:

On Line 82, we loop over each trial. In each trial, we:

  • Initialize our Random Forest classifier and train the model (Lines 86 and 87). For more information about Random Forests, including how they are used in context of computer vision, be sure to refer to PyImageSearch Gurus.
  • Make predictions  on testing data (Line 91).
  • Compute accuracy, sensitivity, and specificity metrics  (Lines 96-100).
  • Update our trials  dictionary (Lines 103-108).

Looping over each of our metrics, we’ll print statistical information:

On Line 111, we loop over each metric .

Then we proceed to grab the values  from the trials  (Line 114).

Using the values , the mean and standard deviation are computed for each metric (Lines 115 and 116).

From there, the statistics are shown in the terminal.

Now comes the eye candy — we’re going to create a montage so that we can share our work visually:

First, we randomly sample images from our testing set (Lines 126-128).

Our images  list will hold each wave or spiral image along with annotations added via OpenCV drawing functions (Line 129).

We proceed to loop over the random image indices on Line 132.

Inside the loop, each image is processed in the same manner as during training (Lines 134-142).

From there we’ll automatically classify the image using our new HOG + Random Forest based classifier and add color-coded annotations:

Each image  is quantified with HOG features  (Line 146).

Then the image is classified by passing those features  to model.predict  (Lines 147 and 148).

The class label is colored green for "healthy"  and red otherwise (Line 152). The label  is drawn in the top left corner of the image (Lines 153 and 154).

Each output  image is then added to an images  list (Line 155) so that we can develop a montage  (Line 158). You can learn more about creating Montages with OpenCV.

The montage  is then displayed via Line 161 until a key is pressed.

Training the Parkinson’s detector model

Figure 4: Using Python, OpenCV, and machine learning (Random Forests), we have classified Parkinson’s patients using their hand-drawn spirals with 83.33% accuracy.

Let’s put our Parkinson’s disease detector to the test!

Use the “Downloads” section of this tutorial to download the source code and dataset.

From there, navigate to where you downloaded the .zip file, unarchive it, and execute the following command to train our “wave” model:

Examining our output you’ll see that we obtained 71.33% classification accuracy on the testing set, with a sensitivity of 69.33% (true-positive rate) and specificity of 73.33% (true-negative rate).

It’s important that we measure both sensitivity and specificity as:

  1. Sensitivity measures the true positives that were also predicted as positives.
  2. Specificity measures the true negatives that were also predicted as negative.

Machine learning models, especially machine learning models in the medical space, need to take utmost care when balancing true positives and true negatives:

  • We don’t want to classify someone as “No Parkinson’s” when they are in fact positive for Parkinson’s.
  • And similarly, we don’t want to classify someone as “Parkinson’s positive” when in fact they don’t have the disease.

Let’s now train our model on the “spiral” drawings:

This time we reach 83.33% accuracy on the testing set, with a sensitivity of 76.00% and specificity of 90.67%.

Looking at the standard deviations we can also see less significantly less variance and a more compact distribution.

When automatically detecting Parkinson’s disease in hand drawings, at least when utilizing this particular dataset, the “spiral” drawing seems to be much more useful and informative.

Fill your toolbox with the right computer vision tools for the job

Deep learning methods are all the rage right now, and yes, they are super powerful, but deep learning doesn’t make other computer vision techniques obsolete.

Instead, you need to bring the right tool to the job. You wouldn’t try to bang in a screw with a hammer, you would instead use a screwdriver. The same concept is true with computer vision — you bring the right tool to the job.

In order to help build your toolbox of computer vision algorithms I have put together the PyImageSearch Gurus course.

Inside the course you’ll learn:

  • Machine learning and image classification
  • Automatic License/Number Plate Recognition (ANPR)
  • Face recognition
  • How to train HOG + Linear SVM object detectors
  • Content-based Image Retrieval (i.e., image search engines)
  • Processing image datasets with Hadoop and MapReduce
  • Hand gesture recognition
  • Deep learning fundamentals
  • …and much more!

PyImageSearch Gurus is the most comprehensive computer vision education online today, covering 13 modules broken out into 168 lessons, with other 2,161 pages of content. You won’t find a more detailed computer vision course anywhere else online, I guarantee it.

The PyImageSearch Gurus course also includes private community forums. I participate in the Gurus forum virtually every day, so it’s a great way to get expert advice, both from me and from the other advanced students, on a daily basis.

To learn more about the PyImageSearch Gurus course + community (and grab 10 FREE sample lessons), just click the button below:

Click here to learn more about PyImageSearch Gurus!

Summary

In this tutorial, you learned how to detect Parkinson’s disease in geometric drawings (specifically spirals and waves) using OpenCV and computer vision. We utilized the Histogram of Oriented Gradients image descriptor to quantify each of the input images.

After extracting features from the input images we trained a Random Forest classifier with 100 total decision trees in the forest, obtaining:

  • 83.33% accuracy for spiral
  • 71.33% accuracy for the wave

It’s also interesting to note that the Random Forest trained on the spiral dataset obtained 76.00% sensitivity, meaning that the model was capable of predicting a true positive (i.e., “Yes, the patient has Parkinson’s”) nearly 76% of the time.

This tutorial serves as yet another example of how computer vision can be applied to the medical domain (click here for more medical tutorials on PyImageSearch).

I hope you enjoyed it and find it helpful when performing your own research or building your own medical computer vision applications.

To download the source code to this post, and be notified when future tutorials are published on PyImageSearch, just enter your email address in the form below!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL! Sound good? If so, enter your email address and I’ll send you the code immediately!

, , ,

18 Responses to Detecting Parkinson’s Disease with OpenCV, Computer Vision, and the Spiral/Wave Test

  1. Monty April 29, 2019 at 11:31 am #

    Thanks a lot Adrian, God bless You.

    • Adrian Rosebrock April 29, 2019 at 2:00 pm #

      You are welcome, Monty!

  2. Yuvrajsingh April 29, 2019 at 2:17 pm #

    Is there a reason for using random forests and not SVM?

    • Adrian Rosebrock April 29, 2019 at 2:38 pm #

      In this particular instance, I found that Random Forests obtained better accuracy than SVMs.

      • Yuvrajsingh April 29, 2019 at 2:50 pm #

        Hmmm.. I understand. By the way great tutorial and thank you for providing such great learning resources specially for CV and ML freshers like me, you really are an inspiration. Will download the code right away and will try to study other feature training methods with it.

        • Adrian Rosebrock April 29, 2019 at 4:12 pm #

          I would definitely encourage you to play with it. Download the code, hack it with, modify a few parameters or insert a new model altogether. It’s one of the best ways to learn!

          • Sanjay April 30, 2019 at 9:27 am #

            Thanks for great tutorial. I’m able to get overall accuracy of 98.9% using deep learning framework…

          • Adrian Rosebrock May 1, 2019 at 11:32 am #

            Nice job, Sanjay! Which network architecture did you use?

  3. Ziri April 29, 2019 at 8:46 pm #

    Do you think training 1D data of the curve can lead to better accuracy ?

    • Adrian Rosebrock May 1, 2019 at 11:39 am #

      Hey Ziri — try it and see, I would love to hear about your findings!

  4. Sanchit Ravindra Deshmukh April 30, 2019 at 8:02 am #

    Hi Adrian, Myself Sanchit, First of all, thank you for all the awesome tutorials, I am new to ML and AI, and I got into it because I saw real-world problems which needed to be solved and I see Computer Vision is a toolkit for solving it,
    I want to know how to learn to identify what’s needed and what’s not, I mean this tutorial starts with a list of libraries required to make this tutorial work, I understand you have years of experience in CV and that may be the factor, but is a general rule of thumb that can be followed by inexperienced people like me to know which libraries to pick up, what’s really required ?
    Can you suggest me a road map to follow to reach this point?
    I am a total rookie please take that into consideration.

    • Adrian Rosebrock May 1, 2019 at 11:33 am #

      It’s awesome that you are interested in the field of computer vision, Sanchit.

      I would strongly encourage you to read through Practical Python and OpenCV. That book will help you learn the fundamentals and enable you to be successful with basic computer vision projects. From there I can provide you with further suggestions.

  5. Sachin Bisht April 30, 2019 at 8:04 pm #

    Really nice tutorial and good explanation on choice b/w ML and DL

    • Adrian Rosebrock May 1, 2019 at 11:23 am #

      Thanks Sachin, I’m glad you found it helpful!

    • DL May 11, 2019 at 3:50 am #

      Yes, it is indeed a very valid point that DL is not the answer to everything. However, the post also seems to be cultivating the myth that DL needs a lot of data, while in reality this is not always the case!

      Took me about 20 minutes to get to 93.3%(!) accuracy on the “Wave” dataset using ResNet as a fixed feature extractor, and 86% on “Spiral”.

      • Adrian Rosebrock May 15, 2019 at 3:08 pm #

        Keep in mind that you’re performing transfer learning via feature extraction. You’re not actually training or fine-tuning the network. By definition that requires less data and enables you to get around not having enough data to reasonably train a model from scratch.

  6. Michael May 4, 2019 at 6:38 am #

    Hi Adrian,

    The number of dimensions produced from a HoG feature could mount up in the array for a larger dataset. Would you have any recommendations for memory management when there is a chance you could be memory constrained?

    I thought pickling and writing to disk would be 1 but have you better suggestions

    • Adrian Rosebrock May 8, 2019 at 1:27 pm #

      To reduce memory load you could do a number of things including kernel approximation or even doing incremental learning. Either (or both) methods are worth a shot.

Leave a Reply

[email]
[email]