My review of Microsoft’s data science virtual machine (DSVM) for deep learning

Image credit: OnMSFT

Over the past few months, I’ve been using Microsoft’s Ubuntu deep learning and data science virtual machine (DSVM) for a few projects I’m working on here at PyImageSearch.

At first, I was a bit hesitant (and perhaps even a bit resistant) to giving it a try — I already have a pre-configured Amazon AWS deep learning AMI that (1) I use often and (2) is publicly available to any PyImageSearch reader who wants to utilize it in their own projects.

And while I’m not a fan of Amazon’s AWS user interface, I’ve gotten used to it over the years. I suppose there is a sort of “familiarity” in its clunky complexity.

But I had heard such good things about the Ubuntu DSVM that I decided to test it out.

I was incredibly impressed.

The interface was easier to use. The performance was great. The price was on point.

…and it didn’t hurt that all code from Deep Learning for Computer Vision with Python ran on it without a single change.

Microsoft even graciously allowed me to author a series of guest posts on their Machine Learning Blog and share my experiences while I was using it, testing it, and evaluating it:

Microsoft is serious about establishing themselves as the “go to” cloud environment for deep learning, machine learning, and data science. The quality of their DSVM product shows that.

In the remainder of today’s special edition blog post I’ll be sharing my thoughts on the DSVM and even demonstrating how to start your first instance and run your first deep learning example on it.

To learn more about Microsoft’s deep learning virtual machine (and whether it’s right for you), keep reading!

A review of Microsoft’s deep learning virtual machine

When I first evaluated Microsoft’s data science and deep learning virtual machine (DSVM) I took all code examples from Deep Learning for Computer Vision with Python and ran each and every example on the DSVM.

The process of manually running each example and inspecting the output was a bit tedious but it was also a great way to put the DSVM for the ringer and assess it for:

  • Beginner usage (i.e., just getting started with deep learning)
  • Practitioner usage, where you’re building deep learning models and need to quickly evaluate performance
  • Research usage, where you’re training deep neural networks on large image datasets.

The codebase to Deep Learning for Computer Vision with Python compliments this test perfectly.

The code inside the Starter Bundle is meant to help you take your first step with image classification, deep learning, and Convolutional Neural Networks (CNNs).

If the code ran without a hitch on the DSVM then I could certainly recommend it to beginners looking for a pre-configured deep learning environment.

The chapters + accompanying code in the Practitioner Bundle cover significantly more advanced techniques (transfer learning, fine-tuning GANs, etc). These are the techniques a deep learning practitioner or engineer would be applying in their day to day work.

If the DSVM handled these examples, then I knew I could recommend it to deep learning practitioners.

Finally, the code inside the ImageNet Bundle requires GPU horsepower (the more the better) and I/O performance. Inside this bundle I demonstrate how to replicate the results of state-of-the-art publications (ex. ResNet, SqueezeNet, etc.) on the massive image datasets, such as the 1.2 million image ImageNet dataset.

If the DSVM could handle reproducing the results of state-of-the-art papers, then I knew I could recommend the DSVM to researchers.

In the first half of this blog post I’ll summarize my experience with each of these tests.

From there I’ll show you how to launch your first deep learning instance in the Microsoft cloud and then run your first deep learning code example in the DSVM.

Comprehensive deep learning libraries

Figure 1: The Microsoft Azure Data Science Virtual Machine comes with all packages shown pre-installed and pre-configured for your immediate use.

Microsoft’s deep learning virtual machine runs in their Azure cloud.

It can technically run either Windows or Linux, but for nearly all deep learning projects, I would recommend you use their Ubuntu DSVM instance (unless you have a specific reason to use Windows).

The list of packages installed on the DSVM is is quite comprehensive — you can find the full list here. I have included the most notable deep learning and computer vision packages (particularly to PyImageSearch readers) below to give you an idea on how comprehensive this list is:

  • TensorFlow
  • Keras
  • mxnet
  • Caffe/Caffe2
  • Torch/PyTorch
  • OpenCV
  • Jupyter
  • CUDA and cuDNN
  • Python 3

The DSVM team releases a new, updated DSVM every few months with the most up to date packages pre-configured and pre-installed. This is a huge testament to not only the DSVM team for keeping this instance running seamlessly (keeping the DSVM free of package conflicts must be a painful process, but it’s totally transparent to the end user), but also Microsoft’s desire to have users enjoying the experience as well.

What about GPUs?

The DSVM can run in both CPU-only and GPU instances.

For the majority of all experiments and tests I ran below, I utilized an Ubuntu GPU instance with the standard NVIDIA K80 GPU.

Additionally, Microsoft granted me to access to their just released NVIDIA V100 behemoth which I ran a few additional quick spot checks with (see results below — it’s fast!)

For all Starter Bundle and Practitioner Bundle experiments I opted to test out Microsoft’s Jupyter Notebook.

The process was incredibly easy.

I copied and pasted the Jupyter Notebook server URL in my browser, launched a new notebook, and within a few minutes I was running examples from the book.

For the ImageNet Bundle experiments I used SSH as replicating the results of state-of-the-art papers required days of training time and I personally do not think that is a proper usage of Jupyter Notebooks.

Easy for deep learning beginners to use

Figure 2: Training the LeNet architecture on the MNIST dataset. This combination is often referred to as the “hello world” example of Deep Learning.

In my first guest post on the Microsoft blog, I trained a simple Convolutional Neural Network (LeNet) on the MNIST handwritten digit dataset. Training LeNet on MNIST is likely the first “real” experiment for a beginner studying deep learning.

Both the model and dataset are straightforward and training can be performed on a CPU or GPU as well.

I took the code from Chapter 14 of Deep Learning for Computer Vision with Python (Starter Bundle) and executed it in a Jupyter Notebook (which you can find here) on the Microsoft DSVM.

The results of which can be seen in Figure 2 above.

I was was able to obtain 98% classification accuracy after 20 epochs of training.

All other code examples from the Starter Bundle of Deep Learning for Computer Vision with Python ran without a hitch as well.

Being able to run the code in browser via a Jupyter Notebook on the Azure DSVM (with no additional configurations) was a great experience and one that I believe users new to deep learning would enjoy and appreciate.

Practical and useful for deep learning practitioners

Figure 3: Taking 2nd place on the Kaggle Leaderboard for the dogs vs. cats challenge is a breeze with the Microsoft Azure DSVM (pre-configured) using code from Deep Learning for Computer Vision with Python.

My second post on the Microsoft blog was geared towards practitioners.

A common technique used by deep learning practitioners is to apply transfer learning and in particular, feature extraction, to quickly train a model and obtain high accuracy.

To demonstrate how the DSVM can be used for practitioners looking to quickly train a model and evaluate different hyperparameters, I:

  1. Utilized feature extraction using a pre-trained ResNet model on the Kaggle Dogs vs. Cats dataset.
  2. Applied a Logistic Regression classifier with grid searched hyperparameters on the extracted features.
  3. Obtained a final model capable of capturing 2nd place in the competition.

I also wanted to accomplish all of this in under 25 minutes.

The end result was a model capable of sliding into 2nd place with only 22 minutes of computation (as Figure 3 demonstrates).

You can find a full writeup on how I accomplished this task, including the Jupyter Notebook + code, in this post.

But could it be done faster?

After I had ran the Kaggle Dogs vs. Cats experiment on the NVIDIA K80, Microsoft allowed me access to their just released NVIDIA V100 GPUs.

I had never used an NVIDIA V100 before so I was really excited to see the results.

I was blown away.

While it took 22 minutes for the NVIDIA K80 to complete the pipeline, the NVIDIA V100 completed the task in only 5 minutes — that’s a massive improvement of over 340%!

I believe deep learning practitioners will get a lot of value out of running their experiments on a V100 vs. a K80, but you’ll also need to justify the price as well (covered below).

Powerful enough for state-of-the-art deep learning research

Figure 4: The Microsoft Azure DSVM handles training SqueezeNet on the ImageNet dataset easily.

The DSVM is perfectly suitable for deep learning beginners and practitioners — but what about researchers doing state-of-the-art work? Is the DSVM still useful for them?

To evaluate this question, I:

  1. Downloaded the entire ImageNet dataset to the VM
  2. Took the code from Chapter 9 of the ImageNet Bundle of Deep Learning for Computer Vision with Python where I demonstrate how to train SqueezeNet on ImageNet

I chose SqueezeNet for a few reasons:

  1. I had a local machine already training SqueezeNet on ImageNet for a separate project, enabling me to easily compare results.
  2. SqueezeNet is one of my personal favorite architectures.
  3. The resulting model size (< 5MB without quantization) is more readily used in production environments where models need to be deployed over resource constrained networks or devices.

I trained SqueezeNet for a total of 80 epochs on the NVIDIA K80. SGD was used to train the network with an initial learning rate of 1e-2 (I found the Iandola et al. recommendation of 4e-2 to be far too large for stable training). Learning rates were lowered by an order of magnitude at epochs 50, 65, and 75, respectively.

Each epoch took approximately 140 minutes on the K80 so the entire training time was ~1 week.

Using multiple GPUs could have easily reduced training time to 1-3 days, depending on the number of GPUs utilized.

After training is complete, I evaluated on a 50,000 image testing set (which I sampled from the training set so I did not have to submit the results to the ImageNet evaluation server).

Overall, I obtained 58.86% rank-1 and 79.38% rank-5 accuracy. These results are consistent with the results reported by Iandola et al.

The full post on SqueezeNet + ImageNet can be found on the Microsoft blog.

Incredibly fast training with the NVIDIA V100

After I trained the SqueezeNet on ImageNet using the NVIDIA K80, I repeated the experiment with a single V100 GPU.

The speedup in training was incredible.

Compared to the K80 (~140 minutes per epoch), the V100 was completing a single epoch in 28 minutes, a huge speedup over over 400%!

I was able to train SqueezeNet and replicate the results in my previous experiment in just over 36 hours.

Deep learning researchers should give the DSVM serious consideration, especially if you do not want to own and maintain the actual hardware.

But what about price?

Figure 5: GPU compute pricing comparison for various deep learning + GPU providers.

On Amazon’s EC2, for a p2.xlarge instance, you’ll pay $0.90/hr (1x K80), $7.20/hr (8x K80), or $14.40/hr (16x K80). That is $0.90/hr per K80.

On Microsoft Azure, prices are the exact same $0.90/hr (1x K80), $1.80/hr (2x K80), and $3.60/hr (4x K80). This also comes out to $0.90/hr per K80.

Amazon has V100 machines ready and priced at $3.06/hr (1x V100), $12.24/hr (4x V100), $24.48/hr (8x V100). Be prepared to spend $3.06/hr per V100 on Amazon EC2.

The recently released V100 instances on Azure are priced competitively at $3.06/hr (1x V100), $6.12/hr (2x V100), $12.24/hr (4x V100).This also comes out to $3.06/hr per V100.

Microsoft offers Azure Batch AI pricing, similar to Amazon’s spot pricing, enabling you to potentially get a better deal on instances.

It wouldn’t be a complete (and fair) price comparison unless we look at Google, Paperspace, and Floydhub as well.

Google charges $0.45/hr (1x K80), $0.90 (2x K80), $1.80/hr (4x K80), $3.60/hr (8x K80). This is clearly the best pricing model for the K80 at half the cost of MS/EC2. Google does not have V100 machines available from what I can tell. Instead they offer their own breed, the TPU which is priced at $6.50/hr per TPU.

Paperspace charges $2.30/hr (1x V100) and they’ve got API endpoints.

Floydhub pricing is $4.20/hr (1x V100) but they offer some great team collaboration solutions.

When it comes to reliability, EC2 and Azure stick out. And when you factor in how easy it is to use Azure (compared to EC2) it becomes harder and harder to justify sticking with Amazon for the long run.

If you’re interested in giving the Azure cloud a try, Microsoft offers free trial credits as well; however, the trial cannot be used for GPU machines (I know, this is a bummer, but GPU instances are at a premium).

Starting your first deep learning instance in the Microsoft cloud

Starting a DSVM instance is dead simple — this section will be your quick-start guide to launching one.

For advanced configurations you’ll want to refer to the documentation (as I’ll mainly be selecting the default options).

Additionally, you may want to consider signing up for Microsoft’s free Azure trial so you can test out their Azure cloud without committing to spending your funds

Note: Microsoft’s trial cannot be used for GPU machines. I know, this is a bummer, but GPU instances are at a huge premium.

Let’s begin!

Step 1: Create a user account or login at

Step 2: Click “Create Resource” in the top-left.

Figure 6: Microsoft Azure “Create Resource” screen.

Step 3: Enter “Data Science Virtual Machine for Linux” in the search box and it will auto-complete as you type. Select the first Ubuntu option.

Step 4: Configure the basic settings: Create a Name (no spaces or special chars). Select HDD (do not select SSD). I elected to use a simple password rather than a key file but this is up to you. Under “Subscription” check to see if you have any free credits you can use. You’ll need to create a “Resource Group” — I used my existing “rg1”.

Figure 7: Microsoft Azure resource “Basic Settings”.

Step 5: Choose a Region and then choose your VM. I selected the available K80 instance (NC65_V3). The V100 instance is also available if you scroll down (NC6S_V3). One of my complaints is I don’t understand the naming conventions. I was hoping they were named like sports cars or at least something like “K80-2” for a 2x K80 machine, instead they’re named after the number of vCPUs which is a bit confusing when we’re instead interested in the GPUs.

Figure 8: The Microsoft Azure DSVM will run on a K80 GPU and V100 GPU.

Step 6: Review the Summary page and agree to the contract:

Figure 9: The summary page allows you to review and agree to the contract.

Step 7Wait while the system deploys — you’ll see a convenient notification when your system is ready.

Step 8: Click “All resources”. You’ll see everything you’re paying for here:

Figure 10: The Azure “All Resources” page shows my DSVM and associated services.

If you select the virtual machine, then you’ll see information about your machine (open the screenshot below in a new tab so you can see a higher resolution version of the image which includes the IP address, etc.):

Figure 11: The “Resource Overview” page allows for you to see your instance’s

Step 9: Connect via SSH and/or Jupyter.

Clicking the connect option will provide you with connectivity details for SSH whether you’re using a key file or password:

Unfortunately, a convenient link to Jupyter isn’t shown. To access Jupyter, you’ll need to:

  1. Open a new tab in your browser
  2. Navigate to https://yourAzureDsvmPublicIP:8000 (the “s” after “http” is important). Make sure you fill in the URL with your public IP.

Running code on the deep learning virtual machine

Now, let’s run the LeNet + MNIST example from my first Microsoft post in Jupyter.

This is a two step process:

Step 1: SSH into the machine (see Step 9 in the previous section).

Change directory into the ~/notebooks  directory.

Clone the repo: $ git clone

Step 2: Fire up Jupyter in your browser (see Step 9 in the previous section).

Click the microsoft-dsvm  directory.

Open the appropriate .ipynb  file ( pyimagesearch-training-your-first-cnn.ipynb ).

But before running the notebook, I’d like to introduce you to a little trick.

It isn’t mandatory, but it can save some headache if you’re working with multiple notebooks in your DSVM.

The motivation for this trick is this: if you execute a notebook but leave it “running”, the kernel still has a lock on the GPU. Whenever you run a different notebook, you’ll see errors such as “resource exhausted”.

The quick fix is to place the following two lines in their very own cell at the very bottom of the notebook:

Now, when you execute all the cells in the notebook, the notebook will gracefully shut down its own kernel. This way you won’t have to remember to manually shut it down.

From there, you can click somewhere inside the first cell and then click “Cell > Run all”. This will run all cells in the notebook and train LeNet on MNIST. From there you can watch the output in the browser and obtain a result similar to mine below:

Figure 12: Training LeNet on MNIST in the Microsoft Azure cloud and the Data Science Virtual Machine (DSVM).

I like to clear all output when I’m finished or before starting new runs after modifications. You can do this from the “Kernel > Restart & Clear Output” menu selection.


In today’s blog post, I reviewed and discussed my personal experience with Microsoft’s data science and deep learning virtual machine (DSVM).

I also demonstrated how to launch your first DSVM instance and run your first deep learning example on it.

I’ll be the first to admit that I was a bit hesitant when trying out the DSVM — but I’m glad I did.

Each and every test I threw at the DSVM, ranging from beginner usage to replicating the results of state-of-the-art papers, it handled it with ease.

And when I was able to use Microsoft’s new NVIDIA V100 GPU instances, my experiments flew, seeing a whopping 400% speedup over the NVIDIA K80 instances.

If you’re in the market for a deep learning cloud-based GPU instance, I would encourage you to try out Microsoft’s DSVM — the experience was great, Microsoft’s support was excellent, and the DSVM itself was powerful yet easy to use.

Additionally, Microsoft and the DSVM team will be sponsoring PyImageConf 2018, PyImageSearch’s very own computer vision and deep learning conference.

PyImageConf attendees will have free access to DSVM GPU instances while at the conference, allowing you to:

  • Follow along with talks and workshops
  • Train their own models
  • Better learn from speakers

To learn more about PyImageConf 2018, just click here.

I hope to see you there!

, , , ,

48 Responses to My review of Microsoft’s data science virtual machine (DSVM) for deep learning

  1. Gerardo March 21, 2018 at 3:49 pm #

    Does it makes sense to still be looking at VMs at this point when Docker is taking over a lot of the functionality of VMs?

    • Adrian Rosebrock March 21, 2018 at 3:53 pm #

      I think there might be a bit of confusion regarding the terminology. Microsoft calls their DSVM a “virtual machine” but it’s actually more similar to a super lightweight hypervisor or even container virtualization. It still runs as close “to the metal” as it possibly can, giving you access to the GPU, better CPU utilization, I/O, etc.

      • Cat Woman March 21, 2018 at 4:41 pm #

        Not true. It is actually a virtual machine. It is not a container.

        • Adrian Rosebrock March 22, 2018 at 9:47 am #

          See my reply to “TimY”. I was addressing what most people think of when they hear the term “VM”.

      • TimY March 21, 2018 at 9:55 pm #

        This is splitting hairs.

        Docker for Mac and Windows actually runs a thin VM layer (Alpine Linux) under the hood. So Docker on Linux is the most ‘pure’ non VM scenario.

        The key to Docker is not VM or not but the ability to easily layer system level dependencies in such a flexible and maintainable way.

        • Adrian Rosebrock March 22, 2018 at 9:39 am #

          I agree, it’s splitting hairs at this point. But most people will tend to think of VMs as big bloated images that typically run on VMWare or VirtualBox. VMs encompass a wide range of applications and implementations. I was commenting on what most people think of when they hear the term “VM” and didn’t want others to get confused.

  2. Johan Ooghe March 21, 2018 at 4:07 pm #

    Very interesting document !!!!! Thanks !!!

  3. Yovel March 21, 2018 at 4:12 pm #

    Thanks for the great post!
    I actually had quite the bad experience with azure- I used it in my first time to compete in kaggle, but in the second day the machine just wouldn’t connect with ssh.
    The whole affair cost me about two thirds of a work day, in which I couldn’t access the previous day’s work.
    After the fact I thought that there must be a way to make a code directory easily avaliable from any machine, thus serving two purposes:
    1. Backup, in cases like mine
    2. Ability to develop and test the code on a low end machine, and just activate the high end machine for the final run.
    It sould be easy, but I couldn’t find any obvious way to do so.
    Does anyone know how to do that?

    • Adrian Rosebrock March 22, 2018 at 9:49 am #

      I’m sorry to hear about the bad experience, Yovel. That’s definitely a bummer. Did you try reaching out to Microsoft’s support directly?

      • Yovel March 24, 2018 at 4:07 pm #

        Yeah, they took their time… It was weird, there was probably a problem with my contact information.
        My question still stands though- do you know of any such way? It should be very helpful to anyone running code. Right now I’m send from one computer to the other using ftp which is *very* fast, but it gets tedious after some time and keeping a shared directory will be much more helpful.

        • Adrian Rosebrock March 26, 2018 at 11:01 am #

          Take a look at Tom’s reply to your question.

    • Tom March 22, 2018 at 2:29 pm #

      You can create a storage resource that different vm’s can mount to appear like local folders. There’s also desktop client (Azure Storage Explorer) to upload/download files

  4. Tom March 21, 2018 at 5:12 pm #

    Good coverage Adrian, I would just add that with the free trial you can request access to GPU VM’s by sending a support request to Azure customer service. When on paid plan you still need to contact them to get access to the NCxS_V3 machines. Plus, if you want to leave a process running for a long time, I recommend doing it in a ‘screen session’ that is spawned from the ssh terminal but runs independently. Cheers, Tom

    • Adrian Rosebrock March 22, 2018 at 9:48 am #

      Thanks for the tip, Tom!

      I also concur — for long running sessions absolutely use screen.

  5. Yovel March 21, 2018 at 5:19 pm #

    What enviroment did you use?
    The main problem with their enviroment is that their opencv doesn’t support reading from video.
    I specifically used their conda enviroment, but I’m pretty sure that the others don’t support it either.

    • Yovel March 21, 2018 at 5:27 pm #

      I mean, I could just install from source but then I would lose the whole advantage of the pre-built enviroment.
      The conda enviroment didn’t cooperate and even when I installed from conda-forge it didn’t install a video supporting version.

      • Adrian Rosebrock March 22, 2018 at 9:46 am #

        I used the Python 3 environment included with the DSVM. I didn’t try accessing a video file or video stream from the DSVM so I can’t comment on that.

        I would also disagree with your statement of saying “lose the whole advantage of the pre-built environment”. That’s not entirely true. If you need to compile and install 1% of the software on the machine to achieve additional functionality and not change 99% of the other packages then I would say you are not losing very much.

        That’s not to say that compiling OpenCV isn’t a bit of a task. I do understand where you’re coming from.

        • Yovel March 24, 2018 at 4:14 pm #

          Well, theortically it’s easier but you know how it is… You reinstall one thing and you ruin everything else’s dependencies for some obscure reason.
          In the end I exported my native conda enviroment to run my code (if it will ever interest anyone).
          I just hoped you already did it the “right” way.
          I will try to update if I ever do it the right way myself.

        • spincraft July 5, 2018 at 11:38 pm #

          Thanks for this info. I would be very interested to learn how to ‘process’ a video stream with Azure. I think the MS IoT hub would be too slow but I’ve not tried that route.

  6. Prafull March 21, 2018 at 5:49 pm #

    Link to the third blog post, titled “Training state-of-the-art neural networks in the Microsoft Azure cloud” points to the same page as previous one “22 minutes to 2nd place…”.

    • Adrian Rosebrock March 22, 2018 at 9:43 am #

      Thanks Prafull, I have fixed the links 🙂

  7. Nicole Finnie March 22, 2018 at 2:17 am #

    Hey Adrian, very cool, thanks for sharing the information. Somehow I got used to the AWS EC2, their GUI designer must have travelled from 20 years ago :p. I’ve stopped using AWS since I tried IBM PowerAI 8 with 4 P100 GPUs. Maybe i’m a little biased to say that ?

    • Adrian Rosebrock March 22, 2018 at 9:37 am #

      Ha! I don’t think you’re biased at all 😉

  8. Abkul March 22, 2018 at 7:23 am #

    Good work Adrian for giving a wide menu of existing approaches.My only hope is Microsoft will not entice with its DSVM only to do “vendor lock-in”.

    • Adrian Rosebrock March 22, 2018 at 9:34 am #

      Hey Abkul — sorry for any confusion here, but the DSVM is meant to be executed in the Azure environment just like Amazon’s AMIs are meant to be executed only in the Amazon ecosystem. You can’t move the two between the platforms.

  9. Jegama March 22, 2018 at 4:10 pm #

    Is super painful to get the GPUs… How is this “easier to use” than AWS?

    • Adrian Rosebrock March 26, 2018 at 11:00 am #

      The user interface itself is significantly easier to use then AWS.

  10. deepMax April 21, 2018 at 1:04 am #

    I use DSVM all the time… I can access anytime from anywhere, also use Azure File Share, have my all ML file and data in one place, no longer need copy or move file from VM to another VM, regardless windows or Linux.

  11. Madhu July 5, 2018 at 2:18 pm #

    Can the DSVM utilize built-in models like we see in ML studio? I prefer ML Studio as it provides their own models like boosted decision trees and ANN’s and a very clear workflow from training to evaluating and then deploying models that can be called via web services. It also allows us to visualize each step of our ML pipeline. However, ML Studio also lacks certain features mainly due to the fact that it takes a very high-level approach, so I’m wondering if DSVM would work better when there is more customization involved (e.g. freezing NN inputs to constant values during model-scoring which is not possible on ML studio or repetitively scoring a model until a certain output constraint is satisfied) while still being able to use Azure’s native models. What are your thoughts?

    PS. Thanks for the tutorial! Extremely helpful as always.

    • Adrian Rosebrock July 10, 2018 at 9:00 am #

      I’ve only played with ML Studio. I don’t have enough experience to fully comment on it. But I do know that you can spin up an Azure instance (including the DSVM) and then connect it to ML Studio and their experiment workbench and run experiments that way. But again, I’m not well versed in ML Studio so I don’t think I’m equipped to answer that question.

  12. Torben B. Jensen July 27, 2018 at 6:01 am #

    Thanks a lot for the guide. I am trying to replicate it, and I have succeeded in:
    – getting a K80 VM in Azure
    – interfacing it through SSH using PuTTY
    – connecting through jupyter, and viewing code in it

    I have so far not been able to accomplish the following:
    – Run the example above and get the above shown output. I have tried a lot, here is what I do:
    1. under ‘Kernel’ choose ‘Restart & Clear Output
    2. Click in the top cell and under ‘Cell’ choose ‘Run All’

    First, very little happen: Under the first cell I get a warning:

    and then, after a while:
    ConnectionResetError Traceback (most recent call last)

    14 dataset = datasets.fetch_mldata(“MNIST Original”)
    15 data =

    and then a whole lot of other traceback.

    What could be the problem?

    • Adrian Rosebrock July 27, 2018 at 6:24 am #

      The scikit-learn library uses the website to download datasets. MLData hosts these datasets for free. Sometimes the website does go down which is why you are seeing an error related to the connection. To verify if the website is down you can try to load it in your browser. The website does come back though 🙂

      In the meantime you can use the Keras helper utilities to load the MNIST dataset. See this post for an example of such. If you swap out the scikit-learn MNIST helpers for the Keras helpers you’ll be good to go.

    • Torben B. Jensen July 30, 2018 at 11:03 am #

      I discovered that my connection to jupyter was completely unsecure, as the browser flatly refused to use the https://[ip-address]:8000 as you recommend above, it just became //[ip-address]:8000, with a warning that the connection was unsecure. So it seems that it didnt use any ssh-tunnel or anything. However, I found a guide here, which made it possible for me to connect without the warning, and I assume that it is now actually tunnelling through SSH:

      • Adrian Rosebrock July 31, 2018 at 9:49 am #

        Nice, thanks for sharing Torben!

    • Davis Tsai October 25, 2018 at 11:48 pm #

      when use jupyter to run “pyimagesearch-training-your-first-cnn” on NC6
      If someone got error as me, It can be fixed when some code change to another way.
      I don’t know why.

      from sklearn import datasets
      dataset = datasets.fetch_mldata(“MNIST Original”)

      change to :
      from sklearn.datasets import fetch_mldata
      dataset = fetch_mldata(“MNIST Original”)

      • Adrian Rosebrock October 29, 2018 at 1:46 pm #

        Wow, that’s incredibly odd. I can’t imagine how that would make a difference but thanks for sharing Davis 🙂

  13. Torben B. Jensen July 27, 2018 at 6:13 am #

    Hi again,

    Another thing I am struggling with is how to interface the VM in Azure through a GUI. This is for convenience, and for transferring the DL4CV code and data from Google drive, where I have it, to the VM, using the Firefox browser. It is virtually impossible to upload them from my local pc because I have a very slow upload speed.

    I found some guidance on the internet, and installed something called xrdp and ubuntu desktop on the VM. But when I try to connect via Remote Desktop from Windows, I get a grey desktop for a few seconds, and then it closes down again. Maybe you could point to a good guide on how to get a remote desktop running towards the dsvm ?

    • Adrian Rosebrock July 27, 2018 at 6:26 am #

      I haven’t tried to use a GUI interface for the DSVM for an extended period of time and I have never used Windows to access the DSVM so I unfortunately do not have any suggestions for remote desktop. I would contact Azure support and mention the concern to them — I’m sure they have many guides on how to remote desktop to instances in the Azure cloud.

      • Zoltán Szalontay September 23, 2018 at 11:30 am #

        1., pip install xrdp
        2., On the Azure portal, go to DSVM Networking settings and add a new inbound rule to accept traffic on port 3389 (Remote desktop well known port) for both TCP and UDP from any address.
        3., Press Connect (to VM) on the toolbar and save the RDP connection file or just copy the FQDN of you machine and append “:3389” to it when connecting to it form a PC.
        4., Launch the RDP file.
        5., Alternativel, you could run mstsc.exe with
        IP_or_DNS_address_of_your VM:3389
        6., Pls note that unline a PC RDP connection, you’ll have to reauthenticate yourself within that session.

        Please note that XRDP by default supports US keyboard layout. However, there are workarounds by defining new layouts.E.g. this post describes how to switch to SLovenian layout. Foloowing that, I manager to switch to Hungarian, but the logon screen layout will still be US:

    • Zoltán Szalontay September 28, 2018 at 5:08 am #

      Hi Torben, it’ll work only of you add a new network rule to let your DSVM accept RDP packets on port 3389.

      To do that, follow the steps of this guide. Networking config is at #7.

      Pls note that the RDP login screen uses US keyboard layout by default, regardless your locale settings.

      When you connected, you would most probably find that clipboard is not working. Known XRDP problem. I read that disabling every ofther RDP channels, like printer or audio, helps, but in my case nope. Investigating.

      Pro tip: You could also subscribe to Azure Storage (and select File Storage). You’ll get an SMB file share that you can access form either you local PC or the cloud. I am using this to transfer training sets and project files.


  14. Torben B. Jensen September 23, 2018 at 1:28 pm #

    Hello Adrian
    I successfully ran file on my vm in Azure, but it took 35 sec per epoch. It also noted in the start that “Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA”

    Both the time used per epoch, and the message above seems to indicate that the tensorflow binary is not using the K80 GPU that was supposed to be in the vm, and the binary is not compiled to use some special instructions of the CPU. So it appears to be a wrong tensorflow binary I am using ? I didn’t install anything of this on my own, everything was preinstalled.

  15. Torben B. Jensen September 28, 2018 at 4:31 pm #

    Hello Adrian, Congrats with the wedding!
    I just found the answer to my previous question: In order to utilize the GPU, you have to create a ‘DLVM’ (Deep Learning Virtual Machine), rather than a ‘DSVM’ (Data Science Virtual Machine). There is a surcharge of app. 30% on the DLVM compared to the DSVM. I tried the NC6 / K80 vm with the lenet code shown above. It ran at 5 sec. per epoch.

    • Adrian Rosebrock October 8, 2018 at 12:20 pm #

      Thanks Torben! And thank you for providing the answer to your question.

  16. Nabila Miriam Abraham December 6, 2018 at 7:05 pm #

    Hi Adrian! Thanks for this guide – it’s unfortunate that I am unable to load the jupyter notebook interface. I have started the VM and tried to open up https://myip:8000/ but the page never loads. Is there something obvious that I might have missed?

    • Adrian Rosebrock December 11, 2018 at 1:08 pm #

      The port of the server may be blocked from serving web traffic. I would contact Microsoft support and ask them if the port is blocked.

Before you leave a comment...

Hey, Adrian here, author of the PyImageSearch blog. I'd love to hear from you, but before you submit a comment, please follow these guidelines:

  1. If you have a question, read the comments first. You should also search this page (i.e., ctrl + f) for keywords related to your question. It's likely that I have already addressed your question in the comments.
  2. If you are copying and pasting code/terminal output, please don't. Reviewing another programmers’ code is a very time consuming and tedious task, and due to the volume of emails and contact requests I receive, I simply cannot do it.
  3. Be respectful of the space. I put a lot of my own personal time into creating these free weekly tutorials. On average, each tutorial takes me 15-20 hours to put together. I love offering these guides to you and I take pride in the content I create. Therefore, I will not approve comments that include large code blocks/terminal output as it destroys the formatting of the page. Kindly be respectful of this space.
  4. Be patient. I receive 200+ comments and emails per day. Due to spam, and my desire to personally answer as many questions as I can, I hand moderate all new comments (typically once per week). I try to answer as many questions as I can, but I'm only one person. Please don't be offended if I cannot get to your question
  5. Do you need priority support? Consider purchasing one of my books and courses. I place customer questions and emails in a separate, special priority queue and answer them first. If you are a customer of mine you will receive a guaranteed response from me. If there's any time left over, I focus on the community at large and attempt to answer as many of those questions as I possibly can.

Thank you for keeping these guidelines in mind before submitting your comment.

Leave a Reply