Parallel Execution for Contests Part 1 — Powershell

This is the first part of the Parallel Execution for Contests series. For your convenience you can find other parts using the links below :
Part 1 — Powershell
Part 2 — C#
Part 3 — C++

Hello! Today we start a new series about running code in parallel. This is not intented to be neither a concurrent programming course nor a tutorial how to use MPI. I am going to show a few simple programs which are very useful when solving algorithmic tasks.

Introduction

When you take part in a contest similar to Deadline24 you need to solve multiple problems on your machine. These are algorithmic problems so you need to come up with a feasible solution, both in time and space complexity. However, when you have a slower algorithm, you might want to catch up using the power of your machine — if you have a powerfull cluster then you can easily solve tasks even using less optimized algorithms.

Since most of todays’ CPUs are multicore it is fairly easy to speed up execution time using multiple threads. Unfortunately, sometimes it is not easy to change algorithm to multithreaded and do not loose precious seconds on threads synchronization. You might try introducing interthreads communication using MPI but when there is not enough time you probably want to introduce parallelism with as few changes as possible.

Fortunately, contests like Deadline usualy consists of multiple test cases for each task. We can utilize this to simply execute multiple instances of our program in parallel. This approach doesn’t require any threads synchronization and is very effective because it is the operating system who will take care of most of the stuff. All we need to do is run multiple instances of application and pass different input files for each of them. Let’s see how to do that.

Powershell features

We assume that we have a single threaded program called Solver.exe which reads some data from standard input and writes to standard output. We also have multiple input files: input1.txt, …, input10.txt. All we want to do is run application, pass input file and capture output. We can do the following using Powershell:

We can also solve all files with one command:

This is very easy, however, it runs one by one. Another instance of Solver will be run only after previous one finished its work. This means that most of our cores will stay idle during the execution.

Fortunately, there are simple tricks to run code in parallel using Powershell. Let’s see them.

Workflows

We start with workflows. 64 bit Powershell is able to run code in parallel using a feature called Workflow like this:

We can see two importants things here. First, we introduce a block of code using workflow keyword. Next, we use foreach -parallel option to run code in parallel. You can see some description here

This approach is simple and works out of the box. However, there are some drawbacks:

  • we cannot simply stop the script since CTRL+C might not work
  • we cannot write to the console so simple echo "I am here!" will not work
  • we cannot easily control number of threads running the code
  • requires 64 bit Powershell

So we can see that while it works it might not suit all our needs.

Background jobs

Powershell is capable of running some jobs in the background using start-job commandlet. We can use this to run all jobs in background and wait for them:

This works as expected, we can also query the job result and do some magic with it. We could also await for specific jobs and implement custom thread pooling. Unfortunately, there are drawbacks as well:

  • it runs in different directory
  • it is slow — even after all jobs are done I need to wait few seconds to see the results (when writing to files)
  • we are unable to write to the console (since these are background jobs)

Parallel foreach

Previous methods were using only built-in features of Powershell. Right now we will take a look at different method which requires additional code.

If you google carefully you will find this script on Technet. It basically does everything we need to execute code in parallel. This code goes as follows:

It basically has all the things we would like it to have. We are able to specify number of threads, we also can easily pass script block and write to console. We execute this code in the following way:

When the task is finished its output will be print out to the console so we can easily verify our commands by echoing them. This is especially useful when we don’t have much time to debug scripts.

Summary

We now know how to execute multiple instances of our program. In the next parts we are going to see how to introduce parallelism to our algorithms by solving every test case on different thread using thread pools.