Why a Parallel Computing Cluster?
Parallel computing simply refers to breking down a larger problem into smaller, independent parts, and executing those smaller parts simutatenously, or in parallel, on different processors. There are many resources available that describe the differnet types of parallel computing available, and the theory behind each, which will not be discussed here.
Overall, parallel computing follows the "more helping hands" theory, that is the more processors running in parallel to solve a given problem, the quicker the final execution. Although parallel processing can occur at the instruction (or even the bit) level within a CPU, the focus for this project will be on Task Parallelism, or running multiple tasks in parallel across multiple processors, on the same or a related set of data.
Although historically parallel computing has involved hundereds or even thousands of processors running in paralllel, often in very expensive supercomputers, in recent years the model has also been shown to scale down to less than a dozen or so small inexpensive processors. This has been made possible by the availability low cost processors like the single board Raspberry Pi, and widely available open source software.
Nonetheless, the model of breaking down a computing problem into smaller parts that execute in parallel, vs. the traditional model of a single process or task running in a linear fashion, is the same regardless of whether the execution is across thousands of processors in an expensive super computer, or across a handful of inexpensive single board computers. The goal of this project is to build the latter, using widely available open source tools.
In adiditon, given the small size and light weight of a handful of Raspberry Pi computers, another goal of this project is portability: being able to take the computing cluster -- which you can hold in one hand -- and easily move it to a new location, with or without access to any other network. To facillitate access, reference for adding a Wire Access Point (WAP) to the cluster network is also included, so that it can be managed from any nearby computer with WiFi. The only infrastructure requrement is a 6-port USB power hub, and a single electrical outlet.
Building a Mini Cluster
The goal of this project is to build a small Parallel Computing Cluster of four (4) mid-range single board computers (e.g., Raspberry Pi 3), using free open source software. It is assumed that the reader knows how to get a single board computer booted and running with the Linux operating system (examples included here use the Raspberry Pi OS 11, based on Debian Bullseye). If not, there are many resources available on the public Internet to describe how to do that.
The next step -- Part 2: Hardware / Networking -- will cover the hardware used in this example cluster, how to configure a cluster of single board computers to work together on separate network with wireless access, and how to set up shared storage between them.