System disruptions
We are currently experiencing disruptions on the search portals due to high traffic. We are working to resolve the issue, you may temporarily encounter an error message.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Distributed Checkpointing with Docker Containers in High Performance Computing
University West, Department of Engineering Science, Division of Computer, Electrical and Surveying Engineering.
University West, Department of Engineering Science, Division of Computer, Electrical and Surveying Engineering.
2017 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [sv]

Container-virtualisering har blivit mer och mer använt efter att uppdateringar till cgroups och namespace-funktionerna släpptes i Linuxkärnan. Samtidigt så lider industrins högpresterande beräkningskluster av dyra licenskostnader som skulle kunna hanteras av virtualisering. I den här uppsatsen utformades experiment för att ta reda på om Dockers funktion checkpoint, som fortfarande är under utveckling, skulle kunna utnyttjas i industrins beräkningskluster. Genom att demonstrera detta koncept och dess möjligheter att pausa distribuerade containrar, som kör parallella processer inuti, användes den välkända NAS Parallel Benchmarken (NPB) fördelad över två test-maskiner. Sedan så pausades containrar i olika ordningar och Docker lyckas återuppta benchmarken utan problem både lokalt och distribuerat. Om man försiktigt överväger ordningen som man skriver ner containers till disk (checkpoint) så går det utan problem att återuppta benchmarken lokalt på samma maskin. Slutligen så visar vi även att distribuerade containrar kan återupptas på en annan maskin än där den startade med hög framgång. Dockers prestanda, möjligheter och flexibilitet lämpar sig i framtidens industriella högpresterande kluster där man mycket väl kan köra sina applikationer i containrar istället för att köra dom på det traditionella sättet, direkt på hårdvaran. Genom användning av Docker-containers kan man hantera problemet med dyra licenskostnader och prioriteringar. 

Abstract [en]

Lightweight container virtualization has gained widespread adoption in recent years after updates to namespace and cgroups features in the Linux kernel. At the same time the Industrial High Performance community suffers from expensive licensing costs that could be managed with virtualization. To demonstrate that Docker could be used for suspending distributed containers with parallel processes, experiments were designed to find out if the experimental checkpoint feature is ready for this community. We run the well-known NAS Parallel Benchmark (NPB) inside containers spread over two systems under test to prove this concept. Then, pausing containers and unpausing them in different sequence orders we were able resume the benchmark. After that, we further demonstrate that if you carefully consider the order in which you Checkpoint/Restore containers, then the checkpoint feature is also able to resume the benchmark successfully. Finally, the concept of restoring distributed containers, running the benchmark, on a different system from where it started was proven to be working with a high success rate. Our tests demonstrate the performance, possibilities and flexibilities of Dockers future in the industrial HPC community. This might very well tip the community over to running their simulations and virtual engineering-applications inside containers instead of running them on native hardware.

Place, publisher, year, edition, pages
2017. , p. 19
Keywords [en]
Industrial HPC, HPCC, Suspend, Pause, Checkpoint, Docker, CRIU
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:hv:diva-11645Local ID: EXD500OAI: oai:DiVA.org:hv-11645DiVA, id: diva2:1144045
Subject / course
Computer enigeering
Educational program
Course
Supervisors
Examiners
Available from: 2017-10-06 Created: 2017-09-25 Last updated: 2018-05-28Bibliographically approved

Open Access in DiVA

fulltext(901 kB)1278 downloads
File information
File name FULLTEXT01.pdfFile size 901 kBChecksum SHA-512
c4a9ab447aa3a534b9b6961fa0da91bc9ca49910ad98f7bc1eafadfd0ab54aa0d07e6975eb323267e277cdb24f43ebe75047cd528af40ce393e17d1c855890cd
Type fulltextMimetype application/pdf

By organisation
Division of Computer, Electrical and Surveying Engineering
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 1281 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 5131 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf