Randomize the container list for uploads
When we work through the list of containers in an alphabetical fashion, we end up duplicating much of the layer fetching because it can occur at the same time. Things like cinder-api, cinder-backup, cinder-volume share many of the same layers. Since we don't ensure that we only do a single fetching of a layer hash durring the multiprocessing, we end up duplicating the fetches of layers. By randomizing the fetches, we reduce the likelihood that we'll be fetching the same family of service containers concurrently. Change-Id: Ifbcd55de52c9e2283203b1c6e2adeb266d43eca6 Related-Bug: #1844446
This commit is contained in:
parent
5bc0ef8fdf
commit
3adfefa13a
|
@ -19,6 +19,7 @@ import hashlib
|
|||
import json
|
||||
import netifaces
|
||||
import os
|
||||
import random
|
||||
import re
|
||||
import requests
|
||||
from requests import auth as requests_auth
|
||||
|
@ -222,6 +223,7 @@ class ImageUploadManager(BaseImageManager):
|
|||
container_images = self.load_config_files(self.CONTAINER_IMAGES) or []
|
||||
upload_images = uploads + container_images
|
||||
|
||||
tasks = []
|
||||
for item in upload_images:
|
||||
image_name = item.get('imagename')
|
||||
uploader = item.get('uploader', DEFAULT_UPLOADER)
|
||||
|
@ -236,10 +238,24 @@ class ImageUploadManager(BaseImageManager):
|
|||
multi_arch = item.get('multi_arch', self.multi_arch)
|
||||
|
||||
uploader = self.uploader(uploader)
|
||||
task = UploadTask(
|
||||
tasks.append(UploadTask(
|
||||
image_name, pull_source, push_destination,
|
||||
append_tag, modify_role, modify_vars, self.dry_run,
|
||||
self.cleanup, multi_arch)
|
||||
self.cleanup, multi_arch))
|
||||
|
||||
# NOTE(mwhahaha): We want to randomize the upload process because of
|
||||
# the shared nature of container layers. Because we multiprocess the
|
||||
# handling of containers, if performed in an alphabetical order (the
|
||||
# default) we end up duplicating fetching of container layers. Things
|
||||
# Like cinder-volume and cinder-backup share almost all of the same
|
||||
# layers so when they are fetched at the same time, we will duplicate
|
||||
# the processing. By randomizing the list we will reduce the amount
|
||||
# of duplicating that occurs. In my testing I went from ~30mins to
|
||||
# ~20mins to run. In the future this could be improved if we added
|
||||
# some locking to the container fetching based on layer hashes but
|
||||
# will require a significant rewrite.
|
||||
random.shuffle(tasks)
|
||||
for task in tasks:
|
||||
uploader.add_upload_task(task)
|
||||
|
||||
for uploader in self.uploaders.values():
|
||||
|
|
Loading…
Reference in New Issue