From 3ad02e834e6afada31531554b1aeb501c21511a5 Mon Sep 17 00:00:00 2001 From: Fabien Boucher Date: Fri, 10 May 2013 17:58:18 +0200 Subject: [PATCH 1/3] Update README file. --- README-filler.rst | 40 ----------- README.md | 179 ++++++++++++++++++++++++++++++++++++++++++++++ README.rst | 1 - 3 files changed, 179 insertions(+), 41 deletions(-) delete mode 100644 README-filler.rst create mode 100644 README.md delete mode 100644 README.rst diff --git a/README-filler.rst b/README-filler.rst deleted file mode 100644 index 475dd7a..0000000 --- a/README-filler.rst +++ /dev/null @@ -1,40 +0,0 @@ -swift-filler.py -============== - -Author(s): Fabien Boucher - -Description ------------ - -This script aims to fill in a swift cluster with random -data. -A custom amount of account will be created against keystone -then many containers and objects will be pushed to those accounts. -Accounts and objects will be flavored with some random meta data. - -Two indexes will be pickled to FS to store first which accounts has been -created (index_path) and second which containers/objects + MD5 and meta data -has been stored (index_containers_path). - -This script use eventlet to try to speedup the most -the fill in process. - - -Usage ------ - -To fill the cluster :: - - python swift-filler.py --create -a 10 -u 1 -f 5 -c 2 -s 5000 -l - -The above command will create 10 accounts with one user in each (in keystone) -then 2 containers will be created with 5 files in each. Each file will -be generated with a size between 1024 Bytes to 5000 Bytes. - -To delete data :: - - python swift-filler.py --delete - -Read pickled index file (index_path) to process a deletion -of objects/containers store in swift for each account then delete -accounts. diff --git a/README.md b/README.md new file mode 100644 index 0000000..0c8e48d --- /dev/null +++ b/README.md @@ -0,0 +1,179 @@ +A massive Swift syncer +====================== + +The purpose of this tool is to give you a way to migrate +the entire content of a swift cluster to an another by +using swift REST API. + +The swiftsync project come with two tools: + + - swfiller + - swsync + +The first one will ease swsync testing by the way +you will be able to populate a test cluster easily +by filling it quickly with heterogeneous data. + +The second is the syncer. Basically it will read +origin swift cluster account by account and +perform the data synchronization by taking care +of avoiding synchronization for data already up to +date. + +Run unit tests +-------------- + +Unitests can be run quickly just after cloning +the project from github. + + $ sudo pip install tox + $ cd swiftsync + $ tox + +Run functional tests +-------------------- + +You can easily start functional tests for swsync +tool by installing two swift on devstack and setting +the ResellerAdmin role to the admin user in keystone +(refer to swsync usage later in this readme) and start +nose as follow : + + $ nosetests -v --nologcapture tests/functional/test_syncer.py + +swiftsync installation +---------------------- + +Prepare a python virtual environment and start +setup.py install. + + $ virtualenv $HOME/venv + $ . $HOME/venv/bin/activate + $ python setup.py install + +swfiller usage +-------------- + +This script aims to fill in a swift cluster with random +data. A custom amount of account will be created against keystone +then many containers and objects will be pushed to those accounts. +Accounts and objects will be flavored with some random meta data. + +Two indexes will be pickled to filesystem to store first +which accounts has been created and second +which containers/objects + MD5 and metadata +has been stored. + +This script use eventlet to try to speedup the most +the fill in process. Default concurrency value can be modified +in configuration file. + +Before using the filler you need to add a configuration file +by copying the sample one (etc/config-sample.ini) and then +editing keystone_origin address and keystone_origin_admin_credential +(tenant:username:password). Be sure to use an user with keystone +admin role to let the filler create tenants and users. + +Which kind of randomization the filler will add to data: + +* random account name +* random metadata on account (some will contain unicode) +* random container name +* random metadata on container (some will contain unicode) +* random object name with random garbage data in it +* random metadata on object (some will contain unicode) +* some object will be created with empty content + +The command below will fill in the swift cluster: + + $ swfiller --create -a 10 -u 1 -c 10 -f 10 -s 1024 --config etc/config.ini + +Meaning of the options are as follow: + +* --create : creating mode (there also a deletion mode to clean tenant and data) +* -a : amount of account or tenant to create +* -u : amount of user to create or each account +* -c : amount of container to create for each account +* -f : amount of file to create in each container +* -s : the maximum size for file to create (in Bytes) + +As mention above there is also a deletion mode that use +index files created during fill in operations. Index files +will keep a list a user and tenant we have created in keystone. +So to clean all account and data the create mode has created +use the deletion mode: + + $ swfiller --delete --config etc/config.ini + +swsync usage +------------ + +The synchronization process will not handle keystone synchronization. +Database synchronization will need to be done by configuring +the replication capabilities of the keystone database. + +The user used by the sync tool will need to be able to perform +API operations on each account for both origin and destination +cluster. To do that the user must own the role ResellerAdmin. + +Adding the role ResellerAdmin to admin user in keystone is +straightforward by using the following command (be sure +to have properly set your environment variables before): + + $ keystone user-role-add --tenant admin --user \ + admin --role ResellerAdmin + +swsync will replicate : + +* account and account metadata +* container and container metadata +* object and object metadata + +The way it will act to do that is as follow: + +* will synchronize account metadata if they has changed on origin +* will delete container on destination if no longer exists on origin +* will create container on destination if not exists +* will synchronize destination container metadata if not same + as origin container. +* will remove container object if no longer exists in origin container +* will synchronize object and metadata object if the last-modified header + is the lastest on the origin. + + +To start the synchronization process you need to edit +the configuration file and configure keystone_dest +and keystone_dest_credentials (Fix it). Then to start +the process simply : + + $ swsync --config etc/config.ini + +As mention above the sync process won't +replicate origin keystone accounts to the destination +keystone so swift accounts on destination will +not work until you start a keystone database synchronization. + +swsync will take care of already synchronized containers/objects. When +re-starting swsync it will only synchronize data that have changed. +The tool can for instance be launched by a cron +job to perform diff synchronization each night. + +Swift Middleware last-modified +------------------------------ + +A swift middleware has been written to speedup the +synchronization process by adding a last modified metadata +to container header. The idea behind this is to only +process the container whether the timestamp is greater +on origin avoiding uselessly walking through container. +But the swsync tool does not use this meta for +now. If you want to contribute feel free to add it ! + +Synchronization performances +---------------------------- + +Things to considers +------------------- + +swfiller and swsync are not designed to work +with swift v1.0 authentication. diff --git a/README.rst b/README.rst deleted file mode 100644 index 4afc2f7..0000000 --- a/README.rst +++ /dev/null @@ -1 +0,0 @@ -A massive Swift syncer. From 2bf9db1961eaeab8dd4ef19b866f6074fcec3f4c Mon Sep 17 00:00:00 2001 From: Fabien Boucher Date: Fri, 10 May 2013 17:59:54 +0200 Subject: [PATCH 2/3] Fix indent in README.md. --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 0c8e48d..4322eea 100644 --- a/README.md +++ b/README.md @@ -39,7 +39,7 @@ the ResellerAdmin role to the admin user in keystone (refer to swsync usage later in this readme) and start nose as follow : - $ nosetests -v --nologcapture tests/functional/test_syncer.py + $ nosetests -v --nologcapture tests/functional/test_syncer.py swiftsync installation ---------------------- @@ -103,7 +103,7 @@ will keep a list a user and tenant we have created in keystone. So to clean all account and data the create mode has created use the deletion mode: - $ swfiller --delete --config etc/config.ini + $ swfiller --delete --config etc/config.ini swsync usage ------------ From f3e4df187c37ec03f4720d30caae0987e9c67c5a Mon Sep 17 00:00:00 2001 From: Fabien Boucher Date: Mon, 13 May 2013 17:13:54 +0200 Subject: [PATCH 3/3] Changes in README according to chmouel review --- README.md | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/README.md b/README.md index 4322eea..de24106 100644 --- a/README.md +++ b/README.md @@ -143,7 +143,7 @@ The way it will act to do that is as follow: To start the synchronization process you need to edit the configuration file and configure keystone_dest -and keystone_dest_credentials (Fix it). Then to start +and keystone_dest_credentials. Then to start the process simply : $ swsync --config etc/config.ini @@ -151,12 +151,16 @@ the process simply : As mention above the sync process won't replicate origin keystone accounts to the destination keystone so swift accounts on destination will -not work until you start a keystone database synchronization. +not work until you start a keystone database synchronization. But be sure +when performing the database synchronization to have swift endpoints +configured to reference the destination swift. swsync will take care of already synchronized containers/objects. When re-starting swsync it will only synchronize data that have changed. -The tool can for instance be launched by a cron -job to perform diff synchronization each night. +swsync has been designed to be run and run again and not ensuring that the +first pass goes well, if for example there is network failure swsync will +just skip it and hope to do it on the next run. So the tool can for instance +be launched by a cron job to perform diff synchronization each night. Swift Middleware last-modified ------------------------------ @@ -166,14 +170,16 @@ synchronization process by adding a last modified metadata to container header. The idea behind this is to only process the container whether the timestamp is greater on origin avoiding uselessly walking through container. -But the swsync tool does not use this meta for -now. If you want to contribute feel free to add it ! +When performing some tests we figured out that synchronization +performances was fast enough for our use case so we decided +to not support this metadata in swsync for now. But If you want to +contribute feel free to add it ! -Synchronization performances ----------------------------- Things to considers ------------------- -swfiller and swsync are not designed to work -with swift v1.0 authentication. +swfiller and swsync are not designed to work with swift v1.0 authentication. +We experienced some performances troubles when doing large synchronization +with token validation. Having to validate the token each time could come back with +error due to keystone capability to handle large amount of token validation requests.