This commit adds the configuration options related to resource limits
in the Heat project. The `max_software_configs_per_tenant`,
`max_software_deployments_per_tenant`, and `max_snapshots_per_stack`
options have been added to control the maximum limits for software
configs, software deployments, stack snapshots.
Story: 2011006
Task: 49401
Change-Id: If33a1c6f3eb9e93f586931bc5c05104439c92bf9
Snapshot.get_all does not return all snapshots of the project but
returns all snapshots associated with a single stack, so its name
should contain _by_stack for consistency.
Change-Id: Ic6b93b7cfc84793077672b3f1052f03519e4c5a1
If there is underlying issues in Keystone that
causes the create stack project to fail the exception
is never caught causing the stack to never enter a failed
state with any feedback. See [1] for example.
This changes so that we catch all exceptions and set the
stack to a failed state. We do not want to return the
complete exception since it can contain information about
the deployments underlying issue so we instead log the
exception.
[1] https://paste.opendev.org/show/810161/
Change-Id: If0bc726d8681fff2b45a3b353ae627c86eb298d2
Regenerate trust when update stack with different user
We will regenerate (and delete old trust) when user credential
doesn't match with current context (means different user is
operating).
Story: #1752347
Task: #17352
Change-Id: I39795bdbd8ab255150153bf8b1e165b49e1a7027
If stacks are already migrated to convergence, there is no point
in checking stack status before returning. This will allow re-run
of the command inspite of migrated stacks in FAILED state.
Change-Id: Ia0e34423377843adee8efc7f23d2c2df5dac8e20
Task: 40266
Six is in use to help us to keep support for python 2.7.
Since the ussuri cycle we decide to remove the python 2.7
support so we can go ahead and also remove six usage from
the python code.
Review process and help
-----------------------
Removing six introduce a lot of changes and an huge amount of modified files
To simplify reviews we decided to split changes into several patches to avoid
painful reviews and avoid mistakes.
To review this patch you can use the six documentation [1] to obtain help and
understand choices.
Additional informations
-----------------------
Changes related to 'six.b(data)' [2]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
six.b [2] encode the given datas in latin-1 in python3 so I did the same
things in this patch.
Latin-1 is equal to iso-8859-1 [3].
This encoding is the default encoding [4] of certain descriptive HTTP
headers.
I suggest to keep latin-1 for the moment and to move to another encoding
in a follow-up patch if needed to move to most powerful encoding (utf8).
HTML4 support utf8 charset and utf8 is the default charset for HTML5 [5].
Note that this commit message is autogenerated and not necesserly contains
changes related to 'six.b'
[1] https://six.readthedocs.io/
[2] https://six.readthedocs.io/#six.b
[3] https://docs.python.org/3/library/codecs.html#standard-encodings
[4] https://www.w3schools.com/charsets/ref_html_8859.asp
[5] https://www.w3schools.com/html/html_charset.asp
Patch 13 of a serie of 28 patches
Change-Id: I09aa3b7ddd93087c3f92c76c893c609cb9473842
Avoid loading the tags from the DB and then re-saving every time
the stack is stored when the stack has tags. Avoid attempting to
lazy-load the tags from the DB multiple times when there are no tags.
Avoid lazy-loading the existing tags on an update when we intend to
overwrite them anyway. Avoid writing the same set of tags multiple
times.
e.g. in a legacy update, previously we rewrote the current set of tags
when changing the state to IN_PROGRESS, then wrote the new set of tags
regardless of whether they had changed, then wrote the new set of tags
again when the update completed. In a convergence update we also did
three writes but in a different order, so that the new tags were written
every time. With this change we write the new set of tags only once.
This could also have prevented stacks with tags from being updated from
legacy to convergence, because the (unchanged) tag list would get
rewritten from inside a DB transaction. This is not expected so the
stack_tags_set() API does not pass subtransactions=True when creating a
transaction, which would cause a DB error.
Change-Id: Ia52818cfc9479d5fa6e3b236988694f47998acda
Task: 37001
It is only necessary to pass subtransactions=True to session.begin()
when the new transactions may be called from inside an existing
transaction. There is *no* need to pass it from the top level
transaction in anticipation that there may be subtransactions. The
sqlalchemy docs are not 100% clear on this so we were doing both to be
sure to be sure, but we have confirmed that the latter is not required.
Since passing subtransactions=True when it is not required may cover up
other problems, remove it except in those cases where we know the API
may be called as a subtransaction.
Change-Id: I34cbd3526aef79132f6d97569d48a347e904ab75
Task: 37000
The limit of 100 is fairly low for a normal production environment.
Allow setting the limit to -1 to turn it off. Also raise
limit to 512 which should be a bit more reasonable as a default.
Change-Id: I9e54b20437875ed88b79414aa4fc17b17cbd305b
Related-Bug: #1221849
Make sure that the database can only contain one active stack with a
given name per tenant.
Change-Id: Icc3cd6900fa116a5e43054c7589195222ef90c78
Task: 27824
In practice we never pass any arguments to the callback function, but if
we did we'd be doing it wrong. Multiple projects appear to have copied
this code from us (directly or indirectly). Fix it before we lead anyone
else astray.
Change-Id: If9cddc470158f32587b2aac19e92d1e01b48bc50
We use the notify.signal() call to notify the main thread that the stack
status has moved to IN_PROGRESS so that it can wait before returning
control to the user. Therefore it is expected that if an operation
eventually fails (or succeeds), signal() will have been called a long
time previously. The only reason it is there is to guard against
failures before the resource attains the IN_PROGRESS state, where the
persistence of the state has been deferred to coincide with the lock
release.
Story #2003988
Task: 26931
Change-Id: Ie0519ee78607f71855c2c0ace2cb4ff52c5809b6
When a heat-engine thread activity completes, it calls release on
its stack_lock object in the database. If that release action fails
due to an inability to update the database, that engine process is no
longer usable. This code catches that failure, logs it, and terminates
that engine process so that a new one can be started. New heat engines
will automatically purge stale stack_locks from the database.
Also, make sure that if the thread exit does not teardown the process
after 5 seconds, the non blockable os level exit call will be invoked.
This bug is very timing specific. The DB error needs to exist when the
stack_lock release fails
Change-Id: I7663b2270bf325cd8e3dd194f2994227fd6f5e8a
Story: 2003439
Task: 24635
Removing stack creation limit for admin since admin can view all stacks,
and stacks created by other tenants would have been counted in the limit
check.
Change-Id: Ie2e9251245e7e16309661154e17724e5984c21e6
Story: 2003487
Task: 24756
When we hold a StackLock, we defer any persistence of COMPLETE or FAILED
states in state_set() until we release the lock, to avoid a race on the
client side. The logic for doing this was scattered about and needed to be
updated together, which has caused bugs in the past.
Collect all of the logic into a single implementation, for better
documentation and so that nothing can fall through the cracks.
Change-Id: I6757d911a63708a6c6356f70c24ccf1d1b5ec076
The purpose of the cfg.CONF.convergence_engine option is to determine
whether new stacks will be created using convergence or not. For some
reason, however, the worker service was not started when the option was
disabled, meaning that previously-created convergence stacks could no
longer be updated (there was a hack in place to allow them to be deleted).
Always start the worker service and continue operating any existing
convergence stacks in convergence mode, regardless of the current setting
for new stacks.
Change-Id: Ic3574e4d15ac48b2bc8e0e8101c81d24f40f0606
Related-Bug: #1508324
Wait for the legacy stack to move to the IN_PROGRESS state before returning
from the API call in the stack update, suspend, resume, check, and restore
operations.
For the stack delete operation, do the same provided that we can acquire
the stack lock immediately, and thus don't need to wait for existing
operations to be cancelled before we can change the state to IN_PROGRESS.
In other cases there is still a race.
Change-Id: Id94d009d69342f311a00ed3859f4ca8ac6b0af09
Story: #1669608
Task: 23175
Previously when doing a delete in convergence, we spawned a new thread to
start the delete. This was to ensure the request returned without waiting
for potentially slow operations like deleting snapshots and stopping
existing workers (which could have caused RPC timeouts).
The result, however, was that the stack was not guaranteed to be
DELETE_IN_PROGRESS by the time the request returned. In the case where a
previous delete had failed, a client request to show the stack issued soon
after the delete had returned would likely show the stack status as
DELETE_FAILED still. Only a careful examination of the updated_at timestamp
would reveal that this corresponded to the previous delete and not the one
just issued. In the case of a nested stack, this could leave the parent
stack effectively undeletable. (Since the updated_at time is not modified
on delete in the legacy path, we never checked it when deleting a nested
stack.)
To prevent this, change the order of operations so that the stack is first
put into the DELETE_IN_PROGRESS state before the delete_stack call returns.
Only after the state is stored, spawn a thread to complete the operation.
Since there is no stack lock in convergence, this gives us the flexibility
to cancel other in-progress workers after we've already written to the
Stack itself to start a new traversal.
The previous patch in the series means that snapshots are now also deleted
after the stack is marked as DELETE_IN_PROGRESS. This is consistent with
the legacy path.
Change-Id: Ib767ce8b39293c2279bf570d8399c49799cbaa70
Story: #1669608
Task: 23174
This provides an option to specify a swift container for stack
actions and all child templates and env files will be fetched
from the container, if available. However, files coming in the
'files' map from the client will have precedence, if the same
is also present in swift.
Change-Id: Ifa21fbcb41fcb77827997cce2d5e9266ba849b17
Story: #1755453
Task: 17353
When deleting a snapshot, we used the current resources in the stack to
call delete_snapshot() on. However, there is no guarantee that the
resources that existed at the time the snapshot was created were of the
same type as any current resources of the same name.
Use resources created using the template in the snapshot to do the
snapshot deletion.
This also solves the problem addressed in
df1708b1a8, whereby snapshots had to be
deleted before the stack deletion was started in convergence because
otherwise the 'latest' template contained no resources.
That allows us to once again move the snapshot deletion after the start of
the stack deletion, which is consistent with when it happens in the legacy
path. Amongst other things, this ensures that any failures can be reported
correctly.
Change-Id: I1d239e9fcda30fec4795a82eba20c3fb11e9e72a
If an engine is stopped before a stack is soft deleted but marked
DELETE COMPLETE (updated with ian empty template in convergence),
the empty stack can't be soft deleted anymore and would always end
up in the stack list.
Keeps the nested stack behaviour same by returning EnitityNotFound
as before.
Change-Id: Idc541fe0cd12d03e2d9b3cc11a1e7b0046be9d25
Story: #2002921
Task: 22901
Stack and its nested stacks should be in *_COMPLETE state for
it to be migrated. Also, we should not allow migration of a
nested stack.
This also changes stack_get_all_by_owner_id() to not select the
backup stacks. This seems to be used only in convergence code
path and migrate_convergence_1().
Change-Id: Icd54465d0c593557a12d853ddee4ee8ce6483499
Closes-Bug: #1767962
Story: #1767962
Task: #17363
This refactors the building of schema from parameter validation to use a
new method (which doesn't keep stacks in memory), and use that new
method for providing proper schema for resource group when the size is
0.
Change-Id: Id3020e8f3fd94e2cef413d5eb9de9d1cd16ddeaa
Closes-Bug: #1751074
Closes-Bug: #1626025
Fix reset_stack_status on legacy, by using a different session for each
stack reset, and handling lock duplicate errors.
Closes-Bug: #1735755
Change-Id: I6bcd7448052e86ec3e4eb4c49ef3139c20d4f919
Allow use policy in code to resource type's rule.
Also add test for override the in-code resource type rule in json
file.
Partially-Implements: bp policy-in-code
Change-Id: Id6c21732e66de6c421427ded98de52f5da0a4db2
When handling the command "openstack stack event list --nested-depth=n", we
obtain the list of nested stacks by querying the database for all resources
that share the root stack ID. However, since all we're getting is the stack
IDs, there's no need to query all of the fields and construct a versioned
object for each resource. Just get the set of stack IDs.
Change-Id: I12155433e2ac1af919aa4b5e780fb965cd5885d8
Related-Bug: #1588561
When doing "openstack stack resource list --nested-depth=n", we were
lazy-loading the resources' properties data. This is expensive, especially
when there are a large number of resources. Eager-load the data, as we
always use it to show the resources.
For consistency, always eager-load the resource 'data' (even in
resource_get), because the Resource versioned object accesses it
unconditionally.
Change-Id: Idb871fddf77bf24828878c315e19e200c28841be
Related-Bug: #1665503
This is to enable preview of the merged environment
without merging the environment on the client side.
Related-Bug: #1635409
Change-Id: I7ec3af729a65164230153021f438bf226cc5e858
We already have REST api support for cancelling a
UPDATE_IN_PROGRESS stack with rollback. This adds a
new action 'cancel_without_rollback' to allow for
canceling a create/update in_progress stack without
rollback.
APIImpact
Change-Id: I6c6ffa0502ab8745cfb2f9c5ef263f1e02dfc4ca
Closes-Bug: #1709041
In change I84d2b34d65b3ce7d8d858de106dac531aff509b7, we changed to
call self._converge_create_or_update() in a sub-thread. However,
thread_group_mgr is not set for cancel_update (with rollback),
which in turn calls converge_stack.
This also enables test_cancel_update_server_with_port, as
bug #1607714 seems to be fixed now.
Change-Id: Ie674fd556418f6aa8e79654458cbe43648851db2
Closes-Bug: #1713952
In order to keep the engine service alive, we add a timer that periodically
does nothing. Calls to add_timer() require a stack_id, and currently we
pass cfg.CONF.periodic_interval. This is highly misleading, because the
value you pass for the stack_id has no effect on the interval. The cause
was a copy-paste error in 07884448fe, when
the code changed from calling ThreadGroup.add_timer() to
ThreadGroupManager.add_timer(). Use None as the stack ID instead.
Change-Id: Ia24a0d3ae9a0295fc811eb5300656399f426408b
This RPC call only generates a list of the outputs defined in the template,
not their values, so don't load the resource data needed to calculate the
output values (which can be very slow).
Also, explicitly pass resolve_value=False instead of relying on the default
argument (which is different for format_stack_output() and
format_stack_outputs()), to reduce future confusion.
Change-Id: I79aae94b6552d465db6707cd4a40cd53ff18455b
Closes-Bug: #1719340
When we show a stack including the outputs, we calculate all of the
resource attributes that are referenced anywhere in the stack. In
convergence, these are either already cached (and therefore fast) or need
to be cached (and therefore the initial slowness will pay off in future).
This isn't the case in the legacy path though, since we are not doing
caching of attributes in the database in that path. So this is
unnecessarily calculating all of the referenced attribute values, which are
potentially very slow to get.
For legacy stacks, only calculate the attribute values needed to show the
outputs.
Change-Id: I35800c7f87b58daf05cbabd05bcbcd75d0c0fadb
Partial-Bug: #1719333