summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorBrian Rosmaita <rosmaita.fossdev@gmail.com>2017-05-25 18:02:09 -0400
committerBrian Rosmaita <rosmaita.fossdev@gmail.com>2018-04-05 13:51:36 +0000
commit4a829fc84293b2621080030a049d7984ff71f789 (patch)
tree3d48725ab7857d17ee43a511bc45c93fd2521ea0
parentef0df98f9e89225661ecfabc66babce2a5b47304 (diff)
Add spec to mitigate OSSN-0075
The spec addresses OSSN-0075 based on discussions at the Rocky PTG and subsequent discussion in #openstack-glance [0]. [0] http://eavesdrop.openstack.org/irclogs/%23openstack-glance/%23openstack-glance.2018-03-16.log.html#t2018-03-16T16:31:52 Change-Id: I00e67b5a901f1e49f18dab3dfc7c0a8325c7bf85
Notes
Notes (review): Code-Review+2: Sean McGinnis <sean.mcginnis@gmail.com> Code-Review+1: Abhishek Kekane <akekane@redhat.com> Code-Review+2: Erno Kuvaja <jokke@usr.fi> Workflow+1: Erno Kuvaja <jokke@usr.fi> Verified+2: Zuul Submitted-by: Zuul Submitted-at: Thu, 07 Jun 2018 20:24:20 +0000 Reviewed-on: https://review.openstack.org/468179 Project: openstack/glance-specs Branch: refs/heads/master
-rw-r--r--specs/rocky/approved/glance/mitigate-ossn-0075.rst276
1 files changed, 276 insertions, 0 deletions
diff --git a/specs/rocky/approved/glance/mitigate-ossn-0075.rst b/specs/rocky/approved/glance/mitigate-ossn-0075.rst
new file mode 100644
index 0000000..8609435
--- /dev/null
+++ b/specs/rocky/approved/glance/mitigate-ossn-0075.rst
@@ -0,0 +1,276 @@
1..
2 This work is licensed under a Creative Commons Attribution 3.0 Unported
3 License.
4
5 http://creativecommons.org/licenses/by/3.0/legalcode
6
7==================
8Mitigate OSSN-0075
9==================
10
11https://blueprints.launchpad.net/glance/+spec/mitigate-ossn-0075
12
13OpenStack Security Note `OSSN-0075`_, "Deleted Glance image IDs may be
14reassigned", was made public on 13 September 2016. The current situation is
15that due to a lack of agreement of how to fix it, we've left operators in a bad
16state: our advice is that soft-deleted rows in the 'images' table in the Glance
17database should *not* be purged from the database, yet at the same time, the
18``glance-manage`` tool deletes such rows without warning.
19
20Problem description
21===================
22
23Briefly, the problem is that Glance has always allowed a user with permission
24to make the image-create call the option of specifying an image_id. If the
25specified image_id clashed with an existing image_id, the image-create
26operation would fail; otherwise, the specified image_id would be applied to the
27new image. Consistency is enforced by a uniqueness constraint on the 'id'
28column in the 'images' table in the database. Since Glance database entries
29are soft-deleted, a proposed image_id will be checked against all image_ids
30that were assigned since the last purge of the 'images' table.
31
32As described in `OSSN-0075`_, this problem becomes a security exploit when (a)
33a popular public or community image is deleted, (b) the database is purged,
34and (c) a user creates a new image with that same image_id. Users consuming an
35image by image_id, which is the way Nova and Cinder consume images, may then
36wind up booting virtual machines using an image different from the one they
37intend to use.
38
39Note that the new image would have its own data and checksum that would be
40different from the original data and checksum, but there would be no way for
41Nova, for instance, to know that these had changed. Were someone to boot a
42server using the image_id, Nova would receive image data and then verify the
43checksum against whatever checksum Glance has recorded as associated with the
44image, which would be the *new* checksum.
45
46The idea that once an image goes to 'active' status, the (image_id, image data,
47checksum) will not change is called *image immutability*. It's important to
48note that image immutability is required for Glance or else it cannot function
49as an image catalog. If each consumer had to keep track of the image_id *and*
50checksum *and* other essential properties in order to verify the downloaded
51data, then there'd be no point in having Glance maintain this information.
52
53.. note::
54
55 The primary use case for allowing end-users to specify an image_id at the
56 time of image creation is to make it easy to find the "same" image data
57 (that is, the data is bit-for-bit identical although it's stored in
58 different locations) in different regions of a cloud. It's important to
59 note that the "sameness" of images in different regions is *not* guaranteed
60 by Glance. (A Glance installation can guarantee the immutability of images
61 within its own region, but it has no way of knowing what's happening in
62 other regions.) Thus, under the current situation, when an end user relies
63 on the image_id as the guarantor that they're getting the "same" data in
64 different cloud regions, the end user is actually relying upon the
65 trustworthiness of the *image owner*.
66
67 This is a separate issue from `OSSN-0075`_ and is independent of whether or
68 not the Glance database is ever purged. We point it out as something for
69 operators to keep in mind. To be clear about the issue, here's an example.
70 Suppose that a cloud operator puts an image with image_id A in regions R, S,
71 T, though for some reason the operator does not put that image in region U.
72 Any cloud user in region U could create an image with image_id A in
73 region U. The image could then be made available to some target user by
74 image sharing, or with the entire cloud by giving it 'community' visibility.
75
76 An operator can avoid this scenario by creating an image record with
77 image_id A in region U and not uploading any data to it. The image will
78 remain in 'queued' status, and if the visibility is not changed to 'public'
79 or 'community', the image will not appear in any end user's image-list
80 response.
81
82 There is also room for end user education here, namely, that image
83 consumers should *not* rely solely upon image_id to guarantee that they are
84 receiving the same image data in cross-region scenarios.
85
86Through discussions with operators, it's clear that the ability to set the
87image_id on image creation is being used out in the field, so we can't simply
88block this ability. At the same time, we must allow the database to be
89occasionally purged, as there is evidence that for large deployments, having a
90large number of soft-deleted rows in the 'images' table affects the response
91time of the image-list API call.
92
93Proposed change
94===============
95
96Modify the current ``glance-manage db purge`` command so that it will not purge
97the images table.
98
99Introduce a new command, ``glance-manage db purge-images-table`` to purge the
100images table. The new command will take the same options as the current purge,
101namely, ``--age-in-days`` and ``--max-rows``. The rationale for this being a
102new command (rather than a ``--force`` option to the current command) is
103twofold: (1) it's likely that the age-in-days used will be different for the
104images table, and (2) given that purging the images table has a security
105impact, having it as a completely separate command emphasizes this.
106
107Alternatives
108------------
109
1101. Introduce a policy governing whether or not a user is allowed to specify
111 the image_id at the time of image creation. The downside of this proposal
112 is twofold:
113
114 * it breaks backward compatibility given that this ability has been allowed
115 up to now in both the v1 and v2 versions of the Image API
116 * it breaks interoperability in that end uses will have the ability in some
117 clouds but not in others
118
119 A further problem with this proposal is that if the cross-region use of
120 a particular image_id is denied to end users, they will have to use some
121 other piece of image metadata for this purpose. Since cinder and nova both
122 use the image_id when services are requested, user workflows will have to
123 change to introduce an extra call to the image service to find the image
124 record before the image_id to pass to cinder or nova is determined.
125
1262. Instead of introducing a new column in the images table, introduce a new
127 single-column table with a uniqueness constraint to record "used" UUIDs.
128 The image-create operation would try to insert a proposed UUID into this
129 table instead of the 'images' table and fail as it currently does if the
130 uniqueness constraint were violated. This "used" UUID table would *never*
131 be purged, but the glance-manage tool could continue to purge all other
132 tables.
133
134 This alternative has the advantage of not impacting the image-list call. It
135 would eventually introduce a small delay into the image-create operation,
136 but that's probably acceptable.
137
138 The downside is that this proposal introduces an unpurgable table that is
139 unbounded in size.
140
1413. A variation on alternative #2: instead of a single-column table, have at
142 least a deleted_at column in addition to the image_id. This table would not
143 be touched by the "normal" ``glance-manage`` database purge operation.
144 Rather, an additional purge operation could be introduced for this table
145 that would purge rows that were, say, 5 years old from the table.
146
147 A problem with this suggestion is that a determined attacker could
148 nonetheless flood the "used" image_ids table. This is possible because
149 while it might make sense to limit the number of existing images a user
150 owns, it doesn't make sense to limit the number of deleted images a user
151 owns. For example, an end user who creates an image of some important
152 server every day, but only keeps around a week's worth, will accumulate many
153 deleted images (multiplied by the number of servers this is being done for),
154 but this is perfectly legitimate behavior. So I'm not sure how flooding the
155 "used" image_id table could be prevented, except by something like
156 rate-limiting, though that would have to be set in such a way as not to
157 impact legitimate use cases.
158
1594. Introduce a new field, ``preserve_id``, for use in the images table. This
160 field will be for internal Glance use only and will not be exposed through
161 the API. This field will be null by default and will be set true whenever
162 the 'visibility' field of an image is set to 'public' or 'community'. There
163 will be no way to unset the value of the field. In addition to this, modify
164 the glance-manage tool so that it will never delete an entry from the images
165 table that has ``preserve_id`` == True.
166
167 As with alternatives 2 and 3, the database table will continue to grow, but
168 this growth is constrained by keeping only rows relevant to the OSSN-0075
169 exploit. On the other hand, all an attacker has to do is read this spec to
170 realize that by creating image records with community visibilty, the images
171 table can still be flooded with spurious image records. Thus this strategy
172 is too easily defeated to be worth implementing, especially as it might give
173 operators a false sense of security.
174
175Data model impact
176-----------------
177
178None
179
180REST API impact
181---------------
182
183None
184
185Security impact
186---------------
187
188This change will enhance security by providing operators with a means of
189mitigating the exploit described in `OSSN-0075`_.
190
191Notifications impact
192--------------------
193
194None
195
196Other end user impact
197---------------------
198
199None
200
201Performance Impact
202------------------
203
204The images table will grow indefinitely, though the associated tables
205(image_properties, image_tags, image_members, image_locations) can be purged by
206the ``glance-manage`` tool.
207
208The images table can be partially purged at appropriate intervals.
209
210Other deployer impact
211---------------------
212
213Operators will have to monitor Glance for abnormal usage patterns and take
214appropriate action.
215
216Additionally, operators should be made aware of the cross-region version of the
217OSSN-0075 exploit (as discussed in the Note in the Problem Description
218section).
219
220Developer impact
221----------------
222
223None
224
225Implementation
226==============
227
228Assignee(s)
229-----------
230
231Primary assignee:
232
233* brian-rosmaita
234
235Other contributors:
236
237* undetermined
238
239Work Items
240----------
241
2421. Modify the ``glance-manage`` tool:
243
244 * The current behavior is that it purges all tables of soft-deleted rows.
245 Change the behavior so that the images table is not purged by default.
246
247 * Add a new command to purge the images table. It should take the
248 ``--age-in-days`` and ``--max-rows`` options just like the current purge
249 command.
250
2512. update operator documentation
252
2533. release note
254
255Dependencies
256============
257
258No new dependencies.
259
260Testing
261=======
262
263Appropriate unit tests to ensure the changes to glance and the glance-manage
264tool function correctly.
265
266Documentation Impact
267====================
268
269The Glance Administrator Guide will need to be updated.
270
271References
272==========
273
274`OSSN-0075`_: `Deleted Glance image IDs may be reassigned`.
275
276.. _OSSN-0075: https://wiki.openstack.org/wiki/OSSN/OSSN-0075