Add specification for list pagination

Related-to: blueprint pagination-of-resources Change-Id: Ia79276b0278c04ed7c404c74ba90b84e1d7600a0
2016-12-21 11:40:45 -06:00 · 2016-12-21 11:40:45 -06:00 · 0cc403fec2
parent da0aacda85
commit 0cc403fec2
1 changed files with 336 additions and 0 deletions
--- a/doc/source/specs/approved/pagination-of-resources.rst
+++ b/doc/source/specs/approved/pagination-of-resources.rst
@ -0,0 +1,336 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+==============================
+ Pagination of List Resources
+==============================
+
+https://blueprints.launchpad.net/craton/+spec/pagination-of-resources
+
+Craton is intended to manage large quantities of devices and other objects
+without sacrificing performance. Craton needs to add pagination support in
+order to efficiently handle queries on large collections.
+
+
+Problem description
+===================
+
+In the current implementation, a request to one of our collection resources
+will attempt to return all of the values that can be returned (based on
+authentication, etc.).  For example, if a user and project have access to 5000
+hosts then making a ``GET`` request against ``/v1/hosts`` would return all
+5000. Such large result sets can and likely will slow down Craton's response
+times and make it unusable.
+
+
+Proposed change
+===============
+
+We propose adding pagination query parameters to all collection endpoints. The
+new parameters would assume defaults if the user does not include them.
+
+We specifically propose that:
+
+#. Craton choose a default page size of 30 and limit it to being at least 10
+   items and at most 100 items,
+
+#. Craton choose to make the next page both discoverable *and* calculable. In
+   other words, using "link" hypermedia relations in a response to indicate
+   first, previous, next, and last page URLs that are generated by the server
+   for the client,
+
+#. Craton should assume the defaults for requests that have no query
+   parameters. For example, if someone makes a ``GET`` request to
+   ``/v1/hosts`` it would imply an original page size of 30 and that the first
+   30 results should be returned.
+
+To provide pagination to users, it is suggested that we use ``limit`` and
+``marker`` parameters to indicate the page size and last seen ID. This allows
+users to begin pagination after an item, rather than at a particular page. For
+example, if a user is checking for new hosts in the listing and they know the
+ID of the last host they encountered they can provide ``marker=:id&limit=30``
+to get the newer hosts. If instead, we used ``page`` and ``per_page`` there's
+the possibility they'd miss items since hosts may have been deleted changing
+the page number of the last host.
+
+This implies that the default ``limit`` value would be 30 and the default
+``marker`` would be null (to indicate that no last ID is seen).
+
+This combination of parameters is practically the standard in OpenStack.
+Operators familiar with OpenStack's existing Compute, Images, etc. APIs
+will be familiar with these parameters.
+
+In addition to pagination parameters, this spec proposes adding link relations
+in the Response body - as defined by JSON Hyper-Schema and `favored by the API
+WG`_
+
+This makes API usage easier for everyone, including, people using the API
+directly and people writing API wrappers such as python-cratonclient. This
+does, however, have the downside of affecting our response bodies and JSON
+Schema
+
+Finally, I'd like to strongly propose that we include these links in each
+response. Which relation types we include would depend on where in the
+pagination the user is, but it would do something like this:
+
+#. Include a ``self`` relation for every page that tells the user exactly what
+   page they're presently on.
+
+#. If there is a page prior to the current one, we would include the ``prev``
+   **and** ``first`` relations. These tell the user what the previous page is
+   and what the first page is.
+
+#. If there is a page after the current one, we would include the ``next``
+   **and** ``last`` relations. These are the opposites to ``prev`` and
+   ``first`` respectively.
+
+   It is worth noting that without properly implemented caching the ``last``
+   relation, it could become computationally expensive to calculate for every
+   pagination query.
+
+
+Alternatives
+------------
+
+Alternative query parameters to ``limit`` and ``marker`` are:
+
+#. Use ``page`` and ``per_page`` parameters to indicate the 1-indexed "page
+   number" and number of items on each page respectively. This means that
+   users can change how many items they get on each page request and can
+   resume in arbitrary places by specifying the ``page`` parameter.
+
+   This would imply that the default ``page`` value would be 1 and the default
+   ``per_page`` would be 30.
+
+   These two parameters are presently used by a significant number of large
+   APIs at the moment but are not common in OpenStack itself. They provide
+   simplicity in that if the API user wants to, they can just constantly
+   increment the page number to get the next page in the simplest way possible.
+   They don't have to calculate the next value from a combination of values in
+   the response of the last request.
+
+   This does, however, prevent users from being able to resume iteration from
+   the last item it received in a list. Further, this adds the potential that
+   users may miss objects due to deletions or other changes in the
+   corresponding collection. Finally, these parameters only provide users an
+   opaque idea as to where in a paginated resource they are and how to resume
+   pagination.
+
+#. Use ``limit`` and ``offset`` parameters to provide similar functionality
+   and opacity to ``per_page`` and ``page`` respectively.
+
+   The default ``limit`` would, again, be 30 and the default ``offset`` would
+   be 0.
+
+   This combination of parameters is also present in a small number of
+   OpenStack projects but has some of the same negative implications as the
+   ``page`` and ``per_page`` parameters when compared to ``limit`` and
+   ``marker``.
+
+An alternative way to provide pagination links are:
+
+#. Link headers - as defined in :rfc:`6903` - using Relation Types defined in
+   :rfc:`5988`.
+
+   These are also commonly used outside of OpenStack and were popular to the
+   creation of including the relations in the response body. The benefit to
+   Craton of using this method is that it doesn't effect our JSON Schema or
+   existing Response bodies. A major problem with this approach is that a
+   relation type can be repeated in a Link header. However, the HTTP library
+   used by the majority of the Python world - Requests - does not parse such
+   links correctly. Further, widespread support for parsing these header
+   values is not known to the author of this specification.
+
+Data model impact
+-----------------
+
+This should have **no** impact on our data model.
+
+REST API impact
+---------------
+
+This specification will have two impacts on our REST API:
+
+#. It will add ``limit`` and ``marker`` query parameters that are identical to
+   a number of existing and future endpoints.
+
+#. It will change the fundamental structure of our list responses in order to
+   accommodate the link relations.
+
+   At the moment, for example, a ``GET`` request made to ``/v1/hosts`` has a
+   response body that looks like:
+
+   .. code-block:: json
+
+       [
+         {
+            "active": true,
+            "cell_id": null,
+            "device_type": "Computer",
+            "id": 1,
+            "ip_address": "12.12.12.15",
+            "name": "foo2Host",
+            "note": null,
+            "parent_id": null,
+            "region_id": 1
+         },
+         {
+            "active": true,
+            "cell_id": null,
+            "device_type": "Phone",
+            "id": 2,
+            "ip_address": "11.11.11.14",
+            "name": "fooHost",
+            "note": null,
+            "parent_id": null,
+            "region_id": 1
+         }
+       ]
+
+   This would need to transform to
+
+   .. code-block:: json
+
+       {
+         "items": [
+           {
+              "active": true,
+              "cell_id": null,
+              "device_type": "Computer",
+              "id": 1,
+              "ip_address": "12.12.12.15",
+              "name": "foo2Host",
+              "note": null,
+              "parent_id": null,
+              "region_id": 1
+           },
+           {
+              "active": true,
+              "cell_id": null,
+              "device_type": "Phone",
+              "id": 2,
+              "ip_address": "11.11.11.14",
+              "name": "fooHost",
+              "note": null,
+              "parent_id": null,
+              "region_id": 1
+           }
+         ],
+         "links": [
+           {
+             "rel": "first",
+             "href": "https://craton.environment.com/v1/hosts?limit=30"
+           },
+           {
+             "rel": "next",
+             "href": "https://craton.environment.com/v1/hosts?limit=30&marker=2"
+           },
+           {
+             "rel": "self",
+             "href": "https://craton.environment.com/v1/hosts?limit=30&marker=1"
+           }
+         ]
+     }
+
+
+Security impact
+---------------
+
+Pagination suppport reduces the potential attack surface for denial of service
+attacks aimed at Craton. It alone, however, is not sufficient to prevent DoS
+attacks and additional measures should be taken by deployers to further
+mitigate those possibilities.
+
+Notifications impact
+--------------------
+
+Craton does not yet have notifications.
+
+Other end user impact
+---------------------
+
+This will have a minor affect on python-cratonclient. The ``list`` calls it
+implements will need to become smarter so they can handle pagination for the
+user automatically.
+
+Performance Impact
+------------------
+
+There should not be any performance impact on the service created by this code
+although it will frequently be called.
+
+Other deployer impact
+---------------------
+
+None
+
+Developer impact
+----------------
+
+None
+
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+Primary assignee:
+- icordasc
+
+Other contributors:
+- None
+
+Work Items
+----------
+
+- Add basic pagination support with tests to ensure that functionality works
+  independent of the other features proposed in this specification
+
+- Add link relation support to response bodies
+
+
+Dependencies
+============
+
+N/A
+
+
+Testing
+=======
+
+This should be tested on different levels, but at a minimum on a functional
+level.
+
+
+Documentation Impact
+====================
+
+This will impact our API reference documentation
+
+
+References
+==========
+
+* `IANA Link Relations Registry`_
+
+* :rfc:`5988`
+
+* :rfc:`6903`
+
+* `JSON Hyper-Schema`_
+
+* `"Pagination, Filtering, and Sorting" by the OpenStack API WG`_
+
+.. _favored by the API WG:
+    http://specs.openstack.org/openstack/api-wg/guidelines/links.html
+.. _IANA Link Relations Registry:
+    https://www.iana.org/assignments/link-relations/link-relations.xhtml
+.. _JSON Hyper-Schema:
+    http://json-schema.org/latest/json-schema-hypermedia.html
+.. _"Pagination, Filtering, and Sorting" by the OpenStack API WG:
+    http://specs.openstack.org/openstack/api-wg/guidelines/pagination_filter_sort.html