From fc825d4586964d178f09802feb1c92fb09b4b586 Mon Sep 17 00:00:00 2001 From: Shachar Snapiri Date: Tue, 20 Feb 2018 12:22:19 +0200 Subject: [PATCH] SPEC - Skydive integration Spec for the integration of the SkyDive project in Dragonflow. Still a few open issues (expect more to be there) - feedback is welcome Change-Id: I2045fe79423b43b98faef41dc35f28aa872a2b78 Related-Bug: #1749429 --- doc/source/specs/index.rst | 1 + doc/source/specs/skydive_integration.rst | 122 +++++++++++++++++++++++ 2 files changed, 123 insertions(+) create mode 100644 doc/source/specs/skydive_integration.rst diff --git a/doc/source/specs/index.rst b/doc/source/specs/index.rst index 0a4f53b27..987a6f042 100644 --- a/doc/source/specs/index.rst +++ b/doc/source/specs/index.rst @@ -79,6 +79,7 @@ Specs lbaas2 application_decoupling dnsaas + skydive_integration Templates --------- diff --git a/doc/source/specs/skydive_integration.rst b/doc/source/specs/skydive_integration.rst new file mode 100644 index 000000000..edad015a7 --- /dev/null +++ b/doc/source/specs/skydive_integration.rst @@ -0,0 +1,122 @@ +.. + This work is licensed under a Creative Commons Attribution 3.0 Unported + License. + + https://creativecommons.org/licenses/by/3.0/legalcode + +=================== +SkyDive integration +=================== + +Relevant launchpad RFE: + +https://bugs.launchpad.net/dragonflow/+bug/1749429 + +Currently we do not have an easy way to visualize the way Dragonflow sees the +topology and the relations between topology elements. This view is important +both for operators using Dragonflow, and for developers trying to debug the +system. + +To solve this, we would like to leverage one of the SkyDive [1]_ project +capabilities - being a good topology visualization tool. This will allow +us to provide the overall topology view (skydive model) and use the skydive +for all the different ways to visualize and dissect the information (skydive +view). + + +Problem Description +=================== + +Add the ability to view the topology as known by DragonFlow in a graphical way. +This feature is needed to allow easier debugging of issues in flows and +understanding the behaviour of DragonFlow. + +Implementation stages: + +1. Have a view of the topology of a specific chassis (Node running + Dragonflow controller) +2. Have real-time updates of the information (topology elements added/removed) +3. Allow aggregation of the information and have a global view of the + topology, which means the view of the entire network as Dragonflow sees it +4. Allow filtering and dissecting of the information to show just specific + parts +5. Allow simulation or tracing of packets in the system on the graphical + interface + +All of these features are supported by skydive project, but for other +agent types. +What we need to do is implement our own agent (or a way to send the +information to the analyser) in a way that will be out-of-band with the +operation of the controller (or at least not block it for long periods). +Another thing we should provide is to allow an administrator (rather than +developer) easy access and debugging of the system. + +Proposed Change +=============== + +Add one Skydive analyzer with one (or more) skydive agents. +The agents are responsible to collect the topology information, +translate it to the Skydive model structure and send it to the analyzer. +The Analyzer, in turn, is responsible to aggregate the information and for +the displaying of the information, with/without filtering requested by the +end-user (be it a developer / integrator / operator). + +The agents will run on the Dragonflow Controller nodes as a separate +service (not within the controller) for several reasons: + +1. We do not want to affect the performance of the controller. Each update + sent to the analyzer may take up to a few seconds, in which time the + controller will not be able to service other requests +2. The code running in the controller must be monkey_patched (as it is using + the oslo infrastructure and the etcd driver with both require + monkey_patching). This creates different limitations on our code - e.g. + the asyncio loops and selectors are limited or behave badly + +Integration should be done in several stages: + +1. Create a basic service that runs every given period and sends an update + of the elements in the system to the analyzer. +2. Support topology update, including removal of objects from the DB and + reflect it in the topology view in SkyDive. +3. Handle cases of disconnect/re-authentication between the collector and the + analyzer +4. Handle cases of disconnect/reconnect of the collector and the nb_db +5. Add a mechanism in which the skydive_service will get notification of + objects that were added-to/removed-from the topology to have an + experience that is closer to real-time as opposed to periodic updates. +6. Specify an API for DragonFlow applications to add custom information to + the topology view (e.g. port-behind-port) and relevant metadata to be + used in the view filtering. +7. Improve the visualization (custom icons, etc.) +8. Add some SkyDive views / filters. +9. In the nb_db the router only points to the internal network ports, so the + view we get is not complete. We would like it to point to the its gateway + port as well. To achieve that we would have to add some kind of proxy + object (as we have the table name and owner ID) to be able to retrieve this + gateway port from the nb_db. + +Open issues / feature discussion: +================================= + +- How do we get the topology change notifications? As we are an external + application we disable the pubsub feature, so we should have a different + way of getting these notifications. + One solution may be using the pubsub mechanism, but it will require + rewriting (at least) the etcd pub/sub subscriber driver to use (e.g.) + etcd3gw or etcd3 library (for patched code or not) so we can support + both cases for different use-cases. + +I believe that a separate spec would have to cover these: +- What views / filters do we want to supply - need to investigate how to + define them. +- If possible, we would like to support the option to emulate a passing + of a packet through the system. Is it supported by SkyDive? if so, how? +- If possible, we would like to support visualization of tracing of packets + in our system. Is it supported by SkyDive? if so, how? + What is required on our side to visualize it on SkyDive? + +References +========== + +.. [1] SkyDive project: http://skydive-project.github.io/skydive/ +.. [2] SkyDive intro: https://www.youtube.com/watch?v=nQSdGKV8ceM \ No newline at end of file