Adds glossary and fixes some mistakes

Change-Id: I679ffa1383bf5214ebbf1ac273f9179bb825c09d
This commit is contained in:
Sandy Walsh 2015-02-27 13:29:09 +00:00 committed by Sandy Walsh
parent f0a178b54c
commit a7e600585e
3 changed files with 96 additions and 11 deletions

View File

@ -62,17 +62,17 @@
<img src='v3_arch.gif' class="img-rounded"/>
<ul>
<li>The source application (aka: your application) publishes <a href='apdx.html#notification'>notifications</a> to the <a href='faq.html#queues'>queue</a>.
<li><a href='apdx.html#yagi'>Yagi</a> consumes these notifications from the queue and passes them onto a chain of Yagi <a href='apdx.html#yagi_handler'>Handlers</a>.
<li>The source application (aka: your application) publishes <a href='glossary.html#notification'>notifications</a> to the <a href='glossary.html#queues'>queue</a>.
<li><a href='glossary.html#yagi'>Yagi</a> consumes these notifications from the queue and passes them onto a chain of Yagi <a href='glossary.html#handlers'>Handlers</a>.
<li>Some yagi handlers include:
<ul>
<li><a href='apdx.html#shoebox'>Shoebox</a> for long-term archiving.
<li><a href='apdx.html#atomhopper'>Atom-Hopper</a> for pub-sub ATOM feeds.
<li><a href='apdx.html#distiller'>Stack-distiller</a>/<a href='apdx.html#winchester'>Winchester</a> for StackTach.v3 stream processing.
<li><a href='glossary.html#shoebox'>Shoebox</a> for long-term archiving.
<li><a href='glossary.html#atomhopper'>Atom-Hopper</a> for pub-sub ATOM feeds.
<li><a href='glossary.html#distiller'>Stack-distiller</a>/<a href='glossary.html#winchester'>Winchester</a> for StackTach.v3 stream processing.
</ul>
<li>Winchester takes the distilled notifications (called <a href='apdx.html#events'>events</a>) and stores them in the MySQL database.
<li>Winchester takes the distilled notifications (called <a href='glossary.html#events'>events</a>) and stores them in the MySQL database.
<li>Streams may be created or processed as new events flow into the system. This can result in new events or notifications being generated.
<li>Any new notifications can be published back into the queue for subsequent processing via the <a href='apdx.html#notabene'>Notabene</a> pipeline handler.
<li>Any new notifications can be published back into the queue for subsequent processing via the <a href='glossary.html#notabene'>Notabene</a> pipeline handler.
<li>... the sequence repeats itself.
</ul>
@ -135,7 +135,7 @@ max_messages = 100
<p>You can write your own Yagi handlers if you like, but there are a number that ship with StackTach.v3 to do some interesting things. The most important of these is the <code>winchester.yagi_handler:WinchesterHandler</code>. This handler is your entry point into StackTach.v3 stream processing. But first, we need to convert those messy notifications into events ...</p>
<h3>Distilling Notifications to Events</h3>
<p>Now we have notifications coming into Winchester. But, as we hinted at above, we need to take the larger notification and <i>distill</i> it down into a, more manageable, event. The stack-distiller module makes this happen. Within StackTach.v3, this is part of <code>winchester.yagi_handler:WinchesterHandler</code>.</p>
<p>A notification is a large, nested JSON data structure. But we don't need all of that data for stream processing. In fact, we generally only require a few <a href='apdx.html#trait'>Traits</a> from the notification. That's what distilling does. It pulls out the important traits, scrubs the data and uses that. Distillations are done via the distillation configuration file (specified in winchester.conf). </p>
<p>A notification is a large, nested JSON data structure. But we don't need all of that data for stream processing. In fact, we generally only require a few <a href='glossary.html#trait'>Traits</a> from the notification. That's what distilling does. It pulls out the important traits, scrubs the data and uses that. Distillations are done via the distillation configuration file (specified in winchester.conf). </p>
<p>Only <code>timestamp</code> and <code>event_type</code> are required traits.</p>
<span class="label label-default">A sample notification</span>
@ -281,7 +281,7 @@ pipeline_handlers:
notabene: winchester.pipeline_handler:NotabeneHandler
</pre>
<p>The first thing you'll notice is the database connection string. But then you'll notice that the Winchester module needs three other configuration files. The distiller config file we've already covered. The other two require a little more explaination. They define your <a href='apdx.html#trigger'>Triggers</a> and your <a href='apdx.html#pipeline'>Pipelines</a>.</p>
<p>The first thing you'll notice is the database connection string. But then you'll notice that the Winchester module needs three other configuration files. The distiller config file we've already covered. The other two require a little more explaination. They define your <a href='glossary.html#trigger'>Triggers</a> and your <a href='glossary.html#pipeline'>Pipelines</a>.</p>
<div class="panel panel-info">
<div class="panel-heading">
@ -334,7 +334,7 @@ config_file = winchester.yaml
<p>Streams are buckets that collect events. The bucket the event goes in is determined by the distinguishing traits you define. Generally these are traits that have a somewhat constrained set of values. For example, instance_id, request_id, user_id, tenant_id, region, server, ip_address ... are all good choices. Timestamp is generally not a good distinguishing trait since it varies so greatly. You would end up with a different stream for every incoming event and each stream would only have one event in it. Not very useful. Also, you can define multiple distinguishing traits. For example: region and the "day" portion of the timestamp. This would produce one stream for each region for each day of the month. If you had five regions, you'd end up with 5*31 stream buckets. The choices are limitless.</p>
<p>At some point you have to do something with the data in your buckets. This is what the fire criteria defines. You can make time-based firing criteria (such as 2 hours past the last collected event) or trait-based criteria (such as "when you see the 'foo' event"). Wildcards are permitted in matching criteria. Time-based firings are defined with the "expiration" setting. There is a simple grammar for defining how much time has to elapse for a expiry to occur. We will go into detail on this later. For real-time stream processing, it's best to keep these expiries short or stick with trait-based firing criteria. Expiries = lag.</p>
<p>Finally, we define the pipelines that will process the streams when they fire or expire. Pipelines are sets of <a href='apdx.html#pipeline_handlers'>pipeline handlers</a> that do the processing. A pipeline handler is called with all the events in that stream. The events are in the temporal order they were generated. A pipeline handler does not need to concern itself with querying the database. It has all that it needs. Out-of-the-box, StackTach.v3 comes with a collection of pipeline handler for computing OpenStack usage for billing as well as re-publishing new notifications back into the queue. More are constantly being added and writing your own pipeline handlers is trivial. But more on that later.<p>
<p>Finally, we define the pipelines that will process the streams when they fire or expire. Pipelines are sets of <a href='glossary.html#handlers'>pipeline handlers</a> that do the processing. A pipeline handler is called with all the events in that stream. The events are in the temporal order they were generated. A pipeline handler does not need to concern itself with querying the database. It has all that it needs. Out-of-the-box, StackTach.v3 comes with a collection of pipeline handler for computing OpenStack usage for billing as well as re-publishing new notifications back into the queue. More are constantly being added and writing your own pipeline handlers is trivial. But more on that later.<p>
<p>You can define different pipelines for streams that fire and streams that expire. In the trigger definition file you simply give the name of the pipeline. Your winchester config file points to the pipeline configuration file that lists the pipeline handlers to run.<p>

View File

@ -58,7 +58,7 @@
<div class="col-lg-12">
<h3>Contributing to StackTach.v3</h3>
<p>StackTach.v3 is licensed under the Apache 2.0 license</p>
<p>All the source repos for StackTach.v3 (and .v2) are available on <a href='https://github.com/stackforge?query=stacktach'>SourceForge</a>. Details on contributing to StackForge projects are available <a href='https://wiki.openstack.org/wiki/How_To_Contribute'>here</a></p>
<p>All the source repos for StackTach.v3 (and .v2) are available on <a href='https://github.com/stackforge?query=stacktach'>StackForge</a>. Details on contributing to StackForge projects are available <a href='https://wiki.openstack.org/wiki/How_To_Contribute'>here</a></p>
<p>The core developers are available on Freenode IRC in the <code>#stacktach</code> channel</p>
<p>These docs are available in the Sandbox repo. Patches welcome!</p>

85
docs/glossary.html Normal file
View File

@ -0,0 +1,85 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="description" content="StackTach.v3">
<meta name="author" content="Sandy Walsh">
<link rel="icon" href="../../favicon.ico">
<title>StackTach.v3</title>
<!-- Latest compiled and minified CSS -->
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.2/css/bootstrap.min.css">
<link href="stv3-narrow.css" rel="stylesheet">
<!-- Optional theme -->
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.2/css/bootstrap-theme.min.css">
<!-- Latest compiled and minified JavaScript -->
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.2/js/bootstrap.min.js"></script>
<style>
.bottom_padding {
padding-bottom: 20px;
}
</style>
<!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<div class="container">
<div class="header bottom_padding">
<nav>
<ul class="nav nav-pills pull-right">
<li role="presentation"><a href="index.html">Home</a></li>
<li role="presentation"><a href="about.html">Docs</a></li>
<li role="presentation"><a href="install.html">Getting Started</a></li>
<li role="presentation"><a href="#">API</a></li>
<li role="presentation"><a href="screencasts.html">Screencasts</a></li>
<li role="presentation"><a href="contribute.html">Contribute</a></li>
</ul>
</nav>
<img src="StackTach_160x70.png"/>
</div>
<div class="row marketing">
<div class="col-lg-12">
<h3>Glossary</h3>
<ul>
<li><a id='atomhopper'>Atom-Hopper</a> - Atom-Hopper is a java system that produces ATOM feeds in a pub-sub manner.
<li><a id='distiller'>Stack Distiller</a> - Stack-Distiller is a python library that extracts key traits from a complex JSON message to produce a smaller, flat set of key-value pairs.
<li><a id='events'>Events</a> - Events are what we call Notifications that have been distilled.
<li><a id='handlers'>Handlers</a> - a handler is python code that processes a small chunk of data. In StackTach.v3 we have a variety of different handlers for different purposes. There are Shoebox Handlers for dealing with notification archives, Yagi Handlers for processing messages as they come off the queue, Winchester Pipeline Handlers for processing completed event streams, etc. Refer to the appropriate library to see the structure of that handler, as they are all a little different.
<li><a id='notification'>Notification</a> - A notification is a JSON data structure. It can contain nested data with all native JSON data types. A notification must have <code>event_type</code>, <code>message_id</code> and <code>timestamp</code> in the top level traits.
<li><a id='notigen'>Notigen</a> - Notigen is a python library that generates fake OpenStack Nova-style notifications. It simulates the common operations of Nova such as Create/Delete/Resize/Rebuild instance.
<li><a id='notabene'>Notabene</a> - Notabene is a python library that consumes and publishes notifications to/from RabbitMQ queues. Within StackTach.v3 it's used for it's publishing capabilities. There is a Winchester Pipeline Handler that uses Notabene to publish new notification back to RabbitMQ. Notigen also uses Notabene to push simulated notifications to RabbitMQ.
<li><a id='pipeline'>Pipeline</a> - A pipeline is a series of handlers that process data one after another. There are Yagi pipelines, Winchester pipelines and Shoebox pipelines.
<li><a id='queues'>Queues</a> - A queue refers to a RabbitMQ queue. In RabbitMQ, messages are published to Exchanges, which are routed to queues until they are read by consumers.
<li><a id='shoebox'>Shoebox</a> - Shoebox is a python library for archiving complex JSON messages. Messages can be stored locally and tarballed (like logfiles) or packaged into binary archives. Archives can be exported to external stores, like HDFS or Swift, when they reach a certain size or age.
<li><a id='trait'>Traits</a> - Traits are key-value pairs. For example, in <code>{'foo': 1, 'blah': 2}</code> foo and blah are traits.
<li><a id='trigger'>Trigger</a> - A trigger is a rule that deems when a Winchester stream should be processed. There are triggers that can fire when a particular event is seen or after a period of stream inactivity.
<li><a id='yagi'>Yagi</a> - <a href='https://github.com/rackerlabs/yagi'>Yagi</a> is a python library for consuming messages from queues. It supports a handler-chain approach to processing these messages. A handler can do whatever it wants with consumed messages. Multiple yagi workers can be run to consume messages faster.
</ul>
<footer class="footer">
<p>&copy; Dark Secret Software Inc. 2014</p>
</footer>
</div> <!-- /container -->
<!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
<script src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
</body>
</html>