Commit Graph

320 Commits

Author SHA1 Message Date
chengebj5238 f67e6b376e Correct spelling mistakes
Change-Id: I4a87c3cc9562437ff7548bb523b9f94684bed05b
2021-03-01 08:16:33 +00:00
Georgina Shippey b05893ae39 Fixes for tests/gates
Bump ansible version to 2.10.5
Prefer python3 over python2
Fix ansible str vs int comparisons
Print a message if setting defaultIndex in kibana 6 fails
Use saved_objects API for setting defaultIndex in kibana 7
Update version in dashboard path for 7x
ILM doesn't like bodies from GET requests
Stop using forked galera_server role, now that 633321 is merged
Force osquery inventory to python3
Update zuul jobs

Change-Id: Ibfc20b1605245927ad4de4a54e751a13defb1ee0
2021-03-01 00:02:17 +01:00
Dmitriy Rabotyagov 3d2aed2c2d Use version test instead of version_compare
This test was changed to 'version' in ansible 2.5 [1].

[1] https://docs.ansible.com/ansible/2.8/user_guide/playbooks_tests.html#version-comparison

Change-Id: I21efa77fc743f9530d307dc06c8a345475d35dfa
2019-09-10 10:09:32 +00:00
Jonathan Herlin 9230c3f392 Fix spelling in README
Change-Id: I17238a1acad4f34137597457c6364ae50a8d22b5
2019-07-11 21:23:15 +02:00
melissaml 87cbdd6649 Replace git.openstack.org URLs with opendev.org URLs
Change-Id: I790c1876a3e44da8623c74632332f0e453dce1f6
2019-07-09 16:36:22 +00:00
cloudnull c12168e419
updates for elk6.7x
Some of the plugins are irrelevant now so with this release they've been
removed. Additionally the machine-learning switch in the updated beats
no longer does anything so its also been removed.

Change-Id: Ibac0177a61af5392cb80888a8fca1fa9ebe3ad4b
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-04-15 11:38:09 -05:00
cloudnull 28cb67cf33
improve deployments on 14.04
Change-Id: Ic2c335d8c3ede9dad2edb86a76139bdb71bdb6f7
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-03-07 20:05:43 -06:00
Zuul 8702dca38c Merge "Update heartbeat config for the latest stable release" 2019-02-27 14:04:25 +00:00
Kevin Carter aabf90d1a4 Update heartbeat config for the latest stable release
Change-Id: I0db06c07ac9320c5db927f23e32fdb8194e5106b
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-27 06:26:13 +00:00
Kevin Carter 280ff11746 Update auditbeat config for the latest stable release
Change-Id: I468992009f562ca7d48fb88aab41edb552e23831
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-27 06:26:09 +00:00
Kevin Carter c74eed3845 update packetbeat config for the latest release
Change-Id: If370e015ec2ec33b6f6e744958d7bcbed041ab42
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-26 22:29:53 -06:00
Kevin Carter 2d3c0d55f4 Update metricbeat config for the latest release
Change-Id: I312a0c272143973050f81f34867471098cec3286
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-26 22:22:46 -06:00
Kevin Carter 4490ed3dea Update journalbeat config
The journalbeat configuration has been updated to make it
similar to all other beats. This change updates our config
so that it is functional with the latest journalbeat release.

Change-Id: Ic70a031bdeb57f2f5439763a3bf9f6b7001e6a31
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-26 22:03:12 -06:00
cloudnull a3afb64654
Ensure the default version of Java is set
When installing ES and Logstash the system version of java needs to
match the expected version of java ES and Logstash will use. This
change, uses the `update-alternatives` command to set the java version
to the expected value when more than one java exists on a system.
The nessisity for this change came from OS level upgrades within
environments running OLD versions of ES. Upon upgrading the base OS
our playbooks could not complete an upgrade of ES which was due to the
java expectations. Once the alternantives were set accordingly the
upgrade completed without issues.

Change-Id: I9025967f723ee17940e11789f503e342cdad6f2a
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-02-26 11:36:22 -06:00
cloudnull aa41c62f9d
comment out monitorstack installation for now
monitorstack for openstack services was added to site however to this is
not quite ready yet so its been removed from site and will remain as a
beta capability for the time being.

Change-Id: Ia9a8af49c288bf04c02b3be6b4179c9a2fb07076
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-02-19 23:34:20 -06:00
cloudnull a4d2b3c1f9 Correct service name when running with upstart
The service name for packetbeat needs to be set correctly, this does
that.

Change-Id: I39c10914ba2d0f16b6ebb94da480ad13f455a08f
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-02-19 21:39:36 +00:00
cloudnull faf940cbfc Add dynamic group detection to openstack metrics
This change allows the larger os metrics installation process to be
selective. A dynamic group will be created and hosts will be added to it
when the host is within a known target group and is using systemd
(required for now).

Change-Id: I225ec5b5ffe4aa8ba403624f9ebe8c9eebed9fee
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-02-19 20:35:35 +00:00
cloudnull 22f997f7fa
Ensure upper and lower memory limits are whole numbers
This change will ensure that the upper and lower memory limits are set
to whole numbers when dynamically setting the memory limits. Before this
change a memory limit could be a float, and in certain versions of
ansible + python the option could evaluate to 0, which would break
everything.

Change-Id: Ibfb8af60db6566937cbf77243b0e6848d542665d
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-02-19 13:11:02 -06:00
Zuul e52e99166c Merge "Add monitorstack data collection into ES" 2019-02-19 15:59:08 +00:00
Kevin Carter 8ee8ec0832 Update grafana, use vendored role, and add es lb
The grafana role will now deploy and setup the grafana datasources using
the API as expected. API users will also be created for admin, viewer, editor.

The es config for grafana has been udpated to correct issues where the system
expected a publically accessible lb to handle grafana traffic back to an es
cluster. When the grafana role deploys the traefik lb will now be used within
the grafana deployment to ensure grafana is able to deploy against an es cluster.

Change-Id: Iae3a5c2ab1b98390110d37f33b074156d32bb684
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-18 21:20:15 -06:00
cloudnull c6493a812b
Add monitorstack data collection into ES
The monitorstack data collection can export data into elasticsearch.
A playbook has been added to deploy the data collection probes which
will leverage systemd-timers to run the probes on regular intervals. The
systemd timers will be deployed per-probe and run within the utility,
compute, and memcached hosts. Any place the probes are deployed an
isolated user will ensure to fence the probes from the cluster and limit
access. OpenStack probes will only be deployed when an openstack-sdk
clouds config is found within the system.

Change-Id: Ic5cd5fd51a7e0763c0a2db40af4150b8851bc748
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-02-18 08:46:39 -06:00
Kevin Carter 326fde4895 Increment nginx check port
In the event that both NGINX and apache are co-existing on the same machine
the status port check for both platforms will be the same, and that will
cause one of the services to not start. This change increments the NGINX
check port to ensure there are no conflicts.

Change-Id: I03d5d351fff2d6926f35ca860c01f5a075de42aa
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-14 20:42:56 -06:00
Kevin Carter 52c2702587 Fix dashboards and possible port conflicts
This change ensures we are not creating a port conflict for apache/nginx when
status is enabled. This change also ensures dashboards are created correctly
resolving an issue with index-patterns containing a regex.

Change-Id: I8228fc9832d02518d2db843c96abf6dffc63bdfc
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-14 18:14:28 -06:00
Kevin Carter 9c9efd9eb5 Change the q_mem and h_mem to lower and upper limits
This change removes the {h,q}_mem options in favor of a new variable which
clearly states the upper and lower limits for a given deployment. This change
also makes these options a lot more conservative by default which will allow
the deployment to better run on shared infra.

Change-Id: I169f457198c11edc4881a04df65312f6c4f67feb
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-14 08:50:29 -06:00
cloudnull 0a0a4a0880
Add the ability to enable or disable rollups / indexes
This change creates a new option to enable or disbale rollup jobs. This
is also providing the default basic index patterns for kibana index
patterns and elastic indexes.

Change-Id: I60e96a2cdbe27de760b54c4d9d43bcde4d09bbf5
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-02-11 23:14:51 -06:00
cloudnull 03d25dce3d
Add logstash ingestion for collectd
This change will allow logstash to ingest metrics from collectd. New
options have been added to enable the deployment and configure it.

Change-Id: I995c0db69fc68d5f5bcae27ce16956876368e2a8
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-02-11 00:13:37 -06:00
Zuul 38f817aee7 Merge "Omit dahsboard on elk setup by default" 2019-02-04 06:02:11 +00:00
Kevin Carter 6017fc0e89 Add the ability to set the JVM heap size
This change makes it possible for users to set the `elastic_heap_size_default`
value. Before this change, the option was unreachable due to a series of facts
ganerated template values. The options `elastic_heap_size` or `logstash_heap_size`
have also been exposed giving deployers the ability to define service specific
heap sizes as needed.

Change-Id: Ida3a57fdcff388f8e4bb3f325b787205a6183970
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-01-30 09:53:20 -06:00
Kevin Carter 151d80382c Omit dahsboard on elk setup by default
With the introduction of the "infrustructure" panel and "canvas" becoming
stable, there's not a lot of reason to import the  general beat dashboards.
The default dashboard are almost always in a state of disrepare and take a
long time to import on high traffic clusters.

This channge removes the default dashboard from the beat setup role by
default. If a deployer wishes to renable the default dashboards, or add any
other beat flags, the variable `elastic_setup_flags` can be used to extend
the setup.

Change-Id: If44845f53e4d0cb1e91ec804060316fb852b4bfa
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-01-27 20:13:31 -06:00
Kevin Carter 82cc72e166 Read the path for the logstash queue path
The queue path within logstash may be a symlink which will fail to mount
as tempfs. To ensure queue path can be tempfs, a readlink command is
used to fetch the true path, which will be used in a mount when nessisary.

Change-Id: I5fe6bf311e0621c98766ae458371b5f11f89a61f
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-01-25 13:53:41 -06:00
Zuul a54602805f Merge "Fix the misspelling of "container"" 2019-01-23 22:34:34 +00:00
lijunjie defe320d86
Fix the misspelling of "container"
Change-Id: I201ed221941df93ed61eac3f256e8a60a0534c9b
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2019-01-23 09:53:34 -06:00
Kevin Carter abd6661b4e
Update conditionals and namespaced options
This change implements namespaced variables and conditionals in needed
services. This will ensure systems running these playbooks are able to
be deployed in isolation without making osa specific assumptions.

Change-Id: Ia20b8514144f0b0bf925d405f06ef2ddc28f1003
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2019-01-23 09:38:40 -06:00
Kevin Carter 892d617dc6
Add a default retention policy for skydive indexes
Given the ops tools now have a skydive deployment capability there needs
curator needs to be able to detect the addition of skydive indexes and
build a curator policy accordingly.

This change adds the new retention policy to the overlay inventory
providing a sane default for most environments.

The retention action files have been updated to remove the "-" as an
index separator. This was done because not all indexes use a dash as a
divider.

Change-Id: I5b61720f27da00e0c3b92341355b09ea6c01caba
2019-01-15 17:35:09 -06:00
Kevin Carter b23ec9f8d9 Initial commit to add skydive
This commit adds playbooks and roles to the ops tooling setup to
build, deploy, and operate environments with skydive within in
it.

Skydive is a network analyzer which will allow users to explore
their topology in real-time using a defined storage back-end for
captures, alerts, and more.

The initial implementation of skydive deploys agents throughout
the environment and wires them all back to a cluster of analyzers
which leverage elasticsearch for its persistent storage back-end.
Storage back-ends are load balanced from the within the analyzer
nodes using the traefik light-weight reverse proxy. This setup
gives skydive a fully fault tolerant deployment.

Tests have been added to ensure the binary installation process
is validated. While these jobs are non-voting today, they'll be
iterated on and made passing in the subsequent PRs. All jobs are
following the selective pattern which allows these tools to be
gated in the mono-repo without impacting all other tools within
the environment.

Change-Id: Iaa1152566f2b615d67a33dc94ebdbebb1b492a9d
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-01-14 03:03:08 -06:00
Kevin Carter a1d6ebe4d3 remove dynamic ns.enable generators
The ns.enabled generators will fail when running packetbeat with a limit.
These generators were dynamically enabling/disabling packetbeat features
based on things discovered in the environment however they we're
attempting to be a little to fancy, especially when running packetbeat
in a non-osa cloud. The values for the services have been reset to the
provider defaults and should teh deployer want to configure these option
they can use config_template.

Change-Id: I36d7298ca5142e8b5f926ab5d59ab8283704b5af
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-01-10 16:00:37 -06:00
Kevin Carter 7491b6df8e Update the embedded-ansible-setup process to be configurable
This change allows the embedded ansible process to be configurable by
the end user.
  * Python requirements and ansible roles will all now be user
    configurable.
  * Setup is now a local only playbook. This playbook replaces the bash
    commands we were rerunning when the `bootstrap-embedded-ansible.sh`
    script was executed.
  * Embedded ansible version is now 2.7.5 as default.
  * Deprecation warnings have been resolved.
  * Tests impacted by this change have been updated.

Change-Id: I4303c44e249cda31457a4f05a681e298d225a8b7
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2019-01-04 11:46:19 -06:00
Kevin Carter b1232aead5
Add pretasks to exit quick when needed
The journalbeat playbook uses conditionals to know when to deploy the
journalbeat collector. This change makes it so the playbook simply exits
when the journal is not found or the environment being deployed is not
using systemd. This change will result in faster deployments in mixed
environments as the role will no longer need to iterate over its
conditional.

Change-Id: I581b61902723f54237623036566a83c9be79210e
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2019-01-03 09:05:32 -06:00
Kevin Carter 5586d8a80f Convert template setup to a role
This change reduces code throughout the playbooks thereby speeding up
the task execution.
  * A new role named `elastic_beat_setup` was created to
    facilitate template setup as needed.
  * Beats retention policies are now defined on the elastic-logstash
    nodes instead of on all target hosts. This method will speed-up
    deployments on massive installations while streamlining all deployments.
  * Kibana variable assumptions have been fixed. This will allow for
    deployments without Kibana to be accomplished.

Change-Id: I36343264042e81dfcb68bad0f6c3a503e525eceb
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2019-01-02 20:38:47 -06:00
Kevin Carter 9a896aa81a Fence options before casting to json
These options could be "undefined" which is an object and not json
serializable. This change ensures if an option is undefined it defaults
to an empty set which will allow the option to be json serialized.

Change-Id: I1a81bafa441aa6400bfbec50d57e56df4d09bda3
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-12-20 17:38:18 -06:00
Kevin Carter eb3bcb8daa Add local facts for template creations
The template creates can take a lot of time when dealing with large data
sets. This change makes it so template create will only happen on a
greenfield installation or when upgrading. A simple rerun of the
playbooks will not trigger template creations which will allow deployers
the ability to better change or modify deploymens without having to
worry about extended runtimes due to template interactions.

Change-Id: Ia9b77277553fbdbe0444737f39ec3de75f07cc0f
2018-12-19 17:59:21 -06:00
Kevin Carter ad91d5773e Extend auto change detection
This change makes it possible for a deployer to modify the set of
indexes and weights assosiated with them. If modified, the local
facts will be automatically updated.

Change-Id: Iaea1f22d8aad2abdd02801dd9acad5f969b78d0e
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-12-19 16:15:28 -06:00
Zuul f7552334ba Merge "Add missing prometheus port for ceph auto discover" 2018-12-19 16:25:22 +00:00
Michael Vollman 0ea548f979 Add missing prometheus port for ceph auto discover
When ceph prometheus metrics are auto discovered the metricbeat config
should point to the ceph mgr prometheus port. Adding missing
brackets around metricset so the default is treated as an array.
Dropping ceph dir detection for prometheus auto discover and relying on
is the port availability and inventory group only.

Change-Id: Iaba0fdece00414e17bc172f39e624374a9d273e8
2018-12-19 09:56:42 -05:00
Kevin Carter e08c58dd15 modify fact gathering to use local facts
This change makes the retention gathering operation faster by storing
the retention values as "local facts". The local facts are then
referenced in templates loading from the local fact file instead of
running repetitive queries which are slow making very large deployments
cumbersome. To make the retention policy fact gathering process smarter
it will now automatically refresh if undefined or should the
elasticsearch cluster size change. This will ensure we're improving
speed of execution while also catering to the needs of deployers to
grow, or shrink, elasticsearch cluster sizes.

Documentation has been added regarding the new option and why it may be
of use to depoyers.

Change-Id: I3936ee94461ac39fb8bc78dc2c873e6067552461
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-12-19 01:37:06 -06:00
Kevin Carter 3a69a1c43d Update elk_6x for 6.5.x
This change updates the roles / playbooks to begin using Elasticsearch
relesae 6.5.x. Core to this change is the conversion of the journalbeat
role from custom compiled go, to simple package install which was made
possible by the folks at elastic within this release. Because of the
conversion the "beats-community" playbook has been removed given its now
empty.

A change to the bootstrap script was made allowing it to parse an OS id
with a "-" in it, like "opensuse-tumbleweed".

Change-Id: Ic9b80234d6a6ce876bff885f3223874602d55dd6
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-11-27 18:56:22 +00:00
Kevin Carter 9af219910e
reduce container count on cluster test
Change-Id: Id70967834f43009e6c91fb71877606118242a429
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-11-21 23:19:28 -06:00
Kevin Carter 6797da6b1e
Correct clustered gating job
The cluster gatting job was failing due to memory locking, which is
impossible due to limited resources in the CI environment. This change
disables memory locking in the clustered gate test.

Change-Id: I0a146c41a1b82425539e014b1baee2011d464e05
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-11-20 10:25:19 -06:00
Michael Vollman 3f70fc76d9 Add auto discover for ceph prometheus metrics
Add to elk_metrics_6x the capability to auto discover the ceph mgr
prometheus plugin metric port.  Pull ceph metrics from prometheus when
the plugin is enabled.

Change-Id: I530a99f42e396ba7b2cd2c1b3d587f528ef84242
2018-11-09 12:42:28 -05:00
Michael Vollman bc180319ba Metricbeat role check for a ceph api to monitor
Before enabling the metricbeat ceph module, first check that the
restapi port is up and responding.

Change-Id: Ic795df02b93ca22c19fe67d6d2319889dc0f06a0
2018-11-08 15:50:50 +00:00