Commit Graph

7 Commits

Author SHA1 Message Date
Sagi Shnaidman 0977f8502d Big clean of tripleo-ci
Change-Id: Iff0350a1fff1057d1de924f05693258445da9c37
2020-01-22 18:37:04 +02:00
Steven Hardy 821d84f34c Replace references to deprecated controllerExtraConfig
This should be ControllerExtraConfig, since the current parameter name
has been deprecated for some time, and is inconsistent with all other
roles.

Since mitaka is now EOL this also removes references to the
worker-config-mitaka-and-below environment.

Change-Id: I0f07b3abbe290ed7f740a6f4915e16be39e3a4c6
2017-08-04 23:05:55 +00:00
Ben Nemec b5ce8758f2 Use only RAMWeigher for rh1 scheduling
We have a few black sheep compute nodes in rh1 that have no SSD
and/or less memory.  The non-SSD nodes tend to get preferred by the
scheduler due to the larger disk number they report (1000 GB > 200 GB)
and that's particularly a problem when we have a node with only 64
GB of memory and a 1 TB drive.  It tends to get over-scheduled,
even though it's our slowest node.

Since we almost exclusively care about distributing memory evenly,
let's weigh scheduler decisions only on that.  Note that the normal
filters are left in place, so we shouldn't ever try to schedule
more vms than a node can handle.  This will only stop us from
preferring the nodes with slower storage.

Also note that this change is already live in rh1 and seems to be
working fine.  I'm just updating the env to reflect the change.

Change-Id: I54731e49ed9bb08a6048bc52fe25412d9de6473c
2017-02-09 16:26:09 +00:00
Ben Nemec 9d26dbc764 Limit heat and nova workers
Running the full 24 workers doesn't increase our capacity from what
I can tell, so it's just a waste of CPU and memory.  Neutron is by
far the most heavily used service in rh1, and limiting its workers
seems to cause issues so I'm leaving it alone.

Change-Id: I39b89a0eeb9791fb60b44f3bf62dc31bd721624f
2017-01-19 15:35:58 +00:00
Ben Nemec 26e940c68f Increase disk_allocation_ratio to 4
With a max of 60 envs, we're starting to hit scheduling errors due
to lack of disk space on some of the compute nodes.  In reality,
none of the compute nodes are using more than 61% of their disk
and most are under 50%, so a 33% increase in overcommit should be
safe enough.

We may also want to increase the scheduler retries to help with
this problem.  Part of the issue is that most of the compute nodes
have sufficient disk available, but if we happen to get unlucky and
pick three in a row that don't then the instance fails.  More
scheduler retries would help with that.

Change-Id: Ifb2db1ddafd183aa4c9584b406e3b47bf7b0c5a9
2016-09-29 18:20:49 +00:00
Ben Nemec e9cabd607f Limit Heat engine workers to 4 in rh1
If we have more than 4 Heat engine workers in rh1 it can generate
so much traffic that the other services have trouble keeping up,
which causes all kinds of trouble.  4 seems to be a pretty good
sweet spot where Heat has plenty of capacity to create stacks but
doesn't DoS the other services.

Note that this is already in the running config on rh1.  This is
just a patch to update the env file so it will be persistent.

Change-Id: I2c5d33cf2307349ea231ad1ba07170b250a84cef
2016-09-09 21:15:02 +00:00
Derek Higgins 7ddd6d784c Add details for rh1 deployments
Change-Id: Ida58bba79bf5c790456b9fd82beb112eae446985
2016-07-27 12:58:19 +01:00