entropy

Commit Graph

Author	SHA1	Message	Date
Pranesh Pandurangan	c79d12c645	Finish changes to stop an engine Added a call to stop all react scripts, by passing all known routing keys to the react_killer function. Then stop the threadpool executor. Finally, Raise a known exception to stop the watchdog thread. This will throw an ugly traceback, but will shutdown the engine gracefully. Also made minor changes to the example react.json, to change the log format. Knowing the time a log was printed is useful. Change-Id: Ibed06f79547312d188feb499f937eb5390d60c3e	2014-07-14 16:47:43 -07:00
Pranesh Pandurangan	5b647e5f79	Add logic to stop repair scripts When watchdog detects that repair script(s) have been killed, get a list of scripts to nuke and pass to stop_repair_scripts. Then, get its routing key(s), and send a message from a special user to any queue listening on those keys. Modified an example repair script to show how it could be killed, but need a more concrete way that that. For now, messages from 'react_killer' will raise the RepairStoppedException, which will stop react scripts Modified the example engine cfg to have some details about the kombu connection to use. Implements blueprint kill-repair-scripts Change-Id: I67e15e9b9ebb5d36c5cb0e01995bc95f7a73b3dd	2014-06-20 11:41:23 -07:00
Pranesh Pandurangan	7a6999c9eb	Add an unregister repair option Added a parser and function to unregister repair scripts. Remove the repair script from the backend repair cfg, and watchdog will catch it in the engine. Change-Id: I7b93ca7e5eb4430b7c9502c8dd84af75b2a9fae3	2014-06-19 01:18:59 -07:00
Jenkins	e2f7101b0b	Merge "When creating an engine, add an enabled field"	2014-06-17 00:53:32 +00:00
Pranesh Pandurangan	f3c51ab67d	When creating an engine, add an enabled field We use this field when disabling an engine. Stay clear of possible KeyErrors by adding this field when registering an engine. Change-Id: Iacca4f99be018b5147b6a00f82e7c772ec88a8f3	2014-06-12 23:08:41 -07:00
Pranesh Pandurangan	473e2febc0	Add an audit unregister option Allow users to remove audit scripts through a simple CLI call. The unregister-audit call will remove an audit script from audit.cfg, thus stopping the execution of all but the currently scheduled audit scripts. partially implements blueprint simple api Change-Id: I1d5328d87b607c2f5cfdaebb7448a11673f38d48	2014-06-11 03:07:58 +00:00
Pranesh Pandurangan	b1ca5a4c91	Refactor some parser code Rename scheduler_parser to start_engine parser to make more sense, also add the -p (purge) option that an earlier commit needs. Change-Id: I5058e6f60804166e71fd5fefb2e298d27630b3d8	2014-06-11 03:07:23 +00:00
Pranesh Pandurangan	47e35d1d2c	Add a stop engine call Add a call to stop engine, this function just sets the enabled field to false, and psutil terminates the process (equivalent to what we do now). Also add psutil to requirements Change-Id: Idc3edb2bf1c9ed55d7d77973c59e3d3562e2ad8b	2014-06-10 20:06:05 -07:00
pran1990	81a9042ff1	Add some more checks to engine creation Add a field called enabled in the engine cfg file that keeps track of files. This can be used to keep track of stopped engines, and monitored using watchdog/similar. Also added some more checks at engine creation time. Partially Closes-bug:1309406 Change-Id: I1c365c2c438e6ed0a44413e1d09c69d3fab7ab7b	2014-06-10 19:20:55 -07:00
Pranesh Pandurangan	0be7808705	Move register-audit/repair code to backend Abstract out cfg file operations in the register workflow. Changed the get_driver code in engine to be static, so we can call it from main too. Added a new function in file_backend to return the config file given the script type. Eg. return audit_cfg for audit. Added a new function in file_backend to replace check_duplicate, that returns True if a script already is already registered. Added a couple of string variables in base.py The function get_cfg_file, when using a db, will actually return a table. So this belongs in the backend, the code refactor here ensures this function is not called in the main() code. Raise errors instead of returning None in the some backend functions Completes blueprint backend-abstraction Change-Id: I20d6bd46caf56c750e4b1193a6f5d00ce4e930f6	2014-06-09 19:50:56 -07:00
pran1990	d4fcac48f1	Move cfg file creation to driver Added code to load the file backend as a driver in engine __init__, and a function in FileBackend to create cfg files. Also added extra field in engine.cfg to specify what kind of backend to use. Change-Id: I6d3f24d4f676c72c94afff2c4c7f54a35cf1d4b1	2014-06-02 20:31:14 +00:00
pran1990	0f5954359c	Move some file creation code from main to utils As part of engine creation, we create files for audit and repair cfg. Move this to utils. Change-Id: I9d5075cb854ab5585fddcb58013ffa2530add970	2014-06-02 13:29:19 -07:00
pran1990	9e68d9cdd1	Add some structure for backend abstraction Add some backend placeholders. implements blueprint backend-abstraction Change-Id: Ia6bd3020d2f666ee317e1cf89ae2f3e10e6977aa	2014-05-30 08:41:29 +00:00
pran1990	1d38e2ca2b	Create audit and repair cfg at startup If those files don't exist, create them at startup. Else adding the first audit or repair script fails because the cfg file doesn't exist. Change-Id: I9143f59364167e98f69616351b4ac1df8ebd4ff8	2014-05-19 21:31:11 +00:00
pran1990	38520d41d8	Change format of audit and react cfg files Try to follow the format name1: arg1: arg2: name2: arg1: arg2: Changes in several places to reflect the new format. Change-Id: I182bbb701ac0e1885078f9ec3789fcff799acf5a	2014-05-18 00:19:33 -07:00
pran1990	ef2b443dac	Change format of engine configuration file Store pid along with other details. Change some code to take care of the new format, including the utils function that reads yaml. Henceforth, try to keep yaml files in the project in the format: name1: arg1: arg2: name2: arg1: arg2: Also wrote a function to write yaml in utils.py Change-Id: I838ec927a439ac1aeea88ba0b1d71fc782777204	2014-05-17 22:40:46 -07:00
pran1990	479662ad48	More logging fixes, and queue work There were some bugs in the commit to create queues dynamically. The engine now creates queues that are needed, and passes to react scripts. Also made some fixes to example code, added some config files. They contain usernames, but should be simple enough to modify and test Change-Id: Ife1977b3f8d669024fd853b6691300b5dd4fd73f	2014-04-27 14:08:26 -07:00
pran1990	24983c6fc6	Make entropy suitable for pypi distribution, part 2 Correct logging: reset loggers in engine.py, and setup your own Added functions to setup logging in engine.py and Audit class, can be used by audit scripts. Added a reset_logger function in utils Expose __main__.py as a CLI on packaging. So after installing the entropy package, you can call commands like entropy register-audit entropy start-engine from anywhere in your machine. Made changes to setup.cfg to make the main function an entry point. Moved the engine_cfg file to /tmp/engines.cfg. That way, even though it is hardcoded, it's at least fairly uniform across machines. Change-Id: I704bf5e4635ffc539d7a73c5f84ef4bf8b2e801e	2014-04-15 23:03:04 -07:00
pran1990	92b73b563b	Make entropy suitable for pypi distribution, part 1 Set logging handlers in each file, instead of a global one Move CLI output to stdout Remove one hardcoded value Change-Id: I0d1bfcbd642bdc43547bf177bed53c32eaf956b9	2014-04-14 17:52:26 -07:00
pran1990	e58f59c858	Use a config file to register engines Remain consistent with other options and just pass a name and config file to register a new engine. Changed the cmdline call start-engine a bit, and some changes to account for the difference in input. Change-Id: Idc73528dc39d79d530a5a5c901761ef59ab13f33	2014-04-13 16:39:55 -07:00
pran1990	e4fd9aa1f2	Make entropy suitable for pypi distribution, part 0 We need to separate out engine code and audit and repair scripts if pypi distribution is reqd. This is handled in this commit. Further commits will remove the two variables hardcoded currently, and expose __main__ as a cmdline script on installation. Use imp for module finding and loading, use full path to script in audit and repair cfg files. Move audit and repair scripts to examples. Make some changes to hardcoded stuff to account for this Update gitignore Change-Id: I50831003c6f7272967dbeb5c558b76b0183c91be	2014-04-10 13:08:51 -07:00
pran1990	090fc4dd21	Cleanup logging Change inappropriate usages of LOG.error and LOG.warning. Change-Id: I7964aaabc5533e6f438c56bde6817ba6a939b283	2014-03-27 15:08:27 -07:00
pran1990	b65a8a4d12	Add vmbooter audit and react scripts Added an audit script with functions to flavor-list, boot and delete vms. Use CLI for now, can switch to novaclient later. Use paramiko to ssh onto a host with access to the cluster, then run nova list, delete the vms we created, and boot a new one. Added a new queue for this audit/repair pair, dynamically creating queues or sharing them is TODO. Added react script, removed the ability to call from commandline. Can show off the feature without that. Added templates for conf files Change-Id: I3fe70534573aa70bd9407e18dcbe11e0e784595c	2014-03-26 19:44:38 -07:00
pran1990	a7a4d5feee	Do not start scheduler in constructor Set up the engine object in the constructor, and start it in a separate run function Change-Id: Ic688c3e8059f18e328735b6dd2a55ae86745fc50	2014-03-17 16:53:21 -07:00
pran1990	aa5e2e9775	Remove unused global The variable entropy_engine doesn't really have to be a global Change-Id: I26ea9b356d5de99a4657e55a9a6e503e5fb8833f	2014-03-17 16:52:49 -07:00
pran1990	616e6c69d0	Move code into a scheduler class, part III Remove usage of globals.py Rewrite register-audit and register-repair to avoid using globals. Get cfg file from engine name and script type instead. Remove cfg files from git Change-Id: I8ee119b4ebf55fa18ff4f6a83c0859ddc6699c5f	2014-03-16 23:09:24 -07:00
pran1990	3ac2fde405	Move code into a scheduler class, part II Moved all major functionalities, like adding audit/repair scripts, the watchdog handler, etc to the engine class. There is still some cleanup to do, like getting rid of some references to the globals variables, which will come in part III, in progress. Realized that we need full names to at least the audit.cfg and repair.cfg files, removing all the cfg files from git, because it makes the repo look ugly with usernames in them. Will add in sample cfg files, even though they shouldn't really be created manually. Verified that the code is now in the same state as before we used a scheduler class, ie no difference in expected behavior Change-Id: If9eeb9201ac6dd30705c3246c304b304054dc577	2014-03-16 16:59:08 -07:00
pran1990	925f45cf45	Move code into a scheduler class: part 1 Move start-scheduler logic into a class (in engine.py). Call to start-engine() will create a new engine class, and run a watchdog thread on the cfg files associated with that engine. In followup commits, will also start scheduler in the constructor, and move more code into the class. The existing globals file can be deleted at that point. Change-Id: I3a547a538fecaabdb84c927df9439c30119bf74f	2014-03-14 00:44:42 -07:00
pran1990	adaeaf3c05	Clean up code a bit While trying to migrate stuff to a class I noticed some functions that could be in utils. Removed a join statement that shouldn't be there because we use futures now. Change-Id: Iab457a4b34ff176a6e39f1a02ba9b5377602e652	2014-03-13 17:34:29 -07:00
pran1990	5c290a4b3b	Use a thread pool, not process pool The right thing to do is use a thread pool, that is the like for like replacement for what was going on earlier. use all_futures to track all currently started scripts, will use this to kill threads, etc later. Change-Id: I5274a381cb0ff8744cb1efee265c7d1e74895098	2014-03-12 12:33:46 -07:00
pran1990	ebf5c34f9b	Fix some bugs load_yaml needs a string, not a file handle Change-Id: I8743ac1dfa3730cf53c329700888c014c86a68f7	2014-03-11 19:34:22 -07:00
pran1990	a1360a4953	Use ProcessPoolExecutor instread of threading Currently we use threading, and do thread.join() to start audit/repair scripts. Using futures and executors makes the code simpler to read. Note that it might not necessarily improve performance. I set the max number of workers arbitrarily to 8, but we can work out a way to set this number correctly. Change-Id: I3c9f3194753c79d57204b49c7ec2444fd454bfc7	2014-03-11 17:46:15 -07:00
Debo~ Dutta	bf2e38f03b	test commit - fix typo Change-Id: I264869b5bd3edecfdf2346fe8f453700f6622ec5	2014-03-04 00:31:20 -08:00
pran1990	d509cbeb55	Add jobs at runtime Newly added jobs will change either audit.cfg or repair.cfg, which are watched by watchdog. Call the right function to add the job to currently running ones. Minor typo fixes to cfg files for scripts. Use utils.load_yaml instead of yaml.load_all. That function uses safe_load_all, so better security-wise Change-Id: Ib8137a0a1d9a3b960d9c64f9c6424709d57b8747	2014-02-24 19:33:48 -08:00
pran1990	75fc1bd888	Restructure code a bit There is some redundant code in start_scheduler() when starting scripts. Write start_scripts() to address this. Change-Id: I1f6b9a66c7385bbb066cefdf0cfb4cb176d84805	2014-02-24 19:25:12 -08:00
pran1990	02dceddfa7	Introduce watchdog into entropy Watch cfg files for changes, if so, call right callback. Move audit.cfg and repair.cfg into cfg/ for easier monitoring using watchdog. Change-Id: Iace75f36f0bfb5b83fe53c7d63b110f10534808f	2014-02-24 19:15:09 -08:00
pran1990	52bff736c9	Store list of running audits and repairs Store list of running audits and repairs, will aid in other things later on, like preventing duplicates, adding scripts at runtime, etc. Removed needless import in vm_count Change-Id: I9ed811783e5bc4a7799e8a5f73a4e55a15fdfee4	2014-02-24 18:48:50 -08:00
pran1990	bffddde2f9	Add an example audit/repair script vm_count.py gets number of vms running in a cluster, react will throw an error if it's above a limit. Not adding vm_count.json to git, similar to audit.json, contains api and compute hostnames in addition. Don't use extrapolation for every log message Add audit conf files to gitignore Remove stevedore stuff, not using now, can add back later if needed Use libvirt bindings to talk to hypervisor. Only one hypervisor for now, changes soon for multiple hypervisors. Change-Id: I843e3600a62cb6698526b3498358e4b90121ba1a	2014-02-10 18:08:26 -08:00
pran1990	1c9c640da3	Remove some hardcoding Use a new module field in audit.json to specify which module to load from the scheduler Change-Id: I6d9f846be3e379b5179da740697962a01f591d21	2014-02-07 00:49:46 -08:00
pran1990	5fc67635ad	Enable stevedore and dynamic loading Load modules dynamically, allowing better control over audit/react scripts Change code structure a bit (put audit scripts in audit/ dir, react scripts in repair/ dir) Enable stevedore, for audit/react scripts installed with the package Remove all homedir references Change-Id: I7351d6b7cd9ca5ba9cfa9526dfbefbfecacc3dc8	2014-02-03 00:01:15 -08:00
pran1990	ee0cc7d4c7	Change code structure a bit Use register-audit to register audit script, register-repair to register repair script. start-scheduler to start all the react scripts and then schedule audit scripts. Added audit.cfg and repair.cfg files to strore registered scripts, this will help restart after failure. Added globals.py to store global variables Removed validate_cfg function that wasn't doing anything Change-Id: Id9140d2665e5710e6ffe2ed707135ff9a30ccdff	2014-01-03 16:29:22 -08:00
pran1990	a16b654009	Use pause library for sleeping As described in https://pypi.python.org/pypi/pause , the pause library has higher precision than the sleep lib, and uses machine timestamp instead of counters. Change-Id: I0f1135757ef8d1ed6e4eb203b84632ef5ec91977	2013-12-28 19:43:57 -08:00
Joshua Harlow	31413508a9	Small adjustments Move entropy.py -> __main__.py so that it can just be used by running $ python entropy (inside the entropy root folder). Also use yaml loading instead of json loading since yaml allows for comments inside the file (yaml is a superset of json). Removed import json from __main__.py to keep pep8 happy Also changed react.py to use a per module logger instead of root logger Change-Id: I5eb24319dee4f04891878c6e61cc4d7835b14d34	2013-12-16 23:47:17 -08:00

43 Commits