2.9 KiB
Create a Fact class for storing facts
Include the URL of your launchpad blueprint:
https://blueprints.launchpad.net/congress/+spec/fact-datastructure
Today, the congress runtime stores facts as Rules. This is inefficient from a memory perspective since each Rule contains a Literal, which in turn contains a Term for each column. Each of these is a python object, and each of them also contains a few extra fields like head, body, location, negated, etc. Using Rules is also inefficient with CPU since congress needs to construct all these objects. This blueprint proposes to use a Fact data structure to store each fact. A Fact is a subclass of a native tuple, plus one field for table name. This is much more efficient memorywise and CPUwise than using a Rule because there are no extra objects like Literal and Term. Preliminary tests show a 10x reduction in CPU for initializing tables plus a 3x reduction in memory use.
Problem description
A detailed description of the problem:
- Today, congress stores each fact as a Rule object
- A Rule object contains many objects and fields
- Many objects and fields means that creating and storing a fact uses lots of CPU and memory resources.
- High CPU and memory use makes congress unable to scale to larger datasets.
Proposed change
We propose to create a new class called Fact to store each fact. A Fact is a native tuple plus one string for table name. Using a Fact will eliminate all the subfields and subobjects in Rule.
Alternatives
None
Policy
None
Policy Actions
None
Data Sources
None
Data model impact
None
REST API impact
None
Security impact
None
Notifications impact
None
Other end user impact
None
Performance impact
Preliminary testing shows a 10x reduction in CPU use and a 3x reduction in memory use in initialize_table() for 7M facts where the payload is 700MB.
Other deployer impact
None
Developer impact
A Theory object will internally contain Rules and Facts. The caller an insert a Fact into a RuleSet. However, whenever someone fetches the rules from a Theory the RuleSet converts Facts to Rules before returning them.
Implementation
Assignee(s)
- Primary assignee:
-
ayip
Work items
- Implement FactSet
- Use FactSet inside of RuleSet
- Change initialize_tables() to avoid instantiating a list of all facts coming from DSE
Dependencies
None
Testing
Add a unit test for FactSet
Documentation impact
None
References
None