Skip hits that don't have data

The e-r graph and uncategorized fails jobs are both
currently failing because we're getting back hits
with empty data for the 'timestamp' attribute because
the value is too large, e.g.:

"[FIELDDATA] Data too large, data for [@timestamp] would "
"be larger than limit of [22469214208/20.9gb]]"

To workaround this for now we check to see if the hit
item list has anything it it before returning the
data for the facet results.

We should follow this up by logging errors on hits that
have bad data and we should remove those indexes.

Change-Id: Icf19af6580632ef52a55d3fb4bed3bced140024a
Closes-Bug: #1630355
This commit is contained in:
Matt Riedemann 2016-10-04 15:10:37 -04:00
parent 02429cd85d
commit 44c967d3e3
1 changed files with 10 additions and 2 deletions

View File

@ -150,7 +150,10 @@ class FacetSet(dict):
"""
def _histogram(self, data, facet, res=3600):
"""A preprocessor for data should we want to bucket it."""
if facet == "timestamp":
# NOTE(mriedem): We sometimes hit a case where the @timestamp attribute
# is too large and ES won't return it. At some point we should probably
# log a warning/error for these so we can clean them up.
if facet == "timestamp" and data is not None:
ts = dp.parse(data)
tsepoch = int(calendar.timegm(ts.timetuple()))
# take the floor based on resolution
@ -212,7 +215,12 @@ class Hit(object):
"""
def first(item):
if type(item) == list:
return item[0]
# We've seen cases where the field data, like @timestamp, is
# too large so we don't get anything back from elastic-search,
# so skip over those.
if len(item) > 0:
return item[0]
return None
return item
result = None