fix 'non-wellform' hashtag and remove trailing spaces

fix hashtag program not to generate TAG-.md

Change-Id: I739505b2da699214a03516161394ac5702a8fa11
This commit is contained in:
Yih Leong Sun 2017-05-30 21:49:02 +00:00
parent df0848eae0
commit 0c9c3aa215
69 changed files with 529 additions and 309 deletions

View File

@ -4,6 +4,8 @@ https://etherpad.openstack.org/p/BOS-forum-Compliance-Security-Certification: 14
https://etherpad.openstack.org/p/BOS-forum-Compliance-Security-Certification: 143: Guidelines for securing openstack ##actionitem ##compliance
https://etherpad.openstack.org/p/BOS-forum-LCOOGetToKnow: 49: * Define governance model in LCOO and it is allined with UC ##actionitem#lcoo
https://etherpad.openstack.org/p/BOS-forum-LCOORoadmap: 127: * set up and publish slack channel ##actionitem ##lcoo --> https://lcoo.slack.com >> publish this on wiki
https://etherpad.openstack.org/p/BOS-forum-LCOORoadmap: 128: * derive short term win ##actionitem ##lcoo
@ -36,6 +38,14 @@ https://etherpad.openstack.org/p/BOS-forum-product-wg-working-session: 65: * No
https://etherpad.openstack.org/p/BOS-forum-product-wg-working-session: 66: * How do we identify organization to help with this wg ##pwg ##actionitem
https://etherpad.openstack.org/p/BOS-forum-skip-level-upgrading: 123: What could the communiity do to help in place n+2 upgrades?.##all-projects#actionitem
https://etherpad.openstack.org/p/BOS-forum-skip-level-upgrading: 124: * Better release notes for all projects would help to identfiy what has changed.##all-projects#actionitem
https://etherpad.openstack.org/p/BOS-forum-skip-level-upgrading: 125: * Better description of what happens with mixed versions, i.e agents behind one or two versions.##all-projects#actionitem
https://etherpad.openstack.org/p/BOS-forum-skip-level-upgrading: 126: * TODO: call out projects that are doing upgrade impacts but not documenting them in the release notes.##all-projects#actionitem
https://etherpad.openstack.org/p/BOS-forum-telecom-nfv-collaboration: 192: 1. Monthly IRC meeting of the WG chairs introducing the happenngs of the last period ##actionitem ##nfv
https://etherpad.openstack.org/p/BOS-forum-telecom-nfv-collaboration: 201: 1. Wiki or similar to keep track of the related WGs and their work scope ##actionitem ##nfv

View File

@ -1,2 +0,0 @@
https://etherpad.openstack.org/p/BOS-forum-LCOOGetToKnow: 49: * Define governance model in LCOO and it is allined with UC ##actionitem#lcoo

View File

@ -0,0 +1,2 @@
https://etherpad.openstack.org/p/BOS-forum-developer-openstack-org: 11: 1. Help take notes by adding ##<actions> and "+1" any good ideas with which you agree.

View File

@ -0,0 +1,10 @@
https://etherpad.openstack.org/p/BOS-forum-skip-level-upgrading: 79: * Better release notes for all projects would help to identfiy what has changed.##all-projects
https://etherpad.openstack.org/p/BOS-forum-skip-level-upgrading: 123: What could the communiity do to help in place n+2 upgrades?.##all-projects#actionitem
https://etherpad.openstack.org/p/BOS-forum-skip-level-upgrading: 124: * Better release notes for all projects would help to identfiy what has changed.##all-projects#actionitem
https://etherpad.openstack.org/p/BOS-forum-skip-level-upgrading: 125: * Better description of what happens with mixed versions, i.e agents behind one or two versions.##all-projects#actionitem
https://etherpad.openstack.org/p/BOS-forum-skip-level-upgrading: 126: * TODO: call out projects that are doing upgrade impacts but not documenting them in the release notes.##all-projects#actionitem

View File

@ -1,2 +0,0 @@
https://etherpad.openstack.org/p/BOS-forum-101: 32: * You can use ##<tag> to highlight anything that needs to be shared with a team not present in the session (e.g. ##uservoice, ##ironic) and a group of people will share it in a mailing list summary after the event

View File

@ -1,3 +1,5 @@
https://etherpad.openstack.org/p/BOS-forum-101: 32: * You can use ##<tag> to highlight anything that needs to be shared with a team not present in the session (e.g. ##uservoice, ##ironic) and a group of people will share it in a mailing list summary after the event
https://etherpad.openstack.org/p/BOS-forum-ironic-feedback: 43: * - Port selection on multi physical switches. ##ironic ##painpoint
https://etherpad.openstack.org/p/BOS-forum-ironic-feedback: 51: * no concept of locality ##ironic ##painpoint

View File

@ -1,3 +1,5 @@
https://etherpad.openstack.org/p/BOS-forum-LCOOGetToKnow: 49: * Define governance model in LCOO and it is allined with UC ##actionitem#lcoo
https://etherpad.openstack.org/p/BOS-forum-LCOORoadmap: 61: * Containerized control plane ##roadmap ##lcoo
https://etherpad.openstack.org/p/BOS-forum-LCOORoadmap: 66: * Yes - we definately will include Kolla and OpenStack-Helm and others in the Gap Analysis. We have started 2 different Development Proposals (above) and from an LCOO perspective there should not be an assumption that we have chosen the solution. What we hoped to do next with the Containerized Control Plane user stories was to have an introduction from the Kolla knowledgeable SMEs and also form teh OpenStack-Helm SMEs to help us get started with Gap Analysis. ##lcoo ##

View File

@ -0,0 +1,2 @@
https://etherpad.openstack.org/p/BOS-forum-ops-tags-wg-session: 11: *##opstags - Feedback from session on best practices and areas for improvement

View File

@ -0,0 +1,2 @@
https://etherpad.openstack.org/p/BOS-forum-101: 32: * You can use ##<tag> to highlight anything that needs to be shared with a team not present in the session (e.g. ##uservoice, ##ironic) and a group of people will share it in a mailing list summary after the event

View File

@ -1,5 +1,7 @@
https://etherpad.openstack.org/p/BOS-forum-developer-openstack-org: 16: ##UC <-- goes to the UC chairs as suggestion.
https://etherpad.openstack.org/p/BOS-forum-evolving-the-community-generated-roadmap: 32: * How can the projects indicate that they are ready for deployment - defining the maturity level##uc
https://etherpad.openstack.org/p/BOS-forum-uc-governance-and-support-of-wgs: 50: * ##action ##uc get documents into gerrit
https://etherpad.openstack.org/p/BOS-forum-unanswered-requirements: 62: * ways to measure success/failure of proposed process(es)/plan(s) ##uc ##wg
@ -12,6 +14,8 @@ https://etherpad.openstack.org/p/BOS-forum-unanswered-requirements: 66: * Is th
https://etherpad.openstack.org/p/BOS-forum-unanswered-requirements: 77: * finding ways to steer member companies contributing developers to focus on the identified goals ##wg ##uc ##actionitem
https://etherpad.openstack.org/p/BOS-forum-user-committee-session: 11: *##uc - Feedback from session and open items
https://etherpad.openstack.org/p/BOS-forum-user-committee-session: 19: * Revising AUC criteria and documenting program motives/goals ##uc
https://etherpad.openstack.org/p/BOS-forum-user-committee-session: 21: * it would be great if we can see the statistics in stackalystics ##uc

View File

@ -0,0 +1,2 @@
https://etherpad.openstack.org/p/BOS-forum-uc-governance-and-support-of-wgs: 11: *##ucwg - Feedback from session and open items

View File

@ -1,5 +1,7 @@
https://etherpad.openstack.org/p/BOS-forum-evolving-the-user-survey: 16: * ##usersurvey - Feedback from sessions on how to improve the user survey
https://etherpad.openstack.org/p/BOS-forum-evolving-the-user-survey: 26: * Have someone from the user Survey WG take on the task of collecting data around a certain theme and work on making it non-confidential to be able to share more broadly##usersurvey
https://etherpad.openstack.org/p/BOS-forum-evolving-the-user-survey: 30: * The networking question is getting confusing with ML2 and underlying technologies. Can we revisit it? ##usersurvey
https://etherpad.openstack.org/p/BOS-forum-user-committee-session: 57: * Role/responsibilities for upcoming User Survey and timeline ##heidijoy ##usersurvey

18
forum/README.rst Normal file
View File

@ -0,0 +1,18 @@
Working with ##hashtag Program
==============================
* Retrieve etherpad content
$python get_etherpads.py
This will download list of etherpads to local directory.
Files are prefix with "PAD-<etherpad-name>".
* Generate ##hashtag result
$python extract_tags.py
This will read the content from the list of downloaded files
from previous step, then extract the ##hashtag from each lines,
and output to a list of "TAG-<##hashtag>.md".

114
forum/extract_tags.py Executable file
View File

@ -0,0 +1,114 @@
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
#!/usr/bin/env python
import re
import os
# remove all generated output file that begin with 'TAG' in filename
def cleanup_files(dir_path):
files_to_remove = [ f for f in os.listdir(dir_path) if f.startswith("TAG") ]
for f in files_to_remove:
os.remove(f)
# read all files that begin with 'PAD' in filename
def get_files(dir_path):
files = [f for f in os.listdir(dir_path) if f.upper().startswith("PAD")]
# print files
return sorted(files)
# write the generated output to a file, prefix with 'TAG' in filename
def write_md_report(hashtag, etherpad, text, line_number):
with open ("TAG-" + hashtag + ".md", "a") as hashtag_file:
hashtag_file.write('%s: %d: %s\n\n' %(create_link(etherpad), line_number, text))
# create the etherpad hyperlink
def create_link(etherpad):
etherpad_link = "https://etherpad.openstack.org/p/" + etherpad
return etherpad_link
# skip all lines beginning with these words,
# which assumed to be the ##hashtag explanation notes
def skip_line(line):
skip_words = [
'##<hashtag>',
'##hashtag',
'##newfeature - Proposal for a new feature',
'##gap - Feature gap that does not have a solution yet',
'##uservoice - Feedback from users and operators',
'##painpoint - Functional challenges/problems (either real or perceived)',
'##<workload-type>',
'##<project-name> ',
'##<working-group/team-name>']
for word in skip_words:
if line.find(word) != -1:
return 1
return 0
# the main procedure
# 1. Cleanup previously generated files that prefix with 'TAG'
# 2. Read all files prefix with 'PAD'.
# These 'PAD' files are downloaded from 'get_etherpads.py'
# 3. For every lines in each PAD files, extract the ##hastag
# 4. Write the line with ##hashtag to a file 'TAG-<##hashtag>'
def run():
dir_path = os.path.abspath('')
# Remove previously generated files
cleanup_files(dir_path)
hash_set = set([])
rep_chars = ["##", ":", ",", "+1", "?", "<", ">", "(", ")"]
# read each etherpad content, line by line
for file in get_files(dir_path):
with open (file, 'rt') as f:
line_number = 1
for line in f:
# trim whitespace characters
line = line.strip()
if skip_line(line) != 1:
try:
for hashtag in re.findall(r'#{2}\S*', line.lower()):
# hack to workaround ##hashtag that was not properly formed
# eg: '##something#action' instead of '##something #action'
# if hashtag.find('<') != -1:
# print hashtag
for tag in hashtag.split('#'):
for char in rep_chars:
tag = tag.replace(char, "").lower()
if tag != "":
hash_set.add(tag)
# the filename is the etherpad link that downloaded locally
# prefix with 'PAD-'
# simply replace 'PAD-' will give you the etherpad name
write_md_report(tag, file.replace("PAD-",""), line,
line_number)
except AttributeError:
line = ""
line_number += 1
print sorted(hash_set)
if __name__ == "__main__":
run()

54
forum/get_etherpads.py Executable file
View File

@ -0,0 +1,54 @@
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
#!/usr/bin/env python
import re
import os
import urllib2
import urllib
# remove previously downloaded files that prefix with 'PAD-'
def cleanup_files(dir_path):
files_to_remove = [ f for f in os.listdir(dir_path) if f.startswith("PAD-") ]
for f in files_to_remove:
os.remove(f)
def run():
dir_path = os.path.abspath('')
cleanup_files(dir_path)
# connect to a URL that contain list of etherpads
wiki = urllib2.urlopen("https://wiki.openstack.org/wiki/Forum/Boston2017")
# read html code
html = wiki.read()
# slice the needed section only
start_index = html.index("Event intro")
end_index = html.index("(old) Brainstorming");
html = html[start_index:end_index]
# use re.findall to get all the links
links = re.findall('"(https?://.*?)"', html)
for link in sorted(links):
if "etherpad" in link:
filename = "PAD-" + link.split('/')[-1]
url = link + "/export/txt"
urllib.urlretrieve(url, filename)
if __name__ == "__main__":
run()