-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CMonitor : Added inotify support to cmonitor. #49
base: master
Are you sure you want to change the base?
CMonitor : Added inotify support to cmonitor. #49
Conversation
@@ -0,0 +1,195 @@ | |||
# ======================================================================================================= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file (cmonitor_launcher.py) really deserve its own folder: tools/launcher.. tools/common-code is sort of "library" folder where we store Python code that is "importable" and usable as standalone library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another cosmetic note: make sure to run "black" on this file to ensure consistent formatting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also need the python3 shebang:
#!/usr/bin/python3
as very first line
# - Execute command to launch CMonitor if the process name matches with the filter. | ||
# | ||
# ======================================================================================================= | ||
class CmonitorLauncher: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this class contains both the logic to watch for inotify events and to process them. We must split these 2 parts into 2 different classes:
-
CgroupWatcher : this class will contain the inotify_events() method but it should also provide more "value" by containing the white filter logic on the process name; in particular it should insert in the "queue" only when an event is matching all filtering criterias (to avoid unblocking "by mistake" the worker thread)
-
CmonitorLauncher : this class will contain the launch_cmonitor() method. I think using os.system() may raise subtle issues: for example if cmonitor_launcher.py is killed, there is no way for this script to terminate all the children cmonitor processes as well. By using instead subprocess.popen() or maybe even with subprocess.run() it should be instead possible to propagate the SIGTERM to all children cmonitor instances.
Anyway this is something that needs to be tested properly (maybe SIGTERM works also with os.system() not sure!!!)
filename = event.get() | ||
print("In process events entry:", entry, filename) | ||
# print("In process events", filename) | ||
time.sleep(50) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this sleep?
return allFiles | ||
|
||
def process_task_files(self, dir): | ||
time.sleep(5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this sleep?
Hi @satyabratabharati , I think this MR still needs some work. Major points to raise are:
Generally speaking I think the design is good anyway: I like the idea of using ThreadPoolExecutor to watch on a folder and then launch (synchronoulsy) cmonitor instances. I admit however that's I should study how ThreadPoolExecutor actually works because I have a lot of doubts: |
…into feature/inotify-support
Hi @f18m , I tried to implement all the points what you have suggested. Except for the unit test case from the above list. I will work on it. Following points have been incorporated :
Thank You. |
…tabharati/cmonitor into feature/inotify-support
|
||
exit_flag = False | ||
# ======================================================================================================= | ||
# CgroupWatcher : Basic inotify class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this class looks good except for one thing: the timeout / sleep logic: it should not be included in this class.. it doesn't really belong here. The caller may want to sleep for whatever reason by watching cgroup should be fast and imply no sleeps. So please remove the "timeout" field.
import logging | ||
from datetime import datetime | ||
|
||
exit_flag = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this flag should not be here... this class should not care about the whole app being requested to exit: it must just process the inotify events and fill a queue as fast as possible...that's it
self.path = path | ||
self.filter = filter | ||
self.timeout = timeout | ||
self.myFileList = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.myFileList can be removed as variable
""" | ||
self.path = path | ||
self.filter = filter | ||
self.timeout = timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mentioned before, remove self.timeout
|
||
""" | ||
self.path = path | ||
self.filter = filter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from the documentation it's not clear what is the format of self.filter honestly. Is it a list of strings? please document
process_name = parts[1].strip("()") | ||
return process_name | ||
|
||
def __get_pid_list(self, filename): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to __get_pid_list_from_cgroup_procs
list.append(line.strip()) | ||
return list | ||
|
||
def __get_list_of_files(self, dir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to __get_files_recursively
and remove mention of "event" from the description... this function doesn't deal with any event by itself... also remove "watched dir"... this function is generic
|
||
""" | ||
# time.sleep(20) | ||
time.sleep(self.timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this sleep as mentioned before
|
||
return allFiles | ||
|
||
def __process_task_files(self, dir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to __check_new_cgroup_against_filter(self, cgroup_dir) and provide an example of "cgroup_dir" that this function expects
logging.info(f"CgroupWatcher event in Queue:{fileList}") | ||
queue.put(fileList) | ||
# global exit_flag | ||
if exit_flag is True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this "if"
i.add_watch(self.path) | ||
try: | ||
for event in i.event_gen(): | ||
if event is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment like "if event is None, it means the Inotify() system has no more events that need to be processed... give control back to the caller"
|
||
""" | ||
logging.info(f"CgroupWatcher calling inotify_event") | ||
i = inotify.adapters.Inotify() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a waste of resource to allocate the Inotify() class every time the inotify_events() method is invoked. this instance "i" should be allocated in the ctor and stored into "self.inotify_instance" member.
exit(1) | ||
|
||
finally: | ||
i.remove_watch(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this remove_watch() should be moved into the dtor of this class
logging.info(f"Launch cMonitor with command: {monitor_cmd }") | ||
# monitor_process = subprocess.Popen(monitor_cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) | ||
monitor_process = subprocess.Popen(monitor_cmd, shell=True) | ||
self.monitored_processes[process_name] = monitor_process |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my understanding of this class CmonitorLauncher is that it requires the caller to provide a static assigment "process_name -> IP address/port used by cmonitor with Prometheus frontend" right?
What if the same process_name is detected in e.g. 10 container instances? I think here we will overwrite the self.monitored_processes[process_name] each time with a new subprocess.Popen() instance, leaking the previous instance.
I think we need a more flexible IP address/port allocation scheme (when launching a cmonitor_collector configured to stream data to Prometheus) or more flexible JSON output name allocation scheme (when launching a cmonitor_collector configured to save data to JSON -- probably not a concern in this first implementation).
Perhaps we can have a meeting to discuss this but I suggest to ask to the user (through CLI options) a range of IPs/ports that can be used. Then CMonitorLauncher should have a pooling mechanism that keeps track of free IPadress/port pairs and used ones. When a new cmonitor_collector is launched we ask for next free IPaddress/port. When a cmonitor_launcher completes, we must release his IPaddress/port back to the pool.
CmonitorLauncher :
It will perform following steps: