-
-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft updated doc for profiler #1144
base: master
Are you sure you want to change the base?
Conversation
Added doc for additional options on `profile/2`. Left "Visualizing profiling data" section as is; figure looks a little dated but difficult to generate decent X-Windows screenshots on MacOS.
man/profile.doc
Outdated
library. The latter can be hooked, which is used by the XPCE module | ||
\pllib{swi/pce_profile} to provide an interactive graphical | ||
frontend for the results. | ||
|
||
The information gathering component can be run in two modes, controllable using | ||
the \const{ports} option. The first mode (\const{ports(false)}) minimizes the | ||
memory required to store the call tree, but only accumulates the the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "the the"
man/profile.doc
Outdated
memory required to store the call tree, but only accumulates the the | ||
\emph{call} port count for each predicate. (See \secref{debugoverview} for more information | ||
on the Prolog "Byrd Box Model" or "4 Port Model".) This mode is most beneficial for long | ||
profiler sessions on running applications. The second mode (\const{ports(false)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be ports(true)
?
If \const{true} counts on all ports will be accumulated at the possible | ||
expense of added memory overhead. If \const{false} only the \emph{call} port | ||
counts are accumulated but memory overhead is reduced in some situations. | ||
If neither value is specified, the default is the "classic" view which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the "classic" view different from specifying either ports(true)
or ports(false)
? If so, I suggest adding a ports(classic)
option.
man/profile.doc
Outdated
|
||
\begin{description} | ||
\termitem{sampling_rate}{+Rate} | ||
\arg{Rate} is numeric value between 1 and 1000 specifying the number of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/is numeric/is a numeric/
@@ -144,12 +174,12 @@ While the program executes under the profiler, the system builds a | |||
kernel: one that starts a new goal (\emph{profCall}), one that tells the | |||
system which goal is resumed after an \emph{exit} (\emph{profExit}) and | |||
one that tells the system which goal is resumed after a \emph{fail} | |||
(i.e., which goal is used to \emph{retry} (\emph{profRedo})). The | |||
profCall() function finds or creates the subnode for the argument | |||
(i.e., which goal is used to \emph{retry} (\emph{profFail})). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use \exam{...} for examples (not \emph{...})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\const{}
is for constants (identifiers, atoms, ...), \exam{} is for (example) inline code fragments (i.e., generic "code"). Functions are written name()
(will be styled automatically). Terms may be written as \term{name}{args}
, especially if the arguments are placeholders.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for proof reading; I'll update the PR to fix the typos.
Regarding "classic" mode; perhaps I'm overly pedantic but I think a mode which displays incorrect information in some situations is a mistake. It's (too?) easy to enable that mode buy just by omitting the ports
option so, personally, I'd rather not legitimize it further in the doc.
As for the tex conventions used by SWIP, I just tried to mimic what was there previously as I don't really understand the nuances, and the changes required (if any) to what's been submitted. This is just a draft so I assume the editor-in-chief will have the final say for the official version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still confused: if I specify ports(false)
, do I get the current (somewhat wrong) behaviour? Do I get a different behaviour if I don't specify either ports(false)
or ports(true)
?
If ports(false)
gives the same behaviour as not specifying the ports
option at all, then there's no problem. If this isn't the case, I don't like it because the API would have no way of specifying the default behaviour.
[I'll leave the markup fixing to Jan -- I often get it wrong because there's no documentation about the markup and I got lost while trying to read the TeX macros]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are three different behaviours at the moment. The API (predicate '$profile'/4
) supports all three. The user doc "should" explain how each one is specified in the profile/2
predicate (no ports
option defaults to "classic" mode).
I'd rather there were just two modes: ports(true)
and ports(false)
; that's how I intend to use it.
If the |
I agree, although it's not quite so clear-cut. "classic" port counts will be accurate except for indirect recursion loops. Perhaps an argument can be made that that if you know what you're doing and/or you can somehow mark suspect data, you can almost have the best of both worlds. I would prefer to keep it simple for the average user (and the doc/implementation). |
I have no opinion on whether there should be 2 or 3 options. I do have an opinion about APIs; and I don't like an API that has no positive way of specifying something. For example, suppose you have a predicate: my_profile(PortOption) : -
profile(my_goal, [port(PortOption)]). then there's no way to specify the "classic" profile. Instead you'd have to do:
which is not as nice an API. |
As it turns out, the first version will work just fine with |
I think we're arguing over the colour of a bikeshed - as you say, the API to |
Hope it didn't feel like an argument; not my intent. More like a small difference in perspective. In any case, thanks for taking the time to read this. Doc is not my strong suit (if I have one). |
I should have said that we're discussing the colour of a bikeshed. I have a pretty thick skin when it comes to technical discussions - my intent is to get to the best solution, not to prove that I'm right and you're wrong. So, I hope I haven't given the wrong impression. :) |
As is, I think we should step back. I think there are two questions to answer first.
As to (1), I think it can because deep mutual recursion isn't something unrealistic in Prolog. So, not-so-theoretical code may run fine but fills up all memory while profiling. An example is ordered merging based on compare/3 and some list iteration. Lists can have millions of elements, creating 2 profile nodes per element. To me, this means the current As to (2), I think we can, if we create an indirect recursion loop, mark the node as such. If nothing was marked, all our counts are correct. Also all nodes that have no marks above them are correct. I'm not sure whether we can do something with branches below the marks. Possibly, we could check that the walks over the call tree for exit and fail do (not) walk over such a mark and rule more as correct and/or give an upper bound on how much off they can be? Even with the simple approach touch, the compacted version could simply give the correct counts and only give the call counts for predicates involved in indirect recursion. |
Don't intend to be argumentative, but this seems pretty theoretical to me. Sure it's possible to write such an example (construct and process million element lists?), but I have difficulty understanding why anyone would do so (other than artificial benchmarks). And a million element list takes a minimum of 24 Mbytes just by itself, so an additional overhead of 200 Mbytes of profiler overhead doesn't feel all that prohibitive to me these days. Not to mention that surely one would have a pretty good handle on performance of 1000 element lists before scaling up to a million elements, so the need for detailed profiling at that excessive size seems questionable At the end of the day it's not my call but, subject to a real world counter example, I don't see any need for any more than two |
I have some code that, amongst other things, processes a list with ~100,000
elements, so I could try running that under the profiler with both options.
It'll take me a day or so to extract the pieces and run them from the
command line.
…On Wed, Mar 22, 2023, 19:38 ridgeworks ***@***.***> wrote:
As to (1), I think it can because deep mutual recursion isn't something
unrealistic in Prolog. So, not-so-theoretical code may run fine but fills
up all memory while profiling. An example is ordered merging based on
compare/3 and some list iteration. Lists can have millions of elements,
creating 2 profile nodes per element.
Don't intend to be argumentative, but this seems pretty theoretical to me.
Sure it's possible to write such an example (construct and process million
element lists?), but I have difficulty understanding why anyone would do so
(other than artificial benchmarks). And a million element list takes a
minimum of 24 Mbytes just by itself, so an additional overhead of 200
Mbytes of profiler overhead doesn't feel all that prohibitive to me these
days. Not to mention that surely one would have a pretty good handle on
performance of 1000 element lists before scaling up to a million elements,
so the need for detailed profiling at that excessive size seems questionable
At the end of the day it's not my call but, subject to a real world
counter example, I don't see any need for any more than two port's
options and I don't think it even matters much which is default. The
complications of figuring out how to efficiently mark erroneous
information, pass that to the output generator, modify the GUI accordingly,
and explain it all in user documentation just aren't worth the effort IMO.
—
Reply to this email directly, view it on GitHub
<#1144 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AIIGNNNV32CZFPUAHKS3HJTW5OZQRANCNFSM6AAAAAAVW5QAEY>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Seems we live in a different Prolog universe 😄 . I've had the pleasure of working with Prolog processes using hundreds of giga bytes, including many million element long lists. There is nothing artificial about these. And yes, merging multiple of such (ordered) lists can be done quite ok and result (using the most natural implementation) in a mutual recursive predicates. A 10 fold space increment is a problem, exactly for these applications that stress the limits of your hardware. Not to mention a malloc() for every cell we are processing. You seem to live mostly in domains using Prolog for (backtracking) search. Those programs typically do not stress memory a lot. Several of my projects deal with large amounts of data, (often) fairly simple logic and aggregation steps that often generate huge lists. |
Apparently. My industry experience was on computers with disks smaller than memory sizes these days, and memory.., well you get the picture. I have a hard time imagining such behemoths, let alone profiling them. I would hazard a guess that not very many Prolog users are dealing with hundreds of giga bytes of data. I appreciate Peter's offer for getting some real experience to chew on. My suspicion is that the issues will be more on the analyse/display side than the collection side. In any case, the profiler now works much better for my "limited" applications, so I don't intend to put any more effort into it. |
In my last academic job we had a machine with 1Tb main memory and 48 cores/96 threads 😄
I'm happy with the improvements. Thanks! |
Do you want me to try the profiler with my moderately long lists? (It'll take me a little bit of work because it's part of a larger batch job, so I need to isolate one large step and re-run it with the profiler) |
If it's not too much trouble, that would be great. If it, or some part thereof, can be packaged up for me to poke at, I'm happy to go that route (assuming it works on my meagre computer resources); whatever works for you. |
Bottom line to me is whether you consider mutual recursive procedures on long lists to be something one finds in the wild. From the library, ord_union/3 is a perfect example. Real-life code. Now you need to believe there are real Prolog programs processing ordered sets with millions of elements. I have seen and used them. QED 😄 Now it is also true that this scenario is not that common. It might be an option to change the collection to use ports(true) until a specified recursion depth and use the "classic" algorithm above. All that would need is to add a recursion depth field to the nodes. |
Regarding profiling specifically, for this "uncommon scenario" you have |
Only some of the port counts are wrong. Only those for predicates involved in a branch of the call tree subject to mutual recursion are wrong. That leaves many correct in most applications. As it isn't too hard to identify the wrong ones, that seems a promising route. I don't care too much though. It works pretty well as it does right now. It is mostly a little complicated to explain. |
What is the status of this PR? I'd like to submit a PR to fix a bug in the |
Enables module simplex to pass sandbox safety check
Remove meta-call in pivot_row/7
This pull request has been mentioned on SWI-Prolog. There might be relevant details there: https://swi-prolog.discourse.group/t/using-format-2-with-attributed-variable-on-a-pengine/6996/18 |
This pull request has been mentioned on SWI-Prolog. There might be relevant details there: https://swi-prolog.discourse.group/t/understanding-profiler/8582/4 |
Added doc for additional options on
profile/2
.Left "Visualizing profiling data" section as is; figure looks a little dated but difficult to generate decent X-Windows screenshots on MacOS.