How is ergm using Change Statistics to calculate summary statistics? #527
Replies: 3 comments
-
Although ergm 4.0 has added some alternatives, the standard way that
statistics are calculated is simple: one begins with the value for the
empty graph (which is usually, but not always, 0), and then performs one
additive edge toggle per observed edge. The accumulated changescores
yield the final graph state. Since most graphs for which ergm is used
are sparse, one generally needs only order N toggles for this purpose -
while there are faster ways to compute some graph statistics (some are
implemented e.g. in the sna package, which is designed to compute
descriptives rather than to perform changescore based calculations), in
practice this is usually quite fast. (And if it isn't, you're probably
not going to be fitting an ergm with that term anyway, because you are
going to be doing a *lot* more toggles than that during the estimation
process.)
Another advantage of this scheme is that it greatly simplifies
implementation: one only needs to define a changescore function, and to
know the statistic for the empty graph (which is passed by the
InitErgmTerm function). There is no need for a function to know how to
do anything other than handle edge toggles, and one can use it for both
summary() and MCMC calls.
As noted at the outset, ergm 4.0 has added the ability to have terms
that do implement separate summary() behavior, for cases where this is
helpful - it can thus be done, but is optional. It is also possible to
have stateful behavior for changescores (which was not supported
before); that should certainly speed up some kinds of changescores, but
it's pretty new functionality and not widely used yet.
Hope that helps,
…-Carter
On 5/9/23 6:35 PM, akumar01 wrote:
I have a question regarding how the ergm package counts the summary
statistics associated with model terms (for example when calling
summary(nw ~ ctriple) to get the number of directed 3-ctycles).
I can appreciate that the Metropolis-Hastings steps and subsequent
MCMLE optimization steps only require keeping track of how the count
of the model terms changes upon edge swaps/toggles, but this whole
process is presumably predicated on knowing the initial real count in
the observed network.
After digging through the codebase a little, I found the function
allstatistics.c that seems to use all possible toggles to somehow get
subgraph counts, though it is does not seem to be explicitly used
during normal model fitting initialization.
Given that subgraph enumeration is a non-trivial problem with lots of
literature (e.g.
https://link.springer.com/chapter/10.1007/978-3-540-71681-5_7), I'm a
little surprised there is no documentation on how it is done within
ergm. Could anyone shed some light on this? I am particularly
interested in what is required in order to add new model terms.
Thanks!
—
Reply to this email directly, view it on GitHub
<#527>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJM3GA6L27J4JMAR4WODVLXFLWHXANCNFSM6AAAAAAX4BMUYE>.
You are receiving this because you are subscribed to this
thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
akumar01
-
Thanks for the detailed response! |
Beta Was this translation helpful? Give feedback.
0 replies
-
To what Carter said, I’d add that if you’re interested in allstatistics, there is documentation for ergm.allstats but it sounds like this isn’t quite what you’re interested in.
Best,
Dave
From: CarterButts ***@***.***>
Reply-To: statnet/ergm ***@***.***>
Date: Tuesday, May 9, 2023 at 10:03 PM
To: statnet/ergm ***@***.***>
Cc: Subscribed ***@***.***>
Subject: Re: [statnet/ergm] How is ergm using Change Statistics to calculate summary statistics? (Discussion #527)
Although ergm 4.0 has added some alternatives, the standard way that
statistics are calculated is simple: one begins with the value for the
empty graph (which is usually, but not always, 0), and then performs one
additive edge toggle per observed edge. The accumulated changescores
yield the final graph state. Since most graphs for which ergm is used
are sparse, one generally needs only order N toggles for this purpose -
while there are faster ways to compute some graph statistics (some are
implemented e.g. in the sna package, which is designed to compute
descriptives rather than to perform changescore based calculations), in
practice this is usually quite fast. (And if it isn't, you're probably
not going to be fitting an ergm with that term anyway, because you are
going to be doing a *lot* more toggles than that during the estimation
process.)
Another advantage of this scheme is that it greatly simplifies
implementation: one only needs to define a changescore function, and to
know the statistic for the empty graph (which is passed by the
InitErgmTerm function). There is no need for a function to know how to
do anything other than handle edge toggles, and one can use it for both
summary() and MCMC calls.
As noted at the outset, ergm 4.0 has added the ability to have terms
that do implement separate summary() behavior, for cases where this is
helpful - it can thus be done, but is optional. It is also possible to
have stateful behavior for changescores (which was not supported
before); that should certainly speed up some kinds of changescores, but
it's pretty new functionality and not widely used yet.
Hope that helps,
-Carter
On 5/9/23 6:35 PM, akumar01 wrote:
I have a question regarding how the ergm package counts the summary
statistics associated with model terms (for example when calling
summary(nw ~ ctriple) to get the number of directed 3-ctycles).
I can appreciate that the Metropolis-Hastings steps and subsequent
MCMLE optimization steps only require keeping track of how the count
of the model terms changes upon edge swaps/toggles, but this whole
process is presumably predicated on knowing the initial real count in
the observed network.
After digging through the codebase a little, I found the function
allstatistics.c that seems to use all possible toggles to somehow get
subgraph counts, though it is does not seem to be explicitly used
during normal model fitting initialization.
Given that subgraph enumeration is a non-trivial problem with lots of
literature (e.g.
https://link.springer.com/chapter/10.1007/978-3-540-71681-5_7), I'm a
little surprised there is no documentation on how it is done within
ergm. Could anyone shed some light on this? I am particularly
interested in what is required in order to add new model terms.
Thanks!
—
Reply to this email directly, view it on GitHub
<#527>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJM3GA6L27J4JMAR4WODVLXFLWHXANCNFSM6AAAAAAX4BMUYE>.
You are receiving this because you are subscribed to this
thread.Message ID: ***@***.***>
—
Reply to this email directly, view it on GitHub<#527 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACUJ3YCANO7JFYBQFNFN2F3XFLZNJANCNFSM6AAAAAAX4BMUYE>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have a question regarding how the ergm package counts the summary statistics associated with model terms (for example when calling summary(nw ~ ctriple) to get the number of directed 3-ctycles).
I can appreciate that the Metropolis-Hastings steps and subsequent MCMLE optimization steps only require keeping track of how the count of the model terms changes upon edge swaps/toggles, but this whole process is presumably predicated on knowing the initial real count in the observed network.
After digging through the codebase a little, I found the function allstatistics.c that seems to use all possible toggles to somehow get subgraph counts, though it is does not seem to be explicitly used during normal model fitting initialization.
Given that subgraph enumeration is a non-trivial problem with lots of literature (e.g. https://link.springer.com/chapter/10.1007/978-3-540-71681-5_7), I'm a little surprised there is no documentation on how it is done within ergm. Could anyone shed some light on this? I am particularly interested in what is required in order to add new model terms.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions