Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Methods Question #1

Open
davisv7 opened this issue Jan 27, 2021 · 6 comments
Open

Methods Question #1

davisv7 opened this issue Jan 27, 2021 · 6 comments

Comments

@davisv7
Copy link

davisv7 commented Jan 27, 2021

Hello,

I was wondering how you gathered the 1ml metadata.

Did you scrape it from their website or do they provide it in a downloadable format?

Thanks!

@davisv7
Copy link
Author

davisv7 commented Jan 27, 2021

Not nearly as much information, but I think you only used the pubkeys from the csv anyway.

https://gist.github.com/davisv7/73e8970dea442aa3fd55d528ae511f7c

@seresistvanandras
Copy link
Collaborator

seresistvanandras commented Jan 27, 2021

Hey! yeah, it was scraped from the 1ml website! If you download our used data, you can see there the 1ml metadata in the 1ml_meta_data.csv file! Hope, this helps!

p.s. Not sure if we enclosed the scraping script in this repo. We can definitely do so, if necessary.

@defianalytics
Copy link

Hi @seresistvanandras
Many thanks for the clarification, May I know if I can resuse your

  1. scraping script for obtaining 1ML data If so can you please share the same to [email protected]
  2. The script that is used to produce the daily snapshot, did you deploy the node and collect this data or you got it from 1ML or amboss.space like network explorers ?

@seresistvanandras
Copy link
Collaborator

seresistvanandras commented Sep 19, 2021

Hey! The data used in our work was kindly provided by Antoine Le Calvez (@khannib on Twitter), a data scientist and engineer at CoinMetrics. He had a list of all the channel openings from the very beginning of LN. I believe there should be other ppl as well who has all the (public) channel data since day 0. Maybe you could also ask Christian Decker or other prominent members of the LN community. But if you just want a single snapshot of the current channel graph, then it should be way easier to achieve.

I did not make the scraping script, hopefully, @ferencberes still has it. Let's see if he can help you!
Hope this helps!

@ferencberes
Copy link
Owner

Hi @defianalytics

I looked up our old 1ml scraper and made it available on the following link:
http://info.ilab.sztaki.hu/~fberes/ln/1ml_scraper.py
Unfortunately, we haven't used it since early 2019 so if the underlying structure of the 1ml.com website changed then you will need to refactor this script accordingly.

Best regards,
Ferenc

@defianalytics
Copy link

Many thanks @ferencberes , I will take a look at the script and modify it accordingly,
Also May I know is there any setup guide for collecting latest data in relation to.

  1. for gathering edge stream data files

  2. producing snaphot file sample.json

  3. The files ln.tsv and ln_edges.csv are output of this line in the code
    directed_edges = preprocess_json_file("%s/sample.json" % data_dir)
    Please let me know if I am correct?

  4. How can I use this tool https://github.com/lnresearch/topology in your opinion ? Have you used it.

Please note I read your paper on arxiv:
"First, we gathered
an edge stream data that describes every payment channel opening and closure from block height 501,337
(in December 28, 2017) to 576,140 (in May 15, 2019). Second, we collected snapshots of the public graph
using the lnd client and utilized snapshots taken by Rohrer et al [31] as well."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants