-
-
Notifications
You must be signed in to change notification settings - Fork 344
Community Database Dump #218
Comments
I will soon create a page under https://kescher.at/magneticod-db for sharing my own SQLite database backups. |
Thanks @kescherCode. I suggest you take the same approach of sharing via .torrent. As far as GitHub is concerned, I am confident that sharing an external webpage that contains .torrent files of self-created databases is not in violation of any rules. Neither the webpage posted here nor the torrent/database contains any copyrighted material. |
That's good news. Someone is brave enough to write a small script to merge 2 databases? |
https://github.com/kd8bny/sqlMerge Alternatively: |
I have put up a page at https://kescher.at/magneticod-db now :) |
There is a real issue here. Then, I explored manual merging. Obviously,
This is not so difficult, but I don't think it can be done using a generic script. |
It may be ideal to compress the database before sharing. Compressing with LZMA2 (7zip) on the 'fastest' preset (64K dictionary, 32 word size) yields a file 23.3% of the size of the uncompressed database. Compressing on the 'normal' preset (16M dictionary, 32 word size) yields a file 15-18.2% of the size of the uncompressed database. I'd suggest using the following command on linux systems with
This will produce a file named It can then be decompressed after being downloaded using I've taken the liberty of compressing both of the shared databases so far: https://public.tool.nz/DHT/ |
@AlphaDelta I agree, and I will compress my future torrents containing a database.sqlite3 with xz, however probably with the most aggressive preset you've ever seen:
|
@kescherCode That is indeed the most aggressive preset I've ever seen 😂 Just keep in mind the entire dictionary is loaded into memory when decompressing, so it would require allocating 1536MiB of memory just to decompress the database. Probably not worth riding the exponential-cost train that far. |
That's a very significant size reduction using I will share both compressed and uncompressed versions from next time. Those who have a low-memory VPS for hosting magneticow, may want to directly download the uncompressed database. |
BTW I was writing a simple and dirty tool to migrate old But if you're not scared of bad UPD: just pushed small |
Here it is : https://framagit.org/Glandos/magnetico_merge |
@Glandos It worth mentioning that it's working only with SQLite databases. |
Yes indeed, but it is the only database currently supported :) And the only database that is shared. Sharing postgresql database for merging is not complex, but different. |
Here it is. Now, I have a big merged database with your both database. More than 7 millions torrents with more than 216 millions file entries. |
Did you not remove duplicates? I would guess that there would be significant overlap between the databases. |
There are a lot of overlap as you saw in the merge report : |
What do you call a duplicate? Torrents with the same infohash couldn't be inserted again. |
Here is mine: https://antipoul.fr/dhtd/ This is very basic. |
@anindyamaiti Thanks for your regular updates. Your page is very nice. Do you think you can add an RSS feed? I know it's another thing to do :) |
Pinned! I think once we implement import & export functionality, it'd be even easier (and portable across different databases). =) Closing because it's not an issue but feel free to keep the discussion & sharing going. |
@Glandos I was thinking of the same. Here is a basic automated RSS feed of the ten most recently added files: Nothing fancy, just the filenames, but it should be good enough for a notification. If anyone else is interested in incorporating RSS for their shares, here is my (dirty) PHP code: https://tnt.maiti.info/dhtd/rss.php.txt
@boramalper thanks for the pin! 😊 |
@boramalper suggested I point to torrents.csv, an open repository of torrents / global search engine. Here's the issue for potentially adding people's data to this. |
I have a new dump of my own: https://antipoul.fr/dhtd/20200203_9.2M_magnetico-merge.torrent Since my server is really low on CPU, I didn't use XZ and switch to zstandard. The output is larger, but much faster to compress / decompress. No tracker inside, so I will be the first swarm. |
I, too, have released a new dump on https://kescher.at/magneticod-db. Obviously not relying on trackers either, just the DHT. In case your client allows manual adding of peers and your client doesn't seem to find a connection, feel free to add Also, I may seed other dumps here in order to increase availability for people that want to bootstrap their db, hence why I call my torrents "Magneticod bootstrap". |
I have released a new dump, having roughly 10.8 million torrents. You can get it here. If you can't find any peers through DHT, add |
@kescherCode could you update your dump? I'd offer to host it as a direct download on one of my servers. I considered sharing my version but figured your public magnetico instance has over 11.3 million torrents now which makes my 8 million look rather pale in comparison. |
@19h Small hint for the future: If you want to share your sqlite3 file, before sure to manually open it with sqlite3 and execute |
@19h and @kescherCode I recreated the torrent with announce URIs in it. My client announced it, so now, you should be able to find me. But you need to reimport the torrent (from https://antipoul.fr/dhtd/20200706_13.6M_magnetico-merge.torrent) as I've updated it. @skobkin I can't download from MEGA because the file is too big, and it requires me to install an extra software. I won't do this, sorry :) |
@Glandos I'm removing it then 🤷 |
My latest dump, containing 13.8 million torrents. I will make sure this file is well-seeded by a fast connection as well as my home connection. |
@19h see updated dump above. |
@Glandos @kescherCode that's amazing, thanks both of you! |
@Glandos @kescherCode jfyi I fetched both dumps and my seedbox is seeding them. |
I'm seeding your torrents. Also ... I'm currently writing a merging tool in Rust so that it's a bit faster, but my ideal future of this would be migrating off sqlite to leveldb (or the fb fork rocksdb). I'm also playing with the idea of building a frontend searching the database using tantivy, but that's a bit of a stretch goal .. It would be cool if we could have a semi-dht where we can interconnect our instances so that they act as isolated sattelites for each other.. |
how to install this project on vps? |
@sunnymme this isn't the right place for this question. Check the readme, check other issues or create one .. |
thanks a lot. I have created a issue for this question. magnetico is really good. I want to setup it. But I don't know how to do it.
…------------------ 原始邮件 ------------------
发件人: "Kenan Sulayman"<[email protected]>;
发送时间: 2020年7月11日(星期六) 晚上9:20
收件人: "boramalper/magnetico"<[email protected]>;
抄送: "1059777607"<[email protected]>; "Mention"<[email protected]>;
主题: Re: [boramalper/magnetico] Community Database Dump (#218)
@sunnymme this isn't the right place for this question. Check the readme, check other issues or create one ..
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Here is the dump of my database. 2.64M torrents. My database is not merged with any other database. |
Thanks a lot. I will try it. Your are so kindness.
Sincercely yours,
Sunny
…------------------ 原始邮件 ------------------
发件人: "DyonR"<[email protected]>;
发送时间: 2020年7月19日(星期天) 凌晨3:31
收件人: "boramalper/magnetico"<[email protected]>;
抄送: "1059777607"<[email protected]>; "Mention"<[email protected]>;
主题: Re: [boramalper/magnetico] Community Database Dump (#218)
Here is the dump of my database. 2.64M torrents. My database is not merged with any other database.
Compressed zst file is 2.5GB. The sqlite3 file is 8.3GB.
You can find my database at https://dyonr.nl/magnetico/
Preferable, use the .torrent to download it, instead of downloading the zst file.
The torrent is loaded on my seedbox (1Gbit/s), the .zst on my server which is limited to 200Mbit/s.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
@DyonR I'm seeding your database now, not merging it in yet until the next time I'll dump |
For importing boramalper/magnetico#218
Here is my fresher dump: https://antipoul.fr/dhtd/20201112_14.1_magnetico-merge.torrent Unfortunately, it seems to be a bit stalling. Sometimes, I have 0.1 torrent per second, but it is usually 10 times less… |
Since some of you have millions of torrents maybe you are interested in add support for other databases that scale better than SQLite. Some users are having request timeouts in magneticow due to poor performance in SQLite. I don't have time to work on this issue, but maybe some of you do. Jackett/Jackett#10174 (comment) UPDATE: Of course, having a faster backend will increase the discovery/indexing speed too. There is an attempt to include Postgres but I think it's abandoned #214 |
@ngosang It's not abandoned, it's working for me more than a year for now 😄 I've just forgot about it because Bora didn't answer to my question. I think I can make the last change he asked soon, but I'm not sure he'll merge it because he's not supporting magnetico for a long time. UPD: You can test it using this Docker image: https://hub.docker.com/r/skobkin/magneticod |
@ngosang commented on 15 nov. 2020 à 13:01 UTC+1:
Since magneticod is not using 100% of a CPU, I don't think this is the current bottleneck. |
@Glandos SQLite is really a bottleneck sometimes. It'll not use 100% of CPU because it's most likely using 100% of the disk. I don't have time to check it, so I can be wrong. If someone can check the disk usage (IOPS, throughput, latency) when searching torrents in VERY LARGE database, let us know. |
From my experience as software architect if you have a 10GB database, the SQLite read performance is between 100 and 1000 times slower than other relational databases like MySQL, Postgres, Oracle. |
BTW, I've just updated the PR with PostgreSQL eliminating the last "problem" which was pointed a year ago. |
@skobkin Now, how do I migrate my data from SQLite to Postgres? lol |
@kescherCode See this comment. Be aware that |
I started fresh last month, and have ~1.7M torrents in my database after about 3 weeks. And, I plan to keep my
magneticod
running for the foreseeable future, expecting to add about ~50K per day after the initial spike converges.There have been requests for database dump before, but no one has shared theirs in my knowledge. So, I thought of taking the initiative. Here is my website where I will share (via .torrent) my database dump 1-2 times a month: https://tnt.maiti.info/dhtd/
You can use it as-is or to get a head start with
magneticod
. And, don't forget to seed!Huge thanks to @boramalper for making this project happen.
The text was updated successfully, but these errors were encountered: