Redis Lists As Wget Queues
How to improve your daily scripting and automatable tasks w/ redis lists
Recently I’ve been skimming through texts I can find on
Anna’s Archive, and one of the problems I’ve been
having is that the free, slower servers will limit requests to just one
concurrent download per client. So let’s say I want to download a bunch of
books, in order to do that manually I would have to wait for each download to
finish before I can enter the next wget.
A better way to do this is to enlist queues, where one background worker waits for new URL’s to be added to the queue. Such a technique can easily be implemented with Redis lists. Here’s an example:
1
redis-cli RPUSH "https://annas-archive.se/.../book_one.epub" "https://annas-archive.se/.../book_two.epub" ...
And then in the background I run a script like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#!/bin/bash
while :; do
url=$(redis-cli LINDEX archive 0)
if [[ -n "$url" && "$url" != "(nil)" ]]; then
file="file.${url##*.}"
if wget "$url" -O "$file"; then
calibredb add "$file" --with-library /mnt/das/calibre
echo "Added $url"
else
echo "Failed to add $url"
fi
redis-cli LPOP archive
fi
sleep 30
done
From then on, as long as I leave the background script running I can download
from the archive without having to keep track of when downloads finish. All I
have to do is run redis-cli RPUSH commands and the worker takes care of the
rest. This technique also comes with the benefit that if I want to cancel
downloads but save them for later they’ll remain in the queue.
