Compare commits

...

5 Commits

7 changed files with 72 additions and 20 deletions

View File

@ -1,12 +1,12 @@
ListDownloader
====================
==============
About
--------------------
-----
This program simply takes a list of files as argument and a directory to download the files, and it downloads them sequentially, or in parallel. The program gives the option to load the whole list, or do parts of the list at a time. An option also is provided for how many threads/processes to be used.
Installation
--------------------
------------
(Installation was prepared for and tested with Debian Jessie.)
@ -14,7 +14,9 @@ You can install the package with pip using
# pip install listdownloader
Or you can create the installation package yourself from the source using
OR you can use the scripts that are provided to do that (`run_build`, and `run_install`), which are available in the repository.
OR you can create the source installation package yourself using
python3 setup.py sdist
@ -22,7 +24,7 @@ and then use pip to install the package that will be built in the directory `dis
# pip3 install listdownloader-x.y.z.tar.gz
where x.y.z is the current version of the program.
where x.y.z is the current version of the program.
The program installs the package listdownloader and a script file for usage.
@ -34,10 +36,14 @@ The script can be executed (globally) using:
$ downloadlist.py -f file.txt -d destination -t threads -l lines
where:
`file.txt` is the file name/path with the list of URLs to be downloaded
`destination` is the path, to which the files should be downloaded
`threads` is the number of processes to be used to download the URLs simultaneously
`lines` is the number of lines to read from the files and read simultaneously. 0 leads to reading the whole file.
`file.txt` is the file name/path with the list of URLs to be downloaded (line by line)
`destination` is the path, to which the files should be downloaded
`threads` is the number of processes to be used to download the URLs simultaneously
`lines` is the number of lines to read from the files and read simultaneously. 0 leads to reading the whole file.
You may use the package in your own scripts by importing it:
@ -48,9 +54,12 @@ then you can download a list of files using:
listdownloader.download_files(URLs, destination, num_threads)
where:
`URLs` is a list of the URLs to be downloaded
`destination` is a string with the path, at which the files have to be saved
`num_threads` is the number of threads/processes to use for the download.
`URLs` is a list of the URLs to be downloaded
`destination` is a string with the path, at which the files have to be saved
`num_threads` is the number of threads/processes to use for the download.
You can also download a single file using the function:
@ -62,4 +71,4 @@ MPL
About
-----
This script was written by Samer Afach, samer@afach.de for test purposes.
This script was written by Samer Afach for test purposes.

View File

@ -1,12 +1,12 @@
ListDownloader
====================
==============
About
--------------------
-----
This program simply takes a list of files as argument and a directory to download the files, and it downloads them sequentially, or in parallel. The program gives the option to load the whole list, or do parts of the list at a time. An option also is provided for how many threads/processes to be used.
Installation
--------------------
------------
(Installation was prepared for and tested with Debian Jessie.)
@ -14,7 +14,9 @@ You can install the package with pip using
# pip install listdownloader
Or you can create the installation package yourself from the source using
OR you can use the scripts that are provided to do that (`run_build`, and `run_install`), which are available in the repository.
OR you can create the source installation package yourself using
python3 setup.py sdist

View File

@ -171,17 +171,23 @@ def download_files(list_of_urls, to_dir, processes=0):
list_of_urls = [line.replace(' ', '').replace('\n', '').replace('\t', '') for line in list_of_urls]
if not os.path.isdir(to_dir):
mkdir_p(to_dir)
# try to detect the number of CPUs automatically
if processes <= 0:
try:
processes = mp.cpu_count()
except NotImplementedError as e:
sys.stderr.write("Unable to determine the number of CPUs for parallelization. Proceeding sequentially. "
"Consider inputting the number of CPUs manually.\n")
"Consider inputting the number of CPUs manually. The error says: " + str(e) + "\n")
_download_files(list_of_urls, to_dir)
return
# if there's only 1 process or 1 URL, there's no need to use multiprocessing
elif processes == 1 or len(list_of_urls) == 1:
_download_files(list_of_urls, to_dir)
return
# if number of processes is larger than the number of URLs, reduce the number of processes to save resources
elif processes > len(list_of_urls):
processes = len(list_of_urls)
@ -189,4 +195,4 @@ def download_files(list_of_urls, to_dir, processes=0):
pool = mp.Pool(processes)
pool.starmap(download_file, params)
pool.close()
pool.join()
pool.join()

14
run_build Normal file
View File

@ -0,0 +1,14 @@
#!/bin/bash
#go to the directory of the script
reldir=`dirname $0`
cd $reldir
directory=`pwd`
rm -rf dist
python3 setup.py sdist
if [ $? -ne 0 ]; then
echo 'Python packager source distribution tool failed.'
exit
fi

18
run_install Normal file
View File

@ -0,0 +1,18 @@
#!/bin/bash
#go to the directory of the script
reldir=`dirname $0`
cd $reldir
directory=`pwd`
if [ "$(id -u)" != "0" ]; then
echo "This script must be run as root" 1>&2
exit 1
fi
pip3 install dist/listdownloader* --upgrade
if [ $? -ne 0 ]; then
echo 'Package installation failed. Did you build the package? Do you have pip3 installed?'
exit
fi

3
run_publish Normal file
View File

@ -0,0 +1,3 @@
#!/bin/bash
python3 setup.py register sdist upload

View File

@ -5,7 +5,7 @@ del os.link
setup(
name="listdownloader",
version="0.1.3",
version="0.1.4",
author="Samer Afach",
author_email="samer@afach.de",
packages=["listdownloader"],