Compare commits

...

45 Commits
0.3.7 ... 0.4.2

Author SHA1 Message Date
14a53a0953 fix 2020-10-02 01:39:42 +08:00
c5e4b5ffa8 update 2020-10-02 01:39:14 +08:00
b3f25875d0 fix bug on mac #126 2020-10-02 01:32:18 +08:00
91053b98af 0.4.1 2020-10-02 01:02:41 +08:00
7570b6ae7d remove img2pdf in requirements 2020-10-02 00:55:26 +08:00
d2e68c6c45 fix #146 #142 #146 2020-10-02 00:51:37 +08:00
78429423d9 fix bug 2020-06-26 13:29:44 +08:00
38ff69d99d add sort options 2020-06-26 13:28:10 +08:00
2ce36204fe update tests 2020-06-26 13:18:08 +08:00
e9864d158f update tests 2020-06-26 13:15:57 +08:00
43013badd4 update .gitignore 2020-06-26 13:12:49 +08:00
7508a2010d 0.4.0 2020-06-26 13:12:37 +08:00
946761477d Merge pull request #139 from RicterZ/master
Merge into dev branch
2020-06-26 12:48:51 +08:00
db80408024 Merge pull request #138 from RicterZ/revert-134-master
Revert "Fix fatal error and keep index of id which from file"
2020-06-26 12:47:25 +08:00
4c85cebb78 Revert "Fix fatal error and keep index of id which from file" 2020-06-26 12:47:10 +08:00
e982a8170c Merge pull request #134 from ODtian/master
Fix fatal error and keep index of id which from file
2020-06-26 12:46:08 +08:00
0b62f0ebd9 Merge pull request #137 from jwfiredragon/patch-1
Fixing typos
2020-06-26 12:45:55 +08:00
37b4ee7d00 Fixing typos
ms-user-select should be -ms-user-select. #0d0d0d9 isn't a valid hex code - I assume it's supposed to be #0d0d0d?
2020-06-23 23:04:09 -07:00
84cad0d475 Update cmdline.py 2020-06-24 12:00:17 +08:00
bf03881ed6 Fix fatal error and keep index of id which from file 2020-06-23 20:39:41 +08:00
f97b814b45 Merge pull request #131 from myzWILLmake/dev
remove args.tag since no tag option in parser
2020-06-22 18:11:18 +08:00
7323eae99b remove args.tag since no tag option in parser 2020-06-15 10:00:23 +08:00
6e07f0426b Merge pull request #130 from jwfiredragon/patch-1
Fixing parser for nhentai site update
2020-06-12 10:32:34 +08:00
44c424a321 Fixing parser for nhentai site update
nhentai's recent site update broke the parser, this fixes it. Based off the work on [my fork here](8c4a4f02bc).
2020-06-10 22:39:35 -07:00
3db77e0ce3 Merge pull request #127 from Tsuribori/dev
Add PDF support
2020-06-08 11:11:42 +08:00
22dbb4dd0d Add PDF support 2020-06-07 19:07:40 +03:00
2be4bd71ce Merge pull request #123 from Alocks/dev
--search fix, removed --tag commands
2020-05-06 19:16:27 +08:00
fc39aeb49e stupid fix 2020-05-02 14:52:24 -03:00
be2ec3f452 updated documentation 2020-05-02 14:35:22 -03:00
0c23f64356 removed all --tag commands since --search API is working again, now --language is a setting, cleaned some code 2020-05-02 14:23:31 -03:00
7e4dff8fec move import statement to function 2020-05-01 22:20:55 +08:00
e2a1d79b1b fix #117 2020-05-01 22:18:03 +08:00
8183f3a7a9 Merge pull request #119 from BachoSeven/master
Updated README
2020-04-26 09:57:39 +08:00
80713d2e00 updated README.rst 2020-04-25 18:19:44 +02:00
a2cd025027 updated README.rst 2020-04-25 18:18:48 +02:00
2f7bb59e58 Update README.rst 2020-04-25 18:04:50 +02:00
e94685d9c5 Merge pull request #116 from AnhNhan/master
write ComicInfo.xml for CBZ files
2020-04-22 12:52:17 +08:00
07d804b047 move ComicInfo.xml behind the --comic-info flag 2020-04-22 06:19:12 +02:00
5552d39337 fix --artist, --character, --parody, --group 2020-04-21 14:54:04 +02:00
d35190f9d0 write ComicInfo.xml for CBZ files 2020-04-21 13:23:50 +02:00
c8bca4240a Merge pull request #115 from RicterZ/dev
fix bug #114
2020-04-20 20:17:09 +08:00
130386054f 0.3.9 2020-04-20 20:16:48 +08:00
df16109788 fix install script on python2 2020-04-20 20:15:06 +08:00
c18cd2aaa5 Merge pull request #112 from RicterZ/dev
0.3.8
2020-04-20 20:07:02 +08:00
197b5e4923 update 2020-04-09 22:04:45 +08:00
14 changed files with 197 additions and 202 deletions

1
.gitignore vendored
View File

@ -6,3 +6,4 @@ dist/
.python-version .python-version
.DS_Store .DS_Store
output/ output/
venv/

View File

@ -3,7 +3,6 @@ os:
language: python language: python
python: python:
- 2.7
- 3.7 - 3.7
install: install:
@ -11,10 +10,9 @@ install:
script: script:
- echo 268642 > /tmp/test.txt - echo 268642 > /tmp/test.txt
- nhentai --cookie "csrftoken=xIh7s9d4NB8qSLN7eJZG9064zsV84aHEYFoAU49Ib9anqmoT0pZRw6TIdayLzQuT; sessionid=un101zfgpglsyffdnsm72le4euuisp7t" - nhentai --cookie "_ga=GA1.2.2000087053.1558179358; __cfduid=d8930f7b43d04e1b2117719e28386b2e31593148489; csrftoken=3914GQGSmmqQyfQTBswNgfXuhFiefu8sAgOnsfZWiiqS4PJpKivuTp34p2USV6xu; sessionid=be0w2lwlprlmld3ahg9i592ipsuaw840"
- nhentai --search umaru - nhentai --search umaru
- nhentai --id=152503,146134 -t 10 --output=/tmp/ --cbz - nhentai --id=152503,146134 -t 10 --output=/tmp/ --cbz
- nhentai --tag lolicon --sorting popular
- nhentai -F - nhentai -F
- nhentai --file /tmp/test.txt - nhentai --file /tmp/test.txt
- nhentai --id=152503,146134 --gen-main --output=/tmp/ - nhentai --id=152503,146134 --gen-main --output=/tmp/

View File

@ -19,15 +19,30 @@ nhentai
nHentai is a CLI tool for downloading doujinshi from <http://nhentai.net> nHentai is a CLI tool for downloading doujinshi from <http://nhentai.net>
============ ===================
Installation Manual Installation
============ ===================
.. code-block:: .. code-block::
git clone https://github.com/RicterZ/nhentai git clone https://github.com/RicterZ/nhentai
cd nhentai cd nhentai
python setup.py install python setup.py install
==================
Installation (pip)
==================
Alternatively, install from PyPI with pip:
.. code-block::
pip install nhentai
For a self-contained installation, use `Pipx <https://github.com/pipxproject/pipx/>`_:
.. code-block::
pipx install nhentai
===================== =====================
Installation (Gentoo) Installation (Gentoo)
===================== =====================
@ -50,6 +65,8 @@ Set your nhentai cookie against captcha:
nhentai --cookie "YOUR COOKIE FROM nhentai.net" nhentai --cookie "YOUR COOKIE FROM nhentai.net"
**NOTE**: The format of the cookie is `"csrftoken=TOKEN; sessionid=ID"`
Download specified doujinshi: Download specified doujinshi:
.. code-block:: bash .. code-block:: bash
@ -62,53 +79,20 @@ Download doujinshi with ids specified in a file (doujinshi ids split by line):
nhentai --file=doujinshi.txt nhentai --file=doujinshi.txt
Set search default language
.. code-block:: bash
nhentai --language=english
Search a keyword and download the first page: Search a keyword and download the first page:
.. code-block:: bash .. code-block:: bash
nhentai --search="tomori" --page=1 --download nhentai --search="tomori" --page=1 --download
# you also can download by tags and multiple keywords
Download by tag name: nhentai --search="tag:lolicon, artist:henreader, tag:full color"
nhentai --search="lolicon, henreader, full color"
.. code-block:: bash
nhentai --tag lolicon --download --page=2
Download by language:
.. code-block:: bash
nhentai --language english --download --page=2
Download by artist name:
.. code-block:: bash
nhentai --artist henreader --download
Download by character name:
.. code-block:: bash
nhentai --character "kuro von einsbern" --download
Download by parody name:
.. code-block:: bash
nhentai --parody "the idolmaster" --download
Download by group name:
.. code-block:: bash
nhentai --group clesta --download
Download using multiple tags (--tag, --character, --paordy and --group supported):
.. code-block:: bash
nhentai --tag "lolicon, teasing" --artist "tamano kedama, atte nanakusa"
Download your favorites with delay: Download your favorites with delay:
@ -170,8 +154,9 @@ Other options:
--no-html don't generate HTML after downloading --no-html don't generate HTML after downloading
--gen-main generate a main viewer contain all the doujin in the folder --gen-main generate a main viewer contain all the doujin in the folder
-C, --cbz generate Comic Book CBZ File -C, --cbz generate Comic Book CBZ File
-P --pdf generate PDF file
--rm-origin-dir remove downloaded doujinshi dir when generated CBZ --rm-origin-dir remove downloaded doujinshi dir when generated CBZ
file. or PDF file.
# nHentai options # nHentai options
--cookie=COOKIE set cookie of nhentai to bypass Google recaptcha --cookie=COOKIE set cookie of nhentai to bypass Google recaptcha
@ -183,7 +168,7 @@ nHentai Mirror
If you want to use a mirror, you should set up a reverse proxy of `nhentai.net` and `i.nhentai.net`. If you want to use a mirror, you should set up a reverse proxy of `nhentai.net` and `i.nhentai.net`.
For example: For example:
.. code-block:: .. code-block::
i.h.loli.club -> i.nhentai.net i.h.loli.club -> i.nhentai.net
h.loli.club -> nhentai.net h.loli.club -> nhentai.net

View File

@ -1,3 +1,3 @@
__version__ = '0.3.8' __version__ = '0.4.1'
__author__ = 'RicterZ' __author__ = 'RicterZ'
__email__ = 'ricterzheng@gmail.com' __email__ = 'ricterzheng@gmail.com'

View File

@ -38,7 +38,7 @@ def banner():
def cmd_parser(): def cmd_parser():
parser = OptionParser('\n nhentai --search [keyword] --download' parser = OptionParser('\n nhentai --search [keyword] --download'
'\n NHENTAI=http://h.loli.club nhentai --id [ID ...]' '\n NHENTAI=http://h.loli.club nhentai --id [ID ...]'
'\n nhentai --file [filename]' '\n nhentai --file [filename]'
'\n\nEnvironment Variable:\n' '\n\nEnvironment Variable:\n'
' NHENTAI nhentai mirror url') ' NHENTAI nhentai mirror url')
# operation options # operation options
@ -50,26 +50,17 @@ def cmd_parser():
parser.add_option('--id', type='string', dest='id', action='store', help='doujinshi ids set, e.g. 1,2,3') parser.add_option('--id', type='string', dest='id', action='store', help='doujinshi ids set, e.g. 1,2,3')
parser.add_option('--search', '-s', type='string', dest='keyword', action='store', parser.add_option('--search', '-s', type='string', dest='keyword', action='store',
help='search doujinshi by keyword') help='search doujinshi by keyword')
parser.add_option('--tag', type='string', dest='tag', action='store', help='download doujinshi by tag')
parser.add_option('--artist', type='string', dest='artist', action='store', help='download doujinshi by artist')
parser.add_option('--character', type='string', dest='character', action='store',
help='download doujinshi by character')
parser.add_option('--parody', type='string', dest='parody', action='store', help='download doujinshi by parody')
parser.add_option('--group', type='string', dest='group', action='store', help='download doujinshi by group')
parser.add_option('--language', type='string', dest='language', action='store',
help='download doujinshi by language')
parser.add_option('--favorites', '-F', action='store_true', dest='favorites', parser.add_option('--favorites', '-F', action='store_true', dest='favorites',
help='list or download your favorites.') help='list or download your favorites.')
# page options # page options
parser.add_option('--page', type='int', dest='page', action='store', default=1, parser.add_option('--page', type='int', dest='page', action='store', default=1,
help='page number of search results') help='page number of search results')
parser.add_option('--max-page', type='int', dest='max_page', action='store', default=1,
help='The max page when recursive download tagged doujinshi')
parser.add_option('--page-range', type='string', dest='page_range', action='store', parser.add_option('--page-range', type='string', dest='page_range', action='store',
help='page range of favorites. e.g. 1,2-5,14') help='page range of favorites. e.g. 1,2-5,14')
parser.add_option('--sorting', dest='sorting', action='store', default='date', parser.add_option('--sorting', dest='sorting', action='store', default='recent',
help='sorting of doujinshi (date / popular)', choices=['date', 'popular']) help='sorting of doujinshi (recent / popular / popular-[today|week])',
choices=['recent', 'popular', 'popular-today', 'popular-week'])
# download options # download options
parser.add_option('--output', '-o', type='string', dest='output_dir', action='store', default='', parser.add_option('--output', '-o', type='string', dest='output_dir', action='store', default='',
@ -95,12 +86,16 @@ def cmd_parser():
help='generate a main viewer contain all the doujin in the folder') help='generate a main viewer contain all the doujin in the folder')
parser.add_option('--cbz', '-C', dest='is_cbz', action='store_true', parser.add_option('--cbz', '-C', dest='is_cbz', action='store_true',
help='generate Comic Book CBZ File') help='generate Comic Book CBZ File')
parser.add_option('--pdf', '-P', dest='is_pdf', action='store_true',
help='generate PDF file')
parser.add_option('--rm-origin-dir', dest='rm_origin_dir', action='store_true', default=False, parser.add_option('--rm-origin-dir', dest='rm_origin_dir', action='store_true', default=False,
help='remove downloaded doujinshi dir when generated CBZ file.') help='remove downloaded doujinshi dir when generated CBZ or PDF file.')
# nhentai options # nhentai options
parser.add_option('--cookie', type='str', dest='cookie', action='store', parser.add_option('--cookie', type='str', dest='cookie', action='store',
help='set cookie of nhentai to bypass Google recaptcha') help='set cookie of nhentai to bypass Google recaptcha')
parser.add_option('--language', type='str', dest='language', action='store',
help='set default language to parse doujinshis')
parser.add_option('--save-download-history', dest='is_save_download_history', action='store_true', parser.add_option('--save-download-history', dest='is_save_download_history', action='store_true',
default=False, help='save downloaded doujinshis, whose will be skipped if you re-download them') default=False, help='save downloaded doujinshis, whose will be skipped if you re-download them')
parser.add_option('--clean-download-history', action='store_true', default=False, dest='clean_download_history', parser.add_option('--clean-download-history', action='store_true', default=False, dest='clean_download_history',
@ -120,9 +115,7 @@ def cmd_parser():
generate_html() generate_html()
exit(0) exit(0)
if args.main_viewer and not args.id and not args.keyword and \ if args.main_viewer and not args.id and not args.keyword and not args.favorites:
not args.tag and not args.artist and not args.character and \
not args.parody and not args.group and not args.language and not args.favorites:
generate_main_html() generate_main_html()
exit(0) exit(0)
@ -151,6 +144,25 @@ def cmd_parser():
logger.info('Cookie saved.') logger.info('Cookie saved.')
exit(0) exit(0)
if os.path.exists(constant.NHENTAI_LANGUAGE) and not args.language:
with open(constant.NHENTAI_LANGUAGE, 'r') as f:
constant.LANGUAGE = f.read()
args.language = f.read()
if args.language:
try:
if not os.path.exists(constant.NHENTAI_HOME):
os.mkdir(constant.NHENTAI_HOME)
with open(constant.NHENTAI_LANGUAGE, 'w') as f:
f.write(args.language)
except Exception as e:
logger.error('Cannot create NHENTAI_HOME: {}'.format(str(e)))
exit(1)
logger.info('Default language now is {}.'.format(args.language))
exit(0)
if os.path.exists(constant.NHENTAI_PROXY): if os.path.exists(constant.NHENTAI_PROXY):
with open(constant.NHENTAI_PROXY, 'r') as f: with open(constant.NHENTAI_PROXY, 'r') as f:
link = f.read() link = f.read()
@ -189,15 +201,12 @@ def cmd_parser():
_ = [i.strip() for i in f.readlines()] _ = [i.strip() for i in f.readlines()]
args.id = set(int(i) for i in _ if i.isdigit()) args.id = set(int(i) for i in _ if i.isdigit())
if (args.is_download or args.is_show) and not args.id and not args.keyword and \ if (args.is_download or args.is_show) and not args.id and not args.keyword and not args.favorites:
not args.tag and not args.artist and not args.character and \
not args.parody and not args.group and not args.language and not args.favorites:
logger.critical('Doujinshi id(s) are required for downloading') logger.critical('Doujinshi id(s) are required for downloading')
parser.print_help() parser.print_help()
exit(1) exit(1)
if not args.keyword and not args.id and not args.tag and not args.artist and \ if not args.keyword and not args.id and not args.favorites:
not args.character and not args.parody and not args.group and not args.language and not args.favorites:
parser.print_help() parser.print_help()
exit(1) exit(1)

View File

@ -6,12 +6,12 @@ import platform
import time import time
from nhentai.cmdline import cmd_parser, banner from nhentai.cmdline import cmd_parser, banner
from nhentai.parser import doujinshi_parser, search_parser, print_doujinshi, favorites_parser, tag_parser from nhentai.parser import doujinshi_parser, search_parser, print_doujinshi, favorites_parser
from nhentai.doujinshi import Doujinshi from nhentai.doujinshi import Doujinshi
from nhentai.downloader import Downloader from nhentai.downloader import Downloader
from nhentai.logger import logger from nhentai.logger import logger
from nhentai.constant import BASE_URL from nhentai.constant import BASE_URL
from nhentai.utils import generate_html, generate_cbz, generate_main_html, check_cookie, signal_handler, DB from nhentai.utils import generate_html, generate_cbz, generate_main_html, generate_pdf, check_cookie, signal_handler, DB
def main(): def main():
@ -21,7 +21,7 @@ def main():
from nhentai.constant import PROXY from nhentai.constant import PROXY
# constant.PROXY will be changed after cmd_parser() # constant.PROXY will be changed after cmd_parser()
if PROXY != {}: if PROXY:
logger.info('Using proxy: {0}'.format(PROXY)) logger.info('Using proxy: {0}'.format(PROXY))
# check your cookie # check your cookie
@ -37,25 +37,11 @@ def main():
doujinshis = favorites_parser(options.page_range) doujinshis = favorites_parser(options.page_range)
elif options.tag:
doujinshis = tag_parser(options.tag, sorting=options.sorting, max_page=options.max_page)
elif options.artist:
doujinshis = tag_parser(options.artist, max_page=options.max_page, index=1)
elif options.character:
doujinshis = tag_parser(options.character, max_page=options.max_page, index=2)
elif options.parody:
doujinshis = tag_parser(options.parody, max_page=options.max_page, index=3)
elif options.group:
doujinshis = tag_parser(options.group, max_page=options.max_page, index=4)
elif options.language:
doujinshis = tag_parser(options.language, max_page=options.max_page, index=5)
elif options.keyword: elif options.keyword:
from nhentai.constant import LANGUAGE
if LANGUAGE:
logger.info('Using deafult language: {0}'.format(LANGUAGE))
options.keyword += ', language:{}'.format(LANGUAGE)
doujinshis = search_parser(options.keyword, sorting=options.sorting, page=options.page) doujinshis = search_parser(options.keyword, sorting=options.sorting, page=options.page)
elif not doujinshi_ids: elif not doujinshi_ids:
@ -96,10 +82,12 @@ def main():
with DB() as db: with DB() as db:
db.add_one(doujinshi.id) db.add_one(doujinshi.id)
if not options.is_nohtml and not options.is_cbz: if not options.is_nohtml and not options.is_cbz and not options.is_pdf:
generate_html(options.output_dir, doujinshi) generate_html(options.output_dir, doujinshi)
elif options.is_cbz: elif options.is_cbz:
generate_cbz(options.output_dir, doujinshi, options.rm_origin_dir) generate_cbz(options.output_dir, doujinshi, options.rm_origin_dir)
elif options.is_pdf:
generate_pdf(options.output_dir, doujinshi, options.rm_origin_dir)
if options.main_viewer: if options.main_viewer:
generate_main_html(options.output_dir) generate_main_html(options.output_dir)
@ -115,6 +103,5 @@ def main():
signal.signal(signal.SIGINT, signal_handler) signal.signal(signal.SIGINT, signal_handler)
if __name__ == '__main__': if __name__ == '__main__':
main() main()

View File

@ -12,17 +12,10 @@ except ImportError:
BASE_URL = os.getenv('NHENTAI', 'https://nhentai.net') BASE_URL = os.getenv('NHENTAI', 'https://nhentai.net')
__api_suspended_DETAIL_URL = '%s/api/gallery' % BASE_URL __api_suspended_DETAIL_URL = '%s/api/gallery' % BASE_URL
__api_suspended_SEARCH_URL = '%s/api/galleries/search' % BASE_URL
DETAIL_URL = '%s/g' % BASE_URL DETAIL_URL = '%s/g' % BASE_URL
SEARCH_URL = '%s/search/' % BASE_URL SEARCH_URL = '%s/api/galleries/search' % BASE_URL
TAG_URL = ['%s/tag' % BASE_URL,
'%s/artist' % BASE_URL,
'%s/character' % BASE_URL,
'%s/parody' % BASE_URL,
'%s/group' % BASE_URL,
'%s/language' % BASE_URL]
TAG_API_URL = '%s/api/galleries/tagged' % BASE_URL TAG_API_URL = '%s/api/galleries/tagged' % BASE_URL
LOGIN_URL = '%s/login/' % BASE_URL LOGIN_URL = '%s/login/' % BASE_URL
@ -35,8 +28,10 @@ IMAGE_URL = '%s://i.%s/galleries' % (u.scheme, u.hostname)
NHENTAI_HOME = os.path.join(os.getenv('HOME', tempfile.gettempdir()), '.nhentai') NHENTAI_HOME = os.path.join(os.getenv('HOME', tempfile.gettempdir()), '.nhentai')
NHENTAI_PROXY = os.path.join(NHENTAI_HOME, 'proxy') NHENTAI_PROXY = os.path.join(NHENTAI_HOME, 'proxy')
NHENTAI_COOKIE = os.path.join(NHENTAI_HOME, 'cookie') NHENTAI_COOKIE = os.path.join(NHENTAI_HOME, 'cookie')
NHENTAI_LANGUAGE = os.path.join(NHENTAI_HOME, 'language')
NHENTAI_HISTORY = os.path.join(NHENTAI_HOME, 'history.sqlite3') NHENTAI_HISTORY = os.path.join(NHENTAI_HOME, 'history.sqlite3')
PROXY = {} PROXY = {}
COOKIE = '' COOKIE = ''
LANGUAGE = ''

View File

@ -5,11 +5,10 @@ import multiprocessing
import signal import signal
from future.builtins import str as text from future.builtins import str as text
import sys
import os import os
import requests import requests
import threadpool
import time import time
import multiprocessing as mp
try: try:
from urllib.parse import urlparse from urllib.parse import urlparse
@ -18,10 +17,10 @@ except ImportError:
from nhentai.logger import logger from nhentai.logger import logger
from nhentai.parser import request from nhentai.parser import request
from nhentai.utils import Singleton, signal_handler from nhentai.utils import Singleton
requests.packages.urllib3.disable_warnings() requests.packages.urllib3.disable_warnings()
semaphore = mp.Semaphore() semaphore = multiprocessing.Semaphore(1)
class NHentaiImageNotExistException(Exception): class NHentaiImageNotExistException(Exception):
@ -133,16 +132,14 @@ class Downloader(Singleton):
queue = [(self, url, folder) for url in queue] queue = [(self, url, folder) for url in queue]
pool = multiprocessing.Pool(self.size, init_worker) pool = multiprocessing.Pool(self.size, init_worker)
[pool.apply_async(download_wrapper, args=item) for item in queue]
for item in queue:
pool.apply_async(download_wrapper, args=item, callback=self._download_callback)
pool.close() pool.close()
pool.join() pool.join()
def download_wrapper(obj, url, folder=''): def download_wrapper(obj, url, folder=''):
if semaphore.get_value(): if sys.platform == 'darwin' or semaphore.get_value():
return Downloader.download_(obj, url=url, folder=folder) return Downloader.download_(obj, url=url, folder=folder)
else: else:
return -3, None return -3, None

View File

@ -120,15 +120,15 @@ def page_range_parser(page_range, max_page_num):
else: else:
try: try:
left = int(range_str[:idx]) left = int(range_str[:idx])
right = int(range_str[idx+1:]) right = int(range_str[idx + 1:])
if right > max_page_num: if right > max_page_num:
right = max_page_num right = max_page_num
for page in range(left, right+1): for page in range(left, right + 1):
pages.add(page) pages.add(page)
except ValueError: except ValueError:
logger.error('page range({0}) is not valid'.format(page_range)) logger.error('page range({0}) is not valid'.format(page_range))
return list(pages) return list(pages)
def doujinshi_parser(id_): def doujinshi_parser(id_):
@ -143,7 +143,7 @@ def doujinshi_parser(id_):
try: try:
response = request('get', url) response = request('get', url)
if response.status_code in (200, ): if response.status_code in (200,):
response = response.content response = response.content
else: else:
logger.debug('Slow down and retry ({}) ...'.format(id_)) logger.debug('Slow down and retry ({}) ...'.format(id_))
@ -178,12 +178,9 @@ def doujinshi_parser(id_):
doujinshi['img_id'] = img_id.group(1) doujinshi['img_id'] = img_id.group(1)
doujinshi['ext'] = ext doujinshi['ext'] = ext
pages = 0 for _ in doujinshi_info.find_all('div', class_='tag-container field-name'):
for _ in doujinshi_info.find_all('div', class_=''): if re.search('Pages:', _.text):
pages = re.search('([\d]+) pages', _.text) pages = _.find('span', class_='name').string
if pages:
pages = pages.group(1)
break
doujinshi['pages'] = int(pages) doujinshi['pages'] = int(pages)
# gain information of the doujinshi # gain information of the doujinshi
@ -192,7 +189,7 @@ def doujinshi_parser(id_):
for field in information_fields: for field in information_fields:
field_name = field.contents[0].strip().strip(':') field_name = field.contents[0].strip().strip(':')
if field_name in needed_fields: if field_name in needed_fields:
data = [sub_field.contents[0].strip() for sub_field in data = [sub_field.find('span', attrs={'class': 'name'}).contents[0].strip() for sub_field in
field.find_all('a', attrs={'class': 'tag'})] field.find_all('a', attrs={'class': 'tag'})]
doujinshi[field_name.lower()] = ', '.join(data) doujinshi[field_name.lower()] = ', '.join(data)
@ -202,7 +199,7 @@ def doujinshi_parser(id_):
return doujinshi return doujinshi
def search_parser(keyword, sorting='date', page=1): def old_search_parser(keyword, sorting='date', page=1):
logger.debug('Searching doujinshis of keyword {0}'.format(keyword)) logger.debug('Searching doujinshis of keyword {0}'.format(keyword))
response = request('get', url=constant.SEARCH_URL, params={'q': keyword, 'page': page, 'sort': sorting}).content response = request('get', url=constant.SEARCH_URL, params={'q': keyword, 'page': page, 'sort': sorting}).content
@ -222,50 +219,15 @@ def print_doujinshi(doujinshi_list):
tabulate(tabular_data=doujinshi_list, headers=headers, tablefmt='rst')) tabulate(tabular_data=doujinshi_list, headers=headers, tablefmt='rst'))
def tag_parser(tag_name, sorting='date', max_page=1, index=0): def search_parser(keyword, sorting, page):
result = []
tag_name = tag_name.lower()
if ',' in tag_name:
tag_name = [i.strip().replace(' ', '-') for i in tag_name.split(',')]
else:
tag_name = tag_name.strip().replace(' ', '-')
if sorting == 'date':
sorting = ''
for p in range(1, max_page + 1):
if sys.version_info >= (3, 0, 0):
unicode_ = str
else:
unicode_ = unicode
if isinstance(tag_name, (str, unicode_)):
logger.debug('Fetching page {0} for doujinshi with tag \'{1}\''.format(p, tag_name))
response = request('get', url='%s/%s/%s?page=%d' % (constant.TAG_URL[index], tag_name, sorting, p)).content
result += _get_title_and_id(response)
else:
for i in tag_name:
logger.debug('Fetching page {0} for doujinshi with tag \'{1}\''.format(p, i))
response = request('get',
url='%s/%s/%s?page=%d' % (constant.TAG_URL[index], i, sorting, p)).content
result += _get_title_and_id(response)
if not result:
logger.error('Cannot find doujinshi id of tag \'{0}\''.format(tag_name))
return
if not result:
logger.warn('No results for tag \'{}\''.format(tag_name))
return result
def __api_suspended_search_parser(keyword, sorting, page):
logger.debug('Searching doujinshis using keywords {0}'.format(keyword)) logger.debug('Searching doujinshis using keywords {0}'.format(keyword))
keyword = '+'.join([i.strip().replace(' ', '-').lower() for i in keyword.split(',')])
result = [] result = []
i = 0 i = 0
while i < 5: while i < 5:
try: try:
response = request('get', url=constant.SEARCH_URL, params={'query': keyword, 'page': page, 'sort': sorting}).json() url = request('get', url=constant.SEARCH_URL, params={'query': keyword, 'page': page, 'sort': sorting}).url
response = request('get', url.replace('%2B', '+')).json()
except Exception as e: except Exception as e:
i += 1 i += 1
if not i < 5: if not i < 5:
@ -289,29 +251,6 @@ def __api_suspended_search_parser(keyword, sorting, page):
return result return result
def __api_suspended_tag_parser(tag_id, sorting, max_page=1):
logger.info('Searching for doujinshi with tag id {0}'.format(tag_id))
result = []
response = request('get', url=constant.TAG_API_URL, params={'sort': sorting, 'tag_id': tag_id}).json()
page = max_page if max_page <= response['num_pages'] else int(response['num_pages'])
for i in range(1, page + 1):
logger.info('Getting page {} ...'.format(i))
if page != 1:
response = request('get', url=constant.TAG_API_URL,
params={'sort': sorting, 'tag_id': tag_id}).json()
for row in response['result']:
title = row['title']['english']
title = title[:85] + '..' if len(title) > 85 else title
result.append({'id': row['id'], 'title': title})
if not result:
logger.warn('No results for tag id {}'.format(tag_id))
return result
def __api_suspended_doujinshi_parser(id_): def __api_suspended_doujinshi_parser(id_):
if not isinstance(id_, (int,)) and (isinstance(id_, (str,)) and not id_.isdigit()): if not isinstance(id_, (int,)) and (isinstance(id_, (str,)) and not id_.isdigit()):
raise Exception('Doujinshi id({0}) is not valid'.format(id_)) raise Exception('Doujinshi id({0}) is not valid'.format(id_))

View File

@ -1,9 +1,10 @@
# coding: utf-8 # coding: utf-8
import json import json
import os import os
from xml.sax.saxutils import escape
def serialize(doujinshi, dir): def serialize_json(doujinshi, dir):
metadata = {'title': doujinshi.name, metadata = {'title': doujinshi.name,
'subtitle': doujinshi.info.subtitle} 'subtitle': doujinshi.info.subtitle}
if doujinshi.info.date: if doujinshi.info.date:
@ -28,6 +29,51 @@ def serialize(doujinshi, dir):
json.dump(metadata, f, separators=','':') json.dump(metadata, f, separators=','':')
def serialize_comicxml(doujinshi, dir):
from iso8601 import parse_date
with open(os.path.join(dir, 'ComicInfo.xml'), 'w') as f:
f.write('<?xml version="1.0" encoding="utf-8"?>\n')
f.write('<ComicInfo xmlns:xsd="http://www.w3.org/2001/XMLSchema" '
'xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">\n')
xml_write_simple_tag(f, 'Manga', 'Yes')
xml_write_simple_tag(f, 'Title', doujinshi.name)
xml_write_simple_tag(f, 'Summary', doujinshi.info.subtitle)
xml_write_simple_tag(f, 'PageCount', doujinshi.pages)
xml_write_simple_tag(f, 'URL', doujinshi.url)
xml_write_simple_tag(f, 'NhentaiId', doujinshi.id)
xml_write_simple_tag(f, 'Genre', doujinshi.info.categories)
xml_write_simple_tag(f, 'BlackAndWhite', 'No' if doujinshi.info.tags and 'full color' in doujinshi.info.tags else 'Yes')
if doujinshi.info.date:
dt = parse_date(doujinshi.info.date)
xml_write_simple_tag(f, 'Year', dt.year)
xml_write_simple_tag(f, 'Month', dt.month)
xml_write_simple_tag(f, 'Day', dt.day)
if doujinshi.info.parodies:
xml_write_simple_tag(f, 'Series', doujinshi.info.parodies)
if doujinshi.info.characters:
xml_write_simple_tag(f, 'Characters', doujinshi.info.characters)
if doujinshi.info.tags:
xml_write_simple_tag(f, 'Tags', doujinshi.info.tags)
if doujinshi.info.artists:
xml_write_simple_tag(f, 'Writer', ' & '.join([i.strip() for i in doujinshi.info.artists.split(',')]))
# if doujinshi.info.groups:
# metadata['group'] = [i.strip() for i in doujinshi.info.groups.split(',')]
if doujinshi.info.languages:
languages = [i.strip() for i in doujinshi.info.languages.split(',')]
xml_write_simple_tag(f, 'Translated', 'Yes' if 'translated' in languages else 'No')
[xml_write_simple_tag(f, 'Language', i) for i in languages if i != 'translated']
f.write('</ComicInfo>')
def xml_write_simple_tag(f, name, val, indent=1):
f.write('{}<{}>{}</{}>\n'.format(' ' * indent, name, escape(str(val)), name))
def merge_json(): def merge_json():
lst = [] lst = []
output_dir = "./" output_dir = "./"

View File

@ -12,7 +12,7 @@ import sqlite3
from nhentai import constant from nhentai import constant
from nhentai.logger import logger from nhentai.logger import logger
from nhentai.serializer import serialize, set_js_database from nhentai.serializer import serialize_json, serialize_comicxml, set_js_database
def request(method, url, **kwargs): def request(method, url, **kwargs):
@ -86,7 +86,7 @@ def generate_html(output_dir='.', doujinshi_obj=None):
js = readfile('viewer/scripts.js') js = readfile('viewer/scripts.js')
if doujinshi_obj is not None: if doujinshi_obj is not None:
serialize(doujinshi_obj, doujinshi_dir) serialize_json(doujinshi_obj, doujinshi_dir)
name = doujinshi_obj.name name = doujinshi_obj.name
if sys.version_info < (3, 0): if sys.version_info < (3, 0):
name = doujinshi_obj.name.encode('utf-8') name = doujinshi_obj.name.encode('utf-8')
@ -102,9 +102,9 @@ def generate_html(output_dir='.', doujinshi_obj=None):
with open(os.path.join(doujinshi_dir, 'index.html'), 'wb') as f: with open(os.path.join(doujinshi_dir, 'index.html'), 'wb') as f:
f.write(data.encode('utf-8')) f.write(data.encode('utf-8'))
logger.log(15, 'HTML Viewer has been write to \'{0}\''.format(os.path.join(doujinshi_dir, 'index.html'))) logger.log(15, 'HTML Viewer has been written to \'{0}\''.format(os.path.join(doujinshi_dir, 'index.html')))
except Exception as e: except Exception as e:
logger.warning('Writen HTML Viewer failed ({})'.format(str(e))) logger.warning('Writing HTML Viewer failed ({})'.format(str(e)))
def generate_main_html(output_dir='./'): def generate_main_html(output_dir='./'):
@ -150,7 +150,7 @@ def generate_main_html(output_dir='./'):
image_html += element.format(FOLDER=folder, IMAGE=image, TITLE=title) image_html += element.format(FOLDER=folder, IMAGE=image, TITLE=title)
if image_html == '': if image_html == '':
logger.warning('None index.html found, --gen-main paused.') logger.warning('No index.html found, --gen-main paused.')
return return
try: try:
data = main.format(STYLES=css, SCRIPTS=js, PICTURE=image_html) data = main.format(STYLES=css, SCRIPTS=js, PICTURE=image_html)
@ -163,14 +163,16 @@ def generate_main_html(output_dir='./'):
shutil.copy(os.path.dirname(__file__)+'/viewer/logo.png', './') shutil.copy(os.path.dirname(__file__)+'/viewer/logo.png', './')
set_js_database() set_js_database()
logger.log( logger.log(
15, 'Main Viewer has been write to \'{0}main.html\''.format(output_dir)) 15, 'Main Viewer has been written to \'{0}main.html\''.format(output_dir))
except Exception as e: except Exception as e:
logger.warning('Writen Main Viewer failed ({})'.format(str(e))) logger.warning('Writing Main Viewer failed ({})'.format(str(e)))
def generate_cbz(output_dir='.', doujinshi_obj=None, rm_origin_dir=False): def generate_cbz(output_dir='.', doujinshi_obj=None, rm_origin_dir=False, write_comic_info=False):
if doujinshi_obj is not None: if doujinshi_obj is not None:
doujinshi_dir = os.path.join(output_dir, doujinshi_obj.filename) doujinshi_dir = os.path.join(output_dir, doujinshi_obj.filename)
if write_comic_info:
serialize_comicxml(doujinshi_obj, doujinshi_dir)
cbz_filename = os.path.join(os.path.join(doujinshi_dir, '..'), '{}.cbz'.format(doujinshi_obj.filename)) cbz_filename = os.path.join(os.path.join(doujinshi_dir, '..'), '{}.cbz'.format(doujinshi_obj.filename))
else: else:
cbz_filename = './doujinshi.cbz' cbz_filename = './doujinshi.cbz'
@ -188,7 +190,40 @@ def generate_cbz(output_dir='.', doujinshi_obj=None, rm_origin_dir=False):
if rm_origin_dir: if rm_origin_dir:
shutil.rmtree(doujinshi_dir, ignore_errors=True) shutil.rmtree(doujinshi_dir, ignore_errors=True)
logger.log(15, 'Comic Book CBZ file has been write to \'{0}\''.format(doujinshi_dir)) logger.log(15, 'Comic Book CBZ file has been written to \'{0}\''.format(doujinshi_dir))
def generate_pdf(output_dir='.', doujinshi_obj=None, rm_origin_dir=False):
try:
import img2pdf
except ImportError:
logger.error("Please install img2pdf package by using pip.")
"""Write images to a PDF file using img2pdf."""
if doujinshi_obj is not None:
doujinshi_dir = os.path.join(output_dir, doujinshi_obj.filename)
pdf_filename = os.path.join(
os.path.join(doujinshi_dir, '..'),
'{}.pdf'.format(doujinshi_obj.filename)
)
else:
pdf_filename = './doujinshi.pdf'
doujinshi_dir = '.'
file_list = os.listdir(doujinshi_dir)
file_list.sort()
logger.info('Writing PDF file to path: {}'.format(pdf_filename))
with open(pdf_filename, 'wb') as pdf_f:
full_path_list = (
[os.path.join(doujinshi_dir, image) for image in file_list]
)
pdf_f.write(img2pdf.convert(full_path_list))
if rm_origin_dir:
shutil.rmtree(doujinshi_dir, ignore_errors=True)
logger.log(15, 'PDF file has been written to \'{0}\''.format(doujinshi_dir))
def format_filename(s): def format_filename(s):
@ -202,6 +237,7 @@ and append a file extension like '.txt', so I avoid the potential of using
an invalid filename. an invalid filename.
""" """
# maybe you can use `--format` to select a suitable filename
valid_chars = "-_.()[] %s%s" % (string.ascii_letters, string.digits) valid_chars = "-_.()[] %s%s" % (string.ascii_letters, string.digits)
filename = ''.join(c for c in s if c in valid_chars) filename = ''.join(c for c in s if c in valid_chars)
if len(filename) > 100: if len(filename) > 100:

View File

@ -148,7 +148,7 @@ blockquote {
-webkit-user-select: none; /* Safari */ -webkit-user-select: none; /* Safari */
-khtml-user-select: none; /* Konqueror HTML */ -khtml-user-select: none; /* Konqueror HTML */
-moz-user-select: none; /* Old versions of Firefox */ -moz-user-select: none; /* Old versions of Firefox */
ms-user-select: none; /* Internet Explorer/Edge */ -ms-user-select: none; /* Internet Explorer/Edge */
user-select: none; user-select: none;
} }
@ -157,7 +157,7 @@ blockquote {
padding: 5px 0px 5px 15px; padding: 5px 0px 5px 15px;
text-decoration: none; text-decoration: none;
font-size: 15px; font-size: 15px;
color: #0d0d0d9; color: #0d0d0d;
display: block; display: block;
text-align: left; text-align: left;
} }
@ -329,4 +329,4 @@ html.theme-black .gallery:hover .caption {
html.theme-black .caption { html.theme-black .caption {
background-color: #404040; background-color: #404040;
color: #d9d9d9 color: #d9d9d9
} }

View File

@ -1,5 +1,7 @@
requests>=2.5.0 requests>=2.5.0
soupsieve<2.0
BeautifulSoup4>=4.0.0 BeautifulSoup4>=4.0.0
threadpool>=1.2.7 threadpool>=1.2.7
tabulate>=0.7.5 tabulate>=0.7.5
future>=0.15.2 future>=0.15.2
iso8601 >= 0.1

View File

@ -23,7 +23,7 @@ setup(
author=__author__, author=__author__,
author_email=__email__, author_email=__email__,
keywords='nhentai, doujinshi', keywords=['nhentai', 'doujinshi', 'downloader'],
description='nhentai.net doujinshis downloader', description='nhentai.net doujinshis downloader',
long_description=long_description(), long_description=long_description(),
url='https://github.com/RicterZ/nhentai', url='https://github.com/RicterZ/nhentai',