Compare commits

..

47 Commits

Author SHA1 Message Date
263dba51f3 modify tests #54 2019-05-18 19:40:09 +08:00
049ab4d9ad using cookie rather than login #54 2019-05-18 19:34:54 +08:00
b173a6c28f slow down #50 2019-05-04 12:12:57 +08:00
b64b718c88 remove eval 2019-05-04 11:31:41 +08:00
8317662664 fix #50 2019-05-04 11:29:01 +08:00
13e60a69e9 Merge pull request #51 from symant233/master
Add viewer arrow support, add README license badge.
2019-05-04 11:11:34 +08:00
b5acbc76fd Update README license badage 2019-05-04 11:07:15 +08:00
1eb1b5c04c Add viewer arrow support & Readme license badage 2019-05-04 11:04:43 +08:00
2acb6a1249 Update README.md 2019-04-25 03:36:31 +08:00
0660cb0fed update user-agent 2019-04-11 22:48:18 +08:00
680b004c24 update README 2019-04-11 22:47:49 +08:00
6709af2a20 0.3.1 - add login session 2019-04-11 22:44:26 +08:00
a3fead2852 pep-8 2019-04-11 22:43:42 +08:00
0728dd8c6d use text rather than content 2019-04-11 22:41:37 +08:00
9160b38c3f bypass the challenge 2019-04-11 22:39:20 +08:00
f74be0c665 add new tests 2019-04-11 22:10:16 +08:00
c30f562a83 Merge pull request #48 from onlymyflower/master
download ids from file
2019-04-11 22:09:30 +08:00
37547cc97f global login session #49 #46 2019-04-11 22:08:19 +08:00
f6fb90aab5 download ids from file 2019-03-06 16:46:47 +08:00
50be89db44 fix extension issue #44 2019-01-27 10:06:12 +08:00
fc0be35b2c 0.3.0 #40 2019-01-15 21:16:14 +08:00
5c3dace937 tag page download #40 2019-01-15 21:12:20 +08:00
b2d622f11a fix tag download issue #40 2019-01-15 21:09:24 +08:00
0c8264bcc6 fix download issues 2019-01-15 20:43:00 +08:00
a6074242fb nhentai suspended api #40 2019-01-15 20:29:10 +08:00
eb6df28fba 0.2.19 2018-12-30 14:13:27 +08:00
1091ea3e0a remove debug 2018-12-30 14:12:38 +08:00
0df51c83e5 change output filename 2018-12-30 14:06:15 +08:00
c5fa98ebd1 Update .travis.yml 2018-11-04 21:44:59 +08:00
3154a94c3d 0.2.18 2018-10-24 22:21:29 +08:00
c47018251f fix #27 2018-10-24 22:20:33 +08:00
74d0499092 add test 2018-10-24 22:07:43 +08:00
7e56d9b901 fix #33 2018-10-24 22:06:49 +08:00
8cbb334d36 fix #31 2018-10-24 21:56:21 +08:00
db6d45efe0 fix bug #34 2018-10-19 10:55:21 +08:00
d412794bce Merge pull request #32 from violetdarkness/patch-1
requirement.txt missing new line
2018-10-08 23:36:38 +08:00
8eedbf077b requirement.txt missing new line
I got error when installing and find this requirement.txt missing newline
2018-10-08 21:13:52 +07:00
c95ecdded4 remove gdb 2018-10-01 15:04:32 +08:00
489e8bf0f4 fix #29 0.2.16 2018-10-01 15:02:04 +08:00
86c31f9b5e Merge pull request #28 from tbinavsl/master
Max retries + misc. language fixes
2018-09-28 13:28:44 +08:00
6f20405f47 adding gif support and fixing yet another english typo 2018-09-09 23:38:30 +02:00
c0143548d1 reverted partially by mistake the max_page commit; also added retries on other features 2018-09-09 22:24:34 +02:00
114c364f03 oops 2018-09-09 21:42:03 +02:00
af26482b6d Max retries + misc. language fixes 2018-09-09 21:33:50 +02:00
b8ea917db2 max page #26 2018-08-24 23:55:34 +08:00
963f4d9ddf fix 2018-08-12 23:22:30 +08:00
ef36e012ce fix unicode error on windows / python2 2018-08-12 23:11:01 +08:00
15 changed files with 410 additions and 145 deletions

1
.gitignore vendored
View File

@ -5,3 +5,4 @@ dist/
*.egg-info *.egg-info
.python-version .python-version
.DS_Store .DS_Store
output/

View File

@ -4,14 +4,18 @@ os:
language: python language: python
python: python:
- 2.7 - 2.7
- 2.6
- 3.6 - 3.6
- 3.5
- 3.4
install: install:
- python setup.py install - python setup.py install
script: script:
- echo 268642 > /tmp/test.txt
- NHENTAI=https://nhentai.net nhentai --cookie '__cfduid=da09f237ceb0f51c75980b0b3fda3ce571558179357; _ga=GA1.2.2000087053.1558179358; _gid=GA1.2.717818542.1558179358; csrftoken=iSxrTFOjrujJqauhAqWvTTI9dl3sfWnxdEFoMuqgmlBrbMin5Gj9wJW4r61cmH1X; sessionid=ewuaayfewbzpiukrarx9d52oxwlz2esd'
- NHENTAI=https://nhentai.net nhentai --search umaru - NHENTAI=https://nhentai.net nhentai --search umaru
- NHENTAI=https://nhentai.net nhentai --id=152503,146134 -t 10 --output=/tmp/ - NHENTAI=https://nhentai.net nhentai --id=152503,146134 -t 10 --output=/tmp/ --cbz
- NHENTAI=https://nhentai.net nhentai -l nhentai_test:nhentai --output=/tmp/
- NHENTAI=https://nhentai.net nhentai --tag lolicon - NHENTAI=https://nhentai.net nhentai --tag lolicon
- NHENTAI=https://nhentai.net nhentai -F
- NHENTAI=https://nhentai.net nhentai --file /tmp/test.txt

View File

@ -7,7 +7,7 @@ nhentai
|_| |_|_| |_|\___|_| |_|\__\__,_|_| |_| |_|_| |_|\___|_| |_|\__\__,_|_|
あなたも変態。 いいね? あなたも変態。 いいね?
[![Build Status](https://travis-ci.org/RicterZ/nhentai.svg?branch=master)](https://travis-ci.org/RicterZ/nhentai) ![nhentai PyPI Downloads](https://pypistats.com/badge/nhentai.svg) [![Build Status](https://travis-ci.org/RicterZ/nhentai.svg?branch=master)](https://travis-ci.org/RicterZ/nhentai) ![nhentai PyPI Downloads](https://img.shields.io/pypi/dm/nhentai.svg) [![license](https://img.shields.io/cocoapods/l/AFNetworking.svg)](https://github.com/RicterZ/nhentai/blob/master/LICENSE)
nHentai is a CLI tool for downloading doujinshi from [nhentai.net](http://nhentai.net). nHentai is a CLI tool for downloading doujinshi from [nhentai.net](http://nhentai.net).
@ -18,17 +18,26 @@ nHentai is a CLI tool for downloading doujinshi from [nhentai.net](http://nhenta
cd nhentai cd nhentai
python setup.py install python setup.py install
### Gentoo ### Installation (Gentoo)
layman -fa glicOne layman -fa glicOne
sudo emerge net-misc/nhentai sudo emerge net-misc/nhentai
### Usage ### Usage
**IMPORTANT**: To bypass the nhentai frequency limit, you should use `--login` option to log into nhentai.net.
*The default download folder will be the path where you run the command (CLI path).*
Download specified doujinshi: Download specified doujinshi:
```bash ```bash
nhentai --id=123855,123866 nhentai --id=123855,123866
``` ```
Download doujinshi with ids specified in a file:
```bash
nhentai --file=doujinshi.txt
```
Search a keyword and download the first page: Search a keyword and download the first page:
```bash ```bash
nhentai --search="tomori" --page=1 --download nhentai --search="tomori" --page=1 --download
@ -71,8 +80,5 @@ NHENTAI=http://h.loli.club nhentai --id 123456
![](./images/download.png) ![](./images/download.png)
![](./images/viewer.png) ![](./images/viewer.png)
### License
MIT
### あなたも変態 ### あなたも変態
![](./images/image.jpg) ![](./images/image.jpg)

5
doujinshi.txt Normal file
View File

@ -0,0 +1,5 @@
184212
204944
222460
244502
261909

View File

@ -1,3 +1,3 @@
__version__ = '0.2.15' __version__ = '0.3.1'
__author__ = 'RicterZ' __author__ = 'RicterZ'
__email__ = 'ricterzheng@gmail.com' __email__ = 'ricterzheng@gmail.com'

View File

@ -1,20 +1,25 @@
# coding: utf-8 # coding: utf-8
from __future__ import print_function from __future__ import print_function
import os
import sys import sys
from optparse import OptionParser from optparse import OptionParser
from nhentai import __version__
try: try:
from itertools import ifilter as filter from itertools import ifilter as filter
except ImportError: except ImportError:
pass pass
import nhentai.constant as constant import nhentai.constant as constant
from nhentai import __version__
from nhentai.utils import urlparse, generate_html from nhentai.utils import urlparse, generate_html
from nhentai.logger import logger from nhentai.logger import logger
try: try:
reload(sys) if sys.version_info < (3, 0, 0):
sys.setdefaultencoding(sys.stdin.encoding) import codecs
import locale
sys.stdout = codecs.getwriter(locale.getpreferredencoding())(sys.stdout)
sys.stderr = codecs.getwriter(locale.getpreferredencoding())(sys.stderr)
except NameError: except NameError:
# python3 # python3
pass pass
@ -33,35 +38,51 @@ def banner():
def cmd_parser(): def cmd_parser():
parser = OptionParser('\n nhentai --search [keyword] --download' parser = OptionParser('\n nhentai --search [keyword] --download'
'\n NHENTAI=http://h.loli.club nhentai --id [ID ...]' '\n NHENTAI=http://h.loli.club nhentai --id [ID ...]'
'\n nhentai --file [filename]'
'\n\nEnvironment Variable:\n' '\n\nEnvironment Variable:\n'
' NHENTAI nhentai mirror url') ' NHENTAI nhentai mirror url')
# operation options
parser.add_option('--download', dest='is_download', action='store_true', parser.add_option('--download', dest='is_download', action='store_true',
help='download doujinshi (for search result)') help='download doujinshi (for search results)')
parser.add_option('--show-info', dest='is_show', action='store_true', help='just show the doujinshi information') parser.add_option('--show', dest='is_show', action='store_true', help='just show the doujinshi information')
# doujinshi options
parser.add_option('--id', type='string', dest='id', action='store', help='doujinshi ids set, e.g. 1,2,3') parser.add_option('--id', type='string', dest='id', action='store', help='doujinshi ids set, e.g. 1,2,3')
parser.add_option('--search', type='string', dest='keyword', action='store', help='search doujinshi by keyword') parser.add_option('--search', type='string', dest='keyword', action='store', help='search doujinshi by keyword')
parser.add_option('--page', type='int', dest='page', action='store', default=1,
help='page number of search result')
parser.add_option('--tag', type='string', dest='tag', action='store', help='download doujinshi by tag') parser.add_option('--tag', type='string', dest='tag', action='store', help='download doujinshi by tag')
parser.add_option('--favorites', '-F', action='store_true', dest='favorites',
help='list or download your favorites.')
# page options
parser.add_option('--page', type='int', dest='page', action='store', default=1,
help='page number of search results')
parser.add_option('--max-page', type='int', dest='max_page', action='store', default=1,
help='The max page when recursive download tagged doujinshi')
# download options
parser.add_option('--output', type='string', dest='output_dir', action='store', default='', parser.add_option('--output', type='string', dest='output_dir', action='store', default='',
help='output dir') help='output dir')
parser.add_option('--threads', '-t', type='int', dest='threads', action='store', default=5, parser.add_option('--threads', '-t', type='int', dest='threads', action='store', default=5,
help='thread count of download doujinshi') help='thread count for downloading doujinshi')
parser.add_option('--timeout', type='int', dest='timeout', action='store', default=30, parser.add_option('--timeout', type='int', dest='timeout', action='store', default=30,
help='timeout of download doujinshi') help='timeout for downloading doujinshi')
parser.add_option('--proxy', type='string', dest='proxy', action='store', default='', parser.add_option('--proxy', type='string', dest='proxy', action='store', default='',
help='use proxy, example: http://127.0.0.1:1080') help='uses a proxy, for example: http://127.0.0.1:1080')
parser.add_option('--file', '-f', type='string', dest='file', action='store', help='read gallery IDs from file.')
# generate options
parser.add_option('--html', dest='html_viewer', action='store_true', parser.add_option('--html', dest='html_viewer', action='store_true',
help='generate a html viewer at current directory') help='generate a html viewer at current directory')
parser.add_option('--login', '-l', type='str', dest='login', action='store',
help='username:password pair of nhentai account')
parser.add_option('--nohtml', dest='is_nohtml', action='store_true', parser.add_option('--nohtml', dest='is_nohtml', action='store_true',
help='Don\'t generate HTML') help='don\'t generate HTML')
parser.add_option('--cbz', dest='is_cbz', action='store_true', parser.add_option('--cbz', dest='is_cbz', action='store_true',
help='Generate Comic Book CBZ File') help='generate Comic Book CBZ File')
parser.add_option('--rm-origin-dir', dest='rm_origin_dir', action='store_true', default=False,
help='remove downloaded doujinshi dir when generated CBZ file.')
# nhentai options
parser.add_option('--cookie', type='str', dest='cookie', action='store',
help='set cookie of nhentai to bypass Google recaptcha')
try: try:
sys.argv = list(map(lambda x: unicode(x.decode(sys.stdin.encoding)), sys.argv)) sys.argv = list(map(lambda x: unicode(x.decode(sys.stdin.encoding)), sys.argv))
@ -76,6 +97,25 @@ def cmd_parser():
generate_html() generate_html()
exit(0) exit(0)
if os.path.exists(os.path.join(constant.NHENTAI_HOME, 'cookie')):
with open(os.path.join(constant.NHENTAI_HOME, 'cookie'), 'r') as f:
constant.COOKIE = f.read()
if args.cookie:
try:
if not os.path.exists(constant.NHENTAI_HOME):
os.mkdir(constant.NHENTAI_HOME)
with open(os.path.join(constant.NHENTAI_HOME, 'cookie'), 'w') as f:
f.write(args.cookie)
except Exception as e:
logger.error('Cannot create NHENTAI_HOME: {}'.format(str(e)))
exit(1)
logger.info('Cookie saved.')
exit(0)
'''
if args.login: if args.login:
try: try:
_, _ = args.login.split(':', 1) _, _ = args.login.split(':', 1)
@ -85,18 +125,29 @@ def cmd_parser():
if not args.is_download: if not args.is_download:
logger.warning('YOU DO NOT SPECIFY `--download` OPTION !!!') logger.warning('YOU DO NOT SPECIFY `--download` OPTION !!!')
'''
if args.favorites:
if not constant.COOKIE:
logger.warning('Cookie has not been set, please use `nhentai --cookie \'COOKIE\'` to set it.')
exit(1)
if args.id: if args.id:
_ = map(lambda id: id.strip(), args.id.split(',')) _ = map(lambda id_: id_.strip(), args.id.split(','))
args.id = set(map(int, filter(lambda id_: id_.isdigit(), _)))
if args.file:
with open(args.file, 'r') as f:
_ = map(lambda id: id.strip(), f.readlines())
args.id = set(map(int, filter(lambda id_: id_.isdigit(), _))) args.id = set(map(int, filter(lambda id_: id_.isdigit(), _)))
if (args.is_download or args.is_show) and not args.id and not args.keyword and \ if (args.is_download or args.is_show) and not args.id and not args.keyword and \
not args.login and not args.tag: not args.tag and not args.favorites:
logger.critical('Doujinshi id(s) are required for downloading') logger.critical('Doujinshi id(s) are required for downloading')
parser.print_help() parser.print_help()
exit(1) exit(1)
if not args.keyword and not args.id and not args.login and not args.tag: if not args.keyword and not args.id and not args.tag and not args.favorites:
parser.print_help() parser.print_help()
exit(1) exit(1)

View File

@ -5,7 +5,7 @@ import signal
import platform import platform
from nhentai.cmdline import cmd_parser, banner from nhentai.cmdline import cmd_parser, banner
from nhentai.parser import doujinshi_parser, search_parser, print_doujinshi, login_parser, tag_guessing, tag_parser from nhentai.parser import doujinshi_parser, search_parser, print_doujinshi, favorites_parser, tag_parser, login
from nhentai.doujinshi import Doujinshi from nhentai.doujinshi import Doujinshi
from nhentai.downloader import Downloader from nhentai.downloader import Downloader
from nhentai.logger import logger from nhentai.logger import logger
@ -21,18 +21,21 @@ def main():
doujinshi_ids = [] doujinshi_ids = []
doujinshi_list = [] doujinshi_list = []
if options.login: if options.favorites:
username, password = options.login.split(':', 1) if not options.is_download:
logger.info('Login to nhentai use credential \'%s:%s\'' % (username, '*' * len(password))) logger.warning('You do not specify --download option')
for doujinshi_info in login_parser(username=username, password=password):
for doujinshi_info in favorites_parser():
doujinshi_list.append(Doujinshi(**doujinshi_info)) doujinshi_list.append(Doujinshi(**doujinshi_info))
if not options.is_download:
print_doujinshi([{'id': i.id, 'title': i.name} for i in doujinshi_list])
exit(0)
if options.tag: if options.tag:
tag_id = tag_guessing(options.tag) doujinshis = tag_parser(options.tag, max_page=options.max_page)
if tag_id:
doujinshis = tag_parser(tag_id)
print_doujinshi(doujinshis) print_doujinshi(doujinshis)
if options.is_download: if options.is_download and doujinshis:
doujinshi_ids = map(lambda d: d['id'], doujinshis) doujinshi_ids = map(lambda d: d['id'], doujinshis)
if options.keyword: if options.keyword:
@ -41,6 +44,9 @@ def main():
if options.is_download: if options.is_download:
doujinshi_ids = map(lambda d: d['id'], doujinshis) doujinshi_ids = map(lambda d: d['id'], doujinshis)
if not doujinshi_ids:
doujinshi_ids = options.id
if doujinshi_ids: if doujinshi_ids:
for id_ in doujinshi_ids: for id_ in doujinshi_ids:
doujinshi_info = doujinshi_parser(id_) doujinshi_info = doujinshi_parser(id_)
@ -56,7 +62,7 @@ def main():
if not options.is_nohtml and not options.is_cbz: if not options.is_nohtml and not options.is_cbz:
generate_html(options.output_dir, doujinshi) generate_html(options.output_dir, doujinshi)
elif options.is_cbz: elif options.is_cbz:
generate_cbz(options.output_dir, doujinshi) generate_cbz(options.output_dir, doujinshi, options.rm_origin_dir)
if not platform.system() == 'Windows': if not platform.system() == 'Windows':
logger.log(15, '🍻 All done.') logger.log(15, '🍻 All done.')
@ -68,7 +74,7 @@ def main():
def signal_handler(signal, frame): def signal_handler(signal, frame):
logger.error('Ctrl-C signal received. Quit.') logger.error('Ctrl-C signal received. Stopping...')
exit(1) exit(1)

View File

@ -1,18 +1,28 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals, print_function from __future__ import unicode_literals, print_function
import os import os
import tempfile
from nhentai.utils import urlparse from nhentai.utils import urlparse
BASE_URL = os.getenv('NHENTAI', 'https://nhentai.net') BASE_URL = os.getenv('NHENTAI', 'https://nhentai.net')
DETAIL_URL = '%s/api/gallery' % BASE_URL __api_suspended_DETAIL_URL = '%s/api/gallery' % BASE_URL
SEARCH_URL = '%s/api/galleries/search' % BASE_URL __api_suspended_SEARCH_URL = '%s/api/galleries/search' % BASE_URL
DETAIL_URL = '%s/g' % BASE_URL
SEARCH_URL = '%s/search/' % BASE_URL
TAG_URL = '%s/tag' % BASE_URL TAG_URL = '%s/tag' % BASE_URL
TAG_API_URL = '%s/api/galleries/tagged' % BASE_URL TAG_API_URL = '%s/api/galleries/tagged' % BASE_URL
LOGIN_URL = '%s/login/' % BASE_URL LOGIN_URL = '%s/login/' % BASE_URL
CHALLENGE_URL = '%s/challenge' % BASE_URL
FAV_URL = '%s/favorites/' % BASE_URL FAV_URL = '%s/favorites/' % BASE_URL
u = urlparse(BASE_URL) u = urlparse(BASE_URL)
IMAGE_URL = '%s://i.%s/galleries' % (u.scheme, u.hostname) IMAGE_URL = '%s://i.%s/galleries' % (u.scheme, u.hostname)
NHENTAI_HOME = os.path.join(os.getenv('HOME', tempfile.gettempdir()), '.nhentai')
PROXY = {} PROXY = {}
COOKIE = ''

View File

@ -11,6 +11,7 @@ from nhentai.utils import format_filename
EXT_MAP = { EXT_MAP = {
'j': 'jpg', 'j': 'jpg',
'p': 'png', 'p': 'png',
'g': 'gif',
} }
@ -35,6 +36,7 @@ class Doujinshi(object):
self.downloader = None self.downloader = None
self.url = '%s/%d' % (DETAIL_URL, self.id) self.url = '%s/%d' % (DETAIL_URL, self.id)
self.info = DoujinshiInfo(**kwargs) self.info = DoujinshiInfo(**kwargs)
self.filename = format_filename('[%s][%s][%s]' % (self.id, self.info.artist, self.name))
def __repr__(self): def __repr__(self):
return '<Doujinshi: {0}>'.format(self.name) return '<Doujinshi: {0}>'.format(self.name)
@ -43,8 +45,8 @@ class Doujinshi(object):
table = [ table = [
["Doujinshi", self.name], ["Doujinshi", self.name],
["Subtitle", self.info.subtitle], ["Subtitle", self.info.subtitle],
["Characters", self.info.characters], ["Characters", self.info.character],
["Authors", self.info.artists], ["Authors", self.info.artist],
["Language", self.info.language], ["Language", self.info.language],
["Tags", self.info.tags], ["Tags", self.info.tags],
["URL", self.url], ["URL", self.url],
@ -53,15 +55,25 @@ class Doujinshi(object):
logger.info(u'Print doujinshi information of {0}\n{1}'.format(self.id, tabulate(table))) logger.info(u'Print doujinshi information of {0}\n{1}'.format(self.id, tabulate(table)))
def download(self): def download(self):
logger.info('Start download doujinshi: %s' % self.name) logger.info('Starting to download doujinshi: %s' % self.name)
if self.downloader: if self.downloader:
download_queue = [] download_queue = []
if len(self.ext) != self.pages:
logger.warning('Page count and ext count do not equal')
for i in range(1, min(self.pages, len(self.ext)) + 1):
download_queue.append('%s/%d/%d.%s' % (IMAGE_URL, int(self.img_id), i, self.ext[i-1]))
self.downloader.download(download_queue, self.filename)
'''
for i in range(len(self.ext)): for i in range(len(self.ext)):
download_queue.append('%s/%d/%d.%s' % (IMAGE_URL, int(self.img_id), i+1, EXT_MAP[self.ext[i]])) download_queue.append('%s/%d/%d.%s' % (IMAGE_URL, int(self.img_id), i+1, EXT_MAP[self.ext[i]]))
'''
self.downloader.download(download_queue, format_filename('%s-%s' % (self.id, self.name[:200])))
else: else:
logger.critical('Downloader has not be loaded') logger.critical('Downloader has not been loaded')
if __name__ == '__main__': if __name__ == '__main__':

View File

@ -29,22 +29,40 @@ class Downloader(Singleton):
self.path = str(path) self.path = str(path)
self.thread_count = thread self.thread_count = thread
self.threads = [] self.threads = []
self.thread_pool = None
self.timeout = timeout self.timeout = timeout
def _download(self, url, folder='', filename='', retried=0): def _download(self, url, folder='', filename='', retried=0):
logger.info('Start downloading: {0} ...'.format(url)) logger.info('Starting to download {0} ...'.format(url))
filename = filename if filename else os.path.basename(urlparse(url).path) filename = filename if filename else os.path.basename(urlparse(url).path)
base_filename, extension = os.path.splitext(filename) base_filename, extension = os.path.splitext(filename)
try: try:
if os.path.exists(os.path.join(folder, base_filename.zfill(3) + extension)): if os.path.exists(os.path.join(folder, base_filename.zfill(3) + extension)):
logger.warning('File: {0} existed, ignore.'.format(os.path.join(folder, base_filename.zfill(3) + logger.warning('File: {0} exists, ignoring'.format(os.path.join(folder, base_filename.zfill(3) +
extension))) extension)))
return 1, url return 1, url
response = None
with open(os.path.join(folder, base_filename.zfill(3) + extension), "wb") as f: with open(os.path.join(folder, base_filename.zfill(3) + extension), "wb") as f:
i = 0
while i < 10:
try:
response = request('get', url, stream=True, timeout=self.timeout) response = request('get', url, stream=True, timeout=self.timeout)
if response.status_code != 200: if response.status_code != 200:
raise NhentaiImageNotExistException raise NhentaiImageNotExistException
except NhentaiImageNotExistException as e:
raise e
except Exception as e:
i += 1
if not i < 10:
logger.critical(str(e))
return 0, None
continue
break
length = response.headers.get('content-length') length = response.headers.get('content-length')
if length is None: if length is None:
f.write(response.content) f.write(response.content)
@ -77,7 +95,7 @@ class Downloader(Singleton):
elif result == -1: elif result == -1:
logger.warning('url {} return status code 404'.format(data)) logger.warning('url {} return status code 404'.format(data))
else: else:
logger.log(15, '{0} download successfully'.format(data)) logger.log(15, '{0} downloaded successfully'.format(data))
def download(self, queue, folder=''): def download(self, queue, folder=''):
if not isinstance(folder, text): if not isinstance(folder, text):
@ -87,7 +105,7 @@ class Downloader(Singleton):
folder = os.path.join(self.path, folder) folder = os.path.join(self.path, folder)
if not os.path.exists(folder): if not os.path.exists(folder):
logger.warn('Path \'{0}\' not exist.'.format(folder)) logger.warn('Path \'{0}\' does not exist, creating.'.format(folder))
try: try:
os.makedirs(folder) os.makedirs(folder)
except EnvironmentError as e: except EnvironmentError as e:

View File

@ -104,6 +104,9 @@ class ColorizingStreamHandler(logging.StreamHandler):
text = parts.pop(0) text = parts.pop(0)
if text: if text:
if sys.version_info < (3, 0, 0):
write(text.encode('utf-8'))
else:
write(text) write(text)
if parts: if parts:

View File

@ -5,6 +5,7 @@ import os
import re import re
import threadpool import threadpool
import requests import requests
import time
from bs4 import BeautifulSoup from bs4 import BeautifulSoup
from tabulate import tabulate from tabulate import tabulate
@ -12,44 +13,67 @@ import nhentai.constant as constant
from nhentai.logger import logger from nhentai.logger import logger
session = requests.Session()
session.headers.update({
'Referer': constant.LOGIN_URL,
'User-Agent': 'nhentai command line client (https://github.com/RicterZ/nhentai)',
})
def request(method, url, **kwargs): def request(method, url, **kwargs):
if not hasattr(requests, method): global session
raise AttributeError('\'requests\' object has no attribute \'{0}\''.format(method)) if not hasattr(session, method):
raise AttributeError('\'requests.Session\' object has no attribute \'{0}\''.format(method))
return requests.__dict__[method](url, proxies=constant.PROXY, verify=False, **kwargs) session.headers.update({'Cookie': constant.COOKIE})
return getattr(session, method)(url, proxies=constant.PROXY, verify=False, **kwargs)
def login_parser(username, password): def _get_csrf_token(content):
s = requests.Session()
s.proxies = constant.PROXY
s.verify = False
s.headers.update({'Referer': constant.LOGIN_URL})
s.get(constant.LOGIN_URL)
content = s.get(constant.LOGIN_URL).content
html = BeautifulSoup(content, 'html.parser') html = BeautifulSoup(content, 'html.parser')
csrf_token_elem = html.find('input', attrs={'name': 'csrfmiddlewaretoken'}) csrf_token_elem = html.find('input', attrs={'name': 'csrfmiddlewaretoken'})
if not csrf_token_elem: if not csrf_token_elem:
raise Exception('Cannot find csrf token to login') raise Exception('Cannot find csrf token to login')
csrf_token = csrf_token_elem.attrs['value'] return csrf_token_elem.attrs['value']
def login(username, password):
logger.warning('This feature is deprecated, please use --cookie to set your cookie.')
csrf_token = _get_csrf_token(request('get', url=constant.LOGIN_URL).text)
if os.getenv('DEBUG'):
logger.info('Getting CSRF token ...')
if os.getenv('DEBUG'):
logger.info('CSRF token is {}'.format(csrf_token))
login_dict = { login_dict = {
'csrfmiddlewaretoken': csrf_token, 'csrfmiddlewaretoken': csrf_token,
'username_or_email': username, 'username_or_email': username,
'password': password, 'password': password,
} }
resp = s.post(constant.LOGIN_URL, data=login_dict) resp = request('post', url=constant.LOGIN_URL, data=login_dict)
if 'Invalid username (or email) or password' in resp.text:
if 'You\'re loading pages way too quickly.' in resp.text or 'Really, slow down' in resp.text:
csrf_token = _get_csrf_token(resp.text)
resp = request('post', url=resp.url, data={'csrfmiddlewaretoken': csrf_token, 'next': '/'})
if 'Invalid username/email or password' in resp.text:
logger.error('Login failed, please check your username and password') logger.error('Login failed, please check your username and password')
exit(1) exit(1)
html = BeautifulSoup(s.get(constant.FAV_URL).content, 'html.parser') if 'You\'re loading pages way too quickly.' in resp.text or 'Really, slow down' in resp.text:
logger.error('Using nhentai --cookie \'YOUR_COOKIE_HERE\' to save your Cookie.')
exit(2)
def favorites_parser():
html = BeautifulSoup(request('get', constant.FAV_URL).content, 'html.parser')
count = html.find('span', attrs={'class': 'count'}) count = html.find('span', attrs={'class': 'count'})
if not count: if not count:
logger.error('Cannot get count of your favorites, maybe login failed.') logger.error("Can't get your number of favorited doujins. Did the login failed?")
return []
count = int(count.text.strip('(').strip(')')) count = int(count.text.strip('(').strip(')').replace(',', ''))
if count == 0: if count == 0:
logger.warning('No favorites found') logger.warning('No favorites found')
return [] return []
@ -60,7 +84,7 @@ def login_parser(username, password):
else: else:
pages = 1 pages = 1
logger.info('Your have %d favorites in %d pages.' % (count, pages)) logger.info('You have %d favorites in %d pages.' % (count, pages))
if os.getenv('DEBUG'): if os.getenv('DEBUG'):
pages = 1 pages = 1
@ -68,19 +92,15 @@ def login_parser(username, password):
ret = [] ret = []
doujinshi_id = re.compile('data-id="([\d]+)"') doujinshi_id = re.compile('data-id="([\d]+)"')
def _callback(request, result):
ret.append(result)
thread_pool = threadpool.ThreadPool(5)
for page in range(1, pages + 1): for page in range(1, pages + 1):
try: try:
logger.info('Getting doujinshi id of page %d' % page) logger.info('Getting doujinshi ids of page %d' % page)
resp = s.get(constant.FAV_URL + '?page=%d' % page).text resp = request('get', constant.FAV_URL + '?page=%d' % page).text
ids = doujinshi_id.findall(resp) ids = doujinshi_id.findall(resp)
requests_ = threadpool.makeRequests(doujinshi_parser, ids, _callback)
[thread_pool.putRequest(req) for req in requests_] for i in ids:
thread_pool.wait() ret.append(doujinshi_parser(i))
except Exception as e: except Exception as e:
logger.error('Error: %s, continue', str(e)) logger.error('Error: %s, continue', str(e))
@ -95,13 +115,110 @@ def doujinshi_parser(id_):
logger.log(15, 'Fetching doujinshi information of id {0}'.format(id_)) logger.log(15, 'Fetching doujinshi information of id {0}'.format(id_))
doujinshi = dict() doujinshi = dict()
doujinshi['id'] = id_ doujinshi['id'] = id_
url = '{0}/{1}'.format(constant.DETAIL_URL, id_) url = '{0}/{1}/'.format(constant.DETAIL_URL, id_)
try:
response = request('get', url)
if response.status_code in (200, ):
response = response.content
else:
logger.debug('Slow down and retry ({}) ...'.format(id_))
time.sleep(1)
return doujinshi_parser(str(id_))
except Exception as e:
logger.critical(str(e))
raise SystemExit
html = BeautifulSoup(response, 'html.parser')
doujinshi_info = html.find('div', attrs={'id': 'info'})
title = doujinshi_info.find('h1').text
subtitle = doujinshi_info.find('h2')
doujinshi['name'] = title
doujinshi['subtitle'] = subtitle.text if subtitle else ''
doujinshi_cover = html.find('div', attrs={'id': 'cover'})
img_id = re.search('/galleries/([\d]+)/cover\.(jpg|png)$', doujinshi_cover.a.img.attrs['data-src'])
ext = []
for i in html.find_all('div', attrs={'class': 'thumb-container'}):
_, ext_name = os.path.basename(i.img.attrs['data-src']).rsplit('.', 1)
ext.append(ext_name)
if not img_id:
logger.critical('Tried yo get image id failed')
exit(1)
doujinshi['img_id'] = img_id.group(1)
doujinshi['ext'] = ext
pages = 0
for _ in doujinshi_info.find_all('div', class_=''):
pages = re.search('([\d]+) pages', _.text)
if pages:
pages = pages.group(1)
break
doujinshi['pages'] = int(pages)
# gain information of the doujinshi
information_fields = doujinshi_info.find_all('div', attrs={'class': 'field-name'})
needed_fields = ['Characters', 'Artists', 'Language', 'Tags']
for field in information_fields:
field_name = field.contents[0].strip().strip(':')
if field_name in needed_fields:
data = [sub_field.contents[0].strip() for sub_field in
field.find_all('a', attrs={'class': 'tag'})]
doujinshi[field_name.lower()] = ', '.join(data)
return doujinshi
def search_parser(keyword, page):
logger.debug('Searching doujinshis of keyword {0}'.format(keyword))
result = []
try:
response = request('get', url=constant.SEARCH_URL, params={'q': keyword, 'page': page}).content
except requests.ConnectionError as e:
logger.critical(e)
logger.warn('If you are in China, please configure the proxy to fu*k GFW.')
raise SystemExit
html = BeautifulSoup(response, 'html.parser')
doujinshi_search_result = html.find_all('div', attrs={'class': 'gallery'})
for doujinshi in doujinshi_search_result:
doujinshi_container = doujinshi.find('div', attrs={'class': 'caption'})
title = doujinshi_container.text.strip()
title = title if len(title) < 85 else title[:82] + '...'
id_ = re.search('/g/(\d+)/', doujinshi.a['href']).group(1)
result.append({'id': id_, 'title': title})
if not result:
logger.warn('Not found anything of keyword {}'.format(keyword))
return result
def __api_suspended_doujinshi_parser(id_):
if not isinstance(id_, (int,)) and (isinstance(id_, (str,)) and not id_.isdigit()):
raise Exception('Doujinshi id({0}) is not valid'.format(id_))
id_ = int(id_)
logger.log(15, 'Fetching information of doujinshi id {0}'.format(id_))
doujinshi = dict()
doujinshi['id'] = id_
url = '{0}/{1}'.format(constant.DETAIL_URL, id_)
i = 0
while 5 > i:
try: try:
response = request('get', url).json() response = request('get', url).json()
except Exception as e: except Exception as e:
i += 1
if not i < 5:
logger.critical(str(e)) logger.critical(str(e))
exit(1) exit(1)
continue
break
doujinshi['name'] = response['title']['english'] doujinshi['name'] = response['title']['english']
doujinshi['subtitle'] = response['title']['japanese'] doujinshi['subtitle'] = response['title']['japanese']
@ -124,22 +241,29 @@ def doujinshi_parser(id_):
elif tag_type not in doujinshi: elif tag_type not in doujinshi:
doujinshi[tag_type] = tag['name'] doujinshi[tag_type] = tag['name']
else: else:
doujinshi[tag_type] += tag['name'] doujinshi[tag_type] += ', ' + tag['name']
return doujinshi return doujinshi
def search_parser(keyword, page): def __api_suspended_search_parser(keyword, page):
logger.debug('Searching doujinshis of keyword {0}'.format(keyword)) logger.debug('Searching doujinshis using keywords {0}'.format(keyword))
result = [] result = []
i = 0
while i < 5:
try: try:
response = request('get', url=constant.SEARCH_URL, params={'query': keyword, 'page': page}).json() response = request('get', url=constant.SEARCH_URL, params={'query': keyword, 'page': page}).json()
if 'result' not in response: except Exception as e:
raise Exception('No result in response') i += 1
except requests.ConnectionError as e: if not i < 5:
logger.critical(e) logger.critical(str(e))
logger.warn('If you are in China, please configure the proxy to fu*k GFW.') logger.warn('If you are in China, please configure the proxy to fu*k GFW.')
exit(1) exit(1)
continue
break
if 'result' not in response:
raise Exception('No result in response')
for row in response['result']: for row in response['result']:
title = row['title']['english'] title = row['title']['english']
@ -147,7 +271,7 @@ def search_parser(keyword, page):
result.append({'id': row['id'], 'title': title}) result.append({'id': row['id'], 'title': title})
if not result: if not result:
logger.warn('Not found anything of keyword {}'.format(keyword)) logger.warn('No results for keywords {}'.format(keyword))
return result return result
@ -161,47 +285,54 @@ def print_doujinshi(doujinshi_list):
tabulate(tabular_data=doujinshi_list, headers=headers, tablefmt='rst')) tabulate(tabular_data=doujinshi_list, headers=headers, tablefmt='rst'))
def tag_parser(tag_id): def __api_suspended_tag_parser(tag_id, max_page=1):
logger.info('Get doujinshi of tag id: {0}'.format(tag_id)) logger.info('Searching for doujinshi with tag id {0}'.format(tag_id))
result = [] result = []
response = request('get', url=constant.TAG_API_URL, params={'sort': 'popular', 'tag_id': tag_id}).json() response = request('get', url=constant.TAG_API_URL, params={'sort': 'popular', 'tag_id': tag_id}).json()
page = max_page if max_page <= response['num_pages'] else int(response['num_pages'])
for i in range(1, page + 1):
logger.info('Getting page {} ...'.format(i))
if page != 1:
response = request('get', url=constant.TAG_API_URL,
params={'sort': 'popular', 'tag_id': tag_id}).json()
for row in response['result']: for row in response['result']:
title = row['title']['english'] title = row['title']['english']
title = title[:85] + '..' if len(title) > 85 else title title = title[:85] + '..' if len(title) > 85 else title
result.append({'id': row['id'], 'title': title}) result.append({'id': row['id'], 'title': title})
if not result: if not result:
logger.warn('Not found anything of tag id {}'.format(tag_id)) logger.warn('No results for tag id {}'.format(tag_id))
return result return result
def tag_guessing(tag_name): def tag_parser(tag_name, max_page=1):
result = []
tag_name = tag_name.lower() tag_name = tag_name.lower()
tag_name = tag_name.replace(' ', '-') tag_name = tag_name.replace(' ', '-')
logger.info('Trying to get tag_id of tag \'{0}\''.format(tag_name))
response = request('get', url='%s/%s' % (constant.TAG_URL, tag_name)).content for p in range(1, max_page + 1):
logger.debug('Fetching page {0} for doujinshi with tag \'{1}\''.format(p, tag_name))
response = request('get', url='%s/%s?page=%d' % (constant.TAG_URL, tag_name, p)).content
html = BeautifulSoup(response, 'html.parser') html = BeautifulSoup(response, 'html.parser')
first_item = html.find('div', attrs={'class': 'gallery'}) doujinshi_items = html.find_all('div', attrs={'class': 'gallery'})
if not first_item: if not doujinshi_items:
logger.error('Cannot find doujinshi id of tag \'{0}\''.format(tag_name)) logger.error('Cannot find doujinshi id of tag \'{0}\''.format(tag_name))
return return
doujinshi_id = re.findall('(\d+)', first_item.a.attrs['href']) for i in doujinshi_items:
if not doujinshi_id: doujinshi_id = i.a.attrs['href'].strip('/g')
logger.error('Cannot find doujinshi id of tag \'{0}\''.format(tag_name)) doujinshi_title = i.a.text.strip()
return doujinshi_title = doujinshi_title if len(doujinshi_title) < 85 else doujinshi_title[:82] + '...'
result.append({'title': doujinshi_title, 'id': doujinshi_id})
ret = doujinshi_parser(doujinshi_id[0]) if not result:
if 'tag' in ret and tag_name in ret['tag']: logger.warn('No results for tag \'{}\''.format(tag_name))
tag_id = ret['tag'][tag_name]
logger.info('Tag id of tag \'{0}\' is {1}'.format(tag_name, tag_id))
else:
logger.error('Cannot find doujinshi id of tag \'{0}\''.format(tag_name))
return
return tag_id return result
if __name__ == '__main__': if __name__ == '__main__':

View File

@ -43,8 +43,7 @@ def generate_html(output_dir='.', doujinshi_obj=None):
image_html = '' image_html = ''
if doujinshi_obj is not None: if doujinshi_obj is not None:
doujinshi_dir = os.path.join(output_dir, format_filename('%s-%s' % (doujinshi_obj.id, doujinshi_dir = os.path.join(output_dir, doujinshi_obj.filename)
str(doujinshi_obj.name[:200]))))
else: else:
doujinshi_dir = '.' doujinshi_dir = '.'
@ -64,6 +63,8 @@ def generate_html(output_dir='.', doujinshi_obj=None):
if doujinshi_obj is not None: if doujinshi_obj is not None:
title = doujinshi_obj.name title = doujinshi_obj.name
if sys.version_info < (3, 0):
title = title.encode('utf-8')
else: else:
title = 'nHentai HTML Viewer' title = 'nHentai HTML Viewer'
@ -81,12 +82,10 @@ def generate_html(output_dir='.', doujinshi_obj=None):
logger.warning('Writen HTML Viewer failed ({})'.format(str(e))) logger.warning('Writen HTML Viewer failed ({})'.format(str(e)))
def generate_cbz(output_dir='.', doujinshi_obj=None): def generate_cbz(output_dir='.', doujinshi_obj=None, rm_origin_dir=False):
if doujinshi_obj is not None: if doujinshi_obj is not None:
doujinshi_dir = os.path.join(output_dir, format_filename('%s-%s' % (doujinshi_obj.id, doujinshi_dir = os.path.join(output_dir, doujinshi_obj.filename)
str(doujinshi_obj.name[:200])))) cbz_filename = os.path.join(os.path.join(doujinshi_dir, '..'), '%s.cbz' % doujinshi_obj.id)
cbz_filename = os.path.join(output_dir, format_filename('%s-%s.cbz' % (doujinshi_obj.id,
str(doujinshi_obj.name[:200]))))
else: else:
cbz_filename = './doujinshi.cbz' cbz_filename = './doujinshi.cbz'
doujinshi_dir = '.' doujinshi_dir = '.'
@ -94,20 +93,18 @@ def generate_cbz(output_dir='.', doujinshi_obj=None):
file_list = os.listdir(doujinshi_dir) file_list = os.listdir(doujinshi_dir)
file_list.sort() file_list.sort()
logger.info('Writing CBZ file to path: {}'.format(cbz_filename))
with zipfile.ZipFile(cbz_filename, 'w') as cbz_pf: with zipfile.ZipFile(cbz_filename, 'w') as cbz_pf:
for image in file_list: for image in file_list:
image_path = os.path.join(doujinshi_dir, image) image_path = os.path.join(doujinshi_dir, image)
cbz_pf.write(image_path, image) cbz_pf.write(image_path, image)
if rm_origin_dir:
shutil.rmtree(doujinshi_dir, ignore_errors=True) shutil.rmtree(doujinshi_dir, ignore_errors=True)
logger.log(15, 'Comic Book CBZ file has been write to \'{0}\''.format(doujinshi_dir)) logger.log(15, 'Comic Book CBZ file has been write to \'{0}\''.format(doujinshi_dir))
def format_filename(s): def format_filename(s):
"""Take a string and return a valid filename constructed from the string. """Take a string and return a valid filename constructed from the string.
Uses a whitelist approach: any characters not present in valid_chars are Uses a whitelist approach: any characters not present in valid_chars are
@ -119,7 +116,12 @@ and append a file extension like '.txt', so I avoid the potential of using
an invalid filename. an invalid filename.
""" """
valid_chars = "-_.() %s%s" % (string.ascii_letters, string.digits) valid_chars = "-_.()[] %s%s" % (string.ascii_letters, string.digits)
filename = ''.join(c for c in s if c in valid_chars) filename = ''.join(c for c in s if c in valid_chars)
filename = filename.replace(' ', '_') # I don't like spaces in filenames. filename = filename.replace(' ', '_') # I don't like spaces in filenames.
if len(filename) > 100:
filename = filename[:100] + '...]'
# Remove [] from filename
filename = filename.replace('[]', '')
return filename return filename

View File

@ -46,17 +46,33 @@ document.getElementById('image-container').onclick = event => {
document.onkeypress = event => { document.onkeypress = event => {
switch (event.key.toLowerCase()) { switch (event.key.toLowerCase()) {
// Previous Image // Previous Image
case 'arrowleft':
case 'a': case 'a':
changePage(currentPage - 1); changePage(currentPage - 1);
break; break;
// Next Image // Next Image
case ' ': case ' ':
case 'esc': // future close page function
case 'enter': case 'enter':
case 'arrowright':
case 'd': case 'd':
changePage(currentPage + 1); changePage(currentPage + 1);
break; break;
}// remove arrow cause it won't work
};
document.onkeydown = event =>{
switch (event.keyCode) {
case 37: //left
changePage(currentPage - 1);
break;
case 38: //up
changePage(currentPage - 1);
break;
case 39: //right
changePage(currentPage + 1);
break;
case 40: //down
changePage(currentPage + 1);
break;
} }
}; };

View File

@ -2,4 +2,4 @@ requests>=2.5.0
BeautifulSoup4>=4.0.0 BeautifulSoup4>=4.0.0
threadpool>=1.2.7 threadpool>=1.2.7
tabulate>=0.7.5 tabulate>=0.7.5
future>=0.15.2threadpool==1.3.2 future>=0.15.2