Compare commits

...

38 Commits

Author SHA1 Message Date
221ff6b32c 0.4.18 bugs fix 2023-02-04 20:24:53 +08:00
bc6ef0cf5d solve #251 2023-02-04 20:22:57 +08:00
c8c63cbc11 add usage images 2023-02-04 20:09:51 +08:00
a63856d076 update usage 2023-02-04 20:09:46 +08:00
aa4986189f resolve issue #264 2023-02-04 19:55:51 +08:00
0fb81599dc resolve #265 2023-02-04 19:47:24 +08:00
e9f9651d07 change the default sort method 2023-02-04 19:38:29 +08:00
1860b5f0cf resoved issue #249 2022-05-03 16:54:38 +08:00
eff4f3bf9b remove debug print 2022-05-03 16:51:49 +08:00
501840172e change sorting from recent to date 2022-05-03 16:49:26 +08:00
e5ed6d098a update README 2022-05-02 18:53:40 +08:00
98606202fb remove some unused images 2022-05-02 18:49:34 +08:00
5a3f1009c9 update README for issue #237 2022-05-02 18:48:02 +08:00
61945a6e97 fix for issue #236 2022-05-02 17:01:30 +08:00
443fcdc7da fix for issue #232 2022-05-02 16:53:23 +08:00
31b95fe2dd 0.4.17 releases, for #246 2022-05-02 16:24:04 +08:00
be8c97f8d4 Merge pull request #247 from krrr/master 2022-05-02 13:21:53 +08:00
348e51676e Update README.rst 2022-05-02 12:13:19 +08:00
ea356a1ca2 Merge pull request #244 from krrr/master 2022-04-30 13:47:57 +08:00
5a4dfb8a76 Add new option to avoid cloudflare captcha 2022-04-30 11:22:41 +08:00
4b15744ceb Merge pull request #235 from TravisDavis-ops/nixpkg 2021-12-24 03:27:07 +08:00
b05fa16286 Update README.rst 2021-12-23 12:43:20 -06:00
0879486881 Merge pull request #228 from culturecloud/master 2021-08-23 20:27:38 +08:00
c66ba730d3 Fix UnicodeEncodeError 2021-07-28 18:43:45 +06:00
606c5e0ffd Merge pull request #226 from nanaih/minimal_viewer 2021-06-23 18:14:47 +08:00
ba04f81a6f add minimal viewer, fix not using config's template on --html only option 2021-06-22 23:17:03 -04:00
6519e6f221 Merge pull request #224 from RicterZ/pull/221
Pull/221
2021-06-07 17:21:00 +08:00
7594625d72 fix format 2021-06-07 17:17:54 +08:00
4948c8f0c5 update README 2021-06-07 16:50:03 +08:00
e22a99fa8c Merge branch 'master' of github.com:RicterZ/nhentai 2021-06-07 16:48:36 +08:00
19a1d5c404 fix #220 add pretty name of doujinshi format 2021-06-07 16:47:54 +08:00
ad1e876611 Merge pull request #221 from SomeRandomDude870/master
HDoujin-format Metadata file
2021-06-07 16:02:43 +08:00
1de7e1f998 Merge branch 'pull/221' into master 2021-06-07 16:01:54 +08:00
b97e707817 HDoujin-format Metadata file 2021-06-05 17:13:18 +02:00
6ef2189bfe Merge pull request #214 from lleene/master
Add dryrun option to command line interface
2021-06-03 08:00:18 +08:00
24be2d37d4 0.4.16 2021-06-02 23:22:23 +08:00
bd38294bb7 undo whitespace edits 2021-05-16 19:49:26 +02:00
2cf4e6718e Add the option to perform a dry-run and only download meta-data / generate file structure 2021-05-16 19:44:01 +02:00
17 changed files with 420 additions and 121 deletions

1
.gitignore vendored
View File

@ -7,3 +7,4 @@ dist/
.DS_Store
output/
venv/
.vscode/

View File

@ -1,4 +1,5 @@
include README.md
include requirements.txt
include nhentai/viewer/*
include nhentai/viewer/default/*
include nhentai/viewer/default/*
include nhentai/viewer/minimal/*

View File

@ -50,11 +50,18 @@ Installation (Gentoo)
layman -fa glicOne
sudo emerge net-misc/nhentai
=====================
Installation (NixOs)
=====================
.. code-block::
nix-env -iA nixos.nhentai
=====
Usage
=====
**IMPORTANT**: To bypass the nhentai frequency limit, you should use `--cookie` option to store your cookie.
**⚠️IMPORTANT⚠️**: To bypass the nhentai frequency limit, you should use `--cookie` and `--useragent` options to store your cookie and your user-agent.
*The default download folder will be the path where you run the command (CLI path).*
@ -63,9 +70,13 @@ Set your nhentai cookie against captcha:
.. code-block:: bash
nhentai --useragent "USER AGENT of YOUR BROWSER"
nhentai --cookie "YOUR COOKIE FROM nhentai.net"
**NOTE**: The format of the cookie is `"csrftoken=TOKEN; sessionid=ID"`
**NOTE**
- The format of the cookie is `"csrftoken=TOKEN; sessionid=ID; cf_clearance=CLOUDFLARE"`
- `cf_clearance` cookie and useragent must be set if you encounter "blocked by cloudflare captcha" error. Make sure you use the same IP and useragent as when you got it
| To get csrftoken and sessionid, first login to your nhentai account in web browser, then:
| (Chrome) |ve| |ld| More tools |ld| Developer tools |ld| Application |ld| Storage |ld| Cookies |ld| https://nhentai.net
@ -76,6 +87,10 @@ Set your nhentai cookie against captcha:
.. |ve| unicode:: U+22EE .. https://www.compart.com/en/unicode/U+22EE
.. |ld| unicode:: U+2014 .. https://www.compart.com/en/unicode/U+2014
.. image:: ./images/usage.png?raw=true
:alt: nhentai
:align: center
Download specified doujinshi:
.. code-block:: bash
@ -121,30 +136,41 @@ Supported doujinshi folder formatter:
- %t: Doujinshi name
- %s: Doujinshi subtitle (translated name)
- %a: Doujinshi authors' name
- %p: Doujinshi pretty name
Other options:
.. code-block::
Usage:
nhentai --search [keyword] --download
NHENTAI=http://h.loli.club nhentai --id [ID ...]
nhentai --file [filename]
Environment Variable:
NHENTAI nhentai mirror url
Options:
# Operation options
# Operation options, control the program behaviors
-h, --help show this help message and exit
-D, --download download doujinshi (for search results)
-S, --show just show the doujinshi information
# Doujinshi options
# Doujinshi options, specify id, keyword, etc.
--id=ID doujinshi ids set, e.g. 1,2,3
-s KEYWORD, --search=KEYWORD
search doujinshi by keyword
--tag=TAG download doujinshi by tag
-F, --favorites list or download your favorites.
# Multi-page options
--page=PAGE page number of search results
--max-page=MAX_PAGE The max page when recursive download tagged doujinshi
# Page options, control the page to fetch / download
--page-all all search results
--page=PAGE, --page-range=PAGE
page number of search results. e.g. 1,2-5,14
--sorting=SORTING sorting of doujinshi (recent / popular /
popular-[today|week])
# Download options
# Download options, the output directory, threads, timeout, delay, etc.
-o OUTPUT_DIR, --output=OUTPUT_DIR
output dir
-t THREADS, --threads=THREADS
@ -153,23 +179,36 @@ Other options:
timeout for downloading doujinshi
-d DELAY, --delay=DELAY
slow down between downloading every doujinshi
-p PROXY, --proxy=PROXY
uses a proxy, for example: http://127.0.0.1:1080
--proxy=PROXY store a proxy, for example: -p 'http://127.0.0.1:1080'
-f FILE, --file=FILE read gallery IDs from file.
--format=NAME_FORMAT format the saved folder name
-r, --dry-run Dry run, skip file download.
# Generating options
# Generate options, for generate html viewer, cbz file, pdf file, etc
--html generate a html viewer at current directory
--no-html don't generate HTML after downloading
--gen-main generate a main viewer contain all the doujin in the folder
--gen-main generate a main viewer contain all the doujin in the
folder
-C, --cbz generate Comic Book CBZ File
-P --pdf generate PDF file
--rm-origin-dir remove downloaded doujinshi dir when generated CBZ
or PDF file.
# nHentai options
--cookie=COOKIE set cookie of nhentai to bypass Google recaptcha
-P, --pdf generate PDF file
--rm-origin-dir remove downloaded doujinshi dir when generated CBZ or
PDF file.
--meta generate a metadata file in doujinshi format
--regenerate-cbz regenerate the cbz file if exists
# nhentai options, set cookie, user-agent, language, remove caches, histories, etc
--cookie=COOKIE set cookie of nhentai to bypass Cloudflare captcha
--useragent=USERAGENT
set useragent to bypass Cloudflare captcha
--language=LANGUAGE set default language to parse doujinshis
--clean-language set DEFAULT as language to parse doujinshis
--save-download-history
save downloaded doujinshis, whose will be skipped if
you re-download them
--clean-download-history
clean download history
--template=VIEWER_TEMPLATE
set viewer template
==============
nHentai Mirror
@ -199,14 +238,6 @@ Set `NHENTAI` env var to your nhentai mirror.
:alt: nhentai
:align: center
============
あなたも変態
============
.. image:: ./images/image.jpg?raw=true
:alt: nhentai
:align: center
.. |travis| image:: https://travis-ci.org/RicterZ/nhentai.svg?branch=master
:target: https://travis-ci.org/RicterZ/nhentai

Binary file not shown.

Before

Width:  |  Height:  |  Size: 34 KiB

BIN
images/usage.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 679 KiB

View File

@ -1,3 +1,3 @@
__version__ = '0.4.15'
__version__ = '0.4.18'
__author__ = 'RicterZ'
__email__ = 'ricterzheng@gmail.com'

View File

@ -71,9 +71,9 @@ def cmd_parser():
help='all search results')
parser.add_option('--page', '--page-range', type='string', dest='page', action='store', default='',
help='page number of search results. e.g. 1,2-5,14')
parser.add_option('--sorting', dest='sorting', action='store', default='recent',
parser.add_option('--sorting', dest='sorting', action='store', default='popular',
help='sorting of doujinshi (recent / popular / popular-[today|week])',
choices=['recent', 'popular', 'popular-today', 'popular-week'])
choices=['recent', 'popular', 'popular-today', 'popular-week', 'date'])
# download options
parser.add_option('--output', '-o', type='string', dest='output_dir', action='store', default='./',
@ -86,9 +86,10 @@ def cmd_parser():
help='slow down between downloading every doujinshi')
parser.add_option('--proxy', type='string', dest='proxy', action='store',
help='store a proxy, for example: -p \'http://127.0.0.1:1080\'')
parser.add_option('--file', '-f', type='string', dest='file', action='store', help='read gallery IDs from file.')
parser.add_option('--file', '-f', type='string', dest='file', action='store', help='read gallery IDs from file.')
parser.add_option('--format', type='string', dest='name_format', action='store',
help='format the saved folder name', default='[%i][%a][%t]')
parser.add_option('--dry-run', '-r', action='store_true', dest='dryrun', help='Dry run, skip file download.')
# generate options
parser.add_option('--html', dest='html_viewer', action='store_true',
@ -103,10 +104,16 @@ def cmd_parser():
help='generate PDF file')
parser.add_option('--rm-origin-dir', dest='rm_origin_dir', action='store_true', default=False,
help='remove downloaded doujinshi dir when generated CBZ or PDF file.')
parser.add_option('--meta', dest='generate_metadata', action='store_true',
help='generate a metadata file in doujinshi format')
parser.add_option('--regenerate-cbz', dest='regenerate_cbz', action='store_true', default=False,
help='regenerate the cbz file if exists')
# nhentai options
parser.add_option('--cookie', type='str', dest='cookie', action='store',
help='set cookie of nhentai to bypass Google recaptcha')
help='set cookie of nhentai to bypass Cloudflare captcha')
parser.add_option('--useragent', '--user-agent', type='str', dest='useragent', action='store',
help='set useragent to bypass Cloudflare captcha')
parser.add_option('--language', type='str', dest='language', action='store',
help='set default language to parse doujinshis')
parser.add_option('--clean-language', dest='clean_language', action='store_true', default=False,
@ -117,6 +124,8 @@ def cmd_parser():
help='clean download history')
parser.add_option('--template', dest='viewer_template', action='store',
help='set viewer template', default='')
parser.add_option('--legacy', dest='legacy', action='store_true', default=False,
help='use legacy searching method')
try:
sys.argv = [unicode(i.decode(sys.stdin.encoding)) for i in sys.argv]
@ -128,7 +137,7 @@ def cmd_parser():
args, _ = parser.parse_args(sys.argv[1:])
if args.html_viewer:
generate_html()
generate_html(template=constant.CONFIG['template'])
exit(0)
if args.main_viewer and not args.id and not args.keyword and not args.favorites:
@ -145,20 +154,24 @@ def cmd_parser():
# --- set config ---
if args.cookie is not None:
constant.CONFIG['cookie'] = args.cookie
write_config()
logger.info('Cookie saved.')
write_config()
exit(0)
if args.language is not None:
constant.CONFIG['language'] = args.language
logger.info('Default language now set to \'{0}\''.format(args.language))
elif args.useragent is not None:
constant.CONFIG['useragent'] = args.useragent
write_config()
logger.info('User-Agent saved.')
exit(0)
elif args.language is not None:
constant.CONFIG['language'] = args.language
write_config()
logger.info('Default language now set to \'{0}\''.format(args.language))
exit(0)
# TODO: search without language
if args.proxy is not None:
proxy_url = urlparse(args.proxy)
if not args.proxy == '' and proxy_url.scheme not in ('http', 'https'):
if not args.proxy == '' and proxy_url.scheme not in ('http', 'https', 'socks5', 'socks5h', 'socks4', 'socks4a'):
logger.error('Invalid protocol \'{0}\' of proxy, ignored'.format(proxy_url.scheme))
exit(0)
else:
@ -203,7 +216,7 @@ def cmd_parser():
parser.print_help()
exit(1)
if not args.keyword and not args.id and not args.favorites:
if not args.keyword and not args.id and not args.favorites:
parser.print_help()
exit(1)
@ -214,4 +227,8 @@ def cmd_parser():
logger.critical('Maximum number of used threads is 15')
exit(1)
if args.dryrun and (args.is_cbz or args.is_pdf):
logger.critical('Cannot generate PDF or CBZ during dry-run')
exit(1)
return args

View File

@ -8,12 +8,12 @@ import time
from nhentai import constant
from nhentai.cmdline import cmd_parser, banner
from nhentai.parser import doujinshi_parser, search_parser, print_doujinshi, favorites_parser
from nhentai.parser import doujinshi_parser, search_parser, legacy_search_parser, print_doujinshi, favorites_parser
from nhentai.doujinshi import Doujinshi
from nhentai.downloader import Downloader
from nhentai.logger import logger
from nhentai.constant import BASE_URL
from nhentai.utils import generate_html, generate_cbz, generate_main_html, generate_pdf, \
from nhentai.utils import generate_html, generate_cbz, generate_main_html, generate_pdf, generate_metadata_file, \
paging, check_cookie, signal_handler, DB
@ -55,8 +55,10 @@ def main():
if constant.CONFIG['language']:
logger.info('Using default language: {0}'.format(constant.CONFIG['language']))
options.keyword += ' language:{}'.format(constant.CONFIG['language'])
doujinshis = search_parser(options.keyword, sorting=options.sorting, page=page_list,
is_page_all=options.page_all)
_search_parser = legacy_search_parser if options.legacy else search_parser
doujinshis = _search_parser(options.keyword, sorting=options.sorting, page=page_list,
is_page_all=options.page_all)
elif not doujinshi_ids:
doujinshi_ids = options.id
@ -71,27 +73,25 @@ def main():
doujinshi_ids = list(set(map(int, doujinshi_ids)) - set(data))
if doujinshi_ids:
for i, id_ in enumerate(doujinshi_ids):
if options.delay:
time.sleep(options.delay)
doujinshi_info = doujinshi_parser(id_)
if doujinshi_info:
doujinshi_list.append(Doujinshi(name_format=options.name_format, **doujinshi_info))
if (i + 1) % 10 == 0:
logger.info('Progress: %d / %d' % (i + 1, len(doujinshi_ids)))
if not options.is_show:
downloader = Downloader(path=options.output_dir, size=options.threads,
timeout=options.timeout, delay=options.delay)
for doujinshi in doujinshi_list:
for doujinshi_id in doujinshi_ids:
doujinshi_info = doujinshi_parser(doujinshi_id)
if doujinshi_info:
doujinshi = Doujinshi(name_format=options.name_format, **doujinshi_info)
else:
continue
if not options.dryrun:
doujinshi.downloader = downloader
doujinshi.download(regenerate_cbz=options.regenerate_cbz)
if options.generate_metadata:
table = doujinshi.table
generate_metadata_file(options.output_dir, table, doujinshi)
doujinshi.downloader = downloader
doujinshi.download()
if options.is_save_download_history:
with DB() as db:
db.add_one(doujinshi.id)
@ -112,11 +112,16 @@ def main():
logger.log(15, 'All done.')
else:
[doujinshi.show() for doujinshi in doujinshi_list]
for doujinshi_id in doujinshi_ids:
doujinshi_info = doujinshi_parser(doujinshi_id)
if doujinshi_info:
doujinshi = Doujinshi(name_format=options.name_format, **doujinshi_info)
else:
continue
doujinshi.show()
signal.signal(signal.SIGINT, signal_handler)
if __name__ == '__main__':
main()

View File

@ -14,6 +14,7 @@ BASE_URL = os.getenv('NHENTAI', 'https://nhentai.net')
__api_suspended_DETAIL_URL = '%s/api/gallery' % BASE_URL
DETAIL_URL = '%s/g' % BASE_URL
LEGACY_SEARCH_URL = '%s/search/' % BASE_URL
SEARCH_URL = '%s/api/galleries/search' % BASE_URL
@ -34,6 +35,7 @@ CONFIG = {
'cookie': '',
'language': '',
'template': '',
'useragent': 'nhentai command line client (https://github.com/RicterZ/nhentai)'
}
LANGUAGEISO ={

View File

@ -26,8 +26,10 @@ class DoujinshiInfo(dict):
class Doujinshi(object):
def __init__(self, name=None, id=None, img_id=None, ext='', pages=0, name_format='[%i][%a][%t]', **kwargs):
def __init__(self, name=None, pretty_name=None, id=None, img_id=None,
ext='', pages=0, name_format='[%i][%a][%t]', **kwargs):
self.name = name
self.pretty_name = pretty_name
self.id = id
self.img_id = img_id
self.ext = ext
@ -36,17 +38,15 @@ class Doujinshi(object):
self.url = '%s/%d' % (DETAIL_URL, self.id)
self.info = DoujinshiInfo(**kwargs)
name_format = name_format.replace('%i', str(self.id))
name_format = name_format.replace('%a', self.info.artists)
name_format = name_format.replace('%t', self.name)
name_format = name_format.replace('%s', self.info.subtitle)
self.filename = format_filename(name_format)
name_format = name_format.replace('%i', format_filename(str(self.id)))
name_format = name_format.replace('%a', format_filename(self.info.artists))
def __repr__(self):
return '<Doujinshi: {0}>'.format(self.name)
name_format = name_format.replace('%t', format_filename(self.name))
name_format = name_format.replace('%p', format_filename(self.pretty_name))
name_format = name_format.replace('%s', format_filename(self.info.subtitle))
self.filename = format_filename(name_format, 255, True)
def show(self):
table = [
self.table = [
["Parodies", self.info.parodies],
["Doujinshi", self.name],
["Subtitle", self.info.subtitle],
@ -57,26 +57,25 @@ class Doujinshi(object):
["URL", self.url],
["Pages", self.pages],
]
logger.info(u'Print doujinshi information of {0}\n{1}'.format(self.id, tabulate(table)))
def download(self):
def __repr__(self):
return '<Doujinshi: {0}>'.format(self.name)
def show(self):
logger.info(u'Print doujinshi information of {0}\n{1}'.format(self.id, tabulate(self.table)))
def download(self, regenerate_cbz=False):
logger.info('Starting to download doujinshi: %s' % self.name)
if self.downloader:
download_queue = []
if len(self.ext) != self.pages:
logger.warning('Page count and ext count do not equal')
for i in range(1, min(self.pages, len(self.ext)) + 1):
download_queue.append('%s/%d/%d.%s' % (IMAGE_URL, int(self.img_id), i, self.ext[i-1]))
self.downloader.download(download_queue, self.filename)
'''
for i in range(len(self.ext)):
download_queue.append('%s/%d/%d.%s' % (IMAGE_URL, int(self.img_id), i+1, EXT_MAP[self.ext[i]]))
'''
download_queue.append('%s/%d/%d.%s' % (IMAGE_URL, int(self.img_id), i, self.ext[i - 1]))
self.downloader.download(download_queue, self.filename, regenerate_cbz=regenerate_cbz)
else:
logger.critical('Downloader has not been loaded')

View File

@ -113,13 +113,18 @@ class Downloader(Singleton):
else:
logger.log(15, '{0} downloaded successfully'.format(data))
def download(self, queue, folder=''):
def download(self, queue, folder='', regenerate_cbz=False):
if not isinstance(folder, text):
folder = str(folder)
if self.path:
folder = os.path.join(self.path, folder)
if os.path.exists(folder + '.cbz'):
if not regenerate_cbz:
logger.warning('CBZ file \'{}.cbz\' exists, ignored download request'.format(folder))
return
if not os.path.exists(folder):
logger.warning('Path \'{0}\' does not exist, creating.'.format(folder))
try:

View File

@ -133,13 +133,16 @@ def doujinshi_parser(id_):
doujinshi_info = html.find('div', attrs={'id': 'info'})
title = doujinshi_info.find('h1').text
pretty_name = doujinshi_info.find('h1').find('span', attrs={'class': 'pretty'}).text
subtitle = doujinshi_info.find('h2')
doujinshi['name'] = title
doujinshi['pretty_name'] = pretty_name
doujinshi['subtitle'] = subtitle.text if subtitle else ''
doujinshi_cover = html.find('div', attrs={'id': 'cover'})
img_id = re.search('/galleries/([\d]+)/cover\.(jpg|png|gif)$', doujinshi_cover.a.img.attrs['data-src'])
img_id = re.search('/galleries/([0-9]+)/cover.(jpg|png|gif)$',
doujinshi_cover.a.img.attrs['data-src'])
ext = []
for i in html.find_all('div', attrs={'class': 'thumb-container'}):
@ -174,9 +177,11 @@ def doujinshi_parser(id_):
return doujinshi
def old_search_parser(keyword, sorting='date', page=1):
def legacy_search_parser(keyword, sorting='date', page=1, is_page_all=False):
logger.warning('Using legacy searching method, `--all` options will not be supported')
logger.debug('Searching doujinshis of keyword {0}'.format(keyword))
response = request('get', url=constant.SEARCH_URL, params={'q': keyword, 'page': page, 'sort': sorting}).content
response = request('get', url=constant.LEGACY_SEARCH_URL,
params={'q': keyword, 'page': page, 'sort': sorting}).content
result = _get_title_and_id(response)
if not result:
@ -197,6 +202,7 @@ def print_doujinshi(doujinshi_list):
def search_parser(keyword, sorting, page, is_page_all=False):
# keyword = '+'.join([i.strip().replace(' ', '-').lower() for i in keyword.split(',')])
result = []
response = None
if not page:
page = [1]
@ -206,6 +212,7 @@ def search_parser(keyword, sorting, page, is_page_all=False):
page = range(1, init_response['num_pages']+1)
total = '/{0}'.format(page[-1]) if is_page_all else ''
not_exists_persist = False
for p in page:
i = 0
@ -217,18 +224,21 @@ def search_parser(keyword, sorting, page, is_page_all=False):
response = request('get', url.replace('%2B', '+')).json()
except Exception as e:
logger.critical(str(e))
response = None
break
if 'result' not in response:
if response is None or 'result' not in response:
logger.warning('No result in response in page {}'.format(p))
break
if not_exists_persist is True:
break
continue
for row in response['result']:
title = row['title']['english']
title = title[:85] + '..' if len(title) > 85 else title
result.append({'id': row['id'], 'title': title})
not_exists_persist = False
if not result:
logger.warning('No results for keywords {}'.format(keyword))

View File

@ -4,6 +4,7 @@ import os
from xml.sax.saxutils import escape
from nhentai.constant import LANGUAGEISO
def serialize_json(doujinshi, dir):
metadata = {'title': doujinshi.name,
'subtitle': doujinshi.info.subtitle}
@ -26,12 +27,12 @@ def serialize_json(doujinshi, dir):
metadata['Pages'] = doujinshi.pages
with open(os.path.join(dir, 'metadata.json'), 'w') as f:
json.dump(metadata, f, separators=','':')
json.dump(metadata, f, separators=(',', ':'))
def serialize_comicxml(doujinshi, dir):
def serialize_comic_xml(doujinshi, dir):
from iso8601 import parse_date
with open(os.path.join(dir, 'ComicInfo.xml'), 'w') as f:
with open(os.path.join(dir, 'ComicInfo.xml'), 'w', encoding="utf-8") as f:
f.write('<?xml version="1.0" encoding="utf-8"?>\n')
f.write('<ComicInfo xmlns:xsd="http://www.w3.org/2001/XMLSchema" '
'xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">\n')
@ -45,7 +46,8 @@ def serialize_comicxml(doujinshi, dir):
xml_write_simple_tag(f, 'NhentaiId', doujinshi.id)
xml_write_simple_tag(f, 'Genre', doujinshi.info.categories)
xml_write_simple_tag(f, 'BlackAndWhite', 'No' if doujinshi.info.tags and 'full color' in doujinshi.info.tags else 'Yes')
xml_write_simple_tag(f, 'BlackAndWhite', 'No' if doujinshi.info.tags and
'full color' in doujinshi.info.tags else 'Yes')
if doujinshi.info.date:
dt = parse_date(doujinshi.info.date)
@ -59,13 +61,13 @@ def serialize_comicxml(doujinshi, dir):
if doujinshi.info.tags:
xml_write_simple_tag(f, 'Tags', doujinshi.info.tags)
if doujinshi.info.artists:
xml_write_simple_tag(f, 'Writer', ' & '.join([i.strip() for i in doujinshi.info.artists.split(',')]))
# if doujinshi.info.groups:
# metadata['group'] = [i.strip() for i in doujinshi.info.groups.split(',')]
xml_write_simple_tag(f, 'Writer', ' & '.join([i.strip() for i in
doujinshi.info.artists.split(',')]))
if doujinshi.info.languages:
languages = [i.strip() for i in doujinshi.info.languages.split(',')]
xml_write_simple_tag(f, 'Translated', 'Yes' if 'translated' in languages else 'No')
[xml_write_simple_tag(f, 'LanguageISO', LANGUAGEISO[i]) for i in languages \
[xml_write_simple_tag(f, 'LanguageISO', LANGUAGEISO[i]) for i in languages
if (i != 'translated' and i in LANGUAGEISO)]
f.write('</ComicInfo>')
@ -121,7 +123,7 @@ def serialize_unique(lst):
def set_js_database():
with open('data.js', 'w') as f:
indexed_json = merge_json()
unique_json = json.dumps(serialize_unique(indexed_json), separators=','':')
indexed_json = json.dumps(indexed_json, separators=','':')
unique_json = json.dumps(serialize_unique(indexed_json), separators=(',', ':'))
indexed_json = json.dumps(indexed_json, separators=(',', ':'))
f.write('var data = ' + indexed_json)
f.write(';\nvar tags = ' + unique_json)

View File

@ -10,14 +10,17 @@ import sqlite3
from nhentai import constant
from nhentai.logger import logger
from nhentai.serializer import serialize_json, serialize_comicxml, set_js_database
from nhentai.serializer import serialize_json, serialize_comic_xml, set_js_database
MAX_FIELD_LENGTH = 100
def request(method, url, **kwargs):
session = requests.Session()
session.headers.update({
'Referer': constant.LOGIN_URL,
'User-Agent': 'nhentai command line client (https://github.com/RicterZ/nhentai)',
'User-Agent': constant.CONFIG['useragent'],
'Cookie': constant.CONFIG['cookie']
})
@ -28,10 +31,14 @@ def request(method, url, **kwargs):
def check_cookie():
response = request('get', constant.BASE_URL).text
username = re.findall('"/users/\d+/(.*?)"', response)
response = request('get', constant.BASE_URL)
if response.status_code == 503 and 'cf-browser-verification' in response.text:
logger.error('Blocked by Cloudflare captcha, please set your cookie and useragent')
exit(-1)
username = re.findall('"/users/\d+/(.*?)"', response.text)
if not username:
logger.error('Cannot get your username, please check your cookie or use `nhentai --cookie` to set your cookie')
logger.warning('Cannot get your username, please check your cookie or use `nhentai --cookie` to set your cookie')
else:
logger.info('Login successfully! Your username: {}'.format(username[0]))
@ -74,6 +81,13 @@ def generate_html(output_dir='.', doujinshi_obj=None, template='default'):
else:
doujinshi_dir = '.'
if not os.path.exists(doujinshi_dir):
logger.warning('Path \'{0}\' does not exist, creating.'.format(doujinshi_dir))
try:
os.makedirs(doujinshi_dir)
except EnvironmentError as e:
logger.critical('{0}'.format(str(e)))
file_list = os.listdir(doujinshi_dir)
file_list.sort()
@ -81,7 +95,7 @@ def generate_html(output_dir='.', doujinshi_obj=None, template='default'):
if not os.path.splitext(image)[1] in ('.jpg', '.png'):
continue
image_html += '<img src="{0}" class="image-item"/>\n'\
image_html += '<img src="{0}" class="image-item"/>\n' \
.format(image)
html = readfile('viewer/{}/index.html'.format(template))
css = readfile('viewer/{}/styles.css'.format(template))
@ -162,7 +176,7 @@ def generate_main_html(output_dir='./'):
else:
with open('./main.html', 'wb') as f:
f.write(data.encode('utf-8'))
shutil.copy(os.path.dirname(__file__)+'/viewer/logo.png', './')
shutil.copy(os.path.dirname(__file__) + '/viewer/logo.png', './')
set_js_database()
logger.log(
15, 'Main Viewer has been written to \'{0}main.html\''.format(output_dir))
@ -174,7 +188,7 @@ def generate_cbz(output_dir='.', doujinshi_obj=None, rm_origin_dir=False, write_
if doujinshi_obj is not None:
doujinshi_dir = os.path.join(output_dir, doujinshi_obj.filename)
if write_comic_info:
serialize_comicxml(doujinshi_obj, doujinshi_dir)
serialize_comic_xml(doujinshi_obj, doujinshi_dir)
cbz_filename = os.path.join(os.path.join(doujinshi_dir, '..'), '{}.cbz'.format(doujinshi_obj.filename))
else:
cbz_filename = './doujinshi.cbz'
@ -198,7 +212,7 @@ def generate_cbz(output_dir='.', doujinshi_obj=None, rm_origin_dir=False, write_
def generate_pdf(output_dir='.', doujinshi_obj=None, rm_origin_dir=False):
try:
import img2pdf
"""Write images to a PDF file using img2pdf."""
if doujinshi_obj is not None:
doujinshi_dir = os.path.join(output_dir, doujinshi_obj.filename)
@ -224,10 +238,11 @@ def generate_pdf(output_dir='.', doujinshi_obj=None, rm_origin_dir=False):
shutil.rmtree(doujinshi_dir, ignore_errors=True)
logger.log(15, 'PDF file has been written to \'{0}\''.format(doujinshi_dir))
except ImportError:
logger.error("Please install img2pdf package by using pip.")
def unicode_truncate(s, length, encoding='utf-8'):
"""https://stackoverflow.com/questions/1809531/truncating-unicode-so-it-fits-a-maximum-size-when-encoded-for-wire-transfer
"""
@ -235,7 +250,7 @@ def unicode_truncate(s, length, encoding='utf-8'):
return encoded.decode(encoding, 'ignore')
def format_filename(s):
def format_filename(s, length=MAX_FIELD_LENGTH, _truncate_only=False):
"""
It used to be a whitelist approach allowed only alphabet and a part of symbols.
but most doujinshi's names include Japanese 2-byte characters and these was rejected.
@ -243,16 +258,20 @@ def format_filename(s):
if filename include forbidden characters (\'/:,;*?"<>|) ,it replace space character(' ').
"""
# maybe you can use `--format` to select a suitable filename
ban_chars = '\\\'/:,;*?"<>|\t'
filename = s.translate(str.maketrans(ban_chars, ' '*len(ban_chars))).strip()
filename = ' '.join(filename.split())
print(repr(filename))
while filename.endswith('.'):
filename = filename[:-1]
if not _truncate_only:
ban_chars = '\\\'/:,;*?"<>|\t'
filename = s.translate(str.maketrans(ban_chars, ' ' * len(ban_chars))).strip()
filename = ' '.join(filename.split())
if len(filename) > 100:
filename = filename[:100] + u''
while filename.endswith('.'):
filename = filename[:-1]
else:
filename = s
# limit `length` chars
if len(filename) >= length:
filename = filename[:length - 1] + u''
# Remove [] from filename
filename = filename.replace('[]', '').strip()
@ -275,7 +294,7 @@ def paging(page_string):
start, end = i.split('-')
if not (start.isdigit() and end.isdigit()):
raise Exception('Invalid page number')
page_list.extend(list(range(int(start), int(end)+1)))
page_list.extend(list(range(int(start), int(end) + 1)))
else:
if not i.isdigit():
raise Exception('Invalid page number')
@ -284,6 +303,34 @@ def paging(page_string):
return page_list
def generate_metadata_file(output_dir, table, doujinshi_obj=None):
logger.info('Writing Metadata Info')
if doujinshi_obj is not None:
doujinshi_dir = os.path.join(output_dir, doujinshi_obj.filename)
else:
doujinshi_dir = '.'
logger.info(doujinshi_dir)
f = open(os.path.join(doujinshi_dir, 'info.txt'), 'w', encoding='utf-8')
fields = ['TITLE', 'ORIGINAL TITLE', 'AUTHOR', 'ARTIST', 'CIRCLE', 'SCANLATOR',
'TRANSLATOR', 'PUBLISHER', 'DESCRIPTION', 'STATUS', 'CHAPTERS', 'PAGES',
'TAGS', 'TYPE', 'LANGUAGE', 'RELEASED', 'READING DIRECTION', 'CHARACTERS',
'SERIES', 'PARODY', 'URL']
special_fields = ['PARODY', 'TITLE', 'ORIGINAL TITLE', 'CHARACTERS', 'AUTHOR',
'LANGUAGE', 'TAGS', 'URL', 'PAGES']
for i in range(len(fields)):
f.write('{}: '.format(fields[i]))
if fields[i] in special_fields:
f.write(str(table[special_fields.index(fields[i])][1]))
f.write('\n')
f.close()
class DB(object):
conn = None
cur = None

View File

@ -0,0 +1,25 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=yes, viewport-fit=cover" />
<title>{TITLE}</title>
<style>
{STYLES}
</style>
</head>
<body>
<nav id="list" hidden=true>
{IMAGES}</nav>
<div id="image-container">
<div id="dest"></div>
<span id="page-num"></span>
</div>
<script>
{SCRIPTS}
</script>
</body>
</html>

View File

@ -0,0 +1,79 @@
const pages = Array.from(document.querySelectorAll('img.image-item'));
let currentPage = 0;
function changePage(pageNum) {
const previous = pages[currentPage];
const current = pages[pageNum];
if (current == null) {
return;
}
previous.classList.remove('current');
current.classList.add('current');
currentPage = pageNum;
const display = document.getElementById('dest');
display.style.backgroundImage = `url("${current.src}")`;
scroll(0,0)
document.getElementById('page-num')
.innerText = [
(pageNum + 1).toLocaleString(),
pages.length.toLocaleString()
].join('\u200a/\u200a');
}
changePage(0);
document.getElementById('image-container').onclick = event => {
const width = document.getElementById('image-container').clientWidth;
const clickPos = event.clientX / width;
if (clickPos < 0.5) {
changePage(currentPage - 1);
} else {
changePage(currentPage + 1);
}
};
document.onkeypress = event => {
switch (event.key.toLowerCase()) {
// Previous Image
case 'w':
scrollBy(0, -40);
break;
case 'a':
changePage(currentPage - 1);
break;
// Return to previous page
case 'q':
window.history.go(-1);
break;
// Next Image
case ' ':
case 's':
scrollBy(0, 40);
break;
case 'd':
changePage(currentPage + 1);
break;
}// remove arrow cause it won't work
};
document.onkeydown = event =>{
switch (event.keyCode) {
case 37: //left
changePage(currentPage - 1);
break;
case 38: //up
break;
case 39: //right
changePage(currentPage + 1);
break;
case 40: //down
break;
}
};

View File

@ -0,0 +1,75 @@
*, *::after, *::before {
box-sizing: border-box;
}
img {
vertical-align: middle;
}
html, body {
display: flex;
background-color: #e8e6e6;
height: 100%;
width: 100%;
padding: 0;
margin: 0;
font-family: sans-serif;
}
#list {
height: 2000px;
overflow: scroll;
width: 260px;
text-align: center;
}
#list img {
width: 200px;
padding: 10px;
border-radius: 10px;
margin: 15px 0;
cursor: pointer;
}
#list img.current {
background: #0003;
}
#image-container {
flex: auto;
height: 100%;
background: rgb(0, 0, 0);
color: rgb(100, 100, 100);
text-align: center;
cursor: pointer;
-webkit-user-select: none;
user-select: none;
position: relative;
}
#image-container #dest {
height: 2000px;
width: 100%;
background-size: contain;
background-repeat: no-repeat;
background-position: top;
margin-left: auto;
margin-right: auto;
max-width: 100%;
max-height: 100vh;
margin: auto;
}
#image-container #page-num {
position: static;
font-size: 9pt;
left: 10px;
bottom: 5px;
font-weight: bold;
opacity: 0.9;
text-shadow: /* Duplicate the same shadow to make it very strong */
0 0 2px #222,
0 0 2px #222,
0 0 2px #222;
}