0.4.18 bugs fix

solve #251
add usage images
2025-07-02 00:19:29 +02:00 · 2023-02-04 20:24:53 +08:00 · 2023-02-04 20:22:57 +08:00 · 2023-02-04 20:09:51 +08:00 · 2023-02-04 20:09:46 +08:00 · 2023-02-04 19:55:51 +08:00
17 changed files with 420 additions and 121 deletions
--- a/.gitignore
+++ b/.gitignore
@ -7,3 +7,4 @@ dist/
 .DS_Store
 output/
 venv/
+.vscode/
--- a/MANIFEST.in
+++ b/MANIFEST.in
@ -1,4 +1,5 @@
 include README.md
 include requirements.txt
 include nhentai/viewer/*
-include nhentai/viewer/default/*
+include nhentai/viewer/default/*
+include nhentai/viewer/minimal/*
--- a/README.rst
+++ b/README.rst
@ -50,11 +50,18 @@ Installation (Gentoo)

    layman -fa glicOne
    sudo emerge net-misc/nhentai
+    
+=====================
+Installation (NixOs)
+=====================
+.. code-block::

+    nix-env -iA nixos.nhentai
+    
 =====
 Usage
 =====
-**IMPORTANT**: To bypass the nhentai frequency limit, you should use `--cookie` option to store your cookie.
+**⚠️IMPORTANT⚠️**: To bypass the nhentai frequency limit, you should use `--cookie` and `--useragent` options to store your cookie and your user-agent.

 *The default download folder will be the path where you run the command (CLI path).*

@ -63,9 +70,13 @@ Set your nhentai cookie against captcha:

 .. code-block:: bash

+    nhentai --useragent "USER AGENT of YOUR BROWSER"
    nhentai --cookie "YOUR COOKIE FROM nhentai.net"

-**NOTE**: The format of the cookie is `"csrftoken=TOKEN; sessionid=ID"`
+**NOTE**
+
+- The format of the cookie is `"csrftoken=TOKEN; sessionid=ID; cf_clearance=CLOUDFLARE"`
+- `cf_clearance` cookie and useragent must be set if you encounter "blocked by cloudflare captcha" error. Make sure you use the same IP and useragent as when you got it

 | To get csrftoken and sessionid, first login to your nhentai account in web browser, then:
 | (Chrome) |ve| |ld| More tools    |ld| Developer tools     |ld| Application |ld| Storage |ld| Cookies |ld| https://nhentai.net
@ -76,6 +87,10 @@ Set your nhentai cookie against captcha:
 .. |ve| unicode:: U+22EE .. https://www.compart.com/en/unicode/U+22EE
 .. |ld| unicode:: U+2014 .. https://www.compart.com/en/unicode/U+2014

+.. image:: ./images/usage.png?raw=true
+    :alt: nhentai
+    :align: center
+
 Download specified doujinshi:

 .. code-block:: bash
@ -121,30 +136,41 @@ Supported doujinshi folder formatter:
 - %t: Doujinshi name
 - %s: Doujinshi subtitle (translated name)
 - %a: Doujinshi authors' name
+- %p: Doujinshi pretty name


 Other options:

 .. code-block::

+    Usage:
+      nhentai --search [keyword] --download
+      NHENTAI=http://h.loli.club nhentai --id [ID ...]
+      nhentai --file [filename]
+
+    Environment Variable:
+      NHENTAI                 nhentai mirror url
+
    Options:
-      # Operation options
+      # Operation options, control the program behaviors
      -h, --help            show this help message and exit
      -D, --download        download doujinshi (for search results)
      -S, --show            just show the doujinshi information

-      # Doujinshi options
+      # Doujinshi options, specify id, keyword, etc.
      --id=ID               doujinshi ids set, e.g. 1,2,3
      -s KEYWORD, --search=KEYWORD
                            search doujinshi by keyword
-      --tag=TAG             download doujinshi by tag
      -F, --favorites       list or download your favorites.

-      # Multi-page options
-      --page=PAGE           page number of search results
-      --max-page=MAX_PAGE   The max page when recursive download tagged doujinshi
+      # Page options, control the page to fetch / download
+      --page-all            all search results
+      --page=PAGE, --page-range=PAGE
+                            page number of search results. e.g. 1,2-5,14
+      --sorting=SORTING     sorting of doujinshi (recent / popular /
+                            popular-[today|week])

-      # Download options
+      # Download options, the output directory, threads, timeout, delay, etc.
      -o OUTPUT_DIR, --output=OUTPUT_DIR
                            output dir
      -t THREADS, --threads=THREADS
@ -153,23 +179,36 @@ Other options:
                            timeout for downloading doujinshi
      -d DELAY, --delay=DELAY
                            slow down between downloading every doujinshi
-      -p PROXY, --proxy=PROXY
-                            uses a proxy, for example: http://127.0.0.1:1080
+      --proxy=PROXY         store a proxy, for example: -p 'http://127.0.0.1:1080'
      -f FILE, --file=FILE  read gallery IDs from file.
      --format=NAME_FORMAT  format the saved folder name
+      -r, --dry-run         Dry run, skip file download.

-      # Generating options
+      # Generate options, for generate html viewer, cbz file, pdf file, etc
      --html                generate a html viewer at current directory
      --no-html             don't generate HTML after downloading
-      --gen-main            generate a main viewer contain all the doujin in the folder
+      --gen-main            generate a main viewer contain all the doujin in the
+                            folder
      -C, --cbz             generate Comic Book CBZ File
-      -P --pdf              generate PDF file
-      --rm-origin-dir       remove downloaded doujinshi dir when generated CBZ
-                            or PDF file.
-
-      # nHentai options
-      --cookie=COOKIE       set cookie of nhentai to bypass Google recaptcha
+      -P, --pdf             generate PDF file
+      --rm-origin-dir       remove downloaded doujinshi dir when generated CBZ or
+                            PDF file.
+      --meta                generate a metadata file in doujinshi format
+      --regenerate-cbz      regenerate the cbz file if exists

+      # nhentai options, set cookie, user-agent, language, remove caches, histories, etc
+      --cookie=COOKIE       set cookie of nhentai to bypass Cloudflare captcha
+      --useragent=USERAGENT
+                            set useragent to bypass Cloudflare captcha
+      --language=LANGUAGE   set default language to parse doujinshis
+      --clean-language      set DEFAULT as language to parse doujinshis
+      --save-download-history
+                            save downloaded doujinshis, whose will be skipped if
+                            you re-download them
+      --clean-download-history
+                            clean download history
+      --template=VIEWER_TEMPLATE
+                            set viewer template

 ==============
 nHentai Mirror
@ -199,14 +238,6 @@ Set `NHENTAI` env var to your nhentai mirror.
    :alt: nhentai
    :align: center

-============
-あなたも変態
-============
-.. image:: ./images/image.jpg?raw=true
-    :alt: nhentai
-    :align: center
-
-

 .. |travis| image:: https://travis-ci.org/RicterZ/nhentai.svg?branch=master
   :target: https://travis-ci.org/RicterZ/nhentai
--- a/images/image.jpg
+++ b/images/image.jpg
--- a/images/usage.png
+++ b/images/usage.png
--- a/nhentai/init.py
+++ b/nhentai/init.py
@ -1,3 +1,3 @@
-__version__ = '0.4.15'
+__version__ = '0.4.18'
 __author__ = 'RicterZ'
 __email__ = 'ricterzheng@gmail.com'
--- a/nhentai/cmdline.py
+++ b/nhentai/cmdline.py
@ -71,9 +71,9 @@ def cmd_parser():
                      help='all search results')
    parser.add_option('--page', '--page-range', type='string', dest='page', action='store', default='',
                      help='page number of search results. e.g. 1,2-5,14')
-    parser.add_option('--sorting', dest='sorting', action='store', default='recent',
+    parser.add_option('--sorting', dest='sorting', action='store', default='popular',
                      help='sorting of doujinshi (recent / popular / popular-[today|week])',
-                      choices=['recent', 'popular', 'popular-today', 'popular-week'])
+                      choices=['recent', 'popular', 'popular-today', 'popular-week', 'date'])

    # download options
    parser.add_option('--output', '-o', type='string', dest='output_dir', action='store', default='./',
@ -86,9 +86,10 @@ def cmd_parser():
                      help='slow down between downloading every doujinshi')
    parser.add_option('--proxy', type='string', dest='proxy', action='store',
                      help='store a proxy, for example: -p \'http://127.0.0.1:1080\'')
-    parser.add_option('--file',  '-f', type='string', dest='file', action='store', help='read gallery IDs from file.')
+    parser.add_option('--file', '-f', type='string', dest='file', action='store', help='read gallery IDs from file.')
    parser.add_option('--format', type='string', dest='name_format', action='store',
                      help='format the saved folder name', default='[%i][%a][%t]')
+    parser.add_option('--dry-run', '-r', action='store_true', dest='dryrun', help='Dry run, skip file download.')

    # generate options
    parser.add_option('--html', dest='html_viewer', action='store_true',
@ -103,10 +104,16 @@ def cmd_parser():
                      help='generate PDF file')
    parser.add_option('--rm-origin-dir', dest='rm_origin_dir', action='store_true', default=False,
                      help='remove downloaded doujinshi dir when generated CBZ or PDF file.')
+    parser.add_option('--meta', dest='generate_metadata', action='store_true',
+                      help='generate a metadata file in doujinshi format')
+    parser.add_option('--regenerate-cbz', dest='regenerate_cbz', action='store_true', default=False,
+                      help='regenerate the cbz file if exists')

    # nhentai options
    parser.add_option('--cookie', type='str', dest='cookie', action='store',
-                      help='set cookie of nhentai to bypass Google recaptcha')
+                      help='set cookie of nhentai to bypass Cloudflare captcha')
+    parser.add_option('--useragent', '--user-agent', type='str', dest='useragent', action='store',
+                      help='set useragent to bypass Cloudflare captcha')
    parser.add_option('--language', type='str', dest='language', action='store',
                      help='set default language to parse doujinshis')
    parser.add_option('--clean-language', dest='clean_language', action='store_true', default=False,
@ -117,6 +124,8 @@ def cmd_parser():
                      help='clean download history')
    parser.add_option('--template', dest='viewer_template', action='store',
                      help='set viewer template', default='')
+    parser.add_option('--legacy', dest='legacy', action='store_true', default=False,
+                      help='use legacy searching method')

    try:
        sys.argv = [unicode(i.decode(sys.stdin.encoding)) for i in sys.argv]
@ -128,7 +137,7 @@ def cmd_parser():
    args, _ = parser.parse_args(sys.argv[1:])

    if args.html_viewer:
-        generate_html()
+        generate_html(template=constant.CONFIG['template'])
        exit(0)

    if args.main_viewer and not args.id and not args.keyword and not args.favorites:
@ -145,20 +154,24 @@ def cmd_parser():
    # --- set config ---
    if args.cookie is not None:
        constant.CONFIG['cookie'] = args.cookie
+        write_config()
        logger.info('Cookie saved.')
-        write_config()
        exit(0)
-
-    if args.language is not None:
-        constant.CONFIG['language'] = args.language
-        logger.info('Default language now set to \'{0}\''.format(args.language))
+    elif args.useragent is not None:
+        constant.CONFIG['useragent'] = args.useragent
        write_config()
+        logger.info('User-Agent saved.')
+        exit(0)
+    elif args.language is not None:
+        constant.CONFIG['language'] = args.language
+        write_config()
+        logger.info('Default language now set to \'{0}\''.format(args.language))
        exit(0)
        # TODO: search without language

    if args.proxy is not None:
        proxy_url = urlparse(args.proxy)
-        if not args.proxy == '' and proxy_url.scheme not in ('http', 'https'):
+        if not args.proxy == '' and proxy_url.scheme not in ('http', 'https', 'socks5', 'socks5h', 'socks4', 'socks4a'):
            logger.error('Invalid protocol \'{0}\' of proxy, ignored'.format(proxy_url.scheme))
            exit(0)
        else:
@ -203,7 +216,7 @@ def cmd_parser():
        parser.print_help()
        exit(1)

-    if not args.keyword and not args.id and not  args.favorites:
+    if not args.keyword and not args.id and not args.favorites:
        parser.print_help()
        exit(1)

@ -214,4 +227,8 @@ def cmd_parser():
        logger.critical('Maximum number of used threads is 15')
        exit(1)

+    if args.dryrun and (args.is_cbz or args.is_pdf):
+        logger.critical('Cannot generate PDF or CBZ during dry-run')
+        exit(1)
+
    return args
--- a/nhentai/command.py
+++ b/nhentai/command.py
@ -8,12 +8,12 @@ import time

 from nhentai import constant
 from nhentai.cmdline import cmd_parser, banner
-from nhentai.parser import doujinshi_parser, search_parser, print_doujinshi, favorites_parser
+from nhentai.parser import doujinshi_parser, search_parser, legacy_search_parser, print_doujinshi, favorites_parser
 from nhentai.doujinshi import Doujinshi
 from nhentai.downloader import Downloader
 from nhentai.logger import logger
 from nhentai.constant import BASE_URL
-from nhentai.utils import generate_html, generate_cbz, generate_main_html, generate_pdf, \
+from nhentai.utils import generate_html, generate_cbz, generate_main_html, generate_pdf, generate_metadata_file, \
    paging, check_cookie, signal_handler, DB


@ -55,8 +55,10 @@ def main():
        if constant.CONFIG['language']:
            logger.info('Using default language: {0}'.format(constant.CONFIG['language']))
            options.keyword += ' language:{}'.format(constant.CONFIG['language'])
-        doujinshis = search_parser(options.keyword, sorting=options.sorting, page=page_list,
-                                   is_page_all=options.page_all)
+
+        _search_parser = legacy_search_parser if options.legacy else search_parser
+        doujinshis = _search_parser(options.keyword, sorting=options.sorting, page=page_list,
+                                    is_page_all=options.page_all)

    elif not doujinshi_ids:
        doujinshi_ids = options.id
@ -71,27 +73,25 @@ def main():

        doujinshi_ids = list(set(map(int, doujinshi_ids)) - set(data))

-    if doujinshi_ids:
-        for i, id_ in enumerate(doujinshi_ids):
-            if options.delay:
-                time.sleep(options.delay)
-
-            doujinshi_info = doujinshi_parser(id_)
-
-            if doujinshi_info:
-                doujinshi_list.append(Doujinshi(name_format=options.name_format, **doujinshi_info))
-
-            if (i + 1) % 10 == 0:
-                logger.info('Progress: %d / %d' % (i + 1, len(doujinshi_ids)))
-
    if not options.is_show:
        downloader = Downloader(path=options.output_dir, size=options.threads,
                                timeout=options.timeout, delay=options.delay)

-        for doujinshi in doujinshi_list:
+        for doujinshi_id in doujinshi_ids:
+            doujinshi_info = doujinshi_parser(doujinshi_id)
+            if doujinshi_info:
+                doujinshi = Doujinshi(name_format=options.name_format, **doujinshi_info)
+            else:
+                continue
+
+            if not options.dryrun:
+                doujinshi.downloader = downloader
+                doujinshi.download(regenerate_cbz=options.regenerate_cbz)
+
+            if options.generate_metadata:
+                table = doujinshi.table
+                generate_metadata_file(options.output_dir, table, doujinshi)

-            doujinshi.downloader = downloader
-            doujinshi.download()
            if options.is_save_download_history:
                with DB() as db:
                    db.add_one(doujinshi.id)
@ -112,11 +112,16 @@ def main():
            logger.log(15, 'All done.')

    else:
-        [doujinshi.show() for doujinshi in doujinshi_list]
+        for doujinshi_id in doujinshi_ids:
+            doujinshi_info = doujinshi_parser(doujinshi_id)
+            if doujinshi_info:
+                doujinshi = Doujinshi(name_format=options.name_format, **doujinshi_info)
+            else:
+                continue
+            doujinshi.show()


 signal.signal(signal.SIGINT, signal_handler)

-
 if __name__ == '__main__':
    main()
--- a/nhentai/constant.py
+++ b/nhentai/constant.py
@ -14,6 +14,7 @@ BASE_URL = os.getenv('NHENTAI', 'https://nhentai.net')
 __api_suspended_DETAIL_URL = '%s/api/gallery' % BASE_URL

 DETAIL_URL = '%s/g' % BASE_URL
+LEGACY_SEARCH_URL = '%s/search/' % BASE_URL
 SEARCH_URL = '%s/api/galleries/search' % BASE_URL


@ -34,6 +35,7 @@ CONFIG = {
    'cookie': '',
    'language': '',
    'template': '',
+    'useragent': 'nhentai command line client (https://github.com/RicterZ/nhentai)'
 }

 LANGUAGEISO ={
--- a/nhentai/doujinshi.py
+++ b/nhentai/doujinshi.py
@ -26,8 +26,10 @@ class DoujinshiInfo(dict):


 class Doujinshi(object):
-    def __init__(self, name=None, id=None, img_id=None, ext='', pages=0, name_format='[%i][%a][%t]', **kwargs):
+    def __init__(self, name=None, pretty_name=None, id=None, img_id=None,
+                 ext='', pages=0, name_format='[%i][%a][%t]', **kwargs):
        self.name = name
+        self.pretty_name = pretty_name
        self.id = id
        self.img_id = img_id
        self.ext = ext
@ -36,17 +38,15 @@ class Doujinshi(object):
        self.url = '%s/%d' % (DETAIL_URL, self.id)
        self.info = DoujinshiInfo(**kwargs)

-        name_format = name_format.replace('%i', str(self.id))
-        name_format = name_format.replace('%a', self.info.artists)
-        name_format = name_format.replace('%t', self.name)
-        name_format = name_format.replace('%s', self.info.subtitle)
-        self.filename = format_filename(name_format)
+        name_format = name_format.replace('%i', format_filename(str(self.id)))
+        name_format = name_format.replace('%a', format_filename(self.info.artists))

-    def __repr__(self):
-        return '<Doujinshi: {0}>'.format(self.name)
+        name_format = name_format.replace('%t', format_filename(self.name))
+        name_format = name_format.replace('%p', format_filename(self.pretty_name))
+        name_format = name_format.replace('%s', format_filename(self.info.subtitle))
+        self.filename = format_filename(name_format, 255, True)

-    def show(self):
-        table = [
+        self.table = [
            ["Parodies", self.info.parodies],
            ["Doujinshi", self.name],
            ["Subtitle", self.info.subtitle],
@ -57,26 +57,25 @@ class Doujinshi(object):
            ["URL", self.url],
            ["Pages", self.pages],
        ]
-        logger.info(u'Print doujinshi information of {0}\n{1}'.format(self.id, tabulate(table)))

-    def download(self):
+    def __repr__(self):
+        return '<Doujinshi: {0}>'.format(self.name)
+
+    def show(self):
+
+        logger.info(u'Print doujinshi information of {0}\n{1}'.format(self.id, tabulate(self.table)))
+
+    def download(self, regenerate_cbz=False):
        logger.info('Starting to download doujinshi: %s' % self.name)
        if self.downloader:
            download_queue = []
-
            if len(self.ext) != self.pages:
                logger.warning('Page count and ext count do not equal')

            for i in range(1, min(self.pages, len(self.ext)) + 1):
-                download_queue.append('%s/%d/%d.%s' % (IMAGE_URL, int(self.img_id), i, self.ext[i-1]))
-
-            self.downloader.download(download_queue, self.filename)
-
-            '''
-            for i in range(len(self.ext)):
-                download_queue.append('%s/%d/%d.%s' % (IMAGE_URL, int(self.img_id), i+1, EXT_MAP[self.ext[i]]))
-            '''
+                download_queue.append('%s/%d/%d.%s' % (IMAGE_URL, int(self.img_id), i, self.ext[i - 1]))

+            self.downloader.download(download_queue, self.filename, regenerate_cbz=regenerate_cbz)
        else:
            logger.critical('Downloader has not been loaded')

--- a/nhentai/downloader.py
+++ b/nhentai/downloader.py
@ -113,13 +113,18 @@ class Downloader(Singleton):
        else:
            logger.log(15, '{0} downloaded successfully'.format(data))

-    def download(self, queue, folder=''):
+    def download(self, queue, folder='', regenerate_cbz=False):
        if not isinstance(folder, text):
            folder = str(folder)

        if self.path:
            folder = os.path.join(self.path, folder)

+        if os.path.exists(folder + '.cbz'):
+            if not regenerate_cbz:
+                logger.warning('CBZ file \'{}.cbz\' exists, ignored download request'.format(folder))
+                return
+
        if not os.path.exists(folder):
            logger.warning('Path \'{0}\' does not exist, creating.'.format(folder))
            try:
--- a/nhentai/parser.py
+++ b/nhentai/parser.py
@ -133,13 +133,16 @@ def doujinshi_parser(id_):
    doujinshi_info = html.find('div', attrs={'id': 'info'})

    title = doujinshi_info.find('h1').text
+    pretty_name = doujinshi_info.find('h1').find('span', attrs={'class': 'pretty'}).text
    subtitle = doujinshi_info.find('h2')

    doujinshi['name'] = title
+    doujinshi['pretty_name'] = pretty_name
    doujinshi['subtitle'] = subtitle.text if subtitle else ''

    doujinshi_cover = html.find('div', attrs={'id': 'cover'})
-    img_id = re.search('/galleries/([\d]+)/cover\.(jpg|png|gif)$', doujinshi_cover.a.img.attrs['data-src'])
+    img_id = re.search('/galleries/([0-9]+)/cover.(jpg|png|gif)$',
+                       doujinshi_cover.a.img.attrs['data-src'])

    ext = []
    for i in html.find_all('div', attrs={'class': 'thumb-container'}):
@ -174,9 +177,11 @@ def doujinshi_parser(id_):
    return doujinshi


-def old_search_parser(keyword, sorting='date', page=1):
+def legacy_search_parser(keyword, sorting='date', page=1, is_page_all=False):
+    logger.warning('Using legacy searching method, `--all` options will not be supported')
    logger.debug('Searching doujinshis of keyword {0}'.format(keyword))
-    response = request('get', url=constant.SEARCH_URL, params={'q': keyword, 'page': page, 'sort': sorting}).content
+    response = request('get', url=constant.LEGACY_SEARCH_URL,
+                       params={'q': keyword, 'page': page, 'sort': sorting}).content

    result = _get_title_and_id(response)
    if not result:
@ -197,6 +202,7 @@ def print_doujinshi(doujinshi_list):
 def search_parser(keyword, sorting, page, is_page_all=False):
    # keyword = '+'.join([i.strip().replace(' ', '-').lower() for i in keyword.split(',')])
    result = []
+    response = None
    if not page:
        page = [1]

@ -206,6 +212,7 @@ def search_parser(keyword, sorting, page, is_page_all=False):
        page = range(1, init_response['num_pages']+1)

    total = '/{0}'.format(page[-1]) if is_page_all else ''
+    not_exists_persist = False
    for p in page:
        i = 0

@ -217,18 +224,21 @@ def search_parser(keyword, sorting, page, is_page_all=False):
                response = request('get', url.replace('%2B', '+')).json()
            except Exception as e:
                logger.critical(str(e))
-
+                response = None
            break

-        if 'result' not in response:
+        if response is None or 'result' not in response:
            logger.warning('No result in response in page {}'.format(p))
-            break
+            if not_exists_persist is True:
+                break
+            continue

        for row in response['result']:
            title = row['title']['english']
            title = title[:85] + '..' if len(title) > 85 else title
            result.append({'id': row['id'], 'title': title})

+        not_exists_persist = False
        if not result:
            logger.warning('No results for keywords {}'.format(keyword))

--- a/nhentai/serializer.py
+++ b/nhentai/serializer.py
@ -4,6 +4,7 @@ import os
 from xml.sax.saxutils import escape
 from nhentai.constant import LANGUAGEISO

+
 def serialize_json(doujinshi, dir):
    metadata = {'title': doujinshi.name,
                'subtitle': doujinshi.info.subtitle}
@ -26,12 +27,12 @@ def serialize_json(doujinshi, dir):
    metadata['Pages'] = doujinshi.pages

    with open(os.path.join(dir, 'metadata.json'), 'w') as f:
-        json.dump(metadata, f, separators=','':')
+        json.dump(metadata, f, separators=(',', ':'))


-def serialize_comicxml(doujinshi, dir):
+def serialize_comic_xml(doujinshi, dir):
    from iso8601 import parse_date
-    with open(os.path.join(dir, 'ComicInfo.xml'), 'w') as f:
+    with open(os.path.join(dir, 'ComicInfo.xml'), 'w', encoding="utf-8") as f:
        f.write('<?xml version="1.0" encoding="utf-8"?>\n')
        f.write('<ComicInfo xmlns:xsd="http://www.w3.org/2001/XMLSchema" '
                'xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">\n')
@ -45,7 +46,8 @@ def serialize_comicxml(doujinshi, dir):
        xml_write_simple_tag(f, 'NhentaiId', doujinshi.id)
        xml_write_simple_tag(f, 'Genre', doujinshi.info.categories)

-        xml_write_simple_tag(f, 'BlackAndWhite', 'No' if doujinshi.info.tags and 'full color' in doujinshi.info.tags else 'Yes')
+        xml_write_simple_tag(f, 'BlackAndWhite', 'No' if doujinshi.info.tags and
+                             'full color' in doujinshi.info.tags else 'Yes')

        if doujinshi.info.date:
            dt = parse_date(doujinshi.info.date)
@ -59,13 +61,13 @@ def serialize_comicxml(doujinshi, dir):
        if doujinshi.info.tags:
            xml_write_simple_tag(f, 'Tags', doujinshi.info.tags)
        if doujinshi.info.artists:
-            xml_write_simple_tag(f, 'Writer', ' & '.join([i.strip() for i in doujinshi.info.artists.split(',')]))
-        # if doujinshi.info.groups:
-        #     metadata['group'] = [i.strip() for i in doujinshi.info.groups.split(',')]
+            xml_write_simple_tag(f, 'Writer', ' & '.join([i.strip() for i in
+                                                          doujinshi.info.artists.split(',')]))
+
        if doujinshi.info.languages:
            languages = [i.strip() for i in doujinshi.info.languages.split(',')]
            xml_write_simple_tag(f, 'Translated', 'Yes' if 'translated' in languages else 'No')
-            [xml_write_simple_tag(f, 'LanguageISO', LANGUAGEISO[i]) for i in languages \
+            [xml_write_simple_tag(f, 'LanguageISO', LANGUAGEISO[i]) for i in languages
                if (i != 'translated' and i in LANGUAGEISO)]

        f.write('</ComicInfo>')
@ -121,7 +123,7 @@ def serialize_unique(lst):
 def set_js_database():
    with open('data.js', 'w') as f:
        indexed_json = merge_json()
-        unique_json = json.dumps(serialize_unique(indexed_json), separators=','':')
-        indexed_json = json.dumps(indexed_json, separators=','':')
+        unique_json = json.dumps(serialize_unique(indexed_json), separators=(',', ':'))
+        indexed_json = json.dumps(indexed_json, separators=(',', ':'))
        f.write('var data = ' + indexed_json)
        f.write(';\nvar tags = ' + unique_json)
--- a/nhentai/utils.py
+++ b/nhentai/utils.py
@ -10,14 +10,17 @@ import sqlite3

 from nhentai import constant
 from nhentai.logger import logger
-from nhentai.serializer import serialize_json, serialize_comicxml, set_js_database
+from nhentai.serializer import serialize_json, serialize_comic_xml, set_js_database
+
+
+MAX_FIELD_LENGTH = 100


 def request(method, url, **kwargs):
    session = requests.Session()
    session.headers.update({
        'Referer': constant.LOGIN_URL,
-        'User-Agent': 'nhentai command line client (https://github.com/RicterZ/nhentai)',
+        'User-Agent': constant.CONFIG['useragent'],
        'Cookie': constant.CONFIG['cookie']
    })

@ -28,10 +31,14 @@ def request(method, url, **kwargs):


 def check_cookie():
-    response = request('get', constant.BASE_URL).text
-    username = re.findall('"/users/\d+/(.*?)"', response)
+    response = request('get', constant.BASE_URL)
+    if response.status_code == 503 and 'cf-browser-verification' in response.text:
+        logger.error('Blocked by Cloudflare captcha, please set your cookie and useragent')
+        exit(-1)
+
+    username = re.findall('"/users/\d+/(.*?)"', response.text)
    if not username:
-        logger.error('Cannot get your username, please check your cookie or use `nhentai --cookie` to set your cookie')
+        logger.warning('Cannot get your username, please check your cookie or use `nhentai --cookie` to set your cookie')
    else:
        logger.info('Login successfully! Your username: {}'.format(username[0]))

@ -74,6 +81,13 @@ def generate_html(output_dir='.', doujinshi_obj=None, template='default'):
    else:
        doujinshi_dir = '.'

+    if not os.path.exists(doujinshi_dir):
+        logger.warning('Path \'{0}\' does not exist, creating.'.format(doujinshi_dir))
+        try:
+            os.makedirs(doujinshi_dir)
+        except EnvironmentError as e:
+            logger.critical('{0}'.format(str(e)))
+
    file_list = os.listdir(doujinshi_dir)
    file_list.sort()

@ -81,7 +95,7 @@ def generate_html(output_dir='.', doujinshi_obj=None, template='default'):
        if not os.path.splitext(image)[1] in ('.jpg', '.png'):
            continue

-        image_html += '<img src="{0}" class="image-item"/>\n'\
+        image_html += '<img src="{0}" class="image-item"/>\n' \
            .format(image)
    html = readfile('viewer/{}/index.html'.format(template))
    css = readfile('viewer/{}/styles.css'.format(template))
@ -162,7 +176,7 @@ def generate_main_html(output_dir='./'):
        else:
            with open('./main.html', 'wb') as f:
                f.write(data.encode('utf-8'))
-        shutil.copy(os.path.dirname(__file__)+'/viewer/logo.png', './')
+        shutil.copy(os.path.dirname(__file__) + '/viewer/logo.png', './')
        set_js_database()
        logger.log(
            15, 'Main Viewer has been written to \'{0}main.html\''.format(output_dir))
@ -174,7 +188,7 @@ def generate_cbz(output_dir='.', doujinshi_obj=None, rm_origin_dir=False, write_
    if doujinshi_obj is not None:
        doujinshi_dir = os.path.join(output_dir, doujinshi_obj.filename)
        if write_comic_info:
-            serialize_comicxml(doujinshi_obj, doujinshi_dir)
+            serialize_comic_xml(doujinshi_obj, doujinshi_dir)
        cbz_filename = os.path.join(os.path.join(doujinshi_dir, '..'), '{}.cbz'.format(doujinshi_obj.filename))
    else:
        cbz_filename = './doujinshi.cbz'
@ -198,7 +212,7 @@ def generate_cbz(output_dir='.', doujinshi_obj=None, rm_origin_dir=False, write_
 def generate_pdf(output_dir='.', doujinshi_obj=None, rm_origin_dir=False):
    try:
        import img2pdf
-        
+
        """Write images to a PDF file using img2pdf."""
        if doujinshi_obj is not None:
            doujinshi_dir = os.path.join(output_dir, doujinshi_obj.filename)
@ -224,10 +238,11 @@ def generate_pdf(output_dir='.', doujinshi_obj=None, rm_origin_dir=False):
            shutil.rmtree(doujinshi_dir, ignore_errors=True)

        logger.log(15, 'PDF file has been written to \'{0}\''.format(doujinshi_dir))
-        
+
    except ImportError:
        logger.error("Please install img2pdf package by using pip.")

+
 def unicode_truncate(s, length, encoding='utf-8'):
    """https://stackoverflow.com/questions/1809531/truncating-unicode-so-it-fits-a-maximum-size-when-encoded-for-wire-transfer
    """
@ -235,7 +250,7 @@ def unicode_truncate(s, length, encoding='utf-8'):
    return encoded.decode(encoding, 'ignore')


-def format_filename(s):
+def format_filename(s, length=MAX_FIELD_LENGTH, _truncate_only=False):
    """
    It used to be a whitelist approach allowed only alphabet and a part of symbols.
    but most doujinshi's names include Japanese 2-byte characters and these was rejected.
@ -243,16 +258,20 @@ def format_filename(s):
    if filename include forbidden characters (\'/:,;*?"<>|) ,it replace space character(' '). 
    """
    # maybe you can use `--format` to select a suitable filename
-    ban_chars = '\\\'/:,;*?"<>|\t'
-    filename = s.translate(str.maketrans(ban_chars, ' '*len(ban_chars))).strip()
-    filename = ' '.join(filename.split())
-    print(repr(filename))

-    while filename.endswith('.'):
-        filename = filename[:-1]
+    if not _truncate_only:
+        ban_chars = '\\\'/:,;*?"<>|\t'
+        filename = s.translate(str.maketrans(ban_chars, ' ' * len(ban_chars))).strip()
+        filename = ' '.join(filename.split())

-    if len(filename) > 100:
-        filename = filename[:100] + u'…'
+        while filename.endswith('.'):
+            filename = filename[:-1]
+    else:
+        filename = s
+
+    # limit `length` chars
+    if len(filename) >= length:
+        filename = filename[:length - 1] + u'…'

    # Remove [] from filename
    filename = filename.replace('[]', '').strip()
@ -275,7 +294,7 @@ def paging(page_string):
            start, end = i.split('-')
            if not (start.isdigit() and end.isdigit()):
                raise Exception('Invalid page number')
-            page_list.extend(list(range(int(start), int(end)+1)))
+            page_list.extend(list(range(int(start), int(end) + 1)))
        else:
            if not i.isdigit():
                raise Exception('Invalid page number')
@ -284,6 +303,34 @@ def paging(page_string):
    return page_list


+def generate_metadata_file(output_dir, table, doujinshi_obj=None):
+    logger.info('Writing Metadata Info')
+
+    if doujinshi_obj is not None:
+        doujinshi_dir = os.path.join(output_dir, doujinshi_obj.filename)
+    else:
+        doujinshi_dir = '.'
+
+    logger.info(doujinshi_dir)
+
+    f = open(os.path.join(doujinshi_dir, 'info.txt'), 'w', encoding='utf-8')
+
+    fields = ['TITLE', 'ORIGINAL TITLE', 'AUTHOR', 'ARTIST', 'CIRCLE', 'SCANLATOR',
+              'TRANSLATOR', 'PUBLISHER', 'DESCRIPTION', 'STATUS', 'CHAPTERS', 'PAGES',
+              'TAGS', 'TYPE', 'LANGUAGE', 'RELEASED', 'READING DIRECTION', 'CHARACTERS',
+              'SERIES', 'PARODY', 'URL']
+    special_fields = ['PARODY', 'TITLE', 'ORIGINAL TITLE', 'CHARACTERS', 'AUTHOR',
+                      'LANGUAGE', 'TAGS', 'URL', 'PAGES']
+
+    for i in range(len(fields)):
+        f.write('{}: '.format(fields[i]))
+        if fields[i] in special_fields:
+            f.write(str(table[special_fields.index(fields[i])][1]))
+        f.write('\n')
+
+    f.close()
+
+
 class DB(object):
    conn = None
    cur = None
--- a/nhentai/viewer/minimal/index.html
+++ b/nhentai/viewer/minimal/index.html
@ -0,0 +1,25 @@
+<!DOCTYPE html>
+<html>
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=yes, viewport-fit=cover" />
+    <title>{TITLE}</title>
+    <style>
+{STYLES}
+    </style>
+</head>
+<body>
+
+<nav id="list" hidden=true>
+{IMAGES}</nav>
+
+<div id="image-container">
+    <div id="dest"></div>
+    <span id="page-num"></span>
+</div>
+
+<script>
+{SCRIPTS}
+</script>
+</body>
+</html>
--- a/nhentai/viewer/minimal/scripts.js
+++ b/nhentai/viewer/minimal/scripts.js
@ -0,0 +1,79 @@
+const pages = Array.from(document.querySelectorAll('img.image-item'));
+let currentPage = 0;
+
+function changePage(pageNum) {
+    const previous = pages[currentPage];
+    const current = pages[pageNum];
+
+    if (current == null) {
+        return;
+    }
+    
+    previous.classList.remove('current');
+    current.classList.add('current');
+
+    currentPage = pageNum;
+
+    const display = document.getElementById('dest');
+    display.style.backgroundImage = `url("${current.src}")`;
+
+    scroll(0,0)
+
+    document.getElementById('page-num')
+        .innerText = [
+                (pageNum + 1).toLocaleString(),
+                pages.length.toLocaleString()
+            ].join('\u200a/\u200a');
+}
+
+changePage(0);
+
+document.getElementById('image-container').onclick = event => {
+    const width = document.getElementById('image-container').clientWidth;
+    const clickPos = event.clientX / width;
+
+    if (clickPos < 0.5) {
+        changePage(currentPage - 1);
+    } else {
+        changePage(currentPage + 1);
+    }
+};
+
+document.onkeypress = event => {
+    switch (event.key.toLowerCase()) {
+        // Previous Image
+        case 'w':
+	   scrollBy(0, -40);
+	   break;
+        case 'a':
+            changePage(currentPage - 1);
+            break;
+        // Return to previous page
+        case 'q':
+            window.history.go(-1);
+            break;
+        // Next Image
+        case ' ':
+        case 's':
+	    scrollBy(0, 40);
+            break;
+        case 'd':
+            changePage(currentPage + 1);
+            break;
+    }// remove arrow cause it won't work
+};
+
+document.onkeydown = event =>{
+    switch (event.keyCode) {
+        case 37: //left
+            changePage(currentPage - 1);
+            break;
+        case 38: //up
+            break;
+        case 39: //right
+            changePage(currentPage + 1);
+            break;
+        case 40: //down
+            break;
+    }
+};
--- a/nhentai/viewer/minimal/styles.css
+++ b/nhentai/viewer/minimal/styles.css
@ -0,0 +1,75 @@
+  
+*, *::after, *::before {
+    box-sizing: border-box;
+}
+
+img {
+    vertical-align: middle;
+}
+
+html, body {
+    display: flex;
+    background-color: #e8e6e6;
+    height: 100%;
+    width: 100%;
+    padding: 0;
+    margin: 0;
+    font-family: sans-serif;
+}
+
+#list {
+    height: 2000px;
+    overflow: scroll;
+    width: 260px;
+    text-align: center;
+}
+
+#list img {
+    width: 200px;
+    padding: 10px;
+    border-radius: 10px;
+    margin: 15px 0;
+    cursor: pointer;
+}
+
+#list img.current {
+    background: #0003;
+}
+
+#image-container {
+    flex: auto;
+    height: 100%;
+    background: rgb(0, 0, 0);
+    color: rgb(100, 100, 100);
+    text-align: center;
+    cursor: pointer;
+    -webkit-user-select: none;
+    user-select: none;
+    position: relative;
+}
+
+#image-container #dest {
+    height: 2000px;
+    width: 100%;
+    background-size: contain;
+    background-repeat: no-repeat;
+    background-position: top;
+    margin-left: auto;
+    margin-right: auto;
+    max-width: 100%;
+    max-height: 100vh;
+    margin: auto;
+}
+
+#image-container #page-num {
+    position: static;
+    font-size: 9pt;
+    left: 10px;
+    bottom: 5px;
+    font-weight: bold;
+    opacity: 0.9;
+    text-shadow: /* Duplicate the same shadow to make it very strong */
+        0 0 2px #222,
+        0 0 2px #222,
+        0 0 2px #222;
+}
Author	SHA1	Message	Date
Ricter Z	221ff6b32c	0.4.18 bugs fix	2023-02-04 20:24:53 +08:00
Ricter Z	bc6ef0cf5d	solve #251	2023-02-04 20:22:57 +08:00
Ricter Z	c8c63cbc11	add usage images	2023-02-04 20:09:51 +08:00
Ricter Z	a63856d076	update usage	2023-02-04 20:09:46 +08:00
Ricter Z	aa4986189f	resolve issue #264	2023-02-04 19:55:51 +08:00
Ricter Z	0fb81599dc	resolve #265	2023-02-04 19:47:24 +08:00
Ricter Z	e9f9651d07	change the default sort method	2023-02-04 19:38:29 +08:00
Ricter Z	1860b5f0cf	resoved issue #249	2022-05-03 16:54:38 +08:00
Ricter Z	eff4f3bf9b	remove debug print	2022-05-03 16:51:49 +08:00
Ricter Z	501840172e	change sorting from recent to date	2022-05-03 16:49:26 +08:00
Ricter Z	e5ed6d098a	update README	2022-05-02 18:53:40 +08:00
Ricter Z	98606202fb	remove some unused images	2022-05-02 18:49:34 +08:00
Ricter Z	5a3f1009c9	update README for issue #237	2022-05-02 18:48:02 +08:00
Ricter Z	61945a6e97	fix for issue #236	2022-05-02 17:01:30 +08:00
Ricter Z	443fcdc7da	fix for issue #232	2022-05-02 16:53:23 +08:00
Ricter Z	31b95fe2dd	0.4.17 releases, for #246	2022-05-02 16:24:04 +08:00
Ricter Zheng	be8c97f8d4	Merge pull request #247 from krrr/master	2022-05-02 13:21:53 +08:00
krrr	348e51676e	Update README.rst	2022-05-02 12:13:19 +08:00
Ricter Zheng	ea356a1ca2	Merge pull request #244 from krrr/master	2022-04-30 13:47:57 +08:00
krrr	5a4dfb8a76	Add new option to avoid cloudflare captcha	2022-04-30 11:22:41 +08:00
Ricter Zheng	4b15744ceb	Merge pull request #235 from TravisDavis-ops/nixpkg	2021-12-24 03:27:07 +08:00
Travis Davis	b05fa16286	Update README.rst	2021-12-23 12:43:20 -06:00
Ricter Zheng	0879486881	Merge pull request #228 from culturecloud/master	2021-08-23 20:27:38 +08:00
RedoX	c66ba730d3	Fix UnicodeEncodeError	2021-07-28 18:43:45 +06:00
Ricter Zheng	606c5e0ffd	Merge pull request #226 from nanaih/minimal_viewer	2021-06-23 18:14:47 +08:00
rodrigo_qwertyuiop	ba04f81a6f	add minimal viewer, fix not using config's template on --html only option	2021-06-22 23:17:03 -04:00
Ricter Zheng	6519e6f221	Merge pull request #224 from RicterZ/pull/221 Pull/221	2021-06-07 17:21:00 +08:00
RicterZ	7594625d72	fix format	2021-06-07 17:17:54 +08:00
RicterZ	4948c8f0c5	update README	2021-06-07 16:50:03 +08:00
RicterZ	e22a99fa8c	Merge branch 'master' of github.com:RicterZ/nhentai	2021-06-07 16:48:36 +08:00
RicterZ	19a1d5c404	fix #220 add pretty name of doujinshi format	2021-06-07 16:47:54 +08:00
Ricter Zheng	ad1e876611	Merge pull request #221 from SomeRandomDude870/master HDoujin-format Metadata file	2021-06-07 16:02:43 +08:00
Ricter Zheng	1de7e1f998	Merge branch 'pull/221' into master	2021-06-07 16:01:54 +08:00
$DESKTOP-58CH9VE\Michael$ DESKTOP-58CH9VE\Michael	b97e707817	HDoujin-format Metadata file	2021-06-05 17:13:18 +02:00
Ricter Zheng	6ef2189bfe	Merge pull request #214 from lleene/master Add dryrun option to command line interface	2021-06-03 08:00:18 +08:00
RicterZ	24be2d37d4	0.4.16	2021-06-02 23:22:23 +08:00
Lieuwe Leene	bd38294bb7	undo whitespace edits	2021-05-16 19:49:26 +02:00
Lieuwe Leene	2cf4e6718e	Add the option to perform a dry-run and only download meta-data / generate file structure	2021-05-16 19:44:01 +02:00