Compare commits

..

24 Commits

Author SHA1 Message Date
ddc4a20251 0.2.12 2018-07-01 12:48:30 +08:00
206aa3710a fix bug 2018-07-01 12:48:05 +08:00
b5b201f61c 🍻 2018-07-01 02:15:26 +08:00
eb8b41cd1d Merge pull request #22 from Pizzacus/master
Rework the HTML Viewer
2018-06-03 22:53:00 +08:00
98bf88d638 Actually use MANIFEST.ini to specify the package data
*considers suicide*
2018-06-03 11:32:06 +02:00
0bc83982e4 Add the viewer to the package_data entry 2018-06-03 11:09:46 +02:00
99edcef9ac Rework the HTML Viewer
* More modern and efficient code, particularily for the JS
 * Also the layout is better, with flexboxes and all
 * The CSS and JS have their own files
 * The sidebar has proper margins around the images
 * You can use A + D and the arrow keys to navigate images, like on nhentai
 * Images with a lot of width are  properly sized
 * There is a page counter on the bottom left
2018-06-02 23:22:37 +02:00
3ddd474aab Merge pull request #21 from mentaterasmus/master
fixing issue 16 and adding functionalities
2018-05-15 23:17:10 +08:00
f2573d5f10 fixing identation 2018-05-14 01:52:38 -03:00
147eec57cf fixing issue 16 and adding functionalities 2018-05-09 15:42:12 -03:00
967e0b4ff5 fix #18 #19 use nhentai api 2018-04-19 17:21:43 +08:00
22cf2592dd 0.2.11 2018-03-16 23:48:58 +08:00
caa0753adb fix bug #13 2018-03-16 23:45:05 +08:00
0e14dd62d5 fix bug #13 2018-03-16 23:42:24 +08:00
7c9693785e fix #14 2018-03-16 23:39:04 +08:00
08ad73b683 fix bug #13 2018-03-16 23:33:16 +08:00
a56d3ca18c fix bug #13 2018-03-16 23:23:25 +08:00
c1975897d2 save downloaded doujinshi as doujinshi name #13 2018-03-16 23:16:26 +08:00
4ed596ff98 download user fav 2018-03-05 21:47:27 +08:00
debf287fb0 download user fav 2018-03-05 21:45:56 +08:00
308c5277b8 Merge pull request #12 from RomaniukVadim/master
Add install for Gentoo
2018-03-03 19:33:23 +08:00
b425c883c7 Add install for Gentoo 2018-03-02 17:18:22 +02:00
7bf9507bd2 0.2.10 2018-01-09 16:05:52 +08:00
5f5245f70f fix bug 2018-01-09 16:02:16 +08:00
15 changed files with 411 additions and 231 deletions

View File

@ -1,3 +1,5 @@
include README.md include README.md
include requirements.txt include requirements.txt
include nhentai/doujinshi.html include nhentai/viewer/index.html
include nhentai/viewer/styles.css
include nhentai/viewer/scripts.js

View File

@ -12,38 +12,52 @@ nhentai
🎉🎉 nhentai 现在支持 Windows 啦! 🎉🎉 nhentai 现在支持 Windows 啦!
由于 [http://nhentai.net](http://nhentai.net) 下载下来的种子速度很慢,而且官方也提供在线观看本子的功能,所以可以利用本脚本下载本子。 由于 [http://nhentai.net](http://nhentai.net) 下载下来的种子速度很慢,而且官方也提供在线观看本子的功能,所以可以利用本脚本下载本子。
### 安装
### Installation
git clone https://github.com/RicterZ/nhentai git clone https://github.com/RicterZ/nhentai
cd nhentai cd nhentai
python setup.py install python setup.py install
### Gentoo
### 用法 layman -fa glicOne
+ 下载指定 id 列表的本子: sudo emerge net-misc/nhentai
### Usage
下载指定 id 列表的本子:
```bash
nhentai --id=123855,123866
```
nhentai --id=123855,123866 下载某关键词第一页的本子:
```bash
nhentai --search="tomori" --page=1 --download
```
+ 下载某关键词第一页的本子(不推荐) 下载用户 favorites 内容
```bash
nhentai --login "username:password" --download
nhentai --search="tomori" --page=1 --download ```
### Options
`-t, --thread`:指定下载的线程数,最多为 10 线程。 `-t, --thread`:指定下载的线程数,最多为 10 线程。
`--path`:指定下载文件的输出路径,默认为当前目录。 `--path`:指定下载文件的输出路径,默认为当前目录。
`--timeout`:指定下载图片的超时时间,默认为 30 秒。 `--timeout`:指定下载图片的超时时间,默认为 30 秒。
`--proxy`:指定下载的代理,例如: http://127.0.0.1:8080/ `--proxy`:指定下载的代理,例如: http://127.0.0.1:8080/
`--login`nhentai 账号的“用户名:密码”组合
`--nohtml`nhentai Don't generate HTML
`--cbz`nhentai Generate Comic Book CBZ file
### 自建 nhentai 镜像 ### nHentai Mirror
如果想用自建镜像下载 nhentai 的本子,需要搭建 nhentai.net 和 i.nhentai.net 的反向代理。 如果想用自建镜像下载 nhentai 的本子,需要搭建 nhentai.net 和 i.nhentai.net 的反向代理。
例如用 h.loli.club 来做反向代理的话,需要 h.loli.club 反代 nhentai.neti.h.loli.club 反带 i.nhentai.net。 例如用 h.loli.club 来做反向代理的话,需要 h.loli.club 反代 nhentai.neti.h.loli.club 反带 i.nhentai.net。
然后利用环境变量来下载: 然后利用环境变量来下载:
NHENTAI=http://h.loli.club nhentai --id 123456 ```bash
NHENTAI=http://h.loli.club nhentai --id 123456
```
![](./images/search.png) ![](./images/search.png)
![](./images/download.png) ![](./images/download.png)
@ -53,4 +67,4 @@ nhentai
MIT MIT
### あなたも変態 ### あなたも変態
![](./images/image.jpg) ![](./images/image.jpg)

View File

@ -1,3 +1,3 @@
__version__ = '0.2.9' __version__ = '0.2.12'
__author__ = 'Ricter' __author__ = 'RicterZ'
__email__ = 'ricterzheng@gmail.com' __email__ = 'ricterzheng@gmail.com'

View File

@ -2,6 +2,7 @@
from __future__ import print_function from __future__ import print_function
import sys import sys
from optparse import OptionParser from optparse import OptionParser
from nhentai import __version__
try: try:
from itertools import ifilter as filter from itertools import ifilter as filter
except ImportError: except ImportError:
@ -20,13 +21,13 @@ except NameError:
def banner(): def banner():
logger.info(u'''nHentai: あなたも変態。 いいね? logger.info(u'''nHentai ver %s: あなたも変態。 いいね?
_ _ _ _ _ _ _ _
_ __ | | | | ___ _ __ | |_ __ _(_) _ __ | | | | ___ _ __ | |_ __ _(_)
| '_ \| |_| |/ _ \ '_ \| __/ _` | | | '_ \| |_| |/ _ \ '_ \| __/ _` | |
| | | | _ | __/ | | | || (_| | | | | | | _ | __/ | | | || (_| | |
|_| |_|_| |_|\___|_| |_|\__\__,_|_| |_| |_|_| |_|\___|_| |_|\__\__,_|_|
''') ''' % __version__)
def cmd_parser(): def cmd_parser():
@ -34,7 +35,8 @@ def cmd_parser():
'\n NHENTAI=http://h.loli.club nhentai --id [ID ...]' '\n NHENTAI=http://h.loli.club nhentai --id [ID ...]'
'\n\nEnvironment Variable:\n' '\n\nEnvironment Variable:\n'
' NHENTAI nhentai mirror url') ' NHENTAI nhentai mirror url')
parser.add_option('--download', dest='is_download', action='store_true', help='download doujinshi (for search result)') parser.add_option('--download', dest='is_download', action='store_true',
help='download doujinshi (for search result)')
parser.add_option('--show-info', dest='is_show', action='store_true', help='just show the doujinshi information') parser.add_option('--show-info', dest='is_show', action='store_true', help='just show the doujinshi information')
parser.add_option('--id', type='string', dest='id', action='store', help='doujinshi ids set, e.g. 1,2,3') parser.add_option('--id', type='string', dest='id', action='store', help='doujinshi ids set, e.g. 1,2,3')
parser.add_option('--search', type='string', dest='keyword', action='store', help='search doujinshi by keyword') parser.add_option('--search', type='string', dest='keyword', action='store', help='search doujinshi by keyword')
@ -49,8 +51,18 @@ def cmd_parser():
help='timeout of download doujinshi') help='timeout of download doujinshi')
parser.add_option('--proxy', type='string', dest='proxy', action='store', default='', parser.add_option('--proxy', type='string', dest='proxy', action='store', default='',
help='use proxy, example: http://127.0.0.1:1080') help='use proxy, example: http://127.0.0.1:1080')
parser.add_option('--html', dest='html_viewer', action='store_true', help='generate a html viewer at current directory') parser.add_option('--html', dest='html_viewer', action='store_true',
help='generate a html viewer at current directory')
parser.add_option('--login', '-l', type='str', dest='login', action='store',
help='username:password pair of nhentai account')
parser.add_option('--nohtml', dest='is_nohtml', action='store_true',
help='Don\'t generate HTML')
parser.add_option('--cbz', dest='is_cbz', action='store_true',
help='Generate Comic Book CBZ File')
try: try:
sys.argv = list(map(lambda x: unicode(x.decode(sys.stdin.encoding)), sys.argv)) sys.argv = list(map(lambda x: unicode(x.decode(sys.stdin.encoding)), sys.argv))
except (NameError, TypeError): except (NameError, TypeError):
@ -64,35 +76,45 @@ def cmd_parser():
generate_html() generate_html()
exit(0) exit(0)
if args.login:
try:
_, _ = args.login.split(':', 1)
except ValueError:
logger.error('Invalid `username:password` pair.')
exit(1)
if not args.is_download:
logger.warning('YOU DO NOT SPECIFY `--download` OPTION !!!')
if args.tags: if args.tags:
logger.warning('`--tags` is under construction') logger.warning('`--tags` is under construction')
exit(0) exit(1)
if args.id: if args.id:
_ = map(lambda id: id.strip(), args.id.split(',')) _ = map(lambda id: id.strip(), args.id.split(','))
args.id = set(map(int, filter(lambda id: id.isdigit(), _))) args.id = set(map(int, filter(lambda id_: id_.isdigit(), _)))
if (args.is_download or args.is_show) and not args.id and not args.keyword: if (args.is_download or args.is_show) and not args.id and not args.keyword and not args.login:
logger.critical('Doujinshi id(s) are required for downloading') logger.critical('Doujinshi id(s) are required for downloading')
parser.print_help() parser.print_help()
exit(0) exit(1)
if not args.keyword and not args.id: if not args.keyword and not args.id and not args.login:
parser.print_help() parser.print_help()
exit(0) exit(1)
if args.threads <= 0: if args.threads <= 0:
args.threads = 1 args.threads = 1
elif args.threads > 15: elif args.threads > 15:
logger.critical('Maximum number of used threads is 15') logger.critical('Maximum number of used threads is 15')
exit(0) exit(1)
if args.proxy: if args.proxy:
proxy_url = urlparse(args.proxy) proxy_url = urlparse(args.proxy)
if proxy_url.scheme not in ('http', 'https'): if proxy_url.scheme not in ('http', 'https'):
logger.error('Invalid protocol \'{0}\' of proxy, ignored'.format(proxy_url.scheme)) logger.error('Invalid protocol \'{0}\' of proxy, ignored'.format(proxy_url.scheme))
else: else:
constant.PROXY = {proxy_url.scheme: args.proxy} constant.PROXY = {'http': args.proxy, 'https': args.proxy}
return args return args

View File

@ -1,17 +1,16 @@
#!/usr/bin/env python2.7 #!/usr/bin/env python2.7
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals, print_function from __future__ import unicode_literals, print_function
import os
import signal import signal
import platform import platform
from nhentai.cmdline import cmd_parser, banner from nhentai.cmdline import cmd_parser, banner
from nhentai.parser import doujinshi_parser, search_parser, print_doujinshi from nhentai.parser import doujinshi_parser, search_parser, print_doujinshi, login_parser
from nhentai.doujinshi import Doujinshi from nhentai.doujinshi import Doujinshi
from nhentai.downloader import Downloader from nhentai.downloader import Downloader
from nhentai.logger import logger from nhentai.logger import logger
from nhentai.constant import BASE_URL from nhentai.constant import BASE_URL
from nhentai.utils import generate_html from nhentai.utils import generate_html, generate_cbz
def main(): def main():
@ -22,6 +21,12 @@ def main():
doujinshi_ids = [] doujinshi_ids = []
doujinshi_list = [] doujinshi_list = []
if options.login:
username, password = options.login.split(':', 1)
logger.info('Login to nhentai use credential \'%s:%s\'' % (username, '*' * len(password)))
for doujinshi_info in login_parser(username=username, password=password):
doujinshi_list.append(Doujinshi(**doujinshi_info))
if options.keyword: if options.keyword:
doujinshis = search_parser(options.keyword, options.page) doujinshis = search_parser(options.keyword, options.page)
print_doujinshi(doujinshis) print_doujinshi(doujinshis)
@ -31,11 +36,9 @@ def main():
doujinshi_ids = options.id doujinshi_ids = options.id
if doujinshi_ids: if doujinshi_ids:
for id in doujinshi_ids: for id_ in doujinshi_ids:
doujinshi_info = doujinshi_parser(id) doujinshi_info = doujinshi_parser(id_)
doujinshi_list.append(Doujinshi(**doujinshi_info)) doujinshi_list.append(Doujinshi(**doujinshi_info))
else:
exit(0)
if not options.is_show: if not options.is_show:
downloader = Downloader(path=options.output_dir, downloader = Downloader(path=options.output_dir,
@ -44,10 +47,13 @@ def main():
for doujinshi in doujinshi_list: for doujinshi in doujinshi_list:
doujinshi.downloader = downloader doujinshi.downloader = downloader
doujinshi.download() doujinshi.download()
generate_html(doujinshi, output_dir) if not options.is_nohtml and not options.is_cbz:
generate_html(options.output_dir, doujinshi)
elif options.is_cbz:
generate_cbz(options.output_dir, doujinshi)
if not platform.system() == 'Windows': if not platform.system() == 'Windows':
logger.log(15, '🍺 All done.') logger.log(15, '🍻 All done.')
else: else:
logger.log(15, 'All done.') logger.log(15, 'All done.')

View File

@ -5,8 +5,10 @@ from nhentai.utils import urlparse
BASE_URL = os.getenv('NHENTAI', 'https://nhentai.net') BASE_URL = os.getenv('NHENTAI', 'https://nhentai.net')
DETAIL_URL = '%s/g' % BASE_URL DETAIL_URL = '%s/api/gallery' % BASE_URL
SEARCH_URL = '%s/search/' % BASE_URL SEARCH_URL = '%s/api/galleries/search' % BASE_URL
LOGIN_URL = '%s/login/' % BASE_URL
FAV_URL = '%s/favorites/' % BASE_URL
u = urlparse(BASE_URL) u = urlparse(BASE_URL)
IMAGE_URL = '%s://i.%s/galleries' % (u.scheme, u.hostname) IMAGE_URL = '%s://i.%s/galleries' % (u.scheme, u.hostname)

View File

@ -1,126 +0,0 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>{TITLE}</title>
<style>
html, body {{
background-color: #e8e6e6;
height: 100%;
padding: 0;
margin: 0;
overflow: hidden;
}}
.container img {{
display: block;
width: 100%;
margin: 30px 0;
padding: 10px;
cursor: pointer;
}}
.container {{
height: 100%;
overflow: scroll;
background: #e8e6e6;
width: 200px;
padding: 30px;
float: left;
}}
.image {{
margin-left: 260px;
height: 100%;
background: #222;
text-align: center;
}}
.image img {{
height: 100%;
}}
.i a {{
display: block;
position: absolute;
top: 0;
width: 50%;
height: 100%;
}}
.i {{
position: relative;
height: 100%;
}}
.current {{
background: #BBB;
border-radius: 10px;
}}
</style>
<script>
function cursorfocus(elem) {{
var container = document.getElementsByClassName('container')[0];
container.scrollTop = elem.offsetTop - 500;
}}
function getImage(type) {{
var current = document.getElementsByClassName("current")[0];
current.className = "image-item";
var img_src = type == 1 ? current.getAttribute('attr-next') : current.getAttribute('attr-prev');
if (img_src === "") {{
img_src = current.src;
}}
var img_list = document.getElementsByClassName("image-item");
for (i=0; i<img_list.length; i++) {{
if (img_list[i].src.endsWith(img_src)) {{
img_list[i].className = "image-item current";
cursorfocus(img_list[i]);
break;
}}
}}
var display = document.getElementById("dest");
display.src = img_src;
display.focus();
}}
</script>
</head>
<body>
<div class="container">
{IMAGES}</div>
<div class="image">
<div class="i">
<img src="" id="dest">
<a href="javascript:getImage(-1)" style="left: 0;"></a>
<a href="javascript:getImage(1)" style="left: 50%;"></a>
</div>
</div>
</body>
<script>
var img_list = document.getElementsByClassName("image-item");
var display = document.getElementById("dest");
display.src = img_list[0].src;
for (var i = 0; i < img_list.length; i++) {{
img_list[i].addEventListener('click', function() {{
var current = document.getElementsByClassName("current")[0];
current.className = "image-item";
this.className = "image-item current";
var display = document.getElementById("dest");
display.src = this.src;
display.focus();
}}, false);
}}
document.onkeypress = function(e) {{
if (e.keyCode == 32) {{
getImage(1);
}}
}}
</script>
</html>

View File

@ -5,6 +5,13 @@ from future.builtins import range
from nhentai.constant import DETAIL_URL, IMAGE_URL from nhentai.constant import DETAIL_URL, IMAGE_URL
from nhentai.logger import logger from nhentai.logger import logger
from nhentai.utils import format_filename
EXT_MAP = {
'j': 'jpg',
'p': 'png',
}
class DoujinshiInfo(dict): class DoujinshiInfo(dict):
@ -19,7 +26,7 @@ class DoujinshiInfo(dict):
class Doujinshi(object): class Doujinshi(object):
def __init__(self, name=None, id=None, img_id=None, ext='jpg', pages=0, **kwargs): def __init__(self, name=None, id=None, img_id=None, ext='', pages=0, **kwargs):
self.name = name self.name = name
self.id = id self.id = id
self.img_id = img_id self.img_id = img_id
@ -49,9 +56,10 @@ class Doujinshi(object):
logger.info('Start download doujinshi: %s' % self.name) logger.info('Start download doujinshi: %s' % self.name)
if self.downloader: if self.downloader:
download_queue = [] download_queue = []
for i in range(1, self.pages + 1): for i in range(len(self.ext)):
download_queue.append('%s/%d/%d.%s' % (IMAGE_URL, int(self.img_id), i, self.ext)) download_queue.append('%s/%d/%d.%s' % (IMAGE_URL, int(self.img_id), i+1, EXT_MAP[self.ext[i]]))
self.downloader.download(download_queue, self.id)
self.downloader.download(download_queue, format_filename('%s-%s' % (self.id, self.name[:200])))
else: else:
logger.critical('Downloader has not be loaded') logger.critical('Downloader has not be loaded')

View File

@ -36,6 +36,11 @@ class Downloader(Singleton):
filename = filename if filename else os.path.basename(urlparse(url).path) filename = filename if filename else os.path.basename(urlparse(url).path)
base_filename, extension = os.path.splitext(filename) base_filename, extension = os.path.splitext(filename)
try: try:
if os.path.exists(os.path.join(folder, base_filename.zfill(3) + extension)):
logger.warning('File: {0} existed, ignore.'.format(os.path.join(folder, base_filename.zfill(3) +
extension)))
return 1, url
with open(os.path.join(folder, base_filename.zfill(3) + extension), "wb") as f: with open(os.path.join(folder, base_filename.zfill(3) + extension), "wb") as f:
response = request('get', url, stream=True, timeout=self.timeout) response = request('get', url, stream=True, timeout=self.timeout)
if response.status_code != 200: if response.status_code != 200:
@ -75,7 +80,7 @@ class Downloader(Singleton):
logger.log(15, '{0} download successfully'.format(data)) logger.log(15, '{0} download successfully'.format(data))
def download(self, queue, folder=''): def download(self, queue, folder=''):
if not isinstance(folder, (text)): if not isinstance(folder, text):
folder = str(folder) folder = str(folder)
if self.path: if self.path:

View File

@ -1,9 +1,11 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals, print_function from __future__ import unicode_literals, print_function
from bs4 import BeautifulSoup import os
import re import re
import threadpool
import requests import requests
from bs4 import BeautifulSoup
from tabulate import tabulate from tabulate import tabulate
import nhentai.constant as constant import nhentai.constant as constant
@ -17,6 +19,66 @@ def request(method, url, **kwargs):
return requests.__dict__[method](url, proxies=constant.PROXY, verify=False, **kwargs) return requests.__dict__[method](url, proxies=constant.PROXY, verify=False, **kwargs)
def login_parser(username, password):
s = requests.Session()
s.proxies = constant.PROXY
s.verify = False
s.headers.update({'Referer': constant.LOGIN_URL})
s.get(constant.LOGIN_URL)
content = s.get(constant.LOGIN_URL).content
html = BeautifulSoup(content, 'html.parser').encode("ascii")
csrf_token_elem = html.find('input', attrs={'name': 'csrfmiddlewaretoken'})
if not csrf_token_elem:
raise Exception('Cannot find csrf token to login')
csrf_token = csrf_token_elem.attrs['value']
login_dict = {
'csrfmiddlewaretoken': csrf_token,
'username_or_email': username,
'password': password,
}
resp = s.post(constant.LOGIN_URL, data=login_dict)
if 'Invalid username (or email) or password' in resp.text:
logger.error('Login failed, please check your username and password')
exit(1)
html = BeautifulSoup(s.get(constant.FAV_URL).content, 'html.parser').encode("ascii")
count = html.find('span', attrs={'class': 'count'})
if not count:
logger.error('Cannot get count of your favorites, maybe login failed.')
count = int(count.text.strip('(').strip(')'))
pages = count / 25
pages += 1 if count % (25 * pages) else 0
logger.info('Your have %d favorites in %d pages.' % (count, pages))
if os.getenv('DEBUG'):
pages = 1
ret = []
doujinshi_id = re.compile('data-id="([\d]+)"')
def _callback(request, result):
ret.append(result)
thread_pool = threadpool.ThreadPool(5)
for page in range(1, pages+1):
try:
logger.info('Getting doujinshi id of page %d' % page)
resp = s.get(constant.FAV_URL + '?page=%d' % page).content
ids = doujinshi_id.findall(resp)
requests_ = threadpool.makeRequests(doujinshi_parser, ids, _callback)
[thread_pool.putRequest(req) for req in requests_]
thread_pool.wait()
except Exception as e:
logger.error('Error: %s, continue', str(e))
return ret
def doujinshi_parser(id_): def doujinshi_parser(id_):
if not isinstance(id_, (int,)) and (isinstance(id_, (str,)) and not id_.isdigit()): if not isinstance(id_, (int,)) and (isinstance(id_, (str,)) and not id_.isdigit()):
raise Exception('Doujinshi id({0}) is not valid'.format(id_)) raise Exception('Doujinshi id({0}) is not valid'.format(id_))
@ -25,49 +87,29 @@ def doujinshi_parser(id_):
logger.log(15, 'Fetching doujinshi information of id {0}'.format(id_)) logger.log(15, 'Fetching doujinshi information of id {0}'.format(id_))
doujinshi = dict() doujinshi = dict()
doujinshi['id'] = id_ doujinshi['id'] = id_
url = '{0}/{1}/'.format(constant.DETAIL_URL, id_) url = '{0}/{1}'.format(constant.DETAIL_URL, id_)
try: try:
response = request('get', url).content response = request('get', url).json()
except Exception as e: except Exception as e:
logger.critical(str(e)) logger.critical(str(e))
exit(1) exit(1)
html = BeautifulSoup(response, 'html.parser') doujinshi['name'] = response['title']['english']
doujinshi_info = html.find('div', attrs={'id': 'info'}) doujinshi['subtitle'] = response['title']['japanese']
doujinshi['img_id'] = response['media_id']
title = doujinshi_info.find('h1').text doujinshi['ext'] = ''.join(map(lambda s: s['t'], response['images']['pages']))
subtitle = doujinshi_info.find('h2') doujinshi['pages'] = len(response['images']['pages'])
doujinshi['name'] = title
doujinshi['subtitle'] = subtitle.text if subtitle else ''
doujinshi_cover = html.find('div', attrs={'id': 'cover'})
img_id = re.search('/galleries/([\d]+)/cover\.(jpg|png)$', doujinshi_cover.a.img.attrs['data-src'])
if not img_id:
logger.critical('Tried yo get image id failed')
exit(1)
doujinshi['img_id'] = img_id.group(1)
doujinshi['ext'] = img_id.group(2)
pages = 0
for _ in doujinshi_info.find_all('div', class_=''):
pages = re.search('([\d]+) pages', _.text)
if pages:
pages = pages.group(1)
break
doujinshi['pages'] = int(pages)
# gain information of the doujinshi # gain information of the doujinshi
information_fields = doujinshi_info.find_all('div', attrs={'class': 'field-name'}) needed_fields = ['character', 'artist', 'language']
needed_fields = ['Characters', 'Artists', 'Language', 'Tags'] for tag in response['tags']:
for field in information_fields: tag_type = tag['type']
field_name = field.contents[0].strip().strip(':') if tag_type in needed_fields:
if field_name in needed_fields: if tag_type not in doujinshi:
data = [sub_field.contents[0].strip() for sub_field in doujinshi[tag_type] = tag['name']
field.find_all('a', attrs={'class': 'tag'})] else:
doujinshi[field_name.lower()] = ', '.join(data) doujinshi[tag_type] += tag['name']
return doujinshi return doujinshi
@ -76,20 +118,19 @@ def search_parser(keyword, page):
logger.debug('Searching doujinshis of keyword {0}'.format(keyword)) logger.debug('Searching doujinshis of keyword {0}'.format(keyword))
result = [] result = []
try: try:
response = request('get', url=constant.SEARCH_URL, params={'q': keyword, 'page': page}).content response = request('get', url=constant.SEARCH_URL, params={'query': keyword, 'page': page}).json()
if 'result' not in response:
raise Exception('No result in response')
except requests.ConnectionError as e: except requests.ConnectionError as e:
logger.critical(e) logger.critical(e)
logger.warn('If you are in China, please configure the proxy to fu*k GFW.') logger.warn('If you are in China, please configure the proxy to fu*k GFW.')
exit(1) exit(1)
html = BeautifulSoup(response, 'html.parser') for row in response['result']:
doujinshi_search_result = html.find_all('div', attrs={'class': 'gallery'}) title = row['title']['english']
for doujinshi in doujinshi_search_result: title = title[:85] + '..' if len(title) > 85 else title
doujinshi_container = doujinshi.find('div', attrs={'class': 'caption'}) result.append({'id': row['id'], 'title': title})
title = doujinshi_container.text.strip()
title = (title[:85] + '..') if len(title) > 85 else title
id_ = re.search('/g/(\d+)/', doujinshi.a['href']).group(1)
result.append({'id': id_, 'title': title})
if not result: if not result:
logger.warn('Not found anything of keyword {}'.format(keyword)) logger.warn('Not found anything of keyword {}'.format(keyword))
@ -104,5 +145,6 @@ def print_doujinshi(doujinshi_list):
logger.info('Search Result\n' + logger.info('Search Result\n' +
tabulate(tabular_data=doujinshi_list, headers=headers, tablefmt='rst')) tabulate(tabular_data=doujinshi_list, headers=headers, tablefmt='rst'))
if __name__ == '__main__': if __name__ == '__main__':
print(doujinshi_parser("32271")) print(doujinshi_parser("32271"))

View File

@ -2,6 +2,9 @@
from __future__ import unicode_literals, print_function from __future__ import unicode_literals, print_function
import os import os
import string
import zipfile
import shutil
from nhentai.logger import logger from nhentai.logger import logger
@ -27,42 +30,86 @@ def urlparse(url):
return urlparse(url) return urlparse(url)
def readfile(path):
loc = os.path.dirname(__file__)
with open(os.path.join(loc, path), 'r') as file:
return file.read()
def generate_html(output_dir='.', doujinshi_obj=None): def generate_html(output_dir='.', doujinshi_obj=None):
image_html = '' image_html = ''
previous = ''
if doujinshi_obj is not None: if doujinshi_obj is not None:
doujinshi_dir = os.path.join(output_dir, str(doujinshi_obj.id)) doujinshi_dir = os.path.join(output_dir, format_filename('%s-%s' % (doujinshi_obj.id,
str(doujinshi_obj.name[:200]))))
else: else:
doujinshi_dir = '.' doujinshi_dir = '.'
file_list = os.listdir(doujinshi_dir) file_list = os.listdir(doujinshi_dir)
file_list.sort() file_list.sort()
for index, image in enumerate(file_list): for image in file_list:
if not os.path.splitext(image)[1] in ('.jpg', '.png'): if not os.path.splitext(image)[1] in ('.jpg', '.png'):
continue continue
try: image_html += '<img src="{0}" class="image-item"/>\n'\
next_ = file_list[file_list.index(image) + 1] .format(image)
except IndexError:
next_ = ''
image_html += '<img src="{0}" class="image-item {1}" attr-prev="{2}" attr-next="{3}">\n'\ html = readfile('viewer/index.html')
.format(image, 'current' if index == 0 else '', previous, next_) css = readfile('viewer/styles.css')
previous = image js = readfile('viewer/scripts.js')
with open(os.path.join(os.path.dirname(__file__), 'doujinshi.html'), 'r') as template:
html = template.read()
if doujinshi_obj is not None: if doujinshi_obj is not None:
title = doujinshi_obj.name title = doujinshi_obj.name
else: else:
title = 'nHentai HTML Viewer' title = 'nHentai HTML Viewer'
data = html.format(TITLE=title, IMAGES=image_html) data = html.format(TITLE=title, IMAGES=image_html, SCRIPTS=js, STYLES=css)
with open(os.path.join(doujinshi_dir, 'index.html'), 'w') as f: with open(os.path.join(doujinshi_dir, 'index.html'), 'w') as f:
f.write(data) f.write(data)
logger.log(15, 'HTML Viewer has been write to \'{0}\''.format(os.path.join(doujinshi_dir, 'index.html'))) logger.log(15, 'HTML Viewer has been write to \'{0}\''.format(os.path.join(doujinshi_dir, 'index.html')))
def generate_cbz(output_dir='.', doujinshi_obj=None):
if doujinshi_obj is not None:
doujinshi_dir = os.path.join(output_dir, format_filename('%s-%s' % (doujinshi_obj.id,
str(doujinshi_obj.name[:200]))))
cbz_filename = os.path.join(output_dir, format_filename('%s-%s.cbz' % (doujinshi_obj.id,
str(doujinshi_obj.name[:200]))))
else:
cbz_filename = './doujinshi.cbz'
doujinshi_dir = '.'
file_list = os.listdir(doujinshi_dir)
file_list.sort()
with zipfile.ZipFile(cbz_filename, 'w') as cbz_pf:
for image in file_list:
image_path = os.path.join(doujinshi_dir, image)
cbz_pf.write(image_path, image)
shutil.rmtree(doujinshi_dir, ignore_errors=True)
logger.log(15, 'Comic Book CBZ file has been write to \'{0}\''.format(doujinshi_dir))
def format_filename(s):
"""Take a string and return a valid filename constructed from the string.
Uses a whitelist approach: any characters not present in valid_chars are
removed. Also spaces are replaced with underscores.
Note: this method may produce invalid filenames such as ``, `.` or `..`
When I use this method I prepend a date string like '2009_01_15_19_46_32_'
and append a file extension like '.txt', so I avoid the potential of using
an invalid filename.
"""
valid_chars = "-_.() %s%s" % (string.ascii_letters, string.digits)
filename = ''.join(c for c in s if c in valid_chars)
filename = filename.replace(' ', '_') # I don't like spaces in filenames.
return filename

24
nhentai/viewer/index.html Normal file
View File

@ -0,0 +1,24 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>{TITLE}</title>
<style>
{STYLES}
</style>
</head>
<body>
<nav id="list">
{IMAGES}</nav>
<div id="image-container">
<span id="page-num"></span>
<div id="dest"></div>
</div>
<script>
{SCRIPTS}
</script>
</body>
</html>

62
nhentai/viewer/scripts.js Normal file
View File

@ -0,0 +1,62 @@
const pages = Array.from(document.querySelectorAll('img.image-item'));
let currentPage = 0;
function changePage(pageNum) {
const previous = pages[currentPage];
const current = pages[pageNum];
if (current == null) {
return;
}
previous.classList.remove('current');
current.classList.add('current');
currentPage = pageNum;
const display = document.getElementById('dest');
display.style.backgroundImage = `url("${current.src}")`;
document.getElementById('page-num')
.innerText = [
(pageNum + 1).toLocaleString(),
pages.length.toLocaleString()
].join('\u200a/\u200a');
}
changePage(0);
document.getElementById('list').onclick = event => {
if (pages.includes(event.target)) {
changePage(pages.indexOf(event.target));
}
};
document.getElementById('image-container').onclick = event => {
const width = document.getElementById('image-container').clientWidth;
const clickPos = event.clientX / width;
if (clickPos < 0.5) {
changePage(currentPage - 1);
} else {
changePage(currentPage + 1);
}
};
document.onkeypress = event => {
switch (event.key.toLowerCase()) {
// Previous Image
case 'arrowleft':
case 'a':
changePage(currentPage - 1);
break;
// Next Image
case ' ':
case 'enter':
case 'arrowright':
case 'd':
changePage(currentPage + 1);
break;
}
};

69
nhentai/viewer/styles.css Normal file
View File

@ -0,0 +1,69 @@
*, *::after, *::before {
box-sizing: border-box;
}
img {
vertical-align: middle;
}
html, body {
display: flex;
background-color: #e8e6e6;
height: 100%;
width: 100%;
padding: 0;
margin: 0;
font-family: sans-serif;
}
#list {
height: 100%;
overflow: auto;
width: 260px;
text-align: center;
}
#list img {
width: 200px;
padding: 10px;
border-radius: 10px;
margin: 15px 0;
cursor: pointer;
}
#list img.current {
background: #0003;
}
#image-container {
flex: auto;
height: 100vh;
background: #222;
color: #fff;
text-align: center;
cursor: pointer;
-webkit-user-select: none;
user-select: none;
position: relative;
}
#image-container #dest {
height: 100%;
width: 100%;
background-size: contain;
background-repeat: no-repeat;
background-position: center;
}
#image-container #page-num {
position: absolute;
font-size: 18pt;
left: 10px;
bottom: 5px;
font-weight: bold;
opacity: 0.75;
text-shadow: /* Duplicate the same shadow to make it very strong */
0 0 2px #222,
0 0 2px #222,
0 0 2px #222;
}

View File

@ -1,16 +1,19 @@
# coding: utf-8 # coding: utf-8
from __future__ import print_function, unicode_literals from __future__ import print_function, unicode_literals
import sys
import codecs import codecs
from setuptools import setup, find_packages from setuptools import setup, find_packages
from nhentai import __version__, __author__, __email__ from nhentai import __version__, __author__, __email__
with open('requirements.txt') as f: with open('requirements.txt') as f:
requirements = [l for l in f.read().splitlines() if l] requirements = [l for l in f.read().splitlines() if l]
def long_description(): def long_description():
with codecs.open('README.md', 'r') as f: with codecs.open('README.md', 'rb') as f:
return str(f.read()) if sys.version_info >= (3, 0, 0):
return str(f.read())
setup( setup(
name='nhentai', name='nhentai',