DEV Community

haywhnk
haywhnk

Posted on • Edited on

The "mangabank.org" has disappeared.`

Fr:
https://rentry.co/hzpue/

Spanish:
https://rentry.co/agr95

Japanese:
https://crieit.net/posts/Bank-6249a7b59dfd9

About what was a manga bank.

The "manga bank" has disappeared.
It became a new illegal site called 13dl.me, but since the online reading style was abandoned and it became a download reach type, this article was also updated on the board, as a new milestone.

Past board
Until mangaBank disappears
https://crieit.net/boards/manga-B

Well, I've always been wondering how to write Javascript, so I wrote a program in Javascript using node.js to see what this new illegal site looks like.

It took me a while because I wasn't familiar with Promise, but it had more documentation on Javascript than Nim, so it wasn't long before the program worked.

Install node.js and install axios, cheerio with npm package manager.

https://nodejs.org/

https://github.com/axios/axios#installing

https://cheerio.js.org/

A module that axios requests https, it is a return value of Promise type and can process response if the https request is successful.
cheerio uses html text as an object and uses it as a css selector.
I used it like nokogiri in Ruby, like bueatifulsoup4 in python, like goquery in go, like nimquery in Nim.
These convenience selectors have similar usages and are different, so if you are not accustomed to not being able to use them properly unless you examine them carefully, you can manage with regular expressions even if you do not use them, so in that case it is the same even if the language is different. The standard regular expression style can be written in the same way.
go also has a perl-style regular expression module, though it's not a standard library.

RE2 is a fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python. It is a C++ library.

License
BSD-3-Clause license
https://github.com/google/re2/

1. Base

const cheerio = require('cheerio'),
    axios = require('axios');
//  url = `https://13dl.me/`;

var url = `https://13dl.me/list/popular/`;

var counter = 0;

function recursive(url){
    let temp_url = url;
    axios.get(url)
    .then((response) => {
        if (response.status == 200){
            let $ = cheerio.load(response.data);
            $('a').each(function (i, e) {
                let title = $(e).attr('title');      
                let link = $(e).attr('href');      
                if (title !== undefined){
                    let h = /^Home/.test(title),
                        po = /^Popular\s/.test(title),
                        p = /^Prev/.test(title),
                        pa = /^Page/.test(title),
                        n = /^Next/.test(title);
                    if (h || po || p || pa || n || title === ``){
                        //unless
                    } else {
                        counter++;
                        console.log(counter + `{` + title +`{` + link);
//                      console.log(counter,title);
//                      console.log(`____________________`);
                    }
                }
                if (title === `Next`){
                    url = link;
//                    recursive(url);
                }
            })
            if (url !== temp_url){
                    recursive(url);
            }
        }
    }).catch(function (e) {
        console.log(e);
        recursive(url);
    });
}

recursive(url);
Enter fullscreen mode Exit fullscreen mode

What kind of program is, it enumerates the contents on the site.
It call the function recursively, but I didn't know how to write it in Javascript, but it worked, so let's record the contents to be listed in the SQLite database.
If you use it with awk, you can create SQLite3 database files with this base program as well. Set it to csv with the delimiter {, read csv with SQLite3 and save it, and it will be a database file.

Ruby

If you write Ruby program code similar to this,

require 'nokogiri'
require 'open-uri'

url = 'https://13dl.me/list/popular/'
counter = 0
threads = []

def recursive(counter,url,threads) 
  html = URI.open(url).read
  doc = Nokogiri::HTML.parse(html)
  html = nil
  doc.css('a').each do |x|
    xx = x.attr('title')
    pa = /^Page/.match?("#{xx}")
    p = /^Prev/.match?("#{xx}")

    if xx && xx !='' &&  xx !='Home' && xx !='Popular Manga' then

      unless pa | p then
        n = /^Next/.match?("#{xx}")
        link = ''
        if n then
          link = x.attr('href')
          threads << Thread.new do
            recursive(counter,link,threads)
          end
        else
          counter += 1
          puts "#{counter} #{xx}"
        end
      end

    end

  end
  doc = nil
end

recursive(counter,url,threads)

threads.each(&:join)

Enter fullscreen mode Exit fullscreen mode

python

!python -m pip install requests beautifulsoup4
import requests
from bs4 import BeautifulSoup

url = 'https://13dl.me/list/popular/'
counter = 0

def scrape(url, counter):
    response = requests.get(url)
    soup = BeautifulSoup(response.content)
    response.close()
    xx = soup.find_all("a",title=True)
    next_url = ""
    for x in xx:
        if (x['title'] != '' and x['title'] != "Home" and x['title'] != "Popular Manga"):
            pa_tf = bool(re.search("^Page",x['title']))
            p_tf = bool(re.search("^Prev",x['title']))
            if (pa_tf == False and p_tf == False):
                if (x['title'] != "Next"):
                    counter += 1
                    print(counter,x['title'])
                else:
                    next_url = x['href']

    del xx,soup
    return next_url, counter

while(url != ""):
    url,counter = scrape(url, counter)
Enter fullscreen mode Exit fullscreen mode

recursive
https://rentry.co/5ibqk
while
https://rentry.co/ir4b3

Nim

while
https://rentry.co/856s2
threadpool ... Work in Progress
https://rentry.co/ze8f4

perl5

https://rentry.co/perl5_manga

Go

https://rentry.co/75gch
&sync.wait.Group{} ...Work in Progress
https://rentry.co/wikwa
contents amount ,page | channel , sync.Mutex
https://rentry.co/o8fp4

2. SQLite3

const cheerio = require('cheerio'), 
    axios = require('axios');
const sqlite3 = require('sqlite3').verbose();

const db = new sqlite3.Database('13dlme0.db',(err) => {
    if(err) {
        return console.error(err.message);
    }
    console.log('Connect to SQLite database.');
});

db.serialize(() => {
    db.run(`CREATE TABLE manga(id interger,title string,link string)`)
});

var url = `https://13dl.me/list/popular/`;

var counter = 0;
function recursive(url){
    let temp_url = url;
    axios.get(url)
    .then((response) => {
        if (response.status == 200){
            let $ = cheerio.load(response.data);
            $('a').each(function (i, e) {
                let title = $(e).attr('title');      
                let link = $(e).attr('href');      
                if (title !== undefined){
                    let h = /^Home/.test(title),
                        po = /^Popular\s/.test(title),
                        p = /^Prev/.test(title),
                        pa = /^Page/.test(title),
                        n = /^Next/.test(title);
                    if (h || po || p || pa || n || title === ``){
                        //unless
                    } else {
                        counter++;
//                        console.log(counter + ` ** ` + title + ` ** ` + link);
                        db.run(`insert into manga(id,title) VALUES(?,?)`,[counter,title]);
                        console.log(counter,title);
                        console.log(`____________________`);
                    }
                }
                if (title === `Next`){
                    url = link;
                }
            })
            if (temp_url != url){
                recursive(url);
            };
        }
    }).catch(function (e) {
        console.log(e);
        recursive(url);
    });
}
recursive(url);
Enter fullscreen mode Exit fullscreen mode

db.close ()
I haven't.
How should I db.close()?

3. SQLite3 + db.close()

const cheerio = require('cheerio'), 
    axios = require('axios');
const sqlite3 = require('sqlite3').verbose();

const db = new sqlite3.Database('13dlme_test1-1.db',(err) => {
    if(err) {
        return console.error(err.message);
    }
    console.log('Connect to SQLite database.');
});

db.serialize(() => {
    db.run(`CREATE TABLE manga(id interger,title string,link string)`)
});
;
var url = `https://13dl.me/list/popular/`;
var counter = 0;
function recursive(url){
    const temp_url = url;
    const promise1 = axios.get(url)
    const promiseData1 = promise1.then((response) => {
        if (response.status == 200){
            let $ = cheerio.load(response.data);
            $('a').each(function (i, e) {
                const title = $(e).attr('title');      
                const link = $(e).attr('href');      
                if (title !== undefined){
                    const h = /^Home/.test(title),
                        po = /^Popular\s/.test(title),
                        p = /^Prev/.test(title),
                        pa = /^Page/.test(title),
                        n = /^Next/.test(title);
                    if (h || po || p || pa || n || title === ``){
                        //unless
                    } else {
                        counter++;
//                        console.log(counter + ` ** ` + title + ` ** ` + link);
                        db.serialize(() => {
                            db.run(`insert into manga(id,title) VALUES(?,?)`,[counter,title]);
                        });
                        console.log(counter,title);
                        console.log(`____________________`);
                    }
                }
                if ((title === `Next`) && (link != undefined)){
                    url = link;
                }
            })
        } else {
            console.log(response);
        }

        if (temp_url !== url){
            return recursive(url);
//          `:keep going:`;
        } else {
            console.log('Close SQLite database.');
            return `:stop:`;
        }

    }).catch(function (e) {
        console.log(e);
    });

    Promise.all([promiseData1]).then((value) => {
        if (value[0] === `:stop:`){
            db.close((err) => {
            if(err) {
                console.error(err.message);
                }
            });
        }
    });
}

recursive(url);
Enter fullscreen mode Exit fullscreen mode

4. async/await

const cheerio = require('cheerio'), 
    axios = require('axios');
const sqlite3 = require('sqlite3').verbose();

const db = new sqlite3.Database('13dlme_test2.db',(err) => {
    if(err) {
        return console.error(err.message);
    }
    console.log('Connect to SQLite database.');
});

db.serialize(() => {
    db.run(`CREATE TABLE manga(id interger,title string,link string)`)
});

var url = `https://13dl.me/list/popular/`;

var counter = 0;
const recursive = async (url) => {
    console.log(url);
    const temp_url = url;
    try {
        const {data} = await axios.get(url);
        const $ = cheerio.load(data)
        $('a').each(function (i, e) {
            const title = $(e).attr('title');      
            const link = $(e).attr('href');      
            if (title !== undefined){
                const h = /^Home/.test(title),
                    po = /^Popular\s/.test(title),
                    p = /^Prev/.test(title),
                    pa = /^Page/.test(title),
                    n = /^Next/.test(title);
                if (h || po || p || pa || n || title === ``){
                    //unless
                } else {
                    counter++;
//                    console.log(counter + ` ** ` + title + ` ** ` + link);
                    db.run(`insert into manga(id,title) VALUES(?,?)`,[counter,title]);
                    console.log(counter,title);
                    console.log(`____________________`);
                }
            }
            if ((title === `Next`) && (link != undefined)){
                url = link;
            }
        });

        if (temp_url !== url) {
        recursive(url);
        //`:keep going:`;
    }else{
            console.log('Close SQLite database.');
            db.close();
        //`:stop:`;
    }

    } catch(error) {
        throw error;
    }
}

recursive(url);
Enter fullscreen mode Exit fullscreen mode

However, even in the case of 2 ..., it seems that it can be written to the database even if it is not closed, so it seems that commit is possible.

In the case of an old PC that writes to the hard disk, if you commit one by one or repeat open and close to write to the SQLite file, the speed will be sacrificed and it will be slow, but you can create a table in memory and write the data. If you write the result to the .db file at the end, it may be good because you only have to commit at the end.

Each of the 3 and 4 program codes was written to close the database when all the data was written, but this should not be the case, but it was closed cleanly. Funny, but non-blocking. There is room for pursuit, but it is difficult because the result does not result in an error.

There is nothing special to explain.

Ruby sqlite3

require 'nokogiri'
require 'open-uri'
require 'sqlite3'

url = 'https://13dl.me/list/popular/'
counter = 0
threads = []

SQL =<<EOS                                                                     
create table manga(
    id INTEGER PRIMARY KEY,
    title text
    );
EOS

db = SQLite3::Database.open("13dlme.db")
db.execute(SQL)

def recursive(counter,url,threads,db) 
  html = URI.open(url).read
  doc = Nokogiri::HTML.parse(html)
  html = nil
  doc.css('a').each do |x|
    xx = x.attr('title')
    pa = /^Page/.match?("#{xx}")
    p = /^Prev/.match?("#{xx}")

    if xx && xx !='' &&  xx !='Home' && xx !='Popular Manga' then

      unless pa | p then
        n = /^Next/.match?("#{xx}")
        link = ''
        if n then
          link = x.attr('href')
          threads << Thread.new do
            recursive(counter,link,threads,db)
          end
        else
          counter += 1
          puts "#{counter} #{xx}"
          db.execute("insert into manga(id,title) values('#{counter}','#{xx}') ;")
        end
      end

    end

  end
  doc = nil
end

recursive(counter,url,threads,db)
threads.each(&:join)
db.close
Enter fullscreen mode Exit fullscreen mode

As a program, if the application termux or userland works on a smartphone etc., node.js can be installed as a package even if it is not rooted, so it works.

In the case of termux, you can write the above program by installing node.js and axios, cheerio, sqlite3 with the npm package manager, but userland starts by selecting the OS, but depending on the OS, it may be apt. There are some differences with apk, but if you can install node.js, the rest is almost the same as termux.

For iOS I have a terminal emulator app called iSh, but I haven't tested it on iOS. I don't have a mac and I got an old iOS iphone, but I have a physiological dislike for a company called apple, so I can't touch it easily.

In Ruby and python, there are various differences depending on the OS before installing the library, but in the case of node.js, it may be an advantage that such a difference does not occur. Whether javascript is good or not, it should be rewritten so that it works only in the browser.

As a copyright manager, if there is a title in the list that you manage, take down the procedure.

When it was a manga bank, the image data on the cloudflare process was not allowed to be accessed from outside Japan by IP address, but this time it seems that it is not so, and the title is also in Japanese in alphabet. It may be conscious of overseas demand because it corresponds to the word title.

Top comments (0)