Merge pull request #342 from CIRCL/advanced_crawler

Advanced_crawler + tag by daterange + bootstrap 4 migration + bugs fix
This commit is contained in:
Alexandre Dulaunoy 2019-04-25 15:39:19 +02:00 committed by GitHub
commit ecd14b91d9
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
95 changed files with 6782 additions and 1893 deletions

2
.gitignore vendored
View file

@ -34,6 +34,8 @@ var/www/submitted
bin/packages/config.cfg bin/packages/config.cfg
bin/packages/config.cfg.backup bin/packages/config.cfg.backup
configs/keys configs/keys
configs/update.cfg
update/current_version
files files
# Pystemon archives # Pystemon archives

View file

@ -12,46 +12,205 @@ Redis and ARDB overview
- DB 0 - PubSub + Queue and Paste content LRU cache - DB 0 - PubSub + Queue and Paste content LRU cache
- DB 1 - _Mixer_ Cache - DB 1 - _Mixer_ Cache
* ARDB on TCP port 6382 * ARDB on TCP port 6382
- DB 1 - Curve
- DB 2 - Trending
- DB 3 - Terms DB 1 - Curve
- DB 4 - Sentiments DB 2 - TermFreq
DB 3 - Trending
DB 4 - Sentiments
DB 5 - TermCred
DB 6 - Tags
DB 7 - Metadata
DB 8 - Statistics
DB 9 - Crawler
* ARDB on TCP port <year> * ARDB on TCP port <year>
- DB 0 - Lines duplicate - DB 0 - Lines duplicate
- DB 1 - Hashes - DB 1 - Hashes
# Database Map:
## DB0 - Core:
##### Update keys:
| Key | Value |
| ------ | ------ |
| | |
| ail:version | **current version** |
| | |
| ail:update_**update_version** | **background update name** |
| | **background update name** |
| | **...** |
| | |
| ail:update_date_v1.5 | **update date** |
| | |
| ail:update_error | **update message error** |
| | |
| ail:update_in_progress | **update version in progress** |
| ail:current_background_update | **current update version** |
| | |
| ail:current_background_script | **name of the background script currently executed** |
| ail:current_background_script_stat | **progress in % of the background script** |
## DB2 - TermFreq:
##### Set:
| Key | Value |
| ------ | ------ |
| TrackedSetTermSet | **tracked_term** |
| TrackedSetSet | **tracked_set** |
| TrackedRegexSet | **tracked_regex** |
| | |
| tracked_**tracked_term** | **item_path** |
| set_**tracked_set** | **item_path** |
| regex_**tracked_regex** | **item_path** |
| | |
| TrackedNotifications | **tracked_trem / set / regex** |
| | |
| TrackedNotificationTags_**tracked_trem / set / regex** | **tag** |
| | |
| TrackedNotificationEmails_**tracked_trem / set / regex** | **email** |
##### Zset:
| Key | Field | Value |
| ------ | ------ | ------ |
| per_paste_TopTermFreq_set_month | **term** | **nb_seen** |
| per_paste_TopTermFreq_set_week | **term** | **nb_seen** |
| per_paste_TopTermFreq_set_day_**epoch** | **term** | **nb_seen** |
| | | |
| TopTermFreq_set_month | **term** | **nb_seen** |
| TopTermFreq_set_week | **term** | **nb_seen** |
| TopTermFreq_set_day_**epoch** | **term** | **nb_seen** |
##### Hset:
| Key | Field | Value |
| ------ | ------ | ------ |
| TrackedTermDate | **tracked_term** | **epoch** |
| TrackedSetDate | **tracked_set** | **epoch** |
| TrackedRegexDate | **tracked_regex** | **epoch** |
| | | |
| BlackListTermDate | **blacklisted_term** | **epoch** |
| | | |
| **epoch** | **term** | **nb_seen** |
## DB6 - Tags:
##### Hset:
| Key | Field | Value |
| ------ | ------ | ------ |
| per_paste_**epoch** | **term** | **nb_seen** |
| | |
| tag_metadata:**tag** | first_seen | **date** |
| tag_metadata:**tag** | last_seen | **date** |
##### Set:
| Key | Value |
| ------ | ------ |
| list_tags | **tag** |
| active_taxonomies | **taxonomie** |
| active_galaxies | **galaxie** |
| active_tag_**taxonomie or galaxy** | **tag** |
| synonym_tag_misp-galaxy:**galaxy** | **tag synonym** |
| list_export_tags | **user_tag** |
| **tag**:**date** | **paste** |
##### old:
| Key | Value |
| ------ | ------ |
| *tag* | *paste* |
## DB7 - Metadata:
#### Crawled Items:
##### Hset:
| Key | Field | Value |
| ------ | ------ | ------ |
| paste_metadata:**item path** | super_father | **first url crawled** |
| | father | **item father** |
| | domain | **crawled domain**:**domain port** |
| | screenshot | **screenshot hash** |
##### Set:
| Key | Field |
| ------ | ------ |
| tag:**item path** | **tag** |
| | |
| paste_children:**item path** | **item path** |
| | |
| hash_paste:**item path** | **hash** |
| base64_paste:**item path** | **hash** |
| hexadecimal_paste:**item path** | **hash** |
| binary_paste:**item path** | **hash** |
##### Zset:
| Key | Field | Value |
| ------ | ------ | ------ |
| nb_seen_hash:**hash** | **item** | **nb_seen** |
| base64_hash:**hash** | **item** | **nb_seen** |
| binary_hash:**hash** | **item** | **nb_seen** |
| hexadecimal_hash:**hash** | **item** | **nb_seen** |
## DB9 - Crawler:
##### Hset:
| Key | Field | Value |
| ------ | ------ | ------ |
| **service type**_metadata:**domain** | first_seen | **date** |
| | last_check | **date** |
| | ports | **port**;**port**;**port** ... |
| | paste_parent | **parent last crawling (can be auto or manual)** |
##### Zset:
| Key | Field | Value |
| ------ | ------ | ------ |
| crawler\_history\_**service type**:**domain**:**port** | **item root (first crawled item)** | **epoch (seconds)** |
##### Set:
| Key | Value |
| ------ | ------ | ------ |
| screenshot:**sha256** | **item path** |
##### crawler config:
| Key | Value |
| ------ | ------ |
| crawler\_config:**crawler mode**:**service type**:**domain** | **json config** |
##### automatic crawler config:
| Key | Value |
| ------ | ------ |
| crawler\_config:**crawler mode**:**service type**:**domain**:**url** | **json config** |
###### exemple json config:
```json
{
"closespider_pagecount": 1,
"time": 3600,
"depth_limit": 0,
"har": 0,
"png": 0
}
```
ARDB overview ARDB overview
---------------------------
ARDB_DB
* DB 1 - Curve
* DB 2 - TermFreq
----------------------------------------- TERM ----------------------------------------
SET - 'TrackedRegexSet' term ----------------------------------------- SENTIMENT ------------------------------------
HSET - 'TrackedRegexDate' tracked_regex today_timestamp SET - 'Provider_set' Provider
SET - 'TrackedSetSet' set_to_add KEY - 'UniqID' INT
HSET - 'TrackedSetDate' set_to_add today_timestamp SET - provider_timestamp UniqID
SET - 'TrackedSetTermSet' term SET - UniqID avg_score
HSET - 'TrackedTermDate' tracked_regex today_timestamp
SET - 'TrackedNotificationEmails_'+term/set email
SET - 'TrackedNotifications' term/set
* DB 3 - Trending
* DB 4 - Sentiment
* DB 5 - TermCred
* DB 6 - Tags
* DB 7 - Metadata
* DB 8 - Statistics
* DB 7 - Metadata: * DB 7 - Metadata:
----------------------------------------------------------------------------------------
----------------------------------------- BASE64 ---------------------------------------- ----------------------------------------- BASE64 ----------------------------------------
HSET - 'metadata_hash:'+hash 'saved_path' saved_path HSET - 'metadata_hash:'+hash 'saved_path' saved_path
@ -71,18 +230,10 @@ ARDB_DB
SET - 'hash_base64_all_type' hash_type * SET - 'hash_base64_all_type' hash_type *
SET - 'hash_binary_all_type' hash_type * SET - 'hash_binary_all_type' hash_type *
SET - 'hash_paste:'+paste hash *
SET - 'base64_paste:'+paste hash *
SET - 'binary_paste:'+paste hash *
ZADD - 'hash_date:'+20180622 hash * nb_seen_this_day ZADD - 'hash_date:'+20180622 hash * nb_seen_this_day
ZADD - 'base64_date:'+20180622 hash * nb_seen_this_day ZADD - 'base64_date:'+20180622 hash * nb_seen_this_day
ZADD - 'binary_date:'+20180622 hash * nb_seen_this_day ZADD - 'binary_date:'+20180622 hash * nb_seen_this_day
ZADD - 'nb_seen_hash:'+hash paste * nb_seen_in_paste
ZADD - 'base64_hash:'+hash paste * nb_seen_in_paste
ZADD - 'binary_hash:'+hash paste * nb_seen_in_paste
ZADD - 'base64_type:'+type date nb_seen ZADD - 'base64_type:'+type date nb_seen
ZADD - 'binary_type:'+type date nb_seen ZADD - 'binary_type:'+type date nb_seen

View file

@ -40,7 +40,7 @@ def search_api_key(message):
print('found google api key') print('found google api key')
print(to_print) print(to_print)
publisher.warning('{}Checked {} found Google API Key;{}'.format( publisher.warning('{}Checked {} found Google API Key;{}'.format(
to_print, len(google_api_key), paste.p_path)) to_print, len(google_api_key), paste.p_rel_path))
msg = 'infoleak:automatic-detection="google-api-key";{}'.format(filename) msg = 'infoleak:automatic-detection="google-api-key";{}'.format(filename)
p.populate_set_out(msg, 'Tags') p.populate_set_out(msg, 'Tags')
@ -49,7 +49,7 @@ def search_api_key(message):
print(to_print) print(to_print)
total = len(aws_access_key) + len(aws_secret_key) total = len(aws_access_key) + len(aws_secret_key)
publisher.warning('{}Checked {} found AWS Key;{}'.format( publisher.warning('{}Checked {} found AWS Key;{}'.format(
to_print, total, paste.p_path)) to_print, total, paste.p_rel_path))
msg = 'infoleak:automatic-detection="aws-key";{}'.format(filename) msg = 'infoleak:automatic-detection="aws-key";{}'.format(filename)
p.populate_set_out(msg, 'Tags') p.populate_set_out(msg, 'Tags')
@ -57,8 +57,6 @@ def search_api_key(message):
msg = 'infoleak:automatic-detection="api-key";{}'.format(filename) msg = 'infoleak:automatic-detection="api-key";{}'.format(filename)
p.populate_set_out(msg, 'Tags') p.populate_set_out(msg, 'Tags')
msg = 'apikey;{}'.format(filename)
p.populate_set_out(msg, 'alertHandler')
#Send to duplicate #Send to duplicate
p.populate_set_out(filename, 'Duplicate') p.populate_set_out(filename, 'Duplicate')

View file

@ -43,8 +43,8 @@ if __name__ == "__main__":
# FIXME why not all saving everything there. # FIXME why not all saving everything there.
PST.save_all_attributes_redis() PST.save_all_attributes_redis()
# FIXME Not used. # FIXME Not used.
PST.store.sadd("Pastes_Objects", PST.p_path) PST.store.sadd("Pastes_Objects", PST.p_rel_path)
except IOError: except IOError:
print("CRC Checksum Failed on :", PST.p_path) print("CRC Checksum Failed on :", PST.p_rel_path)
publisher.error('Duplicate;{};{};{};CRC Checksum Failed'.format( publisher.error('Duplicate;{};{};{};CRC Checksum Failed'.format(
PST.p_source, PST.p_date, PST.p_name)) PST.p_source, PST.p_date, PST.p_name))

View file

@ -67,7 +67,7 @@ def check_all_iban(l_iban, paste, filename):
if(nb_valid_iban > 0): if(nb_valid_iban > 0):
to_print = 'Iban;{};{};{};'.format(paste.p_source, paste.p_date, paste.p_name) to_print = 'Iban;{};{};{};'.format(paste.p_source, paste.p_date, paste.p_name)
publisher.warning('{}Checked found {} IBAN;{}'.format( publisher.warning('{}Checked found {} IBAN;{}'.format(
to_print, nb_valid_iban, paste.p_path)) to_print, nb_valid_iban, paste.p_rel_path))
msg = 'infoleak:automatic-detection="iban";{}'.format(filename) msg = 'infoleak:automatic-detection="iban";{}'.format(filename)
p.populate_set_out(msg, 'Tags') p.populate_set_out(msg, 'Tags')
@ -113,7 +113,7 @@ if __name__ == "__main__":
try: try:
l_iban = iban_regex.findall(content) l_iban = iban_regex.findall(content)
except TimeoutException: except TimeoutException:
print ("{0} processing timeout".format(paste.p_path)) print ("{0} processing timeout".format(paste.p_rel_path))
continue continue
else: else:
signal.alarm(0) signal.alarm(0)

View file

@ -62,8 +62,6 @@ def search_key(content, message, paste):
to_print = 'Bitcoin found: {} address and {} private Keys'.format(len(bitcoin_address), len(bitcoin_private_key)) to_print = 'Bitcoin found: {} address and {} private Keys'.format(len(bitcoin_address), len(bitcoin_private_key))
print(to_print) print(to_print)
publisher.warning(to_print) publisher.warning(to_print)
msg = ('bitcoin;{}'.format(message))
p.populate_set_out( msg, 'alertHandler')
msg = 'infoleak:automatic-detection="bitcoin-address";{}'.format(message) msg = 'infoleak:automatic-detection="bitcoin-address";{}'.format(message)
p.populate_set_out(msg, 'Tags') p.populate_set_out(msg, 'Tags')
@ -75,7 +73,7 @@ def search_key(content, message, paste):
to_print = 'Bitcoin;{};{};{};'.format(paste.p_source, paste.p_date, to_print = 'Bitcoin;{};{};{};'.format(paste.p_source, paste.p_date,
paste.p_name) paste.p_name)
publisher.warning('{}Detected {} Bitcoin private key;{}'.format( publisher.warning('{}Detected {} Bitcoin private key;{}'.format(
to_print, len(bitcoin_private_key),paste.p_path)) to_print, len(bitcoin_private_key),paste.p_rel_path))
if __name__ == "__main__": if __name__ == "__main__":
publisher.port = 6380 publisher.port = 6380

View file

@ -89,16 +89,10 @@ if __name__ == "__main__":
paste = Paste.Paste(filename) paste = Paste.Paste(filename)
content = paste.get_p_content() content = paste.get_p_content()
#print('-----------------------------------------------------')
#print(filename)
#print(content)
#print('-----------------------------------------------------')
for categ, pattern in tmp_dict.items(): for categ, pattern in tmp_dict.items():
found = set(re.findall(pattern, content)) found = set(re.findall(pattern, content))
if len(found) >= matchingThreshold: if len(found) >= matchingThreshold:
msg = '{} {}'.format(paste.p_path, len(found)) msg = '{} {}'.format(paste.p_rel_path, len(found))
#msg = " ".join( [paste.p_path, bytes(len(found))] )
print(msg, categ) print(msg, categ)
p.populate_set_out(msg, categ) p.populate_set_out(msg, categ)
@ -106,4 +100,4 @@ if __name__ == "__main__":
publisher.info( publisher.info(
'Categ;{};{};{};Detected {} as {};{}'.format( 'Categ;{};{};{};Detected {} as {};{}'.format(
paste.p_source, paste.p_date, paste.p_name, paste.p_source, paste.p_date, paste.p_name,
len(found), categ, paste.p_path)) len(found), categ, paste.p_rel_path))

View file

@ -4,6 +4,8 @@
import os import os
import sys import sys
import re import re
import uuid
import json
import redis import redis
import datetime import datetime
import time import time
@ -16,22 +18,179 @@ sys.path.append(os.environ['AIL_BIN'])
from Helper import Process from Helper import Process
from pubsublogger import publisher from pubsublogger import publisher
def on_error_send_message_back_in_queue(type_hidden_service, domain, message): # ======== FUNCTIONS ========
# send this msg back in the queue
if not r_onion.sismember('{}_domain_crawler_queue'.format(type_hidden_service), domain):
r_onion.sadd('{}_domain_crawler_queue'.format(type_hidden_service), domain)
r_onion.sadd('{}_crawler_priority_queue'.format(type_hidden_service), message)
def crawl_onion(url, domain, date, date_month, message): def load_blacklist(service_type):
try:
with open(os.environ['AIL_BIN']+'/torcrawler/blacklist_{}.txt'.format(service_type), 'r') as f:
redis_crawler.delete('blacklist_{}'.format(service_type))
lines = f.read().splitlines()
for line in lines:
redis_crawler.sadd('blacklist_{}'.format(service_type), line)
except Exception:
pass
def update_auto_crawler():
current_epoch = int(time.time())
list_to_crawl = redis_crawler.zrangebyscore('crawler_auto_queue', '-inf', current_epoch)
for elem_to_crawl in list_to_crawl:
mess, type = elem_to_crawl.rsplit(';', 1)
redis_crawler.sadd('{}_crawler_priority_queue'.format(type), mess)
redis_crawler.zrem('crawler_auto_queue', elem_to_crawl)
# Extract info form url (url, domain, domain url, ...)
def unpack_url(url):
to_crawl = {}
faup.decode(url)
url_unpack = faup.get()
to_crawl['domain'] = url_unpack['domain'].decode()
if url_unpack['scheme'] is None:
to_crawl['scheme'] = 'http'
url= 'http://{}'.format(url_unpack['url'].decode())
else:
scheme = url_unpack['scheme'].decode()
if scheme in default_proto_map:
to_crawl['scheme'] = scheme
url = url_unpack['url'].decode()
else:
redis_crawler.sadd('new_proto', '{} {}'.format(scheme, url_unpack['url'].decode()))
to_crawl['scheme'] = 'http'
url= 'http://{}'.format(url_unpack['url'].decode().replace(scheme, '', 1))
if url_unpack['port'] is None:
to_crawl['port'] = default_proto_map[to_crawl['scheme']]
else:
port = url_unpack['port'].decode()
# Verify port number #################### make function to verify/correct port number
try:
int(port)
# Invalid port Number
except Exception as e:
port = default_proto_map[to_crawl['scheme']]
to_crawl['port'] = port
#if url_unpack['query_string'] is None:
# if to_crawl['port'] == 80:
# to_crawl['url']= '{}://{}'.format(to_crawl['scheme'], url_unpack['host'].decode())
# else:
# to_crawl['url']= '{}://{}:{}'.format(to_crawl['scheme'], url_unpack['host'].decode(), to_crawl['port'])
#else:
# to_crawl['url']= '{}://{}:{}{}'.format(to_crawl['scheme'], url_unpack['host'].decode(), to_crawl['port'], url_unpack['query_string'].decode())
to_crawl['url'] = url
if to_crawl['port'] == 80:
to_crawl['domain_url'] = '{}://{}'.format(to_crawl['scheme'], url_unpack['host'].decode())
else:
to_crawl['domain_url'] = '{}://{}:{}'.format(to_crawl['scheme'], url_unpack['host'].decode(), to_crawl['port'])
to_crawl['tld'] = url_unpack['tld'].decode()
return to_crawl
# get url, paste and service_type to crawl
def get_elem_to_crawl(rotation_mode):
message = None
domain_service_type = None
#load_priority_queue
for service_type in rotation_mode:
message = redis_crawler.spop('{}_crawler_priority_queue'.format(service_type))
if message is not None:
domain_service_type = service_type
break
#load_normal_queue
if message is None:
for service_type in rotation_mode:
message = redis_crawler.spop('{}_crawler_queue'.format(service_type))
if message is not None:
domain_service_type = service_type
break
if message:
splitted = message.rsplit(';', 1)
if len(splitted) == 2:
url, paste = splitted
if paste:
paste = paste.replace(PASTES_FOLDER+'/', '')
message = {'url': url, 'paste': paste, 'type_service': domain_service_type, 'original_message': message}
return message
def get_crawler_config(redis_server, mode, service_type, domain, url=None):
crawler_options = {}
if mode=='auto':
config = redis_server.get('crawler_config:{}:{}:{}:{}'.format(mode, service_type, domain, url))
else:
config = redis_server.get('crawler_config:{}:{}:{}'.format(mode, service_type, domain))
if config is None:
config = {}
else:
config = json.loads(config)
for option in default_crawler_config:
if option in config:
crawler_options[option] = config[option]
else:
crawler_options[option] = default_crawler_config[option]
if mode == 'auto':
crawler_options['time'] = int(config['time'])
elif mode == 'manual':
redis_server.delete('crawler_config:{}:{}:{}'.format(mode, service_type, domain))
return crawler_options
def load_crawler_config(service_type, domain, paste, url, date):
crawler_config = {}
crawler_config['splash_url'] = splash_url
crawler_config['item'] = paste
crawler_config['service_type'] = service_type
crawler_config['domain'] = domain
crawler_config['date'] = date
# Auto and Manual Crawling
# Auto ################################################# create new entry, next crawling => here or when ended ?
if paste == 'auto':
crawler_config['crawler_options'] = get_crawler_config(redis_crawler, 'auto', service_type, domain, url=url)
crawler_config['requested'] = True
# Manual
elif paste == 'manual':
crawler_config['crawler_options'] = get_crawler_config(r_cache, 'manual', service_type, domain)
crawler_config['requested'] = True
# default crawler
else:
crawler_config['crawler_options'] = get_crawler_config(redis_crawler, 'default', service_type, domain)
crawler_config['requested'] = False
return crawler_config
def is_domain_up_day(domain, type_service, date_day):
if redis_crawler.sismember('{}_up:{}'.format(type_service, date_day), domain):
return True
else:
return False
def set_crawled_domain_metadata(type_service, date, domain, father_item):
# first seen
if not redis_crawler.hexists('{}_metadata:{}'.format(type_service, domain), 'first_seen'):
redis_crawler.hset('{}_metadata:{}'.format(type_service, domain), 'first_seen', date['date_day'])
redis_crawler.hset('{}_metadata:{}'.format(type_service, domain), 'paste_parent', father_item)
# last check
redis_crawler.hset('{}_metadata:{}'.format(type_service, domain), 'last_check', date['date_day'])
# Put message back on queue
def on_error_send_message_back_in_queue(type_service, domain, message):
if not redis_crawler.sismember('{}_domain_crawler_queue'.format(type_service), domain):
redis_crawler.sadd('{}_domain_crawler_queue'.format(type_service), domain)
redis_crawler.sadd('{}_crawler_priority_queue'.format(type_service), message)
def crawl_onion(url, domain, port, type_service, message, crawler_config):
crawler_config['url'] = url
crawler_config['port'] = port
print('Launching Crawler: {}'.format(url))
r_cache.hset('metadata_crawler:{}'.format(splash_port), 'crawling_domain', domain) r_cache.hset('metadata_crawler:{}'.format(splash_port), 'crawling_domain', domain)
r_cache.hset('metadata_crawler:{}'.format(splash_port), 'started_time', datetime.datetime.now().strftime("%Y/%m/%d - %H:%M.%S")) r_cache.hset('metadata_crawler:{}'.format(splash_port), 'started_time', datetime.datetime.now().strftime("%Y/%m/%d - %H:%M.%S"))
#if not r_onion.sismember('full_onion_up', domain) and not r_onion.sismember('onion_down:'+date , domain):
super_father = r_serv_metadata.hget('paste_metadata:'+paste, 'super_father')
if super_father is None:
super_father=paste
retry = True retry = True
nb_retry = 0 nb_retry = 0
while retry: while retry:
@ -43,7 +202,7 @@ def crawl_onion(url, domain, date, date_month, message):
nb_retry += 1 nb_retry += 1
if nb_retry == 6: if nb_retry == 6:
on_error_send_message_back_in_queue(type_hidden_service, domain, message) on_error_send_message_back_in_queue(type_service, domain, message)
publisher.error('{} SPASH DOWN'.format(splash_url)) publisher.error('{} SPASH DOWN'.format(splash_url))
print('--------------------------------------') print('--------------------------------------')
print(' \033[91m DOCKER SPLASH DOWN\033[0m') print(' \033[91m DOCKER SPLASH DOWN\033[0m')
@ -57,7 +216,11 @@ def crawl_onion(url, domain, date, date_month, message):
if r.status_code == 200: if r.status_code == 200:
r_cache.hset('metadata_crawler:{}'.format(splash_port), 'status', 'Crawling') r_cache.hset('metadata_crawler:{}'.format(splash_port), 'status', 'Crawling')
process = subprocess.Popen(["python", './torcrawler/tor_crawler.py', splash_url, type_hidden_service, url, domain, paste, super_father], # save config in cash
UUID = str(uuid.uuid4())
r_cache.set('crawler_request:{}'.format(UUID), json.dumps(crawler_config))
process = subprocess.Popen(["python", './torcrawler/tor_crawler.py', UUID],
stdout=subprocess.PIPE) stdout=subprocess.PIPE)
while process.poll() is None: while process.poll() is None:
time.sleep(1) time.sleep(1)
@ -67,7 +230,7 @@ def crawl_onion(url, domain, date, date_month, message):
print(output) print(output)
# error: splash:Connection to proxy refused # error: splash:Connection to proxy refused
if 'Connection to proxy refused' in output: if 'Connection to proxy refused' in output:
on_error_send_message_back_in_queue(type_hidden_service, domain, message) on_error_send_message_back_in_queue(type_service, domain, message)
publisher.error('{} SPASH, PROXY DOWN OR BAD CONFIGURATION'.format(splash_url)) publisher.error('{} SPASH, PROXY DOWN OR BAD CONFIGURATION'.format(splash_url))
print('------------------------------------------------------------------------') print('------------------------------------------------------------------------')
print(' \033[91m SPLASH: Connection to proxy refused') print(' \033[91m SPLASH: Connection to proxy refused')
@ -80,56 +243,53 @@ def crawl_onion(url, domain, date, date_month, message):
print(process.stdout.read()) print(process.stdout.read())
exit(-1) exit(-1)
else: else:
on_error_send_message_back_in_queue(type_hidden_service, domain, message) on_error_send_message_back_in_queue(type_service, domain, message)
print('--------------------------------------') print('--------------------------------------')
print(' \033[91m DOCKER SPLASH DOWN\033[0m') print(' \033[91m DOCKER SPLASH DOWN\033[0m')
print(' {} DOWN'.format(splash_url)) print(' {} DOWN'.format(splash_url))
r_cache.hset('metadata_crawler:{}'.format(splash_port), 'status', 'Crawling') r_cache.hset('metadata_crawler:{}'.format(splash_port), 'status', 'Crawling')
exit(1) exit(1)
# check external links (full_crawl)
def search_potential_source_domain(type_service, domain):
external_domains = set()
for link in redis_crawler.smembers('domain_{}_external_links:{}'.format(type_service, domain)):
# unpack url
url_data = unpack_url(link)
if url_data['domain'] != domain:
if url_data['tld'] == 'onion' or url_data['tld'] == 'i2p':
external_domains.add(url_data['domain'])
# # TODO: add special tag ?
if len(external_domains) >= 20:
redis_crawler.sadd('{}_potential_source'.format(type_service), domain)
print('New potential source found: domain')
redis_crawler.delete('domain_{}_external_links:{}'.format(type_service, domain))
if __name__ == '__main__': if __name__ == '__main__':
if len(sys.argv) != 3: if len(sys.argv) != 2:
print('usage:', 'Crawler.py', 'type_hidden_service (onion or i2p or regular)', 'splash_port') print('usage:', 'Crawler.py', 'splash_port')
exit(1) exit(1)
##################################################
#mode = sys.argv[1]
splash_port = sys.argv[1]
type_hidden_service = sys.argv[1] rotation_mode = ['onion', 'regular']
splash_port = sys.argv[2] default_proto_map = {'http': 80, 'https': 443}
######################################################## add ftp ???
publisher.port = 6380 publisher.port = 6380
publisher.channel = "Script" publisher.channel = "Script"
publisher.info("Script Crawler started") publisher.info("Script Crawler started")
config_section = 'Crawler' config_section = 'Crawler'
# Setup the I/O queues # Setup the I/O queues
p = Process(config_section) p = Process(config_section)
url_onion = "((http|https|ftp)?(?:\://)?([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.onion)(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*)" splash_url = '{}:{}'.format( p.config.get("Crawler", "splash_url_onion"), splash_port)
re.compile(url_onion)
url_i2p = "((http|https|ftp)?(?:\://)?([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.i2p)(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*)"
re.compile(url_i2p)
if type_hidden_service == 'onion':
regex_hidden_service = url_onion
splash_url = '{}:{}'.format( p.config.get("Crawler", "splash_url_onion"), splash_port)
elif type_hidden_service == 'i2p':
regex_hidden_service = url_i2p
splash_url = '{}:{}'.format( p.config.get("Crawler", "splash_url_i2p"), splash_port)
elif type_hidden_service == 'regular':
regex_hidden_service = url_i2p
splash_url = '{}:{}'.format( p.config.get("Crawler", "splash_url_onion"), splash_port)
else:
print('incorrect crawler type: {}'.format(type_hidden_service))
exit(0)
print('splash url: {}'.format(splash_url)) print('splash url: {}'.format(splash_url))
crawler_depth_limit = p.config.getint("Crawler", "crawler_depth_limit")
faup = Faup()
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], p.config.get("Directories", "pastes")) PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], p.config.get("Directories", "pastes"))
r_serv_metadata = redis.StrictRedis( r_serv_metadata = redis.StrictRedis(
@ -144,137 +304,122 @@ if __name__ == '__main__':
db=p.config.getint("Redis_Cache", "db"), db=p.config.getint("Redis_Cache", "db"),
decode_responses=True) decode_responses=True)
r_onion = redis.StrictRedis( redis_crawler = redis.StrictRedis(
host=p.config.get("ARDB_Onion", "host"), host=p.config.get("ARDB_Onion", "host"),
port=p.config.getint("ARDB_Onion", "port"), port=p.config.getint("ARDB_Onion", "port"),
db=p.config.getint("ARDB_Onion", "db"), db=p.config.getint("ARDB_Onion", "db"),
decode_responses=True) decode_responses=True)
r_cache.sadd('all_crawler:{}'.format(type_hidden_service), splash_port) faup = Faup()
# Default crawler options
default_crawler_config = {'html': 1,
'har': 1,
'png': 1,
'depth_limit': p.config.getint("Crawler", "crawler_depth_limit"),
'closespider_pagecount': 50,
'user_agent': 'Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Firefox/24.0'}
# Track launched crawler
r_cache.sadd('all_crawler', splash_port)
r_cache.hset('metadata_crawler:{}'.format(splash_port), 'status', 'Waiting') r_cache.hset('metadata_crawler:{}'.format(splash_port), 'status', 'Waiting')
r_cache.hset('metadata_crawler:{}'.format(splash_port), 'started_time', datetime.datetime.now().strftime("%Y/%m/%d - %H:%M.%S")) r_cache.hset('metadata_crawler:{}'.format(splash_port), 'started_time', datetime.datetime.now().strftime("%Y/%m/%d - %H:%M.%S"))
# load domains blacklist # update hardcoded blacklist
try: load_blacklist('onion')
with open(os.environ['AIL_BIN']+'/torcrawler/blacklist_onion.txt', 'r') as f: load_blacklist('regular')
r_onion.delete('blacklist_{}'.format(type_hidden_service))
lines = f.read().splitlines()
for line in lines:
r_onion.sadd('blacklist_{}'.format(type_hidden_service), line)
except Exception:
pass
while True: while True:
# Priority Queue - Recovering the streamed message informations. update_auto_crawler()
message = r_onion.spop('{}_crawler_priority_queue'.format(type_hidden_service))
if message is None: to_crawl = get_elem_to_crawl(rotation_mode)
# Recovering the streamed message informations. if to_crawl:
message = r_onion.spop('{}_crawler_queue'.format(type_hidden_service)) url_data = unpack_url(to_crawl['url'])
# remove domain from queue
redis_crawler.srem('{}_domain_crawler_queue'.format(to_crawl['type_service']), url_data['domain'])
if message is not None: print()
print()
print('\033[92m------------------START CRAWLER------------------\033[0m')
print('crawler type: {}'.format(to_crawl['type_service']))
print('\033[92m-------------------------------------------------\033[0m')
print('url: {}'.format(url_data['url']))
print('domain: {}'.format(url_data['domain']))
print('domain_url: {}'.format(url_data['domain_url']))
print()
splitted = message.split(';') # Check blacklist
if len(splitted) == 2: if not redis_crawler.sismember('blacklist_{}'.format(to_crawl['type_service']), url_data['domain']):
url, paste = splitted date = {'date_day': datetime.datetime.now().strftime("%Y%m%d"),
paste = paste.replace(PASTES_FOLDER+'/', '') 'date_month': datetime.datetime.now().strftime("%Y%m"),
'epoch': int(time.time())}
url_list = re.findall(regex_hidden_service, url)[0] # Update crawler status type
if url_list[1] == '': r_cache.sadd('{}_crawlers'.format(to_crawl['type_service']), splash_port)
url= 'http://{}'.format(url)
link, s, credential, subdomain, domain, host, port, \ crawler_config = load_crawler_config(to_crawl['type_service'], url_data['domain'], to_crawl['paste'], to_crawl['url'], date)
resource_path, query_string, f1, f2, f3, f4 = url_list # check if default crawler
domain = url_list[4] if not crawler_config['requested']:
r_onion.srem('{}_domain_crawler_queue'.format(type_hidden_service), domain) # Auto crawl only if service not up this month
if redis_crawler.sismember('month_{}_up:{}'.format(to_crawl['type_service'], date['date_month']), url_data['domain']):
continue
domain_url = 'http://{}'.format(domain) set_crawled_domain_metadata(to_crawl['type_service'], date, url_data['domain'], to_crawl['paste'])
print()
print()
print('\033[92m------------------START CRAWLER------------------\033[0m')
print('crawler type: {}'.format(type_hidden_service))
print('\033[92m-------------------------------------------------\033[0m')
print('url: {}'.format(url))
print('domain: {}'.format(domain))
print('domain_url: {}'.format(domain_url))
faup.decode(domain)
onion_domain=faup.get()['domain'].decode()
if not r_onion.sismember('blacklist_{}'.format(type_hidden_service), domain) and not r_onion.sismember('blacklist_{}'.format(type_hidden_service), onion_domain):
date = datetime.datetime.now().strftime("%Y%m%d")
date_month = datetime.datetime.now().strftime("%Y%m")
if not r_onion.sismember('month_{}_up:{}'.format(type_hidden_service, date_month), domain) and not r_onion.sismember('{}_down:{}'.format(type_hidden_service, date), domain):
# first seen
if not r_onion.hexists('{}_metadata:{}'.format(type_hidden_service, domain), 'first_seen'):
r_onion.hset('{}_metadata:{}'.format(type_hidden_service, domain), 'first_seen', date)
# last_father
r_onion.hset('{}_metadata:{}'.format(type_hidden_service, domain), 'paste_parent', paste)
# last check
r_onion.hset('{}_metadata:{}'.format(type_hidden_service, domain), 'last_check', date)
crawl_onion(url, domain, date, date_month, message)
if url != domain_url:
print(url)
print(domain_url)
crawl_onion(domain_url, domain, date, date_month, message)
# save down onion
if not r_onion.sismember('{}_up:{}'.format(type_hidden_service, date), domain):
r_onion.sadd('{}_down:{}'.format(type_hidden_service, date), domain)
#r_onion.sadd('{}_down_link:{}'.format(type_hidden_service, date), url)
#r_onion.hincrby('{}_link_down'.format(type_hidden_service), url, 1)
else:
#r_onion.hincrby('{}_link_up'.format(type_hidden_service), url, 1)
if r_onion.sismember('month_{}_up:{}'.format(type_hidden_service, date_month), domain) and r_serv_metadata.exists('paste_children:'+paste):
msg = 'infoleak:automatic-detection="{}";{}'.format(type_hidden_service, paste)
p.populate_set_out(msg, 'Tags')
# add onion screenshot history
# add crawled days
if r_onion.lindex('{}_history:{}'.format(type_hidden_service, domain), 0) != date:
r_onion.lpush('{}_history:{}'.format(type_hidden_service, domain), date)
# add crawled history by date
r_onion.lpush('{}_history:{}:{}'.format(type_hidden_service, domain, date), paste) #add datetime here
# check external onions links (full_scrawl) #### CRAWLER ####
external_domains = set() # Manual and Auto Crawler
for link in r_onion.smembers('domain_{}_external_links:{}'.format(type_hidden_service, domain)): if crawler_config['requested']:
external_domain = re.findall(url_onion, link)
external_domain.extend(re.findall(url_i2p, link))
if len(external_domain) > 0:
external_domain = external_domain[0][4]
else:
continue
if '.onion' in external_domain and external_domain != domain:
external_domains.add(external_domain)
elif '.i2p' in external_domain and external_domain != domain:
external_domains.add(external_domain)
if len(external_domains) >= 10:
r_onion.sadd('{}_potential_source'.format(type_hidden_service), domain)
r_onion.delete('domain_{}_external_links:{}'.format(type_hidden_service, domain))
print(r_onion.smembers('domain_{}_external_links:{}'.format(type_hidden_service, domain)))
# update list, last crawled onions ######################################################crawler strategy
r_onion.lpush('last_{}'.format(type_hidden_service), domain) # CRAWL domain
r_onion.ltrim('last_{}'.format(type_hidden_service), 0, 15) crawl_onion(url_data['url'], url_data['domain'], url_data['port'], to_crawl['type_service'], to_crawl['original_message'], crawler_config)
#update crawler status # Default Crawler
r_cache.hset('metadata_crawler:{}'.format(splash_port), 'status', 'Waiting')
r_cache.hdel('metadata_crawler:{}'.format(splash_port), 'crawling_domain')
else: else:
print(' Blacklisted Onion') # CRAWL domain
print() crawl_onion(url_data['domain_url'], url_data['domain'], url_data['port'], to_crawl['type_service'], to_crawl['original_message'], crawler_config)
print() #if url != domain_url and not is_domain_up_day(url_data['domain'], to_crawl['type_service'], date['date_day']):
# crawl_onion(url_data['url'], url_data['domain'], to_crawl['original_message'])
# Save last_status day (DOWN)
if not is_domain_up_day(url_data['domain'], to_crawl['type_service'], date['date_day']):
redis_crawler.sadd('{}_down:{}'.format(to_crawl['type_service'], date['date_day']), url_data['domain'])
# if domain was UP at least one time
if redis_crawler.exists('crawler_history_{}:{}:{}'.format(to_crawl['type_service'], url_data['domain'], url_data['port'])):
# add crawler history (if domain is down)
if not redis_crawler.zrangebyscore('crawler_history_{}:{}:{}'.format(to_crawl['type_service'], url_data['domain'], url_data['port']), date['epoch'], date['epoch']):
# Domain is down
redis_crawler.zadd('crawler_history_{}:{}:{}'.format(to_crawl['type_service'], url_data['domain'], url_data['port']), int(date['epoch']), int(date['epoch']))
############################
# extract page content
############################
# update list, last crawled domains
redis_crawler.lpush('last_{}'.format(to_crawl['type_service']), '{}:{};{}'.format(url_data['domain'], url_data['port'], date['epoch']))
redis_crawler.ltrim('last_{}'.format(to_crawl['type_service']), 0, 15)
#update crawler status
r_cache.hset('metadata_crawler:{}'.format(splash_port), 'status', 'Waiting')
r_cache.hdel('metadata_crawler:{}'.format(splash_port), 'crawling_domain')
# Update crawler status type
r_cache.srem('{}_crawlers'.format(to_crawl['type_service']), splash_port)
# add next auto Crawling in queue:
if to_crawl['paste'] == 'auto':
redis_crawler.zadd('crawler_auto_queue', int(time.time()+crawler_config['crawler_options']['time']) , '{};{}'.format(to_crawl['original_message'], to_crawl['type_service']))
# update list, last auto crawled domains
redis_crawler.lpush('last_auto_crawled', '{}:{};{}'.format(url_data['domain'], url_data['port'], date['epoch']))
redis_crawler.ltrim('last_auto_crawled', 0, 9)
else: else:
continue print(' Blacklisted Domain')
print()
print()
else: else:
time.sleep(1) time.sleep(1)

View file

@ -97,7 +97,7 @@ if __name__ == "__main__":
if sites_set: if sites_set:
message += ' Related websites: {}'.format( (', '.join(sites_set)) ) message += ' Related websites: {}'.format( (', '.join(sites_set)) )
to_print = 'Credential;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, message, paste.p_path) to_print = 'Credential;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, message, paste.p_rel_path)
print('\n '.join(creds)) print('\n '.join(creds))
@ -107,9 +107,6 @@ if __name__ == "__main__":
publisher.warning(to_print) publisher.warning(to_print)
#Send to duplicate #Send to duplicate
p.populate_set_out(filepath, 'Duplicate') p.populate_set_out(filepath, 'Duplicate')
#Send to alertHandler
msg = 'credential;{}'.format(filepath)
p.populate_set_out(msg, 'alertHandler')
msg = 'infoleak:automatic-detection="credential";{}'.format(filepath) msg = 'infoleak:automatic-detection="credential";{}'.format(filepath)
p.populate_set_out(msg, 'Tags') p.populate_set_out(msg, 'Tags')

View file

@ -77,19 +77,16 @@ if __name__ == "__main__":
paste.p_source, paste.p_date, paste.p_name) paste.p_source, paste.p_date, paste.p_name)
if (len(creditcard_set) > 0): if (len(creditcard_set) > 0):
publisher.warning('{}Checked {} valid number(s);{}'.format( publisher.warning('{}Checked {} valid number(s);{}'.format(
to_print, len(creditcard_set), paste.p_path)) to_print, len(creditcard_set), paste.p_rel_path))
print('{}Checked {} valid number(s);{}'.format( print('{}Checked {} valid number(s);{}'.format(
to_print, len(creditcard_set), paste.p_path)) to_print, len(creditcard_set), paste.p_rel_path))
#Send to duplicate #Send to duplicate
p.populate_set_out(filename, 'Duplicate') p.populate_set_out(filename, 'Duplicate')
#send to Browse_warning_paste
msg = 'creditcard;{}'.format(filename)
p.populate_set_out(msg, 'alertHandler')
msg = 'infoleak:automatic-detection="credit-card";{}'.format(filename) msg = 'infoleak:automatic-detection="credit-card";{}'.format(filename)
p.populate_set_out(msg, 'Tags') p.populate_set_out(msg, 'Tags')
else: else:
publisher.info('{}CreditCard related;{}'.format(to_print, paste.p_path)) publisher.info('{}CreditCard related;{}'.format(to_print, paste.p_rel_path))
else: else:
publisher.debug("Script creditcard is idling 1m") publisher.debug("Script creditcard is idling 1m")
time.sleep(10) time.sleep(10)

View file

@ -31,10 +31,6 @@ def search_cve(message):
print('{} contains CVEs'.format(paste.p_name)) print('{} contains CVEs'.format(paste.p_name))
publisher.warning('{} contains CVEs'.format(paste.p_name)) publisher.warning('{} contains CVEs'.format(paste.p_name))
#send to Browse_warning_paste
msg = 'cve;{}'.format(filepath)
p.populate_set_out(msg, 'alertHandler')
msg = 'infoleak:automatic-detection="cve";{}'.format(filepath) msg = 'infoleak:automatic-detection="cve";{}'.format(filepath)
p.populate_set_out(msg, 'Tags') p.populate_set_out(msg, 'Tags')
#Send to duplicate #Send to duplicate

View file

@ -147,9 +147,6 @@ def set_out_paste(decoder_name, message):
publisher.warning(decoder_name+' decoded') publisher.warning(decoder_name+' decoded')
#Send to duplicate #Send to duplicate
p.populate_set_out(message, 'Duplicate') p.populate_set_out(message, 'Duplicate')
#send to Browse_warning_paste
msg = (decoder_name+';{}'.format(message))
p.populate_set_out( msg, 'alertHandler')
msg = 'infoleak:automatic-detection="'+decoder_name+'";{}'.format(message) msg = 'infoleak:automatic-detection="'+decoder_name+'";{}'.format(message)
p.populate_set_out(msg, 'Tags') p.populate_set_out(msg, 'Tags')
@ -229,7 +226,7 @@ if __name__ == '__main__':
except TimeoutException: except TimeoutException:
encoded_list = [] encoded_list = []
p.incr_module_timeout_statistic() # add encoder type p.incr_module_timeout_statistic() # add encoder type
print ("{0} processing timeout".format(paste.p_path)) print ("{0} processing timeout".format(paste.p_rel_path))
continue continue
else: else:
signal.alarm(0) signal.alarm(0)

View file

@ -54,14 +54,14 @@ def main():
if localizeddomains: if localizeddomains:
print(localizeddomains) print(localizeddomains)
publisher.warning('DomainC;{};{};{};Checked {} located in {};{}'.format( publisher.warning('DomainC;{};{};{};Checked {} located in {};{}'.format(
PST.p_source, PST.p_date, PST.p_name, localizeddomains, cc_tld, PST.p_path)) PST.p_source, PST.p_date, PST.p_name, localizeddomains, cc_tld, PST.p_rel_path))
localizeddomains = c.localizedomain(cc=cc) localizeddomains = c.localizedomain(cc=cc)
if localizeddomains: if localizeddomains:
print(localizeddomains) print(localizeddomains)
publisher.warning('DomainC;{};{};{};Checked {} located in {};{}'.format( publisher.warning('DomainC;{};{};{};Checked {} located in {};{}'.format(
PST.p_source, PST.p_date, PST.p_name, localizeddomains, cc, PST.p_path)) PST.p_source, PST.p_date, PST.p_name, localizeddomains, cc, PST.p_rel_path))
except IOError: except IOError:
print("CRC Checksum Failed on :", PST.p_path) print("CRC Checksum Failed on :", PST.p_rel_path)
publisher.error('Duplicate;{};{};{};CRC Checksum Failed'.format( publisher.error('Duplicate;{};{};{};CRC Checksum Failed'.format(
PST.p_source, PST.p_date, PST.p_name)) PST.p_source, PST.p_date, PST.p_name))

View file

@ -142,17 +142,17 @@ if __name__ == "__main__":
paste_date = paste_date paste_date = paste_date
paste_date = paste_date if paste_date != None else "No date available" paste_date = paste_date if paste_date != None else "No date available"
if paste_path != None: if paste_path != None:
if paste_path != PST.p_path: if paste_path != PST.p_rel_path:
hash_dico[dico_hash] = (hash_type, paste_path, percent, paste_date) hash_dico[dico_hash] = (hash_type, paste_path, percent, paste_date)
print('['+hash_type+'] '+'comparing: ' + str(PST.p_path[44:]) + ' and ' + str(paste_path[44:]) + ' percentage: ' + str(percent)) print('['+hash_type+'] '+'comparing: ' + str(PST.p_rel_path) + ' and ' + str(paste_path) + ' percentage: ' + str(percent))
except Exception: except Exception:
print('hash not comparable, bad hash: '+dico_hash+' , current_hash: '+paste_hash) print('hash not comparable, bad hash: '+dico_hash+' , current_hash: '+paste_hash)
# Add paste in DB after checking to prevent its analysis twice # Add paste in DB after checking to prevent its analysis twice
# hash_type_i -> index_i AND index_i -> PST.PATH # hash_type_i -> index_i AND index_i -> PST.PATH
r_serv1.set(index, PST.p_path) r_serv1.set(index, PST.p_rel_path)
r_serv1.set(index+'_date', PST._get_p_date()) r_serv1.set(index+'_date', PST._get_p_date())
r_serv1.sadd("INDEX", index) r_serv1.sadd("INDEX", index)
# Adding hashes in Redis # Adding hashes in Redis
@ -180,7 +180,7 @@ if __name__ == "__main__":
PST.__setattr__("p_duplicate", dupl) PST.__setattr__("p_duplicate", dupl)
PST.save_attribute_duplicate(dupl) PST.save_attribute_duplicate(dupl)
PST.save_others_pastes_attribute_duplicate(dupl) PST.save_others_pastes_attribute_duplicate(dupl)
publisher.info('{}Detected {};{}'.format(to_print, len(dupl), PST.p_path)) publisher.info('{}Detected {};{}'.format(to_print, len(dupl), PST.p_rel_path))
print('{}Detected {}'.format(to_print, len(dupl))) print('{}Detected {}'.format(to_print, len(dupl)))
print('') print('')
@ -191,5 +191,5 @@ if __name__ == "__main__":
except IOError: except IOError:
to_print = 'Duplicate;{};{};{};'.format( to_print = 'Duplicate;{};{};{};'.format(
PST.p_source, PST.p_date, PST.p_name) PST.p_source, PST.p_date, PST.p_name)
print("CRC Checksum Failed on :", PST.p_path) print("CRC Checksum Failed on :", PST.p_rel_path)
publisher.error('{}CRC Checksum Failed'.format(to_print)) publisher.error('{}CRC Checksum Failed'.format(to_print))

View file

@ -45,6 +45,9 @@ if __name__ == '__main__':
p = Process(config_section) p = Process(config_section)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], p.config.get("Directories", "pastes"))
PASTES_FOLDERS = PASTES_FOLDER + '/'
# LOGGING # # LOGGING #
publisher.info("Feed Script started to receive & publish.") publisher.info("Feed Script started to receive & publish.")
@ -78,8 +81,7 @@ if __name__ == '__main__':
paste = rreplace(paste, file_name_paste, new_file_name_paste, 1) paste = rreplace(paste, file_name_paste, new_file_name_paste, 1)
# Creating the full filepath # Creating the full filepath
filename = os.path.join(os.environ['AIL_HOME'], filename = os.path.join(PASTES_FOLDER, paste)
p.config.get("Directories", "pastes"), paste)
dirname = os.path.dirname(filename) dirname = os.path.dirname(filename)
if not os.path.exists(dirname): if not os.path.exists(dirname):
@ -102,6 +104,11 @@ if __name__ == '__main__':
print(filename) print(filename)
print(type) print(type)
print('-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------') print('-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------')
''' '''
p.populate_set_out(filename)
# remove PASTES_FOLDER from item path (crawled item + submited)
if PASTES_FOLDERS in paste:
paste = paste.replace(PASTES_FOLDERS, '', 1)
p.populate_set_out(paste)
processed_paste+=1 processed_paste+=1

View file

@ -112,10 +112,7 @@ def search_key(paste):
#Send to duplicate #Send to duplicate
p.populate_set_out(message, 'Duplicate') p.populate_set_out(message, 'Duplicate')
#send to Browse_warning_paste
msg = ('keys;{}'.format(message))
print(message) print(message)
p.populate_set_out( msg, 'alertHandler')
if __name__ == '__main__': if __name__ == '__main__':

View file

@ -16,7 +16,6 @@ export AIL_HOME="${DIR}"
cd ${AIL_HOME} cd ${AIL_HOME}
if [ -e "${DIR}/AILENV/bin/python" ]; then if [ -e "${DIR}/AILENV/bin/python" ]; then
echo "AIL-framework virtualenv seems to exist, good"
ENV_PY="${DIR}/AILENV/bin/python" ENV_PY="${DIR}/AILENV/bin/python"
else else
echo "Please make sure you have a AIL-framework environment, au revoir" echo "Please make sure you have a AIL-framework environment, au revoir"
@ -75,6 +74,7 @@ function helptext {
LAUNCH.sh LAUNCH.sh
[-l | --launchAuto] [-l | --launchAuto]
[-k | --killAll] [-k | --killAll]
[-u | --update]
[-c | --configUpdate] [-c | --configUpdate]
[-t | --thirdpartyUpdate] [-t | --thirdpartyUpdate]
[-h | --help] [-h | --help]
@ -125,12 +125,7 @@ function launching_queues {
function checking_configuration { function checking_configuration {
bin_dir=${AIL_HOME}/bin bin_dir=${AIL_HOME}/bin
echo -e "\t* Checking configuration" echo -e "\t* Checking configuration"
if [ "$1" == "automatic" ]; then bash -c "${ENV_PY} $bin_dir/Update-conf.py"
bash -c "${ENV_PY} $bin_dir/Update-conf.py True"
else
bash -c "${ENV_PY} $bin_dir/Update-conf.py False"
fi
exitStatus=$? exitStatus=$?
if [ $exitStatus -ge 1 ]; then if [ $exitStatus -ge 1 ]; then
echo -e $RED"\t* Configuration not up-to-date"$DEFAULT echo -e $RED"\t* Configuration not up-to-date"$DEFAULT
@ -140,7 +135,7 @@ function checking_configuration {
} }
function launching_scripts { function launching_scripts {
checking_configuration $1; checking_configuration;
screen -dmS "Script_AIL" screen -dmS "Script_AIL"
sleep 0.1 sleep 0.1
@ -206,14 +201,14 @@ function launching_scripts {
sleep 0.1 sleep 0.1
screen -S "Script_AIL" -X screen -t "LibInjection" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./LibInjection.py; read x" screen -S "Script_AIL" -X screen -t "LibInjection" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./LibInjection.py; read x"
sleep 0.1 sleep 0.1
screen -S "Script_AIL" -X screen -t "alertHandler" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./alertHandler.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "MISPtheHIVEfeeder" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./MISP_The_Hive_feeder.py; read x" screen -S "Script_AIL" -X screen -t "MISPtheHIVEfeeder" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./MISP_The_Hive_feeder.py; read x"
sleep 0.1 sleep 0.1
screen -S "Script_AIL" -X screen -t "Tags" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Tags.py; read x" screen -S "Script_AIL" -X screen -t "Tags" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Tags.py; read x"
sleep 0.1 sleep 0.1
screen -S "Script_AIL" -X screen -t "SentimentAnalysis" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./SentimentAnalysis.py; read x" screen -S "Script_AIL" -X screen -t "SentimentAnalysis" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./SentimentAnalysis.py; read x"
sleep 0.1 sleep 0.1
screen -S "Script_AIL" -X screen -t "UpdateBackground" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./update-background.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "SubmitPaste" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./submit_paste.py; read x" screen -S "Script_AIL" -X screen -t "SubmitPaste" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./submit_paste.py; read x"
} }
@ -221,7 +216,7 @@ function launching_scripts {
function launching_crawler { function launching_crawler {
if [[ ! $iscrawler ]]; then if [[ ! $iscrawler ]]; then
CONFIG=$AIL_BIN/packages/config.cfg CONFIG=$AIL_BIN/packages/config.cfg
lport=$(awk '/^\[Crawler\]/{f=1} f==1&&/^splash_onion_port/{print $3;exit}' "${CONFIG}") lport=$(awk '/^\[Crawler\]/{f=1} f==1&&/^splash_port/{print $3;exit}' "${CONFIG}")
IFS='-' read -ra PORTS <<< "$lport" IFS='-' read -ra PORTS <<< "$lport"
if [ ${#PORTS[@]} -eq 1 ] if [ ${#PORTS[@]} -eq 1 ]
@ -237,7 +232,7 @@ function launching_crawler {
sleep 0.1 sleep 0.1
for ((i=first_port;i<=last_port;i++)); do for ((i=first_port;i<=last_port;i++)); do
screen -S "Crawler_AIL" -X screen -t "onion_crawler:$i" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Crawler.py onion $i; read x" screen -S "Crawler_AIL" -X screen -t "onion_crawler:$i" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Crawler.py $i; read x"
sleep 0.1 sleep 0.1
done done
@ -266,20 +261,20 @@ function checking_redis {
redis_dir=${AIL_HOME}/redis/src/ redis_dir=${AIL_HOME}/redis/src/
bash -c $redis_dir'redis-cli -p 6379 PING | grep "PONG" &> /dev/null' bash -c $redis_dir'redis-cli -p 6379 PING | grep "PONG" &> /dev/null'
if [ ! $? == 0 ]; then if [ ! $? == 0 ]; then
echo -e $RED"\t6379 not ready"$DEFAULT echo -e $RED"\t6379 not ready"$DEFAULT
flag_redis=1 flag_redis=1
fi fi
sleep 0.1 sleep 0.1
bash -c $redis_dir'redis-cli -p 6380 PING | grep "PONG" &> /dev/null' bash -c $redis_dir'redis-cli -p 6380 PING | grep "PONG" &> /dev/null'
if [ ! $? == 0 ]; then if [ ! $? == 0 ]; then
echo -e $RED"\t6380 not ready"$DEFAULT echo -e $RED"\t6380 not ready"$DEFAULT
flag_redis=1 flag_redis=1
fi fi
sleep 0.1 sleep 0.1
bash -c $redis_dir'redis-cli -p 6381 PING | grep "PONG" &> /dev/null' bash -c $redis_dir'redis-cli -p 6381 PING | grep "PONG" &> /dev/null'
if [ ! $? == 0 ]; then if [ ! $? == 0 ]; then
echo -e $RED"\t6381 not ready"$DEFAULT echo -e $RED"\t6381 not ready"$DEFAULT
flag_redis=1 flag_redis=1
fi fi
sleep 0.1 sleep 0.1
@ -292,13 +287,37 @@ function checking_ardb {
sleep 0.2 sleep 0.2
bash -c $redis_dir'redis-cli -p 6382 PING | grep "PONG" &> /dev/null' bash -c $redis_dir'redis-cli -p 6382 PING | grep "PONG" &> /dev/null'
if [ ! $? == 0 ]; then if [ ! $? == 0 ]; then
echo -e $RED"\t6382 ARDB not ready"$DEFAULT echo -e $RED"\t6382 ARDB not ready"$DEFAULT
flag_ardb=1 flag_ardb=1
fi fi
return $flag_ardb; return $flag_ardb;
} }
function wait_until_redis_is_ready {
redis_not_ready=true
while $redis_not_ready; do
if checking_redis; then
redis_not_ready=false;
else
sleep 1
fi
done
echo -e $YELLOW"\t* Redis Launched"$DEFAULT
}
function wait_until_ardb_is_ready {
ardb_not_ready=true;
while $ardb_not_ready; do
if checking_ardb; then
ardb_not_ready=false
else
sleep 3
fi
done
echo -e $YELLOW"\t* ARDB Launched"$DEFAULT
}
function launch_redis { function launch_redis {
if [[ ! $isredis ]]; then if [[ ! $isredis ]]; then
launching_redis; launching_redis;
@ -335,14 +354,14 @@ function launch_scripts {
if [[ ! $isscripted ]]; then if [[ ! $isscripted ]]; then
sleep 1 sleep 1
if checking_ardb && checking_redis; then if checking_ardb && checking_redis; then
launching_scripts $1; launching_scripts;
else else
no_script_launched=true no_script_launched=true
while $no_script_launched; do while $no_script_launched; do
echo -e $YELLOW"\tScript not started, waiting 5 more secondes"$DEFAULT echo -e $YELLOW"\tScript not started, waiting 5 more secondes"$DEFAULT
sleep 5 sleep 5
if checking_redis && checking_ardb; then if checking_redis && checking_ardb; then
launching_scripts $1; launching_scripts;
no_script_launched=false no_script_launched=false
else else
echo -e $RED"\tScript not started"$DEFAULT echo -e $RED"\tScript not started"$DEFAULT
@ -380,17 +399,21 @@ function launch_feeder {
} }
function killall { function killall {
if [[ $isredis || $isardb || $islogged || $isqueued || $isscripted || $isflasked || $isfeeded ]]; then if [[ $isredis || $isardb || $islogged || $isqueued || $isscripted || $isflasked || $isfeeded || $iscrawler ]]; then
echo -e $GREEN"Gracefully closing redis servers"$DEFAULT if [[ $isredis ]]; then
shutting_down_redis; echo -e $GREEN"Gracefully closing redis servers"$DEFAULT
sleep 0.2 shutting_down_redis;
echo -e $GREEN"Gracefully closing ardb servers"$DEFAULT sleep 0.2
shutting_down_ardb; fi
if [[ $isardb ]]; then
echo -e $GREEN"Gracefully closing ardb servers"$DEFAULT
shutting_down_ardb;
fi
echo -e $GREEN"Killing all"$DEFAULT echo -e $GREEN"Killing all"$DEFAULT
kill $isredis $isardb $islogged $isqueued $isscripted $isflasked $isfeeded kill $isredis $isardb $islogged $isqueued $isscripted $isflasked $isfeeded $iscrawler
sleep 0.2 sleep 0.2
echo -e $ROSE`screen -ls`$DEFAULT echo -e $ROSE`screen -ls`$DEFAULT
echo -e $GREEN"\t* $isredis $isardb $islogged $isqueued $isscripted killed."$DEFAULT echo -e $GREEN"\t* $isredis $isardb $islogged $isqueued $isscripted $isflasked $isfeeded $iscrawler killed."$DEFAULT
else else
echo -e $RED"\t* No screen to kill"$DEFAULT echo -e $RED"\t* No screen to kill"$DEFAULT
fi fi
@ -400,6 +423,17 @@ function shutdown {
bash -c "./Shutdown.py" bash -c "./Shutdown.py"
} }
function update() {
bin_dir=${AIL_HOME}/bin
bash -c "python3 $bin_dir/Update.py"
exitStatus=$?
if [ $exitStatus -ge 1 ]; then
echo -e $RED"\t* Update Error"$DEFAULT
exit
fi
}
function update_thirdparty { function update_thirdparty {
echo -e "\t* Updating thirdparty..." echo -e "\t* Updating thirdparty..."
bash -c "(cd ${AIL_FLASK}; ./update_thirdparty.sh)" bash -c "(cd ${AIL_FLASK}; ./update_thirdparty.sh)"
@ -413,11 +447,12 @@ function update_thirdparty {
} }
function launch_all { function launch_all {
update;
launch_redis; launch_redis;
launch_ardb; launch_ardb;
launch_logs; launch_logs;
launch_queues; launch_queues;
launch_scripts $1; launch_scripts;
launch_flask; launch_flask;
} }
@ -426,7 +461,7 @@ function launch_all {
helptext; helptext;
options=("Redis" "Ardb" "Logs" "Queues" "Scripts" "Flask" "Killall" "Shutdown" "Update-config" "Update-thirdparty") options=("Redis" "Ardb" "Logs" "Queues" "Scripts" "Flask" "Killall" "Shutdown" "Update" "Update-config" "Update-thirdparty")
menu() { menu() {
echo "What do you want to Launch?:" echo "What do you want to Launch?:"
@ -451,7 +486,7 @@ function launch_all {
if [[ "${choices[i]}" ]]; then if [[ "${choices[i]}" ]]; then
case ${options[i]} in case ${options[i]} in
Redis) Redis)
launch_redis launch_redis;
;; ;;
Ardb) Ardb)
launch_ardb; launch_ardb;
@ -477,8 +512,11 @@ function launch_all {
Shutdown) Shutdown)
shutdown; shutdown;
;; ;;
Update)
update;
;;
Update-config) Update-config)
checking_configuration "manual"; checking_configuration;
;; ;;
Update-thirdparty) Update-thirdparty)
update_thirdparty; update_thirdparty;
@ -490,23 +528,40 @@ function launch_all {
exit exit
} }
#echo "$@"
while [ "$1" != "" ]; do while [ "$1" != "" ]; do
case $1 in case $1 in
-l | --launchAuto ) launch_all "automatic"; -l | --launchAuto ) launch_all "automatic";
;; ;;
-k | --killAll ) killall; -lr | --launchRedis ) launch_redis;
;; ;;
-t | --thirdpartyUpdate ) update_thirdparty; -la | --launchARDB ) launch_ardb;
;; ;;
-c | --crawler ) launching_crawler; -lrv | --launchRedisVerify ) launch_redis;
;; wait_until_redis_is_ready;
-f | --launchFeeder ) launch_feeder; ;;
;; -lav | --launchARDBVerify ) launch_ardb;
-h | --help ) helptext; wait_until_ardb_is_ready;
exit ;;
;; -k | --killAll ) killall;
* ) helptext ;;
exit 1 -u | --update ) update;
;;
-t | --thirdpartyUpdate ) update_thirdparty;
;;
-c | --crawler ) launching_crawler;
;;
-f | --launchFeeder ) launch_feeder;
;;
-h | --help ) helptext;
exit
;;
-kh | --khelp ) helptext;
;;
* ) helptext
exit 1
esac esac
shift shift
done done

View file

@ -47,12 +47,11 @@ def analyse(url, path):
paste = Paste.Paste(path) paste = Paste.Paste(path)
print("Detected (libinjection) SQL in URL: ") print("Detected (libinjection) SQL in URL: ")
print(urllib.request.unquote(url)) print(urllib.request.unquote(url))
to_print = 'LibInjection;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, "Detected SQL in URL", paste.p_path) to_print = 'LibInjection;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, "Detected SQL in URL", paste.p_rel_path)
publisher.warning(to_print) publisher.warning(to_print)
#Send to duplicate #Send to duplicate
p.populate_set_out(path, 'Duplicate') p.populate_set_out(path, 'Duplicate')
#send to Browse_warning_paste
p.populate_set_out('sqlinjection;{}'.format(path), 'alertHandler')
msg = 'infoleak:automatic-detection="sql-injection";{}'.format(path) msg = 'infoleak:automatic-detection="sql-injection";{}'.format(path)
p.populate_set_out(msg, 'Tags') p.populate_set_out(msg, 'Tags')

View file

@ -75,10 +75,11 @@ if __name__ == '__main__':
PST.save_attribute_redis("p_max_length_line", lines_infos[1]) PST.save_attribute_redis("p_max_length_line", lines_infos[1])
# FIXME Not used. # FIXME Not used.
PST.store.sadd("Pastes_Objects", PST.p_path) PST.store.sadd("Pastes_Objects", PST.p_rel_path)
print(PST.p_rel_path)
if lines_infos[1] < args.max: if lines_infos[1] < args.max:
p.populate_set_out( PST.p_path , 'LinesShort') p.populate_set_out( PST.p_rel_path , 'LinesShort')
else: else:
p.populate_set_out( PST.p_path , 'LinesLong') p.populate_set_out( PST.p_rel_path , 'LinesLong')
except IOError: except IOError:
print("CRC Checksum Error on : ", PST.p_path) print("CRC Checksum Error on : ", PST.p_rel_path)

View file

@ -78,12 +78,11 @@ if __name__ == "__main__":
to_print = 'Mails;{};{};{};Checked {} e-mail(s);{}'.\ to_print = 'Mails;{};{};{};Checked {} e-mail(s);{}'.\
format(PST.p_source, PST.p_date, PST.p_name, format(PST.p_source, PST.p_date, PST.p_name,
MX_values[0], PST.p_path) MX_values[0], PST.p_rel_path)
if MX_values[0] > is_critical: if MX_values[0] > is_critical:
publisher.warning(to_print) publisher.warning(to_print)
#Send to duplicate #Send to duplicate
p.populate_set_out(filename, 'Duplicate') p.populate_set_out(filename, 'Duplicate')
p.populate_set_out('mail;{}'.format(filename), 'alertHandler')
msg = 'infoleak:automatic-detection="mail";{}'.format(filename) msg = 'infoleak:automatic-detection="mail";{}'.format(filename)
p.populate_set_out(msg, 'Tags') p.populate_set_out(msg, 'Tags')

View file

@ -82,6 +82,8 @@ if __name__ == '__main__':
ttl_key = cfg.getint("Module_Mixer", "ttl_duplicate") ttl_key = cfg.getint("Module_Mixer", "ttl_duplicate")
default_unnamed_feed_name = cfg.get("Module_Mixer", "default_unnamed_feed_name") default_unnamed_feed_name = cfg.get("Module_Mixer", "default_unnamed_feed_name")
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], p.config.get("Directories", "pastes")) + '/'
# STATS # # STATS #
processed_paste = 0 processed_paste = 0
processed_paste_per_feeder = {} processed_paste_per_feeder = {}
@ -104,12 +106,14 @@ if __name__ == '__main__':
feeder_name.replace(" ","") feeder_name.replace(" ","")
if 'import_dir' in feeder_name: if 'import_dir' in feeder_name:
feeder_name = feeder_name.split('/')[1] feeder_name = feeder_name.split('/')[1]
paste_name = complete_paste
except ValueError as e: except ValueError as e:
feeder_name = default_unnamed_feed_name feeder_name = default_unnamed_feed_name
paste_name = complete_paste paste_name = complete_paste
# remove absolute path
paste_name = paste_name.replace(PASTES_FOLDER, '', 1)
# Processed paste # Processed paste
processed_paste += 1 processed_paste += 1
try: try:
@ -119,6 +123,7 @@ if __name__ == '__main__':
processed_paste_per_feeder[feeder_name] = 1 processed_paste_per_feeder[feeder_name] = 1
duplicated_paste_per_feeder[feeder_name] = 0 duplicated_paste_per_feeder[feeder_name] = 0
relay_message = "{0} {1}".format(paste_name, gzip64encoded) relay_message = "{0} {1}".format(paste_name, gzip64encoded)
#relay_message = b" ".join( [paste_name, gzip64encoded] ) #relay_message = b" ".join( [paste_name, gzip64encoded] )

View file

@ -32,6 +32,8 @@ import redis
import signal import signal
import re import re
from pyfaup.faup import Faup
from Helper import Process from Helper import Process
class TimeoutException(Exception): class TimeoutException(Exception):
@ -132,6 +134,8 @@ if __name__ == "__main__":
activate_crawler = False activate_crawler = False
print('Crawler disabled') print('Crawler disabled')
faup = Faup()
# Thanks to Faup project for this regex # Thanks to Faup project for this regex
# https://github.com/stricaud/faup # https://github.com/stricaud/faup
url_regex = "((http|https|ftp)?(?:\://)?([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.onion)(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*)" url_regex = "((http|https|ftp)?(?:\://)?([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.onion)(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*)"
@ -167,7 +171,7 @@ if __name__ == "__main__":
except TimeoutException: except TimeoutException:
encoded_list = [] encoded_list = []
p.incr_module_timeout_statistic() p.incr_module_timeout_statistic()
print ("{0} processing timeout".format(PST.p_path)) print ("{0} processing timeout".format(PST.p_rel_path))
continue continue
signal.alarm(0) signal.alarm(0)
@ -185,7 +189,7 @@ if __name__ == "__main__":
r_onion.sadd('i2p_domain', domain) r_onion.sadd('i2p_domain', domain)
r_onion.sadd('i2p_link', url) r_onion.sadd('i2p_link', url)
r_onion.sadd('i2p_domain_crawler_queue', domain) r_onion.sadd('i2p_domain_crawler_queue', domain)
msg = '{};{}'.format(url,PST.p_path) msg = '{};{}'.format(url,PST.p_rel_path)
r_onion.sadd('i2p_crawler_queue', msg) r_onion.sadd('i2p_crawler_queue', msg)
''' '''
@ -200,10 +204,10 @@ if __name__ == "__main__":
if not activate_crawler: if not activate_crawler:
publisher.warning('{}Detected {} .onion(s);{}'.format( publisher.warning('{}Detected {} .onion(s);{}'.format(
to_print, len(domains_list),PST.p_path)) to_print, len(domains_list),PST.p_rel_path))
else: else:
publisher.info('{}Detected {} .onion(s);{}'.format( publisher.info('{}Detected {} .onion(s);{}'.format(
to_print, len(domains_list),PST.p_path)) to_print, len(domains_list),PST.p_rel_path))
now = datetime.datetime.now() now = datetime.datetime.now()
path = os.path.join('onions', str(now.year).zfill(4), path = os.path.join('onions', str(now.year).zfill(4),
str(now.month).zfill(2), str(now.month).zfill(2),
@ -218,37 +222,50 @@ if __name__ == "__main__":
date = datetime.datetime.now().strftime("%Y%m%d") date = datetime.datetime.now().strftime("%Y%m%d")
for url in urls: for url in urls:
domain = re.findall(url_regex, url) faup.decode(url)
if len(domain) > 0: url_unpack = faup.get()
domain = domain[0][4] domain = url_unpack['domain'].decode()
## TODO: blackilst by port ?
# check blacklist
if redis_crawler.sismember('blacklist_onion', domain):
continue
subdomain = re.findall(url_regex, url)
if len(subdomain) > 0:
subdomain = subdomain[0][4]
else: else:
continue continue
# too many subdomain # too many subdomain
if len(domain.split('.')) > 5: if len(subdomain.split('.')) > 3:
continue subdomain = '{}.{}.onion'.format(subdomain[-3], subdomain[-2])
if not r_onion.sismember('month_onion_up:{}'.format(date_month), domain) and not r_onion.sismember('onion_down:'+date , domain): if not r_onion.sismember('month_onion_up:{}'.format(date_month), subdomain) and not r_onion.sismember('onion_down:'+date , subdomain):
if not r_onion.sismember('onion_domain_crawler_queue', domain): if not r_onion.sismember('onion_domain_crawler_queue', subdomain):
print('send to onion crawler') print('send to onion crawler')
r_onion.sadd('onion_domain_crawler_queue', domain) r_onion.sadd('onion_domain_crawler_queue', subdomain)
msg = '{};{}'.format(url,PST.p_path) msg = '{};{}'.format(url,PST.p_rel_path)
if not r_onion.hexists('onion_metadata:{}'.format(domain), 'first_seen'): if not r_onion.hexists('onion_metadata:{}'.format(subdomain), 'first_seen'):
r_onion.sadd('onion_crawler_priority_queue', msg) r_onion.sadd('onion_crawler_priority_queue', msg)
print('send to priority queue') print('send to priority queue')
else: else:
r_onion.sadd('onion_crawler_queue', msg) r_onion.sadd('onion_crawler_queue', msg)
#p.populate_set_out(msg, 'Crawler') # tag if domain was up
if r_onion.sismember('full_onion_up', subdomain):
# TAG Item
msg = 'infoleak:automatic-detection="onion";{}'.format(PST.p_rel_path)
p.populate_set_out(msg, 'Tags')
else: else:
for url in fetch(p, r_cache, urls, domains_list, path): for url in fetch(p, r_cache, urls, domains_list, path):
publisher.info('{}Checked {};{}'.format(to_print, url, PST.p_path)) publisher.info('{}Checked {};{}'.format(to_print, url, PST.p_rel_path))
p.populate_set_out('onion;{}'.format(PST.p_path), 'alertHandler')
msg = 'infoleak:automatic-detection="onion";{}'.format(PST.p_path) # TAG Item
p.populate_set_out(msg, 'Tags') msg = 'infoleak:automatic-detection="onion";{}'.format(PST.p_rel_path)
p.populate_set_out(msg, 'Tags')
else: else:
publisher.info('{}Onion related;{}'.format(to_print, PST.p_path)) publisher.info('{}Onion related;{}'.format(to_print, PST.p_rel_path))
prec_filename = filename prec_filename = filename
else: else:

View file

@ -32,14 +32,11 @@ def search_phone(message):
if len(results) > 4: if len(results) > 4:
print(results) print(results)
publisher.warning('{} contains PID (phone numbers)'.format(paste.p_name)) publisher.warning('{} contains PID (phone numbers)'.format(paste.p_name))
#send to Browse_warning_paste
msg = 'phone;{}'.format(message)
p.populate_set_out(msg, 'alertHandler')
#Send to duplicate
msg = 'infoleak:automatic-detection="phone-number";{}'.format(message) msg = 'infoleak:automatic-detection="phone-number";{}'.format(message)
p.populate_set_out(msg, 'Tags') p.populate_set_out(msg, 'Tags')
#Send to duplicate
p.populate_set_out(message, 'Duplicate') p.populate_set_out(message, 'Duplicate')
stats = {} stats = {}
for phone_number in results: for phone_number in results:

View file

@ -108,7 +108,7 @@ if __name__ == "__main__":
try: try:
matched = compiled_regex.search(content) matched = compiled_regex.search(content)
except TimeoutException: except TimeoutException:
print ("{0} processing timeout".format(paste.p_path)) print ("{0} processing timeout".format(paste.p_rel_path))
continue continue
else: else:
signal.alarm(0) signal.alarm(0)

View file

@ -54,7 +54,7 @@ if __name__ == "__main__":
if len(releases) == 0: if len(releases) == 0:
continue continue
to_print = 'Release;{};{};{};{} releases;{}'.format(paste.p_source, paste.p_date, paste.p_name, len(releases), paste.p_path) to_print = 'Release;{};{};{};{} releases;{}'.format(paste.p_source, paste.p_date, paste.p_name, len(releases), paste.p_rel_path)
print(to_print) print(to_print)
if len(releases) > 30: if len(releases) > 30:
publisher.warning(to_print) publisher.warning(to_print)
@ -63,7 +63,7 @@ if __name__ == "__main__":
except TimeoutException: except TimeoutException:
p.incr_module_timeout_statistic() p.incr_module_timeout_statistic()
print ("{0} processing timeout".format(paste.p_path)) print ("{0} processing timeout".format(paste.p_rel_path))
continue continue
else: else:
signal.alarm(0) signal.alarm(0)

View file

@ -78,12 +78,10 @@ def analyse(url, path):
if (result_path > 1) or (result_query > 1): if (result_path > 1) or (result_query > 1):
print("Detected SQL in URL: ") print("Detected SQL in URL: ")
print(urllib.request.unquote(url)) print(urllib.request.unquote(url))
to_print = 'SQLInjection;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, "Detected SQL in URL", paste.p_path) to_print = 'SQLInjection;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, "Detected SQL in URL", paste.p_rel_path)
publisher.warning(to_print) publisher.warning(to_print)
#Send to duplicate #Send to duplicate
p.populate_set_out(path, 'Duplicate') p.populate_set_out(path, 'Duplicate')
#send to Browse_warning_paste
p.populate_set_out('sqlinjection;{}'.format(path), 'alertHandler')
msg = 'infoleak:automatic-detection="sql-injection";{}'.format(path) msg = 'infoleak:automatic-detection="sql-injection";{}'.format(path)
p.populate_set_out(msg, 'Tags') p.populate_set_out(msg, 'Tags')
@ -97,7 +95,7 @@ def analyse(url, path):
else: else:
print("Potential SQL injection:") print("Potential SQL injection:")
print(urllib.request.unquote(url)) print(urllib.request.unquote(url))
to_print = 'SQLInjection;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, "Potential SQL injection", paste.p_path) to_print = 'SQLInjection;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, "Potential SQL injection", paste.p_rel_path)
publisher.info(to_print) publisher.info(to_print)

View file

@ -45,6 +45,7 @@ cfg = configparser.ConfigParser()
cfg.read(configfile) cfg.read(configfile)
sentiment_lexicon_file = cfg.get("Directories", "sentiment_lexicon_file") sentiment_lexicon_file = cfg.get("Directories", "sentiment_lexicon_file")
#time_clean_sentiment_db = 60*60
def Analyse(message, server): def Analyse(message, server):
path = message path = message
@ -157,9 +158,16 @@ if __name__ == '__main__':
db=p.config.get("ARDB_Sentiment", "db"), db=p.config.get("ARDB_Sentiment", "db"),
decode_responses=True) decode_responses=True)
time1 = time.time()
while True: while True:
message = p.get_from_set() message = p.get_from_set()
if message is None: if message is None:
#if int(time.time() - time1) > time_clean_sentiment_db:
# clean_db()
# time1 = time.time()
# continue
#else:
publisher.debug("{} queue is empty, waiting".format(config_section)) publisher.debug("{} queue is empty, waiting".format(config_section))
time.sleep(1) time.sleep(1)
continue continue

View file

@ -17,6 +17,19 @@ from pubsublogger import publisher
from Helper import Process from Helper import Process
from packages import Paste from packages import Paste
def get_item_date(item_filename):
l_directory = item_filename.split('/')
return '{}{}{}'.format(l_directory[-4], l_directory[-3], l_directory[-2])
def set_tag_metadata(tag, date):
# First time we see this tag ## TODO: filter paste from the paste ?
if not server.hexists('tag_metadata:{}'.format(tag), 'first_seen'):
server.hset('tag_metadata:{}'.format(tag), 'first_seen', date)
# Check and Set tag last_seen
last_seen = server.hget('tag_metadata:{}'.format(tag), 'last_seen')
if last_seen is None or date > last_seen:
server.hset('tag_metadata:{}'.format(tag), 'last_seen', date)
if __name__ == '__main__': if __name__ == '__main__':
# Port of the redis instance used by pubsublogger # Port of the redis instance used by pubsublogger
@ -42,12 +55,6 @@ if __name__ == '__main__':
db=p.config.get("ARDB_Metadata", "db"), db=p.config.get("ARDB_Metadata", "db"),
decode_responses=True) decode_responses=True)
serv_statistics = redis.StrictRedis(
host=p.config.get('ARDB_Statistics', 'host'),
port=p.config.get('ARDB_Statistics', 'port'),
db=p.config.get('ARDB_Statistics', 'db'),
decode_responses=True)
# Sent to the logging a description of the module # Sent to the logging a description of the module
publisher.info("Tags module started") publisher.info("Tags module started")
@ -68,12 +75,15 @@ if __name__ == '__main__':
if res == 1: if res == 1:
print("new tags added : {}".format(tag)) print("new tags added : {}".format(tag))
# add the path to the tag set # add the path to the tag set
res = server.sadd(tag, path) #curr_date = datetime.date.today().strftime("%Y%m%d")
item_date = get_item_date(path)
res = server.sadd('{}:{}'.format(tag, item_date), path)
if res == 1: if res == 1:
print("new paste: {}".format(path)) print("new paste: {}".format(path))
print(" tagged: {}".format(tag)) print(" tagged: {}".format(tag))
server_metadata.sadd('tag:'+path, tag) set_tag_metadata(tag, item_date)
server_metadata.sadd('tag:{}'.format(path), tag)
curr_date = datetime.date.today() curr_date = datetime.date.today().strftime("%Y%m%d")
serv_statistics.hincrby(curr_date.strftime("%Y%m%d"),'paste_tagged:'+tag, 1) server.hincrby('daily_tags:{}'.format(item_date), tag, 1)
p.populate_set_out(message, 'MISP_The_Hive_feeder') p.populate_set_out(message, 'MISP_The_Hive_feeder')

View file

@ -57,11 +57,11 @@ if __name__ == "__main__":
try: try:
for word, score in paste._get_top_words().items(): for word, score in paste._get_top_words().items():
if len(word) >= 4: if len(word) >= 4:
msg = '{} {} {}'.format(paste.p_path, word, score) msg = '{} {} {}'.format(paste.p_rel_path, word, score)
p.populate_set_out(msg) p.populate_set_out(msg)
except TimeoutException: except TimeoutException:
p.incr_module_timeout_statistic() p.incr_module_timeout_statistic()
print ("{0} processing timeout".format(paste.p_path)) print ("{0} processing timeout".format(paste.p_rel_path))
continue continue
else: else:
signal.alarm(0) signal.alarm(0)

View file

@ -1,123 +1,90 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
# -*-coding:UTF-8 -* # -*-coding:UTF-8 -*
import configparser
from configparser import ConfigParser as cfgP
import os import os
from collections import OrderedDict
import sys import sys
import shutil import argparse
import configparser
def print_message(message_to_print, verbose):
if verbose:
print(message_to_print)
def update_config(config_file, config_file_sample, config_file_backup=False):
verbose = True
# Check if confile file exist
if not os.path.isfile(config_file):
# create config file
with open(config_file, 'w') as configfile:
with open(config_file_sample, 'r') as config_file_sample:
configfile.write(config_file_sample.read())
print_message('Config File Created', verbose)
else:
config_server = configparser.ConfigParser()
config_server.read(config_file)
config_sections = config_server.sections()
config_sample = configparser.ConfigParser()
config_sample.read(config_file_sample)
sample_sections = config_sample.sections()
mew_content_added = False
for section in sample_sections:
new_key_added = False
if section not in config_sections:
# add new section
config_server.add_section(section)
mew_content_added = True
for key in config_sample[section]:
if key not in config_server[section]:
# add new section key
config_server.set(section, key, config_sample[section][key])
if not new_key_added:
print_message('[{}]'.format(section), verbose)
new_key_added = True
mew_content_added = True
print_message(' {} = {}'.format(key, config_sample[section][key]), verbose)
# new keys have been added to config file
if mew_content_added:
# backup config file
if config_file_backup:
with open(config_file_backup, 'w') as configfile:
with open(config_file, 'r') as configfile_origin:
configfile.write(configfile_origin.read())
print_message('New Backup Created', verbose)
# create new config file
with open(config_file, 'w') as configfile:
config_server.write(configfile)
print_message('Config file updated', verbose)
else:
print_message('Nothing to update', verbose)
#return true if the configuration is up-to-date #return true if the configuration is up-to-date
def main(): def main():
if len(sys.argv) != 2:
print('usage:', 'Update-conf.py', 'Automatic (boolean)')
exit(1)
else:
automatic = sys.argv[1]
if automatic == 'True':
automatic = True
else:
automatic = False
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg') #------------------------------------------------------------------------------------#
configfileBackup = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg') + '.backup'
if not os.path.exists(configfile): config_file_default = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
config_file_default_sample = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg.sample')
config_file_default_backup = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg.backup')
config_file_update = os.path.join(os.environ['AIL_HOME'], 'configs/update.cfg')
config_file_update_sample = os.path.join(os.environ['AIL_HOME'], 'configs/update.cfg.sample')
if not os.path.exists(config_file_default_sample):
raise Exception('Unable to find the configuration file. \ raise Exception('Unable to find the configuration file. \
Did you set environment variables? \ Did you set environment variables? \
Or activate the virtualenv.') Or activate the virtualenv.')
configfileSample = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg.sample')
cfg = configparser.ConfigParser()
cfg.read(configfile)
cfgSample = configparser.ConfigParser()
cfgSample.read(configfileSample)
sections = cfgP.sections(cfg)
sectionsSample = cfgP.sections(cfgSample)
missingSection = []
dicoMissingSection = {}
missingItem = []
dicoMissingItem = {}
for sec in sectionsSample:
if sec not in sections:
missingSection += [sec]
dicoMissingSection[sec] = cfgP.items(cfgSample, sec)
else:
setSample = set(cfgP.options(cfgSample, sec))
setNormal = set(cfgP.options(cfg, sec))
if setSample != setNormal:
missing_items = list(setSample.difference(setNormal))
missingItem += [sec]
list_items = []
for i in missing_items:
list_items.append( (i, cfgSample.get(sec, i)) )
dicoMissingItem[sec] = list_items
if len(missingSection) == 0 and len(missingItem) == 0:
#print("Configuration up-to-date")
return True
print("/!\\ Configuration not complete. Missing following configuration: /!\\")
print("+--------------------------------------------------------------------+")
for section in missingSection:
print("["+section+"]")
for item in dicoMissingSection[section]:
print(" - "+item[0])
for section in missingItem:
print("["+section+"]")
for item in dicoMissingItem[section]:
print(" - "+item[0])
print("+--------------------------------------------------------------------+")
if automatic:
resp = 'y'
else:
resp = input("Do you want to auto fix it? [y/n] ")
if resp != 'y':
return False
else:
if automatic:
resp2 = 'y'
else:
resp2 = input("Do you want to keep a backup of the old configuration file? [y/n] ")
if resp2 == 'y':
shutil.move(configfile, configfileBackup)
#Do not keep item ordering in section. New items appened
for section in missingItem:
for item, value in dicoMissingItem[section]:
cfg.set(section, item, value)
#Keep sections ordering while updating the config file
new_dico = add_items_to_correct_position(cfgSample._sections, cfg._sections, missingSection, dicoMissingSection)
cfg._sections = new_dico
with open(configfile, 'w') as f:
cfg.write(f)
return True
''' Return a new dico with the section ordered as the old configuration with the updated one added ''' update_config(config_file_default, config_file_default_sample, config_file_backup=config_file_default_backup)
def add_items_to_correct_position(sample_dico, old_dico, missingSection, dicoMissingSection): update_config(config_file_update, config_file_update_sample)
new_dico = OrderedDict()
positions = {} return True
for pos_i, sec in enumerate(sample_dico):
if sec in missingSection:
positions[pos_i] = sec
for pos_i, sec in enumerate(old_dico):
if pos_i in positions:
missSection = positions[pos_i]
new_dico[missSection] = sample_dico[missSection]
new_dico[sec] = old_dico[sec]
return new_dico
if __name__ == "__main__": if __name__ == "__main__":

357
bin/Update.py Executable file
View file

@ -0,0 +1,357 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
Update AIL
============================
Update AIL clone and fork
"""
import configparser
import os
import sys
import subprocess
def auto_update_enabled(cfg):
auto_update = cfg.get('Update', 'auto_update')
if auto_update == 'True' or auto_update == 'true':
return True
else:
return False
# check if files are modify locally
def check_if_files_modified():
process = subprocess.run(['git', 'ls-files' ,'-m'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
modified_files = process.stdout
if modified_files:
print('Modified Files:')
print('{}{}{}'.format(TERMINAL_BLUE, modified_files.decode(), TERMINAL_DEFAULT))
return False
else:
return True
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
sys.exit(1)
def repo_is_fork():
print('Check if this repository is a fork:')
process = subprocess.run(['git', 'remote', '-v'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
res = process.stdout.decode()
if 'origin {}'.format(AIL_REPO) in res:
print(' This repository is a {}clone of {}{}'.format(TERMINAL_BLUE, AIL_REPO, TERMINAL_DEFAULT))
return False
else:
print(' This repository is a {}fork{}'.format(TERMINAL_BLUE, TERMINAL_DEFAULT))
print()
return True
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
def is_upstream_created(upstream):
process = subprocess.run(['git', 'remote', '-v'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
output = process.stdout.decode()
if upstream in output:
return True
else:
return False
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
def create_fork_upstream(upstream):
print('{}... Creating upstream ...{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
print('git remote add {} {}'.format(upstream, AIL_REPO))
process = subprocess.run(['git', 'remote', 'add', upstream, AIL_REPO], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
print(process.stdout.decode())
if is_upstream_created(upstream):
print('Fork upstream created')
print('{}... ...{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
else:
print('Fork not created')
aborting_update()
sys.exit(0)
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
def update_fork():
print('{}... Updating fork ...{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
if cfg.get('Update', 'update-fork') == 'True' or cfg.get('Update', 'update-fork') == 'true':
upstream = cfg.get('Update', 'upstream')
if not is_upstream_created(upstream):
create_fork_upstream(upstream)
print('{}git fetch {}:{}'.format(TERMINAL_YELLOW, upstream, TERMINAL_DEFAULT))
process = subprocess.run(['git', 'fetch', upstream], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
print(process.stdout.decode())
print('{}git checkout master:{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
process = subprocess.run(['git', 'checkout', 'master'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
print(process.stdout.decode())
print('{}git merge {}/master:{}'.format(TERMINAL_YELLOW, upstream, TERMINAL_DEFAULT))
process = subprocess.run(['git', 'merge', '{}/master'.format(upstream)], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
print(process.stdout.decode())
print('{}... ...{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(1)
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
else:
print('{}Fork Auto-Update disabled in config file{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
def get_git_current_tag(current_version_path):
try:
with open(current_version_path, 'r') as version_content:
version = version_content.read()
except FileNotFoundError:
version = 'v1.4'
with open(current_version_path, 'w') as version_content:
version_content.write(version)
version = version.replace(" ", "").splitlines()
return version[0]
def get_git_upper_tags_remote(current_tag, is_fork):
if is_fork:
process = subprocess.run(['git', 'tag'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
list_all_tags = process.stdout.decode().splitlines()
list_upper_tags = []
if list_all_tags[-1][1:] == current_tag:
list_upper_tags.append( (list_all_tags[-1], None) )
return list_upper_tags
for tag in list_all_tags:
if float(tag[1:]) >= float(current_tag):
list_upper_tags.append( (tag, None) )
return list_upper_tags
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
else:
process = subprocess.run(['git', 'ls-remote' ,'--tags'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
list_all_tags = process.stdout.decode().splitlines()
last_tag = list_all_tags[-1].split('\trefs/tags/')
last_commit = last_tag[0]
last_tag = last_tag[1].split('^{}')[0]
list_upper_tags = []
if last_tag[1:] == current_tag:
list_upper_tags.append( (last_tag, last_commit) )
return list_upper_tags
else:
for mess_tag in list_all_tags:
commit, tag = mess_tag.split('\trefs/tags/')
# add tag with last commit
if float(tag.split('^{}')[0][1:]) >= float(current_tag):
if '^{}' in tag:
list_upper_tags.append( (tag.split('^{}')[0], commit) )
# add last commit
if last_tag not in list_upper_tags[-1][0]:
list_upper_tags.append( (last_tag, last_commit) )
return list_upper_tags
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
def update_ail(current_tag, list_upper_tags_remote, current_version_path, is_fork):
print('{}git checkout master:{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
process = subprocess.run(['git', 'checkout', 'master'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
#process = subprocess.run(['ls'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
print(process.stdout.decode())
print()
print('{}git pull:{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
process = subprocess.run(['git', 'pull'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
output = process.stdout.decode()
print(output)
if len(list_upper_tags_remote) == 1:
# additional update (between 2 commits on the same version)
additional_update_path = os.path.join(os.environ['AIL_HOME'], 'update', current_tag, 'additional_update.sh')
if os.path.isfile(additional_update_path):
print()
print('{}------------------------------------------------------------------'.format(TERMINAL_YELLOW))
print('- Launching Additional Update: -')
print('-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --{}'.format(TERMINAL_DEFAULT))
process = subprocess.run(['bash', additional_update_path], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
output = process.stdout.decode()
print(output)
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(1)
print()
print('{}**************** AIL Sucessfully Updated *****************{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
print()
exit(0)
else:
# map version with roll back commit
list_update = []
previous_commit = list_upper_tags_remote[0][1]
for tuple in list_upper_tags_remote[1:]:
tag = tuple[0]
list_update.append( (tag, previous_commit) )
previous_commit = tuple[1]
for update in list_update:
launch_update_version(update[0], update[1], current_version_path, is_fork)
# Sucess
print('{}**************** AIL Sucessfully Updated *****************{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
print()
sys.exit(0)
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(1)
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
def launch_update_version(version, roll_back_commit, current_version_path, is_fork):
update_path = os.path.join(os.environ['AIL_HOME'], 'update', version, 'Update.sh')
print()
print('{}------------------------------------------------------------------'.format(TERMINAL_YELLOW))
print('- Launching Update: {}{}{} -'.format(TERMINAL_BLUE, version, TERMINAL_YELLOW))
print('-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --{}'.format(TERMINAL_DEFAULT))
process = subprocess.Popen(['bash', update_path], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
while True:
output = process.stdout.readline().decode()
if output == '' and process.poll() is not None:
break
if output:
print(output.strip())
if process.returncode == 0:
#output = process.stdout.decode()
#print(output)
with open(current_version_path, 'w') as version_content:
version_content.write(version)
print('{}-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --'.format(TERMINAL_YELLOW))
print('- Sucessfully Updated: {}{}{} -'.format(TERMINAL_BLUE, version, TERMINAL_YELLOW))
print('------------------------------------------------------------------{}'.format(TERMINAL_DEFAULT))
print()
else:
#print(process.stdout.read().decode())
print('{}{}{}'.format(TERMINAL_RED, process.stderr.read().decode(), TERMINAL_DEFAULT))
print('------------------------------------------------------------------')
print(' {}Update Error: {}{}{}'.format(TERMINAL_RED, TERMINAL_BLUE, version, TERMINAL_DEFAULT))
print('------------------------------------------------------------------')
if not is_fork:
roll_back_update(roll_back_commit)
else:
aborting_update()
sys.exit(1)
def roll_back_update(roll_back_commit):
print('Rolling back to safe commit: {}{}{}'.format(TERMINAL_BLUE ,roll_back_commit, TERMINAL_DEFAULT))
process = subprocess.run(['git', 'checkout', roll_back_commit], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
output = process.stdout
print(output)
sys.exit(0)
else:
print(TERMINAL_RED+process.stderr.decode()+TERMINAL_DEFAULT)
aborting_update()
sys.exit(1)
def aborting_update():
print()
print('{}Aborting ...{}'.format(TERMINAL_RED, TERMINAL_DEFAULT))
print('{}******************************************************************'.format(TERMINAL_RED))
print('* AIL Not Updated *')
print('******************************************************************{}'.format(TERMINAL_DEFAULT))
print()
if __name__ == "__main__":
TERMINAL_RED = '\033[91m'
TERMINAL_YELLOW = '\33[93m'
TERMINAL_BLUE = '\33[94m'
TERMINAL_BLINK = '\33[6m'
TERMINAL_DEFAULT = '\033[0m'
AIL_REPO = 'https://github.com/CIRCL/AIL-framework.git'
configfile = os.path.join(os.environ['AIL_HOME'], 'configs/update.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
current_version_path = os.path.join(os.environ['AIL_HOME'], 'update/current_version')
print('{}******************************************************************'.format(TERMINAL_YELLOW))
print('* Updating AIL ... *')
print('******************************************************************{}'.format(TERMINAL_DEFAULT))
if auto_update_enabled(cfg):
if check_if_files_modified():
is_fork = repo_is_fork()
if is_fork:
update_fork()
current_tag = get_git_current_tag(current_version_path)
print()
print('Current Version: {}{}{}'.format( TERMINAL_YELLOW, current_tag, TERMINAL_DEFAULT))
print()
list_upper_tags_remote = get_git_upper_tags_remote(current_tag[1:], is_fork)
# new realease
if len(list_upper_tags_remote) > 1:
print('New Releases:')
if is_fork:
for upper_tag in list_upper_tags_remote:
print(' {}{}{}'.format(TERMINAL_BLUE, upper_tag[0], TERMINAL_DEFAULT))
else:
for upper_tag in list_upper_tags_remote:
print(' {}{}{}: {}'.format(TERMINAL_BLUE, upper_tag[0], TERMINAL_DEFAULT, upper_tag[1]))
print()
update_ail(current_tag, list_upper_tags_remote, current_version_path, is_fork)
else:
print('Please, commit your changes or stash them before you can update AIL')
aborting_update()
sys.exit(0)
else:
print(' {}AIL Auto update is disabled{}'.format(TERMINAL_RED, TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)

View file

@ -153,7 +153,7 @@ if __name__ == "__main__":
pprint.pprint(A_values) pprint.pprint(A_values)
publisher.info('Url;{};{};{};Checked {} URL;{}'.format( publisher.info('Url;{};{};{};Checked {} URL;{}'.format(
PST.p_source, PST.p_date, PST.p_name, A_values[0], PST.p_path)) PST.p_source, PST.p_date, PST.p_name, A_values[0], PST.p_rel_path))
prec_filename = filename prec_filename = filename
else: else:

View file

@ -1,63 +0,0 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
The Browse_warning_paste module
====================
This module saved signaled paste (logged as 'warning') in redis for further usage
like browsing by category
Its input comes from other modules, namely:
Credential, CreditCard, SQLinjection, CVE, Keys, Mail and Phone
"""
import redis
import time
from datetime import datetime, timedelta
from packages import Paste
from pubsublogger import publisher
from Helper import Process
import sys
sys.path.append('../')
flag_misp = False
if __name__ == "__main__":
publisher.port = 6380
publisher.channel = "Script"
config_section = 'alertHandler'
p = Process(config_section)
# port generated automatically depending on the date
curYear = datetime.now().year
server = redis.StrictRedis(
host=p.config.get("ARDB_DB", "host"),
port=p.config.get("ARDB_DB", "port"),
db=curYear,
decode_responses=True)
# FUNCTIONS #
publisher.info("Script duplicate started")
while True:
message = p.get_from_set()
if message is not None:
module_name, p_path = message.split(';')
print("new alert : {}".format(module_name))
#PST = Paste.Paste(p_path)
else:
publisher.debug("Script Attribute is idling 10s")
time.sleep(10)
continue
# Add in redis for browseWarningPaste
# Format in set: WARNING_moduleName -> p_path
key = "WARNING_" + module_name
server.sadd(key, p_path)
publisher.info('Saved warning paste {}'.format(p_path))

View file

@ -1,16 +0,0 @@
#!/bin/bash
set -e
set -x
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_REDIS" ] && echo "Needs the env var AIL_REDIS. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_LEVELDB" ] && echo "Needs the env var AIL_LEVELDB. Run the script from the virtual environment." && exit 1;
screen -dmS "Logging"
sleep 0.1
echo -e $GREEN"\t* Launching logging process"$DEFAULT
screen -S "Logging" -X screen -t "LogQueue" bash -c 'log_subscriber -p 6380 -c Queuing -l ../logs/; read x'
sleep 0.1
screen -S "Logging" -X screen -t "LogScript" bash -c 'log_subscriber -p 6380 -c Script -l ../logs/; read x'

View file

@ -1,29 +0,0 @@
#!/bin/bash
set -e
set -x
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_REDIS" ] && echo "Needs the env var AIL_REDIS. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_LEVELDB" ] && echo "Needs the env var AIL_LEVELDB. Run the script from the virtual environment." && exit 1;
lvdbhost='127.0.0.1'
lvdbdir="${AIL_HOME}/LEVEL_DB_DATA/"
nb_db=13
db_y=`date +%Y`
#Verify that a dir with the correct year exists, create it otherwise
if [ ! -d "$lvdbdir$db_y" ]; then
mkdir -p "$db_y"
fi
screen -dmS "LevelDB"
sleep 0.1
echo -e $GREEN"\t* Launching Levels DB servers"$DEFAULT
#Launch a DB for each dir
for pathDir in $lvdbdir*/ ; do
yDir=$(basename "$pathDir")
sleep 0.1
screen -S "LevelDB" -X screen -t "$yDir" bash -c 'redis-leveldb -H '$lvdbhost' -D '$pathDir'/ -P '$yDir' -M '$nb_db'; read x'
done

View file

@ -1,15 +0,0 @@
#!/bin/bash
set -e
set -x
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_REDIS" ] && echo "Needs the env var AIL_REDIS. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_LEVELDB" ] && echo "Needs the env var AIL_LEVELDB. Run the script from the virtual environment." && exit 1;
screen -dmS "Queue"
sleep 0.1
echo -e $GREEN"\t* Launching all the queues"$DEFAULT
screen -S "Queue" -X screen -t "Queues" bash -c './launch_queues.py; read x'

View file

@ -1,23 +0,0 @@
#!/bin/bash
set -e
set -x
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_REDIS" ] && echo "Needs the env var AIL_REDIS. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_LEVELDB" ] && echo "Needs the env var AIL_LEVELDB. Run the script from the virtual environment." && exit 1;
conf_dir="${AIL_HOME}/configs/"
screen -dmS "Redis"
sleep 0.1
echo -e $GREEN"\t* Launching Redis servers"$DEFAULT
screen -S "Redis" -X screen -t "6379" bash -c '../redis/src/redis-server '$conf_dir'6379.conf ; read x'
sleep 0.1
screen -S "Redis" -X screen -t "6380" bash -c '../redis/src/redis-server '$conf_dir'6380.conf ; read x'
sleep 0.1
screen -S "Redis" -X screen -t "6381" bash -c '../redis/src/redis-server '$conf_dir'6381.conf ; read x'
# For Words and curves
sleep 0.1
screen -S "Redis" -X screen -t "6382" bash -c '../redis/src/redis-server '$conf_dir'6382.conf ; read x'

View file

@ -1,77 +0,0 @@
#!/bin/bash
set -e
set -x
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_REDIS" ] && echo "Needs the env var AIL_REDIS. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_LEVELDB" ] && echo "Needs the env var AIL_LEVELDB. Run the script from the virtual environment." && exit 1;
echo -e "\t* Checking configuration"
bash -c "./Update-conf.py"
exitStatus=$?
if [ $exitStatus -ge 1 ]; then
echo -e $RED"\t* Configuration not up-to-date"$DEFAULT
exit
fi
echo -e $GREEN"\t* Configuration up-to-date"$DEFAULT
screen -dmS "Script"
sleep 0.1
echo -e $GREEN"\t* Launching ZMQ scripts"$DEFAULT
screen -S "Script" -X screen -t "ModuleInformation" bash -c './ModulesInformationV2.py -k 0 -c 1; read x'
sleep 0.1
screen -S "Script" -X screen -t "Mixer" bash -c './Mixer.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Global" bash -c './Global.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Duplicates" bash -c './Duplicates.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Attributes" bash -c './Attributes.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Lines" bash -c './Lines.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "DomClassifier" bash -c './DomClassifier.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Categ" bash -c './Categ.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Tokenize" bash -c './Tokenize.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "CreditCards" bash -c './CreditCards.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Onion" bash -c './Onion.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Mail" bash -c './Mail.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Web" bash -c './Web.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Credential" bash -c './Credential.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Curve" bash -c './Curve.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "CurveManageTopSets" bash -c './CurveManageTopSets.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "RegexForTermsFrequency" bash -c './RegexForTermsFrequency.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "SetForTermsFrequency" bash -c './SetForTermsFrequency.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Indexer" bash -c './Indexer.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Keys" bash -c './Keys.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Phone" bash -c './Phone.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Release" bash -c './Release.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Cve" bash -c './Cve.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "WebStats" bash -c './WebStats.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "ModuleStats" bash -c './ModuleStats.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "SQLInjectionDetection" bash -c './SQLInjectionDetection.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "alertHandler" bash -c './alertHandler.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "SentimentAnalysis" bash -c './SentimentAnalysis.py; read x'

View file

@ -37,7 +37,7 @@ class HiddenServices(object):
""" """
def __init__(self, domain, type): def __init__(self, domain, type, port=80):
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg') configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile): if not os.path.exists(configfile):
@ -59,11 +59,14 @@ class HiddenServices(object):
db=cfg.getint("ARDB_Metadata", "db"), db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True) decode_responses=True)
self.PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
self.domain = domain self.domain = domain
self.type = type self.type = type
self.port = port
self.tags = {} self.tags = {}
if type == 'onion': if type == 'onion' or type == 'regular':
self.paste_directory = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) self.paste_directory = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes"))
self.paste_crawled_directory = os.path.join(self.paste_directory, cfg.get("Directories", "crawled")) self.paste_crawled_directory = os.path.join(self.paste_directory, cfg.get("Directories", "crawled"))
self.paste_crawled_directory_name = cfg.get("Directories", "crawled") self.paste_crawled_directory_name = cfg.get("Directories", "crawled")
@ -75,11 +78,24 @@ class HiddenServices(object):
## TODO: # FIXME: add error ## TODO: # FIXME: add error
pass pass
#def remove_absolute_path_link(self, key, value):
# print(key)
# print(value)
def update_item_path_children(self, key, children):
if self.PASTES_FOLDER in children:
self.r_serv_metadata.srem(key, children)
children = children.replace(self.PASTES_FOLDER, '', 1)
self.r_serv_metadata.sadd(key, children)
return children
def get_origin_paste_name(self): def get_origin_paste_name(self):
origin_paste = self.r_serv_onion.hget('onion_metadata:{}'.format(self.domain), 'paste_parent') origin_item = self.r_serv_onion.hget('onion_metadata:{}'.format(self.domain), 'paste_parent')
if origin_paste is None: if origin_item is None:
return '' return ''
return origin_paste.replace(self.paste_directory+'/', '') elif origin_item == 'auto' or origin_item == 'manual':
return origin_item
return origin_item.replace(self.paste_directory+'/', '')
def get_domain_tags(self, update=False): def get_domain_tags(self, update=False):
if not update: if not update:
@ -88,60 +104,119 @@ class HiddenServices(object):
self.get_last_crawled_pastes() self.get_last_crawled_pastes()
return self.tags return self.tags
def update_domain_tags(self, children): def update_domain_tags(self, item):
p_tags = self.r_serv_metadata.smembers('tag:'+children) if self.r_serv_metadata.exists('tag:{}'.format(item)):
p_tags = self.r_serv_metadata.smembers('tag:{}'.format(item))
# update path here
else:
# need to remove it
if self.paste_directory in item:
p_tags = self.r_serv_metadata.smembers('tag:{}'.format(item.replace(self.paste_directory+'/', '')))
# need to remove it
else:
p_tags = self.r_serv_metadata.smembers('tag:{}'.format(os.path.join(self.paste_directory, item)))
for tag in p_tags: for tag in p_tags:
self.tags[tag] = self.tags.get(tag, 0) + 1 self.tags[tag] = self.tags.get(tag, 0) + 1
#todo use the right paste def get_first_crawled(self):
def get_last_crawled_pastes(self): res = self.r_serv_onion.zrange('crawler_history_{}:{}:{}'.format(self.type, self.domain, self.port), 0, 0, withscores=True)
paste_parent = self.r_serv_onion.hget('onion_metadata:{}'.format(self.domain), 'paste_parent') if res:
#paste_parent = paste_parent.replace(self.paste_directory, '')[1:] res = res[0]
return self.get_all_pastes_domain(paste_parent) return {'root_item':res[0], 'epoch':int(res[1])}
else:
return {}
def get_all_pastes_domain(self, father): def get_last_crawled(self):
res = self.r_serv_onion.zrevrange('crawler_history_{}:{}:{}'.format(self.type, self.domain, self.port), 0, 0, withscores=True)
if res:
return {'root_item':res[0][0], 'epoch':res[0][1]}
else:
return {}
#todo use the right paste
def get_domain_crawled_core_item(self, epoch=None):
core_item = {}
if epoch:
list_root = self.r_serv_onion.zrevrangebyscore('crawler_history_{}:{}:{}'.format(self.type, self.domain, self.port), int(epoch), int(epoch))
if list_root:
core_item['root_item'] = list_root[0]
core_item['epoch'] = epoch
return core_item
# no history found for this epoch
if not core_item:
return self.get_last_crawled()
#todo use the right paste
def get_last_crawled_pastes(self, item_root=None):
if item_root is None:
item_root = self.get_domain_crawled_core_item(self)
return self.get_all_pastes_domain(item_root)
def get_all_pastes_domain(self, root_item):
if root_item is None:
return []
l_crawled_pastes = []
l_crawled_pastes = self.get_item_crawled_children(root_item)
l_crawled_pastes.append(root_item)
self.update_domain_tags(root_item)
return l_crawled_pastes
def get_item_crawled_children(self, father):
if father is None: if father is None:
return [] return []
l_crawled_pastes = [] l_crawled_pastes = []
paste_parent = father.replace(self.paste_directory+'/', '') key = 'paste_children:{}'.format(father)
paste_childrens = self.r_serv_metadata.smembers('paste_children:{}'.format(paste_parent)) paste_childrens = self.r_serv_metadata.smembers(key)
## TODO: # FIXME: remove me
paste_children = self.r_serv_metadata.smembers('paste_children:{}'.format(father))
paste_childrens = paste_childrens | paste_children
for children in paste_childrens: for children in paste_childrens:
children = self.update_item_path_children(key, children)
if self.domain in children: if self.domain in children:
l_crawled_pastes.append(children) l_crawled_pastes.append(children)
self.update_domain_tags(children) self.update_domain_tags(children)
l_crawled_pastes.extend(self.get_all_pastes_domain(children)) l_crawled_pastes.extend(self.get_item_crawled_children(children))
return l_crawled_pastes return l_crawled_pastes
def get_item_link(self, item):
link = self.r_serv_metadata.hget('paste_metadata:{}'.format(item), 'real_link')
if link is None:
if self.paste_directory in item:
self.r_serv_metadata.hget('paste_metadata:{}'.format(item.replace(self.paste_directory+'/', '')), 'real_link')
else:
key = os.path.join(self.paste_directory, item)
link = self.r_serv_metadata.hget('paste_metadata:{}'.format(key), 'real_link')
#if link:
#self.remove_absolute_path_link(key, link)
return link
def get_all_links(self, l_items):
dict_links = {}
for item in l_items:
link = self.get_item_link(item)
if link:
dict_links[item] = link
return dict_links
# experimental
def get_domain_son(self, l_paste): def get_domain_son(self, l_paste):
if l_paste is None: if l_paste is None:
return None return None
set_domain = set() set_domain = set()
for paste in l_paste: for paste in l_paste:
paste_full = paste.replace(self.paste_directory+'/', '') paste_childrens = self.r_serv_metadata.smembers('paste_children:{}'.format(paste))
paste_childrens = self.r_serv_metadata.smembers('paste_children:{}'.format(paste_full))
## TODO: # FIXME: remove me
paste_children = self.r_serv_metadata.smembers('paste_children:{}'.format(paste))
paste_childrens = paste_childrens | paste_children
for children in paste_childrens: for children in paste_childrens:
if not self.domain in children: if not self.domain in children:
print(children)
set_domain.add((children.split('.onion')[0]+'.onion').split('/')[-1]) set_domain.add((children.split('.onion')[0]+'.onion').split('/')[-1])
return set_domain return set_domain
'''
def get_all_domain_son(self, father): def get_all_domain_son(self, father):
if father is None: if father is None:
return [] return []
l_crawled_pastes = [] l_crawled_pastes = []
paste_parent = father.replace(self.paste_directory+'/', '') paste_childrens = self.r_serv_metadata.smembers('paste_children:{}'.format(father))
paste_childrens = self.r_serv_metadata.smembers('paste_children:{}'.format(paste_parent))
## TODO: # FIXME: remove me
paste_children = self.r_serv_metadata.smembers('paste_children:{}'.format(father))
paste_childrens = paste_childrens | paste_children
for children in paste_childrens: for children in paste_childrens:
if not self.domain in children: if not self.domain in children:
l_crawled_pastes.append(children) l_crawled_pastes.append(children)
@ -149,16 +224,19 @@ class HiddenServices(object):
l_crawled_pastes.extend(self.get_all_domain_son(children)) l_crawled_pastes.extend(self.get_all_domain_son(children))
return l_crawled_pastes return l_crawled_pastes
'''
def get_domain_random_screenshot(self, l_crawled_pastes, num_screenshot = 1): def get_domain_random_screenshot(self, l_crawled_pastes, num_screenshot = 1):
l_screenshot_paste = [] l_screenshot_paste = []
for paste in l_crawled_pastes: for paste in l_crawled_pastes:
## FIXME: # TODO: remove me ## FIXME: # TODO: remove me
origin_paste = paste
paste= paste.replace(self.paste_directory+'/', '') paste= paste.replace(self.paste_directory+'/', '')
paste = paste.replace(self.paste_crawled_directory_name, '') screenshot = self.r_serv_metadata.hget('paste_metadata:{}'.format(paste), 'screenshot')
if os.path.isfile( '{}{}.png'.format(self.screenshot_directory, paste) ): if screenshot:
l_screenshot_paste.append(paste[1:]) screenshot = os.path.join(screenshot[0:2], screenshot[2:4], screenshot[4:6], screenshot[6:8], screenshot[8:10], screenshot[10:12], screenshot[12:])
l_screenshot_paste.append({'screenshot': screenshot, 'item': origin_paste})
if len(l_screenshot_paste) > num_screenshot: if len(l_screenshot_paste) > num_screenshot:
l_random_screenshot = [] l_random_screenshot = []
@ -176,6 +254,7 @@ class HiddenServices(object):
l_crawled_pastes = [] l_crawled_pastes = []
return l_crawled_pastes return l_crawled_pastes
'''
def get_last_crawled_pastes_fileSearch(self): def get_last_crawled_pastes_fileSearch(self):
last_check = self.r_serv_onion.hget('onion_metadata:{}'.format(self.domain), 'last_check') last_check = self.r_serv_onion.hget('onion_metadata:{}'.format(self.domain), 'last_check')
@ -185,3 +264,4 @@ class HiddenServices(object):
pastes_path = os.path.join(self.paste_crawled_directory, date[0:4], date[4:6], date[6:8]) pastes_path = os.path.join(self.paste_crawled_directory, date[0:4], date[4:6], date[6:8])
l_crawled_pastes = [f for f in os.listdir(pastes_path) if self.domain in f] l_crawled_pastes = [f for f in os.listdir(pastes_path) if self.domain in f]
return l_crawled_pastes return l_crawled_pastes
'''

View file

@ -82,14 +82,14 @@ class Paste(object):
db=cfg.getint("ARDB_Metadata", "db"), db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True) decode_responses=True)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) self.PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes"))
if PASTES_FOLDER not in p_path: if self.PASTES_FOLDER not in p_path:
self.p_rel_path = p_path self.p_rel_path = p_path
p_path = os.path.join(PASTES_FOLDER, p_path) self.p_path = os.path.join(self.PASTES_FOLDER, p_path)
else: else:
self.p_rel_path = None self.p_path = p_path
self.p_rel_path = p_path.replace(self.PASTES_FOLDER+'/', '', 1)
self.p_path = p_path
self.p_name = os.path.basename(self.p_path) self.p_name = os.path.basename(self.p_path)
self.p_size = round(os.path.getsize(self.p_path)/1024.0, 2) self.p_size = round(os.path.getsize(self.p_path)/1024.0, 2)
self.p_mime = magic.from_buffer("test", mime=True) self.p_mime = magic.from_buffer("test", mime=True)
@ -101,7 +101,7 @@ class Paste(object):
var = self.p_path.split('/') var = self.p_path.split('/')
self.p_date = Date(var[-4], var[-3], var[-2]) self.p_date = Date(var[-4], var[-3], var[-2])
self.p_rel_path = os.path.join(var[-4], var[-3], var[-2], self.p_name) self.p_date_path = os.path.join(var[-4], var[-3], var[-2], self.p_name)
self.p_source = var[-5] self.p_source = var[-5]
self.supposed_url = 'https://{}/{}'.format(self.p_source.replace('_pro', ''), var[-1].split('.gz')[0]) self.supposed_url = 'https://{}/{}'.format(self.p_source.replace('_pro', ''), var[-1].split('.gz')[0])
@ -241,6 +241,16 @@ class Paste(object):
def _get_p_date(self): def _get_p_date(self):
return self.p_date return self.p_date
# used
def get_p_date(self):
return self.p_date
def get_item_source(self):
return self.p_source
def get_item_size(self):
return self.p_size
def _get_p_size(self): def _get_p_size(self):
return self.p_size return self.p_size
@ -286,14 +296,22 @@ class Paste(object):
return False, var return False, var
def _get_p_duplicate(self): def _get_p_duplicate(self):
self.p_duplicate = self.store_metadata.smembers('dup:'+self.p_path) p_duplicate = self.store_metadata.smembers('dup:'+self.p_path)
if self.p_rel_path is not None: # remove absolute path #fix-db
self.p_duplicate.union( self.store_metadata.smembers('dup:'+self.p_rel_path) ) if p_duplicate:
for duplicate_string in p_duplicate:
self.store_metadata.srem('dup:'+self.p_path, duplicate_string)
self.store_metadata.sadd('dup:'+self.p_rel_path, duplicate_string.replace(self.PASTES_FOLDER+'/', '', 1))
self.p_duplicate = self.store_metadata.smembers('dup:'+self.p_rel_path)
if self.p_duplicate is not None: if self.p_duplicate is not None:
return list(self.p_duplicate) return list(self.p_duplicate)
else: else:
return '[]' return '[]'
def get_nb_duplicate(self):
# # TODO: FIXME use relative path
return self.store_metadata.scard('dup:'+self.p_path) + self.store_metadata.scard('dup:'+self.p_rel_path)
def _get_p_tags(self): def _get_p_tags(self):
self.p_tags = self.store_metadata.smembers('tag:'+path, tag) self.p_tags = self.store_metadata.smembers('tag:'+path, tag)
if self.self.p_tags is not None: if self.self.p_tags is not None:
@ -304,6 +322,9 @@ class Paste(object):
def get_p_rel_path(self): def get_p_rel_path(self):
return self.p_rel_path return self.p_rel_path
def get_p_date_path(self):
return self.p_date_path
def save_all_attributes_redis(self, key=None): def save_all_attributes_redis(self, key=None):
""" """
Saving all the attributes in a "Redis-like" Database (Redis, LevelDB) Saving all the attributes in a "Redis-like" Database (Redis, LevelDB)

View file

@ -249,5 +249,5 @@ db = 0
[Crawler] [Crawler]
activate_crawler = False activate_crawler = False
crawler_depth_limit = 1 crawler_depth_limit = 1
splash_url_onion = http://127.0.0.1 splash_url = http://127.0.0.1
splash_onion_port = 8050-8052 splash_port = 8050-8052

140
bin/packages/git_status.py Executable file
View file

@ -0,0 +1,140 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import subprocess
TERMINAL_RED = '\033[91m'
TERMINAL_YELLOW = '\33[93m'
TERMINAL_BLUE = '\33[94m'
TERMINAL_BLINK = '\33[6m'
TERMINAL_DEFAULT = '\033[0m'
# Check if working directory is clean
def is_working_directory_clean(verbose=False):
if verbose:
print('check if this git directory is clean ...')
#print('git ls-files -m')
process = subprocess.run(['git', 'ls-files', '-m'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
res = process.stdout
if res == b'':
return True
else:
return False
else:
if verbose:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
return False
# Check if this git is a fork
def is_not_fork(origin_repo, verbose=False):
if verbose:
print('check if this git is a fork ...')
#print('git remote -v')
process = subprocess.run(['git', 'remote', '-v'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
res = process.stdout.decode()
if verbose:
print(res)
if 'origin {}'.format(origin_repo) in res:
return True
else:
return False
else:
if verbose:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
return False
# Get current branch
def get_current_branch(verbose=False):
if verbose:
print('retrieving current branch ...')
#print('git rev-parse --abbrev-ref HEAD')
process = subprocess.run(['git', 'rev-parse', '--abbrev-ref', 'HEAD'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
current_branch = process.stdout.replace(b'\n', b'').decode()
if verbose:
print(current_branch)
return current_branch
else:
if verbose:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
return ''
# Get last commit id on master branch from remote
def get_last_commit_id_from_remote(branch='master', verbose=False):
if verbose:
print('retrieving last remote commit id ...')
#print('git ls-remote origin master')
process = subprocess.run(['git', 'ls-remote', 'origin', branch], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
last_commit_id = process.stdout.split(b'\t')[0].replace(b'\n', b'').decode()
if verbose:
print(last_commit_id)
return last_commit_id
else:
if verbose:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
return ''
# Get last commit id on master branch from local
def get_last_commit_id_from_local(branch='master', verbose=False):
if verbose:
print('retrieving last local commit id ...')
#print('git rev-parse master')
process = subprocess.run(['git', 'rev-parse', branch], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
last_commit_id = process.stdout.replace(b'\n', b'').decode()
if verbose:
print(last_commit_id)
return last_commit_id
else:
if verbose:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
return ''
# Get last local tag
def get_last_tag_from_local(verbose=False):
if verbose:
print('retrieving last local tag ...')
#print('git describe --abbrev=0 --tags')
process = subprocess.run(['git', 'describe', '--abbrev=0', '--tags'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
last_local_tag = process.stdout.replace(b'\n', b'').decode()
if verbose:
print(last_local_tag)
return last_local_tag
else:
if verbose:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
return ''
# Get last local tag
def get_last_tag_from_remote(verbose=False):
if verbose:
print('retrieving last remote tag ...')
#print('git ls-remote --tags')
process = subprocess.run(['git', 'ls-remote', '--tags'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
res = process.stdout.split(b'\n')[-2].split(b'/')[-1].replace(b'^{}', b'').decode()
if verbose:
print(res)
return res
else:
if verbose:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
return ''
if __name__ == "__main__":
get_last_commit_id_from_remote(verbose=True)
get_last_commit_id_from_local(verbose=True)
get_last_tag_from_local(verbose=True)
get_current_branch(verbose=True)
print(is_fork('https://github.com/CIRCL/AIL-framework.git'))
print(is_working_directory_clean())
get_last_tag_from_remote(verbose=True)

View file

@ -51,7 +51,7 @@ publish = Redis_CreditCards,Redis_Mail,Redis_Onion,Redis_Web,Redis_Credential,Re
[CreditCards] [CreditCards]
subscribe = Redis_CreditCards subscribe = Redis_CreditCards
publish = Redis_Duplicate,Redis_ModuleStats,Redis_alertHandler,Redis_Tags publish = Redis_Duplicate,Redis_ModuleStats,Redis_Tags
[BankAccount] [BankAccount]
subscribe = Redis_Global subscribe = Redis_Global
@ -59,12 +59,12 @@ publish = Redis_Duplicate,Redis_Tags
[Mail] [Mail]
subscribe = Redis_Mail subscribe = Redis_Mail
publish = Redis_Duplicate,Redis_ModuleStats,Redis_alertHandler,Redis_Tags publish = Redis_Duplicate,Redis_ModuleStats,Redis_Tags
[Onion] [Onion]
subscribe = Redis_Onion subscribe = Redis_Onion
publish = Redis_ValidOnion,ZMQ_FetchedOnion,Redis_alertHandler,Redis_Tags,Redis_Crawler publish = Redis_ValidOnion,ZMQ_FetchedOnion,Redis_Tags,Redis_Crawler
#publish = Redis_Global,Redis_ValidOnion,ZMQ_FetchedOnion,Redis_alertHandler #publish = Redis_Global,Redis_ValidOnion,ZMQ_FetchedOnion
[DumpValidOnion] [DumpValidOnion]
subscribe = Redis_ValidOnion subscribe = Redis_ValidOnion
@ -78,18 +78,15 @@ subscribe = Redis_Url
[LibInjection] [LibInjection]
subscribe = Redis_Url subscribe = Redis_Url
publish = Redis_alertHandler,Redis_Duplicate,Redis_Tags publish = Redis_Duplicate,Redis_Tags
[SQLInjectionDetection] [SQLInjectionDetection]
subscribe = Redis_Url subscribe = Redis_Url
publish = Redis_alertHandler,Redis_Duplicate,Redis_Tags publish = Redis_Duplicate,Redis_Tags
[ModuleStats] [ModuleStats]
subscribe = Redis_ModuleStats subscribe = Redis_ModuleStats
[alertHandler]
subscribe = Redis_alertHandler
[Tags] [Tags]
subscribe = Redis_Tags subscribe = Redis_Tags
publish = Redis_Tags_feed publish = Redis_Tags_feed
@ -99,7 +96,7 @@ subscribe = Redis_Tags_feed
#[send_to_queue] #[send_to_queue]
#subscribe = Redis_Cve #subscribe = Redis_Cve
#publish = Redis_alertHandler,Redis_Tags #publish = Redis_Tags
[SentimentAnalysis] [SentimentAnalysis]
subscribe = Redis_Global subscribe = Redis_Global
@ -109,31 +106,31 @@ subscribe = Redis_Global
[Credential] [Credential]
subscribe = Redis_Credential subscribe = Redis_Credential
publish = Redis_Duplicate,Redis_ModuleStats,Redis_alertHandler,Redis_Tags publish = Redis_Duplicate,Redis_ModuleStats,Redis_Tags
[Cve] [Cve]
subscribe = Redis_Cve subscribe = Redis_Cve
publish = Redis_alertHandler,Redis_Duplicate,Redis_Tags publish = Redis_Duplicate,Redis_Tags
[Phone] [Phone]
subscribe = Redis_Global subscribe = Redis_Global
publish = Redis_Duplicate,Redis_alertHandler,Redis_Tags publish = Redis_Duplicate,Redis_Tags
[Keys] [Keys]
subscribe = Redis_Global subscribe = Redis_Global
publish = Redis_Duplicate,Redis_alertHandler,Redis_Tags publish = Redis_Duplicate,Redis_Tags
[ApiKey] [ApiKey]
subscribe = Redis_ApiKey subscribe = Redis_ApiKey
publish = Redis_Duplicate,Redis_alertHandler,Redis_Tags publish = Redis_Duplicate,Redis_Tags
[Decoder] [Decoder]
subscribe = Redis_Global subscribe = Redis_Global
publish = Redis_Duplicate,Redis_alertHandler,Redis_Tags publish = Redis_Duplicate,Redis_Tags
[Bitcoin] [Bitcoin]
subscribe = Redis_Global subscribe = Redis_Global
publish = Redis_Duplicate,Redis_alertHandler,Redis_Tags publish = Redis_Duplicate,Redis_Tags
[submit_paste] [submit_paste]
subscribe = Redis subscribe = Redis
@ -142,4 +139,3 @@ publish = Redis_Mixer
[Crawler] [Crawler]
subscribe = Redis_Crawler subscribe = Redis_Crawler
publish = Redis_Mixer,Redis_Tags publish = Redis_Mixer,Redis_Tags

View file

@ -96,25 +96,48 @@ def remove_submit_uuid(uuid):
r_serv_db.srem('submitted:uuid', uuid) r_serv_db.srem('submitted:uuid', uuid)
print('{} all file submitted'.format(uuid)) print('{} all file submitted'.format(uuid))
def add_item_tag(tag, item_path):
item_date = int(get_item_date(item_path))
#add tag
r_serv_metadata.sadd('tag:{}'.format(item_path), tag)
r_serv_tags.sadd('{}:{}'.format(tag, item_date), item_path)
r_serv_tags.hincrby('daily_tags:{}'.format(item_date), tag, 1)
tag_first_seen = r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen')
if tag_first_seen is None:
tag_first_seen = 99999999
else:
tag_first_seen = int(tag_first_seen)
tag_last_seen = r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen')
if tag_last_seen is None:
tag_last_seen = 0
else:
tag_last_seen = int(tag_last_seen)
#add new tag in list of all used tags
r_serv_tags.sadd('list_tags', tag)
# update fisrt_seen/last_seen
if item_date < tag_first_seen:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'first_seen', item_date)
# update metadata last_seen
if item_date > tag_last_seen:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'last_seen', item_date)
def add_tags(tags, tagsgalaxies, path): def add_tags(tags, tagsgalaxies, path):
list_tag = tags.split(',') list_tag = tags.split(',')
list_tag_galaxies = tagsgalaxies.split(',') list_tag_galaxies = tagsgalaxies.split(',')
if list_tag != ['']: if list_tag != ['']:
for tag in list_tag: for tag in list_tag:
#add tag add_item_tag(tag, path)
r_serv_metadata.sadd('tag:'+path, tag)
r_serv_tags.sadd(tag, path)
#add new tag in list of all used tags
r_serv_tags.sadd('list_tags', tag)
if list_tag_galaxies != ['']: if list_tag_galaxies != ['']:
for tag in list_tag_galaxies: for tag in list_tag_galaxies:
#add tag add_item_tag(tag, path)
r_serv_metadata.sadd('tag:'+path, tag)
r_serv_tags.sadd(tag, path)
#add new tag in list of all used tags
r_serv_tags.sadd('list_tags', tag)
def verify_extention_filename(filename): def verify_extention_filename(filename):
if not '.' in filename: if not '.' in filename:

View file

@ -12,6 +12,8 @@ import redis
import json import json
import time import time
from hashlib import sha256
from scrapy.spidermiddlewares.httperror import HttpError from scrapy.spidermiddlewares.httperror import HttpError
from twisted.internet.error import DNSLookupError from twisted.internet.error import DNSLookupError
from twisted.internet.error import TimeoutError from twisted.internet.error import TimeoutError
@ -28,10 +30,10 @@ from Helper import Process
class TorSplashCrawler(): class TorSplashCrawler():
def __init__(self, splash_url, crawler_depth_limit): def __init__(self, splash_url, crawler_options):
self.process = CrawlerProcess({'LOG_ENABLED': False}) self.process = CrawlerProcess({'LOG_ENABLED': False})
self.crawler = Crawler(self.TorSplashSpider, { self.crawler = Crawler(self.TorSplashSpider, {
'USER_AGENT': 'Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Firefox/24.0', 'USER_AGENT': crawler_options['user_agent'],
'SPLASH_URL': splash_url, 'SPLASH_URL': splash_url,
'ROBOTSTXT_OBEY': False, 'ROBOTSTXT_OBEY': False,
'DOWNLOADER_MIDDLEWARES': {'scrapy_splash.SplashCookiesMiddleware': 723, 'DOWNLOADER_MIDDLEWARES': {'scrapy_splash.SplashCookiesMiddleware': 723,
@ -42,26 +44,34 @@ class TorSplashCrawler():
'DUPEFILTER_CLASS': 'scrapy_splash.SplashAwareDupeFilter', 'DUPEFILTER_CLASS': 'scrapy_splash.SplashAwareDupeFilter',
'HTTPERROR_ALLOW_ALL': True, 'HTTPERROR_ALLOW_ALL': True,
'RETRY_TIMES': 2, 'RETRY_TIMES': 2,
'CLOSESPIDER_PAGECOUNT': 50, 'CLOSESPIDER_PAGECOUNT': crawler_options['closespider_pagecount'],
'DEPTH_LIMIT': crawler_depth_limit 'DEPTH_LIMIT': crawler_options['depth_limit']
}) })
def crawl(self, type, url, domain, original_paste, super_father): def crawl(self, type, crawler_options, date, url, domain, port, original_item):
self.process.crawl(self.crawler, type=type, url=url, domain=domain,original_paste=original_paste, super_father=super_father) self.process.crawl(self.crawler, type=type, crawler_options=crawler_options, date=date, url=url, domain=domain, port=port, original_item=original_item)
self.process.start() self.process.start()
class TorSplashSpider(Spider): class TorSplashSpider(Spider):
name = 'TorSplashSpider' name = 'TorSplashSpider'
def __init__(self, type, url, domain,original_paste, super_father, *args, **kwargs): def __init__(self, type, crawler_options, date, url, domain, port, original_item, *args, **kwargs):
self.type = type self.type = type
self.original_paste = original_paste self.original_item = original_item
self.super_father = super_father self.root_key = None
self.start_urls = url self.start_urls = url
self.domains = [domain] self.domains = [domain]
date = datetime.datetime.now().strftime("%Y/%m/%d") self.port = str(port)
self.full_date = datetime.datetime.now().strftime("%Y%m%d") date_str = '{}/{}/{}'.format(date['date_day'][0:4], date['date_day'][4:6], date['date_day'][6:8])
self.date_month = datetime.datetime.now().strftime("%Y%m") self.full_date = date['date_day']
self.date_month = date['date_month']
self.date_epoch = int(date['epoch'])
self.arg_crawler = { 'html': crawler_options['html'],
'wait': 10,
'render_all': 1,
'har': crawler_options['har'],
'png': crawler_options['png']}
config_section = 'Crawler' config_section = 'Crawler'
self.p = Process(config_section) self.p = Process(config_section)
@ -90,12 +100,13 @@ class TorSplashCrawler():
db=self.p.config.getint("ARDB_Onion", "db"), db=self.p.config.getint("ARDB_Onion", "db"),
decode_responses=True) decode_responses=True)
self.crawler_path = os.path.join(self.p.config.get("Directories", "crawled"), date ) self.crawler_path = os.path.join(self.p.config.get("Directories", "crawled"), date_str )
self.crawled_paste_filemame = os.path.join(os.environ['AIL_HOME'], self.p.config.get("Directories", "pastes"), self.crawled_paste_filemame = os.path.join(os.environ['AIL_HOME'], self.p.config.get("Directories", "pastes"),
self.p.config.get("Directories", "crawled"), date ) self.p.config.get("Directories", "crawled"), date_str )
self.crawled_screenshot = os.path.join(os.environ['AIL_HOME'], self.p.config.get("Directories", "crawled_screenshot"), date ) self.crawled_har = os.path.join(os.environ['AIL_HOME'], self.p.config.get("Directories", "crawled_screenshot"), date_str )
self.crawled_screenshot = os.path.join(os.environ['AIL_HOME'], self.p.config.get("Directories", "crawled_screenshot") )
def start_requests(self): def start_requests(self):
yield SplashRequest( yield SplashRequest(
@ -103,12 +114,8 @@ class TorSplashCrawler():
self.parse, self.parse,
errback=self.errback_catcher, errback=self.errback_catcher,
endpoint='render.json', endpoint='render.json',
meta={'father': self.original_paste}, meta={'father': self.original_item, 'root_key': None},
args={ 'html': 1, args=self.arg_crawler
'wait': 10,
'render_all': 1,
'har': 1,
'png': 1}
) )
def parse(self,response): def parse(self,response):
@ -131,12 +138,13 @@ class TorSplashCrawler():
UUID = self.domains[0][-215:]+str(uuid.uuid4()) UUID = self.domains[0][-215:]+str(uuid.uuid4())
else: else:
UUID = self.domains[0]+str(uuid.uuid4()) UUID = self.domains[0]+str(uuid.uuid4())
filename_paste = os.path.join(self.crawled_paste_filemame, UUID) filename_paste_full = os.path.join(self.crawled_paste_filemame, UUID)
relative_filename_paste = os.path.join(self.crawler_path, UUID) relative_filename_paste = os.path.join(self.crawler_path, UUID)
filename_screenshot = os.path.join(self.crawled_screenshot, UUID +'.png') filename_har = os.path.join(self.crawled_har, UUID)
# # TODO: modify me
# save new paste on disk # save new paste on disk
if self.save_crawled_paste(filename_paste, response.data['html']): if self.save_crawled_paste(relative_filename_paste, response.data['html']):
# add this paste to the domain crawled set # TODO: # FIXME: put this on cache ? # add this paste to the domain crawled set # TODO: # FIXME: put this on cache ?
#self.r_serv_onion.sadd('temp:crawled_domain_pastes:{}'.format(self.domains[0]), filename_paste) #self.r_serv_onion.sadd('temp:crawled_domain_pastes:{}'.format(self.domains[0]), filename_paste)
@ -148,28 +156,55 @@ class TorSplashCrawler():
# create onion metadata # create onion metadata
if not self.r_serv_onion.exists('{}_metadata:{}'.format(self.type, self.domains[0])): if not self.r_serv_onion.exists('{}_metadata:{}'.format(self.type, self.domains[0])):
self.r_serv_onion.hset('{}_metadata:{}'.format(self.type, self.domains[0]), 'first_seen', self.full_date) self.r_serv_onion.hset('{}_metadata:{}'.format(self.type, self.domains[0]), 'first_seen', self.full_date)
self.r_serv_onion.hset('{}_metadata:{}'.format(self.type, self.domains[0]), 'last_seen', self.full_date)
# create root_key
if self.root_key is None:
self.root_key = relative_filename_paste
# Create/Update crawler history
self.r_serv_onion.zadd('crawler_history_{}:{}:{}'.format(self.type, self.domains[0], self.port), self.date_epoch, self.root_key)
# Update domain port number
all_domain_ports = self.r_serv_onion.hget('{}_metadata:{}'.format(self.type, self.domains[0]), 'ports')
if all_domain_ports:
all_domain_ports = all_domain_ports.split(';')
else:
all_domain_ports = []
if self.port not in all_domain_ports:
all_domain_ports.append(self.port)
self.r_serv_onion.hset('{}_metadata:{}'.format(self.type, self.domains[0]), 'ports', ';'.join(all_domain_ports))
#create paste metadata #create paste metadata
self.r_serv_metadata.hset('paste_metadata:'+filename_paste, 'super_father', self.super_father) self.r_serv_metadata.hset('paste_metadata:{}'.format(relative_filename_paste), 'super_father', self.root_key)
self.r_serv_metadata.hset('paste_metadata:'+filename_paste, 'father', response.meta['father']) self.r_serv_metadata.hset('paste_metadata:{}'.format(relative_filename_paste), 'father', response.meta['father'])
self.r_serv_metadata.hset('paste_metadata:'+filename_paste, 'domain', self.domains[0]) self.r_serv_metadata.hset('paste_metadata:{}'.format(relative_filename_paste), 'domain', '{}:{}'.format(self.domains[0], self.port))
self.r_serv_metadata.hset('paste_metadata:'+filename_paste, 'real_link', response.url) self.r_serv_metadata.hset('paste_metadata:{}'.format(relative_filename_paste), 'real_link', response.url)
self.r_serv_metadata.sadd('paste_children:'+response.meta['father'], filename_paste) self.r_serv_metadata.sadd('paste_children:'+response.meta['father'], relative_filename_paste)
dirname = os.path.dirname(filename_screenshot) if 'png' in response.data:
if not os.path.exists(dirname): size_screenshot = (len(response.data['png'])*3) /4
os.makedirs(dirname)
size_screenshot = (len(response.data['png'])*3) /4 if size_screenshot < 5000000: #bytes
image_content = base64.standard_b64decode(response.data['png'].encode())
hash = sha256(image_content).hexdigest()
img_dir_path = os.path.join(hash[0:2], hash[2:4], hash[4:6], hash[6:8], hash[8:10], hash[10:12])
filename_img = os.path.join(self.crawled_screenshot, 'screenshot', img_dir_path, hash[12:] +'.png')
dirname = os.path.dirname(filename_img)
if not os.path.exists(dirname):
os.makedirs(dirname)
if not os.path.exists(filename_img):
with open(filename_img, 'wb') as f:
f.write(image_content)
# add item metadata
self.r_serv_metadata.hset('paste_metadata:{}'.format(relative_filename_paste), 'screenshot', hash)
# add sha256 metadata
self.r_serv_onion.sadd('screenshot:{}'.format(hash), relative_filename_paste)
if size_screenshot < 5000000: #bytes if 'har' in response.data:
with open(filename_screenshot, 'wb') as f: dirname = os.path.dirname(filename_har)
f.write(base64.standard_b64decode(response.data['png'].encode())) if not os.path.exists(dirname):
os.makedirs(dirname)
with open(filename_screenshot+'har.txt', 'wb') as f: with open(filename_har+'.json', 'wb') as f:
f.write(json.dumps(response.data['har']).encode()) f.write(json.dumps(response.data['har']).encode())
# save external links in set # save external links in set
#lext = LinkExtractor(deny_domains=self.domains, unique=True) #lext = LinkExtractor(deny_domains=self.domains, unique=True)
@ -184,12 +219,8 @@ class TorSplashCrawler():
self.parse, self.parse,
errback=self.errback_catcher, errback=self.errback_catcher,
endpoint='render.json', endpoint='render.json',
meta={'father': relative_filename_paste}, meta={'father': relative_filename_paste, 'root_key': response.meta['root_key']},
args={ 'html': 1, args=self.arg_crawler
'png': 1,
'render_all': 1,
'har': 1,
'wait': 10}
) )
def errback_catcher(self, failure): def errback_catcher(self, failure):
@ -203,17 +234,17 @@ class TorSplashCrawler():
self.logger.error('Splash, ResponseNeverReceived for %s, retry in 10s ...', url) self.logger.error('Splash, ResponseNeverReceived for %s, retry in 10s ...', url)
time.sleep(10) time.sleep(10)
if response:
response_root_key = response.meta['root_key']
else:
response_root_key = None
yield SplashRequest( yield SplashRequest(
url, url,
self.parse, self.parse,
errback=self.errback_catcher, errback=self.errback_catcher,
endpoint='render.json', endpoint='render.json',
meta={'father': father}, meta={'father': father, 'root_key': response.meta['root_key']},
args={ 'html': 1, args=self.arg_crawler
'png': 1,
'render_all': 1,
'har': 1,
'wait': 10}
) )
else: else:

View file

@ -3,13 +3,15 @@
import os import os
import sys import sys
import json
import redis
import configparser import configparser
from TorSplashCrawler import TorSplashCrawler from TorSplashCrawler import TorSplashCrawler
if __name__ == '__main__': if __name__ == '__main__':
if len(sys.argv) != 7: if len(sys.argv) != 2:
print('usage:', 'tor_crawler.py', 'splash_url', 'type', 'url', 'domain', 'paste', 'super_father') print('usage:', 'tor_crawler.py', 'uuid')
exit(1) exit(1)
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg') configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
@ -21,14 +23,28 @@ if __name__ == '__main__':
cfg = configparser.ConfigParser() cfg = configparser.ConfigParser()
cfg.read(configfile) cfg.read(configfile)
splash_url = sys.argv[1] redis_cache = redis.StrictRedis(
type = sys.argv[2] host=cfg.get("Redis_Cache", "host"),
crawler_depth_limit = cfg.getint("Crawler", "crawler_depth_limit") port=cfg.getint("Redis_Cache", "port"),
db=cfg.getint("Redis_Cache", "db"),
decode_responses=True)
url = sys.argv[3] # get crawler config key
domain = sys.argv[4] uuid = sys.argv[1]
paste = sys.argv[5]
super_father = sys.argv[6]
crawler = TorSplashCrawler(splash_url, crawler_depth_limit) # get configs
crawler.crawl(type, url, domain, paste, super_father) crawler_json = json.loads(redis_cache.get('crawler_request:{}'.format(uuid)))
splash_url = crawler_json['splash_url']
service_type = crawler_json['service_type']
url = crawler_json['url']
domain = crawler_json['domain']
port = crawler_json['port']
original_item = crawler_json['item']
crawler_options = crawler_json['crawler_options']
date = crawler_json['date']
redis_cache.delete('crawler_request:{}'.format(uuid))
crawler = TorSplashCrawler(splash_url, crawler_options)
crawler.crawl(service_type, crawler_options, date, url, domain, port, original_item)

62
bin/update-background.py Executable file
View file

@ -0,0 +1,62 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
Update AIL
============================
Update AIL in the background
"""
import os
import sys
import redis
import subprocess
import configparser
if __name__ == "__main__":
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
if r_serv.scard('ail:update_v1.5') != 5:
r_serv.delete('ail:update_error')
r_serv.set('ail:update_in_progress', 'v1.5')
r_serv.set('ail:current_background_update', 'v1.5')
if not r_serv.sismember('ail:update_v1.5', 'onions'):
update_file = os.path.join(os.environ['AIL_HOME'], 'update', 'v1.5', 'Update-ARDB_Onions.py')
process = subprocess.run(['python' ,update_file])
if not r_serv.sismember('ail:update_v1.5', 'metadata'):
update_file = os.path.join(os.environ['AIL_HOME'], 'update', 'v1.5', 'Update-ARDB_Metadata.py')
process = subprocess.run(['python' ,update_file])
if not r_serv.sismember('ail:update_v1.5', 'tags'):
update_file = os.path.join(os.environ['AIL_HOME'], 'update', 'v1.5', 'Update-ARDB_Tags.py')
process = subprocess.run(['python' ,update_file])
if not r_serv.sismember('ail:update_v1.5', 'tags_background'):
update_file = os.path.join(os.environ['AIL_HOME'], 'update', 'v1.5', 'Update-ARDB_Tags_background.py')
process = subprocess.run(['python' ,update_file])
if not r_serv.sismember('ail:update_v1.5', 'crawled_screenshot'):
update_file = os.path.join(os.environ['AIL_HOME'], 'update', 'v1.5', 'Update-ARDB_Onions_screenshots.py')
process = subprocess.run(['python' ,update_file])
if r_serv.scard('ail:update_v1.5') != 5:
r_serv.set('ail:update_error', 'Update v1.5 Failed, please relaunch the bin/update-background.py script')
else:
r_serv.delete('ail:update_in_progress')
r_serv.delete('ail:current_background_script')
r_serv.delete('ail:current_background_script_stat')
r_serv.delete('ail:current_background_update')

View file

@ -0,0 +1,4 @@
[Update]
auto_update = True
upstream = upstream
update-fork = False

View file

@ -39,7 +39,7 @@ sudo apt-get install p7zip-full -y
# REDIS # # REDIS #
test ! -d redis/ && git clone https://github.com/antirez/redis.git test ! -d redis/ && git clone https://github.com/antirez/redis.git
pushd redis/ pushd redis/
git checkout 3.2 git checkout 5.0
make make
popd popd

18
update/bin/Update_ARDB.sh Executable file
View file

@ -0,0 +1,18 @@
#!/bin/bash
echo "Killing all screens ..."
bash -c "bash ../../bin/LAUNCH.sh -k"
echo ""
echo "Updating ARDB ..."
pushd ../../
rm -r ardb
pushd ardb/
git clone https://github.com/yinqiwen/ardb.git
git checkout 0.10 || exit 1
make || exit 1
popd
popd
echo "ARDB Updated"
echo ""
exit 0

27
update/bin/Update_Redis.sh Executable file
View file

@ -0,0 +1,27 @@
#!/bin/bash
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_REDIS" ] && echo "Needs the env var AIL_REDIS. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_ARDB" ] && echo "Needs the env var AIL_ARDB. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_BIN" ] && echo "Needs the env var AIL_ARDB. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_FLASK" ] && echo "Needs the env var AIL_FLASK. Run the script from the virtual environment." && exit 1;
export PATH=$AIL_HOME:$PATH
export PATH=$AIL_REDIS:$PATH
export PATH=$AIL_ARDB:$PATH
export PATH=$AIL_BIN:$PATH
export PATH=$AIL_FLASK:$PATH
echo "Killing all screens ..."
bash -c "bash ${AIL_BIN}/LAUNCH.sh -k"
echo ""
echo "Updating Redis ..."
pushd $AIL_HOME/redis
git pull || exit 1
git checkout 5.0 || exit 1
make || exit 1
popd
echo "Redis Updated"
echo ""
exit 0

View file

@ -0,0 +1,193 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import time
import redis
import configparser
def update_tracked_terms(main_key, tracked_container_key):
for tracked_item in r_serv_term.smembers(main_key):
all_items = r_serv_term.smembers(tracked_container_key.format(tracked_item))
for item_path in all_items:
if PASTES_FOLDER in item_path:
new_item_path = item_path.replace(PASTES_FOLDER, '', 1)
r_serv_term.sadd(tracked_container_key.format(tracked_item), new_item_path)
r_serv_term.srem(tracked_container_key.format(tracked_item), item_path)
def update_hash_item(has_type):
#get all hash items:
all_hash_items = r_serv_tag.smembers('infoleak:automatic-detection=\"{}\"'.format(has_type))
for item_path in all_hash_items:
if PASTES_FOLDER in item_path:
base64_key = '{}_paste:{}'.format(has_type, item_path)
hash_key = 'hash_paste:{}'.format(item_path)
if r_serv_metadata.exists(base64_key):
new_base64_key = base64_key.replace(PASTES_FOLDER, '', 1)
res = r_serv_metadata.renamenx(base64_key, new_base64_key)
if res == 0:
print('same key, double name: {}'.format(item_path))
# fusion
all_key = r_serv_metadata.smembers(base64_key)
for elem in all_key:
r_serv_metadata.sadd(new_base64_key, elem)
r_serv_metadata.srem(base64_key, elem)
if r_serv_metadata.exists(hash_key):
new_hash_key = hash_key.replace(PASTES_FOLDER, '', 1)
res = r_serv_metadata.renamenx(hash_key, new_hash_key)
if res == 0:
print('same key, double name: {}'.format(item_path))
# fusion
all_key = r_serv_metadata.smembers(hash_key)
for elem in all_key:
r_serv_metadata.sadd(new_hash_key, elem)
r_serv_metadata.srem(hash_key, elem)
if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_tag = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
r_serv_term = redis.StrictRedis(
host=cfg.get("ARDB_TermFreq", "host"),
port=cfg.getint("ARDB_TermFreq", "port"),
db=cfg.getint("ARDB_TermFreq", "db"),
decode_responses=True)
r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
r_serv.set('ail:current_background_script', 'metadata')
## Update metadata ##
print('Updating ARDB_Metadata ...')
index = 0
start = time.time()
#update stats
r_serv.set('ail:current_background_script_stat', 0)
# Update base64
update_hash_item('base64')
#update stats
r_serv.set('ail:current_background_script_stat', 20)
# Update binary
update_hash_item('binary')
#update stats
r_serv.set('ail:current_background_script_stat', 40)
# Update binary
update_hash_item('hexadecimal')
#update stats
r_serv.set('ail:current_background_script_stat', 60)
total_onion = r_serv_tag.scard('infoleak:submission=\"crawler\"')
nb_updated = 0
last_progress = 0
# Update onion metadata
all_crawled_items = r_serv_tag.smembers('infoleak:submission=\"crawler\"')
for item_path in all_crawled_items:
domain = None
if PASTES_FOLDER in item_path:
old_item_metadata = 'paste_metadata:{}'.format(item_path)
item_path = item_path.replace(PASTES_FOLDER, '', 1)
new_item_metadata = 'paste_metadata:{}'.format(item_path)
res = r_serv_metadata.renamenx(old_item_metadata, new_item_metadata)
#key already exist
if res == 0:
r_serv_metadata.delete(old_item_metadata)
# update domain port
domain = r_serv_metadata.hget(new_item_metadata, 'domain')
if domain:
if domain[-3:] != ':80':
r_serv_metadata.hset(new_item_metadata, 'domain', '{}:80'.format(domain))
super_father = r_serv_metadata.hget(new_item_metadata, 'super_father')
if super_father:
if PASTES_FOLDER in super_father:
r_serv_metadata.hset(new_item_metadata, 'super_father', super_father.replace(PASTES_FOLDER, '', 1))
father = r_serv_metadata.hget(new_item_metadata, 'father')
if father:
if PASTES_FOLDER in father:
r_serv_metadata.hset(new_item_metadata, 'father', father.replace(PASTES_FOLDER, '', 1))
nb_updated += 1
progress = int((nb_updated * 30) /total_onion)
print('{}/{} updated {}%'.format(nb_updated, total_onion, progress + 60))
# update progress stats
if progress != last_progress:
r_serv.set('ail:current_background_script_stat', progress + 60)
last_progress = progress
#update stats
r_serv.set('ail:current_background_script_stat', 90)
## update tracked term/set/regex
# update tracked term
update_tracked_terms('TrackedSetTermSet', 'tracked_{}')
#update stats
r_serv.set('ail:current_background_script_stat', 93)
# update tracked set
update_tracked_terms('TrackedSetSet', 'set_{}')
#update stats
r_serv.set('ail:current_background_script_stat', 96)
# update tracked regex
update_tracked_terms('TrackedRegexSet', 'regex_{}')
#update stats
r_serv.set('ail:current_background_script_stat', 100)
##
end = time.time()
print('Updating ARDB_Metadata Done => {} paths: {} s'.format(index, end - start))
print()
r_serv.sadd('ail:update_v1.5', 'metadata')
##
#Key, Dynamic Update
##
#paste_children
#nb_seen_hash, base64_hash, binary_hash
#paste_onion_external_links
#misp_events, hive_cases
##

152
update/v1.5/Update-ARDB_Onions.py Executable file
View file

@ -0,0 +1,152 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import time
import redis
import datetime
import configparser
def substract_date(date_from, date_to):
date_from = datetime.date(int(date_from[0:4]), int(date_from[4:6]), int(date_from[6:8]))
date_to = datetime.date(int(date_to[0:4]), int(date_to[4:6]), int(date_to[6:8]))
delta = date_to - date_from # timedelta
l_date = []
for i in range(delta.days + 1):
date = date_from + datetime.timedelta(i)
l_date.append( date.strftime('%Y%m%d') )
return l_date
def get_date_epoch(date):
return int(datetime.datetime(int(date[0:4]), int(date[4:6]), int(date[6:8])).timestamp())
def get_domain_root_from_paste_childrens(item_father, domain):
item_children = r_serv_metadata.smembers('paste_children:{}'.format(item_father))
domain_root = ''
for item_path in item_children:
# remove absolute_path
if PASTES_FOLDER in item_path:
r_serv_metadata.srem('paste_children:{}'.format(item_father), item_path)
item_path = item_path.replace(PASTES_FOLDER, '', 1)
r_serv_metadata.sadd('paste_children:{}'.format(item_father), item_path)
if domain in item_path:
domain_root = item_path
return domain_root
if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_tag = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
r_serv.set('ail:current_background_script', 'onions')
r_serv.set('ail:current_background_script_stat', 0)
## Update Onion ##
print('Updating ARDB_Onion ...')
index = 0
start = time.time()
# clean down domain from db
date_from = '20180929'
date_today = datetime.date.today().strftime("%Y%m%d")
for date in substract_date(date_from, date_today):
onion_down = r_serv_onion.smembers('onion_down:{}'.format(date))
#print(onion_down)
for onion_domain in onion_down:
if not r_serv_onion.sismember('full_onion_up', onion_domain):
# delete history
all_onion_history = r_serv_onion.lrange('onion_history:{}'.format(onion_domain), 0 ,-1)
if all_onion_history:
for date_history in all_onion_history:
#print('onion_history:{}:{}'.format(onion_domain, date_history))
r_serv_onion.delete('onion_history:{}:{}'.format(onion_domain, date_history))
r_serv_onion.delete('onion_history:{}'.format(onion_domain))
#stats
total_domain = r_serv_onion.scard('full_onion_up')
nb_updated = 0
last_progress = 0
# clean up domain
all_domain_up = r_serv_onion.smembers('full_onion_up')
for onion_domain in all_domain_up:
# delete history
all_onion_history = r_serv_onion.lrange('onion_history:{}'.format(onion_domain), 0 ,-1)
if all_onion_history:
for date_history in all_onion_history:
print('--------')
print('onion_history:{}:{}'.format(onion_domain, date_history))
item_father = r_serv_onion.lrange('onion_history:{}:{}'.format(onion_domain, date_history), 0, 0)
print('item_father: {}'.format(item_father))
try:
item_father = item_father[0]
except IndexError:
r_serv_onion.delete('onion_history:{}:{}'.format(onion_domain, date_history))
continue
#print(item_father)
# delete old history
r_serv_onion.delete('onion_history:{}:{}'.format(onion_domain, date_history))
# create new history
root_key = get_domain_root_from_paste_childrens(item_father, onion_domain)
if root_key:
r_serv_onion.zadd('crawler_history_onion:{}:80'.format(onion_domain), get_date_epoch(date_history), root_key)
print('crawler_history_onion:{}:80 {} {}'.format(onion_domain, get_date_epoch(date_history), root_key))
#update service metadata: paste_parent
r_serv_onion.hset('onion_metadata:{}'.format(onion_domain), 'paste_parent', root_key)
r_serv_onion.delete('onion_history:{}'.format(onion_domain))
r_serv_onion.hset('onion_metadata:{}'.format(onion_domain), 'ports', '80')
r_serv_onion.hdel('onion_metadata:{}'.format(onion_domain), 'last_seen')
nb_updated += 1
progress = int((nb_updated * 100) /total_domain)
print('{}/{} updated {}%'.format(nb_updated, total_domain, progress))
# update progress stats
if progress != last_progress:
r_serv.set('ail:current_background_script_stat', progress)
last_progress = progress
end = time.time()
print('Updating ARDB_Onion Done => {} paths: {} s'.format(index, end - start))
print()
print('Done in {} s'.format(end - start_deb))
r_serv.sadd('ail:update_v1.5', 'onions')

View file

@ -0,0 +1,135 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import time
import redis
import datetime
import configparser
from hashlib import sha256
def rreplace(s, old, new, occurrence):
li = s.rsplit(old, occurrence)
return new.join(li)
def substract_date(date_from, date_to):
date_from = datetime.date(int(date_from[0:4]), int(date_from[4:6]), int(date_from[6:8]))
date_to = datetime.date(int(date_to[0:4]), int(date_to[4:6]), int(date_to[6:8]))
delta = date_to - date_from # timedelta
l_date = []
for i in range(delta.days + 1):
date = date_from + datetime.timedelta(i)
l_date.append( date.strftime('%Y%m%d') )
return l_date
if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
SCREENSHOT_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "crawled_screenshot"))
NEW_SCREENSHOT_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "crawled_screenshot"), 'screenshot')
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_tag = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
r_serv.set('ail:current_background_script', 'crawled_screenshot')
r_serv.set('ail:current_background_script_stat', 0)
## Update Onion ##
print('Updating ARDB_Onion ...')
index = 0
start = time.time()
# clean down domain from db
date_from = '20180801'
date_today = datetime.date.today().strftime("%Y%m%d")
list_date = substract_date(date_from, date_today)
nb_done = 0
last_progress = 0
total_to_update = len(list_date)
for date in list_date:
screenshot_dir = os.path.join(SCREENSHOT_FOLDER, date[0:4], date[4:6], date[6:8])
if os.path.isdir(screenshot_dir):
print(screenshot_dir)
for file in os.listdir(screenshot_dir):
if file.endswith(".png"):
index += 1
#print(file)
img_path = os.path.join(screenshot_dir, file)
with open(img_path, 'br') as f:
image_content = f.read()
hash = sha256(image_content).hexdigest()
img_dir_path = os.path.join(hash[0:2], hash[2:4], hash[4:6], hash[6:8], hash[8:10], hash[10:12])
filename_img = os.path.join(NEW_SCREENSHOT_FOLDER, img_dir_path, hash[12:] +'.png')
dirname = os.path.dirname(filename_img)
if not os.path.exists(dirname):
os.makedirs(dirname)
if not os.path.exists(filename_img):
os.rename(img_path, filename_img)
else:
os.remove(img_path)
item = os.path.join('crawled', date[0:4], date[4:6], date[6:8], file[:-4])
# add item metadata
r_serv_metadata.hset('paste_metadata:{}'.format(item), 'screenshot', hash)
# add sha256 metadata
r_serv_onion.sadd('screenshot:{}'.format(hash), item)
if file.endswith('.pnghar.txt'):
har_path = os.path.join(screenshot_dir, file)
new_file = rreplace(file, '.pnghar.txt', '.json', 1)
new_har_path = os.path.join(screenshot_dir, new_file)
os.rename(har_path, new_har_path)
progress = int((nb_done * 100) /total_to_update)
# update progress stats
if progress != last_progress:
r_serv.set('ail:current_background_script_stat', progress)
print('{}/{} screenshot updated {}%'.format(nb_done, total_to_update, progress))
last_progress = progress
nb_done += 1
r_serv.set('ail:current_background_script_stat', 100)
end = time.time()
print('Updating ARDB_Onion Done => {} paths: {} s'.format(index, end - start))
print()
print('Done in {} s'.format(end - start_deb))
r_serv.sadd('ail:update_v1.5', 'crawled_screenshot')

148
update/v1.5/Update-ARDB_Tags.py Executable file
View file

@ -0,0 +1,148 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import time
import redis
import configparser
if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_tag = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
r_important_paste_2018 = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=2018,
decode_responses=True)
r_important_paste_2019 = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=2018,
decode_responses=True)
r_serv.set('ail:current_background_script', 'tags')
r_serv.set('ail:current_background_script_stat', 0)
if r_serv.sismember('ail:update_v1.5', 'onions') and r_serv.sismember('ail:update_v1.5', 'metadata'):
print('Updating ARDB_Tags ...')
index = 0
nb_tags_to_update = 0
nb_updated = 0
last_progress = 0
start = time.time()
tags_list = r_serv_tag.smembers('list_tags')
# create temp tags metadata
tag_metadata = {}
for tag in tags_list:
tag_metadata[tag] = {}
tag_metadata[tag]['first_seen'] = r_serv_tag.hget('tag_metadata:{}'.format(tag), 'first_seen')
if tag_metadata[tag]['first_seen'] is None:
tag_metadata[tag]['first_seen'] = 99999999
else:
tag_metadata[tag]['first_seen'] = int(tag_metadata[tag]['first_seen'])
tag_metadata[tag]['last_seen'] = r_serv_tag.hget('tag_metadata:{}'.format(tag), 'last_seen')
if tag_metadata[tag]['last_seen'] is None:
tag_metadata[tag]['last_seen'] = 0
else:
tag_metadata[tag]['last_seen'] = int(tag_metadata[tag]['last_seen'])
nb_tags_to_update += r_serv_tag.scard(tag)
for tag in tags_list:
all_item = r_serv_tag.smembers(tag)
for item_path in all_item:
splitted_item_path = item_path.split('/')
#print(tag)
#print(item_path)
try:
item_date = int( ''.join([splitted_item_path[-4], splitted_item_path[-3], splitted_item_path[-2]]) )
except IndexError:
r_serv_tag.srem(tag, item_path)
continue
# remove absolute path
new_path = item_path.replace(PASTES_FOLDER, '', 1)
if new_path != item_path:
# save in queue absolute path to remove
r_serv_tag.sadd('maj:v1.5:absolute_path_to_rename', item_path)
# update metadata first_seen
if item_date < tag_metadata[tag]['first_seen']:
tag_metadata[tag]['first_seen'] = item_date
r_serv_tag.hset('tag_metadata:{}'.format(tag), 'first_seen', item_date)
# update metadata last_seen
if item_date > tag_metadata[tag]['last_seen']:
tag_metadata[tag]['last_seen'] = item_date
last_seen_db = r_serv_tag.hget('tag_metadata:{}'.format(tag), 'last_seen')
if last_seen_db:
if item_date > int(last_seen_db):
r_serv_tag.hset('tag_metadata:{}'.format(tag), 'last_seen', item_date)
else:
tag_metadata[tag]['last_seen'] = last_seen_db
r_serv_tag.sadd('{}:{}'.format(tag, item_date), new_path)
r_serv_tag.hincrby('daily_tags:{}'.format(item_date), tag, 1)
# clean db
r_serv_tag.srem(tag, item_path)
index = index + 1
nb_updated += 1
progress = int((nb_updated * 100) /nb_tags_to_update)
print('{}/{} updated {}%'.format(nb_updated, nb_tags_to_update, progress))
# update progress stats
if progress != last_progress:
r_serv.set('ail:current_background_script_stat', progress)
last_progress = progress
#flush browse importante pastes db
r_important_paste_2018.flushdb()
r_important_paste_2019.flushdb()
end = time.time()
print('Updating ARDB_Tags Done => {} paths: {} s'.format(index, end - start))
r_serv.sadd('ail:update_v1.5', 'tags')

View file

@ -0,0 +1,89 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import time
import redis
import configparser
def tags_key_fusion(old_item_path_key, new_item_path_key):
print('fusion:')
print(old_item_path_key)
print(new_item_path_key)
for tag in r_serv_metadata.smembers(old_item_path_key):
r_serv_metadata.sadd(new_item_path_key, tag)
r_serv_metadata.srem(old_item_path_key, tag)
if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_tag = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
if r_serv.sismember('ail:update_v1.5', 'tags'):
r_serv.set('ail:current_background_script', 'tags_background')
r_serv.set('ail:current_background_script_stat', 0)
print('Updating ARDB_Tags ...')
start = time.time()
#update item metadata tags
tag_not_updated = True
total_to_update = r_serv_tag.scard('maj:v1.5:absolute_path_to_rename')
nb_updated = 0
last_progress = 0
if total_to_update > 0:
while tag_not_updated:
item_path = r_serv_tag.srandmember('maj:v1.5:absolute_path_to_rename')
old_tag_item_key = 'tag:{}'.format(item_path)
new_item_path = item_path.replace(PASTES_FOLDER, '', 1)
new_tag_item_key = 'tag:{}'.format(new_item_path)
res = r_serv_metadata.renamenx(old_tag_item_key, new_tag_item_key)
if res == 0:
tags_key_fusion(old_tag_item_key, new_tag_item_key)
nb_updated += 1
r_serv_tag.srem('maj:v1.5:absolute_path_to_rename', item_path)
if r_serv_tag.scard('maj:v1.5:absolute_path_to_rename') == 0:
tag_not_updated = False
else:
progress = int((nb_updated * 100) /total_to_update)
print('{}/{} Tags updated {}%'.format(nb_updated, total_to_update, progress))
# update progress stats
if progress != last_progress:
r_serv.set('ail:current_background_script_stat', progress)
last_progress = progress
end = time.time()
print('Updating ARDB_Tags Done: {} s'.format(end - start))
r_serv.sadd('ail:update_v1.5', 'tags_background')

68
update/v1.5/Update.py Executable file
View file

@ -0,0 +1,68 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import time
import redis
import datetime
import configparser
if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
print()
print('Updating ARDB_Onion ...')
index = 0
start = time.time()
# update crawler queue
for elem in r_serv_onion.smembers('onion_crawler_queue'):
if PASTES_FOLDER in elem:
r_serv_onion.srem('onion_crawler_queue', elem)
r_serv_onion.sadd('onion_crawler_queue', elem.replace(PASTES_FOLDER, '', 1))
index = index +1
for elem in r_serv_onion.smembers('onion_crawler_priority_queue'):
if PASTES_FOLDER in elem:
r_serv_onion.srem('onion_crawler_queue', elem)
r_serv_onion.sadd('onion_crawler_queue', elem.replace(PASTES_FOLDER, '', 1))
index = index +1
end = time.time()
print('Updating ARDB_Onion Done => {} paths: {} s'.format(index, end - start))
print()
#Set current ail version
r_serv.set('ail:version', 'v1.5')
#Set current update_in_progress
r_serv.set('ail:update_in_progress', 'v1.5')
r_serv.set('ail:current_background_update', 'v1.5')
#Set current ail version
r_serv.set('ail:update_date_v1.5', datetime.datetime.now().strftime("%Y%m%d"))
print('Done in {} s'.format(end - start_deb))

60
update/v1.5/Update.sh Executable file
View file

@ -0,0 +1,60 @@
#!/bin/bash
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_REDIS" ] && echo "Needs the env var AIL_REDIS. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_ARDB" ] && echo "Needs the env var AIL_ARDB. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_BIN" ] && echo "Needs the env var AIL_ARDB. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_FLASK" ] && echo "Needs the env var AIL_FLASK. Run the script from the virtual environment." && exit 1;
export PATH=$AIL_HOME:$PATH
export PATH=$AIL_REDIS:$PATH
export PATH=$AIL_ARDB:$PATH
export PATH=$AIL_BIN:$PATH
export PATH=$AIL_FLASK:$PATH
GREEN="\\033[1;32m"
DEFAULT="\\033[0;39m"
echo -e $GREEN"Shutting down AIL ..."$DEFAULT
bash ${AIL_BIN}/LAUNCH.sh -k &
wait
echo ""
bash -c "bash ${AIL_HOME}/update/bin/Update_Redis.sh"
#bash -c "bash ${AIL_HOME}/update/bin/Update_ARDB.sh"
echo ""
echo -e $GREEN"Update DomainClassifier"$DEFAULT
echo ""
pip3 install --upgrade --force-reinstall git+https://github.com/D4-project/BGP-Ranking.git/@28013297efb039d2ebbce96ee2d89493f6ae56b0#subdirectory=client&egg=pybgpranking
pip3 install --upgrade --force-reinstall git+https://github.com/adulau/DomainClassifier.git
wait
echo ""
echo ""
echo -e $GREEN"Update Web thirdparty"$DEFAULT
echo ""
bash ${AIL_FLASK}update_thirdparty.sh &
wait
echo ""
bash ${AIL_BIN}LAUNCH.sh -lav &
wait
echo ""
echo ""
echo -e $GREEN"Fixing ARDB ..."$DEFAULT
echo ""
python ${AIL_HOME}/update/v1.4/Update.py &
wait
echo ""
echo ""
echo ""
echo -e $GREEN"Shutting down ARDB ..."$DEFAULT
bash ${AIL_BIN}/LAUNCH.sh -k &
wait
echo ""
exit 0

View file

@ -154,14 +154,22 @@ if baseUrl != '':
max_preview_char = int(cfg.get("Flask", "max_preview_char")) # Maximum number of character to display in the tooltip max_preview_char = int(cfg.get("Flask", "max_preview_char")) # Maximum number of character to display in the tooltip
max_preview_modal = int(cfg.get("Flask", "max_preview_modal")) # Maximum number of character to display in the modal max_preview_modal = int(cfg.get("Flask", "max_preview_modal")) # Maximum number of character to display in the modal
max_tags_result = 50
DiffMaxLineLength = int(cfg.get("Flask", "DiffMaxLineLength"))#Use to display the estimated percentage instead of a raw value DiffMaxLineLength = int(cfg.get("Flask", "DiffMaxLineLength"))#Use to display the estimated percentage instead of a raw value
bootstrap_label = ['primary', 'success', 'danger', 'warning', 'info'] bootstrap_label = ['primary', 'success', 'danger', 'warning', 'info']
dict_update_description = {'v1.5':{'nb_background_update': 5, 'update_warning_message': 'An Update is running on the background. Some informations like Tags, screenshot can be',
'update_warning_message_notice_me': 'missing from the UI.'}
}
UPLOAD_FOLDER = os.path.join(os.environ['AIL_FLASK'], 'submitted') UPLOAD_FOLDER = os.path.join(os.environ['AIL_FLASK'], 'submitted')
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
SCREENSHOT_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "crawled_screenshot")) SCREENSHOT_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "crawled_screenshot"), 'screenshot')
REPO_ORIGIN = 'https://github.com/CIRCL/AIL-framework.git'
max_dashboard_logs = int(cfg.get("Flask", "max_dashboard_logs")) max_dashboard_logs = int(cfg.get("Flask", "max_dashboard_logs"))

View file

@ -8,8 +8,7 @@ import redis
from flask import Flask, render_template, jsonify, request, Blueprint, redirect, url_for from flask import Flask, render_template, jsonify, request, Blueprint, redirect, url_for
import json import json
from datetime import datetime import datetime
import ssdeep
import Paste import Paste
@ -28,6 +27,7 @@ r_serv_statistics = Flask_config.r_serv_statistics
max_preview_char = Flask_config.max_preview_char max_preview_char = Flask_config.max_preview_char
max_preview_modal = Flask_config.max_preview_modal max_preview_modal = Flask_config.max_preview_modal
bootstrap_label = Flask_config.bootstrap_label bootstrap_label = Flask_config.bootstrap_label
max_tags_result = Flask_config.max_tags_result
PASTES_FOLDER = Flask_config.PASTES_FOLDER PASTES_FOLDER = Flask_config.PASTES_FOLDER
Tags = Blueprint('Tags', __name__, template_folder='templates') Tags = Blueprint('Tags', __name__, template_folder='templates')
@ -56,6 +56,16 @@ for name, tags in clusters.items(): #galaxie name + tags
def one(): def one():
return 1 return 1
def date_substract_day(date, num_day=1):
new_date = datetime.date(int(date[0:4]), int(date[4:6]), int(date[6:8])) - datetime.timedelta(num_day)
new_date = str(new_date).replace('-', '')
return new_date
def date_add_day(date, num_day=1):
new_date = datetime.date(int(date[0:4]), int(date[4:6]), int(date[6:8])) + datetime.timedelta(num_day)
new_date = str(new_date).replace('-', '')
return new_date
def get_tags_with_synonyms(tag): def get_tags_with_synonyms(tag):
str_synonyms = ' - synonyms: ' str_synonyms = ' - synonyms: '
synonyms = r_serv_tags.smembers('synonym_tag_' + tag) synonyms = r_serv_tags.smembers('synonym_tag_' + tag)
@ -68,12 +78,277 @@ def get_tags_with_synonyms(tag):
else: else:
return {'name':tag,'id':tag} return {'name':tag,'id':tag}
def get_item_date(item_filename):
l_directory = item_filename.split('/')
return '{}{}{}'.format(l_directory[-4], l_directory[-3], l_directory[-2])
def substract_date(date_from, date_to):
date_from = datetime.date(int(date_from[0:4]), int(date_from[4:6]), int(date_from[6:8]))
date_to = datetime.date(int(date_to[0:4]), int(date_to[4:6]), int(date_to[6:8]))
delta = date_to - date_from # timedelta
l_date = []
for i in range(delta.days + 1):
date = date_from + datetime.timedelta(i)
l_date.append( date.strftime('%Y%m%d') )
return l_date
def get_all_dates_range(date_from, date_to):
all_dates = {}
date_range = []
if date_from is not None and date_to is not None:
#change format
try:
if len(date_from) != 8:
date_from = date_from[0:4] + date_from[5:7] + date_from[8:10]
date_to = date_to[0:4] + date_to[5:7] + date_to[8:10]
date_range = substract_date(date_from, date_to)
except:
pass
if not date_range:
date_range.append(datetime.date.today().strftime("%Y%m%d"))
date_from = date_range[0][0:4] + '-' + date_range[0][4:6] + '-' + date_range[0][6:8]
date_to = date_from
else:
date_from = date_from[0:4] + '-' + date_from[4:6] + '-' + date_from[6:8]
date_to = date_to[0:4] + '-' + date_to[4:6] + '-' + date_to[6:8]
all_dates['date_from'] = date_from
all_dates['date_to'] = date_to
all_dates['date_range'] = date_range
return all_dates
def get_last_seen_from_tags_list(list_tags):
min_last_seen = 99999999
for tag in list_tags:
tag_last_seen = r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen')
if tag_last_seen:
tag_last_seen = int(tag_last_seen)
if tag_last_seen < min_last_seen:
min_last_seen = tag_last_seen
return str(min_last_seen)
def add_item_tag(tag, item_path):
item_date = int(get_item_date(item_path))
#add tag
r_serv_metadata.sadd('tag:{}'.format(item_path), tag)
r_serv_tags.sadd('{}:{}'.format(tag, item_date), item_path)
r_serv_tags.hincrby('daily_tags:{}'.format(item_date), tag, 1)
tag_first_seen = r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen')
if tag_first_seen is None:
tag_first_seen = 99999999
else:
tag_first_seen = int(tag_first_seen)
tag_last_seen = r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen')
if tag_last_seen is None:
tag_last_seen = 0
else:
tag_last_seen = int(tag_last_seen)
#add new tag in list of all used tags
r_serv_tags.sadd('list_tags', tag)
# update fisrt_seen/last_seen
if item_date < tag_first_seen:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'first_seen', item_date)
# update metadata last_seen
if item_date > tag_last_seen:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'last_seen', item_date)
def remove_item_tag(tag, item_path):
item_date = int(get_item_date(item_path))
#remove tag
r_serv_metadata.srem('tag:{}'.format(item_path), tag)
res = r_serv_tags.srem('{}:{}'.format(tag, item_date), item_path)
if res ==1:
# no tag for this day
if int(r_serv_tags.hget('daily_tags:{}'.format(item_date), tag)) == 1:
r_serv_tags.hdel('daily_tags:{}'.format(item_date), tag)
else:
r_serv_tags.hincrby('daily_tags:{}'.format(item_date), tag, -1)
tag_first_seen = int(r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen'))
tag_last_seen = int(r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen'))
# update fisrt_seen/last_seen
if item_date == tag_first_seen:
update_tag_first_seen(tag, tag_first_seen, tag_last_seen)
if item_date == tag_last_seen:
update_tag_last_seen(tag, tag_first_seen, tag_last_seen)
else:
return 'Error incorrect tag'
def update_tag_first_seen(tag, tag_first_seen, tag_last_seen):
if tag_first_seen == tag_last_seen:
if r_serv_tags.scard('{}:{}'.format(tag, tag_first_seen)) > 0:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'first_seen', tag_first_seen)
# no tag in db
else:
r_serv_tags.srem('list_tags', tag)
r_serv_tags.hdel('tag_metadata:{}'.format(tag), 'first_seen')
r_serv_tags.hdel('tag_metadata:{}'.format(tag), 'last_seen')
else:
if r_serv_tags.scard('{}:{}'.format(tag, tag_first_seen)) > 0:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'first_seen', tag_first_seen)
else:
tag_first_seen = date_add_day(tag_first_seen)
update_tag_first_seen(tag, tag_first_seen, tag_last_seen)
def update_tag_last_seen(tag, tag_first_seen, tag_last_seen):
if tag_first_seen == tag_last_seen:
if r_serv_tags.scard('{}:{}'.format(tag, tag_last_seen)) > 0:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'last_seen', tag_last_seen)
# no tag in db
else:
r_serv_tags.srem('list_tags', tag)
r_serv_tags.hdel('tag_metadata:{}'.format(tag), 'first_seen')
r_serv_tags.hdel('tag_metadata:{}'.format(tag), 'last_seen')
else:
if r_serv_tags.scard('{}:{}'.format(tag, tag_last_seen)) > 0:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'last_seen', tag_last_seen)
else:
tag_last_seen = date_substract_day(tag_last_seen)
update_tag_last_seen(tag, tag_first_seen, tag_last_seen)
# ============= ROUTES ============== # ============= ROUTES ==============
@Tags.route("/Tags/", methods=['GET']) @Tags.route("/tags/", methods=['GET'])
def Tags_page(): def Tags_page():
return render_template("Tags.html") date_from = request.args.get('date_from')
date_to = request.args.get('date_to')
tags = request.args.get('ltags')
if tags is None:
dates = get_all_dates_range(date_from, date_to)
return render_template("Tags.html", date_from=dates['date_from'], date_to=dates['date_to'])
# unpack tags
list_tags = tags.split(',')
list_tag = []
for tag in list_tags:
list_tag.append(tag.replace('"','\"'))
#no search by date, use last_seen for date_from/date_to
if date_from is None and date_to is None and tags is not None:
date_from = get_last_seen_from_tags_list(list_tags)
date_to = date_from
# TODO verify input
dates = get_all_dates_range(date_from, date_to)
if(type(list_tags) is list):
# no tag
if list_tags is False:
print('empty')
# 1 tag
elif len(list_tags) < 2:
tagged_pastes = []
for date in dates['date_range']:
tagged_pastes.extend(r_serv_tags.smembers('{}:{}'.format(list_tags[0], date)))
# 2 tags or more
else:
tagged_pastes = []
for date in dates['date_range']:
tag_keys = []
for tag in list_tags:
tag_keys.append('{}:{}'.format(tag, date))
if len(tag_keys) > 1:
daily_items = r_serv_tags.sinter(tag_keys[0], *tag_keys[1:])
else:
daily_items = r_serv_tags.sinter(tag_keys[0])
tagged_pastes.extend(daily_items)
else :
return 'INCORRECT INPUT'
all_content = []
paste_date = []
paste_linenum = []
all_path = []
allPastes = list(tagged_pastes)
paste_tags = []
try:
page = int(request.args.get('page'))
except:
page = 1
if page <= 0:
page = 1
nb_page_max = len(tagged_pastes)/(max_tags_result)
if not nb_page_max.is_integer():
nb_page_max = int(nb_page_max)+1
else:
nb_page_max = int(nb_page_max)
if page > nb_page_max:
page = nb_page_max
start = max_tags_result*(page -1)
stop = max_tags_result*page
for path in allPastes[start:stop]:
all_path.append(path)
paste = Paste.Paste(path)
content = paste.get_p_content()
content_range = max_preview_char if len(content)>max_preview_char else len(content)-1
all_content.append(content[0:content_range].replace("\"", "\'").replace("\r", " ").replace("\n", " "))
curr_date = str(paste._get_p_date())
curr_date = curr_date[0:4]+'/'+curr_date[4:6]+'/'+curr_date[6:]
paste_date.append(curr_date)
paste_linenum.append(paste.get_lines_info()[0])
p_tags = r_serv_metadata.smembers('tag:'+path)
complete_tags = []
l_tags = []
for tag in p_tags:
complete_tag = tag
tag = tag.split('=')
if len(tag) > 1:
if tag[1] != '':
tag = tag[1][1:-1]
# no value
else:
tag = tag[0][1:-1]
# use for custom tags
else:
tag = tag[0]
l_tags.append( (tag,complete_tag) )
paste_tags.append(l_tags)
if len(allPastes) > 10:
finished = False
else:
finished = True
if len(list_tag) == 1:
tag_nav=tags.replace('"', '').replace('=', '').replace(':', '')
else:
tag_nav='empty'
return render_template("Tags.html",
all_path=all_path,
tags=tags,
tag_nav=tag_nav,
list_tag = list_tag,
date_from=dates['date_from'],
date_to=dates['date_to'],
page=page, nb_page_max=nb_page_max,
paste_tags=paste_tags,
bootstrap_label=bootstrap_label,
content=all_content,
paste_date=paste_date,
paste_linenum=paste_linenum,
char_to_display=max_preview_modal,
finished=finished)
@Tags.route("/Tags/get_all_tags") @Tags.route("/Tags/get_all_tags")
def get_all_tags(): def get_all_tags():
@ -173,94 +448,6 @@ def get_tags_galaxy():
else: else:
return 'this galaxy is disable' return 'this galaxy is disable'
@Tags.route("/Tags/get_tagged_paste")
def get_tagged_paste():
tags = request.args.get('ltags')
list_tags = tags.split(',')
list_tag = []
for tag in list_tags:
list_tag.append(tag.replace('"','\"'))
# TODO verify input
if(type(list_tags) is list):
# no tag
if list_tags is False:
print('empty')
# 1 tag
elif len(list_tags) < 2:
tagged_pastes = r_serv_tags.smembers(list_tags[0])
# 2 tags or more
else:
tagged_pastes = r_serv_tags.sinter(list_tags[0], *list_tags[1:])
else :
return 'INCORRECT INPUT'
#TODO FIXME
currentSelectYear = int(datetime.now().year)
all_content = []
paste_date = []
paste_linenum = []
all_path = []
allPastes = list(tagged_pastes)
paste_tags = []
for path in allPastes[0:50]: ######################moduleName
all_path.append(path)
paste = Paste.Paste(path)
content = paste.get_p_content()
content_range = max_preview_char if len(content)>max_preview_char else len(content)-1
all_content.append(content[0:content_range].replace("\"", "\'").replace("\r", " ").replace("\n", " "))
curr_date = str(paste._get_p_date())
curr_date = curr_date[0:4]+'/'+curr_date[4:6]+'/'+curr_date[6:]
paste_date.append(curr_date)
paste_linenum.append(paste.get_lines_info()[0])
p_tags = r_serv_metadata.smembers('tag:'+path)
complete_tags = []
l_tags = []
for tag in p_tags:
complete_tag = tag
tag = tag.split('=')
if len(tag) > 1:
if tag[1] != '':
tag = tag[1][1:-1]
# no value
else:
tag = tag[0][1:-1]
# use for custom tags
else:
tag = tag[0]
l_tags.append( (tag,complete_tag) )
paste_tags.append(l_tags)
if len(allPastes) > 10:
finished = False
else:
finished = True
return render_template("tagged.html",
year=currentSelectYear,
all_path=all_path,
tags=tags,
list_tag = list_tag,
paste_tags=paste_tags,
bootstrap_label=bootstrap_label,
content=all_content,
paste_date=paste_date,
paste_linenum=paste_linenum,
char_to_display=max_preview_modal,
finished=finished)
@Tags.route("/Tags/remove_tag") @Tags.route("/Tags/remove_tag")
def remove_tag(): def remove_tag():
@ -268,12 +455,7 @@ def remove_tag():
path = request.args.get('paste') path = request.args.get('paste')
tag = request.args.get('tag') tag = request.args.get('tag')
#remove tag remove_item_tag(tag, path)
r_serv_metadata.srem('tag:'+path, tag)
r_serv_tags.srem(tag, path)
if r_serv_tags.scard(tag) == 0:
r_serv_tags.srem('list_tags', tag)
return redirect(url_for('showsavedpastes.showsavedpaste', paste=path)) return redirect(url_for('showsavedpastes.showsavedpaste', paste=path))
@ -285,17 +467,11 @@ def confirm_tag():
tag = request.args.get('tag') tag = request.args.get('tag')
if(tag[9:28] == 'automatic-detection'): if(tag[9:28] == 'automatic-detection'):
remove_item_tag(tag, path)
#remove automatic tag
r_serv_metadata.srem('tag:'+path, tag)
r_serv_tags.srem(tag, path)
tag = tag.replace('automatic-detection','analyst-detection', 1) tag = tag.replace('automatic-detection','analyst-detection', 1)
#add analyst tag #add analyst tag
r_serv_metadata.sadd('tag:'+path, tag) add_item_tag(tag, path)
r_serv_tags.sadd(tag, path)
#add new tag in list of all used tags
r_serv_tags.sadd('list_tags', tag)
return redirect(url_for('showsavedpastes.showsavedpaste', paste=path)) return redirect(url_for('showsavedpastes.showsavedpaste', paste=path))
@ -345,12 +521,7 @@ def addTags():
tax = tag.split(':')[0] tax = tag.split(':')[0]
if tax in active_taxonomies: if tax in active_taxonomies:
if tag in r_serv_tags.smembers('active_tag_' + tax): if tag in r_serv_tags.smembers('active_tag_' + tax):
add_item_tag(tag, path)
#add tag
r_serv_metadata.sadd('tag:'+path, tag)
r_serv_tags.sadd(tag, path)
#add new tag in list of all used tags
r_serv_tags.sadd('list_tags', tag)
else: else:
return 'INCORRECT INPUT1' return 'INCORRECT INPUT1'
@ -365,12 +536,7 @@ def addTags():
if gal in active_galaxies: if gal in active_galaxies:
if tag in r_serv_tags.smembers('active_tag_galaxies_' + gal): if tag in r_serv_tags.smembers('active_tag_galaxies_' + gal):
add_item_tag(tag, path)
#add tag
r_serv_metadata.sadd('tag:'+path, tag)
r_serv_tags.sadd(tag, path)
#add new tag in list of all used tags
r_serv_tags.sadd('list_tags', tag)
else: else:
return 'INCORRECT INPUT3' return 'INCORRECT INPUT3'

View file

@ -1,112 +1,407 @@
<!DOCTYPE html> <!DOCTYPE html>
<html> <html>
<head> <head>
<meta charset="utf-8"> <title>Tags - AIL</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0"> <link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png') }}">
<title>Tags - AIL</title> <!-- Core CSS -->
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png') }}"> <link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/daterangepicker.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dataTables.bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/tags.css') }}" rel="stylesheet" type="text/css" />
<!-- Core CSS --> <!-- JS -->
<link href="{{ url_for('static', filename='css/bootstrap.min.css') }}" rel="stylesheet"> <script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<link href="{{ url_for('static', filename='font-awesome/css/font-awesome.css') }}" rel="stylesheet"> <script src="{{ url_for('static', filename='js/popper.min.js')}}"></script>
<link href="{{ url_for('static', filename='css/sb-admin-2.css') }}" rel="stylesheet"> <script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
<link href="{{ url_for('static', filename='css/dygraph_gallery.css') }}" rel="stylesheet" type="text/css" /> <script src="{{ url_for('static', filename='js/jquery.dataTables.min.js')}}"></script>
<link href="{{ url_for('static', filename='css/tags.css') }}" rel="stylesheet" type="text/css" /> <script src="{{ url_for('static', filename='js/dataTables.bootstrap.min.js')}}"></script>
<!-- JS --> <script language="javascript" src="{{ url_for('static', filename='js/moment.min.js') }}"></script>
<script type="text/javascript" src="{{ url_for('static', filename='js/dygraph-combined.js') }}"></script> <script language="javascript" src="{{ url_for('static', filename='js/jquery.daterangepicker.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/jquery.js')}}"></script> <script src="{{ url_for('static', filename='js/tags.js') }}"></script>
<script src="{{ url_for('static', filename='js/jquery.flot.js') }}"></script>
<script src="{{ url_for('static', filename='js/jquery.flot.pie.js') }}"></script>
<script src="{{ url_for('static', filename='js/jquery.flot.time.js') }}"></script>
<script src="{{ url_for('static', filename='js/tags.js') }}"></script>
</head> <style>
<body> .rotate{
-moz-transition: all 0.1s linear;
-webkit-transition: all 0.1s linear;
transition: all 0.1s linear;
}
{% include 'navbar.html' %} .rotate.down{
-moz-transform:rotate(180deg);
-webkit-transform:rotate(180deg);
transform:rotate(180deg);
}
</style>
<div id="page-wrapper"> </head>
<div class="row"> <body>
<div class="col-lg-12">
<h1 class="page-header" data-page="page-tags" >Tags</h1>
</div>
<!-- /.col-lg-12 -->
</div>
<!-- /.row -->
<div class="form-group input-group" >
<input id="ltags" style="width:100%;" type="text" name="ltags" autocomplete="off">
<div class="input-group-btn"> {% include 'nav_bar.html' %}
<button type="button" class="btn btn-search btn-primary btn-tags" onclick="searchTags()"<button class="btn btn-primary" onclick="emptyTags()">
<span class="glyphicon glyphicon-search "></span> <!-- Modal -->
<span class="label-icon">Search Tags</span> <div id="mymodal" class="modal fade" role="dialog">
</button> <div class="modal-dialog modal-lg">
<!-- Modal content-->
<div id="mymodalcontent" class="modal-content">
<div id="mymodalbody" class="modal-body" max-width="850px">
<p>Loading paste information...</p>
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" height="26" width="26" style="margin: 4px;">
</div>
<div class="modal-footer">
<a id="button_show_path" target="_blank" href=""><button type="button" class="btn btn-info">Show saved paste</button></a>
<button type="button" class="btn btn-default" data-dismiss="modal">Close</button>
</div>
</div>
</div>
</div>
<div class="container-fluid">
<div class="row">
{% include 'tags/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div class="card mb-3 mt-1">
<div class="card-header text-white bg-dark">
<h5 class="card-title">Search Tags by date range :</h5>
</div> </div>
<div class="card-body">
<div class="row mb-3">
<div class="col-md-6">
<div class="input-group" id="date-range-from">
<div class="input-group-prepend"><span class="input-group-text"><i class="far fa-calendar-alt" aria-hidden="true"></i></span></div>
<input class="form-control" id="date-range-from-input" placeholder="yyyy-mm-dd" value="{{ date_from }}" name="date_from" autocomplete="off">
</div>
</div>
<div class="col-md-6">
<div class="input-group" id="date-range-to">
<div class="input-group-prepend"><span class="input-group-text"><i class="far fa-calendar-alt" aria-hidden="true"></i></span></div>
<input class="form-control" id="date-range-to-input" placeholder="yyyy-mm-dd" value="{{ date_to }}" name="date_to" autocomplete="off">
</div>
</div>
</div>
<div class="input-group mb-3">
<div class="input-group-prepend">
<button class="btn btn-outline-danger" type="button" id="button-clear-tags" style="z-index: 1;" onclick="emptyTags()">
<i class="fas fa-eraser"></i>
</button>
</div>
<input id="ltags" name="ltags" type="text" class="form-control" aria-describedby="button-clear-tags" autocomplete="off">
</div>
<button class="btn btn-primary" type="button" id="button-search-tags" onclick="searchTags()">
<i class="fas fa-search"></i> Search Tags
</button>
</div>
</div> </div>
<button class="btn btn-primary" onclick="emptyTags()" style="margin-bottom: 30px;"> <div>
<span class="glyphicon glyphicon-remove"></span> {%if all_path%}
<span class="label-icon">Clear Tags</span> <table class="table table-bordered table-hover" id="myTable_">
</button> <thead class="thead-dark">
<tr>
<th>Date</th>
<th style="max-width: 800px;">Path</th>
<th># of lines</th>
<th>Action</th>
</tr>
</thead>
<tbody>
<div> {% for path in all_path %}
<br></br> <tr>
<a class="btn btn-tags" href="{{ url_for('Tags.taxonomies') }}" target="_blank"> <td class="pb-0">{{ paste_date[loop.index0] }}</td>
<i class="fa fa-wrench fa-2x"></i> <td class="pb-0"><a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{path}}" class="text-secondary">
<br></br> <div style="line-height:0.9;">{{ path }}</div>
<span class="label-icon">Edit Taxonomies List </span> </a>
</a> <div class="mb-2">
<a class="btn btn-tags" href="{{ url_for('Tags.galaxies') }}" target="_blank"> {% for tag in paste_tags[loop.index0] %}
<i class="fa fa-rocket fa-2x"></i> <a href="{{ url_for('Tags.Tags_page') }}?ltags={{ tag[1] }}">
<br></br> <span class="badge badge-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag[0] }}</span>
<span class="label-icon">Edit Galaxies List</span> </a>
</a> {% endfor %}
</div> </div>
</td class="pb-0">
<td class="pb-0">{{ paste_linenum[loop.index0] }}</td>
<td class="pb-0"><p>
<span class="fas fa-info-circle" data-toggle="tooltip" data-placement="left" title="{{ content[loop.index0] }} "></span>
<button type="button" class="btn btn-light" data-num="{{ loop.index0 + 1 }}" data-toggle="modal" data-target="#mymodal" data-url="{{ url_for('showsavedpastes.showsaveditem_min') }}?paste={{ path }}&num={{ loop.index0+1 }}" data-path="{{ path }}">
<span class="fas fa-search-plus"></span>
</button></p>
</td>
</tr>
{% endfor %}
<div> </tbody>
<br></br> </table>
<a class="btn btn-tags" href="{{ url_for('PasteSubmit.edit_tag_export') }}" target="_blank">
<i class="fa fa-cogs fa-2x"></i>
<br></br>
<span class="label-icon">MISP and Hive, auto push</span>
</a>
</div>
</div> <div class="d-flex justify-content-center border-top mt-2">
<nav class="mt-4" aria-label="...">
<ul class="pagination">
<li class="page-item {%if page==1%}disabled{%endif%}">
<a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{page-1}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">Previous</a>
</li>
<!-- /#page-wrapper --> {%if page>3%}
<li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page=1&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">1</a></li>
<li class="page-item disabled"><a class="page-link" aria-disabled="true" href="#">...</a></li>
<li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{page-1}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{page-1}}</a></li>
<li class="page-item active"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{page}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{page}}</a></li>
{%else%}
{%if page>2%}<li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{page-2}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{page-2}}</a></li>{%endif%}
{%if page>1%}<li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{page-1}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{page-1}}</a></li>{%endif%}
<li class="page-item active"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{page}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{page}}</a></li>
{%endif%}
<script> {%if nb_page_max-page>3%}
var ltags <li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{page+1}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{page+1}}</a></li>
$(document).ready(function(){ <li class="page-item disabled"><a class="page-link" aria-disabled="true" href="#">...</a></li>
activePage = "page-Tags" <li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{nb_page_max}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{nb_page_max}}</a></li>
$("#"+activePage).addClass("active"); {%else%}
{%if nb_page_max-page>2%}<li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{nb_page_max-2}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{nb_page_max-2}}</a></li>{%endif%}
{%if nb_page_max-page>1%}<li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{nb_page_max-1}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{nb_page_max-1}}</a></li>{%endif%}
{%if nb_page_max-page>0%}<li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{nb_page_max}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{nb_page_max}}</a></li>{%endif%}
{%endif%}
$.getJSON("{{ url_for('Tags.get_all_tags') }}", <li class="page-item {%if page==nb_page_max%}disabled{%endif%}">
function(data) { <a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{page+1}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}" aria-disabled="true">Next</a>
</li>
</ul>
</nav>
</div>
{%endif%}
</div>
<div>
<br></br>
<a class="btn btn-light text-secondary" href="{{ url_for('Tags.taxonomies') }}" target="_blank">
<i class="fas fa-wrench fa-2x"></i>
<br></br>
<span class="label-icon">Edit Taxonomies List </span>
</a>
<a class="btn btn-light text-secondary" href="{{ url_for('Tags.galaxies') }}" target="_blank">
<i class="fas fa-rocket fa-2x"></i>
<br></br>
<span class="label-icon">Edit Galaxies List</span>
</a>
</div>
<div>
<br></br>
<a class="btn btn-light text-secondary" href="{{ url_for('PasteSubmit.edit_tag_export') }}" target="_blank">
<i class="fas fa-cogs fa-2x"></i>
<br></br>
<span class="label-icon">MISP and Hive, auto push</span>
</a>
</div>
</div>
</div>
</div>
</body>
<script>
var ltags;
var search_table;
var last_clicked_paste;
var can_change_modal_content = true;
$(document).ready(function(){
$("#nav_quick_search").removeClass("text-muted");
$("#nav_tag_{{tag_nav}}").addClass("active");
search_table = $('#myTable_').DataTable({ "order": [[ 0, "asc" ]] });
// Use to bind the button with the new displayed data
// (The bind do not happens if the dataTable is in tabs and the clicked data is in another page)
search_table.on( 'draw.dt', function () {
// Bind tooltip each time we draw a new page
$('[data-toggle="tooltip"]').tooltip();
// On click, get html content from url and update the corresponding modal
$("[data-toggle='modal']").off('click.openmodal').on("click.openmodal", function (event) {
get_html_and_update_modal(event, $(this));
});
} );
search_table.draw()
var valueData = [
{% for tag in list_tag %}
'{{tag|safe}}',
{% endfor %}
];
$.getJSON("{{ url_for('Tags.get_all_tags') }}",
function(data) {
ltags = $('#ltags').tagSuggest({
data: data,
value: valueData,
sortOrder: 'name',
maxDropHeight: 200,
name: 'ltags'
});
});
$('#date-range-from').dateRangePicker({
separator : ' to ',
getValue: function(){
if ($('#date-range-from-input').val() && $('#date-range-to-input').val() )
return $('#date-range-from-input').val() + ' to ' + $('#date-range-to-input').val();
else
return '';
},
setValue: function(s,s1,s2){
$('#date-range-from-input').val(s1);
$('#date-range-to-input').val(s2);
}
});
$('#date-range-to').dateRangePicker({
separator : ' to ',
getValue: function(){
if ($('#date-range-from-input').val() && $('#date-range-to-input').val() )
return $('#date-range-from-input').val() + ' to ' + $('#date-range-to-input').val();
else
return '';
},
setValue: function(s,s1,s2){
$('#date-range-from-input').val(s1);
$('#date-range-to-input').val(s2);
}
});
});
</script>
<script>
function searchTags() {
var data = ltags.getValue();
var date_from = $('#date-range-from-input').val();
var date_to =$('#date-range-to-input').val();
window.location.replace("{{ url_for('Tags.Tags_page') }}?date_from="+date_from+"&date_to="+date_to+"&ltags=" + data);
}
function emptyTags() {
ltags.clear();
}
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
</script>
<!-- Dynamically update the modal -->
<script>
// static data
var alert_message = '<div class="alert alert-info alert-dismissable"><button type="button" class="close" data-dismiss="alert" aria-hidden="true">×</button><strong>No more data.</strong> Full paste displayed.</div>';
var complete_paste = null;
var char_to_display = {{ char_to_display }};
var start_index = 0;
// When the modal goes out, refresh it to normal content
$("#mymodal").on('hidden.bs.modal', function () {
can_change_modal_content = true;
$("#mymodalbody").html("<p>Loading paste information...</p>");
var loading_gif = "<img id='loading-gif-modal' class='img-center' src=\"{{url_for('static', filename='image/loading.gif') }}\" height='26' width='26' style='margin: 4px;'>";
$("#mymodalbody").append(loading_gif); // Show the loading GIF
$("#button_show_path").attr('href', '');
$("#button_show_path").hide();
complete_paste = null;
start_index = 0;
});
// Update the paste preview in the modal
function update_preview() {
if (start_index + char_to_display > complete_paste.length-1){ // end of paste reached
var final_index = complete_paste.length-1;
var flag_stop = true;
} else {
var final_index = start_index + char_to_display;
}
if (final_index != start_index){ // still have data to display
// Append the new content using text() and not append (XSS)
$("#mymodalbody").find("#paste-holder").text($("#mymodalbody").find("#paste-holder").text()+complete_paste.substring(start_index+1, final_index+1));
start_index = final_index;
if (flag_stop)
nothing_to_display();
} else {
nothing_to_display();
}
}
// Update the modal when there is no more data
function nothing_to_display() {
var new_content = $(alert_message).hide();
$("#mymodalbody").find("#panel-body").append(new_content);
new_content.show('fast');
$("#load-more-button").hide();
}
function get_html_and_update_modal(event, truemodal) {
event.preventDefault();
var modal=truemodal;
var url = " {{ url_for('showsavedpastes.showpreviewpaste') }}?paste=" + modal.attr('data-path') + "&num=" + modal.attr('data-num');
last_clicked_paste = modal.attr('data-num');
$.get(url, function (data) {
// verify that the reveived data is really the current clicked paste. Otherwise, ignore it.
var received_num = parseInt(data.split("|num|")[1]);
if (received_num == last_clicked_paste && can_change_modal_content) {
can_change_modal_content = false;
// clear data by removing html, body, head tags. prevent dark modal background stack bug.
var cleared_data = data.split("<body>")[1].split("</body>")[0];
$("#mymodalbody").html(cleared_data);
var button = $('<button type="button" id="load-more-button" class="btn btn-outline-primary rounded-circle px-1 py-0" data-url="' + $(modal).attr('data-path') +'" data-toggle="tooltip" data-placement="bottom" title="Load more content"><i class="fas fa-arrow-circle-down mt-1"></i></button>');
button.tooltip(button);
$("#container-show-more").append(button);
$("#button_show_path").attr('href', '{{ url_for('showsavedpastes.showsavedpaste') }}?paste=' + $(modal).attr('data-path'));
$("#button_show_path").show('fast');
$("#loading-gif-modal").css("visibility", "hidden"); // Hide the loading GIF
if ($("[data-initsize]").attr('data-initsize') < char_to_display) { // All the content is displayed
nothing_to_display();
}
// collapse decoded
$('#collapseDecoded').collapse('hide');
// On click, donwload all paste's content
$("#load-more-button").on("click", function (event) {
if (complete_paste == null) { //Donwload only once
$.get("{{ url_for('showsavedpastes.getmoredata') }}"+"?paste="+$(modal).attr('data-path'), function(data, status){
complete_paste = data;
update_preview();
});
} else {
update_preview();
}
});
} else if (can_change_modal_content) {
$("#mymodalbody").html("Ignoring previous not finished query of paste #" + received_num);
}
});
}
</script>
ltags = $('#ltags').tagSuggest({
data: data,
sortOrder: 'name',
maxDropHeight: 200,
name: 'ltags'
});
});
});
</script>
<script>
function searchTags() {
var data = ltags.getValue();
window.location.replace("{{ url_for('Tags.get_tagged_paste') }}?ltags=" + data);
}
function emptyTags() {
ltags.clear();
}
</script>
<script src="{{ url_for('static', filename='js/bootstrap.min.js') }}"></script>
</body>
</html> </html>

View file

@ -1,181 +0,0 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
'''
Flask functions and routes for the trending modules page
'''
import redis
import json
import flask
import os
from datetime import datetime
from flask import Flask, render_template, jsonify, request, Blueprint
import Paste
# ============ VARIABLES ============
import Flask_config
app = Flask_config.app
cfg = Flask_config.cfg
baseUrl = Flask_config.baseUrl
max_preview_char = Flask_config.max_preview_char
max_preview_modal = Flask_config.max_preview_modal
r_serv_metadata = Flask_config.r_serv_metadata
bootstrap_label = Flask_config.bootstrap_label
#init all lvlDB servers
curYear = datetime.now().year
int_year = int(curYear)
r_serv_db = {}
# port generated automatically depending on available levelDB date
yearList = []
for x in range(0, (int_year - 2018) + 1):
intYear = int_year - x
yearList.append([str(intYear), intYear, int(curYear) == intYear])
r_serv_db[intYear] = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=intYear,
decode_responses=True)
yearList.sort(reverse=True)
browsepastes = Blueprint('browsepastes', __name__, template_folder='templates')
# ============ FUNCTIONS ============
def getPastebyType(server, module_name):
all_path = []
for path in server.smembers('WARNING_'+module_name):
all_path.append(path)
return all_path
def event_stream_getImportantPasteByModule(module_name, year):
index = 0
all_pastes_list = getPastebyType(r_serv_db[year], module_name)
paste_tags = []
for path in all_pastes_list:
index += 1
paste = Paste.Paste(path)
content = paste.get_p_content()
content_range = max_preview_char if len(content)>max_preview_char else len(content)-1
curr_date = str(paste._get_p_date())
curr_date = curr_date[0:4]+'/'+curr_date[4:6]+'/'+curr_date[6:]
p_tags = r_serv_metadata.smembers('tag:'+path)
l_tags = []
for tag in p_tags:
complete_tag = tag.replace('"', '&quot;')
tag = tag.split('=')
if len(tag) > 1:
if tag[1] != '':
tag = tag[1][1:-1]
# no value
else:
tag = tag[0][1:-1]
# use for custom tags
else:
tag = tag[0]
l_tags.append( (tag, complete_tag) )
data = {}
data["module"] = module_name
data["index"] = index
data["path"] = path
data["content"] = content[0:content_range]
data["linenum"] = paste.get_lines_info()[0]
data["date"] = curr_date
data["l_tags"] = l_tags
data["bootstrap_label"] = bootstrap_label
data["char_to_display"] = max_preview_modal
data["finished"] = True if index == len(all_pastes_list) else False
yield 'retry: 100000\ndata: %s\n\n' % json.dumps(data) #retry to avoid reconnection of the browser
# ============ ROUTES ============
@browsepastes.route("/browseImportantPaste/", methods=['GET'])
def browseImportantPaste():
module_name = request.args.get('moduleName')
return render_template("browse_important_paste.html", year_list=yearList, selected_year=curYear)
@browsepastes.route("/importantPasteByModule/", methods=['GET'])
def importantPasteByModule():
module_name = request.args.get('moduleName')
# # TODO: VERIFY YEAR VALIDITY
try:
currentSelectYear = int(request.args.get('year'))
except:
print('Invalid year input')
currentSelectYear = int(datetime.now().year)
all_content = []
paste_date = []
paste_linenum = []
all_path = []
paste_tags = []
allPastes = getPastebyType(r_serv_db[currentSelectYear], module_name)
for path in allPastes[0:10]:
all_path.append(path)
paste = Paste.Paste(path)
content = paste.get_p_content()
content_range = max_preview_char if len(content)>max_preview_char else len(content)-1
all_content.append(content[0:content_range].replace("\"", "\'").replace("\r", " ").replace("\n", " "))
curr_date = str(paste._get_p_date())
curr_date = curr_date[0:4]+'/'+curr_date[4:6]+'/'+curr_date[6:]
paste_date.append(curr_date)
paste_linenum.append(paste.get_lines_info()[0])
p_tags = r_serv_metadata.smembers('tag:'+path)
l_tags = []
for tag in p_tags:
complete_tag = tag
tag = tag.split('=')
if len(tag) > 1:
if tag[1] != '':
tag = tag[1][1:-1]
# no value
else:
tag = tag[0][1:-1]
# use for custom tags
else:
tag = tag[0]
l_tags.append( (tag, complete_tag) )
paste_tags.append(l_tags)
if len(allPastes) > 10:
finished = False
else:
finished = True
return render_template("important_paste_by_module.html",
moduleName=module_name,
year=currentSelectYear,
all_path=all_path,
content=all_content,
paste_date=paste_date,
paste_linenum=paste_linenum,
char_to_display=max_preview_modal,
paste_tags=paste_tags,
bootstrap_label=bootstrap_label,
finished=finished)
@browsepastes.route("/_getImportantPasteByModule", methods=['GET'])
def getImportantPasteByModule():
module_name = request.args.get('moduleName')
currentSelectYear = int(request.args.get('year'))
return flask.Response(event_stream_getImportantPasteByModule(module_name, currentSelectYear), mimetype="text/event-stream")
# ========= REGISTRATION =========
app.register_blueprint(browsepastes, url_prefix=baseUrl)

View file

@ -1,190 +0,0 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Browse Important Paste - AIL</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png') }}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='font-awesome/css/font-awesome.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/sb-admin-2.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dataTables.bootstrap.css') }}" rel="stylesheet" type="text/css" />
<script language="javascript" src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap.min.js') }}"></script>
<script src="{{ url_for('static', filename='js/jquery.dataTables.min.js') }}"></script>
<script src="{{ url_for('static', filename='js/dataTables.bootstrap.js') }}"></script>
<style>
.tooltip-inner {
text-align: left;
height: 200%;
width: 200%;
max-width: 500px;
max-height: 500px;
font-size: 13px;
}
xmp {
white-space:pre-wrap;
word-wrap:break-word;
}
</style>
</head>
<body>
{% include 'navbar.html' %}
<!-- Modal -->
<div id="mymodal" class="modal fade" role="dialog">
<div class="modal-dialog modal-lg">
<!-- Modal content-->
<div id="mymodalcontent" class="modal-content">
<div id="mymodalbody" class="modal-body" max-width="850px">
<p>Loading paste information...</p>
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" height="26" width="26" style="margin: 4px;">
</div>
<div class="modal-footer">
<a id="button_show_path" target="_blank" href=""><button type="button" class="btn btn-info">Show saved paste</button></a>
<button type="button" class="btn btn-default" data-dismiss="modal">Close</button>
</div>
</div>
</div>
</div>
<div id="page-wrapper">
<div class="row">
<div class="col-lg-12">
<h1 class="page-header" data-page="page-browse" >Browse important pastes</h1>
</div>
<!-- /.col-lg-12 -->
</div>
<div class="row">
<div class="col-md-12" style="margin-bottom: 0.2cm;">
<strong style="">Year: </strong>
<select class="form-control" id="index_year" style="display: inline-block; margin-bottom: 5px; width: 5%">
{% for yearElem in year_list %}
<option {% if yearElem[2] %} selected="selected" {% endif %} value="{{ yearElem[0] }}" >{{ yearElem[1] }}</option>
{% endfor %}
</select>
</div>
</br>
</div>
<!-- /.row -->
<div class="row">
<!-- /.nav-tabs -->
<ul class="nav nav-tabs">
<li name='nav-pan' class="active"><a data-toggle="tab" href="#credential-tab" data-attribute-name="credential" data-panel="credential-panel">Credentials</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#creditcard-tab" data-attribute-name="creditcard" data-panel="creditcard-panel">Credit cards</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#sqlinjection-tab" data-attribute-name="sqlinjection" data-panel="sqlinjection-panel">SQL injections</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#cve-tab" data-attribute-name="cve" data-panel="cve-panel">CVEs</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#keys-tab" data-attribute-name="keys" data-panel="keys-panel">Keys</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#apikey-tab" data-attribute-name="apikey" data-panel="apikey-panel">API Keys</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#mail-tab" data-attribute-name="mail" data-panel="mail-panel">Mails</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#phone-tab" data-attribute-name="phone" data-panel="phone-panel">Phones</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#onion-tab" data-attribute-name="onion" data-panel="onion-panel">Onions</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#bitcoin-tab" data-attribute-name="bitcoin" data-panel="bitcoin-panel">Bitcoin</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#base64-tab" data-attribute-name="base64" data-panel="base64-panel">Base64</a></li>
</ul>
</br>
<div class="tab-content">
<div class="col-lg-12 tab-pane fade in active" id="credential-tab" >
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="creditcard-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="sqlinjection-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="cve-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="keys-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="apikey-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="mail-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="phone-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="onion-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="bitcoin-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="base64-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
</div> <!-- tab-content -->
<!-- /.row -->
</div>
<!-- /#page-wrapper -->
<!-- import graph function -->
<script>
$(document).ready(function(){
activePage = $('h1.page-header').attr('data-page');
$("#"+activePage).addClass("active");
var dataPath = 'credential';
$.get("{{ url_for('browsepastes.importantPasteByModule') }}"+"?moduleName="+dataPath+"&year="+currentSelectYear, function(data, status){
$('#'+dataPath+'-tab').html(data);
});
});
</script>
<script>
// When a pannel is shown, create the data-table.
var previous_tab = $('[data-attribute-name="credential');
var currentTabName = previous_tab.attr('data-attribute-name');
var loading_gif = "<img id='loading-gif-modal' class='img-center' src=\"{{url_for('static', filename='image/loading.gif') }}\" height='26' width='26' style='margin: 4px;'>";
var currentSelectYear = {{ selected_year }};
$('#index_year').on('change', function() {
currentSelectYear = this.value;
$.get("{{ url_for('browsepastes.importantPasteByModule') }}"+"?moduleName="+currentTabName+"&year="+currentSelectYear, function(data, status){
$('#'+currentTabName+'-tab').html(data);
});
})
$('.nav-tabs a').on('shown.bs.tab', function(event){
var dataPath = $(event.target).attr('data-attribute-name');
currentTabName = dataPath;
$.get("{{ url_for('browsepastes.importantPasteByModule') }}"+"?moduleName="+currentTabName+"&year="+currentSelectYear, function(data, status){
currentTab = $('[name].active').children();
$('#'+previous_tab.attr('data-attribute-name')+'-tab').html(loading_gif);
currentTab.removeClass( "active" );
$('#'+dataPath+'-tab').html(data);
$(event.target).parent().addClass( "active" );
previous_tab = currentTab;
});
});
</script>
</div>
</body>
</html>

View file

@ -1 +0,0 @@
<li id='page-browse'><a href="{{ url_for('browsepastes.browseImportantPaste') }}"><i class="fa fa-search-plus "></i> Browse important pastes</a></li>

View file

@ -1,261 +0,0 @@
<table class="table table-striped table-bordered table-hover" id="myTable_{{ moduleName }}">
<thead>
<tr>
<th>#</th>
<th style="max-width: 800px;">Path</th>
<th>Date</th>
<th># of lines</th>
<th>Action</th>
</tr>
</thead>
<tbody>
{% for path in all_path %}
<tr>
<td> {{ loop.index0 }}</td>
<td><a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{path}}">{{ path }}</a>
<div>
{% for tag in paste_tags[loop.index0] %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag[1] }}">
<span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag[0] }}</span>
</a>
{% endfor %}
</div>
</td>
<td>{{ paste_date[loop.index0] }}</td>
<td>{{ paste_linenum[loop.index0] }}</td>
<td><p><span class="glyphicon glyphicon-info-sign" data-toggle="tooltip" data-placement="left" title="{{ content[loop.index0] }} "></span> <button type="button" class="btn-link" data-num="{{ loop.index0 + 1 }}" data-toggle="modal" data-target="#mymodal" data-url="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{ path }}&num={{ loop.index0+1 }}" data-path="{{ path }}"><span class="fa fa-search-plus"></span></button></p></td>
</tr>
{% endfor %}
</tbody>
</table>
</br>
<div id="nbr_entry" class="alert alert-info">
</div>
<div id="div_stil_data">
<button id="load_more_json_button1" type="button" class="btn btn-default" onclick="add_entries(100)" style="display: none">Load 100 entries</button>
<button id="load_more_json_button2" type="button" class="btn btn-warning" onclick="add_entries(300)" style="display: none">Load 300 entries</button>
<img id="loading_gif_browse" src="{{url_for('static', filename='image/loading.gif') }}" heigt="20" width="20" style="margin: 2px;"></div>
</br>
<script>
var json_array = [];
var all_data_received = false;
var curr_numElem;
var elem_added = 10; //10 elements are added by default in the page loading
var tot_num_entry = 10; //10 elements are added by default in the page loading
function deploy_source() {
var button_load_more_displayed = false;
if(typeof(EventSource) !== "undefined" && typeof(source) !== "") {
$("#load_more_json_button1").show();
$("#load_more_json_button2").show();
var source = new EventSource("{{ url_for('browsepastes.getImportantPasteByModule') }}"+"?moduleName="+moduleName+"&year="+currentSelectYear);
source.onmessage = function(event) {
var feed = jQuery.parseJSON( event.data );
curr_numElem = parseInt($("#myTable_"+moduleName).attr('data-numElem'));
if (feed.index > curr_numElem & feed.module == moduleName) { // Avoid doubling the pastes
json_array.push(feed);
tot_num_entry++;
$("#nbr_entry").text(tot_num_entry + " entries available, " + (tot_num_entry - elem_added) + " not displayed");
$("#myTable_"+moduleName).attr('data-numElem', curr_numElem+1);
if(feed.index > 100 && !button_load_more_displayed) {
button_load_more_displayed = true;
add_entries_X(20);
}
if(feed.finished) {
$("#loading_gif_browse").hide();
source.close();
all_data_received = true;
add_entries_X(10);
}
}
};
} else {
console.log("Sorry, your browser does not support server-sent events...");
}
}
function add_entries(iter) { //Used to disable the button before going to the big loop
$("#load_more_json_button1").attr('disabled','disabled');
$("#load_more_json_button2").attr('disabled','disabled');
setTimeout(function() { add_entries_X(iter);}, 50);
}
function add_entries_X(to_add) {
for(i=0; i<to_add; i++) {
if(json_array.length == 0 && all_data_received) {
$("#load_more_json_button1").hide();
$("#load_more_json_button2").hide();
$("#nbr_entry").hide();
return false;
} else {
var feed = json_array.shift();
elem_added++;
var tag = ""
for(j=0; j<feed.l_tags.length; j++) {
console.log(feed.l_tags[j][1])
tag = tag + "<a href=\"{{ url_for('Tags.get_tagged_paste') }}?ltags=" + feed.l_tags[j][1] + "\">"
+ "<span class=\"label label-" + feed.bootstrap_label[j % 5] + " pull-left\">" + feed.l_tags[j][0] + "</span>" + "</a>";
}
search_table.row.add( [
feed.index,
"<a target=\"_blank\" href=\"{{ url_for('showsavedpastes.showsavedpaste') }}?paste="+feed.path+"&num="+feed.index+"\"> "+ feed.path +"</a>"
+ "<div>" + tag + "</div>" ,
feed.date,
feed.linenum,
"<p><span class=\"glyphicon glyphicon-info-sign\" data-toggle=\"tooltip\" data-placement=\"left\" title=\""+feed.content.replace(/\"/g, "\'").replace(/\r/g, "\'").replace(/\n/g, "\'")+"\"></span> <button type=\"button\" class=\"btn-link\" data-num=\""+feed.index+"\" data-toggle=\"modal\" data-target=\"#mymodal\" data-url=\"{{ url_for('showsavedpastes.showsavedpaste') }}?paste="+feed.path+"&num="+feed.index+"\" data-path=\""+feed.path+"\"><span class=\"fa fa-search-plus\"></span></button></p>"
] ).draw( false );
$("#myTable_"+moduleName).attr('data-numElem', curr_numElem+1);
}
}
$("#load_more_json_button1").removeAttr('disabled');
$("#load_more_json_button2").removeAttr('disabled');
$("#nbr_entry").text(tot_num_entry + " entries available, " + (tot_num_entry - elem_added) + " not displayed");
return true;
}
</script>
<script>
var moduleName = "{{ moduleName }}";
var currentSelectYear = "{{ year }}";
var search_table;
var last_clicked_paste;
var can_change_modal_content = true;
$("#myTable_"+moduleName).attr('data-numElem', "{{ all_path|length }}");
$(document).ready(function(){
$('[data-toggle="tooltip"]').tooltip();
$("[data-toggle='modal']").off('click.openmodal').on("click.openmodal", function (event) {
get_html_and_update_modal(event, $(this));
});
search_table = $('#myTable_'+moduleName).DataTable({ "order": [[ 2, "desc" ]] });
if( "{{ finished }}" == "True"){
$("#load_more_json_button1").hide();
$("#load_more_json_button2").hide();
$("#nbr_entry").hide();
$("#loading_gif_browse").hide();
} else {
deploy_source();
}
});
</script>
<!-- Dynamically update the modal -->
<script type="text/javascript">
// static data
var alert_message = '<div class="alert alert-info alert-dismissable"><button type="button" class="close" data-dismiss="alert" aria-hidden="true">×</button><strong>No more data.</strong> Full paste displayed.</div>';
var complete_paste = null;
var char_to_display = {{ char_to_display }};
var start_index = 0;
// When the modal goes out, refresh it to normal content
$("#mymodal").on('hidden.bs.modal', function () {
can_change_modal_content = true;
$("#mymodalbody").html("<p>Loading paste information...</p>");
var loading_gif = "<img id='loading-gif-modal' class='img-center' src=\"{{url_for('static', filename='image/loading.gif') }}\" height='26' width='26' style='margin: 4px;'>";
$("#mymodalbody").append(loading_gif); // Show the loading GIF
$("#button_show_path").attr('href', '');
$("#button_show_path").hide();
complete_paste = null;
start_index = 0;
});
// Update the paste preview in the modal
function update_preview() {
if (start_index + char_to_display > complete_paste.length-1){ // end of paste reached
var final_index = complete_paste.length-1;
var flag_stop = true;
} else {
var final_index = start_index + char_to_display;
}
if (final_index != start_index){ // still have data to display
// Append the new content using text() and not append (XSS)
$("#mymodalbody").find("#paste-holder").text($("#mymodalbody").find("#paste-holder").text()+complete_paste.substring(start_index+1, final_index+1));
start_index = final_index;
if (flag_stop)
nothing_to_display();
} else {
nothing_to_display();
}
}
// Update the modal when there is no more data
function nothing_to_display() {
var new_content = $(alert_message).hide();
$("#mymodalbody").find("#panel-body").append(new_content);
new_content.show('fast');
$("#load-more-button").hide();
}
function get_html_and_update_modal(event, truemodal) {
event.preventDefault();
var modal=truemodal;
var url = " {{ url_for('showsavedpastes.showpreviewpaste') }}?paste=" + modal.attr('data-path') + "&num=" + modal.attr('data-num');
last_clicked_paste = modal.attr('data-num');
$.get(url, function (data) {
// verify that the reveived data is really the current clicked paste. Otherwise, ignore it.
var received_num = parseInt(data.split("|num|")[1]);
if (received_num == last_clicked_paste && can_change_modal_content) {
can_change_modal_content = false;
// clear data by removing html, body, head tags. prevent dark modal background stack bug.
var cleared_data = data.split("<body>")[1].split("</body>")[0];
$("#mymodalbody").html(cleared_data);
var button = $('<button type="button" id="load-more-button" class="btn btn-info btn-xs center-block" data-url="' + $(modal).attr('data-path') +'" data-toggle="tooltip" data-placement="bottom" title="Load more content"><span class="glyphicon glyphicon-download"></span></button>');
button.tooltip();
$("#mymodalbody").children(".panel-default").append(button);
$("#button_show_path").attr('href', $(modal).attr('data-url'));
$("#button_show_path").show('fast');
$("#loading-gif-modal").css("visibility", "hidden"); // Hide the loading GIF
if ($("[data-initsize]").attr('data-initsize') < char_to_display) { // All the content is displayed
nothing_to_display();
}
// On click, donwload all paste's content
$("#load-more-button").on("click", function (event) {
if (complete_paste == null) { //Donwload only once
$.get("{{ url_for('showsavedpastes.getmoredata') }}"+"?paste="+$(modal).attr('data-path'), function(data, status){
complete_paste = data;
update_preview();
});
} else {
update_preview();
}
});
} else if (can_change_modal_content) {
$("#mymodalbody").html("Ignoring previous not finished query of paste #" + received_num);
}
});
}
// Use to bind the button with the new displayed data
// (The bind do not happens if the dataTable is in tabs and the clicked data is in another page)
search_table.on( 'draw.dt', function () {
// Bind tooltip each time we draw a new page
$('[data-toggle="tooltip"]').tooltip();
// On click, get html content from url and update the corresponding modal
$("[data-toggle='modal']").off('click.openmodal').on("click.openmodal", function (event) {
get_html_and_update_modal(event, $(this));
});
} );
</script>

View file

@ -22,8 +22,10 @@ cfg = Flask_config.cfg
baseUrl = Flask_config.baseUrl baseUrl = Flask_config.baseUrl
r_serv = Flask_config.r_serv r_serv = Flask_config.r_serv
r_serv_log = Flask_config.r_serv_log r_serv_log = Flask_config.r_serv_log
r_serv_db = Flask_config.r_serv_db
max_dashboard_logs = Flask_config.max_dashboard_logs max_dashboard_logs = Flask_config.max_dashboard_logs
dict_update_description = Flask_config.dict_update_description
dashboard = Blueprint('dashboard', __name__, template_folder='templates') dashboard = Blueprint('dashboard', __name__, template_folder='templates')
@ -164,8 +166,22 @@ def index():
log_select.add(max_dashboard_logs) log_select.add(max_dashboard_logs)
log_select = list(log_select) log_select = list(log_select)
log_select.sort() log_select.sort()
# Check if update in progress
update_in_progress = False
update_warning_message = ''
update_warning_message_notice_me = ''
current_update = r_serv_db.get('ail:current_background_update')
if current_update:
if r_serv_db.scard('ail:update_{}'.format(current_update)) != dict_update_description[current_update]['nb_background_update']:
update_in_progress = True
update_warning_message = dict_update_description[current_update]['update_warning_message']
update_warning_message_notice_me = dict_update_description[current_update]['update_warning_message_notice_me']
return render_template("index.html", default_minute = default_minute, threshold_stucked_module=threshold_stucked_module, return render_template("index.html", default_minute = default_minute, threshold_stucked_module=threshold_stucked_module,
log_select=log_select, selected=max_dashboard_logs) log_select=log_select, selected=max_dashboard_logs,
update_warning_message=update_warning_message, update_in_progress=update_in_progress,
update_warning_message_notice_me=update_warning_message_notice_me)
# ========= REGISTRATION ========= # ========= REGISTRATION =========
app.register_blueprint(dashboard, url_prefix=baseUrl) app.register_blueprint(dashboard, url_prefix=baseUrl)

View file

@ -99,6 +99,19 @@
</div> </div>
</div> </div>
{%if update_in_progress%}
<div class="alert alert-warning alert-dismissible fade in">
<a href="#" class="close" data-dismiss="alert" aria-label="close">&times;</a>
<strong>Warning!</strong> {{update_warning_message}} <strong>{{update_warning_message_notice_me}}</strong>
(<a href="{{ url_for('settings.settings_page') }}">Check Update Status</a>)
</div>
{%endif%}
<div class="alert alert-info alert-dismissible fade in">
<a href="#" class="close" data-dismiss="alert" aria-label="close">&times;</a>
<strong>Bootstrap 4 migration!</strong> Some pages are still in bootstrap 3. You can check the migration progress <strong><a href="https://github.com/CIRCL/AIL-framework/issues/330" target="_blank">Here</a></strong>.
</div>
<div class="row"> <div class="row">
<div class="col-lg-6"> <div class="col-lg-6">
<div class="panel panel-default"> <div class="panel panel-default">
@ -187,6 +200,7 @@
<!-- /#page-wrapper --> <!-- /#page-wrapper -->
</div> </div>
<script> var url_showSavedPath = "{{ url_for('showsavedpastes.showsavedpaste') }}"; </script> <script> var url_showSavedPath = "{{ url_for('showsavedpastes.showsavedpaste') }}"; </script>
<script type="text/javascript" src="{{ url_for('static', filename='js/indexjavascript.js')}}" <script type="text/javascript" src="{{ url_for('static', filename='js/indexjavascript.js')}}"
data-urlstuff="{{ url_for('dashboard.stuff') }}" data-urllog="{{ url_for('dashboard.logs') }}"> data-urlstuff="{{ url_for('dashboard.stuff') }}" data-urllog="{{ url_for('dashboard.logs') }}">

View file

@ -25,6 +25,7 @@ baseUrl = Flask_config.baseUrl
r_serv_metadata = Flask_config.r_serv_metadata r_serv_metadata = Flask_config.r_serv_metadata
vt_enabled = Flask_config.vt_enabled vt_enabled = Flask_config.vt_enabled
vt_auth = Flask_config.vt_auth vt_auth = Flask_config.vt_auth
PASTES_FOLDER = Flask_config.PASTES_FOLDER
hashDecoded = Blueprint('hashDecoded', __name__, template_folder='templates') hashDecoded = Blueprint('hashDecoded', __name__, template_folder='templates')
@ -589,6 +590,12 @@ def hash_graph_node_json():
#get related paste #get related paste
l_pastes = r_serv_metadata.zrange('nb_seen_hash:'+hash, 0, -1) l_pastes = r_serv_metadata.zrange('nb_seen_hash:'+hash, 0, -1)
for paste in l_pastes: for paste in l_pastes:
# dynamic update
if PASTES_FOLDER in paste:
score = r_serv_metadata.zscore('nb_seen_hash:{}'.format(hash), paste)
r_serv_metadata.zrem('nb_seen_hash:{}'.format(hash), paste)
paste = paste.replace(PASTES_FOLDER, '', 1)
r_serv_metadata.zadd('nb_seen_hash:{}'.format(hash), score, paste)
url = paste url = paste
#nb_seen_in_this_paste = nb_in_file = int(r_serv_metadata.zscore('nb_seen_hash:'+hash, paste)) #nb_seen_in_this_paste = nb_in_file = int(r_serv_metadata.zscore('nb_seen_hash:'+hash, paste))
nb_hash_in_paste = r_serv_metadata.scard('hash_paste:'+paste) nb_hash_in_paste = r_serv_metadata.scard('hash_paste:'+paste)

View file

@ -8,6 +8,9 @@ import redis
import datetime import datetime
import sys import sys
import os import os
import time
import json
from pyfaup.faup import Faup
from flask import Flask, render_template, jsonify, request, Blueprint, redirect, url_for from flask import Flask, render_template, jsonify, request, Blueprint, redirect, url_for
from Date import Date from Date import Date
@ -23,10 +26,13 @@ r_cache = Flask_config.r_cache
r_serv_onion = Flask_config.r_serv_onion r_serv_onion = Flask_config.r_serv_onion
r_serv_metadata = Flask_config.r_serv_metadata r_serv_metadata = Flask_config.r_serv_metadata
bootstrap_label = Flask_config.bootstrap_label bootstrap_label = Flask_config.bootstrap_label
PASTES_FOLDER = Flask_config.PASTES_FOLDER
hiddenServices = Blueprint('hiddenServices', __name__, template_folder='templates') hiddenServices = Blueprint('hiddenServices', __name__, template_folder='templates')
faup = Faup()
list_types=['onion', 'regular']
dic_type_name={'onion':'Onion', 'regular':'Website'}
# ============ FUNCTIONS ============ # ============ FUNCTIONS ============
def one(): def one():
return 1 return 1
@ -68,48 +74,114 @@ def unpack_paste_tags(p_tags):
l_tags.append( (tag, complete_tag) ) l_tags.append( (tag, complete_tag) )
return l_tags return l_tags
def is_valid_domain(domain):
faup.decode(domain)
domain_unpack = faup.get()
if domain_unpack['tld'] is not None and domain_unpack['scheme'] is None and domain_unpack['port'] is None and domain_unpack['query_string'] is None:
return True
else:
return False
def get_onion_status(domain, date): def get_onion_status(domain, date):
if r_serv_onion.sismember('onion_up:'+date , domain): if r_serv_onion.sismember('onion_up:'+date , domain):
return True return True
else: else:
return False return False
# ============= ROUTES ==============
@hiddenServices.route("/hiddenServices/", methods=['GET']) def get_domain_type(domain):
def hiddenServices_page(): type_id = domain.split(':')[-1]
last_onions = r_serv_onion.lrange('last_onion', 0 ,-1) if type_id == 'onion':
list_onion = [] return 'onion'
else:
return 'regular'
now = datetime.datetime.now() def get_type_domain(domain):
date = '{}{}{}'.format(now.strftime("%Y"), now.strftime("%m"), now.strftime("%d")) if domain is None:
statDomains = {} type = 'regular'
statDomains['domains_up'] = r_serv_onion.scard('onion_up:{}'.format(date)) else:
statDomains['domains_down'] = r_serv_onion.scard('onion_down:{}'.format(date)) if domain.rsplit('.', 1)[1] == 'onion':
statDomains['total'] = statDomains['domains_up'] + statDomains['domains_down'] type = 'onion'
statDomains['domains_queue'] = r_serv_onion.scard('onion_domain_crawler_queue')
for onion in last_onions:
metadata_onion = {}
metadata_onion['domain'] = onion
metadata_onion['last_check'] = r_serv_onion.hget('onion_metadata:{}'.format(onion), 'last_check')
if metadata_onion['last_check'] is None:
metadata_onion['last_check'] = '********'
metadata_onion['first_seen'] = r_serv_onion.hget('onion_metadata:{}'.format(onion), 'first_seen')
if metadata_onion['first_seen'] is None:
metadata_onion['first_seen'] = '********'
if get_onion_status(onion, metadata_onion['last_check']):
metadata_onion['status_text'] = 'UP'
metadata_onion['status_color'] = 'Green'
metadata_onion['status_icon'] = 'fa-check-circle'
else: else:
metadata_onion['status_text'] = 'DOWN' type = 'regular'
metadata_onion['status_color'] = 'Red' return type
metadata_onion['status_icon'] = 'fa-times-circle'
list_onion.append(metadata_onion)
crawler_metadata=[] def get_domain_from_url(url):
all_onion_crawler = r_cache.smembers('all_crawler:onion') faup.decode(url)
for crawler in all_onion_crawler: unpack_url = faup.get()
domain = unpack_url['domain'].decode()
return domain
def get_last_domains_crawled(type):
return r_serv_onion.lrange('last_{}'.format(type), 0 ,-1)
def get_stats_last_crawled_domains(type, date):
statDomains = {}
statDomains['domains_up'] = r_serv_onion.scard('{}_up:{}'.format(type, date))
statDomains['domains_down'] = r_serv_onion.scard('{}_down:{}'.format(type, date))
statDomains['total'] = statDomains['domains_up'] + statDomains['domains_down']
statDomains['domains_queue'] = r_serv_onion.scard('{}_crawler_queue'.format(type))
statDomains['domains_queue'] += r_serv_onion.scard('{}_crawler_priority_queue'.format(type))
return statDomains
def get_last_crawled_domains_metadata(list_domains_crawled, date, type=None, auto_mode=False):
list_crawled_metadata = []
for domain_epoch in list_domains_crawled:
if not auto_mode:
domain, epoch = domain_epoch.rsplit(';', 1)
else:
url = domain_epoch
domain = domain_epoch
domain = domain.split(':')
if len(domain) == 1:
port = 80
domain = domain[0]
else:
port = domain[1]
domain = domain[0]
metadata_domain = {}
# get Domain type
if type is None:
type_domain = get_type_domain(domain)
else:
type_domain = type
if auto_mode:
metadata_domain['url'] = url
epoch = r_serv_onion.zscore('crawler_auto_queue', '{};auto;{}'.format(domain, type_domain))
#domain in priority queue
if epoch is None:
epoch = 'In Queue'
else:
epoch = datetime.datetime.fromtimestamp(float(epoch)).strftime('%Y-%m-%d %H:%M:%S')
metadata_domain['domain'] = domain
if len(domain) > 45:
domain_name, tld_domain = domain.rsplit('.', 1)
metadata_domain['domain_name'] = '{}[...].{}'.format(domain_name[:40], tld_domain)
else:
metadata_domain['domain_name'] = domain
metadata_domain['port'] = port
metadata_domain['epoch'] = epoch
metadata_domain['last_check'] = r_serv_onion.hget('{}_metadata:{}'.format(type_domain, domain), 'last_check')
if metadata_domain['last_check'] is None:
metadata_domain['last_check'] = '********'
metadata_domain['first_seen'] = r_serv_onion.hget('{}_metadata:{}'.format(type_domain, domain), 'first_seen')
if metadata_domain['first_seen'] is None:
metadata_domain['first_seen'] = '********'
if r_serv_onion.sismember('{}_up:{}'.format(type_domain, metadata_domain['last_check']) , domain):
metadata_domain['status_text'] = 'UP'
metadata_domain['status_color'] = 'Green'
metadata_domain['status_icon'] = 'fa-check-circle'
else:
metadata_domain['status_text'] = 'DOWN'
metadata_domain['status_color'] = 'Red'
metadata_domain['status_icon'] = 'fa-times-circle'
list_crawled_metadata.append(metadata_domain)
return list_crawled_metadata
def get_crawler_splash_status(type):
crawler_metadata = []
all_crawlers = r_cache.smembers('{}_crawlers'.format(type))
for crawler in all_crawlers:
crawling_domain = r_cache.hget('metadata_crawler:{}'.format(crawler), 'crawling_domain') crawling_domain = r_cache.hget('metadata_crawler:{}'.format(crawler), 'crawling_domain')
started_time = r_cache.hget('metadata_crawler:{}'.format(crawler), 'started_time') started_time = r_cache.hget('metadata_crawler:{}'.format(crawler), 'started_time')
status_info = r_cache.hget('metadata_crawler:{}'.format(crawler), 'status') status_info = r_cache.hget('metadata_crawler:{}'.format(crawler), 'status')
@ -120,10 +192,322 @@ def hiddenServices_page():
status=False status=False
crawler_metadata.append({'crawler_info': crawler_info, 'crawling_domain': crawling_domain, 'status_info': status_info, 'status': status}) crawler_metadata.append({'crawler_info': crawler_info, 'crawling_domain': crawling_domain, 'status_info': status_info, 'status': status})
return crawler_metadata
def create_crawler_config(mode, service_type, crawler_config, domain, url=None):
if mode == 'manual':
r_cache.set('crawler_config:{}:{}:{}'.format(mode, service_type, domain), json.dumps(crawler_config))
elif mode == 'auto':
r_serv_onion.set('crawler_config:{}:{}:{}:{}'.format(mode, service_type, domain, url), json.dumps(crawler_config))
def send_url_to_crawl_in_queue(mode, service_type, url):
r_serv_onion.sadd('{}_crawler_priority_queue'.format(service_type), '{};{}'.format(url, mode))
# add auto crawled url for user UI
if mode == 'auto':
r_serv_onion.sadd('auto_crawler_url:{}'.format(service_type), url)
def delete_auto_crawler(url):
domain = get_domain_from_url(url)
type = get_type_domain(domain)
# remove from set
r_serv_onion.srem('auto_crawler_url:{}'.format(type), url)
# remove config
r_serv_onion.delete('crawler_config:auto:{}:{}:{}'.format(type, domain, url))
# remove from queue
r_serv_onion.srem('{}_crawler_priority_queue'.format(type), '{};auto'.format(url))
# remove from crawler_auto_queue
r_serv_onion.zrem('crawler_auto_queue'.format(type), '{};auto;{}'.format(url, type))
# ============= ROUTES ==============
@hiddenServices.route("/crawlers/", methods=['GET'])
def dashboard():
crawler_metadata_onion = get_crawler_splash_status('onion')
crawler_metadata_regular = get_crawler_splash_status('regular')
now = datetime.datetime.now()
date = now.strftime("%Y%m%d")
statDomains_onion = get_stats_last_crawled_domains('onion', date)
statDomains_regular = get_stats_last_crawled_domains('regular', date)
return render_template("Crawler_dashboard.html", crawler_metadata_onion = crawler_metadata_onion,
crawler_metadata_regular=crawler_metadata_regular,
statDomains_onion=statDomains_onion, statDomains_regular=statDomains_regular)
@hiddenServices.route("/hiddenServices/2", methods=['GET'])
def hiddenServices_page_test():
return render_template("Crawler_index.html")
@hiddenServices.route("/crawlers/manual", methods=['GET'])
def manual():
return render_template("Crawler_Splash_manual.html")
@hiddenServices.route("/crawlers/crawler_splash_onion", methods=['GET'])
def crawler_splash_onion():
type = 'onion'
last_onions = get_last_domains_crawled(type)
list_onion = []
now = datetime.datetime.now()
date = now.strftime("%Y%m%d")
statDomains = get_stats_last_crawled_domains(type, date)
list_onion = get_last_crawled_domains_metadata(last_onions, date, type=type)
crawler_metadata = get_crawler_splash_status(type)
date_string = '{}-{}-{}'.format(date[0:4], date[4:6], date[6:8]) date_string = '{}-{}-{}'.format(date[0:4], date[4:6], date[6:8])
return render_template("hiddenServices.html", last_onions=list_onion, statDomains=statDomains, return render_template("Crawler_Splash_onion.html", last_onions=list_onion, statDomains=statDomains,
crawler_metadata=crawler_metadata, date_from=date_string, date_to=date_string) crawler_metadata=crawler_metadata, date_from=date_string, date_to=date_string)
@hiddenServices.route("/crawlers/Crawler_Splash_last_by_type", methods=['GET'])
def Crawler_Splash_last_by_type():
type = request.args.get('type')
# verify user input
if type not in list_types:
type = 'onion'
type_name = dic_type_name[type]
list_domains = []
now = datetime.datetime.now()
date = now.strftime("%Y%m%d")
date_string = '{}-{}-{}'.format(date[0:4], date[4:6], date[6:8])
statDomains = get_stats_last_crawled_domains(type, date)
list_domains = get_last_crawled_domains_metadata(get_last_domains_crawled(type), date, type=type)
crawler_metadata = get_crawler_splash_status(type)
return render_template("Crawler_Splash_last_by_type.html", type=type, type_name=type_name,
last_domains=list_domains, statDomains=statDomains,
crawler_metadata=crawler_metadata, date_from=date_string, date_to=date_string)
@hiddenServices.route("/crawlers/blacklisted_domains", methods=['GET'])
def blacklisted_domains():
blacklist_domain = request.args.get('blacklist_domain')
unblacklist_domain = request.args.get('unblacklist_domain')
type = request.args.get('type')
if type in list_types:
type_name = dic_type_name[type]
if blacklist_domain is not None:
blacklist_domain = int(blacklist_domain)
if unblacklist_domain is not None:
unblacklist_domain = int(unblacklist_domain)
try:
page = int(request.args.get('page'))
except:
page = 1
if page <= 0:
page = 1
nb_page_max = r_serv_onion.scard('blacklist_{}'.format(type))/(1000)
if isinstance(nb_page_max, float):
nb_page_max = int(nb_page_max)+1
if page > nb_page_max:
page = nb_page_max
start = 1000*(page -1)
stop = 1000*page
list_blacklisted = list(r_serv_onion.smembers('blacklist_{}'.format(type)))
list_blacklisted_1 = list_blacklisted[start:stop]
list_blacklisted_2 = list_blacklisted[stop:stop+1000]
return render_template("blacklisted_domains.html", list_blacklisted_1=list_blacklisted_1, list_blacklisted_2=list_blacklisted_2,
type=type, type_name=type_name, page=page, nb_page_max=nb_page_max,
blacklist_domain=blacklist_domain, unblacklist_domain=unblacklist_domain)
else:
return 'Incorrect Type'
@hiddenServices.route("/crawler/blacklist_domain", methods=['GET'])
def blacklist_domain():
domain = request.args.get('domain')
type = request.args.get('type')
try:
page = int(request.args.get('page'))
except:
page = 1
if type in list_types:
if is_valid_domain(domain):
res = r_serv_onion.sadd('blacklist_{}'.format(type), domain)
if page:
if res == 0:
return redirect(url_for('hiddenServices.blacklisted_domains', page=page, type=type, blacklist_domain=2))
else:
return redirect(url_for('hiddenServices.blacklisted_domains', page=page, type=type, blacklist_domain=1))
else:
return redirect(url_for('hiddenServices.blacklisted_domains', page=page, type=type, blacklist_domain=0))
else:
return 'Incorrect type'
@hiddenServices.route("/crawler/unblacklist_domain", methods=['GET'])
def unblacklist_domain():
domain = request.args.get('domain')
type = request.args.get('type')
try:
page = int(request.args.get('page'))
except:
page = 1
if type in list_types:
if is_valid_domain(domain):
res = r_serv_onion.srem('blacklist_{}'.format(type), domain)
if page:
if res == 0:
return redirect(url_for('hiddenServices.blacklisted_domains', page=page, type=type, unblacklist_domain=2))
else:
return redirect(url_for('hiddenServices.blacklisted_domains', page=page, type=type, unblacklist_domain=1))
else:
return redirect(url_for('hiddenServices.blacklisted_domains', page=page, type=type, unblacklist_domain=0))
else:
return 'Incorrect type'
@hiddenServices.route("/crawlers/create_spider_splash", methods=['POST'])
def create_spider_splash():
url = request.form.get('url_to_crawl')
automatic = request.form.get('crawler_type')
crawler_time = request.form.get('crawler_epoch')
#html = request.form.get('html_content_id')
screenshot = request.form.get('screenshot')
har = request.form.get('har')
depth_limit = request.form.get('depth_limit')
max_pages = request.form.get('max_pages')
# validate url
if url is None or url=='' or url=='\n':
return 'incorrect url'
crawler_config = {}
# verify user input
if automatic:
automatic = True
else:
automatic = False
if not screenshot:
crawler_config['png'] = 0
if not har:
crawler_config['har'] = 0
# verify user input
if depth_limit:
try:
depth_limit = int(depth_limit)
if depth_limit < 0:
return 'incorrect depth_limit'
else:
crawler_config['depth_limit'] = depth_limit
except:
return 'incorrect depth_limit'
if max_pages:
try:
max_pages = int(max_pages)
if max_pages < 1:
return 'incorrect max_pages'
else:
crawler_config['closespider_pagecount'] = max_pages
except:
return 'incorrect max_pages'
# get service_type
faup.decode(url)
unpack_url = faup.get()
domain = unpack_url['domain'].decode()
if unpack_url['tld'] == b'onion':
service_type = 'onion'
else:
service_type = 'regular'
if automatic:
mode = 'auto'
try:
crawler_time = int(crawler_time)
if crawler_time < 0:
return 'incorrect epoch'
else:
crawler_config['time'] = crawler_time
except:
return 'incorrect epoch'
else:
mode = 'manual'
epoch = None
create_crawler_config(mode, service_type, crawler_config, domain, url=url)
send_url_to_crawl_in_queue(mode, service_type, url)
return redirect(url_for('hiddenServices.manual'))
@hiddenServices.route("/crawlers/auto_crawler", methods=['GET'])
def auto_crawler():
nb_element_to_display = 100
try:
page = int(request.args.get('page'))
except:
page = 1
if page <= 0:
page = 1
nb_auto_onion = r_serv_onion.scard('auto_crawler_url:onion')
nb_auto_regular = r_serv_onion.scard('auto_crawler_url:regular')
if nb_auto_onion > nb_auto_regular:
nb_max = nb_auto_onion
else:
nb_max = nb_auto_regular
nb_page_max = nb_max/(nb_element_to_display)
if isinstance(nb_page_max, float):
nb_page_max = int(nb_page_max)+1
if page > nb_page_max:
page = nb_page_max
start = nb_element_to_display*(page -1)
stop = nb_element_to_display*page
last_auto_crawled = get_last_domains_crawled('auto_crawled')
last_domains = get_last_crawled_domains_metadata(last_auto_crawled, '')
if start > nb_auto_onion:
auto_crawler_domain_onions = []
elif stop > nb_auto_onion:
auto_crawler_domain_onions = list(r_serv_onion.smembers('auto_crawler_url:onion'))[start:nb_auto_onion]
else:
auto_crawler_domain_onions = list(r_serv_onion.smembers('auto_crawler_url:onion'))[start:stop]
if start > nb_auto_regular:
auto_crawler_domain_regular = []
elif stop > nb_auto_regular:
auto_crawler_domain_regular = list(r_serv_onion.smembers('auto_crawler_url:regular'))[start:nb_auto_regular]
else:
auto_crawler_domain_regular = list(r_serv_onion.smembers('auto_crawler_url:regular'))[start:stop]
auto_crawler_domain_onions_metadata = get_last_crawled_domains_metadata(auto_crawler_domain_onions, '', type='onion', auto_mode=True)
auto_crawler_domain_regular_metadata = get_last_crawled_domains_metadata(auto_crawler_domain_regular, '', type='regular', auto_mode=True)
return render_template("Crawler_auto.html", page=page, nb_page_max=nb_page_max,
last_domains=last_domains,
auto_crawler_domain_onions_metadata=auto_crawler_domain_onions_metadata,
auto_crawler_domain_regular_metadata=auto_crawler_domain_regular_metadata)
@hiddenServices.route("/crawlers/remove_auto_crawler", methods=['GET'])
def remove_auto_crawler():
url = request.args.get('url')
page = request.args.get('page')
if url:
delete_auto_crawler(url)
return redirect(url_for('hiddenServices.auto_crawler', page=page))
@hiddenServices.route("/crawlers/crawler_dashboard_json", methods=['GET'])
def crawler_dashboard_json():
crawler_metadata_onion = get_crawler_splash_status('onion')
crawler_metadata_regular = get_crawler_splash_status('regular')
now = datetime.datetime.now()
date = now.strftime("%Y%m%d")
statDomains_onion = get_stats_last_crawled_domains('onion', date)
statDomains_regular = get_stats_last_crawled_domains('regular', date)
return jsonify({'statDomains_onion': statDomains_onion, 'statDomains_regular': statDomains_regular,
'crawler_metadata_onion':crawler_metadata_onion, 'crawler_metadata_regular':crawler_metadata_regular})
# # TODO: refractor
@hiddenServices.route("/hiddenServices/last_crawled_domains_with_stats_json", methods=['GET']) @hiddenServices.route("/hiddenServices/last_crawled_domains_with_stats_json", methods=['GET'])
def last_crawled_domains_with_stats_json(): def last_crawled_domains_with_stats_json():
last_onions = r_serv_onion.lrange('last_onion', 0 ,-1) last_onions = r_serv_onion.lrange('last_onion', 0 ,-1)
@ -261,29 +645,54 @@ def show_domains_by_daterange():
date_from=date_from, date_to=date_to, domains_up=domains_up, domains_down=domains_down, date_from=date_from, date_to=date_to, domains_up=domains_up, domains_down=domains_down,
domains_tags=domains_tags, bootstrap_label=bootstrap_label) domains_tags=domains_tags, bootstrap_label=bootstrap_label)
@hiddenServices.route("/hiddenServices/onion_domain", methods=['GET']) @hiddenServices.route("/crawlers/show_domain", methods=['GET'])
def onion_domain(): def show_domain():
onion_domain = request.args.get('onion_domain') domain = request.args.get('domain')
if onion_domain is None or not r_serv_onion.exists('onion_metadata:{}'.format(onion_domain)): epoch = request.args.get('epoch')
try:
epoch = int(epoch)
except:
epoch = None
port = request.args.get('port')
faup.decode(domain)
unpack_url = faup.get()
domain = unpack_url['domain'].decode()
if not port:
if unpack_url['port']:
port = unpack_url['port'].decode()
else:
port = 80
try:
port = int(port)
except:
port = 80
type = get_type_domain(domain)
if domain is None or not r_serv_onion.exists('{}_metadata:{}'.format(type, domain)):
return '404' return '404'
# # TODO: FIXME return 404 # # TODO: FIXME return 404
last_check = r_serv_onion.hget('onion_metadata:{}'.format(onion_domain), 'last_check') last_check = r_serv_onion.hget('{}_metadata:{}'.format(type, domain), 'last_check')
if last_check is None: if last_check is None:
last_check = '********' last_check = '********'
last_check = '{}/{}/{}'.format(last_check[0:4], last_check[4:6], last_check[6:8]) last_check = '{}/{}/{}'.format(last_check[0:4], last_check[4:6], last_check[6:8])
first_seen = r_serv_onion.hget('onion_metadata:{}'.format(onion_domain), 'first_seen') first_seen = r_serv_onion.hget('{}_metadata:{}'.format(type, domain), 'first_seen')
if first_seen is None: if first_seen is None:
first_seen = '********' first_seen = '********'
first_seen = '{}/{}/{}'.format(first_seen[0:4], first_seen[4:6], first_seen[6:8]) first_seen = '{}/{}/{}'.format(first_seen[0:4], first_seen[4:6], first_seen[6:8])
origin_paste = r_serv_onion.hget('onion_metadata:{}'.format(onion_domain), 'paste_parent') origin_paste = r_serv_onion.hget('{}_metadata:{}'.format(type, domain), 'paste_parent')
h = HiddenServices(onion_domain, 'onion') h = HiddenServices(domain, type, port=port)
l_pastes = h.get_last_crawled_pastes() item_core = h.get_domain_crawled_core_item(epoch=epoch)
if item_core:
l_pastes = h.get_last_crawled_pastes(item_root=item_core['root_item'])
else:
l_pastes = []
dict_links = h.get_all_links(l_pastes)
if l_pastes: if l_pastes:
status = True status = True
else: else:
status = False status = False
last_check = '{} - {}'.format(last_check, time.strftime('%H:%M.%S', time.gmtime(epoch)))
screenshot = h.get_domain_random_screenshot(l_pastes) screenshot = h.get_domain_random_screenshot(l_pastes)
if screenshot: if screenshot:
screenshot = screenshot[0] screenshot = screenshot[0]
@ -295,15 +704,14 @@ def onion_domain():
origin_paste_name = h.get_origin_paste_name() origin_paste_name = h.get_origin_paste_name()
origin_paste_tags = unpack_paste_tags(r_serv_metadata.smembers('tag:{}'.format(origin_paste))) origin_paste_tags = unpack_paste_tags(r_serv_metadata.smembers('tag:{}'.format(origin_paste)))
paste_tags = [] paste_tags = []
path_name = []
for path in l_pastes: for path in l_pastes:
path_name.append(path.replace(PASTES_FOLDER+'/', ''))
p_tags = r_serv_metadata.smembers('tag:'+path) p_tags = r_serv_metadata.smembers('tag:'+path)
paste_tags.append(unpack_paste_tags(p_tags)) paste_tags.append(unpack_paste_tags(p_tags))
return render_template("showDomain.html", domain=onion_domain, last_check=last_check, first_seen=first_seen, return render_template("showDomain.html", domain=domain, last_check=last_check, first_seen=first_seen,
l_pastes=l_pastes, paste_tags=paste_tags, bootstrap_label=bootstrap_label, l_pastes=l_pastes, paste_tags=paste_tags, bootstrap_label=bootstrap_label,
path_name=path_name, origin_paste_tags=origin_paste_tags, status=status, dict_links=dict_links,
origin_paste_tags=origin_paste_tags, status=status,
origin_paste=origin_paste, origin_paste_name=origin_paste_name, origin_paste=origin_paste, origin_paste_name=origin_paste_name,
domain_tags=domain_tags, screenshot=screenshot) domain_tags=domain_tags, screenshot=screenshot)
@ -314,7 +722,6 @@ def onion_son():
h = HiddenServices(onion_domain, 'onion') h = HiddenServices(onion_domain, 'onion')
l_pastes = h.get_last_crawled_pastes() l_pastes = h.get_last_crawled_pastes()
l_son = h.get_domain_son(l_pastes) l_son = h.get_domain_son(l_pastes)
print(l_son)
return 'l_son' return 'l_son'
# ============= JSON ============== # ============= JSON ==============
@ -336,5 +743,26 @@ def domain_crawled_7days_json():
return jsonify(json_domain_stats) return jsonify(json_domain_stats)
@hiddenServices.route('/hiddenServices/domain_crawled_by_type_json')
def domain_crawled_by_type_json():
current_date = request.args.get('date')
type = request.args.get('type')
if type in list_types:
num_day_type = 7
date_range = get_date_range(num_day_type)
range_decoder = []
for date in date_range:
day_crawled = {}
day_crawled['date']= date[0:4] + '-' + date[4:6] + '-' + date[6:8]
day_crawled['UP']= nb_domain_up = r_serv_onion.scard('{}_up:{}'.format(type, date))
day_crawled['DOWN']= nb_domain_up = r_serv_onion.scard('{}_up:{}'.format(type, date))
range_decoder.append(day_crawled)
return jsonify(range_decoder)
else:
return jsonify('Incorrect Type')
# ========= REGISTRATION ========= # ========= REGISTRATION =========
app.register_blueprint(hiddenServices, url_prefix=baseUrl) app.register_blueprint(hiddenServices, url_prefix=baseUrl)

View file

@ -0,0 +1,477 @@
<!DOCTYPE html>
<html>
<head>
<title>AIL-Framework</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png')}}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/daterangepicker.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/popper.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/moment.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/jquery.daterangepicker.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/d3.min.js') }}"></script>
<style>
.bar {
fill: steelblue;
}
.bar:hover{
fill: brown;
cursor: pointer;
}
.bar_stack:hover{
cursor: pointer;
}
.popover{
max-width: 100%;
}
</style>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'crawler/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div class="row">
<div class="col-12 col-xl-6">
<div class="table-responsive mt-1 table-hover table-borderless table-striped">
<table class="table">
<thead class="thead-dark">
<tr>
<th>Domain</th>
<th>First Seen</th>
<th>Last Check</th>
<th>Status</th>
</tr>
</thead>
<tbody id="tbody_last_crawled">
{% for metadata_domain in last_domains %}
<tr data-toggle="popover" data-trigger="hover"
title="<span class='badge badge-dark'>{{metadata_domain['domain']}}</span>"
data-content="port: <span class='badge badge-secondary'>{{metadata_domain['port']}}</span><br>
epoch: {{metadata_domain['epoch']}}">
<td><a target="_blank" href="{{ url_for('hiddenServices.show_domain') }}?domain={{ metadata_domain['domain'] }}&port={{metadata_domain['port']}}&epoch={{metadata_domain['epoch']}}">{{ metadata_domain['domain_name'] }}</a></td>
<td>{{'{}/{}/{}'.format(metadata_domain['first_seen'][0:4], metadata_domain['first_seen'][4:6], metadata_domain['first_seen'][6:8])}}</td>
<td>{{'{}/{}/{}'.format(metadata_domain['last_check'][0:4], metadata_domain['last_check'][4:6], metadata_domain['last_check'][6:8])}}</td>
<td><div style="color:{{metadata_domain['status_color']}}; display:inline-block">
<i class="fas {{metadata_domain['status_icon']}} "></i>
{{metadata_domain['status_text']}}
</div>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
<a href="{{ url_for('hiddenServices.blacklisted_domains') }}?type={{type}}">
<button type="button" class="btn btn-outline-danger">Show Blacklisted {{type_name}}s</button>
</a>
</div>
<div class="col-12 col-xl-6">
<div class="card text-white bg-dark mb-3 mt-1">
<div class="card-header">
<div class="row">
<div class="col-6">
<span class="badge badge-success">{{ statDomains['domains_up'] }}</span> UP
<span class="badge badge-danger ml-md-3">{{ statDomains['domains_down'] }}</span> DOWN
</div>
<div class="col-6">
<span class="badge badge-success">{{ statDomains['total'] }}</span> Crawled
<span class="badge badge-warning ml-md-3">{{ statDomains['domains_queue'] }}</span> Queue
</div>
</div>
</div>
<div class="card-body">
<h5 class="card-title">Select domains by date range :</h5>
<p class="card-text">Some quick example text to build on the card title and make up the bulk of the card's content.</p>
<form action="{{ url_for('hiddenServices.get_onions_by_daterange') }}" id="hash_selector_form" method='post'>
<div class="row">
<div class="col-6">
<div class="input-group" id="date-range-from">
<div class="input-group-prepend"><span class="input-group-text"><i class="far fa-calendar-alt" aria-hidden="true"></i></span></div>
<input class="form-control" id="date-range-from-input" placeholder="yyyy-mm-dd" value="{{ date_from }}" name="date_from" autocomplete="off">
</div>
<div class="input-group" id="date-range-to">
<div class="input-group-prepend"><span class="input-group-text"><i class="far fa-calendar-alt" aria-hidden="true"></i></span></div>
<input class="form-control" id="date-range-to-input" placeholder="yyyy-mm-dd" value="{{ date_to }}" name="date_to" autocomplete="off">
</div>
</div>
<div class="col-6">
<div class="custom-control custom-switch">
<input class="custom-control-input" type="checkbox" name="domains_up" value="True" id="domains_up_id" checked>
<label class="custom-control-label" for="domains_up_id">
<span class="badge badge-success"><i class="fas fa-check-circle"></i> Domains UP </span>
</label>
</div>
<div class="custom-control custom-switch">
<input class="custom-control-input" type="checkbox" name="domains_down" value="True" id="domains_down_id">
<label class="custom-control-label" for="domains_down_id">
<span class="badge badge-danger"><i class="fas fa-times-circle"></i> Domains DOWN</span>
</label>
</div>
<div class="custom-control custom-switch mt-2">
<input class="custom-control-input" type="checkbox" name="domains_tags" value="True" id="domains_tags_id">
<label class="custom-control-label" for="domains_tags_id">
<span class="badge badge-dark"><i class="fas fa-tags"></i> Domains Tags</span>
</label>
</div>
</div>
</div>
<button class="btn btn-primary">
<i class="fas fa-eye"></i> Show {{type_name}}s
</button>
<form>
</div>
</div>
<div id="barchart_type">
</div>
<div class="card mt-1 mb-1">
<div class="card-header text-white bg-dark">
Crawlers Status
</div>
<div class="card-body px-0 py-0 ">
<table class="table">
<tbody id="tbody_crawler_info">
{% for crawler in crawler_metadata %}
<tr>
<td>
<i class="fas fa-{%if crawler['status']%}check{%else%}times{%endif%}-circle" style="color:{%if crawler['status']%}Green{%else%}Red{%endif%};"></i> {{crawler['crawler_info']}}
</td>
<td>
{{crawler['crawling_domain']}}
</td>
<td style="color:{%if crawler['status']%}Green{%else%}Red{%endif%};">
{{crawler['status_info']}}
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
<script>
var chart = {};
$(document).ready(function(){
$("#page-Crawler").addClass("active");
$("#nav_{{type}}_crawler").addClass("active");
$('#date-range-from').dateRangePicker({
separator : ' to ',
getValue: function(){
if ($('#date-range-from-input').val() && $('#date-range-to-input').val() )
return $('#date-range-from-input').val() + ' to ' + $('#date-range-to-input').val();
else
return '';
},
setValue: function(s,s1,s2){
$('#date-range-from-input').val(s1);
$('#date-range-to-input').val(s2);
}
});
$('#date-range-to').dateRangePicker({
separator : ' to ',
getValue: function(){
if ($('#date-range-from-input').val() && $('#date-range-to-input').val() )
return $('#date-range-from-input').val() + ' to ' + $('#date-range-to-input').val();
else
return '';
},
setValue: function(s,s1,s2){
$('#date-range-from-input').val(s1);
$('#date-range-to-input').val(s2);
}
});
chart.stackBarChart =barchart_type_stack("{{ url_for('hiddenServices.domain_crawled_by_type_json') }}?type={{type}}", 'id');
chart.onResize();
$(window).on("resize", function() {
chart.onResize();
});
$('[data-toggle="popover"]').popover({
placement: 'top',
container: 'body',
html : true,
});
});
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
</script>
<script>/*
function refresh_list_crawled(){
$.getJSON("{{ url_for('hiddenServices.last_crawled_domains_with_stats_json') }}",
function(data) {
var tableRef = document.getElementById('tbody_last_crawled');
$("#tbody_last_crawled").empty()
for (var i = 0; i < data.last_domains.length; i++) {
var data_domain = data.last_domains[i]
var newRow = tableRef.insertRow(tableRef.rows.length);
var newCell = newRow.insertCell(0);
newCell.innerHTML = "<td><a target=\"_blank\" href=\"{{ url_for('hiddenServices.show_domain') }}?onion_domain="+data_domain['domain']+"\">"+data_domain['domain']+"</a></td>";
newCell = newRow.insertCell(1);
newCell.innerHTML = "<td>"+data_domain['first_seen'].substr(0, 4)+"/"+data_domain['first_seen'].substr(4, 2)+"/"+data_domain['first_seen'].substr(6, 2)+"</td>"
newCell = newRow.insertCell(2);
newCell.innerHTML = "<td>"+data_domain['last_check'].substr(0, 4)+"/"+data_domain['last_check'].substr(4, 2)+"/"+data_domain['last_check'].substr(6, 2)+"</td>"
newCell = newRow.insertCell(3);
newCell.innerHTML = "<td><div style=\"color:"+data_domain['status_color']+"; display:inline-block\"><i class=\"fa "+data_domain['status_icon']+" fa-2x\"></i>"+data_domain['status_text']+"</div></td>"
}
var statDomains = data.statDomains
document.getElementById('text_domain_up').innerHTML = statDomains['domains_up']
document.getElementById('text_domain_down').innerHTML = statDomains['domains_down']
document.getElementById('text_domain_queue').innerHTML = statDomains['domains_queue']
document.getElementById('text_total_domains').innerHTML = statDomains['total']
if(data.crawler_metadata.length!=0){
$("#tbody_crawler_info").empty();
var tableRef = document.getElementById('tbody_crawler_info');
for (var i = 0; i < data.crawler_metadata.length; i++) {
var crawler = data.crawler_metadata[i];
var newRow = tableRef.insertRow(tableRef.rows.length);
var text_color;
var icon;
if(crawler['status']){
text_color = 'Green';
icon = 'check';
} else {
text_color = 'Red';
icon = 'times';
}
var newCell = newRow.insertCell(0);
newCell.innerHTML = "<td><i class=\"fa fa-"+icon+"-circle\" style=\"color:"+text_color+";\"></i>"+crawler['crawler_info']+"</td>";
newCell = newRow.insertCell(1);
newCell.innerHTML = "<td><a target=\"_blank\" href=\"{{ url_for('hiddenServices.show_domain') }}?onion_domain="+crawler['crawling_domain']+"\">"+crawler['crawling_domain']+"</a></td>";
newCell = newRow.insertCell(2);
newCell.innerHTML = "<td><div style=\"color:"+text_color+";\">"+crawler['status_info']+"</div></td>";
$("#panel_crawler").show();
}
} else {
$("#panel_crawler").hide();
}
}
);
if (to_refresh) {
setTimeout("refresh_list_crawled()", 10000);
}
}*/
</script>
<script>
var margin = {top: 20, right: 90, bottom: 55, left: 0},
width = parseInt(d3.select('#barchart_type').style('width'), 10);
width = 1000 - margin.left - margin.right,
height = 500 - margin.top - margin.bottom;
var x = d3.scaleBand().rangeRound([0, width]).padding(0.1);
var y = d3.scaleLinear().rangeRound([height, 0]);
var xAxis = d3.axisBottom(x);
var yAxis = d3.axisLeft(y);
var color = d3.scaleOrdinal(d3.schemeSet3);
var svg = d3.select("#barchart_type").append("svg")
.attr("id", "thesvg")
.attr("viewBox", "0 0 "+width+" 500")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
function barchart_type_stack(url, id) {
d3.json(url)
.then(function(data){
var labelVar = 'date'; //A
var varNames = d3.keys(data[0])
.filter(function (key) { return key !== labelVar;}); //B
data.forEach(function (d) { //D
var y0 = 0;
d.mapping = varNames.map(function (name) {
return {
name: name,
label: d[labelVar],
y0: y0,
y1: y0 += +d[name]
};
});
d.total = d.mapping[d.mapping.length - 1].y1;
});
x.domain(data.map(function (d) { return (d.date); })); //E
y.domain([0, d3.max(data, function (d) { return d.total; })]);
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.attr("class", "bar")
.on("click", function (d) { window.location.href = "#" })
.attr("transform", "rotate(-18)" )
//.attr("transform", "rotate(-40)" )
.style("text-anchor", "end");
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style("text-anchor", "end");
var selection = svg.selectAll(".series")
.data(data)
.enter().append("g")
.attr("class", "series")
.attr("transform", function (d) { return "translate(" + x((d.date)) + ",0)"; });
selection.selectAll("rect")
.data(function (d) { return d.mapping; })
.enter().append("rect")
.attr("class", "bar_stack")
.attr("width", x.bandwidth())
.attr("y", function (d) { return y(d.y1); })
.attr("height", function (d) { return y(d.y0) - y(d.y1); })
.style("fill", function (d) { return color(d.name); })
.style("stroke", "grey")
.on("mouseover", function (d) { showPopover.call(this, d); })
.on("mouseout", function (d) { removePopovers(); })
.on("click", function(d){ window.location.href = "#" });
data.forEach(function(d) {
if(d.total != 0){
svg.append("text")
.attr("class", "bar")
.attr("dy", "-.35em")
.attr('x', x(d.date) + x.bandwidth()/2)
.attr('y', y(d.total))
.on("click", function () {window.location.href = "#" })
.style("text-anchor", "middle")
.text(d.total);
}
});
drawLegend(varNames);
});
}
function drawLegend (varNames) {
var legend = svg.selectAll(".legend")
.data(varNames.slice().reverse())
.enter().append("g")
.attr("class", "legend")
.attr("transform", function (d, i) { return "translate(0," + i * 20 + ")"; });
legend.append("rect")
.attr("x", 943)
.attr("width", 10)
.attr("height", 10)
.style("fill", color)
.style("stroke", "grey");
legend.append("text")
.attr("class", "svgText")
.attr("x", 941)
.attr("y", 6)
.attr("dy", ".35em")
.style("text-anchor", "end")
.text(function (d) { return d; });
}
function removePopovers () {
$('.popover').each(function() {
$(this).remove();
});
}
function showPopover (d) {
$(this).popover({
title: d.name,
placement: 'top',
container: 'body',
trigger: 'manual',
html : true,
content: function() {
return d.label +
"<br/>num: " + d3.format(",")(d.value ? d.value: d.y1 - d.y0); }
});
$(this).popover('show')
}
chart.onResize = function () {
var aspect = width / height, chart = $("#thesvg");
var targetWidth = chart.parent().width();
chart.attr("width", targetWidth);
chart.attr("height", targetWidth / 2);
}
window.chart = chart;
</script>

View file

@ -0,0 +1,164 @@
<!DOCTYPE html>
<html>
<head>
<title>AIL-Framework</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png')}}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/daterangepicker.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/popper.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/moment.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/jquery.daterangepicker.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/d3.min.js') }}"></script>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'crawler/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div class="card text-white bg-dark mb-3 mt-1">
<div class="card-header">
<h5 class="card-title">Crawl a Domain</h5>
</div>
<div class="card-body">
<p class="card-text">Enter a domain and choose what kind of data you want.</p>
<form action="{{ url_for('hiddenServices.create_spider_splash') }}" method='post'>
<div class="row">
<div class="col-12 col-lg-6">
<div class="input-group" id="date-range-from">
<input type="text" class="form-control" id="url_to_crawl" name="url_to_crawl" placeholder="Address or Domain">
</div>
<div class="d-flex mt-1">
<i class="fas fa-user-ninja mt-1"></i> &nbsp;Manual&nbsp;&nbsp;
<div class="custom-control custom-switch">
<input class="custom-control-input" type="checkbox" name="crawler_type" value="True" id="crawler_type">
<label class="custom-control-label" for="crawler_type">
<i class="fas fa-clock"></i> &nbsp;Automatic
</label>
</div>
</div>
<div class="input-group mt-2 mb-2" id="crawler_epoch_input">
<div class="input-group-prepend">
<span class="input-group-text bg-light"><i class="fas fa-clock"></i>&nbsp;</span>
</div>
<input class="form-control" type="number" id="crawler_epoch" value="3600" min="1" name="crawler_epoch" required>
<div class="input-group-append">
<span class="input-group-text">Time (seconds) between each crawling</span>
</div>
</div>
</div>
<div class="col-12 col-lg-6 mt-2 mt-lg-0">
<div class="row">
<div class="col-12 col-xl-6">
<div class="custom-control custom-switch">
<input class="custom-control-input" type="checkbox" name="html_content" value="True" id="html_content_id" checked disabled>
<label class="custom-control-label" for="html_content_id">
<i class="fab fa-html5"></i> &nbsp;HTML
</label>
</div>
<div class="custom-control custom-switch mt-1">
<input class="custom-control-input" type="checkbox" name="screenshot" value="True" id="screenshot_id">
<label class="custom-control-label" for="screenshot_id">
<i class="fas fa-image"></i> Screenshot
</label>
</div>
<div class="custom-control custom-switch mt-1">
<input class="custom-control-input" type="checkbox" name="har" value="True" id="har_id">
<label class="custom-control-label" for="har_id">
<i class="fas fa-file"></i> &nbsp;HAR
</label>
</div>
</div>
<div class="col-12 col-xl-6">
<div class="input-group form-group mb-0">
<div class="input-group-prepend">
<span class="input-group-text bg-light"><i class="fas fa-water"></i></span>
</div>
<input class="form-control" type="number" id="depth_limit" name="depth_limit" min="0" value="0" required>
<div class="input-group-append">
<span class="input-group-text">Depth Limit</span>
</div>
</div>
<div class="input-group mt-2">
<div class="input-group-prepend">
<span class="input-group-text bg-light"><i class="fas fa-copy"></i>&nbsp;</span>
</div>
<input class="form-control" type="number" id="max_pages" name="max_pages" min="1" value="1" required>
<div class="input-group-append">
<span class="input-group-text">Max Pages</span>
</div>
</div>
</div>
</div>
</div>
</div>
<button class="btn btn-primary mt-2">
<i class="fas fa-spider"></i> Send to Spider
</button>
<form>
</div>
</div>
</div>
</div>
</div>
</body>
<script>
var chart = {};
$(document).ready(function(){
$("#page-Crawler").addClass("active");
$("#nav_manual_crawler").addClass("active");
manual_crawler_input_controler();
$('#crawler_type').change(function () {
manual_crawler_input_controler();
});
});
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
function manual_crawler_input_controler() {
if($('#crawler_type').is(':checked')){
$("#crawler_epoch_input").show();
}else{
$("#crawler_epoch_input").hide();
}
}
</script>

View file

@ -0,0 +1,476 @@
<!DOCTYPE html>
<html>
<head>
<title>AIL-Framework</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png')}}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/daterangepicker.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/popper.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/moment.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/jquery.daterangepicker.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/d3.min.js') }}"></script>
<style>
.bar {
fill: steelblue;
}
.bar:hover{
fill: brown;
cursor: pointer;
}
.bar_stack:hover{
cursor: pointer;
}
div.tooltip {
position: absolute;
text-align: center;
padding: 2px;
font: 12px sans-serif;
background: #ebf4fb;
border: 2px solid #b7ddf2;
border-radius: 8px;
pointer-events: none;
color: #000000;
}
</style>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'crawler/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div class="row">
<div class="col-12 col-xl-6">
<div class="table-responsive mt-1 table-hover table-borderless table-striped">
<table class="table">
<thead class="thead-dark">
<tr>
<th>Domain</th>
<th>First Seen</th>
<th>Last Check</th>
<th>Status</th>
</tr>
</thead>
<tbody id="tbody_last_crawled">
{% for metadata_onion in last_onions %}
<tr>
<td><a target="_blank" href="{{ url_for('hiddenServices.onion_domain') }}?onion_domain={{ metadata_onion['domain'] }}">{{ metadata_onion['domain'] }}</a></td>
<td>{{'{}/{}/{}'.format(metadata_onion['first_seen'][0:4], metadata_onion['first_seen'][4:6], metadata_onion['first_seen'][6:8])}}</td>
<td>{{'{}/{}/{}'.format(metadata_onion['last_check'][0:4], metadata_onion['last_check'][4:6], metadata_onion['last_check'][6:8])}}</td>
<td><div style="color:{{metadata_onion['status_color']}}; display:inline-block">
<i class="fas {{metadata_onion['status_icon']}} "></i>
{{metadata_onion['status_text']}}
</div>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
<a href="{{ url_for('hiddenServices.blacklisted_onion') }}">
<button type="button" class="btn btn-outline-danger">Show Blacklisted Onion</button>
</a>
</div>
<div class="col-12 col-xl-6">
<div class="card text-white bg-dark mb-3 mt-1">
<div class="card-header">
<div class="row">
<div class="col-6">
<span class="badge badge-success">{{ statDomains['domains_up'] }}</span> UP
<span class="badge badge-danger ml-md-3">{{ statDomains['domains_down'] }}</span> DOWN
</div>
<div class="col-6">
<span class="badge badge-success">{{ statDomains['total'] }}</span> Crawled
<span class="badge badge-warning ml-md-3">{{ statDomains['domains_queue'] }}</span> Queue
</div>
</div>
</div>
<div class="card-body">
<h5 class="card-title">Select domains by date range :</h5>
<p class="card-text">Some quick example text to build on the card title and make up the bulk of the card's content.</p>
<form action="{{ url_for('hiddenServices.get_onions_by_daterange') }}" id="hash_selector_form" method='post'>
<div class="row">
<div class="col-6">
<div class="input-group" id="date-range-from">
<div class="input-group-prepend"><span class="input-group-text"><i class="far fa-calendar-alt" aria-hidden="true"></i></span></div>
<input class="form-control" id="date-range-from-input" placeholder="yyyy-mm-dd" value="{{ date_from }}" name="date_from">
</div>
<div class="input-group" id="date-range-to">
<div class="input-group-prepend"><span class="input-group-text"><i class="far fa-calendar-alt" aria-hidden="true"></i></span></div>
<input class="form-control" id="date-range-to-input" placeholder="yyyy-mm-dd" value="{{ date_to }}" name="date_to">
</div>
</div>
<div class="col-6">
<div class="custom-control custom-switch">
<input class="custom-control-input" type="checkbox" name="domains_up" value="True" id="domains_up_id" checked>
<label class="custom-control-label" for="domains_up_id">
<span class="badge badge-success"><i class="fas fa-check-circle"></i> Domains UP </span>
</label>
</div>
<div class="custom-control custom-switch">
<input class="custom-control-input" type="checkbox" name="domains_down" value="True" id="domains_down_id">
<label class="custom-control-label" for="domains_down_id">
<span class="badge badge-danger"><i class="fas fa-times-circle"></i> Domains DOWN</span>
</label>
</div>
<div class="custom-control custom-switch mt-2">
<input class="custom-control-input" type="checkbox" name="domains_tags" value="True" id="domains_tags_id">
<label class="custom-control-label" for="domains_tags_id">
<span class="badge badge-dark"><i class="fas fa-tags"></i> Domains Tags</span>
</label>
</div>
</div>
</div>
<button class="btn btn-primary">
<i class="fas fa-eye"></i> Show Onions
</button>
<form>
</div>
</div>
<div id="barchart_type">
</div>
<div class="card mt-1 mb-1">
<div class="card-header text-white bg-dark">
Crawlers Status
</div>
<div class="card-body px-0 py-0 ">
<table class="table">
<tbody id="tbody_crawler_info">
{% for crawler in crawler_metadata %}
<tr>
<td>
<i class="fas fa-{%if crawler['status']%}check{%else%}times{%endif%}-circle" style="color:{%if crawler['status']%}Green{%else%}Red{%endif%};"></i> {{crawler['crawler_info']}}
</td>
<td>
{{crawler['crawling_domain']}}
</td>
<td style="color:{%if crawler['status']%}Green{%else%}Red{%endif%};">
{{crawler['status_info']}}
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
<script>
var chart = {};
$(document).ready(function(){
$("#page-Crawler").addClass("active");
$("#nav_onion_crawler").addClass("active");
$('#date-range-from').dateRangePicker({
separator : ' to ',
getValue: function(){
if ($('#date-range-from-input').val() && $('#date-range-to-input').val() )
return $('#date-range-from-input').val() + ' to ' + $('#date-range-to-input').val();
else
return '';
},
setValue: function(s,s1,s2){
$('#date-range-from-input').val(s1);
$('#date-range-to-input').val(s2);
}
});
$('#date-range-to').dateRangePicker({
separator : ' to ',
getValue: function(){
if ($('#date-range-from-input').val() && $('#date-range-to-input').val() )
return $('#date-range-from-input').val() + ' to ' + $('#date-range-to-input').val();
else
return '';
},
setValue: function(s,s1,s2){
$('#date-range-from-input').val(s1);
$('#date-range-to-input').val(s2);
}
});
chart.stackBarChart =barchart_type_stack("{{ url_for('hiddenServices.automatic_onion_crawler_json') }}", 'id');
chart.onResize();
$(window).on("resize", function() {
chart.onResize();
});
});
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
</script>
<script>/*
function refresh_list_crawled(){
$.getJSON("{{ url_for('hiddenServices.last_crawled_domains_with_stats_json') }}",
function(data) {
var tableRef = document.getElementById('tbody_last_crawled');
$("#tbody_last_crawled").empty()
for (var i = 0; i < data.last_onions.length; i++) {
var data_domain = data.last_onions[i]
var newRow = tableRef.insertRow(tableRef.rows.length);
var newCell = newRow.insertCell(0);
newCell.innerHTML = "<td><a target=\"_blank\" href=\"{{ url_for('hiddenServices.onion_domain') }}?onion_domain="+data_domain['domain']+"\">"+data_domain['domain']+"</a></td>";
newCell = newRow.insertCell(1);
newCell.innerHTML = "<td>"+data_domain['first_seen'].substr(0, 4)+"/"+data_domain['first_seen'].substr(4, 2)+"/"+data_domain['first_seen'].substr(6, 2)+"</td>"
newCell = newRow.insertCell(2);
newCell.innerHTML = "<td>"+data_domain['last_check'].substr(0, 4)+"/"+data_domain['last_check'].substr(4, 2)+"/"+data_domain['last_check'].substr(6, 2)+"</td>"
newCell = newRow.insertCell(3);
newCell.innerHTML = "<td><div style=\"color:"+data_domain['status_color']+"; display:inline-block\"><i class=\"fa "+data_domain['status_icon']+" fa-2x\"></i>"+data_domain['status_text']+"</div></td>"
}
var statDomains = data.statDomains
document.getElementById('text_domain_up').innerHTML = statDomains['domains_up']
document.getElementById('text_domain_down').innerHTML = statDomains['domains_down']
document.getElementById('text_domain_queue').innerHTML = statDomains['domains_queue']
document.getElementById('text_total_domains').innerHTML = statDomains['total']
if(data.crawler_metadata.length!=0){
$("#tbody_crawler_info").empty();
var tableRef = document.getElementById('tbody_crawler_info');
for (var i = 0; i < data.crawler_metadata.length; i++) {
var crawler = data.crawler_metadata[i];
var newRow = tableRef.insertRow(tableRef.rows.length);
var text_color;
var icon;
if(crawler['status']){
text_color = 'Green';
icon = 'check';
} else {
text_color = 'Red';
icon = 'times';
}
var newCell = newRow.insertCell(0);
newCell.innerHTML = "<td><i class=\"fa fa-"+icon+"-circle\" style=\"color:"+text_color+";\"></i>"+crawler['crawler_info']+"</td>";
newCell = newRow.insertCell(1);
newCell.innerHTML = "<td><a target=\"_blank\" href=\"{{ url_for('hiddenServices.onion_domain') }}?onion_domain="+crawler['crawling_domain']+"\">"+crawler['crawling_domain']+"</a></td>";
newCell = newRow.insertCell(2);
newCell.innerHTML = "<td><div style=\"color:"+text_color+";\">"+crawler['status_info']+"</div></td>";
$("#panel_crawler").show();
}
} else {
$("#panel_crawler").hide();
}
}
);
if (to_refresh) {
setTimeout("refresh_list_crawled()", 10000);
}
}*/
</script>
<script>
var margin = {top: 20, right: 90, bottom: 55, left: 0},
width = parseInt(d3.select('#barchart_type').style('width'), 10);
width = 1000 - margin.left - margin.right,
height = 500 - margin.top - margin.bottom;
var x = d3.scaleBand().rangeRound([0, width]).padding(0.1);
var y = d3.scaleLinear().rangeRound([height, 0]);
var xAxis = d3.axisBottom(x);
var yAxis = d3.axisLeft(y);
var color = d3.scaleOrdinal(d3.schemeSet3);
var svg = d3.select("#barchart_type").append("svg")
.attr("id", "thesvg")
.attr("viewBox", "0 0 "+width+" 500")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
function barchart_type_stack(url, id) {
d3.json(url)
.then(function(data){
var labelVar = 'date'; //A
var varNames = d3.keys(data[0])
.filter(function (key) { return key !== labelVar;}); //B
data.forEach(function (d) { //D
var y0 = 0;
d.mapping = varNames.map(function (name) {
return {
name: name,
label: d[labelVar],
y0: y0,
y1: y0 += +d[name]
};
});
d.total = d.mapping[d.mapping.length - 1].y1;
});
x.domain(data.map(function (d) { return (d.date); })); //E
y.domain([0, d3.max(data, function (d) { return d.total; })]);
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.attr("class", "bar")
.on("click", function (d) { window.location.href = "#" })
.attr("transform", "rotate(-18)" )
//.attr("transform", "rotate(-40)" )
.style("text-anchor", "end");
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style("text-anchor", "end");
var selection = svg.selectAll(".series")
.data(data)
.enter().append("g")
.attr("class", "series")
.attr("transform", function (d) { return "translate(" + x((d.date)) + ",0)"; });
selection.selectAll("rect")
.data(function (d) { return d.mapping; })
.enter().append("rect")
.attr("class", "bar_stack")
.attr("width", x.bandwidth())
.attr("y", function (d) { return y(d.y1); })
.attr("height", function (d) { return y(d.y0) - y(d.y1); })
.style("fill", function (d) { return color(d.name); })
.style("stroke", "grey")
.on("mouseover", function (d) { showPopover.call(this, d); })
.on("mouseout", function (d) { removePopovers(); })
.on("click", function(d){ window.location.href = "#" });
data.forEach(function(d) {
if(d.total != 0){
svg.append("text")
.attr("class", "bar")
.attr("dy", "-.35em")
.attr('x', x(d.date) + x.bandwidth()/2)
.attr('y', y(d.total))
.on("click", function () {window.location.href = "#" })
.style("text-anchor", "middle")
.text(d.total);
}
});
drawLegend(varNames);
});
}
function drawLegend (varNames) {
var legend = svg.selectAll(".legend")
.data(varNames.slice().reverse())
.enter().append("g")
.attr("class", "legend")
.attr("transform", function (d, i) { return "translate(0," + i * 20 + ")"; });
legend.append("rect")
.attr("x", 943)
.attr("width", 10)
.attr("height", 10)
.style("fill", color)
.style("stroke", "grey");
legend.append("text")
.attr("class", "svgText")
.attr("x", 941)
.attr("y", 6)
.attr("dy", ".35em")
.style("text-anchor", "end")
.text(function (d) { return d; });
}
function removePopovers () {
$('.popover').each(function() {
$(this).remove();
});
}
function showPopover (d) {
$(this).popover({
title: d.name,
placement: 'top',
container: 'body',
trigger: 'manual',
html : true,
content: function() {
return d.label +
"<br/>num: " + d3.format(",")(d.value ? d.value: d.y1 - d.y0); }
});
$(this).popover('show')
}
chart.onResize = function () {
var aspect = width / height, chart = $("#thesvg");
var targetWidth = chart.parent().width();
chart.attr("width", targetWidth);
chart.attr("height", targetWidth / 2);
}
window.chart = chart;
</script>

View file

@ -0,0 +1,217 @@
<!DOCTYPE html>
<html>
<head>
<title>AIL-Framework</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png')}}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dataTables.bootstrap4.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/jquery.dataTables.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/dataTables.bootstrap.min.js')}}"></script>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'crawler/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
{%if last_domains%}
<div class="table-responsive mt-1 mb-3 table-hover table-borderless table-striped">
<table class="table">
<thead class="thead-dark">
<tr>
<th>Domain</th>
<th>First Seen</th>
<th>Last Check</th>
<th>Status</th>
</tr>
</thead>
<tbody id="tbody_last_crawled">
{% for metadata_domain in last_domains %}
<tr>
<td><a target="_blank" href="{{ url_for('hiddenServices.show_domain') }}?domain={{ metadata_domain['domain'] }}&port={{metadata_domain['port']}}&epoch={{metadata_domain['epoch']}}">{{ metadata_domain['domain_name'] }}</a></td>
<td>{{'{}/{}/{}'.format(metadata_domain['first_seen'][0:4], metadata_domain['first_seen'][4:6], metadata_domain['first_seen'][6:8])}}</td>
<td>{{'{}/{}/{}'.format(metadata_domain['last_check'][0:4], metadata_domain['last_check'][4:6], metadata_domain['last_check'][6:8])}}</td>
<td><div style="color:{{metadata_domain['status_color']}}; display:inline-block">
<i class="fas {{metadata_domain['status_icon']}} "></i>
{{metadata_domain['status_text']}}
</div>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
{%endif%}
<div class="row">
<div class="col-lg-6">
<div class="table-responsive mt-1 table-hover table-borderless table-striped">
<table class="table" id="myTable_1">
<thead class="thead-dark">
<tr>
<th>Onion Url</th>
<th></th>
<th>Next Check</th>
<th></th>
<th></th>
</tr>
</thead>
<tbody id="tbody_last_crawled">
{% for metadata_domain in auto_crawler_domain_onions_metadata %}
<tr>
<td><a target="_blank" href="{{ url_for('hiddenServices.show_domain') }}?domain={{ metadata_domain['domain'] }}&port={{metadata_domain['port']}}&epoch={{metadata_domain['epoch']}}">{{ metadata_domain['url'] }}</a></td>
<td><a class="btn btn-outline-danger px-1 py-0" href="{{ url_for('hiddenServices.remove_auto_crawler') }}?url={{ metadata_domain['url'] }}&page={{page}}">
<i class="fas fa-trash-alt"></i></a>
</td>
<td>{{metadata_domain['epoch']}}</td>
<td><div style="color:{{metadata_domain['status_color']}}; display:inline-block">
<i class="fas {{metadata_domain['status_icon']}} "></i>
{{metadata_domain['status_text']}}
</div>
</td>
<td>
<button class="btn btn-outline-secondary px-1 py-0 disabled"><i class="fas fa-pencil-alt"></i></button>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
<div class="col-lg-6">
<div class="table-responsive mt-1 table-hover table-borderless table-striped">
<table class="table" id="myTable_2">
<thead class="thead-dark">
<tr>
<th>Regular Url</th>
<th></th>
<th>Next Check</th>
<th></th>
<th></th>
</tr>
</thead>
<tbody id="tbody_last_crawled">
{% for metadata_domain in auto_crawler_domain_regular_metadata %}
<tr>
<td><a target="_blank" href="{{ url_for('hiddenServices.show_domain') }}?domain={{ metadata_domain['domain'] }}&port={{metadata_domain['port']}}&epoch={{metadata_domain['epoch']}}">{{ metadata_domain['url'] }}</a></td>
<td><a class="btn btn-outline-danger px-1 py-0" href="{{ url_for('hiddenServices.remove_auto_crawler') }}?url={{ metadata_domain['url'] }}&page={{page}}">
<i class="fas fa-trash-alt"></i></a>
</td>
<td>{{metadata_domain['epoch']}}</td>
<td><div style="color:{{metadata_domain['status_color']}}; display:inline-block">
<i class="fas {{metadata_domain['status_icon']}} "></i>
{{metadata_domain['status_text']}}
</div>
</td>
<td>
<button class="btn btn-outline-secondary px-1 py-0 disabled"><i class="fas fa-pencil-alt"></i></button>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
</div>
<hr class="mt-4">
<div class="d-flex justify-content-center">
<nav aria-label="...">
<ul class="pagination">
<li class="page-item {%if page==1%}disabled{%endif%}">
<a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{page-1}}">Previous</a>
</li>
{%if page>3%}
<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page=1">1</a></li>
<li class="page-item disabled"><a class="page-link" aria-disabled="true" href="#">...</a></li>
<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{page-1}}">{{page-1}}</a></li>
<li class="page-item active"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{page}}">{{page}}</a></li>
{%else%}
{%if page>2%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{page-2}}">{{page-2}}</a></li>{%endif%}
{%if page>1%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{page-1}}">{{page-1}}</a></li>{%endif%}
<li class="page-item active"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{page}}">{{page}}</a></li>
{%endif%}
{%if nb_page_max-page>3%}
<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{page+1}}">{{page+1}}</a></li>
<li class="page-item disabled"><a class="page-link" aria-disabled="true" href="#">...</a></li>
<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{nb_page_max}}">{{nb_page_max}}</a></li>
{%else%}
{%if nb_page_max-page>2%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{nb_page_max-2}}">{{nb_page_max-2}}</a></li>{%endif%}
{%if nb_page_max-page>1%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{nb_page_max-1}}">{{nb_page_max-1}}</a></li>{%endif%}
{%if nb_page_max-page>0%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{nb_page_max}}">{{nb_page_max}}</a></li>{%endif%}
{%endif%}
<li class="page-item {%if page==nb_page_max%}disabled{%endif%}">
<a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{page+1}}" aria-disabled="true">Next</a>
</li>
</ul>
</nav>
</div>
</div>
</div>
</div>
</body>
<script>
$(document).ready(function(){
$("#page-Crawler").addClass("active");
$("#nav_auto_crawler").addClass("active");
table1 = $('#myTable_1').DataTable(
{
//"aLengthMenu": [[5, 10, 15, 20, -1], [5, 10, 15, 20, "All"]],
//"iDisplayLength": 5,
//"order": [[ 0, "desc" ]]
columnDefs: [
{ orderable: false, targets: [-1, -4] }
]
});
table2 = $('#myTable_2').DataTable(
{
//"aLengthMenu": [[5, 10, 15, 20, -1], [5, 10, 15, 20, "All"]],
//"iDisplayLength": 5,
//"order": [[ 0, "desc" ]]
columnDefs: [
{ orderable: false, targets: [-1, -4] }
]
});
});
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
</script>

View file

@ -0,0 +1,222 @@
<!DOCTYPE html>
<html>
<head>
<title>AIL-Framework</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png')}}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'crawler/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div class="row">
<div class="col-xl-6">
<div class="card mt-1 mb-1">
<div class="card-header text-white bg-dark">
<h5><a class="text-info" href="{{ url_for('hiddenServices.Crawler_Splash_last_by_type')}}?type=onion"><i class="fas fa-user-secret"></i> Onions Crawlers</a></h5>
<div class="row">
<div class="col-6">
<span class="badge badge-success" id="stat_onion_domain_up">{{ statDomains_onion['domains_up'] }}</span> UP
<span class="badge badge-danger ml-md-3" id="stat_onion_domain_down">{{ statDomains_onion['domains_down'] }}</span> DOWN
</div>
<div class="col-6">
<span class="badge badge-success" id="stat_onion_total">{{ statDomains_onion['total'] }}</span> Crawled
<span class="badge badge-warning ml-md-3" id="stat_onion_queue">{{ statDomains_onion['domains_queue'] }}</span> Queue
</div>
</div>
</div>
<div class="card-body px-0 py-0 ">
<table class="table">
<tbody id="tbody_crawler_onion_info">
{% for crawler in crawler_metadata_onion %}
<tr>
<td>
<i class="fas fa-{%if crawler['status']%}check{%else%}times{%endif%}-circle" style="color:{%if crawler['status']%}Green{%else%}Red{%endif%};"></i> {{crawler['crawler_info']}}
</td>
<td>
{{crawler['crawling_domain']}}
</td>
<td style="color:{%if crawler['status']%}Green{%else%}Red{%endif%};">
{{crawler['status_info']}}
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
</div>
<div class="col-xl-6">
<div class="card mt-1 mb-1">
<div class="card-header text-white bg-dark">
<h5><a class="text-info" href="{{ url_for('hiddenServices.Crawler_Splash_last_by_type')}}?type=regular"><i class="fab fa-html5"></i> Regular Crawlers</a></h5>
<div class="row">
<div class="col-6">
<span class="badge badge-success" id="stat_regular_domain_up">{{ statDomains_regular['domains_up'] }}</span> UP
<span class="badge badge-danger ml-md-3" id="stat_regular_domain_down">{{ statDomains_regular['domains_down'] }}</span> DOWN
</div>
<div class="col-6">
<span class="badge badge-success" id="stat_regular_total">{{ statDomains_regular['total'] }}</span> Crawled
<span class="badge badge-warning ml-md-3" id="stat_regular_queue">{{ statDomains_regular['domains_queue'] }}</span> Queue
</div>
</div>
</div>
<div class="card-body px-0 py-0 ">
<table class="table">
<tbody id="tbody_crawler_regular_info">
{% for crawler in crawler_metadata_regular %}
<tr>
<td>
<i class="fas fa-{%if crawler['status']%}check{%else%}times{%endif%}-circle" style="color:{%if crawler['status']%}Green{%else%}Red{%endif%};"></i> {{crawler['crawler_info']}}
</td>
<td>
{{crawler['crawling_domain']}}
</td>
<td style="color:{%if crawler['status']%}Green{%else%}Red{%endif%};">
{{crawler['status_info']}}
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
<script>
var to_refresh = false
$(document).ready(function(){
$("#page-Crawler").addClass("active");
$("#nav_dashboard").addClass("active");
$( window ).focus(function() {
to_refresh = true
refresh_crawler_status();
});
$( window ).blur(function() {
to_refresh = false
});
to_refresh = true
refresh_crawler_status();
});
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
function refresh_crawler_status(){
$.getJSON("{{ url_for('hiddenServices.crawler_dashboard_json') }}",
function(data) {
$('#stat_onion_domain_up').text(data.statDomains_onion['domains_up']);
$('#stat_onion_domain_down').text(data.statDomains_onion['domains_down']);
$('#stat_onion_total').text(data.statDomains_onion['total']);
$('#stat_onion_queue').text(data.statDomains_onion['domains_queue']);
$('#stat_regular_domain_up').text(data.statDomains_regular['domains_up']);
$('#stat_regular_domain_down').text(data.statDomains_regular['domains_down']);
$('#stat_regular_total').text(data.statDomains_regular['total']);
$('#stat_regular_queue').text(data.statDomains_regular['domains_queue']);
if(data.crawler_metadata_onion.length!=0){
$("#tbody_crawler_onion_info").empty();
var tableRef = document.getElementById('tbody_crawler_onion_info');
for (var i = 0; i < data.crawler_metadata_onion.length; i++) {
var crawler = data.crawler_metadata_onion[i];
var newRow = tableRef.insertRow(tableRef.rows.length);
var text_color;
var icon;
if(crawler['status']){
text_color = 'Green';
icon = 'check';
} else {
text_color = 'Red';
icon = 'times';
}
var newCell = newRow.insertCell(0);
newCell.innerHTML = "<td><i class=\"fas fa-"+icon+"-circle\" style=\"color:"+text_color+";\"></i> "+crawler['crawler_info']+"</td>";
newCell = newRow.insertCell(1);
newCell.innerHTML = "<td>"+crawler['crawling_domain']+"</td>";
newCell = newRow.insertCell(2);
newCell.innerHTML = "<td><div style=\"color:"+text_color+";\">"+crawler['status_info']+"</div></td>";
//$("#panel_crawler").show();
}
}
if(data.crawler_metadata_regular.length!=0){
$("#tbody_crawler_regular_info").empty();
var tableRef = document.getElementById('tbody_crawler_regular_info');
for (var i = 0; i < data.crawler_metadata_regular.length; i++) {
var crawler = data.crawler_metadata_regular[i];
var newRow = tableRef.insertRow(tableRef.rows.length);
var text_color;
var icon;
if(crawler['status']){
text_color = 'Green';
icon = 'check';
} else {
text_color = 'Red';
icon = 'times';
}
var newCell = newRow.insertCell(0);
newCell.innerHTML = "<td><i class=\"fas fa-"+icon+"-circle\" style=\"color:"+text_color+";\"></i> "+crawler['crawler_info']+"</td>";
newCell = newRow.insertCell(1);
newCell.innerHTML = "<td>"+crawler['crawling_domain']+"</td>";
newCell = newRow.insertCell(2);
newCell.innerHTML = "<td><div style=\"color:"+text_color+";\">"+crawler['status_info']+"</div></td>";
//$("#panel_crawler").show();
}
}
}
);
if (to_refresh) {
setTimeout("refresh_crawler_status()", 10000);
}
}
</script>

View file

@ -0,0 +1,66 @@
<!DOCTYPE html>
<html>
<head>
<title>AIL-Framework</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png')}}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'crawler/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div >
<pre>
--------------
--------------
</pre>
</div>
</div>
</div>
</div>
</body>
<script>
$(document).ready(function(){
$("#page-Crawler").addClass("active");
});
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
</script>

View file

@ -0,0 +1,208 @@
<!DOCTYPE html>
<html>
<head>
<title>AIL-Framework</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png')}}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dataTables.bootstrap.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/jquery.dataTables.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/dataTables.bootstrap.min.js')}}"></script>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'crawler/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div class="card-deck justify-content-center mx-0">
<div class="card border-dark mt-2">
<div class="card-header bg-dark text-white">
Blacklisted {{type_name}}s
</div>
<div class="card-body text-dark">
<div class="row">
<div class="col-12 col-md-6">
<div class="card text-center border-danger">
<div class="card-body text-danger">
<h5 class="card-title">Blacklist {{type_name}}</h5>
<input type="text" class="form-control {%if blacklist_domain is not none %}{%if blacklist_domain==1 %}is-valid{% else %}is-invalid{%endif%}{%endif%}" id="blacklist_domain_input" placeholder="{{type_name}} Address">
<div class="invalid-feedback">
{%if blacklist_domain==2 %}
This {{type_name}} is already blacklisted
{% else %}
Incorrect {{type_name}} address
{% endif %}
</div>
<div class="valid-feedback">
{{type_name}} Blacklisted
</div>
<button type="button" class="btn btn-danger mt-2" onclick="window.location.href ='{{ url_for('hiddenServices.blacklist_domain') }}?redirect=0&type={{type}}&domain='+$('#blacklist_domain_input').val();">Blacklist {{type_name}}</button>
</div>
</div>
</div>
<div class="col-12 col-md-6 mt-4 mt-md-0">
<div class="card text-center border-success">
<div class="card-body">
<h5 class="card-title">Unblacklist {{type_name}}</h5>
<input type="text" class="form-control {%if unblacklist_domain is not none %}{%if unblacklist_domain==1 %}is-valid{% else %}is-invalid{%endif%}{%endif%}" id="unblacklist_domain_input" placeholder="{{type_name}} Address">
<div class="invalid-feedback">
{%if unblacklist_domain==2 %}
This {{type_name}} is not blacklisted
{% else %}
Incorrect {{type_name}} address
{% endif %}
</div>
<div class="valid-feedback">
{{type_name}} Unblacklisted
</div>
<button type="button" class="btn btn-outline-secondary mt-2" onclick="window.location.href ='{{ url_for('hiddenServices.unblacklist_domain') }}?redirect=0&type={{type}}&domain='+$('#unblacklist_domain_input').val();">Unblacklist {{type_name}}</button>
</div>
</div>
</div>
</div>
<div class="row mt-4">
<div class="col-12 col-xl-6">
<table class="table table-striped table-bordered table-hover" id="myTable_1">
<thead class="thead-dark">
<tr>
<th style="max-width: 800px;">{{type_name}}</th>
<th style="max-width: 800px;">Unblacklist {{type_name}}</th>
</tr>
</thead>
<tbody>
{% for domain in list_blacklisted_1 %}
<tr>
<td>{{domain}}</td>
<td>
<a href="{{ url_for('hiddenServices.unblacklist_domain') }}?page={{page}}&domain={{domain}}&type={{type}}">
<button type="button" class="btn btn-outline-danger">UnBlacklist {{type_name}}</button>
</a>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
<div class="col-12 col-xl-6">
<table class="table table-striped table-bordered table-hover" id="myTable_2">
<thead class="thead-dark">
<tr>
<th style="max-width: 800px;">{{type_name}}</th>
<th style="max-width: 800px;">Unblacklist {{type_name}}</th>
</tr>
</thead>
<tbody>
{% for domain in list_blacklisted_2 %}
<tr>
<td>{{domain}}</td>
<td>
<a href="{{ url_for('hiddenServices.unblacklist_domain') }}?page={{page}}&domain={{domain}}&type={{type}}">
<button type="button" class="btn btn-outline-danger">UnBlacklist {{type_name}}</button>
</a>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
</div>
</div>
</div>
<div class="d-flex justify-content-center">
<nav class="mt-4" aria-label="...">
<ul class="pagination">
<li class="page-item {%if page==1%}disabled{%endif%}">
<a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{page-1}}">Previous</a>
</li>
{%if page>3%}
<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page=1">1</a></li>
<li class="page-item disabled"><a class="page-link" aria-disabled="true" href="#">...</a></li>
<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{page-1}}">{{page-1}}</a></li>
<li class="page-item active"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{page}}">{{page}}</a></li>
{%else%}
{%if page>2%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{page-2}}">{{page-2}}</a></li>{%endif%}
{%if page>1%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{page-1}}">{{page-1}}</a></li>{%endif%}
<li class="page-item active"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{page}}">{{page}}</a></li>
{%endif%}
{%if nb_page_max-page>3%}
<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{page+1}}">{{page+1}}</a></li>
<li class="page-item disabled"><a class="page-link" aria-disabled="true" href="#">...</a></li>
<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{nb_page_max}}">{{nb_page_max}}</a></li>
{%else%}
{%if nb_page_max-page>2%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{nb_page_max-2}}">{{nb_page_max-2}}</a></li>{%endif%}
{%if nb_page_max-page>1%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{nb_page_max-1}}">{{nb_page_max-1}}</a></li>{%endif%}
{%if nb_page_max-page>0%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{nb_page_max}}">{{nb_page_max}}</a></li>{%endif%}
{%endif%}
<li class="page-item {%if page==nb_page_max%}disabled{%endif%}">
<a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{page+1}}" aria-disabled="true">Next</a>
</li>
</ul>
</nav>
</div>
</div>
</div>
</div>
</body>
<script>
var table
$(document).ready(function(){
table = $('#myTable_1').DataTable(
{
/*"aLengthMenu": [[5, 10, 15, 20, -1], [5, 10, 15, 20, "All"]],
"iDisplayLength": 10,*/
"order": [[ 0, "asc" ]]
}
);
table = $('#myTable_2').DataTable(
{
/*"aLengthMenu": [[5, 10, 15, 20, -1], [5, 10, 15, 20, "All"]],
"iDisplayLength": 10,*/
"order": [[ 0, "asc" ]]
}
);
$("#page-Crawler").addClass("active");
});
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
</script>

View file

@ -63,7 +63,7 @@
{% for domain in domains_by_day[date] %} {% for domain in domains_by_day[date] %}
<tr> <tr>
<td> <td>
<a target="_blank" href="{{ url_for('hiddenServices.onion_domain') }}?onion_domain={{ domain }}">{{ domain }}</a> <a target="_blank" href="{{ url_for('hiddenServices.show_domain') }}?domain={{ domain }}">{{ domain }}</a>
<div> <div>
{% for tag in domain_metadata[domain]['tags'] %} {% for tag in domain_metadata[domain]['tags'] %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag }}"> <a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag }}">

View file

@ -1 +1 @@
<li id='page-hiddenServices'><a href="{{ url_for('hiddenServices.hiddenServices_page') }}"><i class="fa fa-user-secret"></i> hidden Services </a></li> <li id='page-hiddenServices'><a href="{{ url_for('hiddenServices.dashboard') }}"><i class="fa fa-user-secret"></i> hidden Services </a></li>

View file

@ -1,213 +1,230 @@
<!DOCTYPE html> <!DOCTYPE html>
<html> <html>
<head>
<title>Show Domain - AIL</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png') }}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/daterangepicker.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dataTables.bootstrap4.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/jquery.dataTables.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/dataTables.bootstrap.min.js')}}"></script>
<head> </head>
<meta charset="utf-8"> <body>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Show Domain - AIL</title> {% include 'nav_bar.html' %}
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png') }}">
<!-- Core CSS --> <div class="container-fluid">
<link href="{{ url_for('static', filename='css/bootstrap.min.css') }}" rel="stylesheet"> <div class="row">
<link href="{{ url_for('static', filename='font-awesome/css/font-awesome.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/sb-admin-2.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dygraph_gallery.css') }}" rel="stylesheet" type="text/css" />
<!-- JS -->
<script type="text/javascript" src="{{ url_for('static', filename='js/dygraph-combined.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/jquery.dataTables.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/dataTables.bootstrap.js')}}"></script>
<style> {% include 'crawler/menu_sidebar.html' %}
.test thead{
background: #d91f2d;
color: #fff;
}
</style>
</head> <div class="col-12 col-lg-10" id="core_content">
<body>
{% include 'navbar.html' %}
<div id="page-wrapper">
<div class="row"> <div class="row">
<div class="col-md-6"> <div class="col-12 col-xl-6">
<div class="row"> <div class="card mt-2">
<div class="panel panel-info"> <div class="card-header bg-dark">
<div class="panel-heading"> <span class="badge badge-pill badge-light flex-row-reverse float-right">
{% if status %} {% if status %}
<div class="pull-right" style="color:Green;"> <div style="color:Green;">
<i class="fa fa-check-circle fa-2x"></i> <i class="fas fa-check-circle fa-2x"></i>
UP UP
</div> </div>
{% else %} {% else %}
<div class="pull-right" style="color:Red;"> <div style="color:Red;">
<i class="fa fa-times-circle fa-2x"></i> <i class="fas fa-times-circle fa-2x"></i>
DOWN DOWN
</div> </div>
{% endif %} {% endif %}
<h3>{{ domain }} :</h3> </span>
<ul class="list-group"> <h3 class="card-title text-white">{{ domain }} :</h3>
<li class="list-group-item"> </div>
<div class="card-body">
<table class="table table-condensed"> <table class="table table-responsive table-condensed">
<thead> <thead>
<tr> <tr>
<th>First Seen</th> <th>First Seen</th>
<th>Last Check</th> <th>Last Check</th>
</tr> </tr>
</thead> </thead>
<tbody> <tbody>
<tr> <tr>
<td class="panelText"><a href="#">{{ first_seen }}</a></td> <td class="panelText">{{ first_seen }}</td>
<td class="panelText"><a href="#">{{ last_check }}</a></td> <td class="panelText">{{ last_check }}</td>
</tr> </tr>
</tbody> </tbody>
</table> </table>
</li>
<li class="list-group-item">
Origin Paste: <a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste', paste=origin_paste) }}" />{{ origin_paste_name }}</a>
<div>
{% for tag in origin_paste_tags %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag[1] }}">
<span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag[0] }}</span>
</a>
{% endfor %}
<br>
</div>
</li>
</ul>
Origin Paste:
{% if origin_paste_name=='manual' or origin_paste_name=='auto' %}
<span class="badge badge-dark">{{ origin_paste_name }}</span>
{%else%}
<a class="badge badge-dark" target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste', paste=origin_paste) }}" />{{ origin_paste_name }}</a>
{%endif%}
<div>
{% for tag in origin_paste_tags %}
<a href="{{ url_for('Tags.Tags_page') }}?ltags={{ tag[1] }}">
<span class="badge badge-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag[0] }}</span>
</a>
{% endfor %}
<br>
</div> </div>
</div> </div>
<div>
{% for tag in domain_tags %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag }}">
<span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag }} <i>{{ domain_tags[tag] }}</i></span>
</a>
{% endfor %}
<br>
<br>
</div>
<table class="test table table-striped table-bordered table-hover table-responsive " id="myTable_">
<thead>
<tr>
<th style="max-width: 800px;">Crawled Pastes</th>
</tr>
</thead>
<tbody>
{% for path in l_pastes %}
<tr>
<td><a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{path}}">{{ path_name[loop.index0] }}</a>
<div>
{% for tag in paste_tags[loop.index0] %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag[1] }}">
<span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag[0] }}</span>
</a>
{% endfor %}
</div>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div> </div>
<div>
{% for tag in domain_tags %}
<a href="{{ url_for('Tags.Tags_page') }}?ltags={{ tag }}">
<span class="badge badge-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag }} <i>{{ domain_tags[tag] }}</i></span>
</a>
{% endfor %}
<br>
<br>
</div>
{% if l_pastes %}
<table class="table table-striped table-bordered table-hover" id="myTable_1">
<thead class="thead-dark">
<tr>
<th>Crawled Pastes</th>
</tr>
</thead>
<tbody>
{% for path in l_pastes %}
<tr>
<td>
<a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{path}}" class="text-secondary">
<div style="line-height:0.9;">{{ dict_links[path] }}</div>
</a>
<div>
{% for tag in paste_tags[loop.index0] %}
<a href="{{ url_for('Tags.Tags_page') }}?ltags={{ tag[1] }}">
<span class="badge badge-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag[0] }}</span>
</a>
{% endfor %}
</div>
</td>
</tr>
{% endfor %}
</tbody>
</table>
{%endif%}
</div> </div>
<div class="col-md-6"> <div class="col-12 col-xl-6">
<div class="panel panel-info" style="text-align:center;"> <div class="card my-2" style="background-color:#ecf0f1;">
<div class="panel-heading"> <div class="card-body py-2">
<div class="row"> <div class="row">
<div class="col-md-8"> <div class="col-md-8">
<input class="center" id="blocks" type="range" min="1" max="50" value="13"> <input class="custom-range mt-2" id="blocks" type="range" min="1" max="50" value="13">
</div>
<div class="col-md-4">
<button class="btn btn-primary btn-tags" onclick="blocks.value=50;pixelate();">
<span class="glyphicon glyphicon-zoom-in"></span>
<span class="label-icon">Full resolution</span>
</button>
</div>
</div> </div>
<div class="col-md-4">
<button class="btn btn-primary" onclick="blocks.value=50;pixelate();">
<i class="fas fa-search-plu"></i>
<span class="label-icon">Full resolution</span>
</button>
</div>
</div>
</div> </div>
</div> </div>
<canvas id="canvas" style="width:100%;"></canvas> <canvas id="canvas" style="width:100%;"></canvas>
<div class="text-center">
<small>
<a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{screenshot['item']}}" class="text-info">
<div style="line-height:0.9;">{{dict_links[screenshot['item']]}}</div>
</a>
<small>
</div>
</div> </div>
</div> </div>
</div>
</div>
</div>
</div> </body>
<!-- /#page-wrapper -->
<script>
$(document).ready(function(){
activePage = "page-hiddenServices"
$("#"+activePage).addClass("active");
table = $('#myTable_').DataTable(
{
"aLengthMenu": [[5, 10, 15, 20, -1], [5, 10, 15, 20, "All"]],
"iDisplayLength": 5,
"order": [[ 0, "desc" ]]
}
);
});
</script>
<script> <script>
var ctx = canvas.getContext('2d'), img = new Image(); var table;
$(document).ready(function(){
table = $('#myTable_1').DataTable(
{
//"aLengthMenu": [[5, 10, 15, 20, -1], [5, 10, 15, 20, "All"]],
//"iDisplayLength": 5,
//"order": [[ 0, "desc" ]]
});
});
/// turn off image smoothing function toggle_sidebar(){
ctx.webkitImageSmoothingEnabled = false; if($('#nav_menu').is(':visible')){
ctx.imageSmoothingEnabled = false; $('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
</script>
img.onload = pixelate; <script>
img.addEventListener("error", img_error); var ctx = canvas.getContext('2d'), img = new Image();
var draw_img = false;
img.src = "{{ url_for('showsavedpastes.screenshot', filename=screenshot) }}"; /// turn off image smoothing
ctx.webkitImageSmoothingEnabled = false;
ctx.imageSmoothingEnabled = false;
function pixelate() { img.onload = pixelate;
img.addEventListener("error", img_error);
var draw_img = false;
/// use slider value img.src = "{{ url_for('showsavedpastes.screenshot', filename=screenshot['screenshot']) }}";
if( blocks.value == 50 ){
size = 1;
} else {
var size = (blocks.value) * 0.01;
}
canvas.width = img.width; function pixelate() {
canvas.height = img.height; /// use slider value
if( blocks.value == 50 ){
size = 1;
} else {
var size = (blocks.value) * 0.01;
}
/// cache scaled width and height canvas.width = img.width;
w = canvas.width * size; canvas.height = img.height;
h = canvas.height * size;
/// draw original image to the scaled size /// cache scaled width and height
ctx.drawImage(img, 0, 0, w, h); w = canvas.width * size;
h = canvas.height * size;
/// pixelated /// draw original image to the scaled size
ctx.drawImage(canvas, 0, 0, w, h, 0, 0, canvas.width, canvas.height); ctx.drawImage(img, 0, 0, w, h);
} /// pixelated
ctx.drawImage(canvas, 0, 0, w, h, 0, 0, canvas.width, canvas.height);
blocks.addEventListener('change', pixelate, false); }
function img_error() { blocks.addEventListener('change', pixelate, false);
img.onerror=null;
img.src="{{ url_for('static', filename='image/AIL.png') }}";
blocks.value = 50;
pixelate;
}
</script>
</body> function img_error() {
img.onerror=null;
img.src="{{ url_for('static', filename='image/AIL.png') }}";
blocks.value = 50;
pixelate;
}
</script>
</html> </html>

View file

@ -29,7 +29,7 @@ r_serv_metadata = Flask_config.r_serv_metadata
max_preview_char = Flask_config.max_preview_char max_preview_char = Flask_config.max_preview_char
max_preview_modal = Flask_config.max_preview_modal max_preview_modal = Flask_config.max_preview_modal
bootstrap_label = Flask_config.bootstrap_label bootstrap_label = Flask_config.bootstrap_label
PASTES_FOLDER = Flask_config.PASTES_FOLDER
baseindexpath = os.path.join(os.environ['AIL_HOME'], cfg.get("Indexer", "path")) baseindexpath = os.path.join(os.environ['AIL_HOME'], cfg.get("Indexer", "path"))
indexRegister_path = os.path.join(os.environ['AIL_HOME'], indexRegister_path = os.path.join(os.environ['AIL_HOME'],
@ -133,8 +133,8 @@ def search():
query = QueryParser("content", ix.schema).parse("".join(q)) query = QueryParser("content", ix.schema).parse("".join(q))
results = searcher.search_page(query, 1, pagelen=num_elem_to_get) results = searcher.search_page(query, 1, pagelen=num_elem_to_get)
for x in results: for x in results:
r.append(x.items()[0][1]) r.append(x.items()[0][1].replace(PASTES_FOLDER, '', 1))
path = x.items()[0][1] path = x.items()[0][1].replace(PASTES_FOLDER, '', 1)
paste = Paste.Paste(path) paste = Paste.Paste(path)
content = paste.get_p_content() content = paste.get_p_content()
content_range = max_preview_char if len(content)>max_preview_char else len(content)-1 content_range = max_preview_char if len(content)>max_preview_char else len(content)-1
@ -208,6 +208,7 @@ def get_more_search_result():
results = searcher.search_page(query, page_offset, num_elem_to_get) results = searcher.search_page(query, page_offset, num_elem_to_get)
for x in results: for x in results:
path = x.items()[0][1] path = x.items()[0][1]
path = path.replace(PASTES_FOLDER, '', 1)
path_array.append(path) path_array.append(path)
paste = Paste.Paste(path) paste = Paste.Paste(path)
content = paste.get_p_content() content = paste.get_p_content()

View file

@ -101,7 +101,7 @@
<td><a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{path}}">{{ path }}</a> <td><a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{path}}">{{ path }}</a>
<div> <div>
{% for tag in paste_tags[loop.index0] %} {% for tag in paste_tags[loop.index0] %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag[1] }}"> <a href="{{ url_for('Tags.Tags_page') }}?ltags={{ tag[1] }}">
<span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag[0] }}</span> <span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag[0] }}</span>
</a> </a>
{% endfor %} {% endfor %}
@ -201,7 +201,7 @@
var curr_preview = data.preview_array[i].replace(/\"/g, "\'"); var curr_preview = data.preview_array[i].replace(/\"/g, "\'");
var tag = "" var tag = ""
for(j=0; j<data.list_tags[i].length; j++) { for(j=0; j<data.list_tags[i].length; j++) {
tag = tag + "<a href=\"{{ url_for('Tags.get_tagged_paste') }}?ltags=" + data.list_tags[i][j][1] + ">" tag = tag + "<a href=\"{{ url_for('Tags.Tags_page') }}?ltags=" + data.list_tags[i][j][1] + ">"
+ "<span class=\"label label-" + data.bootstrap_label[j % 5] + " pull-left\">" + data.list_tags[i][j][0] + "<span class=\"label label-" + data.bootstrap_label[j % 5] + " pull-left\">" + data.list_tags[i][j][0]
+ "</span>" + "</a>" + "</span>" + "</a>"
} }

View file

@ -0,0 +1,129 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
'''
Flask functions and routes for the settings modules page
'''
from flask import Flask, render_template, jsonify, request, Blueprint, redirect, url_for
import json
import datetime
import git_status
# ============ VARIABLES ============
import Flask_config
app = Flask_config.app
cfg = Flask_config.cfg
baseUrl = Flask_config.baseUrl
r_serv_db = Flask_config.r_serv_db
max_preview_char = Flask_config.max_preview_char
max_preview_modal = Flask_config.max_preview_modal
REPO_ORIGIN = Flask_config.REPO_ORIGIN
dict_update_description = Flask_config.dict_update_description
settings = Blueprint('settings', __name__, template_folder='templates')
# ============ FUNCTIONS ============
def one():
return 1
#def get_v1.5_update_tags_backgroud_status():
# return '38%'
def get_git_metadata():
dict_git = {}
dict_git['current_branch'] = git_status.get_current_branch()
dict_git['is_clone'] = git_status.is_not_fork(REPO_ORIGIN)
dict_git['is_working_directory_clean'] = git_status.is_working_directory_clean()
dict_git['current_commit'] = git_status.get_last_commit_id_from_local()
dict_git['last_remote_commit'] = git_status.get_last_commit_id_from_remote()
dict_git['last_local_tag'] = git_status.get_last_tag_from_local()
dict_git['last_remote_tag'] = git_status.get_last_tag_from_remote()
if dict_git['current_commit'] != dict_git['last_remote_commit']:
dict_git['new_git_update_available'] = True
else:
dict_git['new_git_update_available'] = False
if dict_git['last_local_tag'] != dict_git['last_remote_tag']:
dict_git['new_git_version_available'] = True
else:
dict_git['new_git_version_available'] = False
return dict_git
def get_update_metadata():
dict_update = {}
dict_update['current_version'] = r_serv_db.get('ail:version')
dict_update['current_background_update'] = r_serv_db.get('ail:current_background_update')
dict_update['update_in_progress'] = r_serv_db.get('ail:update_in_progress')
dict_update['update_error'] = r_serv_db.get('ail:update_error')
if dict_update['update_in_progress']:
dict_update['update_progression'] = r_serv_db.scard('ail:update_{}'.format(dict_update['update_in_progress']))
dict_update['update_nb'] = dict_update_description[dict_update['update_in_progress']]['nb_background_update']
dict_update['update_stat'] = int(dict_update['update_progression']*100/dict_update['update_nb'])
dict_update['current_background_script'] = r_serv_db.get('ail:current_background_script')
dict_update['current_background_script_stat'] = r_serv_db.get('ail:current_background_script_stat')
return dict_update
# ============= ROUTES ==============
@settings.route("/settings/", methods=['GET'])
def settings_page():
git_metadata = get_git_metadata()
current_version = r_serv_db.get('ail:version')
update_metadata = get_update_metadata()
return render_template("settings_index.html", git_metadata=git_metadata,
current_version=current_version)
@settings.route("/settings/get_background_update_stats_json", methods=['GET'])
def get_background_update_stats_json():
# handle :end, error
update_stats = {}
current_update = r_serv_db.get('ail:current_background_update')
update_in_progress = r_serv_db.get('ail:update_in_progress')
if current_update:
update_stats['update_version']= current_update
update_stats['background_name']= r_serv_db.get('ail:current_background_script')
update_stats['background_stats']= r_serv_db.get('ail:current_background_script_stat')
if update_stats['background_stats'] is None:
update_stats['background_stats'] = 0
else:
update_stats['background_stats'] = int(update_stats['background_stats'])
update_progression = r_serv_db.scard('ail:update_{}'.format(current_update))
update_nb_scripts = dict_update_description[current_update]['nb_background_update']
update_stats['update_stat'] = int(update_progression*100/update_nb_scripts)
update_stats['update_stat_label'] = '{}/{}'.format(update_progression, update_nb_scripts)
if not update_in_progress:
update_stats['error'] = True
error_message = r_serv_db.get('ail:update_error')
if error_message:
update_stats['error_message'] = error_message
else:
update_stats['error_message'] = 'Please relaunch the bin/update-background.py script'
else:
if update_stats['background_name'] is None:
update_stats['error'] = True
update_stats['error_message'] = 'Please launch the bin/update-background.py script'
else:
update_stats['error'] = False
return jsonify(update_stats)
else:
return jsonify({})
# ========= REGISTRATION =========
app.register_blueprint(settings, url_prefix=baseUrl)

View file

@ -0,0 +1 @@
<li id='page-hiddenServices'><a href="{{ url_for('settings.settings_page') }}"><i class="fa fa-cog"></i> Server Management </a></li>

View file

@ -0,0 +1,196 @@
<!DOCTYPE html>
<html>
<head>
<title>Server Management - AIL</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png') }}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dataTables.bootstrap4.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/popper.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/jquery.dataTables.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/dataTables.bootstrap.min.js')}}"></script>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'settings/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div class="card mb-3 mt-1">
<div class="card-header text-white bg-dark pb-1">
<h5 class="card-title">AIL-framework Status :</h5>
</div>
<div class="card-body">
<div class="row">
<div class="col-xl-6">
<div class="card text-center border-secondary">
<div class="card-body px-1 py-0">
<table class="table table-sm">
<tbody>
<tr>
<td>AIL Version</td>
<td>{{current_version}}<a target="_blank" href="https://github.com/CIRCL/AIL-framework/releases/tag/{{current_version}}" class="text-info"><small> (release note)</small></a></td>
</tr>
<tr
{%if git_metadata['current_branch'] != 'master'%}
class="table-danger"
{%endif%}
>
<td>Current Branch</td>
<td>
{%if git_metadata['current_branch'] != 'master'%}
<i class="fas fa-times-circle text-danger" data-toggle="tooltip" data-placement="top" title="Please checkout the master branch"></i>&nbsp;
{%endif%}
{{git_metadata['current_branch']}}
</td>
</tr>
<tr
{%if git_metadata['new_git_update_available']%}
class="table-warning"
{%endif%}
>
<td>Current Commit ID</td>
<td>
{%if git_metadata['new_git_update_available']%}
<i class="fas fa-exclamation-triangle text-secondary" data-toggle="tooltip" data-placement="top" title="A New Update Is Available"></i>&nbsp;
{%endif%}
{{git_metadata['current_commit']}}
</td>
</tr>
<tr
{%if git_metadata['new_git_version_available']%}
class="table-danger"
{%endif%}
>
<td>Current Tag</td>
<td>
{%if git_metadata['new_git_version_available']%}
<i class="fas fa-exclamation-circle text-danger" data-toggle="tooltip" data-placement="top" title="A New Version Is Available"></i>&nbsp;&nbsp;
{%endif%}
{{git_metadata['last_local_tag']}}
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="col-xl-6">
<div class="card text-center border-success" id="card_progress">
<div class="card-body" id="card_progress_body">
<h5 class="card-title">Backgroud Update: <span id="backgroud_update_version"></span></h5>
<div class="progress">
<div class="progress-bar bg-danger" role="progressbar" id="update_global_progress" aria-valuenow="0" aria-valuemin="0" aria-valuemax="100"></div>
</div>
<hr class="my-1">
Updating: <strong id="backgroud_update_name"></strong> ...
<div class="progress">
<div class="progress-bar progress-bar-striped bg-warning progress-bar-animated" role="progressbar" id="update_background_progress" aria-valuenow="0" aria-valuemin="0" aria-valuemax="100"></div>
</div>
<div class="text-danger" id="update_error_div">
<hr>
<h5 class="card-title"><i class="fas fa-times-circle text-danger"></i> Update Error:</h5>
<p id="update_error_mess"></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
{%if git_metadata['new_git_version_available']%}
<div class="alert alert-danger" role="alert">
<h4 class="alert-heading">New Version Available!</h4>
<hr class="my-0">
<p>A new version is available, new version: <strong>{{git_metadata['last_remote_tag']}}</strong></p>
<a target="_blank" href="https://github.com/CIRCL/AIL-framework/releases/tag/{{git_metadata['last_remote_tag']}}"> Check last release note.</a>
</div>
{%endif%}
{%if git_metadata['new_git_update_available']%}
<div class="alert alert-warning" role="alert">
<h4 class="alert-heading">New Update Available!</h4>
<hr class="my-0">
<p>A new update is available, new commit ID: <strong>{{git_metadata['last_remote_commit']}}</strong></p>
<a target="_blank" href="https://github.com/CIRCL/AIL-framework/commit/{{git_metadata['last_remote_commit']}}"> Check last commit content.</a>
</div>
{%endif%}
</div>
</div>
</div>
</body>
<script>
$(document).ready(function(){
$("#page-options").addClass("active");
} );
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
function update_progress(){
$.getJSON("{{ url_for('settings.get_background_update_stats_json') }}", function(data){
if(! jQuery.isEmptyObject(data)){
$('#card_progress').show();
$('#backgroud_update_version').text(data['update_version']);
$('#backgroud_update_name').text(data['background_name']);
$('#update_global_progress').attr('aria-valuenow', data['update_stat']).width(data['update_stat']+'%').text(data['update_stat_label']);
$('#update_background_progress').attr('aria-valuenow', data['background_stats']).width(data['background_stats']+'%').text(data['background_stats']+'%');
if(data['error']){
$('#update_error_div').show();
$('#update_error_mess').text(data['error_message']);
$('#card_progress').removeClass("border-success");
$('#card_progress').addClass("border-danger");
} else {
$('#update_error_div').hide();
$('#card_progress').removeClass("border-danger");
$('#card_progress').add("border-success");
}
} else {
$('#card_progress').hide();
clearInterval(progress_interval);
}
});
}
update_progress();
//Interval
var progress_interval = setInterval(function(){
update_progress()
}, 4000);
</script>
</html>

View file

@ -40,15 +40,22 @@ showsavedpastes = Blueprint('showsavedpastes', __name__, template_folder='templa
# ============ FUNCTIONS ============ # ============ FUNCTIONS ============
def get_item_screenshot_path(item):
screenshot = r_serv_metadata.hget('paste_metadata:{}'.format(item), 'screenshot')
if screenshot:
screenshot = os.path.join(screenshot[0:2], screenshot[2:4], screenshot[4:6], screenshot[6:8], screenshot[8:10], screenshot[10:12], screenshot[12:])
return screenshot
def showpaste(content_range, requested_path): def showpaste(content_range, requested_path):
relative_path = None
if PASTES_FOLDER not in requested_path: if PASTES_FOLDER not in requested_path:
relative_path = requested_path # remove full path
requested_path = os.path.join(PASTES_FOLDER, requested_path) requested_path_full = os.path.join(requested_path, PASTES_FOLDER)
# remove old full path else:
#requested_path = requested_path.replace(PASTES_FOLDER, '') requested_path_full = requested_path
requested_path = requested_path.replace(PASTES_FOLDER, '', 1)
# escape directory transversal # escape directory transversal
if os.path.commonprefix((os.path.realpath(requested_path),PASTES_FOLDER)) != PASTES_FOLDER: if os.path.commonprefix((requested_path_full,PASTES_FOLDER)) != PASTES_FOLDER:
return 'path transversal detected' return 'path transversal detected'
vt_enabled = Flask_config.vt_enabled vt_enabled = Flask_config.vt_enabled
@ -58,7 +65,7 @@ def showpaste(content_range, requested_path):
p_date = p_date[6:]+'/'+p_date[4:6]+'/'+p_date[0:4] p_date = p_date[6:]+'/'+p_date[4:6]+'/'+p_date[0:4]
p_source = paste.p_source p_source = paste.p_source
p_encoding = paste._get_p_encoding() p_encoding = paste._get_p_encoding()
p_language = paste._get_p_language() p_language = 'None'
p_size = paste.p_size p_size = paste.p_size
p_mime = paste.p_mime p_mime = paste.p_mime
p_lineinfo = paste.get_lines_info() p_lineinfo = paste.get_lines_info()
@ -124,8 +131,6 @@ def showpaste(content_range, requested_path):
active_taxonomies = r_serv_tags.smembers('active_taxonomies') active_taxonomies = r_serv_tags.smembers('active_taxonomies')
l_tags = r_serv_metadata.smembers('tag:'+requested_path) l_tags = r_serv_metadata.smembers('tag:'+requested_path)
if relative_path is not None:
l_tags.union( r_serv_metadata.smembers('tag:'+relative_path) )
#active galaxies #active galaxies
active_galaxies = r_serv_tags.smembers('active_galaxies') active_galaxies = r_serv_tags.smembers('active_galaxies')
@ -154,7 +159,19 @@ def showpaste(content_range, requested_path):
if r_serv_metadata.scard('hash_paste:'+requested_path) > 0: if r_serv_metadata.scard('hash_paste:'+requested_path) > 0:
set_b64 = r_serv_metadata.smembers('hash_paste:'+requested_path) set_b64 = r_serv_metadata.smembers('hash_paste:'+requested_path)
for hash in set_b64: for hash in set_b64:
nb_in_file = int(r_serv_metadata.zscore('nb_seen_hash:'+hash, requested_path)) nb_in_file = r_serv_metadata.zscore('nb_seen_hash:'+hash, requested_path)
# item list not updated
if nb_in_file is None:
l_pastes = r_serv_metadata.zrange('nb_seen_hash:'+hash, 0, -1)
for paste_name in l_pastes:
# dynamic update
if PASTES_FOLDER in paste_name:
score = r_serv_metadata.zscore('nb_seen_hash:{}'.format(hash), paste_name)
r_serv_metadata.zrem('nb_seen_hash:{}'.format(hash), paste_name)
paste_name = paste_name.replace(PASTES_FOLDER, '', 1)
r_serv_metadata.zadd('nb_seen_hash:{}'.format(hash), score, paste_name)
nb_in_file = r_serv_metadata.zscore('nb_seen_hash:'+hash, requested_path)
nb_in_file = int(nb_in_file)
estimated_type = r_serv_metadata.hget('metadata_hash:'+hash, 'estimated_type') estimated_type = r_serv_metadata.hget('metadata_hash:'+hash, 'estimated_type')
file_type = estimated_type.split('/')[0] file_type = estimated_type.split('/')[0]
# set file icon # set file icon
@ -189,7 +206,7 @@ def showpaste(content_range, requested_path):
crawler_metadata['domain'] = r_serv_metadata.hget('paste_metadata:'+requested_path, 'domain') crawler_metadata['domain'] = r_serv_metadata.hget('paste_metadata:'+requested_path, 'domain')
crawler_metadata['paste_father'] = r_serv_metadata.hget('paste_metadata:'+requested_path, 'father') crawler_metadata['paste_father'] = r_serv_metadata.hget('paste_metadata:'+requested_path, 'father')
crawler_metadata['real_link'] = r_serv_metadata.hget('paste_metadata:'+requested_path,'real_link') crawler_metadata['real_link'] = r_serv_metadata.hget('paste_metadata:'+requested_path,'real_link')
crawler_metadata['screenshot'] = paste.get_p_rel_path() crawler_metadata['screenshot'] = get_item_screenshot_path(requested_path)
else: else:
crawler_metadata['get_metadata'] = False crawler_metadata['get_metadata'] = False
@ -223,6 +240,141 @@ def showpaste(content_range, requested_path):
crawler_metadata=crawler_metadata, crawler_metadata=crawler_metadata,
l_64=l_64, vt_enabled=vt_enabled, misp=misp, hive=hive, misp_eventid=misp_eventid, misp_url=misp_url, hive_caseid=hive_caseid, hive_url=hive_url) l_64=l_64, vt_enabled=vt_enabled, misp=misp, hive=hive, misp_eventid=misp_eventid, misp_url=misp_url, hive_caseid=hive_caseid, hive_url=hive_url)
def get_item_basic_info(item):
item_basic_info = {}
item_basic_info['date'] = str(item.get_p_date())
item_basic_info['date'] = '{}/{}/{}'.format(item_basic_info['date'][0:4], item_basic_info['date'][4:6], item_basic_info['date'][6:8])
item_basic_info['source'] = item.get_item_source()
item_basic_info['size'] = item.get_item_size()
## TODO: FIXME ##performance
item_basic_info['encoding'] = item._get_p_encoding()
## TODO: FIXME ##performance
#item_basic_info['language'] = item._get_p_language()
## TODO: FIXME ##performance
info_line = item.get_lines_info()
item_basic_info['nb_lines'] = info_line[0]
item_basic_info['max_length_line'] = info_line[1]
return item_basic_info
def show_item_min(requested_path , content_range=0):
relative_path = None
if PASTES_FOLDER not in requested_path:
relative_path = requested_path
requested_path = os.path.join(PASTES_FOLDER, requested_path)
else:
relative_path = requested_path.replace(PASTES_FOLDER, '', 1)
# remove old full path
#requested_path = requested_path.replace(PASTES_FOLDER, '')
# escape directory transversal
if os.path.commonprefix((os.path.realpath(requested_path),PASTES_FOLDER)) != PASTES_FOLDER:
return 'path transversal detected'
item_info ={}
paste = Paste.Paste(requested_path)
item_basic_info = get_item_basic_info(paste)
item_info['nb_duplictates'] = paste.get_nb_duplicate()
## TODO: use this for fix ?
item_content = paste.get_p_content()
char_to_display = len(item_content)
if content_range != 0:
item_content = item_content[0:content_range]
vt_enabled = Flask_config.vt_enabled
p_hashtype_list = []
print(requested_path)
l_tags = r_serv_metadata.smembers('tag:'+relative_path)
if relative_path is not None:
l_tags.union( r_serv_metadata.smembers('tag:'+relative_path) )
item_info['tags'] = l_tags
item_info['name'] = relative_path.replace('/', ' / ')
l_64 = []
# load hash files
if r_serv_metadata.scard('hash_paste:'+relative_path) > 0:
set_b64 = r_serv_metadata.smembers('hash_paste:'+relative_path)
for hash in set_b64:
nb_in_file = r_serv_metadata.zscore('nb_seen_hash:'+hash, relative_path)
# item list not updated
if nb_in_file is None:
l_pastes = r_serv_metadata.zrange('nb_seen_hash:'+hash, 0, -1)
for paste_name in l_pastes:
# dynamic update
if PASTES_FOLDER in paste_name:
score = r_serv_metadata.zscore('nb_seen_hash:{}'.format(hash), paste_name)
r_serv_metadata.zrem('nb_seen_hash:{}'.format(hash), paste_name)
paste_name = paste_name.replace(PASTES_FOLDER, '', 1)
r_serv_metadata.zadd('nb_seen_hash:{}'.format(hash), score, paste_name)
nb_in_file = r_serv_metadata.zscore('nb_seen_hash:{}'.format(hash), relative_path)
nb_in_file = int(nb_in_file)
estimated_type = r_serv_metadata.hget('metadata_hash:'+hash, 'estimated_type')
file_type = estimated_type.split('/')[0]
# set file icon
if file_type == 'application':
file_icon = 'fa-file '
elif file_type == 'audio':
file_icon = 'fa-file-video '
elif file_type == 'image':
file_icon = 'fa-file-image'
elif file_type == 'text':
file_icon = 'fa-file-alt'
else:
file_icon = 'fa-file'
saved_path = r_serv_metadata.hget('metadata_hash:'+hash, 'saved_path')
if r_serv_metadata.hexists('metadata_hash:'+hash, 'vt_link'):
b64_vt = True
b64_vt_link = r_serv_metadata.hget('metadata_hash:'+hash, 'vt_link')
b64_vt_report = r_serv_metadata.hget('metadata_hash:'+hash, 'vt_report')
else:
b64_vt = False
b64_vt_link = ''
b64_vt_report = r_serv_metadata.hget('metadata_hash:'+hash, 'vt_report')
# hash never refreshed
if b64_vt_report is None:
b64_vt_report = ''
l_64.append( (file_icon, estimated_type, hash, saved_path, nb_in_file, b64_vt, b64_vt_link, b64_vt_report) )
crawler_metadata = {}
if 'infoleak:submission="crawler"' in l_tags:
crawler_metadata['get_metadata'] = True
crawler_metadata['domain'] = r_serv_metadata.hget('paste_metadata:'+relative_path, 'domain')
crawler_metadata['paste_father'] = r_serv_metadata.hget('paste_metadata:'+relative_path, 'father')
crawler_metadata['real_link'] = r_serv_metadata.hget('paste_metadata:'+relative_path,'real_link')
crawler_metadata['screenshot'] = get_item_screenshot_path(relative_path)
else:
crawler_metadata['get_metadata'] = False
misp_event = r_serv_metadata.get('misp_events:' + requested_path)
if misp_event is None:
misp_eventid = False
misp_url = ''
else:
misp_eventid = True
misp_url = misp_event_url + misp_event
hive_case = r_serv_metadata.get('hive_cases:' + requested_path)
if hive_case is None:
hive_caseid = False
hive_url = ''
else:
hive_caseid = True
hive_url = hive_case_url.replace('id_here', hive_case)
return render_template("show_saved_item_min.html", bootstrap_label=bootstrap_label, content=item_content,
item_basic_info=item_basic_info, item_info=item_info,
initsize=len(item_content),
hashtype_list = p_hashtype_list,
crawler_metadata=crawler_metadata,
l_64=l_64, vt_enabled=vt_enabled, misp_eventid=misp_eventid, misp_url=misp_url, hive_caseid=hive_caseid, hive_url=hive_url)
# ============ ROUTES ============ # ============ ROUTES ============
@showsavedpastes.route("/showsavedpaste/") #completely shows the paste in a new tab @showsavedpastes.route("/showsavedpaste/") #completely shows the paste in a new tab
@ -230,6 +382,11 @@ def showsavedpaste():
requested_path = request.args.get('paste', '') requested_path = request.args.get('paste', '')
return showpaste(0, requested_path) return showpaste(0, requested_path)
@showsavedpastes.route("/showsaveditem_min/") #completely shows the paste in a new tab
def showsaveditem_min():
requested_path = request.args.get('paste', '')
return show_item_min(requested_path)
@showsavedpastes.route("/showsavedrawpaste/") #shows raw @showsavedpastes.route("/showsavedrawpaste/") #shows raw
def showsavedrawpaste(): def showsavedrawpaste():
requested_path = request.args.get('paste', '') requested_path = request.args.get('paste', '')
@ -241,7 +398,7 @@ def showsavedrawpaste():
def showpreviewpaste(): def showpreviewpaste():
num = request.args.get('num', '') num = request.args.get('num', '')
requested_path = request.args.get('paste', '') requested_path = request.args.get('paste', '')
return "|num|"+num+"|num|"+showpaste(max_preview_modal, requested_path) return "|num|"+num+"|num|"+show_item_min(requested_path, content_range=max_preview_modal)
@showsavedpastes.route("/getmoredata/") @showsavedpastes.route("/getmoredata/")
@ -278,6 +435,7 @@ def send_file_to_vt():
paste = request.form['paste'] paste = request.form['paste']
hash = request.form['hash'] hash = request.form['hash']
## TODO: # FIXME: path transversal
b64_full_path = os.path.join(os.environ['AIL_HOME'], b64_path) b64_full_path = os.path.join(os.environ['AIL_HOME'], b64_path)
b64_content = '' b64_content = ''
with open(b64_full_path, 'rb') as f: with open(b64_full_path, 'rb') as f:

View file

@ -0,0 +1,302 @@
<!DOCTYPE html>
<html lang="en">
<head>
<title>Paste information - AIL</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png') }}">
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dataTables.bootstrap4.min.css') }}" rel="stylesheet">
<script language="javascript" src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap.min.js') }}"></script>
<script src="{{ url_for('static', filename='js/jquery.dataTables.min.js') }}"></script>
<script src="{{ url_for('static', filename='js/dataTables.bootstrap.min.js') }}"></script>
<style>
.scrollable-menu {
height: auto;
max-height: 200px;
overflow-x: hidden;
width:100%;
}
.red_table thead{
background: #d91f2d;
color: #fff;
}
</style>
</head>
<body>
<div class="card mb-2">
<div class="card-header bg-dark">
<h3 class="text-white text-center" >{{ item_info['name'] }}</h3>
</div>
<div class="card-body pb-1">
<table class="table table-condensed table-responsive">
<thead class="">
<tr>
<th>Date</th>
<th>Source</th>
<th>Encoding</th>
<th>Size (Kb)</th>
<th>Number of lines</th>
<th>Max line length</th>
</tr>
</thead>
<tbody>
<tr>
<td>{{ item_basic_info['date'] }}</td>
<td>{{ item_basic_info['source'] }}</td>
<td>{{ item_basic_info['encoding'] }}</td>
<td>{{ item_basic_info['size'] }}</td>
<td>{{ item_basic_info['nb_lines'] }}</td>
<td>{{ item_basic_info['max_length_line'] }}</td>
</tr>
</tbody>
</table>
<div>
<h5>
{% for tag in item_info['tags'] %}
<span class="badge badge-{{ bootstrap_label[loop.index0 % 5] }}">{{ tag }}</span>
{% endfor %}
</h5>
</div>
</div>
</div>
{% if misp_eventid %}
<div class="list-group" id="misp_event">
<li class="list-group-item active">MISP Events already Created</li>
<a target="_blank" href="{{ misp_url }}" class="list-group-item">{{ misp_url }}</a>
</div>
{% endif %}
{% if hive_caseid %}
<div class="list-group" id="misp_event">
<li class="list-group-item active">The Hive Case already Created</li>
<a target="_blank" href="{{ hive_url }}" class="list-group-item">{{ hive_url }}</a>
</div>
{% endif %}
{% if item_info['nb_duplictates'] != 0 %}
<div id="accordionDuplicate" class="mb-2">
<div class="card">
<div class="card-header py-1" id="headingDuplicate">
<div class="my-1">
<i class="far fa-clone"></i> duplicates&nbsp;&nbsp;
<div class="badge badge-warning">{{item_info['nb_duplictates']}}</div>
</div>
</div>
</div>
</div>
{% endif %}
{% if l_64|length != 0 %}
<div id="accordionDecoded" class="mb-3">
<div class="card">
<div class="card-header py-1" id="headingDecoded">
<div class="row">
<div class="col-11">
<div class="mt-2">
<i class="fas fa-lock-open"></i> Decoded Files&nbsp;&nbsp;
<div class="badge badge-warning">{{l_64|length}}</div>
</div>
</div>
<div class="col-1">
<button class="btn btn-link py-2 rotate" data-toggle="collapse" data-target="#collapseDecoded" aria-expanded="true" aria-controls="collapseDecoded">
<i class="fas fa-chevron-circle-down"></i>
</button>
</div>
</div>
</div>
<div id="collapseDecoded" class="collapse show" aria-labelledby="headingDecoded" data-parent="#accordionDecoded">
<div class="card-body">
<table id="tableb64" class="red_table table table-striped">
<thead>
<tr>
<th>estimated type</th>
<th>hash</th>
</tr>
</thead>
<tbody>
{% for b64 in l_64 %}
<tr>
<td><i class="fas {{ b64[0] }}"></i>&nbsp;&nbsp;{{ b64[1] }}</td>
<td><a target="_blank" href="{{ url_for('hashDecoded.showHash') }}?hash={{ b64[2] }}">{{ b64[2] }}</a> ({{ b64[4] }})</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
</div>
</div>
{% endif %}
{% if crawler_metadata['get_metadata'] %}
<div id="accordionCrawler" class="mb-3">
<div class="card">
<div class="card-header py-1" id="headingCrawled">
<div class="row">
<div class="col-11">
<div class="mt-2">
<i class="fas fa-spider"></i> Crawled Item
</div>
</div>
<div class="col-1">
<button class="btn btn-link py-2 rotate" data-toggle="collapse" data-target="#collapseCrawled" aria-expanded="true" aria-controls="collapseCrawled">
<i class="fas fa-chevron-circle-up"></i>
</button>
</div>
</div>
</div>
<div id="collapseCrawled" class="collapse show" aria-labelledby="headingCrawled" data-parent="#accordionCrawler">
<div class="card-body">
<table class="table table-hover table-striped">
<tbody>
<tr>
<td>Domain</td>
<td><a target="_blank" href="{{ url_for('hiddenServices.show_domain') }}?domain={{ crawler_metadata['domain'] }}" id='domain'>{{ crawler_metadata['domain'] }}</a></td>
</tr>
<tr>
<td>Father</td>
<td><a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{ crawler_metadata['paste_father'] }}" id='paste_father'>{{ crawler_metadata['paste_father'] }}</a></td>
</tr>
<tr>
<td>Url</td>
<td>{{ crawler_metadata['real_link'] }}</td>
</tr>
</tbody>
</table>
<div class="card my-2" style="background-color:#ecf0f1;">
<div class="card-body py-2">
<div class="row">
<div class="col-md-8">
<input class="custom-range mt-2" id="blocks" type="range" min="1" max="50" value="13">
</div>
<div class="col-md-4">
<button class="btn btn-primary" onclick="blocks.value=50;pixelate();">
<i class="fas fa-search-plu"></i>
<span class="label-icon">Full resolution</span>
</button>
</div>
</div>
</div>
</div>
<canvas id="canvas" style="width:100%;"></canvas>
</div>
</div>
</div>
</div>
{% endif %}
<div class="card bg-dark text-white">
<div class="card-header">
<div class="row">
<div class="col-10">
<h3> Content: </h3>
</div>
<div class="col-2">
<div class="mt-2">
<small><a class="text-info" href="{{ url_for('showsavedpastes.showsavedrawpaste') }}?paste={{ request.args.get('paste') }}" id='raw_paste' > [Raw content] </a></small>
</div>
</div>
</div>
</div>
</div>
<p class="my-0" data-initsize="{{ initsize }}"> <pre id="paste-holder" class="border">{{ content }}</pre></p>
<script>
var ltags
var ltagsgalaxies
$(document).ready(function(){
$('#tableDup').DataTable();
$('#tableb64').DataTable({
"aLengthMenu": [[5, 10, 15, -1], [5, 10, 15, "All"]],
"iDisplayLength": 5,
"order": [[ 1, "asc" ]]
});
$(".rotate").click(function(){
$(this).toggleClass("down") ;
})
});
</script>
{% if crawler_metadata['get_metadata'] %}
<script>
var ctx = canvas.getContext('2d'), img = new Image();
/// turn off image smoothing
ctx.webkitImageSmoothingEnabled = false;
ctx.imageSmoothingEnabled = false;
img.onload = pixelate;
img.addEventListener("error", img_error);
var draw_img = false;
img.src = "{{ url_for('showsavedpastes.screenshot', filename=crawler_metadata['screenshot']) }}";
function pixelate() {
/// use slider value
if( blocks.value == 50 ){
size = 1;
} else {
var size = (blocks.value) * 0.01;
}
canvas.width = img.width;
canvas.height = img.height;
/// cache scaled width and height
w = canvas.width * size;
h = canvas.height * size;
/// draw original image to the scaled size
ctx.drawImage(img, 0, 0, w, h);
/// pixelated
ctx.drawImage(canvas, 0, 0, w, h, 0, 0, canvas.width, canvas.height);
}
function img_error() {
img.onerror=null;
img.src="{{ url_for('static', filename='image/AIL.png') }}";
blocks.value = 50;
pixelate;
}
blocks.addEventListener('change', pixelate, false);
</script>
{% endif %}
<div id="container-show-more" class="text-center">
</div>
</html>
</body>
</html>

View file

@ -433,7 +433,7 @@
<tbody> <tbody>
<tr> <tr>
<td>Domain</td> <td>Domain</td>
<td><a target="_blank" href="{{ url_for('hiddenServices.onion_domain') }}?onion_domain={{ crawler_metadata['domain'] }}" id='onion_domain'>{{ crawler_metadata['domain'] }}</a></td> <td><a target="_blank" href="{{ url_for('hiddenServices.show_domain') }}?domain={{ crawler_metadata['domain'] }}" id='domain'>{{ crawler_metadata['domain'] }}</a></td>
</tr> </tr>
<tr> <tr>
<td>Father</td> <td>Father</td>

View file

@ -281,14 +281,13 @@ def terms_management_query_paste():
p_date = str(paste._get_p_date()) p_date = str(paste._get_p_date())
p_date = p_date[0:4]+'/'+p_date[4:6]+'/'+p_date[6:8] p_date = p_date[0:4]+'/'+p_date[4:6]+'/'+p_date[6:8]
p_source = paste.p_source p_source = paste.p_source
p_encoding = paste._get_p_encoding()
p_size = paste.p_size p_size = paste.p_size
p_mime = paste.p_mime p_mime = paste.p_mime
p_lineinfo = paste.get_lines_info() p_lineinfo = paste.get_lines_info()
p_content = paste.get_p_content() p_content = paste.get_p_content()
if p_content != 0: if p_content != 0:
p_content = p_content[0:400] p_content = p_content[0:400]
paste_info.append({"path": path, "date": p_date, "source": p_source, "encoding": p_encoding, "size": p_size, "mime": p_mime, "lineinfo": p_lineinfo, "content": p_content}) paste_info.append({"path": path, "date": p_date, "source": p_source, "size": p_size, "mime": p_mime, "lineinfo": p_lineinfo, "content": p_content})
return jsonify(paste_info) return jsonify(paste_info)

View file

@ -131,7 +131,7 @@
<span class="term_name">{{ set }}</span> <span class="term_name">{{ set }}</span>
<div> <div>
{% for tag in notificationTagsTermMapping[set] %} {% for tag in notificationTagsTermMapping[set] %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag }}"> <a href="{{ url_for('Tags.Tags_page') }}?ltags={{ tag }}">
<span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag }}</span> <span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag }}</span>
</a> </a>
{% endfor %} {% endfor %}
@ -208,7 +208,7 @@
<span class="term_name">{{ regex }}</span> <span class="term_name">{{ regex }}</span>
<div> <div>
{% for tag in notificationTagsTermMapping[regex] %} {% for tag in notificationTagsTermMapping[regex] %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag }}"> <a href="{{ url_for('Tags.Tags_page') }}?ltags={{ tag }}">
<span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag }}</span> <span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag }}</span>
</a> </a>
{% endfor %} {% endfor %}
@ -285,7 +285,7 @@
<span class="term_name">{{ term }}</span> <span class="term_name">{{ term }}</span>
<div> <div>
{% for tag in notificationTagsTermMapping[term] %} {% for tag in notificationTagsTermMapping[term] %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag }}"> <a href="{{ url_for('Tags.Tags_page') }}?ltags={{ tag }}">
<span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag }}</span> <span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag }}</span>
</a> </a>
{% endfor %} {% endfor %}
@ -443,7 +443,7 @@ function bindEventsForCurrentPage() {
html_to_add += "<tr>"; html_to_add += "<tr>";
html_to_add += "<th>Source</th>"; html_to_add += "<th>Source</th>";
html_to_add += "<th>Date</th>"; html_to_add += "<th>Date</th>";
html_to_add += "<th>Encoding</th>"; html_to_add += "<th>Mime</th>";
html_to_add += "<th>Size (Kb)</th>"; html_to_add += "<th>Size (Kb)</th>";
html_to_add += "<th># lines</th>"; html_to_add += "<th># lines</th>";
html_to_add += "<th>Max length</th>"; html_to_add += "<th>Max length</th>";
@ -456,7 +456,7 @@ function bindEventsForCurrentPage() {
html_to_add += "<tr>"; html_to_add += "<tr>";
html_to_add += "<td>"+curr_data.source+"</td>"; html_to_add += "<td>"+curr_data.source+"</td>";
html_to_add += "<td>"+curr_data.date+"</td>"; html_to_add += "<td>"+curr_data.date+"</td>";
html_to_add += "<td>"+curr_data.encoding+"</td>"; html_to_add += "<td>"+curr_data.mime+"</td>";
html_to_add += "<td>"+curr_data.size+"</td>"; html_to_add += "<td>"+curr_data.size+"</td>";
html_to_add += "<td>"+curr_data.lineinfo[0]+"</td>"; html_to_add += "<td>"+curr_data.lineinfo[0]+"</td>";
html_to_add += "<td>"+curr_data.lineinfo[1]+"</td>"; html_to_add += "<td>"+curr_data.lineinfo[1]+"</td>";

View file

@ -1,6 +1,7 @@
.tag-ctn{ .tag-ctn{
position: relative; position: relative;
height: 30px; height: 38px;
min-height: 38px;
padding: 0; padding: 0;
margin-bottom: 0px; margin-bottom: 0px;
font-size: 14px; font-size: 14px;
@ -63,6 +64,7 @@
} }
.tag-ctn .tag-empty-text{ .tag-ctn .tag-empty-text{
color: #DDD; color: #DDD;
width: 0;
} }
.tag-ctn input:focus{ .tag-ctn input:focus{
border: 0; border: 0;

View file

@ -0,0 +1,48 @@
<div class="col-12 col-lg-2 p-0 bg-light border-right" id="side_menu">
<button type="button" class="btn btn-outline-secondary mt-1 ml-3" onclick="toggle_sidebar()">
<i class="fas fa-align-left"></i>
<span>Toggle Sidebar</span>
</button>
<nav class="navbar navbar-expand navbar-light bg-light flex-md-column flex-row align-items-start py-2" id="nav_menu">
<h5 class="d-flex text-muted w-100">
<span>Splash Crawlers </span>
<a class="ml-auto" href="{{url_for('hiddenServices.manual')}}">
<i class="fas fa-plus-circle ml-auto"></i>
</a>
</h5>
<ul class="nav flex-md-column flex-row navbar-nav justify-content-between w-100"> <!--nav-pills-->
<li class="nav-item">
<a class="nav-link" href="{{url_for('hiddenServices.dashboard')}}" id="nav_dashboard">
<i class="fas fa-search"></i>
<span>Dashboard</span>
</a>
</li>
<li class="nav-item">
<a class="nav-link" href="{{url_for('hiddenServices.Crawler_Splash_last_by_type')}}?type=onion" id="nav_onion_crawler">
<i class="fas fa-user-secret"></i>
Onion Crawler
</a>
</li>
<li class="nav-item">
<a class="nav-link" href="{{url_for('hiddenServices.Crawler_Splash_last_by_type')}}?type=regular" id="nav_regular_crawler">
<i class="fab fa-html5"></i>
Regular Crawler
</a>
</li>
<li class="nav-item">
<a class="nav-link" href="{{url_for('hiddenServices.manual')}}" id="nav_manual_crawler">
<i class="fas fa-spider"></i>
Manual Crawler
</a>
</li>
<li class="nav-item">
<a class="nav-link" href="{{url_for('hiddenServices.auto_crawler')}}" id="nav_auto_crawler">
<i class="fas fa-sync"></i>
Automatic Crawler
</a>
</li>
</ul>
</nav>
</div>

View file

@ -0,0 +1,48 @@
<nav class="navbar navbar-expand-xl navbar-dark bg-dark">
<a class="navbar-brand" href="{{ url_for('dashboard.index') }}">
<img src="{{ url_for('static', filename='image/ail-icon.png')}}" alt="AIL" style="width:80px;">
</a>
<button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarSupportedContent" aria-controls="navbarSupportedContent" aria-expanded="false" aria-label="Toggle navigation">
<span class="navbar-toggler-icon"></span>
</button>
<div class="collapse navbar-collapse" id="navbarSupportedContent">
<ul class="navbar-nav">
<li class="nav-item">
<a class="nav-link mr-3" href="{{ url_for('dashboard.index') }}">Home</a>
</li>
<li class="nav-item mr-3">
<a class="nav-link" href="{{ url_for('PasteSubmit.PasteSubmit_page') }}" aria-disabled="true"><i class="fas fa-external-link-alt"></i> Submit</a>
</li>
<li class="nav-item mr-3">
<a class="nav-link" id="page-Browse-Items" href="{{ url_for('Tags.Tags_page') }}" aria-disabled="true"><i class="fas fa-tag"></i> Browse Items</a>
</li>
<li class="nav-item mr-3">
<a class="nav-link" href="{{ url_for('terms.terms_management') }}" aria-disabled="true"><i class="fas fa-crosshairs"></i> Leaks Hunter</a>
</li>
<li class="nav-item mr-3">
<a class="nav-link" id="page-Crawler" href="{{ url_for('hiddenServices.dashboard') }}" tabindex="-1" aria-disabled="true"><i class="fas fa-spider"></i> Crawlers</a>
</li>
<li class="nav-item mr-3">
<a class="nav-link" href="{{ url_for('hashDecoded.hashDecoded_page') }}" aria-disabled="true"><i class="fas fa-lock-open"></i> Decoded</a>
</li>
<li class="nav-item mr-3">
<a class="nav-link" href="{{ url_for('trendingmodules.moduletrending') }}" aria-disabled="true"><i class="fas fa-chart-bar"></i> Statistics</a>
</li>
<li class="nav-item mr-3">
<a class="nav-link" id="page-options" href="{{ url_for('settings.settings_page') }}" aria-disabled="true"><i class="fas fa-cog"></i> Server Management</a>
</li>
</ul>
<form class="form-inline my-2 my-lg-0 ml-auto justify-content-center">
<div class="d-flex flex-column">
<div>
<input class="form-control mr-sm-2" type="search" id="global_search" placeholder="Search" aria-label="Search" aria-describedby="advanced_search">
<button class="btn btn-outline-info my-2 my-sm-0" type="submit"><i class="fas fa-search"></i></button>
</div>
<small id="advanced_search" class="form-text"><a class="nav text-muted" href="#" aria-disabled="true">Advanced Search</a></small>
</div>
</form>
</div>
</nav>

View file

@ -0,0 +1,13 @@
<div class="col-12 col-lg-2 p-0 bg-light border-right" id="side_menu">
<button type="button" class="btn btn-outline-secondary mt-1 ml-3" onclick="toggle_sidebar()">
<i class="fas fa-align-left"></i>
<span>Toggle Sidebar</span>
</button>
<nav class="navbar navbar-expand navbar-light bg-light flex-md-column flex-row align-items-start py-2" id="nav_menu">
<h5 class="d-flex text-muted w-100">
<span>Diagnostic</span>
</h5>
</nav>
</div>

View file

@ -0,0 +1,91 @@
<div class="col-12 col-lg-2 p-0 bg-light border-right" id="side_menu">
<button type="button" class="btn btn-outline-secondary mt-1 ml-3" onclick="toggle_sidebar()">
<i class="fas fa-align-left"></i>
<span>Toggle Sidebar</span>
</button>
<nav class="navbar navbar-expand navbar-light bg-light flex-md-column flex-row align-items-start py-2" id="nav_menu">
<h5 class="d-flex text-muted w-100">
<span>Tags Management </span>
</h5>
<ul class="nav flex-md-column flex-row navbar-nav justify-content-between w-100">
<li class="nav-item">
<a class="nav-link" href="{{ url_for('Tags.taxonomies') }}" id="nav_taxonomies">
<i class="fas fa-wrench"></i>
Edit Taxonomies List
</a>
</li>
<li class="nav-item">
<a class="nav-link" href="{{ url_for('Tags.galaxies') }}" id="nav_onion_galaxies">
<i class="fas fa-rocket"></i>
Edit Galaxies List
</a>
</li>
</ul>
<h5 class="d-flex text-muted w-100">
<span>Tags Export </span>
</h5>
<ul class="nav flex-md-column flex-row navbar-nav justify-content-between w-100">
<li class="nav-item">
<a class="nav-link" href="{{url_for('PasteSubmit.edit_tag_export')}}" id="nav_regular_edit_tag_export">
<i class="fas fa-cogs"></i>
MISP and Hive, auto push
</a>
</li>
</ul>
<h5 class="d-flex text-muted w-100" id="nav_quick_search">
<span>Quick Search </span>
</h5>
<ul class="nav flex-md-column flex-row navbar-nav justify-content-between w-100">
<li class="nav-item">
<a class="nav-link" href='{{url_for('Tags.Tags_page')}}?ltags=infoleak:automatic-detection="credential"' id='nav_tag_infoleakautomatic-detectioncredential'>
<i class="fas fa-unlock-alt"></i>
Credentials
</a>
</li>
<li class="nav-item">
<a class="nav-link" href='{{url_for('Tags.Tags_page')}}?ltags=infoleak:automatic-detection="credit-card"' id='nav_tag_infoleakautomatic-detectioncredit-card'>
<i class="far fa-credit-card"></i>
Credit cards
</a>
</li>
<li class="nav-item">
<a class="nav-link" href='{{url_for('Tags.Tags_page')}}?ltags=infoleak:automatic-detection="mail"' id='nav_tag_infoleakautomatic-detectionmail'>
<i class="fas fa-envelope"></i>
Mails
</a>
</li>
<li class="nav-item">
<a class="nav-link" href='{{url_for('Tags.Tags_page')}}?ltags=infoleak:automatic-detection="cve"' id='nav_tag_infoleakautomatic-detectioncve'>
<i class="fas fa-bug"></i>
CVEs
</a>
</li>
<li class="nav-item">
<a class="nav-link" href='{{url_for('Tags.Tags_page')}}?ltags=infoleak:automatic-detection="onion"' id='nav_tag_infoleakautomatic-detectiononion'>
<i class="fas fa-user-secret"></i>
Onions
</a>
</li>
<li class="nav-item">
<a class="nav-link" href='{{url_for('Tags.Tags_page')}}?ltags=infoleak:automatic-detection="bitcoin-address"' id='nav_tag_infoleakautomatic-detectionbitcoin-address'>
<i class="fab fa-bitcoin"></i>
Bitcoin
</a>
</li>
<li class="nav-item">
<a class="nav-link" href='{{url_for('Tags.Tags_page')}}?ltags=infoleak:automatic-detection="base64"' id='nav_tag_infoleakautomatic-detectionbase64'>
<i class="fas fa-lock-open"></i>
Base64
</a>
</li>
<li class="nav-item">
<a class="nav-link" href='{{url_for('Tags.Tags_page')}}?ltags=infoleak:automatic-detection="phone-number"' id='nav_tag_infoleakautomatic-detectionphone-number'>
<i class="fas fa-phone"></i>
Phones
</a>
</li>
</ul>
</nav>
</div>

View file

@ -5,33 +5,51 @@ set -e
wget http://dygraphs.com/dygraph-combined.js -O ./static/js/dygraph-combined.js wget http://dygraphs.com/dygraph-combined.js -O ./static/js/dygraph-combined.js
SBADMIN_VERSION='3.3.7' SBADMIN_VERSION='3.3.7'
FONT_AWESOME_VERSION='4.7.0' BOOTSTRAP_VERSION='4.2.1'
FONT_AWESOME_VERSION='5.7.1'
D3_JS_VERSION='5.5.0' D3_JS_VERSION='5.5.0'
rm -rf temp rm -rf temp
mkdir temp mkdir temp
wget https://github.com/twbs/bootstrap/releases/download/v${BOOTSTRAP_VERSION}/bootstrap-${BOOTSTRAP_VERSION}-dist.zip -O temp/bootstrap${BOOTSTRAP_VERSION}.zip
wget https://github.com/FezVrasta/popper.js/archive/v1.14.3.zip -O temp/popper.zip
wget https://github.com/BlackrockDigital/startbootstrap-sb-admin/archive/v${SBADMIN_VERSION}.zip -O temp/${SBADMIN_VERSION}.zip wget https://github.com/BlackrockDigital/startbootstrap-sb-admin/archive/v${SBADMIN_VERSION}.zip -O temp/${SBADMIN_VERSION}.zip
wget https://github.com/BlackrockDigital/startbootstrap-sb-admin-2/archive/v${SBADMIN_VERSION}.zip -O temp/${SBADMIN_VERSION}-2.zip wget https://github.com/BlackrockDigital/startbootstrap-sb-admin-2/archive/v${SBADMIN_VERSION}.zip -O temp/${SBADMIN_VERSION}-2.zip
wget https://github.com/FortAwesome/Font-Awesome/archive/v${FONT_AWESOME_VERSION}.zip -O temp/FONT_AWESOME_${FONT_AWESOME_VERSION}.zip wget https://github.com/FortAwesome/Font-Awesome/archive/v4.7.0.zip -O temp/FONT_AWESOME_4.7.0.zip
wget https://github.com/FortAwesome/Font-Awesome/archive/5.7.1.zip -O temp/FONT_AWESOME_${FONT_AWESOME_VERSION}.zip
wget https://github.com/d3/d3/releases/download/v${D3_JS_VERSION}/d3.zip -O temp/d3_${D3_JS_VERSION}.zip wget https://github.com/d3/d3/releases/download/v${D3_JS_VERSION}/d3.zip -O temp/d3_${D3_JS_VERSION}.zip
# dateRangePicker # dateRangePicker
wget https://github.com/moment/moment/archive/2.22.2.zip -O temp/moment_2.22.2.zip wget https://github.com/moment/moment/archive/2.22.2.zip -O temp/moment_2.22.2.zip
wget https://github.com/longbill/jquery-date-range-picker/archive/v0.18.0.zip -O temp/daterangepicker_v0.18.0.zip wget https://github.com/longbill/jquery-date-range-picker/archive/v0.18.0.zip -O temp/daterangepicker_v0.18.0.zip
unzip temp/bootstrap${BOOTSTRAP_VERSION}.zip -d temp/
unzip temp/popper.zip -d temp/
unzip temp/${SBADMIN_VERSION}.zip -d temp/ unzip temp/${SBADMIN_VERSION}.zip -d temp/
unzip temp/${SBADMIN_VERSION}-2.zip -d temp/ unzip temp/${SBADMIN_VERSION}-2.zip -d temp/
unzip temp/FONT_AWESOME_4.7.0.zip -d temp/
unzip temp/FONT_AWESOME_${FONT_AWESOME_VERSION}.zip -d temp/ unzip temp/FONT_AWESOME_${FONT_AWESOME_VERSION}.zip -d temp/
unzip temp/d3_${D3_JS_VERSION}.zip -d temp/ unzip temp/d3_${D3_JS_VERSION}.zip -d temp/
unzip temp/moment_2.22.2.zip -d temp/ unzip temp/moment_2.22.2.zip -d temp/
unzip temp/daterangepicker_v0.18.0.zip -d temp/ unzip temp/daterangepicker_v0.18.0.zip -d temp/
mv temp/bootstrap-${BOOTSTRAP_VERSION}-dist/js/bootstrap.min.js ./static/js/bootstrap4.min.js
mv temp/bootstrap-${BOOTSTRAP_VERSION}-dist/js/bootstrap.min.js.map ./static/js/bootstrap.min.js.map
mv temp/bootstrap-${BOOTSTRAP_VERSION}-dist/css/bootstrap.min.css ./static/css/bootstrap4.min.css
mv temp/bootstrap-${BOOTSTRAP_VERSION}-dist/css/bootstrap.min.css.map ./static/css/bootstrap4.min.css.map
mv temp/popper.js-1.14.3/dist/umd/popper.min.js ./static/js/
mv temp/popper.js-1.14.3/dist/umd/popper.min.js.map ./static/js/
mv temp/startbootstrap-sb-admin-${SBADMIN_VERSION} temp/sb-admin mv temp/startbootstrap-sb-admin-${SBADMIN_VERSION} temp/sb-admin
mv temp/startbootstrap-sb-admin-2-${SBADMIN_VERSION} temp/sb-admin-2 mv temp/startbootstrap-sb-admin-2-${SBADMIN_VERSION} temp/sb-admin-2
mv temp/Font-Awesome-${FONT_AWESOME_VERSION} temp/font-awesome mv temp/Font-Awesome-4.7.0 temp/font-awesome
rm -rf ./static/webfonts/
mv temp/Font-Awesome-${FONT_AWESOME_VERSION}/css/all.min.css ./static/css/font-awesome.min.css
mv temp/Font-Awesome-${FONT_AWESOME_VERSION}/webfonts ./static/webfonts
rm -rf ./static/js/plugins rm -rf ./static/js/plugins
mv temp/sb-admin/js/* ./static/js/ mv temp/sb-admin/js/* ./static/js/
@ -59,6 +77,9 @@ wget https://cdn.datatables.net/1.10.12/js/jquery.dataTables.min.js -O ./static/
wget https://cdn.datatables.net/plug-ins/1.10.7/integration/bootstrap/3/dataTables.bootstrap.css -O ./static/css/dataTables.bootstrap.css wget https://cdn.datatables.net/plug-ins/1.10.7/integration/bootstrap/3/dataTables.bootstrap.css -O ./static/css/dataTables.bootstrap.css
wget https://cdn.datatables.net/plug-ins/1.10.7/integration/bootstrap/3/dataTables.bootstrap.js -O ./static/js/dataTables.bootstrap.js wget https://cdn.datatables.net/plug-ins/1.10.7/integration/bootstrap/3/dataTables.bootstrap.js -O ./static/js/dataTables.bootstrap.js
wget https://cdn.datatables.net/1.10.18/css/dataTables.bootstrap4.min.css -O ./static/css/dataTables.bootstrap.min.css
wget https://cdn.datatables.net/1.10.18/js/dataTables.bootstrap4.min.js -O ./static/js/dataTables.bootstrap.min.js
#Ressource for graph #Ressource for graph
wget https://raw.githubusercontent.com/flot/flot/958e5fd43c6dff4bab3e1fd5cb6109df5c1e8003/jquery.flot.js -O ./static/js/jquery.flot.js wget https://raw.githubusercontent.com/flot/flot/958e5fd43c6dff4bab3e1fd5cb6109df5c1e8003/jquery.flot.js -O ./static/js/jquery.flot.js
wget https://raw.githubusercontent.com/flot/flot/958e5fd43c6dff4bab3e1fd5cb6109df5c1e8003/jquery.flot.pie.js -O ./static/js/jquery.flot.pie.js wget https://raw.githubusercontent.com/flot/flot/958e5fd43c6dff4bab3e1fd5cb6109df5c1e8003/jquery.flot.pie.js -O ./static/js/jquery.flot.pie.js