Merge pull request #342 from CIRCL/advanced_crawler

Advanced_crawler + tag by daterange + bootstrap 4 migration + bugs fix
This commit is contained in:
Alexandre Dulaunoy 2019-04-25 15:39:19 +02:00 committed by GitHub
commit ecd14b91d9
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
95 changed files with 6782 additions and 1893 deletions

2
.gitignore vendored
View file

@ -34,6 +34,8 @@ var/www/submitted
bin/packages/config.cfg
bin/packages/config.cfg.backup
configs/keys
configs/update.cfg
update/current_version
files
# Pystemon archives

View file

@ -12,46 +12,205 @@ Redis and ARDB overview
- DB 0 - PubSub + Queue and Paste content LRU cache
- DB 1 - _Mixer_ Cache
* ARDB on TCP port 6382
- DB 1 - Curve
- DB 2 - Trending
- DB 3 - Terms
- DB 4 - Sentiments
DB 1 - Curve
DB 2 - TermFreq
DB 3 - Trending
DB 4 - Sentiments
DB 5 - TermCred
DB 6 - Tags
DB 7 - Metadata
DB 8 - Statistics
DB 9 - Crawler
* ARDB on TCP port <year>
- DB 0 - Lines duplicate
- DB 1 - Hashes
# Database Map:
## DB0 - Core:
##### Update keys:
| Key | Value |
| ------ | ------ |
| | |
| ail:version | **current version** |
| | |
| ail:update_**update_version** | **background update name** |
| | **background update name** |
| | **...** |
| | |
| ail:update_date_v1.5 | **update date** |
| | |
| ail:update_error | **update message error** |
| | |
| ail:update_in_progress | **update version in progress** |
| ail:current_background_update | **current update version** |
| | |
| ail:current_background_script | **name of the background script currently executed** |
| ail:current_background_script_stat | **progress in % of the background script** |
## DB2 - TermFreq:
##### Set:
| Key | Value |
| ------ | ------ |
| TrackedSetTermSet | **tracked_term** |
| TrackedSetSet | **tracked_set** |
| TrackedRegexSet | **tracked_regex** |
| | |
| tracked_**tracked_term** | **item_path** |
| set_**tracked_set** | **item_path** |
| regex_**tracked_regex** | **item_path** |
| | |
| TrackedNotifications | **tracked_trem / set / regex** |
| | |
| TrackedNotificationTags_**tracked_trem / set / regex** | **tag** |
| | |
| TrackedNotificationEmails_**tracked_trem / set / regex** | **email** |
##### Zset:
| Key | Field | Value |
| ------ | ------ | ------ |
| per_paste_TopTermFreq_set_month | **term** | **nb_seen** |
| per_paste_TopTermFreq_set_week | **term** | **nb_seen** |
| per_paste_TopTermFreq_set_day_**epoch** | **term** | **nb_seen** |
| | | |
| TopTermFreq_set_month | **term** | **nb_seen** |
| TopTermFreq_set_week | **term** | **nb_seen** |
| TopTermFreq_set_day_**epoch** | **term** | **nb_seen** |
##### Hset:
| Key | Field | Value |
| ------ | ------ | ------ |
| TrackedTermDate | **tracked_term** | **epoch** |
| TrackedSetDate | **tracked_set** | **epoch** |
| TrackedRegexDate | **tracked_regex** | **epoch** |
| | | |
| BlackListTermDate | **blacklisted_term** | **epoch** |
| | | |
| **epoch** | **term** | **nb_seen** |
## DB6 - Tags:
##### Hset:
| Key | Field | Value |
| ------ | ------ | ------ |
| per_paste_**epoch** | **term** | **nb_seen** |
| | |
| tag_metadata:**tag** | first_seen | **date** |
| tag_metadata:**tag** | last_seen | **date** |
##### Set:
| Key | Value |
| ------ | ------ |
| list_tags | **tag** |
| active_taxonomies | **taxonomie** |
| active_galaxies | **galaxie** |
| active_tag_**taxonomie or galaxy** | **tag** |
| synonym_tag_misp-galaxy:**galaxy** | **tag synonym** |
| list_export_tags | **user_tag** |
| **tag**:**date** | **paste** |
##### old:
| Key | Value |
| ------ | ------ |
| *tag* | *paste* |
## DB7 - Metadata:
#### Crawled Items:
##### Hset:
| Key | Field | Value |
| ------ | ------ | ------ |
| paste_metadata:**item path** | super_father | **first url crawled** |
| | father | **item father** |
| | domain | **crawled domain**:**domain port** |
| | screenshot | **screenshot hash** |
##### Set:
| Key | Field |
| ------ | ------ |
| tag:**item path** | **tag** |
| | |
| paste_children:**item path** | **item path** |
| | |
| hash_paste:**item path** | **hash** |
| base64_paste:**item path** | **hash** |
| hexadecimal_paste:**item path** | **hash** |
| binary_paste:**item path** | **hash** |
##### Zset:
| Key | Field | Value |
| ------ | ------ | ------ |
| nb_seen_hash:**hash** | **item** | **nb_seen** |
| base64_hash:**hash** | **item** | **nb_seen** |
| binary_hash:**hash** | **item** | **nb_seen** |
| hexadecimal_hash:**hash** | **item** | **nb_seen** |
## DB9 - Crawler:
##### Hset:
| Key | Field | Value |
| ------ | ------ | ------ |
| **service type**_metadata:**domain** | first_seen | **date** |
| | last_check | **date** |
| | ports | **port**;**port**;**port** ... |
| | paste_parent | **parent last crawling (can be auto or manual)** |
##### Zset:
| Key | Field | Value |
| ------ | ------ | ------ |
| crawler\_history\_**service type**:**domain**:**port** | **item root (first crawled item)** | **epoch (seconds)** |
##### Set:
| Key | Value |
| ------ | ------ | ------ |
| screenshot:**sha256** | **item path** |
##### crawler config:
| Key | Value |
| ------ | ------ |
| crawler\_config:**crawler mode**:**service type**:**domain** | **json config** |
##### automatic crawler config:
| Key | Value |
| ------ | ------ |
| crawler\_config:**crawler mode**:**service type**:**domain**:**url** | **json config** |
###### exemple json config:
```json
{
"closespider_pagecount": 1,
"time": 3600,
"depth_limit": 0,
"har": 0,
"png": 0
}
```
ARDB overview
---------------------------
ARDB_DB
* DB 1 - Curve
* DB 2 - TermFreq
----------------------------------------- TERM ----------------------------------------
SET - 'TrackedRegexSet' term
----------------------------------------- SENTIMENT ------------------------------------
HSET - 'TrackedRegexDate' tracked_regex today_timestamp
SET - 'Provider_set' Provider
SET - 'TrackedSetSet' set_to_add
KEY - 'UniqID' INT
HSET - 'TrackedSetDate' set_to_add today_timestamp
SET - provider_timestamp UniqID
SET - 'TrackedSetTermSet' term
SET - UniqID avg_score
HSET - 'TrackedTermDate' tracked_regex today_timestamp
SET - 'TrackedNotificationEmails_'+term/set email
SET - 'TrackedNotifications' term/set
* DB 3 - Trending
* DB 4 - Sentiment
* DB 5 - TermCred
* DB 6 - Tags
* DB 7 - Metadata
* DB 8 - Statistics
* DB 7 - Metadata:
----------------------------------------------------------------------------------------
----------------------------------------- BASE64 ----------------------------------------
HSET - 'metadata_hash:'+hash 'saved_path' saved_path
@ -71,18 +230,10 @@ ARDB_DB
SET - 'hash_base64_all_type' hash_type *
SET - 'hash_binary_all_type' hash_type *
SET - 'hash_paste:'+paste hash *
SET - 'base64_paste:'+paste hash *
SET - 'binary_paste:'+paste hash *
ZADD - 'hash_date:'+20180622 hash * nb_seen_this_day
ZADD - 'base64_date:'+20180622 hash * nb_seen_this_day
ZADD - 'binary_date:'+20180622 hash * nb_seen_this_day
ZADD - 'nb_seen_hash:'+hash paste * nb_seen_in_paste
ZADD - 'base64_hash:'+hash paste * nb_seen_in_paste
ZADD - 'binary_hash:'+hash paste * nb_seen_in_paste
ZADD - 'base64_type:'+type date nb_seen
ZADD - 'binary_type:'+type date nb_seen

View file

@ -40,7 +40,7 @@ def search_api_key(message):
print('found google api key')
print(to_print)
publisher.warning('{}Checked {} found Google API Key;{}'.format(
to_print, len(google_api_key), paste.p_path))
to_print, len(google_api_key), paste.p_rel_path))
msg = 'infoleak:automatic-detection="google-api-key";{}'.format(filename)
p.populate_set_out(msg, 'Tags')
@ -49,7 +49,7 @@ def search_api_key(message):
print(to_print)
total = len(aws_access_key) + len(aws_secret_key)
publisher.warning('{}Checked {} found AWS Key;{}'.format(
to_print, total, paste.p_path))
to_print, total, paste.p_rel_path))
msg = 'infoleak:automatic-detection="aws-key";{}'.format(filename)
p.populate_set_out(msg, 'Tags')
@ -57,8 +57,6 @@ def search_api_key(message):
msg = 'infoleak:automatic-detection="api-key";{}'.format(filename)
p.populate_set_out(msg, 'Tags')
msg = 'apikey;{}'.format(filename)
p.populate_set_out(msg, 'alertHandler')
#Send to duplicate
p.populate_set_out(filename, 'Duplicate')

View file

@ -43,8 +43,8 @@ if __name__ == "__main__":
# FIXME why not all saving everything there.
PST.save_all_attributes_redis()
# FIXME Not used.
PST.store.sadd("Pastes_Objects", PST.p_path)
PST.store.sadd("Pastes_Objects", PST.p_rel_path)
except IOError:
print("CRC Checksum Failed on :", PST.p_path)
print("CRC Checksum Failed on :", PST.p_rel_path)
publisher.error('Duplicate;{};{};{};CRC Checksum Failed'.format(
PST.p_source, PST.p_date, PST.p_name))

View file

@ -67,7 +67,7 @@ def check_all_iban(l_iban, paste, filename):
if(nb_valid_iban > 0):
to_print = 'Iban;{};{};{};'.format(paste.p_source, paste.p_date, paste.p_name)
publisher.warning('{}Checked found {} IBAN;{}'.format(
to_print, nb_valid_iban, paste.p_path))
to_print, nb_valid_iban, paste.p_rel_path))
msg = 'infoleak:automatic-detection="iban";{}'.format(filename)
p.populate_set_out(msg, 'Tags')
@ -113,7 +113,7 @@ if __name__ == "__main__":
try:
l_iban = iban_regex.findall(content)
except TimeoutException:
print ("{0} processing timeout".format(paste.p_path))
print ("{0} processing timeout".format(paste.p_rel_path))
continue
else:
signal.alarm(0)

View file

@ -62,8 +62,6 @@ def search_key(content, message, paste):
to_print = 'Bitcoin found: {} address and {} private Keys'.format(len(bitcoin_address), len(bitcoin_private_key))
print(to_print)
publisher.warning(to_print)
msg = ('bitcoin;{}'.format(message))
p.populate_set_out( msg, 'alertHandler')
msg = 'infoleak:automatic-detection="bitcoin-address";{}'.format(message)
p.populate_set_out(msg, 'Tags')
@ -75,7 +73,7 @@ def search_key(content, message, paste):
to_print = 'Bitcoin;{};{};{};'.format(paste.p_source, paste.p_date,
paste.p_name)
publisher.warning('{}Detected {} Bitcoin private key;{}'.format(
to_print, len(bitcoin_private_key),paste.p_path))
to_print, len(bitcoin_private_key),paste.p_rel_path))
if __name__ == "__main__":
publisher.port = 6380

View file

@ -89,16 +89,10 @@ if __name__ == "__main__":
paste = Paste.Paste(filename)
content = paste.get_p_content()
#print('-----------------------------------------------------')
#print(filename)
#print(content)
#print('-----------------------------------------------------')
for categ, pattern in tmp_dict.items():
found = set(re.findall(pattern, content))
if len(found) >= matchingThreshold:
msg = '{} {}'.format(paste.p_path, len(found))
#msg = " ".join( [paste.p_path, bytes(len(found))] )
msg = '{} {}'.format(paste.p_rel_path, len(found))
print(msg, categ)
p.populate_set_out(msg, categ)
@ -106,4 +100,4 @@ if __name__ == "__main__":
publisher.info(
'Categ;{};{};{};Detected {} as {};{}'.format(
paste.p_source, paste.p_date, paste.p_name,
len(found), categ, paste.p_path))
len(found), categ, paste.p_rel_path))

View file

@ -4,6 +4,8 @@
import os
import sys
import re
import uuid
import json
import redis
import datetime
import time
@ -16,22 +18,179 @@ sys.path.append(os.environ['AIL_BIN'])
from Helper import Process
from pubsublogger import publisher
def on_error_send_message_back_in_queue(type_hidden_service, domain, message):
# send this msg back in the queue
if not r_onion.sismember('{}_domain_crawler_queue'.format(type_hidden_service), domain):
r_onion.sadd('{}_domain_crawler_queue'.format(type_hidden_service), domain)
r_onion.sadd('{}_crawler_priority_queue'.format(type_hidden_service), message)
# ======== FUNCTIONS ========
def crawl_onion(url, domain, date, date_month, message):
def load_blacklist(service_type):
try:
with open(os.environ['AIL_BIN']+'/torcrawler/blacklist_{}.txt'.format(service_type), 'r') as f:
redis_crawler.delete('blacklist_{}'.format(service_type))
lines = f.read().splitlines()
for line in lines:
redis_crawler.sadd('blacklist_{}'.format(service_type), line)
except Exception:
pass
def update_auto_crawler():
current_epoch = int(time.time())
list_to_crawl = redis_crawler.zrangebyscore('crawler_auto_queue', '-inf', current_epoch)
for elem_to_crawl in list_to_crawl:
mess, type = elem_to_crawl.rsplit(';', 1)
redis_crawler.sadd('{}_crawler_priority_queue'.format(type), mess)
redis_crawler.zrem('crawler_auto_queue', elem_to_crawl)
# Extract info form url (url, domain, domain url, ...)
def unpack_url(url):
to_crawl = {}
faup.decode(url)
url_unpack = faup.get()
to_crawl['domain'] = url_unpack['domain'].decode()
if url_unpack['scheme'] is None:
to_crawl['scheme'] = 'http'
url= 'http://{}'.format(url_unpack['url'].decode())
else:
scheme = url_unpack['scheme'].decode()
if scheme in default_proto_map:
to_crawl['scheme'] = scheme
url = url_unpack['url'].decode()
else:
redis_crawler.sadd('new_proto', '{} {}'.format(scheme, url_unpack['url'].decode()))
to_crawl['scheme'] = 'http'
url= 'http://{}'.format(url_unpack['url'].decode().replace(scheme, '', 1))
if url_unpack['port'] is None:
to_crawl['port'] = default_proto_map[to_crawl['scheme']]
else:
port = url_unpack['port'].decode()
# Verify port number #################### make function to verify/correct port number
try:
int(port)
# Invalid port Number
except Exception as e:
port = default_proto_map[to_crawl['scheme']]
to_crawl['port'] = port
#if url_unpack['query_string'] is None:
# if to_crawl['port'] == 80:
# to_crawl['url']= '{}://{}'.format(to_crawl['scheme'], url_unpack['host'].decode())
# else:
# to_crawl['url']= '{}://{}:{}'.format(to_crawl['scheme'], url_unpack['host'].decode(), to_crawl['port'])
#else:
# to_crawl['url']= '{}://{}:{}{}'.format(to_crawl['scheme'], url_unpack['host'].decode(), to_crawl['port'], url_unpack['query_string'].decode())
to_crawl['url'] = url
if to_crawl['port'] == 80:
to_crawl['domain_url'] = '{}://{}'.format(to_crawl['scheme'], url_unpack['host'].decode())
else:
to_crawl['domain_url'] = '{}://{}:{}'.format(to_crawl['scheme'], url_unpack['host'].decode(), to_crawl['port'])
to_crawl['tld'] = url_unpack['tld'].decode()
return to_crawl
# get url, paste and service_type to crawl
def get_elem_to_crawl(rotation_mode):
message = None
domain_service_type = None
#load_priority_queue
for service_type in rotation_mode:
message = redis_crawler.spop('{}_crawler_priority_queue'.format(service_type))
if message is not None:
domain_service_type = service_type
break
#load_normal_queue
if message is None:
for service_type in rotation_mode:
message = redis_crawler.spop('{}_crawler_queue'.format(service_type))
if message is not None:
domain_service_type = service_type
break
if message:
splitted = message.rsplit(';', 1)
if len(splitted) == 2:
url, paste = splitted
if paste:
paste = paste.replace(PASTES_FOLDER+'/', '')
message = {'url': url, 'paste': paste, 'type_service': domain_service_type, 'original_message': message}
return message
def get_crawler_config(redis_server, mode, service_type, domain, url=None):
crawler_options = {}
if mode=='auto':
config = redis_server.get('crawler_config:{}:{}:{}:{}'.format(mode, service_type, domain, url))
else:
config = redis_server.get('crawler_config:{}:{}:{}'.format(mode, service_type, domain))
if config is None:
config = {}
else:
config = json.loads(config)
for option in default_crawler_config:
if option in config:
crawler_options[option] = config[option]
else:
crawler_options[option] = default_crawler_config[option]
if mode == 'auto':
crawler_options['time'] = int(config['time'])
elif mode == 'manual':
redis_server.delete('crawler_config:{}:{}:{}'.format(mode, service_type, domain))
return crawler_options
def load_crawler_config(service_type, domain, paste, url, date):
crawler_config = {}
crawler_config['splash_url'] = splash_url
crawler_config['item'] = paste
crawler_config['service_type'] = service_type
crawler_config['domain'] = domain
crawler_config['date'] = date
# Auto and Manual Crawling
# Auto ################################################# create new entry, next crawling => here or when ended ?
if paste == 'auto':
crawler_config['crawler_options'] = get_crawler_config(redis_crawler, 'auto', service_type, domain, url=url)
crawler_config['requested'] = True
# Manual
elif paste == 'manual':
crawler_config['crawler_options'] = get_crawler_config(r_cache, 'manual', service_type, domain)
crawler_config['requested'] = True
# default crawler
else:
crawler_config['crawler_options'] = get_crawler_config(redis_crawler, 'default', service_type, domain)
crawler_config['requested'] = False
return crawler_config
def is_domain_up_day(domain, type_service, date_day):
if redis_crawler.sismember('{}_up:{}'.format(type_service, date_day), domain):
return True
else:
return False
def set_crawled_domain_metadata(type_service, date, domain, father_item):
# first seen
if not redis_crawler.hexists('{}_metadata:{}'.format(type_service, domain), 'first_seen'):
redis_crawler.hset('{}_metadata:{}'.format(type_service, domain), 'first_seen', date['date_day'])
redis_crawler.hset('{}_metadata:{}'.format(type_service, domain), 'paste_parent', father_item)
# last check
redis_crawler.hset('{}_metadata:{}'.format(type_service, domain), 'last_check', date['date_day'])
# Put message back on queue
def on_error_send_message_back_in_queue(type_service, domain, message):
if not redis_crawler.sismember('{}_domain_crawler_queue'.format(type_service), domain):
redis_crawler.sadd('{}_domain_crawler_queue'.format(type_service), domain)
redis_crawler.sadd('{}_crawler_priority_queue'.format(type_service), message)
def crawl_onion(url, domain, port, type_service, message, crawler_config):
crawler_config['url'] = url
crawler_config['port'] = port
print('Launching Crawler: {}'.format(url))
r_cache.hset('metadata_crawler:{}'.format(splash_port), 'crawling_domain', domain)
r_cache.hset('metadata_crawler:{}'.format(splash_port), 'started_time', datetime.datetime.now().strftime("%Y/%m/%d - %H:%M.%S"))
#if not r_onion.sismember('full_onion_up', domain) and not r_onion.sismember('onion_down:'+date , domain):
super_father = r_serv_metadata.hget('paste_metadata:'+paste, 'super_father')
if super_father is None:
super_father=paste
retry = True
nb_retry = 0
while retry:
@ -43,7 +202,7 @@ def crawl_onion(url, domain, date, date_month, message):
nb_retry += 1
if nb_retry == 6:
on_error_send_message_back_in_queue(type_hidden_service, domain, message)
on_error_send_message_back_in_queue(type_service, domain, message)
publisher.error('{} SPASH DOWN'.format(splash_url))
print('--------------------------------------')
print(' \033[91m DOCKER SPLASH DOWN\033[0m')
@ -57,7 +216,11 @@ def crawl_onion(url, domain, date, date_month, message):
if r.status_code == 200:
r_cache.hset('metadata_crawler:{}'.format(splash_port), 'status', 'Crawling')
process = subprocess.Popen(["python", './torcrawler/tor_crawler.py', splash_url, type_hidden_service, url, domain, paste, super_father],
# save config in cash
UUID = str(uuid.uuid4())
r_cache.set('crawler_request:{}'.format(UUID), json.dumps(crawler_config))
process = subprocess.Popen(["python", './torcrawler/tor_crawler.py', UUID],
stdout=subprocess.PIPE)
while process.poll() is None:
time.sleep(1)
@ -67,7 +230,7 @@ def crawl_onion(url, domain, date, date_month, message):
print(output)
# error: splash:Connection to proxy refused
if 'Connection to proxy refused' in output:
on_error_send_message_back_in_queue(type_hidden_service, domain, message)
on_error_send_message_back_in_queue(type_service, domain, message)
publisher.error('{} SPASH, PROXY DOWN OR BAD CONFIGURATION'.format(splash_url))
print('------------------------------------------------------------------------')
print(' \033[91m SPLASH: Connection to proxy refused')
@ -80,56 +243,53 @@ def crawl_onion(url, domain, date, date_month, message):
print(process.stdout.read())
exit(-1)
else:
on_error_send_message_back_in_queue(type_hidden_service, domain, message)
on_error_send_message_back_in_queue(type_service, domain, message)
print('--------------------------------------')
print(' \033[91m DOCKER SPLASH DOWN\033[0m')
print(' {} DOWN'.format(splash_url))
r_cache.hset('metadata_crawler:{}'.format(splash_port), 'status', 'Crawling')
exit(1)
# check external links (full_crawl)
def search_potential_source_domain(type_service, domain):
external_domains = set()
for link in redis_crawler.smembers('domain_{}_external_links:{}'.format(type_service, domain)):
# unpack url
url_data = unpack_url(link)
if url_data['domain'] != domain:
if url_data['tld'] == 'onion' or url_data['tld'] == 'i2p':
external_domains.add(url_data['domain'])
# # TODO: add special tag ?
if len(external_domains) >= 20:
redis_crawler.sadd('{}_potential_source'.format(type_service), domain)
print('New potential source found: domain')
redis_crawler.delete('domain_{}_external_links:{}'.format(type_service, domain))
if __name__ == '__main__':
if len(sys.argv) != 3:
print('usage:', 'Crawler.py', 'type_hidden_service (onion or i2p or regular)', 'splash_port')
if len(sys.argv) != 2:
print('usage:', 'Crawler.py', 'splash_port')
exit(1)
##################################################
#mode = sys.argv[1]
splash_port = sys.argv[1]
type_hidden_service = sys.argv[1]
splash_port = sys.argv[2]
rotation_mode = ['onion', 'regular']
default_proto_map = {'http': 80, 'https': 443}
######################################################## add ftp ???
publisher.port = 6380
publisher.channel = "Script"
publisher.info("Script Crawler started")
config_section = 'Crawler'
# Setup the I/O queues
p = Process(config_section)
url_onion = "((http|https|ftp)?(?:\://)?([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.onion)(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*)"
re.compile(url_onion)
url_i2p = "((http|https|ftp)?(?:\://)?([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.i2p)(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*)"
re.compile(url_i2p)
if type_hidden_service == 'onion':
regex_hidden_service = url_onion
splash_url = '{}:{}'.format( p.config.get("Crawler", "splash_url_onion"), splash_port)
elif type_hidden_service == 'i2p':
regex_hidden_service = url_i2p
splash_url = '{}:{}'.format( p.config.get("Crawler", "splash_url_i2p"), splash_port)
elif type_hidden_service == 'regular':
regex_hidden_service = url_i2p
splash_url = '{}:{}'.format( p.config.get("Crawler", "splash_url_onion"), splash_port)
else:
print('incorrect crawler type: {}'.format(type_hidden_service))
exit(0)
print('splash url: {}'.format(splash_url))
crawler_depth_limit = p.config.getint("Crawler", "crawler_depth_limit")
faup = Faup()
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], p.config.get("Directories", "pastes"))
r_serv_metadata = redis.StrictRedis(
@ -144,137 +304,122 @@ if __name__ == '__main__':
db=p.config.getint("Redis_Cache", "db"),
decode_responses=True)
r_onion = redis.StrictRedis(
redis_crawler = redis.StrictRedis(
host=p.config.get("ARDB_Onion", "host"),
port=p.config.getint("ARDB_Onion", "port"),
db=p.config.getint("ARDB_Onion", "db"),
decode_responses=True)
r_cache.sadd('all_crawler:{}'.format(type_hidden_service), splash_port)
faup = Faup()
# Default crawler options
default_crawler_config = {'html': 1,
'har': 1,
'png': 1,
'depth_limit': p.config.getint("Crawler", "crawler_depth_limit"),
'closespider_pagecount': 50,
'user_agent': 'Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Firefox/24.0'}
# Track launched crawler
r_cache.sadd('all_crawler', splash_port)
r_cache.hset('metadata_crawler:{}'.format(splash_port), 'status', 'Waiting')
r_cache.hset('metadata_crawler:{}'.format(splash_port), 'started_time', datetime.datetime.now().strftime("%Y/%m/%d - %H:%M.%S"))
# load domains blacklist
try:
with open(os.environ['AIL_BIN']+'/torcrawler/blacklist_onion.txt', 'r') as f:
r_onion.delete('blacklist_{}'.format(type_hidden_service))
lines = f.read().splitlines()
for line in lines:
r_onion.sadd('blacklist_{}'.format(type_hidden_service), line)
except Exception:
pass
# update hardcoded blacklist
load_blacklist('onion')
load_blacklist('regular')
while True:
# Priority Queue - Recovering the streamed message informations.
message = r_onion.spop('{}_crawler_priority_queue'.format(type_hidden_service))
update_auto_crawler()
if message is None:
# Recovering the streamed message informations.
message = r_onion.spop('{}_crawler_queue'.format(type_hidden_service))
if message is not None:
splitted = message.split(';')
if len(splitted) == 2:
url, paste = splitted
paste = paste.replace(PASTES_FOLDER+'/', '')
url_list = re.findall(regex_hidden_service, url)[0]
if url_list[1] == '':
url= 'http://{}'.format(url)
link, s, credential, subdomain, domain, host, port, \
resource_path, query_string, f1, f2, f3, f4 = url_list
domain = url_list[4]
r_onion.srem('{}_domain_crawler_queue'.format(type_hidden_service), domain)
domain_url = 'http://{}'.format(domain)
to_crawl = get_elem_to_crawl(rotation_mode)
if to_crawl:
url_data = unpack_url(to_crawl['url'])
# remove domain from queue
redis_crawler.srem('{}_domain_crawler_queue'.format(to_crawl['type_service']), url_data['domain'])
print()
print()
print('\033[92m------------------START CRAWLER------------------\033[0m')
print('crawler type: {}'.format(type_hidden_service))
print('crawler type: {}'.format(to_crawl['type_service']))
print('\033[92m-------------------------------------------------\033[0m')
print('url: {}'.format(url))
print('domain: {}'.format(domain))
print('domain_url: {}'.format(domain_url))
print('url: {}'.format(url_data['url']))
print('domain: {}'.format(url_data['domain']))
print('domain_url: {}'.format(url_data['domain_url']))
print()
faup.decode(domain)
onion_domain=faup.get()['domain'].decode()
# Check blacklist
if not redis_crawler.sismember('blacklist_{}'.format(to_crawl['type_service']), url_data['domain']):
date = {'date_day': datetime.datetime.now().strftime("%Y%m%d"),
'date_month': datetime.datetime.now().strftime("%Y%m"),
'epoch': int(time.time())}
if not r_onion.sismember('blacklist_{}'.format(type_hidden_service), domain) and not r_onion.sismember('blacklist_{}'.format(type_hidden_service), onion_domain):
# Update crawler status type
r_cache.sadd('{}_crawlers'.format(to_crawl['type_service']), splash_port)
date = datetime.datetime.now().strftime("%Y%m%d")
date_month = datetime.datetime.now().strftime("%Y%m")
if not r_onion.sismember('month_{}_up:{}'.format(type_hidden_service, date_month), domain) and not r_onion.sismember('{}_down:{}'.format(type_hidden_service, date), domain):
# first seen
if not r_onion.hexists('{}_metadata:{}'.format(type_hidden_service, domain), 'first_seen'):
r_onion.hset('{}_metadata:{}'.format(type_hidden_service, domain), 'first_seen', date)
# last_father
r_onion.hset('{}_metadata:{}'.format(type_hidden_service, domain), 'paste_parent', paste)
# last check
r_onion.hset('{}_metadata:{}'.format(type_hidden_service, domain), 'last_check', date)
crawl_onion(url, domain, date, date_month, message)
if url != domain_url:
print(url)
print(domain_url)
crawl_onion(domain_url, domain, date, date_month, message)
# save down onion
if not r_onion.sismember('{}_up:{}'.format(type_hidden_service, date), domain):
r_onion.sadd('{}_down:{}'.format(type_hidden_service, date), domain)
#r_onion.sadd('{}_down_link:{}'.format(type_hidden_service, date), url)
#r_onion.hincrby('{}_link_down'.format(type_hidden_service), url, 1)
else:
#r_onion.hincrby('{}_link_up'.format(type_hidden_service), url, 1)
if r_onion.sismember('month_{}_up:{}'.format(type_hidden_service, date_month), domain) and r_serv_metadata.exists('paste_children:'+paste):
msg = 'infoleak:automatic-detection="{}";{}'.format(type_hidden_service, paste)
p.populate_set_out(msg, 'Tags')
# add onion screenshot history
# add crawled days
if r_onion.lindex('{}_history:{}'.format(type_hidden_service, domain), 0) != date:
r_onion.lpush('{}_history:{}'.format(type_hidden_service, domain), date)
# add crawled history by date
r_onion.lpush('{}_history:{}:{}'.format(type_hidden_service, domain, date), paste) #add datetime here
# check external onions links (full_scrawl)
external_domains = set()
for link in r_onion.smembers('domain_{}_external_links:{}'.format(type_hidden_service, domain)):
external_domain = re.findall(url_onion, link)
external_domain.extend(re.findall(url_i2p, link))
if len(external_domain) > 0:
external_domain = external_domain[0][4]
else:
crawler_config = load_crawler_config(to_crawl['type_service'], url_data['domain'], to_crawl['paste'], to_crawl['url'], date)
# check if default crawler
if not crawler_config['requested']:
# Auto crawl only if service not up this month
if redis_crawler.sismember('month_{}_up:{}'.format(to_crawl['type_service'], date['date_month']), url_data['domain']):
continue
if '.onion' in external_domain and external_domain != domain:
external_domains.add(external_domain)
elif '.i2p' in external_domain and external_domain != domain:
external_domains.add(external_domain)
if len(external_domains) >= 10:
r_onion.sadd('{}_potential_source'.format(type_hidden_service), domain)
r_onion.delete('domain_{}_external_links:{}'.format(type_hidden_service, domain))
print(r_onion.smembers('domain_{}_external_links:{}'.format(type_hidden_service, domain)))
# update list, last crawled onions
r_onion.lpush('last_{}'.format(type_hidden_service), domain)
r_onion.ltrim('last_{}'.format(type_hidden_service), 0, 15)
set_crawled_domain_metadata(to_crawl['type_service'], date, url_data['domain'], to_crawl['paste'])
#### CRAWLER ####
# Manual and Auto Crawler
if crawler_config['requested']:
######################################################crawler strategy
# CRAWL domain
crawl_onion(url_data['url'], url_data['domain'], url_data['port'], to_crawl['type_service'], to_crawl['original_message'], crawler_config)
# Default Crawler
else:
# CRAWL domain
crawl_onion(url_data['domain_url'], url_data['domain'], url_data['port'], to_crawl['type_service'], to_crawl['original_message'], crawler_config)
#if url != domain_url and not is_domain_up_day(url_data['domain'], to_crawl['type_service'], date['date_day']):
# crawl_onion(url_data['url'], url_data['domain'], to_crawl['original_message'])
# Save last_status day (DOWN)
if not is_domain_up_day(url_data['domain'], to_crawl['type_service'], date['date_day']):
redis_crawler.sadd('{}_down:{}'.format(to_crawl['type_service'], date['date_day']), url_data['domain'])
# if domain was UP at least one time
if redis_crawler.exists('crawler_history_{}:{}:{}'.format(to_crawl['type_service'], url_data['domain'], url_data['port'])):
# add crawler history (if domain is down)
if not redis_crawler.zrangebyscore('crawler_history_{}:{}:{}'.format(to_crawl['type_service'], url_data['domain'], url_data['port']), date['epoch'], date['epoch']):
# Domain is down
redis_crawler.zadd('crawler_history_{}:{}:{}'.format(to_crawl['type_service'], url_data['domain'], url_data['port']), int(date['epoch']), int(date['epoch']))
############################
# extract page content
############################
# update list, last crawled domains
redis_crawler.lpush('last_{}'.format(to_crawl['type_service']), '{}:{};{}'.format(url_data['domain'], url_data['port'], date['epoch']))
redis_crawler.ltrim('last_{}'.format(to_crawl['type_service']), 0, 15)
#update crawler status
r_cache.hset('metadata_crawler:{}'.format(splash_port), 'status', 'Waiting')
r_cache.hdel('metadata_crawler:{}'.format(splash_port), 'crawling_domain')
# Update crawler status type
r_cache.srem('{}_crawlers'.format(to_crawl['type_service']), splash_port)
# add next auto Crawling in queue:
if to_crawl['paste'] == 'auto':
redis_crawler.zadd('crawler_auto_queue', int(time.time()+crawler_config['crawler_options']['time']) , '{};{}'.format(to_crawl['original_message'], to_crawl['type_service']))
# update list, last auto crawled domains
redis_crawler.lpush('last_auto_crawled', '{}:{};{}'.format(url_data['domain'], url_data['port'], date['epoch']))
redis_crawler.ltrim('last_auto_crawled', 0, 9)
else:
print(' Blacklisted Onion')
print(' Blacklisted Domain')
print()
print()
else:
continue
else:
time.sleep(1)

View file

@ -97,7 +97,7 @@ if __name__ == "__main__":
if sites_set:
message += ' Related websites: {}'.format( (', '.join(sites_set)) )
to_print = 'Credential;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, message, paste.p_path)
to_print = 'Credential;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, message, paste.p_rel_path)
print('\n '.join(creds))
@ -107,9 +107,6 @@ if __name__ == "__main__":
publisher.warning(to_print)
#Send to duplicate
p.populate_set_out(filepath, 'Duplicate')
#Send to alertHandler
msg = 'credential;{}'.format(filepath)
p.populate_set_out(msg, 'alertHandler')
msg = 'infoleak:automatic-detection="credential";{}'.format(filepath)
p.populate_set_out(msg, 'Tags')

View file

@ -77,19 +77,16 @@ if __name__ == "__main__":
paste.p_source, paste.p_date, paste.p_name)
if (len(creditcard_set) > 0):
publisher.warning('{}Checked {} valid number(s);{}'.format(
to_print, len(creditcard_set), paste.p_path))
to_print, len(creditcard_set), paste.p_rel_path))
print('{}Checked {} valid number(s);{}'.format(
to_print, len(creditcard_set), paste.p_path))
to_print, len(creditcard_set), paste.p_rel_path))
#Send to duplicate
p.populate_set_out(filename, 'Duplicate')
#send to Browse_warning_paste
msg = 'creditcard;{}'.format(filename)
p.populate_set_out(msg, 'alertHandler')
msg = 'infoleak:automatic-detection="credit-card";{}'.format(filename)
p.populate_set_out(msg, 'Tags')
else:
publisher.info('{}CreditCard related;{}'.format(to_print, paste.p_path))
publisher.info('{}CreditCard related;{}'.format(to_print, paste.p_rel_path))
else:
publisher.debug("Script creditcard is idling 1m")
time.sleep(10)

View file

@ -31,10 +31,6 @@ def search_cve(message):
print('{} contains CVEs'.format(paste.p_name))
publisher.warning('{} contains CVEs'.format(paste.p_name))
#send to Browse_warning_paste
msg = 'cve;{}'.format(filepath)
p.populate_set_out(msg, 'alertHandler')
msg = 'infoleak:automatic-detection="cve";{}'.format(filepath)
p.populate_set_out(msg, 'Tags')
#Send to duplicate

View file

@ -147,9 +147,6 @@ def set_out_paste(decoder_name, message):
publisher.warning(decoder_name+' decoded')
#Send to duplicate
p.populate_set_out(message, 'Duplicate')
#send to Browse_warning_paste
msg = (decoder_name+';{}'.format(message))
p.populate_set_out( msg, 'alertHandler')
msg = 'infoleak:automatic-detection="'+decoder_name+'";{}'.format(message)
p.populate_set_out(msg, 'Tags')
@ -229,7 +226,7 @@ if __name__ == '__main__':
except TimeoutException:
encoded_list = []
p.incr_module_timeout_statistic() # add encoder type
print ("{0} processing timeout".format(paste.p_path))
print ("{0} processing timeout".format(paste.p_rel_path))
continue
else:
signal.alarm(0)

View file

@ -54,14 +54,14 @@ def main():
if localizeddomains:
print(localizeddomains)
publisher.warning('DomainC;{};{};{};Checked {} located in {};{}'.format(
PST.p_source, PST.p_date, PST.p_name, localizeddomains, cc_tld, PST.p_path))
PST.p_source, PST.p_date, PST.p_name, localizeddomains, cc_tld, PST.p_rel_path))
localizeddomains = c.localizedomain(cc=cc)
if localizeddomains:
print(localizeddomains)
publisher.warning('DomainC;{};{};{};Checked {} located in {};{}'.format(
PST.p_source, PST.p_date, PST.p_name, localizeddomains, cc, PST.p_path))
PST.p_source, PST.p_date, PST.p_name, localizeddomains, cc, PST.p_rel_path))
except IOError:
print("CRC Checksum Failed on :", PST.p_path)
print("CRC Checksum Failed on :", PST.p_rel_path)
publisher.error('Duplicate;{};{};{};CRC Checksum Failed'.format(
PST.p_source, PST.p_date, PST.p_name))

View file

@ -142,17 +142,17 @@ if __name__ == "__main__":
paste_date = paste_date
paste_date = paste_date if paste_date != None else "No date available"
if paste_path != None:
if paste_path != PST.p_path:
if paste_path != PST.p_rel_path:
hash_dico[dico_hash] = (hash_type, paste_path, percent, paste_date)
print('['+hash_type+'] '+'comparing: ' + str(PST.p_path[44:]) + ' and ' + str(paste_path[44:]) + ' percentage: ' + str(percent))
print('['+hash_type+'] '+'comparing: ' + str(PST.p_rel_path) + ' and ' + str(paste_path) + ' percentage: ' + str(percent))
except Exception:
print('hash not comparable, bad hash: '+dico_hash+' , current_hash: '+paste_hash)
# Add paste in DB after checking to prevent its analysis twice
# hash_type_i -> index_i AND index_i -> PST.PATH
r_serv1.set(index, PST.p_path)
r_serv1.set(index, PST.p_rel_path)
r_serv1.set(index+'_date', PST._get_p_date())
r_serv1.sadd("INDEX", index)
# Adding hashes in Redis
@ -180,7 +180,7 @@ if __name__ == "__main__":
PST.__setattr__("p_duplicate", dupl)
PST.save_attribute_duplicate(dupl)
PST.save_others_pastes_attribute_duplicate(dupl)
publisher.info('{}Detected {};{}'.format(to_print, len(dupl), PST.p_path))
publisher.info('{}Detected {};{}'.format(to_print, len(dupl), PST.p_rel_path))
print('{}Detected {}'.format(to_print, len(dupl)))
print('')
@ -191,5 +191,5 @@ if __name__ == "__main__":
except IOError:
to_print = 'Duplicate;{};{};{};'.format(
PST.p_source, PST.p_date, PST.p_name)
print("CRC Checksum Failed on :", PST.p_path)
print("CRC Checksum Failed on :", PST.p_rel_path)
publisher.error('{}CRC Checksum Failed'.format(to_print))

View file

@ -45,6 +45,9 @@ if __name__ == '__main__':
p = Process(config_section)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], p.config.get("Directories", "pastes"))
PASTES_FOLDERS = PASTES_FOLDER + '/'
# LOGGING #
publisher.info("Feed Script started to receive & publish.")
@ -78,8 +81,7 @@ if __name__ == '__main__':
paste = rreplace(paste, file_name_paste, new_file_name_paste, 1)
# Creating the full filepath
filename = os.path.join(os.environ['AIL_HOME'],
p.config.get("Directories", "pastes"), paste)
filename = os.path.join(PASTES_FOLDER, paste)
dirname = os.path.dirname(filename)
if not os.path.exists(dirname):
@ -103,5 +105,10 @@ if __name__ == '__main__':
print(type)
print('-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------')
'''
p.populate_set_out(filename)
# remove PASTES_FOLDER from item path (crawled item + submited)
if PASTES_FOLDERS in paste:
paste = paste.replace(PASTES_FOLDERS, '', 1)
p.populate_set_out(paste)
processed_paste+=1

View file

@ -112,10 +112,7 @@ def search_key(paste):
#Send to duplicate
p.populate_set_out(message, 'Duplicate')
#send to Browse_warning_paste
msg = ('keys;{}'.format(message))
print(message)
p.populate_set_out( msg, 'alertHandler')
if __name__ == '__main__':

View file

@ -16,7 +16,6 @@ export AIL_HOME="${DIR}"
cd ${AIL_HOME}
if [ -e "${DIR}/AILENV/bin/python" ]; then
echo "AIL-framework virtualenv seems to exist, good"
ENV_PY="${DIR}/AILENV/bin/python"
else
echo "Please make sure you have a AIL-framework environment, au revoir"
@ -75,6 +74,7 @@ function helptext {
LAUNCH.sh
[-l | --launchAuto]
[-k | --killAll]
[-u | --update]
[-c | --configUpdate]
[-t | --thirdpartyUpdate]
[-h | --help]
@ -125,12 +125,7 @@ function launching_queues {
function checking_configuration {
bin_dir=${AIL_HOME}/bin
echo -e "\t* Checking configuration"
if [ "$1" == "automatic" ]; then
bash -c "${ENV_PY} $bin_dir/Update-conf.py True"
else
bash -c "${ENV_PY} $bin_dir/Update-conf.py False"
fi
bash -c "${ENV_PY} $bin_dir/Update-conf.py"
exitStatus=$?
if [ $exitStatus -ge 1 ]; then
echo -e $RED"\t* Configuration not up-to-date"$DEFAULT
@ -140,7 +135,7 @@ function checking_configuration {
}
function launching_scripts {
checking_configuration $1;
checking_configuration;
screen -dmS "Script_AIL"
sleep 0.1
@ -206,14 +201,14 @@ function launching_scripts {
sleep 0.1
screen -S "Script_AIL" -X screen -t "LibInjection" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./LibInjection.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "alertHandler" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./alertHandler.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "MISPtheHIVEfeeder" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./MISP_The_Hive_feeder.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "Tags" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Tags.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "SentimentAnalysis" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./SentimentAnalysis.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "UpdateBackground" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./update-background.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "SubmitPaste" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./submit_paste.py; read x"
}
@ -221,7 +216,7 @@ function launching_scripts {
function launching_crawler {
if [[ ! $iscrawler ]]; then
CONFIG=$AIL_BIN/packages/config.cfg
lport=$(awk '/^\[Crawler\]/{f=1} f==1&&/^splash_onion_port/{print $3;exit}' "${CONFIG}")
lport=$(awk '/^\[Crawler\]/{f=1} f==1&&/^splash_port/{print $3;exit}' "${CONFIG}")
IFS='-' read -ra PORTS <<< "$lport"
if [ ${#PORTS[@]} -eq 1 ]
@ -237,7 +232,7 @@ function launching_crawler {
sleep 0.1
for ((i=first_port;i<=last_port;i++)); do
screen -S "Crawler_AIL" -X screen -t "onion_crawler:$i" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Crawler.py onion $i; read x"
screen -S "Crawler_AIL" -X screen -t "onion_crawler:$i" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Crawler.py $i; read x"
sleep 0.1
done
@ -299,6 +294,30 @@ function checking_ardb {
return $flag_ardb;
}
function wait_until_redis_is_ready {
redis_not_ready=true
while $redis_not_ready; do
if checking_redis; then
redis_not_ready=false;
else
sleep 1
fi
done
echo -e $YELLOW"\t* Redis Launched"$DEFAULT
}
function wait_until_ardb_is_ready {
ardb_not_ready=true;
while $ardb_not_ready; do
if checking_ardb; then
ardb_not_ready=false
else
sleep 3
fi
done
echo -e $YELLOW"\t* ARDB Launched"$DEFAULT
}
function launch_redis {
if [[ ! $isredis ]]; then
launching_redis;
@ -335,14 +354,14 @@ function launch_scripts {
if [[ ! $isscripted ]]; then
sleep 1
if checking_ardb && checking_redis; then
launching_scripts $1;
launching_scripts;
else
no_script_launched=true
while $no_script_launched; do
echo -e $YELLOW"\tScript not started, waiting 5 more secondes"$DEFAULT
sleep 5
if checking_redis && checking_ardb; then
launching_scripts $1;
launching_scripts;
no_script_launched=false
else
echo -e $RED"\tScript not started"$DEFAULT
@ -380,17 +399,21 @@ function launch_feeder {
}
function killall {
if [[ $isredis || $isardb || $islogged || $isqueued || $isscripted || $isflasked || $isfeeded ]]; then
if [[ $isredis || $isardb || $islogged || $isqueued || $isscripted || $isflasked || $isfeeded || $iscrawler ]]; then
if [[ $isredis ]]; then
echo -e $GREEN"Gracefully closing redis servers"$DEFAULT
shutting_down_redis;
sleep 0.2
fi
if [[ $isardb ]]; then
echo -e $GREEN"Gracefully closing ardb servers"$DEFAULT
shutting_down_ardb;
fi
echo -e $GREEN"Killing all"$DEFAULT
kill $isredis $isardb $islogged $isqueued $isscripted $isflasked $isfeeded
kill $isredis $isardb $islogged $isqueued $isscripted $isflasked $isfeeded $iscrawler
sleep 0.2
echo -e $ROSE`screen -ls`$DEFAULT
echo -e $GREEN"\t* $isredis $isardb $islogged $isqueued $isscripted killed."$DEFAULT
echo -e $GREEN"\t* $isredis $isardb $islogged $isqueued $isscripted $isflasked $isfeeded $iscrawler killed."$DEFAULT
else
echo -e $RED"\t* No screen to kill"$DEFAULT
fi
@ -400,6 +423,17 @@ function shutdown {
bash -c "./Shutdown.py"
}
function update() {
bin_dir=${AIL_HOME}/bin
bash -c "python3 $bin_dir/Update.py"
exitStatus=$?
if [ $exitStatus -ge 1 ]; then
echo -e $RED"\t* Update Error"$DEFAULT
exit
fi
}
function update_thirdparty {
echo -e "\t* Updating thirdparty..."
bash -c "(cd ${AIL_FLASK}; ./update_thirdparty.sh)"
@ -413,11 +447,12 @@ function update_thirdparty {
}
function launch_all {
update;
launch_redis;
launch_ardb;
launch_logs;
launch_queues;
launch_scripts $1;
launch_scripts;
launch_flask;
}
@ -426,7 +461,7 @@ function launch_all {
helptext;
options=("Redis" "Ardb" "Logs" "Queues" "Scripts" "Flask" "Killall" "Shutdown" "Update-config" "Update-thirdparty")
options=("Redis" "Ardb" "Logs" "Queues" "Scripts" "Flask" "Killall" "Shutdown" "Update" "Update-config" "Update-thirdparty")
menu() {
echo "What do you want to Launch?:"
@ -451,7 +486,7 @@ function launch_all {
if [[ "${choices[i]}" ]]; then
case ${options[i]} in
Redis)
launch_redis
launch_redis;
;;
Ardb)
launch_ardb;
@ -477,8 +512,11 @@ function launch_all {
Shutdown)
shutdown;
;;
Update)
update;
;;
Update-config)
checking_configuration "manual";
checking_configuration;
;;
Update-thirdparty)
update_thirdparty;
@ -490,12 +528,26 @@ function launch_all {
exit
}
#echo "$@"
while [ "$1" != "" ]; do
case $1 in
-l | --launchAuto ) launch_all "automatic";
;;
-lr | --launchRedis ) launch_redis;
;;
-la | --launchARDB ) launch_ardb;
;;
-lrv | --launchRedisVerify ) launch_redis;
wait_until_redis_is_ready;
;;
-lav | --launchARDBVerify ) launch_ardb;
wait_until_ardb_is_ready;
;;
-k | --killAll ) killall;
;;
-u | --update ) update;
;;
-t | --thirdpartyUpdate ) update_thirdparty;
;;
-c | --crawler ) launching_crawler;
@ -504,6 +556,9 @@ while [ "$1" != "" ]; do
;;
-h | --help ) helptext;
exit
;;
-kh | --khelp ) helptext;
;;
* ) helptext
exit 1

View file

@ -47,12 +47,11 @@ def analyse(url, path):
paste = Paste.Paste(path)
print("Detected (libinjection) SQL in URL: ")
print(urllib.request.unquote(url))
to_print = 'LibInjection;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, "Detected SQL in URL", paste.p_path)
to_print = 'LibInjection;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, "Detected SQL in URL", paste.p_rel_path)
publisher.warning(to_print)
#Send to duplicate
p.populate_set_out(path, 'Duplicate')
#send to Browse_warning_paste
p.populate_set_out('sqlinjection;{}'.format(path), 'alertHandler')
msg = 'infoleak:automatic-detection="sql-injection";{}'.format(path)
p.populate_set_out(msg, 'Tags')

View file

@ -75,10 +75,11 @@ if __name__ == '__main__':
PST.save_attribute_redis("p_max_length_line", lines_infos[1])
# FIXME Not used.
PST.store.sadd("Pastes_Objects", PST.p_path)
PST.store.sadd("Pastes_Objects", PST.p_rel_path)
print(PST.p_rel_path)
if lines_infos[1] < args.max:
p.populate_set_out( PST.p_path , 'LinesShort')
p.populate_set_out( PST.p_rel_path , 'LinesShort')
else:
p.populate_set_out( PST.p_path , 'LinesLong')
p.populate_set_out( PST.p_rel_path , 'LinesLong')
except IOError:
print("CRC Checksum Error on : ", PST.p_path)
print("CRC Checksum Error on : ", PST.p_rel_path)

View file

@ -78,12 +78,11 @@ if __name__ == "__main__":
to_print = 'Mails;{};{};{};Checked {} e-mail(s);{}'.\
format(PST.p_source, PST.p_date, PST.p_name,
MX_values[0], PST.p_path)
MX_values[0], PST.p_rel_path)
if MX_values[0] > is_critical:
publisher.warning(to_print)
#Send to duplicate
p.populate_set_out(filename, 'Duplicate')
p.populate_set_out('mail;{}'.format(filename), 'alertHandler')
msg = 'infoleak:automatic-detection="mail";{}'.format(filename)
p.populate_set_out(msg, 'Tags')

View file

@ -82,6 +82,8 @@ if __name__ == '__main__':
ttl_key = cfg.getint("Module_Mixer", "ttl_duplicate")
default_unnamed_feed_name = cfg.get("Module_Mixer", "default_unnamed_feed_name")
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], p.config.get("Directories", "pastes")) + '/'
# STATS #
processed_paste = 0
processed_paste_per_feeder = {}
@ -104,12 +106,14 @@ if __name__ == '__main__':
feeder_name.replace(" ","")
if 'import_dir' in feeder_name:
feeder_name = feeder_name.split('/')[1]
paste_name = complete_paste
except ValueError as e:
feeder_name = default_unnamed_feed_name
paste_name = complete_paste
# remove absolute path
paste_name = paste_name.replace(PASTES_FOLDER, '', 1)
# Processed paste
processed_paste += 1
try:
@ -119,6 +123,7 @@ if __name__ == '__main__':
processed_paste_per_feeder[feeder_name] = 1
duplicated_paste_per_feeder[feeder_name] = 0
relay_message = "{0} {1}".format(paste_name, gzip64encoded)
#relay_message = b" ".join( [paste_name, gzip64encoded] )

View file

@ -32,6 +32,8 @@ import redis
import signal
import re
from pyfaup.faup import Faup
from Helper import Process
class TimeoutException(Exception):
@ -132,6 +134,8 @@ if __name__ == "__main__":
activate_crawler = False
print('Crawler disabled')
faup = Faup()
# Thanks to Faup project for this regex
# https://github.com/stricaud/faup
url_regex = "((http|https|ftp)?(?:\://)?([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.onion)(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*)"
@ -167,7 +171,7 @@ if __name__ == "__main__":
except TimeoutException:
encoded_list = []
p.incr_module_timeout_statistic()
print ("{0} processing timeout".format(PST.p_path))
print ("{0} processing timeout".format(PST.p_rel_path))
continue
signal.alarm(0)
@ -185,7 +189,7 @@ if __name__ == "__main__":
r_onion.sadd('i2p_domain', domain)
r_onion.sadd('i2p_link', url)
r_onion.sadd('i2p_domain_crawler_queue', domain)
msg = '{};{}'.format(url,PST.p_path)
msg = '{};{}'.format(url,PST.p_rel_path)
r_onion.sadd('i2p_crawler_queue', msg)
'''
@ -200,10 +204,10 @@ if __name__ == "__main__":
if not activate_crawler:
publisher.warning('{}Detected {} .onion(s);{}'.format(
to_print, len(domains_list),PST.p_path))
to_print, len(domains_list),PST.p_rel_path))
else:
publisher.info('{}Detected {} .onion(s);{}'.format(
to_print, len(domains_list),PST.p_path))
to_print, len(domains_list),PST.p_rel_path))
now = datetime.datetime.now()
path = os.path.join('onions', str(now.year).zfill(4),
str(now.month).zfill(2),
@ -218,37 +222,50 @@ if __name__ == "__main__":
date = datetime.datetime.now().strftime("%Y%m%d")
for url in urls:
domain = re.findall(url_regex, url)
if len(domain) > 0:
domain = domain[0][4]
faup.decode(url)
url_unpack = faup.get()
domain = url_unpack['domain'].decode()
## TODO: blackilst by port ?
# check blacklist
if redis_crawler.sismember('blacklist_onion', domain):
continue
subdomain = re.findall(url_regex, url)
if len(subdomain) > 0:
subdomain = subdomain[0][4]
else:
continue
# too many subdomain
if len(domain.split('.')) > 5:
continue
if len(subdomain.split('.')) > 3:
subdomain = '{}.{}.onion'.format(subdomain[-3], subdomain[-2])
if not r_onion.sismember('month_onion_up:{}'.format(date_month), domain) and not r_onion.sismember('onion_down:'+date , domain):
if not r_onion.sismember('onion_domain_crawler_queue', domain):
if not r_onion.sismember('month_onion_up:{}'.format(date_month), subdomain) and not r_onion.sismember('onion_down:'+date , subdomain):
if not r_onion.sismember('onion_domain_crawler_queue', subdomain):
print('send to onion crawler')
r_onion.sadd('onion_domain_crawler_queue', domain)
msg = '{};{}'.format(url,PST.p_path)
if not r_onion.hexists('onion_metadata:{}'.format(domain), 'first_seen'):
r_onion.sadd('onion_domain_crawler_queue', subdomain)
msg = '{};{}'.format(url,PST.p_rel_path)
if not r_onion.hexists('onion_metadata:{}'.format(subdomain), 'first_seen'):
r_onion.sadd('onion_crawler_priority_queue', msg)
print('send to priority queue')
else:
r_onion.sadd('onion_crawler_queue', msg)
#p.populate_set_out(msg, 'Crawler')
# tag if domain was up
if r_onion.sismember('full_onion_up', subdomain):
# TAG Item
msg = 'infoleak:automatic-detection="onion";{}'.format(PST.p_rel_path)
p.populate_set_out(msg, 'Tags')
else:
for url in fetch(p, r_cache, urls, domains_list, path):
publisher.info('{}Checked {};{}'.format(to_print, url, PST.p_path))
p.populate_set_out('onion;{}'.format(PST.p_path), 'alertHandler')
publisher.info('{}Checked {};{}'.format(to_print, url, PST.p_rel_path))
msg = 'infoleak:automatic-detection="onion";{}'.format(PST.p_path)
# TAG Item
msg = 'infoleak:automatic-detection="onion";{}'.format(PST.p_rel_path)
p.populate_set_out(msg, 'Tags')
else:
publisher.info('{}Onion related;{}'.format(to_print, PST.p_path))
publisher.info('{}Onion related;{}'.format(to_print, PST.p_rel_path))
prec_filename = filename
else:

View file

@ -32,14 +32,11 @@ def search_phone(message):
if len(results) > 4:
print(results)
publisher.warning('{} contains PID (phone numbers)'.format(paste.p_name))
#send to Browse_warning_paste
msg = 'phone;{}'.format(message)
p.populate_set_out(msg, 'alertHandler')
#Send to duplicate
msg = 'infoleak:automatic-detection="phone-number";{}'.format(message)
p.populate_set_out(msg, 'Tags')
#Send to duplicate
p.populate_set_out(message, 'Duplicate')
stats = {}
for phone_number in results:

View file

@ -108,7 +108,7 @@ if __name__ == "__main__":
try:
matched = compiled_regex.search(content)
except TimeoutException:
print ("{0} processing timeout".format(paste.p_path))
print ("{0} processing timeout".format(paste.p_rel_path))
continue
else:
signal.alarm(0)

View file

@ -54,7 +54,7 @@ if __name__ == "__main__":
if len(releases) == 0:
continue
to_print = 'Release;{};{};{};{} releases;{}'.format(paste.p_source, paste.p_date, paste.p_name, len(releases), paste.p_path)
to_print = 'Release;{};{};{};{} releases;{}'.format(paste.p_source, paste.p_date, paste.p_name, len(releases), paste.p_rel_path)
print(to_print)
if len(releases) > 30:
publisher.warning(to_print)
@ -63,7 +63,7 @@ if __name__ == "__main__":
except TimeoutException:
p.incr_module_timeout_statistic()
print ("{0} processing timeout".format(paste.p_path))
print ("{0} processing timeout".format(paste.p_rel_path))
continue
else:
signal.alarm(0)

View file

@ -78,12 +78,10 @@ def analyse(url, path):
if (result_path > 1) or (result_query > 1):
print("Detected SQL in URL: ")
print(urllib.request.unquote(url))
to_print = 'SQLInjection;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, "Detected SQL in URL", paste.p_path)
to_print = 'SQLInjection;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, "Detected SQL in URL", paste.p_rel_path)
publisher.warning(to_print)
#Send to duplicate
p.populate_set_out(path, 'Duplicate')
#send to Browse_warning_paste
p.populate_set_out('sqlinjection;{}'.format(path), 'alertHandler')
msg = 'infoleak:automatic-detection="sql-injection";{}'.format(path)
p.populate_set_out(msg, 'Tags')
@ -97,7 +95,7 @@ def analyse(url, path):
else:
print("Potential SQL injection:")
print(urllib.request.unquote(url))
to_print = 'SQLInjection;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, "Potential SQL injection", paste.p_path)
to_print = 'SQLInjection;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, "Potential SQL injection", paste.p_rel_path)
publisher.info(to_print)

View file

@ -45,6 +45,7 @@ cfg = configparser.ConfigParser()
cfg.read(configfile)
sentiment_lexicon_file = cfg.get("Directories", "sentiment_lexicon_file")
#time_clean_sentiment_db = 60*60
def Analyse(message, server):
path = message
@ -157,9 +158,16 @@ if __name__ == '__main__':
db=p.config.get("ARDB_Sentiment", "db"),
decode_responses=True)
time1 = time.time()
while True:
message = p.get_from_set()
if message is None:
#if int(time.time() - time1) > time_clean_sentiment_db:
# clean_db()
# time1 = time.time()
# continue
#else:
publisher.debug("{} queue is empty, waiting".format(config_section))
time.sleep(1)
continue

View file

@ -17,6 +17,19 @@ from pubsublogger import publisher
from Helper import Process
from packages import Paste
def get_item_date(item_filename):
l_directory = item_filename.split('/')
return '{}{}{}'.format(l_directory[-4], l_directory[-3], l_directory[-2])
def set_tag_metadata(tag, date):
# First time we see this tag ## TODO: filter paste from the paste ?
if not server.hexists('tag_metadata:{}'.format(tag), 'first_seen'):
server.hset('tag_metadata:{}'.format(tag), 'first_seen', date)
# Check and Set tag last_seen
last_seen = server.hget('tag_metadata:{}'.format(tag), 'last_seen')
if last_seen is None or date > last_seen:
server.hset('tag_metadata:{}'.format(tag), 'last_seen', date)
if __name__ == '__main__':
# Port of the redis instance used by pubsublogger
@ -42,12 +55,6 @@ if __name__ == '__main__':
db=p.config.get("ARDB_Metadata", "db"),
decode_responses=True)
serv_statistics = redis.StrictRedis(
host=p.config.get('ARDB_Statistics', 'host'),
port=p.config.get('ARDB_Statistics', 'port'),
db=p.config.get('ARDB_Statistics', 'db'),
decode_responses=True)
# Sent to the logging a description of the module
publisher.info("Tags module started")
@ -68,12 +75,15 @@ if __name__ == '__main__':
if res == 1:
print("new tags added : {}".format(tag))
# add the path to the tag set
res = server.sadd(tag, path)
#curr_date = datetime.date.today().strftime("%Y%m%d")
item_date = get_item_date(path)
res = server.sadd('{}:{}'.format(tag, item_date), path)
if res == 1:
print("new paste: {}".format(path))
print(" tagged: {}".format(tag))
server_metadata.sadd('tag:'+path, tag)
set_tag_metadata(tag, item_date)
server_metadata.sadd('tag:{}'.format(path), tag)
curr_date = datetime.date.today()
serv_statistics.hincrby(curr_date.strftime("%Y%m%d"),'paste_tagged:'+tag, 1)
curr_date = datetime.date.today().strftime("%Y%m%d")
server.hincrby('daily_tags:{}'.format(item_date), tag, 1)
p.populate_set_out(message, 'MISP_The_Hive_feeder')

View file

@ -57,11 +57,11 @@ if __name__ == "__main__":
try:
for word, score in paste._get_top_words().items():
if len(word) >= 4:
msg = '{} {} {}'.format(paste.p_path, word, score)
msg = '{} {} {}'.format(paste.p_rel_path, word, score)
p.populate_set_out(msg)
except TimeoutException:
p.incr_module_timeout_statistic()
print ("{0} processing timeout".format(paste.p_path))
print ("{0} processing timeout".format(paste.p_rel_path))
continue
else:
signal.alarm(0)

View file

@ -1,123 +1,90 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import configparser
from configparser import ConfigParser as cfgP
import os
from collections import OrderedDict
import sys
import shutil
import argparse
import configparser
def print_message(message_to_print, verbose):
if verbose:
print(message_to_print)
def update_config(config_file, config_file_sample, config_file_backup=False):
verbose = True
# Check if confile file exist
if not os.path.isfile(config_file):
# create config file
with open(config_file, 'w') as configfile:
with open(config_file_sample, 'r') as config_file_sample:
configfile.write(config_file_sample.read())
print_message('Config File Created', verbose)
else:
config_server = configparser.ConfigParser()
config_server.read(config_file)
config_sections = config_server.sections()
config_sample = configparser.ConfigParser()
config_sample.read(config_file_sample)
sample_sections = config_sample.sections()
mew_content_added = False
for section in sample_sections:
new_key_added = False
if section not in config_sections:
# add new section
config_server.add_section(section)
mew_content_added = True
for key in config_sample[section]:
if key not in config_server[section]:
# add new section key
config_server.set(section, key, config_sample[section][key])
if not new_key_added:
print_message('[{}]'.format(section), verbose)
new_key_added = True
mew_content_added = True
print_message(' {} = {}'.format(key, config_sample[section][key]), verbose)
# new keys have been added to config file
if mew_content_added:
# backup config file
if config_file_backup:
with open(config_file_backup, 'w') as configfile:
with open(config_file, 'r') as configfile_origin:
configfile.write(configfile_origin.read())
print_message('New Backup Created', verbose)
# create new config file
with open(config_file, 'w') as configfile:
config_server.write(configfile)
print_message('Config file updated', verbose)
else:
print_message('Nothing to update', verbose)
#return true if the configuration is up-to-date
def main():
if len(sys.argv) != 2:
print('usage:', 'Update-conf.py', 'Automatic (boolean)')
exit(1)
else:
automatic = sys.argv[1]
if automatic == 'True':
automatic = True
else:
automatic = False
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
configfileBackup = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg') + '.backup'
if not os.path.exists(configfile):
#------------------------------------------------------------------------------------#
config_file_default = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
config_file_default_sample = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg.sample')
config_file_default_backup = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg.backup')
config_file_update = os.path.join(os.environ['AIL_HOME'], 'configs/update.cfg')
config_file_update_sample = os.path.join(os.environ['AIL_HOME'], 'configs/update.cfg.sample')
if not os.path.exists(config_file_default_sample):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
configfileSample = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg.sample')
cfg = configparser.ConfigParser()
cfg.read(configfile)
cfgSample = configparser.ConfigParser()
cfgSample.read(configfileSample)
sections = cfgP.sections(cfg)
sectionsSample = cfgP.sections(cfgSample)
update_config(config_file_default, config_file_default_sample, config_file_backup=config_file_default_backup)
update_config(config_file_update, config_file_update_sample)
missingSection = []
dicoMissingSection = {}
missingItem = []
dicoMissingItem = {}
for sec in sectionsSample:
if sec not in sections:
missingSection += [sec]
dicoMissingSection[sec] = cfgP.items(cfgSample, sec)
else:
setSample = set(cfgP.options(cfgSample, sec))
setNormal = set(cfgP.options(cfg, sec))
if setSample != setNormal:
missing_items = list(setSample.difference(setNormal))
missingItem += [sec]
list_items = []
for i in missing_items:
list_items.append( (i, cfgSample.get(sec, i)) )
dicoMissingItem[sec] = list_items
if len(missingSection) == 0 and len(missingItem) == 0:
#print("Configuration up-to-date")
return True
print("/!\\ Configuration not complete. Missing following configuration: /!\\")
print("+--------------------------------------------------------------------+")
for section in missingSection:
print("["+section+"]")
for item in dicoMissingSection[section]:
print(" - "+item[0])
for section in missingItem:
print("["+section+"]")
for item in dicoMissingItem[section]:
print(" - "+item[0])
print("+--------------------------------------------------------------------+")
if automatic:
resp = 'y'
else:
resp = input("Do you want to auto fix it? [y/n] ")
if resp != 'y':
return False
else:
if automatic:
resp2 = 'y'
else:
resp2 = input("Do you want to keep a backup of the old configuration file? [y/n] ")
if resp2 == 'y':
shutil.move(configfile, configfileBackup)
#Do not keep item ordering in section. New items appened
for section in missingItem:
for item, value in dicoMissingItem[section]:
cfg.set(section, item, value)
#Keep sections ordering while updating the config file
new_dico = add_items_to_correct_position(cfgSample._sections, cfg._sections, missingSection, dicoMissingSection)
cfg._sections = new_dico
with open(configfile, 'w') as f:
cfg.write(f)
return True
''' Return a new dico with the section ordered as the old configuration with the updated one added '''
def add_items_to_correct_position(sample_dico, old_dico, missingSection, dicoMissingSection):
new_dico = OrderedDict()
positions = {}
for pos_i, sec in enumerate(sample_dico):
if sec in missingSection:
positions[pos_i] = sec
for pos_i, sec in enumerate(old_dico):
if pos_i in positions:
missSection = positions[pos_i]
new_dico[missSection] = sample_dico[missSection]
new_dico[sec] = old_dico[sec]
return new_dico
if __name__ == "__main__":

357
bin/Update.py Executable file
View file

@ -0,0 +1,357 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
Update AIL
============================
Update AIL clone and fork
"""
import configparser
import os
import sys
import subprocess
def auto_update_enabled(cfg):
auto_update = cfg.get('Update', 'auto_update')
if auto_update == 'True' or auto_update == 'true':
return True
else:
return False
# check if files are modify locally
def check_if_files_modified():
process = subprocess.run(['git', 'ls-files' ,'-m'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
modified_files = process.stdout
if modified_files:
print('Modified Files:')
print('{}{}{}'.format(TERMINAL_BLUE, modified_files.decode(), TERMINAL_DEFAULT))
return False
else:
return True
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
sys.exit(1)
def repo_is_fork():
print('Check if this repository is a fork:')
process = subprocess.run(['git', 'remote', '-v'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
res = process.stdout.decode()
if 'origin {}'.format(AIL_REPO) in res:
print(' This repository is a {}clone of {}{}'.format(TERMINAL_BLUE, AIL_REPO, TERMINAL_DEFAULT))
return False
else:
print(' This repository is a {}fork{}'.format(TERMINAL_BLUE, TERMINAL_DEFAULT))
print()
return True
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
def is_upstream_created(upstream):
process = subprocess.run(['git', 'remote', '-v'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
output = process.stdout.decode()
if upstream in output:
return True
else:
return False
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
def create_fork_upstream(upstream):
print('{}... Creating upstream ...{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
print('git remote add {} {}'.format(upstream, AIL_REPO))
process = subprocess.run(['git', 'remote', 'add', upstream, AIL_REPO], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
print(process.stdout.decode())
if is_upstream_created(upstream):
print('Fork upstream created')
print('{}... ...{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
else:
print('Fork not created')
aborting_update()
sys.exit(0)
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
def update_fork():
print('{}... Updating fork ...{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
if cfg.get('Update', 'update-fork') == 'True' or cfg.get('Update', 'update-fork') == 'true':
upstream = cfg.get('Update', 'upstream')
if not is_upstream_created(upstream):
create_fork_upstream(upstream)
print('{}git fetch {}:{}'.format(TERMINAL_YELLOW, upstream, TERMINAL_DEFAULT))
process = subprocess.run(['git', 'fetch', upstream], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
print(process.stdout.decode())
print('{}git checkout master:{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
process = subprocess.run(['git', 'checkout', 'master'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
print(process.stdout.decode())
print('{}git merge {}/master:{}'.format(TERMINAL_YELLOW, upstream, TERMINAL_DEFAULT))
process = subprocess.run(['git', 'merge', '{}/master'.format(upstream)], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
print(process.stdout.decode())
print('{}... ...{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(1)
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
else:
print('{}Fork Auto-Update disabled in config file{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
def get_git_current_tag(current_version_path):
try:
with open(current_version_path, 'r') as version_content:
version = version_content.read()
except FileNotFoundError:
version = 'v1.4'
with open(current_version_path, 'w') as version_content:
version_content.write(version)
version = version.replace(" ", "").splitlines()
return version[0]
def get_git_upper_tags_remote(current_tag, is_fork):
if is_fork:
process = subprocess.run(['git', 'tag'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
list_all_tags = process.stdout.decode().splitlines()
list_upper_tags = []
if list_all_tags[-1][1:] == current_tag:
list_upper_tags.append( (list_all_tags[-1], None) )
return list_upper_tags
for tag in list_all_tags:
if float(tag[1:]) >= float(current_tag):
list_upper_tags.append( (tag, None) )
return list_upper_tags
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
else:
process = subprocess.run(['git', 'ls-remote' ,'--tags'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
list_all_tags = process.stdout.decode().splitlines()
last_tag = list_all_tags[-1].split('\trefs/tags/')
last_commit = last_tag[0]
last_tag = last_tag[1].split('^{}')[0]
list_upper_tags = []
if last_tag[1:] == current_tag:
list_upper_tags.append( (last_tag, last_commit) )
return list_upper_tags
else:
for mess_tag in list_all_tags:
commit, tag = mess_tag.split('\trefs/tags/')
# add tag with last commit
if float(tag.split('^{}')[0][1:]) >= float(current_tag):
if '^{}' in tag:
list_upper_tags.append( (tag.split('^{}')[0], commit) )
# add last commit
if last_tag not in list_upper_tags[-1][0]:
list_upper_tags.append( (last_tag, last_commit) )
return list_upper_tags
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
def update_ail(current_tag, list_upper_tags_remote, current_version_path, is_fork):
print('{}git checkout master:{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
process = subprocess.run(['git', 'checkout', 'master'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
#process = subprocess.run(['ls'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
print(process.stdout.decode())
print()
print('{}git pull:{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
process = subprocess.run(['git', 'pull'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
output = process.stdout.decode()
print(output)
if len(list_upper_tags_remote) == 1:
# additional update (between 2 commits on the same version)
additional_update_path = os.path.join(os.environ['AIL_HOME'], 'update', current_tag, 'additional_update.sh')
if os.path.isfile(additional_update_path):
print()
print('{}------------------------------------------------------------------'.format(TERMINAL_YELLOW))
print('- Launching Additional Update: -')
print('-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --{}'.format(TERMINAL_DEFAULT))
process = subprocess.run(['bash', additional_update_path], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
output = process.stdout.decode()
print(output)
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(1)
print()
print('{}**************** AIL Sucessfully Updated *****************{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
print()
exit(0)
else:
# map version with roll back commit
list_update = []
previous_commit = list_upper_tags_remote[0][1]
for tuple in list_upper_tags_remote[1:]:
tag = tuple[0]
list_update.append( (tag, previous_commit) )
previous_commit = tuple[1]
for update in list_update:
launch_update_version(update[0], update[1], current_version_path, is_fork)
# Sucess
print('{}**************** AIL Sucessfully Updated *****************{}'.format(TERMINAL_YELLOW, TERMINAL_DEFAULT))
print()
sys.exit(0)
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(1)
else:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)
def launch_update_version(version, roll_back_commit, current_version_path, is_fork):
update_path = os.path.join(os.environ['AIL_HOME'], 'update', version, 'Update.sh')
print()
print('{}------------------------------------------------------------------'.format(TERMINAL_YELLOW))
print('- Launching Update: {}{}{} -'.format(TERMINAL_BLUE, version, TERMINAL_YELLOW))
print('-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --{}'.format(TERMINAL_DEFAULT))
process = subprocess.Popen(['bash', update_path], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
while True:
output = process.stdout.readline().decode()
if output == '' and process.poll() is not None:
break
if output:
print(output.strip())
if process.returncode == 0:
#output = process.stdout.decode()
#print(output)
with open(current_version_path, 'w') as version_content:
version_content.write(version)
print('{}-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --'.format(TERMINAL_YELLOW))
print('- Sucessfully Updated: {}{}{} -'.format(TERMINAL_BLUE, version, TERMINAL_YELLOW))
print('------------------------------------------------------------------{}'.format(TERMINAL_DEFAULT))
print()
else:
#print(process.stdout.read().decode())
print('{}{}{}'.format(TERMINAL_RED, process.stderr.read().decode(), TERMINAL_DEFAULT))
print('------------------------------------------------------------------')
print(' {}Update Error: {}{}{}'.format(TERMINAL_RED, TERMINAL_BLUE, version, TERMINAL_DEFAULT))
print('------------------------------------------------------------------')
if not is_fork:
roll_back_update(roll_back_commit)
else:
aborting_update()
sys.exit(1)
def roll_back_update(roll_back_commit):
print('Rolling back to safe commit: {}{}{}'.format(TERMINAL_BLUE ,roll_back_commit, TERMINAL_DEFAULT))
process = subprocess.run(['git', 'checkout', roll_back_commit], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
output = process.stdout
print(output)
sys.exit(0)
else:
print(TERMINAL_RED+process.stderr.decode()+TERMINAL_DEFAULT)
aborting_update()
sys.exit(1)
def aborting_update():
print()
print('{}Aborting ...{}'.format(TERMINAL_RED, TERMINAL_DEFAULT))
print('{}******************************************************************'.format(TERMINAL_RED))
print('* AIL Not Updated *')
print('******************************************************************{}'.format(TERMINAL_DEFAULT))
print()
if __name__ == "__main__":
TERMINAL_RED = '\033[91m'
TERMINAL_YELLOW = '\33[93m'
TERMINAL_BLUE = '\33[94m'
TERMINAL_BLINK = '\33[6m'
TERMINAL_DEFAULT = '\033[0m'
AIL_REPO = 'https://github.com/CIRCL/AIL-framework.git'
configfile = os.path.join(os.environ['AIL_HOME'], 'configs/update.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
current_version_path = os.path.join(os.environ['AIL_HOME'], 'update/current_version')
print('{}******************************************************************'.format(TERMINAL_YELLOW))
print('* Updating AIL ... *')
print('******************************************************************{}'.format(TERMINAL_DEFAULT))
if auto_update_enabled(cfg):
if check_if_files_modified():
is_fork = repo_is_fork()
if is_fork:
update_fork()
current_tag = get_git_current_tag(current_version_path)
print()
print('Current Version: {}{}{}'.format( TERMINAL_YELLOW, current_tag, TERMINAL_DEFAULT))
print()
list_upper_tags_remote = get_git_upper_tags_remote(current_tag[1:], is_fork)
# new realease
if len(list_upper_tags_remote) > 1:
print('New Releases:')
if is_fork:
for upper_tag in list_upper_tags_remote:
print(' {}{}{}'.format(TERMINAL_BLUE, upper_tag[0], TERMINAL_DEFAULT))
else:
for upper_tag in list_upper_tags_remote:
print(' {}{}{}: {}'.format(TERMINAL_BLUE, upper_tag[0], TERMINAL_DEFAULT, upper_tag[1]))
print()
update_ail(current_tag, list_upper_tags_remote, current_version_path, is_fork)
else:
print('Please, commit your changes or stash them before you can update AIL')
aborting_update()
sys.exit(0)
else:
print(' {}AIL Auto update is disabled{}'.format(TERMINAL_RED, TERMINAL_DEFAULT))
aborting_update()
sys.exit(0)

View file

@ -153,7 +153,7 @@ if __name__ == "__main__":
pprint.pprint(A_values)
publisher.info('Url;{};{};{};Checked {} URL;{}'.format(
PST.p_source, PST.p_date, PST.p_name, A_values[0], PST.p_path))
PST.p_source, PST.p_date, PST.p_name, A_values[0], PST.p_rel_path))
prec_filename = filename
else:

View file

@ -1,63 +0,0 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
The Browse_warning_paste module
====================
This module saved signaled paste (logged as 'warning') in redis for further usage
like browsing by category
Its input comes from other modules, namely:
Credential, CreditCard, SQLinjection, CVE, Keys, Mail and Phone
"""
import redis
import time
from datetime import datetime, timedelta
from packages import Paste
from pubsublogger import publisher
from Helper import Process
import sys
sys.path.append('../')
flag_misp = False
if __name__ == "__main__":
publisher.port = 6380
publisher.channel = "Script"
config_section = 'alertHandler'
p = Process(config_section)
# port generated automatically depending on the date
curYear = datetime.now().year
server = redis.StrictRedis(
host=p.config.get("ARDB_DB", "host"),
port=p.config.get("ARDB_DB", "port"),
db=curYear,
decode_responses=True)
# FUNCTIONS #
publisher.info("Script duplicate started")
while True:
message = p.get_from_set()
if message is not None:
module_name, p_path = message.split(';')
print("new alert : {}".format(module_name))
#PST = Paste.Paste(p_path)
else:
publisher.debug("Script Attribute is idling 10s")
time.sleep(10)
continue
# Add in redis for browseWarningPaste
# Format in set: WARNING_moduleName -> p_path
key = "WARNING_" + module_name
server.sadd(key, p_path)
publisher.info('Saved warning paste {}'.format(p_path))

View file

@ -1,16 +0,0 @@
#!/bin/bash
set -e
set -x
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_REDIS" ] && echo "Needs the env var AIL_REDIS. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_LEVELDB" ] && echo "Needs the env var AIL_LEVELDB. Run the script from the virtual environment." && exit 1;
screen -dmS "Logging"
sleep 0.1
echo -e $GREEN"\t* Launching logging process"$DEFAULT
screen -S "Logging" -X screen -t "LogQueue" bash -c 'log_subscriber -p 6380 -c Queuing -l ../logs/; read x'
sleep 0.1
screen -S "Logging" -X screen -t "LogScript" bash -c 'log_subscriber -p 6380 -c Script -l ../logs/; read x'

View file

@ -1,29 +0,0 @@
#!/bin/bash
set -e
set -x
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_REDIS" ] && echo "Needs the env var AIL_REDIS. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_LEVELDB" ] && echo "Needs the env var AIL_LEVELDB. Run the script from the virtual environment." && exit 1;
lvdbhost='127.0.0.1'
lvdbdir="${AIL_HOME}/LEVEL_DB_DATA/"
nb_db=13
db_y=`date +%Y`
#Verify that a dir with the correct year exists, create it otherwise
if [ ! -d "$lvdbdir$db_y" ]; then
mkdir -p "$db_y"
fi
screen -dmS "LevelDB"
sleep 0.1
echo -e $GREEN"\t* Launching Levels DB servers"$DEFAULT
#Launch a DB for each dir
for pathDir in $lvdbdir*/ ; do
yDir=$(basename "$pathDir")
sleep 0.1
screen -S "LevelDB" -X screen -t "$yDir" bash -c 'redis-leveldb -H '$lvdbhost' -D '$pathDir'/ -P '$yDir' -M '$nb_db'; read x'
done

View file

@ -1,15 +0,0 @@
#!/bin/bash
set -e
set -x
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_REDIS" ] && echo "Needs the env var AIL_REDIS. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_LEVELDB" ] && echo "Needs the env var AIL_LEVELDB. Run the script from the virtual environment." && exit 1;
screen -dmS "Queue"
sleep 0.1
echo -e $GREEN"\t* Launching all the queues"$DEFAULT
screen -S "Queue" -X screen -t "Queues" bash -c './launch_queues.py; read x'

View file

@ -1,23 +0,0 @@
#!/bin/bash
set -e
set -x
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_REDIS" ] && echo "Needs the env var AIL_REDIS. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_LEVELDB" ] && echo "Needs the env var AIL_LEVELDB. Run the script from the virtual environment." && exit 1;
conf_dir="${AIL_HOME}/configs/"
screen -dmS "Redis"
sleep 0.1
echo -e $GREEN"\t* Launching Redis servers"$DEFAULT
screen -S "Redis" -X screen -t "6379" bash -c '../redis/src/redis-server '$conf_dir'6379.conf ; read x'
sleep 0.1
screen -S "Redis" -X screen -t "6380" bash -c '../redis/src/redis-server '$conf_dir'6380.conf ; read x'
sleep 0.1
screen -S "Redis" -X screen -t "6381" bash -c '../redis/src/redis-server '$conf_dir'6381.conf ; read x'
# For Words and curves
sleep 0.1
screen -S "Redis" -X screen -t "6382" bash -c '../redis/src/redis-server '$conf_dir'6382.conf ; read x'

View file

@ -1,77 +0,0 @@
#!/bin/bash
set -e
set -x
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_REDIS" ] && echo "Needs the env var AIL_REDIS. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_LEVELDB" ] && echo "Needs the env var AIL_LEVELDB. Run the script from the virtual environment." && exit 1;
echo -e "\t* Checking configuration"
bash -c "./Update-conf.py"
exitStatus=$?
if [ $exitStatus -ge 1 ]; then
echo -e $RED"\t* Configuration not up-to-date"$DEFAULT
exit
fi
echo -e $GREEN"\t* Configuration up-to-date"$DEFAULT
screen -dmS "Script"
sleep 0.1
echo -e $GREEN"\t* Launching ZMQ scripts"$DEFAULT
screen -S "Script" -X screen -t "ModuleInformation" bash -c './ModulesInformationV2.py -k 0 -c 1; read x'
sleep 0.1
screen -S "Script" -X screen -t "Mixer" bash -c './Mixer.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Global" bash -c './Global.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Duplicates" bash -c './Duplicates.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Attributes" bash -c './Attributes.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Lines" bash -c './Lines.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "DomClassifier" bash -c './DomClassifier.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Categ" bash -c './Categ.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Tokenize" bash -c './Tokenize.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "CreditCards" bash -c './CreditCards.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Onion" bash -c './Onion.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Mail" bash -c './Mail.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Web" bash -c './Web.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Credential" bash -c './Credential.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Curve" bash -c './Curve.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "CurveManageTopSets" bash -c './CurveManageTopSets.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "RegexForTermsFrequency" bash -c './RegexForTermsFrequency.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "SetForTermsFrequency" bash -c './SetForTermsFrequency.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Indexer" bash -c './Indexer.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Keys" bash -c './Keys.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Phone" bash -c './Phone.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Release" bash -c './Release.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "Cve" bash -c './Cve.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "WebStats" bash -c './WebStats.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "ModuleStats" bash -c './ModuleStats.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "SQLInjectionDetection" bash -c './SQLInjectionDetection.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "alertHandler" bash -c './alertHandler.py; read x'
sleep 0.1
screen -S "Script" -X screen -t "SentimentAnalysis" bash -c './SentimentAnalysis.py; read x'

View file

@ -37,7 +37,7 @@ class HiddenServices(object):
"""
def __init__(self, domain, type):
def __init__(self, domain, type, port=80):
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
@ -59,11 +59,14 @@ class HiddenServices(object):
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
self.PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
self.domain = domain
self.type = type
self.port = port
self.tags = {}
if type == 'onion':
if type == 'onion' or type == 'regular':
self.paste_directory = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes"))
self.paste_crawled_directory = os.path.join(self.paste_directory, cfg.get("Directories", "crawled"))
self.paste_crawled_directory_name = cfg.get("Directories", "crawled")
@ -75,11 +78,24 @@ class HiddenServices(object):
## TODO: # FIXME: add error
pass
#def remove_absolute_path_link(self, key, value):
# print(key)
# print(value)
def update_item_path_children(self, key, children):
if self.PASTES_FOLDER in children:
self.r_serv_metadata.srem(key, children)
children = children.replace(self.PASTES_FOLDER, '', 1)
self.r_serv_metadata.sadd(key, children)
return children
def get_origin_paste_name(self):
origin_paste = self.r_serv_onion.hget('onion_metadata:{}'.format(self.domain), 'paste_parent')
if origin_paste is None:
origin_item = self.r_serv_onion.hget('onion_metadata:{}'.format(self.domain), 'paste_parent')
if origin_item is None:
return ''
return origin_paste.replace(self.paste_directory+'/', '')
elif origin_item == 'auto' or origin_item == 'manual':
return origin_item
return origin_item.replace(self.paste_directory+'/', '')
def get_domain_tags(self, update=False):
if not update:
@ -88,60 +104,119 @@ class HiddenServices(object):
self.get_last_crawled_pastes()
return self.tags
def update_domain_tags(self, children):
p_tags = self.r_serv_metadata.smembers('tag:'+children)
def update_domain_tags(self, item):
if self.r_serv_metadata.exists('tag:{}'.format(item)):
p_tags = self.r_serv_metadata.smembers('tag:{}'.format(item))
# update path here
else:
# need to remove it
if self.paste_directory in item:
p_tags = self.r_serv_metadata.smembers('tag:{}'.format(item.replace(self.paste_directory+'/', '')))
# need to remove it
else:
p_tags = self.r_serv_metadata.smembers('tag:{}'.format(os.path.join(self.paste_directory, item)))
for tag in p_tags:
self.tags[tag] = self.tags.get(tag, 0) + 1
#todo use the right paste
def get_last_crawled_pastes(self):
paste_parent = self.r_serv_onion.hget('onion_metadata:{}'.format(self.domain), 'paste_parent')
#paste_parent = paste_parent.replace(self.paste_directory, '')[1:]
return self.get_all_pastes_domain(paste_parent)
def get_first_crawled(self):
res = self.r_serv_onion.zrange('crawler_history_{}:{}:{}'.format(self.type, self.domain, self.port), 0, 0, withscores=True)
if res:
res = res[0]
return {'root_item':res[0], 'epoch':int(res[1])}
else:
return {}
def get_all_pastes_domain(self, father):
def get_last_crawled(self):
res = self.r_serv_onion.zrevrange('crawler_history_{}:{}:{}'.format(self.type, self.domain, self.port), 0, 0, withscores=True)
if res:
return {'root_item':res[0][0], 'epoch':res[0][1]}
else:
return {}
#todo use the right paste
def get_domain_crawled_core_item(self, epoch=None):
core_item = {}
if epoch:
list_root = self.r_serv_onion.zrevrangebyscore('crawler_history_{}:{}:{}'.format(self.type, self.domain, self.port), int(epoch), int(epoch))
if list_root:
core_item['root_item'] = list_root[0]
core_item['epoch'] = epoch
return core_item
# no history found for this epoch
if not core_item:
return self.get_last_crawled()
#todo use the right paste
def get_last_crawled_pastes(self, item_root=None):
if item_root is None:
item_root = self.get_domain_crawled_core_item(self)
return self.get_all_pastes_domain(item_root)
def get_all_pastes_domain(self, root_item):
if root_item is None:
return []
l_crawled_pastes = []
l_crawled_pastes = self.get_item_crawled_children(root_item)
l_crawled_pastes.append(root_item)
self.update_domain_tags(root_item)
return l_crawled_pastes
def get_item_crawled_children(self, father):
if father is None:
return []
l_crawled_pastes = []
paste_parent = father.replace(self.paste_directory+'/', '')
paste_childrens = self.r_serv_metadata.smembers('paste_children:{}'.format(paste_parent))
## TODO: # FIXME: remove me
paste_children = self.r_serv_metadata.smembers('paste_children:{}'.format(father))
paste_childrens = paste_childrens | paste_children
key = 'paste_children:{}'.format(father)
paste_childrens = self.r_serv_metadata.smembers(key)
for children in paste_childrens:
children = self.update_item_path_children(key, children)
if self.domain in children:
l_crawled_pastes.append(children)
self.update_domain_tags(children)
l_crawled_pastes.extend(self.get_all_pastes_domain(children))
l_crawled_pastes.extend(self.get_item_crawled_children(children))
return l_crawled_pastes
def get_item_link(self, item):
link = self.r_serv_metadata.hget('paste_metadata:{}'.format(item), 'real_link')
if link is None:
if self.paste_directory in item:
self.r_serv_metadata.hget('paste_metadata:{}'.format(item.replace(self.paste_directory+'/', '')), 'real_link')
else:
key = os.path.join(self.paste_directory, item)
link = self.r_serv_metadata.hget('paste_metadata:{}'.format(key), 'real_link')
#if link:
#self.remove_absolute_path_link(key, link)
return link
def get_all_links(self, l_items):
dict_links = {}
for item in l_items:
link = self.get_item_link(item)
if link:
dict_links[item] = link
return dict_links
# experimental
def get_domain_son(self, l_paste):
if l_paste is None:
return None
set_domain = set()
for paste in l_paste:
paste_full = paste.replace(self.paste_directory+'/', '')
paste_childrens = self.r_serv_metadata.smembers('paste_children:{}'.format(paste_full))
## TODO: # FIXME: remove me
paste_children = self.r_serv_metadata.smembers('paste_children:{}'.format(paste))
paste_childrens = paste_childrens | paste_children
paste_childrens = self.r_serv_metadata.smembers('paste_children:{}'.format(paste))
for children in paste_childrens:
if not self.domain in children:
print(children)
set_domain.add((children.split('.onion')[0]+'.onion').split('/')[-1])
return set_domain
'''
def get_all_domain_son(self, father):
if father is None:
return []
l_crawled_pastes = []
paste_parent = father.replace(self.paste_directory+'/', '')
paste_childrens = self.r_serv_metadata.smembers('paste_children:{}'.format(paste_parent))
## TODO: # FIXME: remove me
paste_children = self.r_serv_metadata.smembers('paste_children:{}'.format(father))
paste_childrens = paste_childrens | paste_children
paste_childrens = self.r_serv_metadata.smembers('paste_children:{}'.format(father))
for children in paste_childrens:
if not self.domain in children:
l_crawled_pastes.append(children)
@ -149,16 +224,19 @@ class HiddenServices(object):
l_crawled_pastes.extend(self.get_all_domain_son(children))
return l_crawled_pastes
'''
def get_domain_random_screenshot(self, l_crawled_pastes, num_screenshot = 1):
l_screenshot_paste = []
for paste in l_crawled_pastes:
## FIXME: # TODO: remove me
origin_paste = paste
paste= paste.replace(self.paste_directory+'/', '')
paste = paste.replace(self.paste_crawled_directory_name, '')
if os.path.isfile( '{}{}.png'.format(self.screenshot_directory, paste) ):
l_screenshot_paste.append(paste[1:])
screenshot = self.r_serv_metadata.hget('paste_metadata:{}'.format(paste), 'screenshot')
if screenshot:
screenshot = os.path.join(screenshot[0:2], screenshot[2:4], screenshot[4:6], screenshot[6:8], screenshot[8:10], screenshot[10:12], screenshot[12:])
l_screenshot_paste.append({'screenshot': screenshot, 'item': origin_paste})
if len(l_screenshot_paste) > num_screenshot:
l_random_screenshot = []
@ -176,6 +254,7 @@ class HiddenServices(object):
l_crawled_pastes = []
return l_crawled_pastes
'''
def get_last_crawled_pastes_fileSearch(self):
last_check = self.r_serv_onion.hget('onion_metadata:{}'.format(self.domain), 'last_check')
@ -185,3 +264,4 @@ class HiddenServices(object):
pastes_path = os.path.join(self.paste_crawled_directory, date[0:4], date[4:6], date[6:8])
l_crawled_pastes = [f for f in os.listdir(pastes_path) if self.domain in f]
return l_crawled_pastes
'''

View file

@ -82,14 +82,14 @@ class Paste(object):
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes"))
if PASTES_FOLDER not in p_path:
self.PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes"))
if self.PASTES_FOLDER not in p_path:
self.p_rel_path = p_path
p_path = os.path.join(PASTES_FOLDER, p_path)
self.p_path = os.path.join(self.PASTES_FOLDER, p_path)
else:
self.p_rel_path = None
self.p_path = p_path
self.p_rel_path = p_path.replace(self.PASTES_FOLDER+'/', '', 1)
self.p_name = os.path.basename(self.p_path)
self.p_size = round(os.path.getsize(self.p_path)/1024.0, 2)
self.p_mime = magic.from_buffer("test", mime=True)
@ -101,7 +101,7 @@ class Paste(object):
var = self.p_path.split('/')
self.p_date = Date(var[-4], var[-3], var[-2])
self.p_rel_path = os.path.join(var[-4], var[-3], var[-2], self.p_name)
self.p_date_path = os.path.join(var[-4], var[-3], var[-2], self.p_name)
self.p_source = var[-5]
self.supposed_url = 'https://{}/{}'.format(self.p_source.replace('_pro', ''), var[-1].split('.gz')[0])
@ -241,6 +241,16 @@ class Paste(object):
def _get_p_date(self):
return self.p_date
# used
def get_p_date(self):
return self.p_date
def get_item_source(self):
return self.p_source
def get_item_size(self):
return self.p_size
def _get_p_size(self):
return self.p_size
@ -286,14 +296,22 @@ class Paste(object):
return False, var
def _get_p_duplicate(self):
self.p_duplicate = self.store_metadata.smembers('dup:'+self.p_path)
if self.p_rel_path is not None:
self.p_duplicate.union( self.store_metadata.smembers('dup:'+self.p_rel_path) )
p_duplicate = self.store_metadata.smembers('dup:'+self.p_path)
# remove absolute path #fix-db
if p_duplicate:
for duplicate_string in p_duplicate:
self.store_metadata.srem('dup:'+self.p_path, duplicate_string)
self.store_metadata.sadd('dup:'+self.p_rel_path, duplicate_string.replace(self.PASTES_FOLDER+'/', '', 1))
self.p_duplicate = self.store_metadata.smembers('dup:'+self.p_rel_path)
if self.p_duplicate is not None:
return list(self.p_duplicate)
else:
return '[]'
def get_nb_duplicate(self):
# # TODO: FIXME use relative path
return self.store_metadata.scard('dup:'+self.p_path) + self.store_metadata.scard('dup:'+self.p_rel_path)
def _get_p_tags(self):
self.p_tags = self.store_metadata.smembers('tag:'+path, tag)
if self.self.p_tags is not None:
@ -304,6 +322,9 @@ class Paste(object):
def get_p_rel_path(self):
return self.p_rel_path
def get_p_date_path(self):
return self.p_date_path
def save_all_attributes_redis(self, key=None):
"""
Saving all the attributes in a "Redis-like" Database (Redis, LevelDB)

View file

@ -249,5 +249,5 @@ db = 0
[Crawler]
activate_crawler = False
crawler_depth_limit = 1
splash_url_onion = http://127.0.0.1
splash_onion_port = 8050-8052
splash_url = http://127.0.0.1
splash_port = 8050-8052

140
bin/packages/git_status.py Executable file
View file

@ -0,0 +1,140 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import subprocess
TERMINAL_RED = '\033[91m'
TERMINAL_YELLOW = '\33[93m'
TERMINAL_BLUE = '\33[94m'
TERMINAL_BLINK = '\33[6m'
TERMINAL_DEFAULT = '\033[0m'
# Check if working directory is clean
def is_working_directory_clean(verbose=False):
if verbose:
print('check if this git directory is clean ...')
#print('git ls-files -m')
process = subprocess.run(['git', 'ls-files', '-m'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
res = process.stdout
if res == b'':
return True
else:
return False
else:
if verbose:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
return False
# Check if this git is a fork
def is_not_fork(origin_repo, verbose=False):
if verbose:
print('check if this git is a fork ...')
#print('git remote -v')
process = subprocess.run(['git', 'remote', '-v'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
res = process.stdout.decode()
if verbose:
print(res)
if 'origin {}'.format(origin_repo) in res:
return True
else:
return False
else:
if verbose:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
return False
# Get current branch
def get_current_branch(verbose=False):
if verbose:
print('retrieving current branch ...')
#print('git rev-parse --abbrev-ref HEAD')
process = subprocess.run(['git', 'rev-parse', '--abbrev-ref', 'HEAD'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
current_branch = process.stdout.replace(b'\n', b'').decode()
if verbose:
print(current_branch)
return current_branch
else:
if verbose:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
return ''
# Get last commit id on master branch from remote
def get_last_commit_id_from_remote(branch='master', verbose=False):
if verbose:
print('retrieving last remote commit id ...')
#print('git ls-remote origin master')
process = subprocess.run(['git', 'ls-remote', 'origin', branch], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
last_commit_id = process.stdout.split(b'\t')[0].replace(b'\n', b'').decode()
if verbose:
print(last_commit_id)
return last_commit_id
else:
if verbose:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
return ''
# Get last commit id on master branch from local
def get_last_commit_id_from_local(branch='master', verbose=False):
if verbose:
print('retrieving last local commit id ...')
#print('git rev-parse master')
process = subprocess.run(['git', 'rev-parse', branch], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
last_commit_id = process.stdout.replace(b'\n', b'').decode()
if verbose:
print(last_commit_id)
return last_commit_id
else:
if verbose:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
return ''
# Get last local tag
def get_last_tag_from_local(verbose=False):
if verbose:
print('retrieving last local tag ...')
#print('git describe --abbrev=0 --tags')
process = subprocess.run(['git', 'describe', '--abbrev=0', '--tags'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
last_local_tag = process.stdout.replace(b'\n', b'').decode()
if verbose:
print(last_local_tag)
return last_local_tag
else:
if verbose:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
return ''
# Get last local tag
def get_last_tag_from_remote(verbose=False):
if verbose:
print('retrieving last remote tag ...')
#print('git ls-remote --tags')
process = subprocess.run(['git', 'ls-remote', '--tags'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.returncode == 0:
res = process.stdout.split(b'\n')[-2].split(b'/')[-1].replace(b'^{}', b'').decode()
if verbose:
print(res)
return res
else:
if verbose:
print('{}{}{}'.format(TERMINAL_RED, process.stderr.decode(), TERMINAL_DEFAULT))
return ''
if __name__ == "__main__":
get_last_commit_id_from_remote(verbose=True)
get_last_commit_id_from_local(verbose=True)
get_last_tag_from_local(verbose=True)
get_current_branch(verbose=True)
print(is_fork('https://github.com/CIRCL/AIL-framework.git'))
print(is_working_directory_clean())
get_last_tag_from_remote(verbose=True)

View file

@ -51,7 +51,7 @@ publish = Redis_CreditCards,Redis_Mail,Redis_Onion,Redis_Web,Redis_Credential,Re
[CreditCards]
subscribe = Redis_CreditCards
publish = Redis_Duplicate,Redis_ModuleStats,Redis_alertHandler,Redis_Tags
publish = Redis_Duplicate,Redis_ModuleStats,Redis_Tags
[BankAccount]
subscribe = Redis_Global
@ -59,12 +59,12 @@ publish = Redis_Duplicate,Redis_Tags
[Mail]
subscribe = Redis_Mail
publish = Redis_Duplicate,Redis_ModuleStats,Redis_alertHandler,Redis_Tags
publish = Redis_Duplicate,Redis_ModuleStats,Redis_Tags
[Onion]
subscribe = Redis_Onion
publish = Redis_ValidOnion,ZMQ_FetchedOnion,Redis_alertHandler,Redis_Tags,Redis_Crawler
#publish = Redis_Global,Redis_ValidOnion,ZMQ_FetchedOnion,Redis_alertHandler
publish = Redis_ValidOnion,ZMQ_FetchedOnion,Redis_Tags,Redis_Crawler
#publish = Redis_Global,Redis_ValidOnion,ZMQ_FetchedOnion
[DumpValidOnion]
subscribe = Redis_ValidOnion
@ -78,18 +78,15 @@ subscribe = Redis_Url
[LibInjection]
subscribe = Redis_Url
publish = Redis_alertHandler,Redis_Duplicate,Redis_Tags
publish = Redis_Duplicate,Redis_Tags
[SQLInjectionDetection]
subscribe = Redis_Url
publish = Redis_alertHandler,Redis_Duplicate,Redis_Tags
publish = Redis_Duplicate,Redis_Tags
[ModuleStats]
subscribe = Redis_ModuleStats
[alertHandler]
subscribe = Redis_alertHandler
[Tags]
subscribe = Redis_Tags
publish = Redis_Tags_feed
@ -99,7 +96,7 @@ subscribe = Redis_Tags_feed
#[send_to_queue]
#subscribe = Redis_Cve
#publish = Redis_alertHandler,Redis_Tags
#publish = Redis_Tags
[SentimentAnalysis]
subscribe = Redis_Global
@ -109,31 +106,31 @@ subscribe = Redis_Global
[Credential]
subscribe = Redis_Credential
publish = Redis_Duplicate,Redis_ModuleStats,Redis_alertHandler,Redis_Tags
publish = Redis_Duplicate,Redis_ModuleStats,Redis_Tags
[Cve]
subscribe = Redis_Cve
publish = Redis_alertHandler,Redis_Duplicate,Redis_Tags
publish = Redis_Duplicate,Redis_Tags
[Phone]
subscribe = Redis_Global
publish = Redis_Duplicate,Redis_alertHandler,Redis_Tags
publish = Redis_Duplicate,Redis_Tags
[Keys]
subscribe = Redis_Global
publish = Redis_Duplicate,Redis_alertHandler,Redis_Tags
publish = Redis_Duplicate,Redis_Tags
[ApiKey]
subscribe = Redis_ApiKey
publish = Redis_Duplicate,Redis_alertHandler,Redis_Tags
publish = Redis_Duplicate,Redis_Tags
[Decoder]
subscribe = Redis_Global
publish = Redis_Duplicate,Redis_alertHandler,Redis_Tags
publish = Redis_Duplicate,Redis_Tags
[Bitcoin]
subscribe = Redis_Global
publish = Redis_Duplicate,Redis_alertHandler,Redis_Tags
publish = Redis_Duplicate,Redis_Tags
[submit_paste]
subscribe = Redis
@ -142,4 +139,3 @@ publish = Redis_Mixer
[Crawler]
subscribe = Redis_Crawler
publish = Redis_Mixer,Redis_Tags

View file

@ -96,25 +96,48 @@ def remove_submit_uuid(uuid):
r_serv_db.srem('submitted:uuid', uuid)
print('{} all file submitted'.format(uuid))
def add_item_tag(tag, item_path):
item_date = int(get_item_date(item_path))
#add tag
r_serv_metadata.sadd('tag:{}'.format(item_path), tag)
r_serv_tags.sadd('{}:{}'.format(tag, item_date), item_path)
r_serv_tags.hincrby('daily_tags:{}'.format(item_date), tag, 1)
tag_first_seen = r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen')
if tag_first_seen is None:
tag_first_seen = 99999999
else:
tag_first_seen = int(tag_first_seen)
tag_last_seen = r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen')
if tag_last_seen is None:
tag_last_seen = 0
else:
tag_last_seen = int(tag_last_seen)
#add new tag in list of all used tags
r_serv_tags.sadd('list_tags', tag)
# update fisrt_seen/last_seen
if item_date < tag_first_seen:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'first_seen', item_date)
# update metadata last_seen
if item_date > tag_last_seen:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'last_seen', item_date)
def add_tags(tags, tagsgalaxies, path):
list_tag = tags.split(',')
list_tag_galaxies = tagsgalaxies.split(',')
if list_tag != ['']:
for tag in list_tag:
#add tag
r_serv_metadata.sadd('tag:'+path, tag)
r_serv_tags.sadd(tag, path)
#add new tag in list of all used tags
r_serv_tags.sadd('list_tags', tag)
add_item_tag(tag, path)
if list_tag_galaxies != ['']:
for tag in list_tag_galaxies:
#add tag
r_serv_metadata.sadd('tag:'+path, tag)
r_serv_tags.sadd(tag, path)
#add new tag in list of all used tags
r_serv_tags.sadd('list_tags', tag)
add_item_tag(tag, path)
def verify_extention_filename(filename):
if not '.' in filename:

View file

@ -12,6 +12,8 @@ import redis
import json
import time
from hashlib import sha256
from scrapy.spidermiddlewares.httperror import HttpError
from twisted.internet.error import DNSLookupError
from twisted.internet.error import TimeoutError
@ -28,10 +30,10 @@ from Helper import Process
class TorSplashCrawler():
def __init__(self, splash_url, crawler_depth_limit):
def __init__(self, splash_url, crawler_options):
self.process = CrawlerProcess({'LOG_ENABLED': False})
self.crawler = Crawler(self.TorSplashSpider, {
'USER_AGENT': 'Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Firefox/24.0',
'USER_AGENT': crawler_options['user_agent'],
'SPLASH_URL': splash_url,
'ROBOTSTXT_OBEY': False,
'DOWNLOADER_MIDDLEWARES': {'scrapy_splash.SplashCookiesMiddleware': 723,
@ -42,26 +44,34 @@ class TorSplashCrawler():
'DUPEFILTER_CLASS': 'scrapy_splash.SplashAwareDupeFilter',
'HTTPERROR_ALLOW_ALL': True,
'RETRY_TIMES': 2,
'CLOSESPIDER_PAGECOUNT': 50,
'DEPTH_LIMIT': crawler_depth_limit
'CLOSESPIDER_PAGECOUNT': crawler_options['closespider_pagecount'],
'DEPTH_LIMIT': crawler_options['depth_limit']
})
def crawl(self, type, url, domain, original_paste, super_father):
self.process.crawl(self.crawler, type=type, url=url, domain=domain,original_paste=original_paste, super_father=super_father)
def crawl(self, type, crawler_options, date, url, domain, port, original_item):
self.process.crawl(self.crawler, type=type, crawler_options=crawler_options, date=date, url=url, domain=domain, port=port, original_item=original_item)
self.process.start()
class TorSplashSpider(Spider):
name = 'TorSplashSpider'
def __init__(self, type, url, domain,original_paste, super_father, *args, **kwargs):
def __init__(self, type, crawler_options, date, url, domain, port, original_item, *args, **kwargs):
self.type = type
self.original_paste = original_paste
self.super_father = super_father
self.original_item = original_item
self.root_key = None
self.start_urls = url
self.domains = [domain]
date = datetime.datetime.now().strftime("%Y/%m/%d")
self.full_date = datetime.datetime.now().strftime("%Y%m%d")
self.date_month = datetime.datetime.now().strftime("%Y%m")
self.port = str(port)
date_str = '{}/{}/{}'.format(date['date_day'][0:4], date['date_day'][4:6], date['date_day'][6:8])
self.full_date = date['date_day']
self.date_month = date['date_month']
self.date_epoch = int(date['epoch'])
self.arg_crawler = { 'html': crawler_options['html'],
'wait': 10,
'render_all': 1,
'har': crawler_options['har'],
'png': crawler_options['png']}
config_section = 'Crawler'
self.p = Process(config_section)
@ -90,12 +100,13 @@ class TorSplashCrawler():
db=self.p.config.getint("ARDB_Onion", "db"),
decode_responses=True)
self.crawler_path = os.path.join(self.p.config.get("Directories", "crawled"), date )
self.crawler_path = os.path.join(self.p.config.get("Directories", "crawled"), date_str )
self.crawled_paste_filemame = os.path.join(os.environ['AIL_HOME'], self.p.config.get("Directories", "pastes"),
self.p.config.get("Directories", "crawled"), date )
self.p.config.get("Directories", "crawled"), date_str )
self.crawled_screenshot = os.path.join(os.environ['AIL_HOME'], self.p.config.get("Directories", "crawled_screenshot"), date )
self.crawled_har = os.path.join(os.environ['AIL_HOME'], self.p.config.get("Directories", "crawled_screenshot"), date_str )
self.crawled_screenshot = os.path.join(os.environ['AIL_HOME'], self.p.config.get("Directories", "crawled_screenshot") )
def start_requests(self):
yield SplashRequest(
@ -103,12 +114,8 @@ class TorSplashCrawler():
self.parse,
errback=self.errback_catcher,
endpoint='render.json',
meta={'father': self.original_paste},
args={ 'html': 1,
'wait': 10,
'render_all': 1,
'har': 1,
'png': 1}
meta={'father': self.original_item, 'root_key': None},
args=self.arg_crawler
)
def parse(self,response):
@ -131,12 +138,13 @@ class TorSplashCrawler():
UUID = self.domains[0][-215:]+str(uuid.uuid4())
else:
UUID = self.domains[0]+str(uuid.uuid4())
filename_paste = os.path.join(self.crawled_paste_filemame, UUID)
filename_paste_full = os.path.join(self.crawled_paste_filemame, UUID)
relative_filename_paste = os.path.join(self.crawler_path, UUID)
filename_screenshot = os.path.join(self.crawled_screenshot, UUID +'.png')
filename_har = os.path.join(self.crawled_har, UUID)
# # TODO: modify me
# save new paste on disk
if self.save_crawled_paste(filename_paste, response.data['html']):
if self.save_crawled_paste(relative_filename_paste, response.data['html']):
# add this paste to the domain crawled set # TODO: # FIXME: put this on cache ?
#self.r_serv_onion.sadd('temp:crawled_domain_pastes:{}'.format(self.domains[0]), filename_paste)
@ -148,27 +156,54 @@ class TorSplashCrawler():
# create onion metadata
if not self.r_serv_onion.exists('{}_metadata:{}'.format(self.type, self.domains[0])):
self.r_serv_onion.hset('{}_metadata:{}'.format(self.type, self.domains[0]), 'first_seen', self.full_date)
self.r_serv_onion.hset('{}_metadata:{}'.format(self.type, self.domains[0]), 'last_seen', self.full_date)
# create root_key
if self.root_key is None:
self.root_key = relative_filename_paste
# Create/Update crawler history
self.r_serv_onion.zadd('crawler_history_{}:{}:{}'.format(self.type, self.domains[0], self.port), self.date_epoch, self.root_key)
# Update domain port number
all_domain_ports = self.r_serv_onion.hget('{}_metadata:{}'.format(self.type, self.domains[0]), 'ports')
if all_domain_ports:
all_domain_ports = all_domain_ports.split(';')
else:
all_domain_ports = []
if self.port not in all_domain_ports:
all_domain_ports.append(self.port)
self.r_serv_onion.hset('{}_metadata:{}'.format(self.type, self.domains[0]), 'ports', ';'.join(all_domain_ports))
#create paste metadata
self.r_serv_metadata.hset('paste_metadata:'+filename_paste, 'super_father', self.super_father)
self.r_serv_metadata.hset('paste_metadata:'+filename_paste, 'father', response.meta['father'])
self.r_serv_metadata.hset('paste_metadata:'+filename_paste, 'domain', self.domains[0])
self.r_serv_metadata.hset('paste_metadata:'+filename_paste, 'real_link', response.url)
self.r_serv_metadata.hset('paste_metadata:{}'.format(relative_filename_paste), 'super_father', self.root_key)
self.r_serv_metadata.hset('paste_metadata:{}'.format(relative_filename_paste), 'father', response.meta['father'])
self.r_serv_metadata.hset('paste_metadata:{}'.format(relative_filename_paste), 'domain', '{}:{}'.format(self.domains[0], self.port))
self.r_serv_metadata.hset('paste_metadata:{}'.format(relative_filename_paste), 'real_link', response.url)
self.r_serv_metadata.sadd('paste_children:'+response.meta['father'], filename_paste)
dirname = os.path.dirname(filename_screenshot)
if not os.path.exists(dirname):
os.makedirs(dirname)
self.r_serv_metadata.sadd('paste_children:'+response.meta['father'], relative_filename_paste)
if 'png' in response.data:
size_screenshot = (len(response.data['png'])*3) /4
if size_screenshot < 5000000: #bytes
with open(filename_screenshot, 'wb') as f:
f.write(base64.standard_b64decode(response.data['png'].encode()))
image_content = base64.standard_b64decode(response.data['png'].encode())
hash = sha256(image_content).hexdigest()
img_dir_path = os.path.join(hash[0:2], hash[2:4], hash[4:6], hash[6:8], hash[8:10], hash[10:12])
filename_img = os.path.join(self.crawled_screenshot, 'screenshot', img_dir_path, hash[12:] +'.png')
dirname = os.path.dirname(filename_img)
if not os.path.exists(dirname):
os.makedirs(dirname)
if not os.path.exists(filename_img):
with open(filename_img, 'wb') as f:
f.write(image_content)
# add item metadata
self.r_serv_metadata.hset('paste_metadata:{}'.format(relative_filename_paste), 'screenshot', hash)
# add sha256 metadata
self.r_serv_onion.sadd('screenshot:{}'.format(hash), relative_filename_paste)
with open(filename_screenshot+'har.txt', 'wb') as f:
if 'har' in response.data:
dirname = os.path.dirname(filename_har)
if not os.path.exists(dirname):
os.makedirs(dirname)
with open(filename_har+'.json', 'wb') as f:
f.write(json.dumps(response.data['har']).encode())
# save external links in set
@ -184,12 +219,8 @@ class TorSplashCrawler():
self.parse,
errback=self.errback_catcher,
endpoint='render.json',
meta={'father': relative_filename_paste},
args={ 'html': 1,
'png': 1,
'render_all': 1,
'har': 1,
'wait': 10}
meta={'father': relative_filename_paste, 'root_key': response.meta['root_key']},
args=self.arg_crawler
)
def errback_catcher(self, failure):
@ -203,17 +234,17 @@ class TorSplashCrawler():
self.logger.error('Splash, ResponseNeverReceived for %s, retry in 10s ...', url)
time.sleep(10)
if response:
response_root_key = response.meta['root_key']
else:
response_root_key = None
yield SplashRequest(
url,
self.parse,
errback=self.errback_catcher,
endpoint='render.json',
meta={'father': father},
args={ 'html': 1,
'png': 1,
'render_all': 1,
'har': 1,
'wait': 10}
meta={'father': father, 'root_key': response.meta['root_key']},
args=self.arg_crawler
)
else:

View file

@ -3,13 +3,15 @@
import os
import sys
import json
import redis
import configparser
from TorSplashCrawler import TorSplashCrawler
if __name__ == '__main__':
if len(sys.argv) != 7:
print('usage:', 'tor_crawler.py', 'splash_url', 'type', 'url', 'domain', 'paste', 'super_father')
if len(sys.argv) != 2:
print('usage:', 'tor_crawler.py', 'uuid')
exit(1)
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
@ -21,14 +23,28 @@ if __name__ == '__main__':
cfg = configparser.ConfigParser()
cfg.read(configfile)
splash_url = sys.argv[1]
type = sys.argv[2]
crawler_depth_limit = cfg.getint("Crawler", "crawler_depth_limit")
redis_cache = redis.StrictRedis(
host=cfg.get("Redis_Cache", "host"),
port=cfg.getint("Redis_Cache", "port"),
db=cfg.getint("Redis_Cache", "db"),
decode_responses=True)
url = sys.argv[3]
domain = sys.argv[4]
paste = sys.argv[5]
super_father = sys.argv[6]
# get crawler config key
uuid = sys.argv[1]
crawler = TorSplashCrawler(splash_url, crawler_depth_limit)
crawler.crawl(type, url, domain, paste, super_father)
# get configs
crawler_json = json.loads(redis_cache.get('crawler_request:{}'.format(uuid)))
splash_url = crawler_json['splash_url']
service_type = crawler_json['service_type']
url = crawler_json['url']
domain = crawler_json['domain']
port = crawler_json['port']
original_item = crawler_json['item']
crawler_options = crawler_json['crawler_options']
date = crawler_json['date']
redis_cache.delete('crawler_request:{}'.format(uuid))
crawler = TorSplashCrawler(splash_url, crawler_options)
crawler.crawl(service_type, crawler_options, date, url, domain, port, original_item)

62
bin/update-background.py Executable file
View file

@ -0,0 +1,62 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
Update AIL
============================
Update AIL in the background
"""
import os
import sys
import redis
import subprocess
import configparser
if __name__ == "__main__":
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
if r_serv.scard('ail:update_v1.5') != 5:
r_serv.delete('ail:update_error')
r_serv.set('ail:update_in_progress', 'v1.5')
r_serv.set('ail:current_background_update', 'v1.5')
if not r_serv.sismember('ail:update_v1.5', 'onions'):
update_file = os.path.join(os.environ['AIL_HOME'], 'update', 'v1.5', 'Update-ARDB_Onions.py')
process = subprocess.run(['python' ,update_file])
if not r_serv.sismember('ail:update_v1.5', 'metadata'):
update_file = os.path.join(os.environ['AIL_HOME'], 'update', 'v1.5', 'Update-ARDB_Metadata.py')
process = subprocess.run(['python' ,update_file])
if not r_serv.sismember('ail:update_v1.5', 'tags'):
update_file = os.path.join(os.environ['AIL_HOME'], 'update', 'v1.5', 'Update-ARDB_Tags.py')
process = subprocess.run(['python' ,update_file])
if not r_serv.sismember('ail:update_v1.5', 'tags_background'):
update_file = os.path.join(os.environ['AIL_HOME'], 'update', 'v1.5', 'Update-ARDB_Tags_background.py')
process = subprocess.run(['python' ,update_file])
if not r_serv.sismember('ail:update_v1.5', 'crawled_screenshot'):
update_file = os.path.join(os.environ['AIL_HOME'], 'update', 'v1.5', 'Update-ARDB_Onions_screenshots.py')
process = subprocess.run(['python' ,update_file])
if r_serv.scard('ail:update_v1.5') != 5:
r_serv.set('ail:update_error', 'Update v1.5 Failed, please relaunch the bin/update-background.py script')
else:
r_serv.delete('ail:update_in_progress')
r_serv.delete('ail:current_background_script')
r_serv.delete('ail:current_background_script_stat')
r_serv.delete('ail:current_background_update')

View file

@ -0,0 +1,4 @@
[Update]
auto_update = True
upstream = upstream
update-fork = False

View file

@ -39,7 +39,7 @@ sudo apt-get install p7zip-full -y
# REDIS #
test ! -d redis/ && git clone https://github.com/antirez/redis.git
pushd redis/
git checkout 3.2
git checkout 5.0
make
popd

18
update/bin/Update_ARDB.sh Executable file
View file

@ -0,0 +1,18 @@
#!/bin/bash
echo "Killing all screens ..."
bash -c "bash ../../bin/LAUNCH.sh -k"
echo ""
echo "Updating ARDB ..."
pushd ../../
rm -r ardb
pushd ardb/
git clone https://github.com/yinqiwen/ardb.git
git checkout 0.10 || exit 1
make || exit 1
popd
popd
echo "ARDB Updated"
echo ""
exit 0

27
update/bin/Update_Redis.sh Executable file
View file

@ -0,0 +1,27 @@
#!/bin/bash
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_REDIS" ] && echo "Needs the env var AIL_REDIS. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_ARDB" ] && echo "Needs the env var AIL_ARDB. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_BIN" ] && echo "Needs the env var AIL_ARDB. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_FLASK" ] && echo "Needs the env var AIL_FLASK. Run the script from the virtual environment." && exit 1;
export PATH=$AIL_HOME:$PATH
export PATH=$AIL_REDIS:$PATH
export PATH=$AIL_ARDB:$PATH
export PATH=$AIL_BIN:$PATH
export PATH=$AIL_FLASK:$PATH
echo "Killing all screens ..."
bash -c "bash ${AIL_BIN}/LAUNCH.sh -k"
echo ""
echo "Updating Redis ..."
pushd $AIL_HOME/redis
git pull || exit 1
git checkout 5.0 || exit 1
make || exit 1
popd
echo "Redis Updated"
echo ""
exit 0

View file

@ -0,0 +1,193 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import time
import redis
import configparser
def update_tracked_terms(main_key, tracked_container_key):
for tracked_item in r_serv_term.smembers(main_key):
all_items = r_serv_term.smembers(tracked_container_key.format(tracked_item))
for item_path in all_items:
if PASTES_FOLDER in item_path:
new_item_path = item_path.replace(PASTES_FOLDER, '', 1)
r_serv_term.sadd(tracked_container_key.format(tracked_item), new_item_path)
r_serv_term.srem(tracked_container_key.format(tracked_item), item_path)
def update_hash_item(has_type):
#get all hash items:
all_hash_items = r_serv_tag.smembers('infoleak:automatic-detection=\"{}\"'.format(has_type))
for item_path in all_hash_items:
if PASTES_FOLDER in item_path:
base64_key = '{}_paste:{}'.format(has_type, item_path)
hash_key = 'hash_paste:{}'.format(item_path)
if r_serv_metadata.exists(base64_key):
new_base64_key = base64_key.replace(PASTES_FOLDER, '', 1)
res = r_serv_metadata.renamenx(base64_key, new_base64_key)
if res == 0:
print('same key, double name: {}'.format(item_path))
# fusion
all_key = r_serv_metadata.smembers(base64_key)
for elem in all_key:
r_serv_metadata.sadd(new_base64_key, elem)
r_serv_metadata.srem(base64_key, elem)
if r_serv_metadata.exists(hash_key):
new_hash_key = hash_key.replace(PASTES_FOLDER, '', 1)
res = r_serv_metadata.renamenx(hash_key, new_hash_key)
if res == 0:
print('same key, double name: {}'.format(item_path))
# fusion
all_key = r_serv_metadata.smembers(hash_key)
for elem in all_key:
r_serv_metadata.sadd(new_hash_key, elem)
r_serv_metadata.srem(hash_key, elem)
if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_tag = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
r_serv_term = redis.StrictRedis(
host=cfg.get("ARDB_TermFreq", "host"),
port=cfg.getint("ARDB_TermFreq", "port"),
db=cfg.getint("ARDB_TermFreq", "db"),
decode_responses=True)
r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
r_serv.set('ail:current_background_script', 'metadata')
## Update metadata ##
print('Updating ARDB_Metadata ...')
index = 0
start = time.time()
#update stats
r_serv.set('ail:current_background_script_stat', 0)
# Update base64
update_hash_item('base64')
#update stats
r_serv.set('ail:current_background_script_stat', 20)
# Update binary
update_hash_item('binary')
#update stats
r_serv.set('ail:current_background_script_stat', 40)
# Update binary
update_hash_item('hexadecimal')
#update stats
r_serv.set('ail:current_background_script_stat', 60)
total_onion = r_serv_tag.scard('infoleak:submission=\"crawler\"')
nb_updated = 0
last_progress = 0
# Update onion metadata
all_crawled_items = r_serv_tag.smembers('infoleak:submission=\"crawler\"')
for item_path in all_crawled_items:
domain = None
if PASTES_FOLDER in item_path:
old_item_metadata = 'paste_metadata:{}'.format(item_path)
item_path = item_path.replace(PASTES_FOLDER, '', 1)
new_item_metadata = 'paste_metadata:{}'.format(item_path)
res = r_serv_metadata.renamenx(old_item_metadata, new_item_metadata)
#key already exist
if res == 0:
r_serv_metadata.delete(old_item_metadata)
# update domain port
domain = r_serv_metadata.hget(new_item_metadata, 'domain')
if domain:
if domain[-3:] != ':80':
r_serv_metadata.hset(new_item_metadata, 'domain', '{}:80'.format(domain))
super_father = r_serv_metadata.hget(new_item_metadata, 'super_father')
if super_father:
if PASTES_FOLDER in super_father:
r_serv_metadata.hset(new_item_metadata, 'super_father', super_father.replace(PASTES_FOLDER, '', 1))
father = r_serv_metadata.hget(new_item_metadata, 'father')
if father:
if PASTES_FOLDER in father:
r_serv_metadata.hset(new_item_metadata, 'father', father.replace(PASTES_FOLDER, '', 1))
nb_updated += 1
progress = int((nb_updated * 30) /total_onion)
print('{}/{} updated {}%'.format(nb_updated, total_onion, progress + 60))
# update progress stats
if progress != last_progress:
r_serv.set('ail:current_background_script_stat', progress + 60)
last_progress = progress
#update stats
r_serv.set('ail:current_background_script_stat', 90)
## update tracked term/set/regex
# update tracked term
update_tracked_terms('TrackedSetTermSet', 'tracked_{}')
#update stats
r_serv.set('ail:current_background_script_stat', 93)
# update tracked set
update_tracked_terms('TrackedSetSet', 'set_{}')
#update stats
r_serv.set('ail:current_background_script_stat', 96)
# update tracked regex
update_tracked_terms('TrackedRegexSet', 'regex_{}')
#update stats
r_serv.set('ail:current_background_script_stat', 100)
##
end = time.time()
print('Updating ARDB_Metadata Done => {} paths: {} s'.format(index, end - start))
print()
r_serv.sadd('ail:update_v1.5', 'metadata')
##
#Key, Dynamic Update
##
#paste_children
#nb_seen_hash, base64_hash, binary_hash
#paste_onion_external_links
#misp_events, hive_cases
##

152
update/v1.5/Update-ARDB_Onions.py Executable file
View file

@ -0,0 +1,152 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import time
import redis
import datetime
import configparser
def substract_date(date_from, date_to):
date_from = datetime.date(int(date_from[0:4]), int(date_from[4:6]), int(date_from[6:8]))
date_to = datetime.date(int(date_to[0:4]), int(date_to[4:6]), int(date_to[6:8]))
delta = date_to - date_from # timedelta
l_date = []
for i in range(delta.days + 1):
date = date_from + datetime.timedelta(i)
l_date.append( date.strftime('%Y%m%d') )
return l_date
def get_date_epoch(date):
return int(datetime.datetime(int(date[0:4]), int(date[4:6]), int(date[6:8])).timestamp())
def get_domain_root_from_paste_childrens(item_father, domain):
item_children = r_serv_metadata.smembers('paste_children:{}'.format(item_father))
domain_root = ''
for item_path in item_children:
# remove absolute_path
if PASTES_FOLDER in item_path:
r_serv_metadata.srem('paste_children:{}'.format(item_father), item_path)
item_path = item_path.replace(PASTES_FOLDER, '', 1)
r_serv_metadata.sadd('paste_children:{}'.format(item_father), item_path)
if domain in item_path:
domain_root = item_path
return domain_root
if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_tag = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
r_serv.set('ail:current_background_script', 'onions')
r_serv.set('ail:current_background_script_stat', 0)
## Update Onion ##
print('Updating ARDB_Onion ...')
index = 0
start = time.time()
# clean down domain from db
date_from = '20180929'
date_today = datetime.date.today().strftime("%Y%m%d")
for date in substract_date(date_from, date_today):
onion_down = r_serv_onion.smembers('onion_down:{}'.format(date))
#print(onion_down)
for onion_domain in onion_down:
if not r_serv_onion.sismember('full_onion_up', onion_domain):
# delete history
all_onion_history = r_serv_onion.lrange('onion_history:{}'.format(onion_domain), 0 ,-1)
if all_onion_history:
for date_history in all_onion_history:
#print('onion_history:{}:{}'.format(onion_domain, date_history))
r_serv_onion.delete('onion_history:{}:{}'.format(onion_domain, date_history))
r_serv_onion.delete('onion_history:{}'.format(onion_domain))
#stats
total_domain = r_serv_onion.scard('full_onion_up')
nb_updated = 0
last_progress = 0
# clean up domain
all_domain_up = r_serv_onion.smembers('full_onion_up')
for onion_domain in all_domain_up:
# delete history
all_onion_history = r_serv_onion.lrange('onion_history:{}'.format(onion_domain), 0 ,-1)
if all_onion_history:
for date_history in all_onion_history:
print('--------')
print('onion_history:{}:{}'.format(onion_domain, date_history))
item_father = r_serv_onion.lrange('onion_history:{}:{}'.format(onion_domain, date_history), 0, 0)
print('item_father: {}'.format(item_father))
try:
item_father = item_father[0]
except IndexError:
r_serv_onion.delete('onion_history:{}:{}'.format(onion_domain, date_history))
continue
#print(item_father)
# delete old history
r_serv_onion.delete('onion_history:{}:{}'.format(onion_domain, date_history))
# create new history
root_key = get_domain_root_from_paste_childrens(item_father, onion_domain)
if root_key:
r_serv_onion.zadd('crawler_history_onion:{}:80'.format(onion_domain), get_date_epoch(date_history), root_key)
print('crawler_history_onion:{}:80 {} {}'.format(onion_domain, get_date_epoch(date_history), root_key))
#update service metadata: paste_parent
r_serv_onion.hset('onion_metadata:{}'.format(onion_domain), 'paste_parent', root_key)
r_serv_onion.delete('onion_history:{}'.format(onion_domain))
r_serv_onion.hset('onion_metadata:{}'.format(onion_domain), 'ports', '80')
r_serv_onion.hdel('onion_metadata:{}'.format(onion_domain), 'last_seen')
nb_updated += 1
progress = int((nb_updated * 100) /total_domain)
print('{}/{} updated {}%'.format(nb_updated, total_domain, progress))
# update progress stats
if progress != last_progress:
r_serv.set('ail:current_background_script_stat', progress)
last_progress = progress
end = time.time()
print('Updating ARDB_Onion Done => {} paths: {} s'.format(index, end - start))
print()
print('Done in {} s'.format(end - start_deb))
r_serv.sadd('ail:update_v1.5', 'onions')

View file

@ -0,0 +1,135 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import time
import redis
import datetime
import configparser
from hashlib import sha256
def rreplace(s, old, new, occurrence):
li = s.rsplit(old, occurrence)
return new.join(li)
def substract_date(date_from, date_to):
date_from = datetime.date(int(date_from[0:4]), int(date_from[4:6]), int(date_from[6:8]))
date_to = datetime.date(int(date_to[0:4]), int(date_to[4:6]), int(date_to[6:8]))
delta = date_to - date_from # timedelta
l_date = []
for i in range(delta.days + 1):
date = date_from + datetime.timedelta(i)
l_date.append( date.strftime('%Y%m%d') )
return l_date
if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
SCREENSHOT_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "crawled_screenshot"))
NEW_SCREENSHOT_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "crawled_screenshot"), 'screenshot')
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_tag = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
r_serv.set('ail:current_background_script', 'crawled_screenshot')
r_serv.set('ail:current_background_script_stat', 0)
## Update Onion ##
print('Updating ARDB_Onion ...')
index = 0
start = time.time()
# clean down domain from db
date_from = '20180801'
date_today = datetime.date.today().strftime("%Y%m%d")
list_date = substract_date(date_from, date_today)
nb_done = 0
last_progress = 0
total_to_update = len(list_date)
for date in list_date:
screenshot_dir = os.path.join(SCREENSHOT_FOLDER, date[0:4], date[4:6], date[6:8])
if os.path.isdir(screenshot_dir):
print(screenshot_dir)
for file in os.listdir(screenshot_dir):
if file.endswith(".png"):
index += 1
#print(file)
img_path = os.path.join(screenshot_dir, file)
with open(img_path, 'br') as f:
image_content = f.read()
hash = sha256(image_content).hexdigest()
img_dir_path = os.path.join(hash[0:2], hash[2:4], hash[4:6], hash[6:8], hash[8:10], hash[10:12])
filename_img = os.path.join(NEW_SCREENSHOT_FOLDER, img_dir_path, hash[12:] +'.png')
dirname = os.path.dirname(filename_img)
if not os.path.exists(dirname):
os.makedirs(dirname)
if not os.path.exists(filename_img):
os.rename(img_path, filename_img)
else:
os.remove(img_path)
item = os.path.join('crawled', date[0:4], date[4:6], date[6:8], file[:-4])
# add item metadata
r_serv_metadata.hset('paste_metadata:{}'.format(item), 'screenshot', hash)
# add sha256 metadata
r_serv_onion.sadd('screenshot:{}'.format(hash), item)
if file.endswith('.pnghar.txt'):
har_path = os.path.join(screenshot_dir, file)
new_file = rreplace(file, '.pnghar.txt', '.json', 1)
new_har_path = os.path.join(screenshot_dir, new_file)
os.rename(har_path, new_har_path)
progress = int((nb_done * 100) /total_to_update)
# update progress stats
if progress != last_progress:
r_serv.set('ail:current_background_script_stat', progress)
print('{}/{} screenshot updated {}%'.format(nb_done, total_to_update, progress))
last_progress = progress
nb_done += 1
r_serv.set('ail:current_background_script_stat', 100)
end = time.time()
print('Updating ARDB_Onion Done => {} paths: {} s'.format(index, end - start))
print()
print('Done in {} s'.format(end - start_deb))
r_serv.sadd('ail:update_v1.5', 'crawled_screenshot')

148
update/v1.5/Update-ARDB_Tags.py Executable file
View file

@ -0,0 +1,148 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import time
import redis
import configparser
if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_tag = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
r_important_paste_2018 = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=2018,
decode_responses=True)
r_important_paste_2019 = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=2018,
decode_responses=True)
r_serv.set('ail:current_background_script', 'tags')
r_serv.set('ail:current_background_script_stat', 0)
if r_serv.sismember('ail:update_v1.5', 'onions') and r_serv.sismember('ail:update_v1.5', 'metadata'):
print('Updating ARDB_Tags ...')
index = 0
nb_tags_to_update = 0
nb_updated = 0
last_progress = 0
start = time.time()
tags_list = r_serv_tag.smembers('list_tags')
# create temp tags metadata
tag_metadata = {}
for tag in tags_list:
tag_metadata[tag] = {}
tag_metadata[tag]['first_seen'] = r_serv_tag.hget('tag_metadata:{}'.format(tag), 'first_seen')
if tag_metadata[tag]['first_seen'] is None:
tag_metadata[tag]['first_seen'] = 99999999
else:
tag_metadata[tag]['first_seen'] = int(tag_metadata[tag]['first_seen'])
tag_metadata[tag]['last_seen'] = r_serv_tag.hget('tag_metadata:{}'.format(tag), 'last_seen')
if tag_metadata[tag]['last_seen'] is None:
tag_metadata[tag]['last_seen'] = 0
else:
tag_metadata[tag]['last_seen'] = int(tag_metadata[tag]['last_seen'])
nb_tags_to_update += r_serv_tag.scard(tag)
for tag in tags_list:
all_item = r_serv_tag.smembers(tag)
for item_path in all_item:
splitted_item_path = item_path.split('/')
#print(tag)
#print(item_path)
try:
item_date = int( ''.join([splitted_item_path[-4], splitted_item_path[-3], splitted_item_path[-2]]) )
except IndexError:
r_serv_tag.srem(tag, item_path)
continue
# remove absolute path
new_path = item_path.replace(PASTES_FOLDER, '', 1)
if new_path != item_path:
# save in queue absolute path to remove
r_serv_tag.sadd('maj:v1.5:absolute_path_to_rename', item_path)
# update metadata first_seen
if item_date < tag_metadata[tag]['first_seen']:
tag_metadata[tag]['first_seen'] = item_date
r_serv_tag.hset('tag_metadata:{}'.format(tag), 'first_seen', item_date)
# update metadata last_seen
if item_date > tag_metadata[tag]['last_seen']:
tag_metadata[tag]['last_seen'] = item_date
last_seen_db = r_serv_tag.hget('tag_metadata:{}'.format(tag), 'last_seen')
if last_seen_db:
if item_date > int(last_seen_db):
r_serv_tag.hset('tag_metadata:{}'.format(tag), 'last_seen', item_date)
else:
tag_metadata[tag]['last_seen'] = last_seen_db
r_serv_tag.sadd('{}:{}'.format(tag, item_date), new_path)
r_serv_tag.hincrby('daily_tags:{}'.format(item_date), tag, 1)
# clean db
r_serv_tag.srem(tag, item_path)
index = index + 1
nb_updated += 1
progress = int((nb_updated * 100) /nb_tags_to_update)
print('{}/{} updated {}%'.format(nb_updated, nb_tags_to_update, progress))
# update progress stats
if progress != last_progress:
r_serv.set('ail:current_background_script_stat', progress)
last_progress = progress
#flush browse importante pastes db
r_important_paste_2018.flushdb()
r_important_paste_2019.flushdb()
end = time.time()
print('Updating ARDB_Tags Done => {} paths: {} s'.format(index, end - start))
r_serv.sadd('ail:update_v1.5', 'tags')

View file

@ -0,0 +1,89 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import time
import redis
import configparser
def tags_key_fusion(old_item_path_key, new_item_path_key):
print('fusion:')
print(old_item_path_key)
print(new_item_path_key)
for tag in r_serv_metadata.smembers(old_item_path_key):
r_serv_metadata.sadd(new_item_path_key, tag)
r_serv_metadata.srem(old_item_path_key, tag)
if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_tag = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
if r_serv.sismember('ail:update_v1.5', 'tags'):
r_serv.set('ail:current_background_script', 'tags_background')
r_serv.set('ail:current_background_script_stat', 0)
print('Updating ARDB_Tags ...')
start = time.time()
#update item metadata tags
tag_not_updated = True
total_to_update = r_serv_tag.scard('maj:v1.5:absolute_path_to_rename')
nb_updated = 0
last_progress = 0
if total_to_update > 0:
while tag_not_updated:
item_path = r_serv_tag.srandmember('maj:v1.5:absolute_path_to_rename')
old_tag_item_key = 'tag:{}'.format(item_path)
new_item_path = item_path.replace(PASTES_FOLDER, '', 1)
new_tag_item_key = 'tag:{}'.format(new_item_path)
res = r_serv_metadata.renamenx(old_tag_item_key, new_tag_item_key)
if res == 0:
tags_key_fusion(old_tag_item_key, new_tag_item_key)
nb_updated += 1
r_serv_tag.srem('maj:v1.5:absolute_path_to_rename', item_path)
if r_serv_tag.scard('maj:v1.5:absolute_path_to_rename') == 0:
tag_not_updated = False
else:
progress = int((nb_updated * 100) /total_to_update)
print('{}/{} Tags updated {}%'.format(nb_updated, total_to_update, progress))
# update progress stats
if progress != last_progress:
r_serv.set('ail:current_background_script_stat', progress)
last_progress = progress
end = time.time()
print('Updating ARDB_Tags Done: {} s'.format(end - start))
r_serv.sadd('ail:update_v1.5', 'tags_background')

68
update/v1.5/Update.py Executable file
View file

@ -0,0 +1,68 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import time
import redis
import datetime
import configparser
if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
print()
print('Updating ARDB_Onion ...')
index = 0
start = time.time()
# update crawler queue
for elem in r_serv_onion.smembers('onion_crawler_queue'):
if PASTES_FOLDER in elem:
r_serv_onion.srem('onion_crawler_queue', elem)
r_serv_onion.sadd('onion_crawler_queue', elem.replace(PASTES_FOLDER, '', 1))
index = index +1
for elem in r_serv_onion.smembers('onion_crawler_priority_queue'):
if PASTES_FOLDER in elem:
r_serv_onion.srem('onion_crawler_queue', elem)
r_serv_onion.sadd('onion_crawler_queue', elem.replace(PASTES_FOLDER, '', 1))
index = index +1
end = time.time()
print('Updating ARDB_Onion Done => {} paths: {} s'.format(index, end - start))
print()
#Set current ail version
r_serv.set('ail:version', 'v1.5')
#Set current update_in_progress
r_serv.set('ail:update_in_progress', 'v1.5')
r_serv.set('ail:current_background_update', 'v1.5')
#Set current ail version
r_serv.set('ail:update_date_v1.5', datetime.datetime.now().strftime("%Y%m%d"))
print('Done in {} s'.format(end - start_deb))

60
update/v1.5/Update.sh Executable file
View file

@ -0,0 +1,60 @@
#!/bin/bash
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_REDIS" ] && echo "Needs the env var AIL_REDIS. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_ARDB" ] && echo "Needs the env var AIL_ARDB. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_BIN" ] && echo "Needs the env var AIL_ARDB. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_FLASK" ] && echo "Needs the env var AIL_FLASK. Run the script from the virtual environment." && exit 1;
export PATH=$AIL_HOME:$PATH
export PATH=$AIL_REDIS:$PATH
export PATH=$AIL_ARDB:$PATH
export PATH=$AIL_BIN:$PATH
export PATH=$AIL_FLASK:$PATH
GREEN="\\033[1;32m"
DEFAULT="\\033[0;39m"
echo -e $GREEN"Shutting down AIL ..."$DEFAULT
bash ${AIL_BIN}/LAUNCH.sh -k &
wait
echo ""
bash -c "bash ${AIL_HOME}/update/bin/Update_Redis.sh"
#bash -c "bash ${AIL_HOME}/update/bin/Update_ARDB.sh"
echo ""
echo -e $GREEN"Update DomainClassifier"$DEFAULT
echo ""
pip3 install --upgrade --force-reinstall git+https://github.com/D4-project/BGP-Ranking.git/@28013297efb039d2ebbce96ee2d89493f6ae56b0#subdirectory=client&egg=pybgpranking
pip3 install --upgrade --force-reinstall git+https://github.com/adulau/DomainClassifier.git
wait
echo ""
echo ""
echo -e $GREEN"Update Web thirdparty"$DEFAULT
echo ""
bash ${AIL_FLASK}update_thirdparty.sh &
wait
echo ""
bash ${AIL_BIN}LAUNCH.sh -lav &
wait
echo ""
echo ""
echo -e $GREEN"Fixing ARDB ..."$DEFAULT
echo ""
python ${AIL_HOME}/update/v1.4/Update.py &
wait
echo ""
echo ""
echo ""
echo -e $GREEN"Shutting down ARDB ..."$DEFAULT
bash ${AIL_BIN}/LAUNCH.sh -k &
wait
echo ""
exit 0

View file

@ -154,14 +154,22 @@ if baseUrl != '':
max_preview_char = int(cfg.get("Flask", "max_preview_char")) # Maximum number of character to display in the tooltip
max_preview_modal = int(cfg.get("Flask", "max_preview_modal")) # Maximum number of character to display in the modal
max_tags_result = 50
DiffMaxLineLength = int(cfg.get("Flask", "DiffMaxLineLength"))#Use to display the estimated percentage instead of a raw value
bootstrap_label = ['primary', 'success', 'danger', 'warning', 'info']
dict_update_description = {'v1.5':{'nb_background_update': 5, 'update_warning_message': 'An Update is running on the background. Some informations like Tags, screenshot can be',
'update_warning_message_notice_me': 'missing from the UI.'}
}
UPLOAD_FOLDER = os.path.join(os.environ['AIL_FLASK'], 'submitted')
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes"))
SCREENSHOT_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "crawled_screenshot"))
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
SCREENSHOT_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "crawled_screenshot"), 'screenshot')
REPO_ORIGIN = 'https://github.com/CIRCL/AIL-framework.git'
max_dashboard_logs = int(cfg.get("Flask", "max_dashboard_logs"))

View file

@ -8,8 +8,7 @@ import redis
from flask import Flask, render_template, jsonify, request, Blueprint, redirect, url_for
import json
from datetime import datetime
import ssdeep
import datetime
import Paste
@ -28,6 +27,7 @@ r_serv_statistics = Flask_config.r_serv_statistics
max_preview_char = Flask_config.max_preview_char
max_preview_modal = Flask_config.max_preview_modal
bootstrap_label = Flask_config.bootstrap_label
max_tags_result = Flask_config.max_tags_result
PASTES_FOLDER = Flask_config.PASTES_FOLDER
Tags = Blueprint('Tags', __name__, template_folder='templates')
@ -56,6 +56,16 @@ for name, tags in clusters.items(): #galaxie name + tags
def one():
return 1
def date_substract_day(date, num_day=1):
new_date = datetime.date(int(date[0:4]), int(date[4:6]), int(date[6:8])) - datetime.timedelta(num_day)
new_date = str(new_date).replace('-', '')
return new_date
def date_add_day(date, num_day=1):
new_date = datetime.date(int(date[0:4]), int(date[4:6]), int(date[6:8])) + datetime.timedelta(num_day)
new_date = str(new_date).replace('-', '')
return new_date
def get_tags_with_synonyms(tag):
str_synonyms = ' - synonyms: '
synonyms = r_serv_tags.smembers('synonym_tag_' + tag)
@ -68,12 +78,277 @@ def get_tags_with_synonyms(tag):
else:
return {'name':tag,'id':tag}
def get_item_date(item_filename):
l_directory = item_filename.split('/')
return '{}{}{}'.format(l_directory[-4], l_directory[-3], l_directory[-2])
def substract_date(date_from, date_to):
date_from = datetime.date(int(date_from[0:4]), int(date_from[4:6]), int(date_from[6:8]))
date_to = datetime.date(int(date_to[0:4]), int(date_to[4:6]), int(date_to[6:8]))
delta = date_to - date_from # timedelta
l_date = []
for i in range(delta.days + 1):
date = date_from + datetime.timedelta(i)
l_date.append( date.strftime('%Y%m%d') )
return l_date
def get_all_dates_range(date_from, date_to):
all_dates = {}
date_range = []
if date_from is not None and date_to is not None:
#change format
try:
if len(date_from) != 8:
date_from = date_from[0:4] + date_from[5:7] + date_from[8:10]
date_to = date_to[0:4] + date_to[5:7] + date_to[8:10]
date_range = substract_date(date_from, date_to)
except:
pass
if not date_range:
date_range.append(datetime.date.today().strftime("%Y%m%d"))
date_from = date_range[0][0:4] + '-' + date_range[0][4:6] + '-' + date_range[0][6:8]
date_to = date_from
else:
date_from = date_from[0:4] + '-' + date_from[4:6] + '-' + date_from[6:8]
date_to = date_to[0:4] + '-' + date_to[4:6] + '-' + date_to[6:8]
all_dates['date_from'] = date_from
all_dates['date_to'] = date_to
all_dates['date_range'] = date_range
return all_dates
def get_last_seen_from_tags_list(list_tags):
min_last_seen = 99999999
for tag in list_tags:
tag_last_seen = r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen')
if tag_last_seen:
tag_last_seen = int(tag_last_seen)
if tag_last_seen < min_last_seen:
min_last_seen = tag_last_seen
return str(min_last_seen)
def add_item_tag(tag, item_path):
item_date = int(get_item_date(item_path))
#add tag
r_serv_metadata.sadd('tag:{}'.format(item_path), tag)
r_serv_tags.sadd('{}:{}'.format(tag, item_date), item_path)
r_serv_tags.hincrby('daily_tags:{}'.format(item_date), tag, 1)
tag_first_seen = r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen')
if tag_first_seen is None:
tag_first_seen = 99999999
else:
tag_first_seen = int(tag_first_seen)
tag_last_seen = r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen')
if tag_last_seen is None:
tag_last_seen = 0
else:
tag_last_seen = int(tag_last_seen)
#add new tag in list of all used tags
r_serv_tags.sadd('list_tags', tag)
# update fisrt_seen/last_seen
if item_date < tag_first_seen:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'first_seen', item_date)
# update metadata last_seen
if item_date > tag_last_seen:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'last_seen', item_date)
def remove_item_tag(tag, item_path):
item_date = int(get_item_date(item_path))
#remove tag
r_serv_metadata.srem('tag:{}'.format(item_path), tag)
res = r_serv_tags.srem('{}:{}'.format(tag, item_date), item_path)
if res ==1:
# no tag for this day
if int(r_serv_tags.hget('daily_tags:{}'.format(item_date), tag)) == 1:
r_serv_tags.hdel('daily_tags:{}'.format(item_date), tag)
else:
r_serv_tags.hincrby('daily_tags:{}'.format(item_date), tag, -1)
tag_first_seen = int(r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen'))
tag_last_seen = int(r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen'))
# update fisrt_seen/last_seen
if item_date == tag_first_seen:
update_tag_first_seen(tag, tag_first_seen, tag_last_seen)
if item_date == tag_last_seen:
update_tag_last_seen(tag, tag_first_seen, tag_last_seen)
else:
return 'Error incorrect tag'
def update_tag_first_seen(tag, tag_first_seen, tag_last_seen):
if tag_first_seen == tag_last_seen:
if r_serv_tags.scard('{}:{}'.format(tag, tag_first_seen)) > 0:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'first_seen', tag_first_seen)
# no tag in db
else:
r_serv_tags.srem('list_tags', tag)
r_serv_tags.hdel('tag_metadata:{}'.format(tag), 'first_seen')
r_serv_tags.hdel('tag_metadata:{}'.format(tag), 'last_seen')
else:
if r_serv_tags.scard('{}:{}'.format(tag, tag_first_seen)) > 0:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'first_seen', tag_first_seen)
else:
tag_first_seen = date_add_day(tag_first_seen)
update_tag_first_seen(tag, tag_first_seen, tag_last_seen)
def update_tag_last_seen(tag, tag_first_seen, tag_last_seen):
if tag_first_seen == tag_last_seen:
if r_serv_tags.scard('{}:{}'.format(tag, tag_last_seen)) > 0:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'last_seen', tag_last_seen)
# no tag in db
else:
r_serv_tags.srem('list_tags', tag)
r_serv_tags.hdel('tag_metadata:{}'.format(tag), 'first_seen')
r_serv_tags.hdel('tag_metadata:{}'.format(tag), 'last_seen')
else:
if r_serv_tags.scard('{}:{}'.format(tag, tag_last_seen)) > 0:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'last_seen', tag_last_seen)
else:
tag_last_seen = date_substract_day(tag_last_seen)
update_tag_last_seen(tag, tag_first_seen, tag_last_seen)
# ============= ROUTES ==============
@Tags.route("/Tags/", methods=['GET'])
@Tags.route("/tags/", methods=['GET'])
def Tags_page():
return render_template("Tags.html")
date_from = request.args.get('date_from')
date_to = request.args.get('date_to')
tags = request.args.get('ltags')
if tags is None:
dates = get_all_dates_range(date_from, date_to)
return render_template("Tags.html", date_from=dates['date_from'], date_to=dates['date_to'])
# unpack tags
list_tags = tags.split(',')
list_tag = []
for tag in list_tags:
list_tag.append(tag.replace('"','\"'))
#no search by date, use last_seen for date_from/date_to
if date_from is None and date_to is None and tags is not None:
date_from = get_last_seen_from_tags_list(list_tags)
date_to = date_from
# TODO verify input
dates = get_all_dates_range(date_from, date_to)
if(type(list_tags) is list):
# no tag
if list_tags is False:
print('empty')
# 1 tag
elif len(list_tags) < 2:
tagged_pastes = []
for date in dates['date_range']:
tagged_pastes.extend(r_serv_tags.smembers('{}:{}'.format(list_tags[0], date)))
# 2 tags or more
else:
tagged_pastes = []
for date in dates['date_range']:
tag_keys = []
for tag in list_tags:
tag_keys.append('{}:{}'.format(tag, date))
if len(tag_keys) > 1:
daily_items = r_serv_tags.sinter(tag_keys[0], *tag_keys[1:])
else:
daily_items = r_serv_tags.sinter(tag_keys[0])
tagged_pastes.extend(daily_items)
else :
return 'INCORRECT INPUT'
all_content = []
paste_date = []
paste_linenum = []
all_path = []
allPastes = list(tagged_pastes)
paste_tags = []
try:
page = int(request.args.get('page'))
except:
page = 1
if page <= 0:
page = 1
nb_page_max = len(tagged_pastes)/(max_tags_result)
if not nb_page_max.is_integer():
nb_page_max = int(nb_page_max)+1
else:
nb_page_max = int(nb_page_max)
if page > nb_page_max:
page = nb_page_max
start = max_tags_result*(page -1)
stop = max_tags_result*page
for path in allPastes[start:stop]:
all_path.append(path)
paste = Paste.Paste(path)
content = paste.get_p_content()
content_range = max_preview_char if len(content)>max_preview_char else len(content)-1
all_content.append(content[0:content_range].replace("\"", "\'").replace("\r", " ").replace("\n", " "))
curr_date = str(paste._get_p_date())
curr_date = curr_date[0:4]+'/'+curr_date[4:6]+'/'+curr_date[6:]
paste_date.append(curr_date)
paste_linenum.append(paste.get_lines_info()[0])
p_tags = r_serv_metadata.smembers('tag:'+path)
complete_tags = []
l_tags = []
for tag in p_tags:
complete_tag = tag
tag = tag.split('=')
if len(tag) > 1:
if tag[1] != '':
tag = tag[1][1:-1]
# no value
else:
tag = tag[0][1:-1]
# use for custom tags
else:
tag = tag[0]
l_tags.append( (tag,complete_tag) )
paste_tags.append(l_tags)
if len(allPastes) > 10:
finished = False
else:
finished = True
if len(list_tag) == 1:
tag_nav=tags.replace('"', '').replace('=', '').replace(':', '')
else:
tag_nav='empty'
return render_template("Tags.html",
all_path=all_path,
tags=tags,
tag_nav=tag_nav,
list_tag = list_tag,
date_from=dates['date_from'],
date_to=dates['date_to'],
page=page, nb_page_max=nb_page_max,
paste_tags=paste_tags,
bootstrap_label=bootstrap_label,
content=all_content,
paste_date=paste_date,
paste_linenum=paste_linenum,
char_to_display=max_preview_modal,
finished=finished)
@Tags.route("/Tags/get_all_tags")
def get_all_tags():
@ -173,94 +448,6 @@ def get_tags_galaxy():
else:
return 'this galaxy is disable'
@Tags.route("/Tags/get_tagged_paste")
def get_tagged_paste():
tags = request.args.get('ltags')
list_tags = tags.split(',')
list_tag = []
for tag in list_tags:
list_tag.append(tag.replace('"','\"'))
# TODO verify input
if(type(list_tags) is list):
# no tag
if list_tags is False:
print('empty')
# 1 tag
elif len(list_tags) < 2:
tagged_pastes = r_serv_tags.smembers(list_tags[0])
# 2 tags or more
else:
tagged_pastes = r_serv_tags.sinter(list_tags[0], *list_tags[1:])
else :
return 'INCORRECT INPUT'
#TODO FIXME
currentSelectYear = int(datetime.now().year)
all_content = []
paste_date = []
paste_linenum = []
all_path = []
allPastes = list(tagged_pastes)
paste_tags = []
for path in allPastes[0:50]: ######################moduleName
all_path.append(path)
paste = Paste.Paste(path)
content = paste.get_p_content()
content_range = max_preview_char if len(content)>max_preview_char else len(content)-1
all_content.append(content[0:content_range].replace("\"", "\'").replace("\r", " ").replace("\n", " "))
curr_date = str(paste._get_p_date())
curr_date = curr_date[0:4]+'/'+curr_date[4:6]+'/'+curr_date[6:]
paste_date.append(curr_date)
paste_linenum.append(paste.get_lines_info()[0])
p_tags = r_serv_metadata.smembers('tag:'+path)
complete_tags = []
l_tags = []
for tag in p_tags:
complete_tag = tag
tag = tag.split('=')
if len(tag) > 1:
if tag[1] != '':
tag = tag[1][1:-1]
# no value
else:
tag = tag[0][1:-1]
# use for custom tags
else:
tag = tag[0]
l_tags.append( (tag,complete_tag) )
paste_tags.append(l_tags)
if len(allPastes) > 10:
finished = False
else:
finished = True
return render_template("tagged.html",
year=currentSelectYear,
all_path=all_path,
tags=tags,
list_tag = list_tag,
paste_tags=paste_tags,
bootstrap_label=bootstrap_label,
content=all_content,
paste_date=paste_date,
paste_linenum=paste_linenum,
char_to_display=max_preview_modal,
finished=finished)
@Tags.route("/Tags/remove_tag")
def remove_tag():
@ -268,12 +455,7 @@ def remove_tag():
path = request.args.get('paste')
tag = request.args.get('tag')
#remove tag
r_serv_metadata.srem('tag:'+path, tag)
r_serv_tags.srem(tag, path)
if r_serv_tags.scard(tag) == 0:
r_serv_tags.srem('list_tags', tag)
remove_item_tag(tag, path)
return redirect(url_for('showsavedpastes.showsavedpaste', paste=path))
@ -285,17 +467,11 @@ def confirm_tag():
tag = request.args.get('tag')
if(tag[9:28] == 'automatic-detection'):
#remove automatic tag
r_serv_metadata.srem('tag:'+path, tag)
r_serv_tags.srem(tag, path)
remove_item_tag(tag, path)
tag = tag.replace('automatic-detection','analyst-detection', 1)
#add analyst tag
r_serv_metadata.sadd('tag:'+path, tag)
r_serv_tags.sadd(tag, path)
#add new tag in list of all used tags
r_serv_tags.sadd('list_tags', tag)
add_item_tag(tag, path)
return redirect(url_for('showsavedpastes.showsavedpaste', paste=path))
@ -345,12 +521,7 @@ def addTags():
tax = tag.split(':')[0]
if tax in active_taxonomies:
if tag in r_serv_tags.smembers('active_tag_' + tax):
#add tag
r_serv_metadata.sadd('tag:'+path, tag)
r_serv_tags.sadd(tag, path)
#add new tag in list of all used tags
r_serv_tags.sadd('list_tags', tag)
add_item_tag(tag, path)
else:
return 'INCORRECT INPUT1'
@ -365,12 +536,7 @@ def addTags():
if gal in active_galaxies:
if tag in r_serv_tags.smembers('active_tag_galaxies_' + gal):
#add tag
r_serv_metadata.sadd('tag:'+path, tag)
r_serv_tags.sadd(tag, path)
#add new tag in list of all used tags
r_serv_tags.sadd('list_tags', tag)
add_item_tag(tag, path)
else:
return 'INCORRECT INPUT3'

View file

@ -2,65 +2,194 @@
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Tags - AIL</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png') }}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='font-awesome/css/font-awesome.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/sb-admin-2.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dygraph_gallery.css') }}" rel="stylesheet" type="text/css" />
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/daterangepicker.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dataTables.bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/tags.css') }}" rel="stylesheet" type="text/css" />
<!-- JS -->
<script type="text/javascript" src="{{ url_for('static', filename='js/dygraph-combined.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/jquery.flot.js') }}"></script>
<script src="{{ url_for('static', filename='js/jquery.flot.pie.js') }}"></script>
<script src="{{ url_for('static', filename='js/jquery.flot.time.js') }}"></script>
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/popper.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/jquery.dataTables.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/dataTables.bootstrap.min.js')}}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/moment.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/jquery.daterangepicker.min.js') }}"></script>
<script src="{{ url_for('static', filename='js/tags.js') }}"></script>
<style>
.rotate{
-moz-transition: all 0.1s linear;
-webkit-transition: all 0.1s linear;
transition: all 0.1s linear;
}
.rotate.down{
-moz-transform:rotate(180deg);
-webkit-transform:rotate(180deg);
transform:rotate(180deg);
}
</style>
</head>
<body>
{% include 'navbar.html' %}
{% include 'nav_bar.html' %}
<div id="page-wrapper">
<!-- Modal -->
<div id="mymodal" class="modal fade" role="dialog">
<div class="modal-dialog modal-lg">
<!-- Modal content-->
<div id="mymodalcontent" class="modal-content">
<div id="mymodalbody" class="modal-body" max-width="850px">
<p>Loading paste information...</p>
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" height="26" width="26" style="margin: 4px;">
</div>
<div class="modal-footer">
<a id="button_show_path" target="_blank" href=""><button type="button" class="btn btn-info">Show saved paste</button></a>
<button type="button" class="btn btn-default" data-dismiss="modal">Close</button>
</div>
</div>
</div>
</div>
<div class="container-fluid">
<div class="row">
<div class="col-lg-12">
<h1 class="page-header" data-page="page-tags" >Tags</h1>
</div>
<!-- /.col-lg-12 -->
</div>
<!-- /.row -->
<div class="form-group input-group" >
<input id="ltags" style="width:100%;" type="text" name="ltags" autocomplete="off">
<div class="input-group-btn">
<button type="button" class="btn btn-search btn-primary btn-tags" onclick="searchTags()"<button class="btn btn-primary" onclick="emptyTags()">
<span class="glyphicon glyphicon-search "></span>
<span class="label-icon">Search Tags</span>
{% include 'tags/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div class="card mb-3 mt-1">
<div class="card-header text-white bg-dark">
<h5 class="card-title">Search Tags by date range :</h5>
</div>
<div class="card-body">
<div class="row mb-3">
<div class="col-md-6">
<div class="input-group" id="date-range-from">
<div class="input-group-prepend"><span class="input-group-text"><i class="far fa-calendar-alt" aria-hidden="true"></i></span></div>
<input class="form-control" id="date-range-from-input" placeholder="yyyy-mm-dd" value="{{ date_from }}" name="date_from" autocomplete="off">
</div>
</div>
<div class="col-md-6">
<div class="input-group" id="date-range-to">
<div class="input-group-prepend"><span class="input-group-text"><i class="far fa-calendar-alt" aria-hidden="true"></i></span></div>
<input class="form-control" id="date-range-to-input" placeholder="yyyy-mm-dd" value="{{ date_to }}" name="date_to" autocomplete="off">
</div>
</div>
</div>
<div class="input-group mb-3">
<div class="input-group-prepend">
<button class="btn btn-outline-danger" type="button" id="button-clear-tags" style="z-index: 1;" onclick="emptyTags()">
<i class="fas fa-eraser"></i>
</button>
</div>
<input id="ltags" name="ltags" type="text" class="form-control" aria-describedby="button-clear-tags" autocomplete="off">
</div>
<button class="btn btn-primary" onclick="emptyTags()" style="margin-bottom: 30px;">
<span class="glyphicon glyphicon-remove"></span>
<span class="label-icon">Clear Tags</span>
<button class="btn btn-primary" type="button" id="button-search-tags" onclick="searchTags()">
<i class="fas fa-search"></i> Search Tags
</button>
</div>
</div>
<div>
{%if all_path%}
<table class="table table-bordered table-hover" id="myTable_">
<thead class="thead-dark">
<tr>
<th>Date</th>
<th style="max-width: 800px;">Path</th>
<th># of lines</th>
<th>Action</th>
</tr>
</thead>
<tbody>
{% for path in all_path %}
<tr>
<td class="pb-0">{{ paste_date[loop.index0] }}</td>
<td class="pb-0"><a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{path}}" class="text-secondary">
<div style="line-height:0.9;">{{ path }}</div>
</a>
<div class="mb-2">
{% for tag in paste_tags[loop.index0] %}
<a href="{{ url_for('Tags.Tags_page') }}?ltags={{ tag[1] }}">
<span class="badge badge-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag[0] }}</span>
</a>
{% endfor %}
</div>
</td class="pb-0">
<td class="pb-0">{{ paste_linenum[loop.index0] }}</td>
<td class="pb-0"><p>
<span class="fas fa-info-circle" data-toggle="tooltip" data-placement="left" title="{{ content[loop.index0] }} "></span>
<button type="button" class="btn btn-light" data-num="{{ loop.index0 + 1 }}" data-toggle="modal" data-target="#mymodal" data-url="{{ url_for('showsavedpastes.showsaveditem_min') }}?paste={{ path }}&num={{ loop.index0+1 }}" data-path="{{ path }}">
<span class="fas fa-search-plus"></span>
</button></p>
</td>
</tr>
{% endfor %}
</tbody>
</table>
<div class="d-flex justify-content-center border-top mt-2">
<nav class="mt-4" aria-label="...">
<ul class="pagination">
<li class="page-item {%if page==1%}disabled{%endif%}">
<a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{page-1}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">Previous</a>
</li>
{%if page>3%}
<li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page=1&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">1</a></li>
<li class="page-item disabled"><a class="page-link" aria-disabled="true" href="#">...</a></li>
<li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{page-1}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{page-1}}</a></li>
<li class="page-item active"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{page}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{page}}</a></li>
{%else%}
{%if page>2%}<li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{page-2}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{page-2}}</a></li>{%endif%}
{%if page>1%}<li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{page-1}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{page-1}}</a></li>{%endif%}
<li class="page-item active"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{page}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{page}}</a></li>
{%endif%}
{%if nb_page_max-page>3%}
<li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{page+1}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{page+1}}</a></li>
<li class="page-item disabled"><a class="page-link" aria-disabled="true" href="#">...</a></li>
<li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{nb_page_max}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{nb_page_max}}</a></li>
{%else%}
{%if nb_page_max-page>2%}<li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{nb_page_max-2}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{nb_page_max-2}}</a></li>{%endif%}
{%if nb_page_max-page>1%}<li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{nb_page_max-1}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{nb_page_max-1}}</a></li>{%endif%}
{%if nb_page_max-page>0%}<li class="page-item"><a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{nb_page_max}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}">{{nb_page_max}}</a></li>{%endif%}
{%endif%}
<li class="page-item {%if page==nb_page_max%}disabled{%endif%}">
<a class="page-link" href="{{ url_for('Tags.Tags_page') }}?page={{page+1}}&date_from={{date_from}}&date_to={{date_to}}&ltags={{tags}}" aria-disabled="true">Next</a>
</li>
</ul>
</nav>
</div>
{%endif%}
</div>
<div>
<br></br>
<a class="btn btn-tags" href="{{ url_for('Tags.taxonomies') }}" target="_blank">
<i class="fa fa-wrench fa-2x"></i>
<a class="btn btn-light text-secondary" href="{{ url_for('Tags.taxonomies') }}" target="_blank">
<i class="fas fa-wrench fa-2x"></i>
<br></br>
<span class="label-icon">Edit Taxonomies List </span>
</a>
<a class="btn btn-tags" href="{{ url_for('Tags.galaxies') }}" target="_blank">
<i class="fa fa-rocket fa-2x"></i>
<a class="btn btn-light text-secondary" href="{{ url_for('Tags.galaxies') }}" target="_blank">
<i class="fas fa-rocket fa-2x"></i>
<br></br>
<span class="label-icon">Edit Galaxies List</span>
</a>
@ -68,45 +197,211 @@
<div>
<br></br>
<a class="btn btn-tags" href="{{ url_for('PasteSubmit.edit_tag_export') }}" target="_blank">
<i class="fa fa-cogs fa-2x"></i>
<a class="btn btn-light text-secondary" href="{{ url_for('PasteSubmit.edit_tag_export') }}" target="_blank">
<i class="fas fa-cogs fa-2x"></i>
<br></br>
<span class="label-icon">MISP and Hive, auto push</span>
</a>
</div>
</div>
</div>
</div>
<!-- /#page-wrapper -->
</body>
<script>
var ltags
var ltags;
var search_table;
var last_clicked_paste;
var can_change_modal_content = true;
$(document).ready(function(){
activePage = "page-Tags"
$("#"+activePage).addClass("active");
$("#nav_quick_search").removeClass("text-muted");
$("#nav_tag_{{tag_nav}}").addClass("active");
search_table = $('#myTable_').DataTable({ "order": [[ 0, "asc" ]] });
// Use to bind the button with the new displayed data
// (The bind do not happens if the dataTable is in tabs and the clicked data is in another page)
search_table.on( 'draw.dt', function () {
// Bind tooltip each time we draw a new page
$('[data-toggle="tooltip"]').tooltip();
// On click, get html content from url and update the corresponding modal
$("[data-toggle='modal']").off('click.openmodal').on("click.openmodal", function (event) {
get_html_and_update_modal(event, $(this));
});
} );
search_table.draw()
var valueData = [
{% for tag in list_tag %}
'{{tag|safe}}',
{% endfor %}
];
$.getJSON("{{ url_for('Tags.get_all_tags') }}",
function(data) {
ltags = $('#ltags').tagSuggest({
data: data,
value: valueData,
sortOrder: 'name',
maxDropHeight: 200,
name: 'ltags'
});
});
$('#date-range-from').dateRangePicker({
separator : ' to ',
getValue: function(){
if ($('#date-range-from-input').val() && $('#date-range-to-input').val() )
return $('#date-range-from-input').val() + ' to ' + $('#date-range-to-input').val();
else
return '';
},
setValue: function(s,s1,s2){
$('#date-range-from-input').val(s1);
$('#date-range-to-input').val(s2);
}
});
$('#date-range-to').dateRangePicker({
separator : ' to ',
getValue: function(){
if ($('#date-range-from-input').val() && $('#date-range-to-input').val() )
return $('#date-range-from-input').val() + ' to ' + $('#date-range-to-input').val();
else
return '';
},
setValue: function(s,s1,s2){
$('#date-range-from-input').val(s1);
$('#date-range-to-input').val(s2);
}
});
});
</script>
<script>
function searchTags() {
var data = ltags.getValue();
window.location.replace("{{ url_for('Tags.get_tagged_paste') }}?ltags=" + data);
var date_from = $('#date-range-from-input').val();
var date_to =$('#date-range-to-input').val();
window.location.replace("{{ url_for('Tags.Tags_page') }}?date_from="+date_from+"&date_to="+date_to+"&ltags=" + data);
}
function emptyTags() {
ltags.clear();
}
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
</script>
<script src="{{ url_for('static', filename='js/bootstrap.min.js') }}"></script>
</body>
<!-- Dynamically update the modal -->
<script>
// static data
var alert_message = '<div class="alert alert-info alert-dismissable"><button type="button" class="close" data-dismiss="alert" aria-hidden="true">×</button><strong>No more data.</strong> Full paste displayed.</div>';
var complete_paste = null;
var char_to_display = {{ char_to_display }};
var start_index = 0;
// When the modal goes out, refresh it to normal content
$("#mymodal").on('hidden.bs.modal', function () {
can_change_modal_content = true;
$("#mymodalbody").html("<p>Loading paste information...</p>");
var loading_gif = "<img id='loading-gif-modal' class='img-center' src=\"{{url_for('static', filename='image/loading.gif') }}\" height='26' width='26' style='margin: 4px;'>";
$("#mymodalbody").append(loading_gif); // Show the loading GIF
$("#button_show_path").attr('href', '');
$("#button_show_path").hide();
complete_paste = null;
start_index = 0;
});
// Update the paste preview in the modal
function update_preview() {
if (start_index + char_to_display > complete_paste.length-1){ // end of paste reached
var final_index = complete_paste.length-1;
var flag_stop = true;
} else {
var final_index = start_index + char_to_display;
}
if (final_index != start_index){ // still have data to display
// Append the new content using text() and not append (XSS)
$("#mymodalbody").find("#paste-holder").text($("#mymodalbody").find("#paste-holder").text()+complete_paste.substring(start_index+1, final_index+1));
start_index = final_index;
if (flag_stop)
nothing_to_display();
} else {
nothing_to_display();
}
}
// Update the modal when there is no more data
function nothing_to_display() {
var new_content = $(alert_message).hide();
$("#mymodalbody").find("#panel-body").append(new_content);
new_content.show('fast');
$("#load-more-button").hide();
}
function get_html_and_update_modal(event, truemodal) {
event.preventDefault();
var modal=truemodal;
var url = " {{ url_for('showsavedpastes.showpreviewpaste') }}?paste=" + modal.attr('data-path') + "&num=" + modal.attr('data-num');
last_clicked_paste = modal.attr('data-num');
$.get(url, function (data) {
// verify that the reveived data is really the current clicked paste. Otherwise, ignore it.
var received_num = parseInt(data.split("|num|")[1]);
if (received_num == last_clicked_paste && can_change_modal_content) {
can_change_modal_content = false;
// clear data by removing html, body, head tags. prevent dark modal background stack bug.
var cleared_data = data.split("<body>")[1].split("</body>")[0];
$("#mymodalbody").html(cleared_data);
var button = $('<button type="button" id="load-more-button" class="btn btn-outline-primary rounded-circle px-1 py-0" data-url="' + $(modal).attr('data-path') +'" data-toggle="tooltip" data-placement="bottom" title="Load more content"><i class="fas fa-arrow-circle-down mt-1"></i></button>');
button.tooltip(button);
$("#container-show-more").append(button);
$("#button_show_path").attr('href', '{{ url_for('showsavedpastes.showsavedpaste') }}?paste=' + $(modal).attr('data-path'));
$("#button_show_path").show('fast');
$("#loading-gif-modal").css("visibility", "hidden"); // Hide the loading GIF
if ($("[data-initsize]").attr('data-initsize') < char_to_display) { // All the content is displayed
nothing_to_display();
}
// collapse decoded
$('#collapseDecoded').collapse('hide');
// On click, donwload all paste's content
$("#load-more-button").on("click", function (event) {
if (complete_paste == null) { //Donwload only once
$.get("{{ url_for('showsavedpastes.getmoredata') }}"+"?paste="+$(modal).attr('data-path'), function(data, status){
complete_paste = data;
update_preview();
});
} else {
update_preview();
}
});
} else if (can_change_modal_content) {
$("#mymodalbody").html("Ignoring previous not finished query of paste #" + received_num);
}
});
}
</script>
</html>

View file

@ -1,181 +0,0 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
'''
Flask functions and routes for the trending modules page
'''
import redis
import json
import flask
import os
from datetime import datetime
from flask import Flask, render_template, jsonify, request, Blueprint
import Paste
# ============ VARIABLES ============
import Flask_config
app = Flask_config.app
cfg = Flask_config.cfg
baseUrl = Flask_config.baseUrl
max_preview_char = Flask_config.max_preview_char
max_preview_modal = Flask_config.max_preview_modal
r_serv_metadata = Flask_config.r_serv_metadata
bootstrap_label = Flask_config.bootstrap_label
#init all lvlDB servers
curYear = datetime.now().year
int_year = int(curYear)
r_serv_db = {}
# port generated automatically depending on available levelDB date
yearList = []
for x in range(0, (int_year - 2018) + 1):
intYear = int_year - x
yearList.append([str(intYear), intYear, int(curYear) == intYear])
r_serv_db[intYear] = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=intYear,
decode_responses=True)
yearList.sort(reverse=True)
browsepastes = Blueprint('browsepastes', __name__, template_folder='templates')
# ============ FUNCTIONS ============
def getPastebyType(server, module_name):
all_path = []
for path in server.smembers('WARNING_'+module_name):
all_path.append(path)
return all_path
def event_stream_getImportantPasteByModule(module_name, year):
index = 0
all_pastes_list = getPastebyType(r_serv_db[year], module_name)
paste_tags = []
for path in all_pastes_list:
index += 1
paste = Paste.Paste(path)
content = paste.get_p_content()
content_range = max_preview_char if len(content)>max_preview_char else len(content)-1
curr_date = str(paste._get_p_date())
curr_date = curr_date[0:4]+'/'+curr_date[4:6]+'/'+curr_date[6:]
p_tags = r_serv_metadata.smembers('tag:'+path)
l_tags = []
for tag in p_tags:
complete_tag = tag.replace('"', '&quot;')
tag = tag.split('=')
if len(tag) > 1:
if tag[1] != '':
tag = tag[1][1:-1]
# no value
else:
tag = tag[0][1:-1]
# use for custom tags
else:
tag = tag[0]
l_tags.append( (tag, complete_tag) )
data = {}
data["module"] = module_name
data["index"] = index
data["path"] = path
data["content"] = content[0:content_range]
data["linenum"] = paste.get_lines_info()[0]
data["date"] = curr_date
data["l_tags"] = l_tags
data["bootstrap_label"] = bootstrap_label
data["char_to_display"] = max_preview_modal
data["finished"] = True if index == len(all_pastes_list) else False
yield 'retry: 100000\ndata: %s\n\n' % json.dumps(data) #retry to avoid reconnection of the browser
# ============ ROUTES ============
@browsepastes.route("/browseImportantPaste/", methods=['GET'])
def browseImportantPaste():
module_name = request.args.get('moduleName')
return render_template("browse_important_paste.html", year_list=yearList, selected_year=curYear)
@browsepastes.route("/importantPasteByModule/", methods=['GET'])
def importantPasteByModule():
module_name = request.args.get('moduleName')
# # TODO: VERIFY YEAR VALIDITY
try:
currentSelectYear = int(request.args.get('year'))
except:
print('Invalid year input')
currentSelectYear = int(datetime.now().year)
all_content = []
paste_date = []
paste_linenum = []
all_path = []
paste_tags = []
allPastes = getPastebyType(r_serv_db[currentSelectYear], module_name)
for path in allPastes[0:10]:
all_path.append(path)
paste = Paste.Paste(path)
content = paste.get_p_content()
content_range = max_preview_char if len(content)>max_preview_char else len(content)-1
all_content.append(content[0:content_range].replace("\"", "\'").replace("\r", " ").replace("\n", " "))
curr_date = str(paste._get_p_date())
curr_date = curr_date[0:4]+'/'+curr_date[4:6]+'/'+curr_date[6:]
paste_date.append(curr_date)
paste_linenum.append(paste.get_lines_info()[0])
p_tags = r_serv_metadata.smembers('tag:'+path)
l_tags = []
for tag in p_tags:
complete_tag = tag
tag = tag.split('=')
if len(tag) > 1:
if tag[1] != '':
tag = tag[1][1:-1]
# no value
else:
tag = tag[0][1:-1]
# use for custom tags
else:
tag = tag[0]
l_tags.append( (tag, complete_tag) )
paste_tags.append(l_tags)
if len(allPastes) > 10:
finished = False
else:
finished = True
return render_template("important_paste_by_module.html",
moduleName=module_name,
year=currentSelectYear,
all_path=all_path,
content=all_content,
paste_date=paste_date,
paste_linenum=paste_linenum,
char_to_display=max_preview_modal,
paste_tags=paste_tags,
bootstrap_label=bootstrap_label,
finished=finished)
@browsepastes.route("/_getImportantPasteByModule", methods=['GET'])
def getImportantPasteByModule():
module_name = request.args.get('moduleName')
currentSelectYear = int(request.args.get('year'))
return flask.Response(event_stream_getImportantPasteByModule(module_name, currentSelectYear), mimetype="text/event-stream")
# ========= REGISTRATION =========
app.register_blueprint(browsepastes, url_prefix=baseUrl)

View file

@ -1,190 +0,0 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Browse Important Paste - AIL</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png') }}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='font-awesome/css/font-awesome.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/sb-admin-2.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dataTables.bootstrap.css') }}" rel="stylesheet" type="text/css" />
<script language="javascript" src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap.min.js') }}"></script>
<script src="{{ url_for('static', filename='js/jquery.dataTables.min.js') }}"></script>
<script src="{{ url_for('static', filename='js/dataTables.bootstrap.js') }}"></script>
<style>
.tooltip-inner {
text-align: left;
height: 200%;
width: 200%;
max-width: 500px;
max-height: 500px;
font-size: 13px;
}
xmp {
white-space:pre-wrap;
word-wrap:break-word;
}
</style>
</head>
<body>
{% include 'navbar.html' %}
<!-- Modal -->
<div id="mymodal" class="modal fade" role="dialog">
<div class="modal-dialog modal-lg">
<!-- Modal content-->
<div id="mymodalcontent" class="modal-content">
<div id="mymodalbody" class="modal-body" max-width="850px">
<p>Loading paste information...</p>
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" height="26" width="26" style="margin: 4px;">
</div>
<div class="modal-footer">
<a id="button_show_path" target="_blank" href=""><button type="button" class="btn btn-info">Show saved paste</button></a>
<button type="button" class="btn btn-default" data-dismiss="modal">Close</button>
</div>
</div>
</div>
</div>
<div id="page-wrapper">
<div class="row">
<div class="col-lg-12">
<h1 class="page-header" data-page="page-browse" >Browse important pastes</h1>
</div>
<!-- /.col-lg-12 -->
</div>
<div class="row">
<div class="col-md-12" style="margin-bottom: 0.2cm;">
<strong style="">Year: </strong>
<select class="form-control" id="index_year" style="display: inline-block; margin-bottom: 5px; width: 5%">
{% for yearElem in year_list %}
<option {% if yearElem[2] %} selected="selected" {% endif %} value="{{ yearElem[0] }}" >{{ yearElem[1] }}</option>
{% endfor %}
</select>
</div>
</br>
</div>
<!-- /.row -->
<div class="row">
<!-- /.nav-tabs -->
<ul class="nav nav-tabs">
<li name='nav-pan' class="active"><a data-toggle="tab" href="#credential-tab" data-attribute-name="credential" data-panel="credential-panel">Credentials</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#creditcard-tab" data-attribute-name="creditcard" data-panel="creditcard-panel">Credit cards</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#sqlinjection-tab" data-attribute-name="sqlinjection" data-panel="sqlinjection-panel">SQL injections</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#cve-tab" data-attribute-name="cve" data-panel="cve-panel">CVEs</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#keys-tab" data-attribute-name="keys" data-panel="keys-panel">Keys</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#apikey-tab" data-attribute-name="apikey" data-panel="apikey-panel">API Keys</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#mail-tab" data-attribute-name="mail" data-panel="mail-panel">Mails</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#phone-tab" data-attribute-name="phone" data-panel="phone-panel">Phones</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#onion-tab" data-attribute-name="onion" data-panel="onion-panel">Onions</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#bitcoin-tab" data-attribute-name="bitcoin" data-panel="bitcoin-panel">Bitcoin</a></li>
<li name='nav-pan'><a data-toggle="tab" href="#base64-tab" data-attribute-name="base64" data-panel="base64-panel">Base64</a></li>
</ul>
</br>
<div class="tab-content">
<div class="col-lg-12 tab-pane fade in active" id="credential-tab" >
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="creditcard-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="sqlinjection-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="cve-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="keys-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="apikey-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="mail-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="phone-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="onion-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="bitcoin-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
<div class="col-lg-12 tab-pane fade" id="base64-tab">
<img id="loading-gif-modal" src="{{url_for('static', filename='image/loading.gif') }}" style="margin: 4px;">
</div>
</div> <!-- tab-content -->
<!-- /.row -->
</div>
<!-- /#page-wrapper -->
<!-- import graph function -->
<script>
$(document).ready(function(){
activePage = $('h1.page-header').attr('data-page');
$("#"+activePage).addClass("active");
var dataPath = 'credential';
$.get("{{ url_for('browsepastes.importantPasteByModule') }}"+"?moduleName="+dataPath+"&year="+currentSelectYear, function(data, status){
$('#'+dataPath+'-tab').html(data);
});
});
</script>
<script>
// When a pannel is shown, create the data-table.
var previous_tab = $('[data-attribute-name="credential');
var currentTabName = previous_tab.attr('data-attribute-name');
var loading_gif = "<img id='loading-gif-modal' class='img-center' src=\"{{url_for('static', filename='image/loading.gif') }}\" height='26' width='26' style='margin: 4px;'>";
var currentSelectYear = {{ selected_year }};
$('#index_year').on('change', function() {
currentSelectYear = this.value;
$.get("{{ url_for('browsepastes.importantPasteByModule') }}"+"?moduleName="+currentTabName+"&year="+currentSelectYear, function(data, status){
$('#'+currentTabName+'-tab').html(data);
});
})
$('.nav-tabs a').on('shown.bs.tab', function(event){
var dataPath = $(event.target).attr('data-attribute-name');
currentTabName = dataPath;
$.get("{{ url_for('browsepastes.importantPasteByModule') }}"+"?moduleName="+currentTabName+"&year="+currentSelectYear, function(data, status){
currentTab = $('[name].active').children();
$('#'+previous_tab.attr('data-attribute-name')+'-tab').html(loading_gif);
currentTab.removeClass( "active" );
$('#'+dataPath+'-tab').html(data);
$(event.target).parent().addClass( "active" );
previous_tab = currentTab;
});
});
</script>
</div>
</body>
</html>

View file

@ -1 +0,0 @@
<li id='page-browse'><a href="{{ url_for('browsepastes.browseImportantPaste') }}"><i class="fa fa-search-plus "></i> Browse important pastes</a></li>

View file

@ -1,261 +0,0 @@
<table class="table table-striped table-bordered table-hover" id="myTable_{{ moduleName }}">
<thead>
<tr>
<th>#</th>
<th style="max-width: 800px;">Path</th>
<th>Date</th>
<th># of lines</th>
<th>Action</th>
</tr>
</thead>
<tbody>
{% for path in all_path %}
<tr>
<td> {{ loop.index0 }}</td>
<td><a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{path}}">{{ path }}</a>
<div>
{% for tag in paste_tags[loop.index0] %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag[1] }}">
<span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag[0] }}</span>
</a>
{% endfor %}
</div>
</td>
<td>{{ paste_date[loop.index0] }}</td>
<td>{{ paste_linenum[loop.index0] }}</td>
<td><p><span class="glyphicon glyphicon-info-sign" data-toggle="tooltip" data-placement="left" title="{{ content[loop.index0] }} "></span> <button type="button" class="btn-link" data-num="{{ loop.index0 + 1 }}" data-toggle="modal" data-target="#mymodal" data-url="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{ path }}&num={{ loop.index0+1 }}" data-path="{{ path }}"><span class="fa fa-search-plus"></span></button></p></td>
</tr>
{% endfor %}
</tbody>
</table>
</br>
<div id="nbr_entry" class="alert alert-info">
</div>
<div id="div_stil_data">
<button id="load_more_json_button1" type="button" class="btn btn-default" onclick="add_entries(100)" style="display: none">Load 100 entries</button>
<button id="load_more_json_button2" type="button" class="btn btn-warning" onclick="add_entries(300)" style="display: none">Load 300 entries</button>
<img id="loading_gif_browse" src="{{url_for('static', filename='image/loading.gif') }}" heigt="20" width="20" style="margin: 2px;"></div>
</br>
<script>
var json_array = [];
var all_data_received = false;
var curr_numElem;
var elem_added = 10; //10 elements are added by default in the page loading
var tot_num_entry = 10; //10 elements are added by default in the page loading
function deploy_source() {
var button_load_more_displayed = false;
if(typeof(EventSource) !== "undefined" && typeof(source) !== "") {
$("#load_more_json_button1").show();
$("#load_more_json_button2").show();
var source = new EventSource("{{ url_for('browsepastes.getImportantPasteByModule') }}"+"?moduleName="+moduleName+"&year="+currentSelectYear);
source.onmessage = function(event) {
var feed = jQuery.parseJSON( event.data );
curr_numElem = parseInt($("#myTable_"+moduleName).attr('data-numElem'));
if (feed.index > curr_numElem & feed.module == moduleName) { // Avoid doubling the pastes
json_array.push(feed);
tot_num_entry++;
$("#nbr_entry").text(tot_num_entry + " entries available, " + (tot_num_entry - elem_added) + " not displayed");
$("#myTable_"+moduleName).attr('data-numElem', curr_numElem+1);
if(feed.index > 100 && !button_load_more_displayed) {
button_load_more_displayed = true;
add_entries_X(20);
}
if(feed.finished) {
$("#loading_gif_browse").hide();
source.close();
all_data_received = true;
add_entries_X(10);
}
}
};
} else {
console.log("Sorry, your browser does not support server-sent events...");
}
}
function add_entries(iter) { //Used to disable the button before going to the big loop
$("#load_more_json_button1").attr('disabled','disabled');
$("#load_more_json_button2").attr('disabled','disabled');
setTimeout(function() { add_entries_X(iter);}, 50);
}
function add_entries_X(to_add) {
for(i=0; i<to_add; i++) {
if(json_array.length == 0 && all_data_received) {
$("#load_more_json_button1").hide();
$("#load_more_json_button2").hide();
$("#nbr_entry").hide();
return false;
} else {
var feed = json_array.shift();
elem_added++;
var tag = ""
for(j=0; j<feed.l_tags.length; j++) {
console.log(feed.l_tags[j][1])
tag = tag + "<a href=\"{{ url_for('Tags.get_tagged_paste') }}?ltags=" + feed.l_tags[j][1] + "\">"
+ "<span class=\"label label-" + feed.bootstrap_label[j % 5] + " pull-left\">" + feed.l_tags[j][0] + "</span>" + "</a>";
}
search_table.row.add( [
feed.index,
"<a target=\"_blank\" href=\"{{ url_for('showsavedpastes.showsavedpaste') }}?paste="+feed.path+"&num="+feed.index+"\"> "+ feed.path +"</a>"
+ "<div>" + tag + "</div>" ,
feed.date,
feed.linenum,
"<p><span class=\"glyphicon glyphicon-info-sign\" data-toggle=\"tooltip\" data-placement=\"left\" title=\""+feed.content.replace(/\"/g, "\'").replace(/\r/g, "\'").replace(/\n/g, "\'")+"\"></span> <button type=\"button\" class=\"btn-link\" data-num=\""+feed.index+"\" data-toggle=\"modal\" data-target=\"#mymodal\" data-url=\"{{ url_for('showsavedpastes.showsavedpaste') }}?paste="+feed.path+"&num="+feed.index+"\" data-path=\""+feed.path+"\"><span class=\"fa fa-search-plus\"></span></button></p>"
] ).draw( false );
$("#myTable_"+moduleName).attr('data-numElem', curr_numElem+1);
}
}
$("#load_more_json_button1").removeAttr('disabled');
$("#load_more_json_button2").removeAttr('disabled');
$("#nbr_entry").text(tot_num_entry + " entries available, " + (tot_num_entry - elem_added) + " not displayed");
return true;
}
</script>
<script>
var moduleName = "{{ moduleName }}";
var currentSelectYear = "{{ year }}";
var search_table;
var last_clicked_paste;
var can_change_modal_content = true;
$("#myTable_"+moduleName).attr('data-numElem', "{{ all_path|length }}");
$(document).ready(function(){
$('[data-toggle="tooltip"]').tooltip();
$("[data-toggle='modal']").off('click.openmodal').on("click.openmodal", function (event) {
get_html_and_update_modal(event, $(this));
});
search_table = $('#myTable_'+moduleName).DataTable({ "order": [[ 2, "desc" ]] });
if( "{{ finished }}" == "True"){
$("#load_more_json_button1").hide();
$("#load_more_json_button2").hide();
$("#nbr_entry").hide();
$("#loading_gif_browse").hide();
} else {
deploy_source();
}
});
</script>
<!-- Dynamically update the modal -->
<script type="text/javascript">
// static data
var alert_message = '<div class="alert alert-info alert-dismissable"><button type="button" class="close" data-dismiss="alert" aria-hidden="true">×</button><strong>No more data.</strong> Full paste displayed.</div>';
var complete_paste = null;
var char_to_display = {{ char_to_display }};
var start_index = 0;
// When the modal goes out, refresh it to normal content
$("#mymodal").on('hidden.bs.modal', function () {
can_change_modal_content = true;
$("#mymodalbody").html("<p>Loading paste information...</p>");
var loading_gif = "<img id='loading-gif-modal' class='img-center' src=\"{{url_for('static', filename='image/loading.gif') }}\" height='26' width='26' style='margin: 4px;'>";
$("#mymodalbody").append(loading_gif); // Show the loading GIF
$("#button_show_path").attr('href', '');
$("#button_show_path").hide();
complete_paste = null;
start_index = 0;
});
// Update the paste preview in the modal
function update_preview() {
if (start_index + char_to_display > complete_paste.length-1){ // end of paste reached
var final_index = complete_paste.length-1;
var flag_stop = true;
} else {
var final_index = start_index + char_to_display;
}
if (final_index != start_index){ // still have data to display
// Append the new content using text() and not append (XSS)
$("#mymodalbody").find("#paste-holder").text($("#mymodalbody").find("#paste-holder").text()+complete_paste.substring(start_index+1, final_index+1));
start_index = final_index;
if (flag_stop)
nothing_to_display();
} else {
nothing_to_display();
}
}
// Update the modal when there is no more data
function nothing_to_display() {
var new_content = $(alert_message).hide();
$("#mymodalbody").find("#panel-body").append(new_content);
new_content.show('fast');
$("#load-more-button").hide();
}
function get_html_and_update_modal(event, truemodal) {
event.preventDefault();
var modal=truemodal;
var url = " {{ url_for('showsavedpastes.showpreviewpaste') }}?paste=" + modal.attr('data-path') + "&num=" + modal.attr('data-num');
last_clicked_paste = modal.attr('data-num');
$.get(url, function (data) {
// verify that the reveived data is really the current clicked paste. Otherwise, ignore it.
var received_num = parseInt(data.split("|num|")[1]);
if (received_num == last_clicked_paste && can_change_modal_content) {
can_change_modal_content = false;
// clear data by removing html, body, head tags. prevent dark modal background stack bug.
var cleared_data = data.split("<body>")[1].split("</body>")[0];
$("#mymodalbody").html(cleared_data);
var button = $('<button type="button" id="load-more-button" class="btn btn-info btn-xs center-block" data-url="' + $(modal).attr('data-path') +'" data-toggle="tooltip" data-placement="bottom" title="Load more content"><span class="glyphicon glyphicon-download"></span></button>');
button.tooltip();
$("#mymodalbody").children(".panel-default").append(button);
$("#button_show_path").attr('href', $(modal).attr('data-url'));
$("#button_show_path").show('fast');
$("#loading-gif-modal").css("visibility", "hidden"); // Hide the loading GIF
if ($("[data-initsize]").attr('data-initsize') < char_to_display) { // All the content is displayed
nothing_to_display();
}
// On click, donwload all paste's content
$("#load-more-button").on("click", function (event) {
if (complete_paste == null) { //Donwload only once
$.get("{{ url_for('showsavedpastes.getmoredata') }}"+"?paste="+$(modal).attr('data-path'), function(data, status){
complete_paste = data;
update_preview();
});
} else {
update_preview();
}
});
} else if (can_change_modal_content) {
$("#mymodalbody").html("Ignoring previous not finished query of paste #" + received_num);
}
});
}
// Use to bind the button with the new displayed data
// (The bind do not happens if the dataTable is in tabs and the clicked data is in another page)
search_table.on( 'draw.dt', function () {
// Bind tooltip each time we draw a new page
$('[data-toggle="tooltip"]').tooltip();
// On click, get html content from url and update the corresponding modal
$("[data-toggle='modal']").off('click.openmodal').on("click.openmodal", function (event) {
get_html_and_update_modal(event, $(this));
});
} );
</script>

View file

@ -22,8 +22,10 @@ cfg = Flask_config.cfg
baseUrl = Flask_config.baseUrl
r_serv = Flask_config.r_serv
r_serv_log = Flask_config.r_serv_log
r_serv_db = Flask_config.r_serv_db
max_dashboard_logs = Flask_config.max_dashboard_logs
dict_update_description = Flask_config.dict_update_description
dashboard = Blueprint('dashboard', __name__, template_folder='templates')
@ -164,8 +166,22 @@ def index():
log_select.add(max_dashboard_logs)
log_select = list(log_select)
log_select.sort()
# Check if update in progress
update_in_progress = False
update_warning_message = ''
update_warning_message_notice_me = ''
current_update = r_serv_db.get('ail:current_background_update')
if current_update:
if r_serv_db.scard('ail:update_{}'.format(current_update)) != dict_update_description[current_update]['nb_background_update']:
update_in_progress = True
update_warning_message = dict_update_description[current_update]['update_warning_message']
update_warning_message_notice_me = dict_update_description[current_update]['update_warning_message_notice_me']
return render_template("index.html", default_minute = default_minute, threshold_stucked_module=threshold_stucked_module,
log_select=log_select, selected=max_dashboard_logs)
log_select=log_select, selected=max_dashboard_logs,
update_warning_message=update_warning_message, update_in_progress=update_in_progress,
update_warning_message_notice_me=update_warning_message_notice_me)
# ========= REGISTRATION =========
app.register_blueprint(dashboard, url_prefix=baseUrl)

View file

@ -99,6 +99,19 @@
</div>
</div>
{%if update_in_progress%}
<div class="alert alert-warning alert-dismissible fade in">
<a href="#" class="close" data-dismiss="alert" aria-label="close">&times;</a>
<strong>Warning!</strong> {{update_warning_message}} <strong>{{update_warning_message_notice_me}}</strong>
(<a href="{{ url_for('settings.settings_page') }}">Check Update Status</a>)
</div>
{%endif%}
<div class="alert alert-info alert-dismissible fade in">
<a href="#" class="close" data-dismiss="alert" aria-label="close">&times;</a>
<strong>Bootstrap 4 migration!</strong> Some pages are still in bootstrap 3. You can check the migration progress <strong><a href="https://github.com/CIRCL/AIL-framework/issues/330" target="_blank">Here</a></strong>.
</div>
<div class="row">
<div class="col-lg-6">
<div class="panel panel-default">
@ -187,6 +200,7 @@
<!-- /#page-wrapper -->
</div>
<script> var url_showSavedPath = "{{ url_for('showsavedpastes.showsavedpaste') }}"; </script>
<script type="text/javascript" src="{{ url_for('static', filename='js/indexjavascript.js')}}"
data-urlstuff="{{ url_for('dashboard.stuff') }}" data-urllog="{{ url_for('dashboard.logs') }}">

View file

@ -25,6 +25,7 @@ baseUrl = Flask_config.baseUrl
r_serv_metadata = Flask_config.r_serv_metadata
vt_enabled = Flask_config.vt_enabled
vt_auth = Flask_config.vt_auth
PASTES_FOLDER = Flask_config.PASTES_FOLDER
hashDecoded = Blueprint('hashDecoded', __name__, template_folder='templates')
@ -589,6 +590,12 @@ def hash_graph_node_json():
#get related paste
l_pastes = r_serv_metadata.zrange('nb_seen_hash:'+hash, 0, -1)
for paste in l_pastes:
# dynamic update
if PASTES_FOLDER in paste:
score = r_serv_metadata.zscore('nb_seen_hash:{}'.format(hash), paste)
r_serv_metadata.zrem('nb_seen_hash:{}'.format(hash), paste)
paste = paste.replace(PASTES_FOLDER, '', 1)
r_serv_metadata.zadd('nb_seen_hash:{}'.format(hash), score, paste)
url = paste
#nb_seen_in_this_paste = nb_in_file = int(r_serv_metadata.zscore('nb_seen_hash:'+hash, paste))
nb_hash_in_paste = r_serv_metadata.scard('hash_paste:'+paste)

View file

@ -8,6 +8,9 @@ import redis
import datetime
import sys
import os
import time
import json
from pyfaup.faup import Faup
from flask import Flask, render_template, jsonify, request, Blueprint, redirect, url_for
from Date import Date
@ -23,10 +26,13 @@ r_cache = Flask_config.r_cache
r_serv_onion = Flask_config.r_serv_onion
r_serv_metadata = Flask_config.r_serv_metadata
bootstrap_label = Flask_config.bootstrap_label
PASTES_FOLDER = Flask_config.PASTES_FOLDER
hiddenServices = Blueprint('hiddenServices', __name__, template_folder='templates')
faup = Faup()
list_types=['onion', 'regular']
dic_type_name={'onion':'Onion', 'regular':'Website'}
# ============ FUNCTIONS ============
def one():
return 1
@ -68,48 +74,114 @@ def unpack_paste_tags(p_tags):
l_tags.append( (tag, complete_tag) )
return l_tags
def is_valid_domain(domain):
faup.decode(domain)
domain_unpack = faup.get()
if domain_unpack['tld'] is not None and domain_unpack['scheme'] is None and domain_unpack['port'] is None and domain_unpack['query_string'] is None:
return True
else:
return False
def get_onion_status(domain, date):
if r_serv_onion.sismember('onion_up:'+date , domain):
return True
else:
return False
# ============= ROUTES ==============
@hiddenServices.route("/hiddenServices/", methods=['GET'])
def hiddenServices_page():
last_onions = r_serv_onion.lrange('last_onion', 0 ,-1)
list_onion = []
now = datetime.datetime.now()
date = '{}{}{}'.format(now.strftime("%Y"), now.strftime("%m"), now.strftime("%d"))
statDomains = {}
statDomains['domains_up'] = r_serv_onion.scard('onion_up:{}'.format(date))
statDomains['domains_down'] = r_serv_onion.scard('onion_down:{}'.format(date))
statDomains['total'] = statDomains['domains_up'] + statDomains['domains_down']
statDomains['domains_queue'] = r_serv_onion.scard('onion_domain_crawler_queue')
for onion in last_onions:
metadata_onion = {}
metadata_onion['domain'] = onion
metadata_onion['last_check'] = r_serv_onion.hget('onion_metadata:{}'.format(onion), 'last_check')
if metadata_onion['last_check'] is None:
metadata_onion['last_check'] = '********'
metadata_onion['first_seen'] = r_serv_onion.hget('onion_metadata:{}'.format(onion), 'first_seen')
if metadata_onion['first_seen'] is None:
metadata_onion['first_seen'] = '********'
if get_onion_status(onion, metadata_onion['last_check']):
metadata_onion['status_text'] = 'UP'
metadata_onion['status_color'] = 'Green'
metadata_onion['status_icon'] = 'fa-check-circle'
def get_domain_type(domain):
type_id = domain.split(':')[-1]
if type_id == 'onion':
return 'onion'
else:
metadata_onion['status_text'] = 'DOWN'
metadata_onion['status_color'] = 'Red'
metadata_onion['status_icon'] = 'fa-times-circle'
list_onion.append(metadata_onion)
return 'regular'
def get_type_domain(domain):
if domain is None:
type = 'regular'
else:
if domain.rsplit('.', 1)[1] == 'onion':
type = 'onion'
else:
type = 'regular'
return type
def get_domain_from_url(url):
faup.decode(url)
unpack_url = faup.get()
domain = unpack_url['domain'].decode()
return domain
def get_last_domains_crawled(type):
return r_serv_onion.lrange('last_{}'.format(type), 0 ,-1)
def get_stats_last_crawled_domains(type, date):
statDomains = {}
statDomains['domains_up'] = r_serv_onion.scard('{}_up:{}'.format(type, date))
statDomains['domains_down'] = r_serv_onion.scard('{}_down:{}'.format(type, date))
statDomains['total'] = statDomains['domains_up'] + statDomains['domains_down']
statDomains['domains_queue'] = r_serv_onion.scard('{}_crawler_queue'.format(type))
statDomains['domains_queue'] += r_serv_onion.scard('{}_crawler_priority_queue'.format(type))
return statDomains
def get_last_crawled_domains_metadata(list_domains_crawled, date, type=None, auto_mode=False):
list_crawled_metadata = []
for domain_epoch in list_domains_crawled:
if not auto_mode:
domain, epoch = domain_epoch.rsplit(';', 1)
else:
url = domain_epoch
domain = domain_epoch
domain = domain.split(':')
if len(domain) == 1:
port = 80
domain = domain[0]
else:
port = domain[1]
domain = domain[0]
metadata_domain = {}
# get Domain type
if type is None:
type_domain = get_type_domain(domain)
else:
type_domain = type
if auto_mode:
metadata_domain['url'] = url
epoch = r_serv_onion.zscore('crawler_auto_queue', '{};auto;{}'.format(domain, type_domain))
#domain in priority queue
if epoch is None:
epoch = 'In Queue'
else:
epoch = datetime.datetime.fromtimestamp(float(epoch)).strftime('%Y-%m-%d %H:%M:%S')
metadata_domain['domain'] = domain
if len(domain) > 45:
domain_name, tld_domain = domain.rsplit('.', 1)
metadata_domain['domain_name'] = '{}[...].{}'.format(domain_name[:40], tld_domain)
else:
metadata_domain['domain_name'] = domain
metadata_domain['port'] = port
metadata_domain['epoch'] = epoch
metadata_domain['last_check'] = r_serv_onion.hget('{}_metadata:{}'.format(type_domain, domain), 'last_check')
if metadata_domain['last_check'] is None:
metadata_domain['last_check'] = '********'
metadata_domain['first_seen'] = r_serv_onion.hget('{}_metadata:{}'.format(type_domain, domain), 'first_seen')
if metadata_domain['first_seen'] is None:
metadata_domain['first_seen'] = '********'
if r_serv_onion.sismember('{}_up:{}'.format(type_domain, metadata_domain['last_check']) , domain):
metadata_domain['status_text'] = 'UP'
metadata_domain['status_color'] = 'Green'
metadata_domain['status_icon'] = 'fa-check-circle'
else:
metadata_domain['status_text'] = 'DOWN'
metadata_domain['status_color'] = 'Red'
metadata_domain['status_icon'] = 'fa-times-circle'
list_crawled_metadata.append(metadata_domain)
return list_crawled_metadata
def get_crawler_splash_status(type):
crawler_metadata = []
all_onion_crawler = r_cache.smembers('all_crawler:onion')
for crawler in all_onion_crawler:
all_crawlers = r_cache.smembers('{}_crawlers'.format(type))
for crawler in all_crawlers:
crawling_domain = r_cache.hget('metadata_crawler:{}'.format(crawler), 'crawling_domain')
started_time = r_cache.hget('metadata_crawler:{}'.format(crawler), 'started_time')
status_info = r_cache.hget('metadata_crawler:{}'.format(crawler), 'status')
@ -120,10 +192,322 @@ def hiddenServices_page():
status=False
crawler_metadata.append({'crawler_info': crawler_info, 'crawling_domain': crawling_domain, 'status_info': status_info, 'status': status})
return crawler_metadata
def create_crawler_config(mode, service_type, crawler_config, domain, url=None):
if mode == 'manual':
r_cache.set('crawler_config:{}:{}:{}'.format(mode, service_type, domain), json.dumps(crawler_config))
elif mode == 'auto':
r_serv_onion.set('crawler_config:{}:{}:{}:{}'.format(mode, service_type, domain, url), json.dumps(crawler_config))
def send_url_to_crawl_in_queue(mode, service_type, url):
r_serv_onion.sadd('{}_crawler_priority_queue'.format(service_type), '{};{}'.format(url, mode))
# add auto crawled url for user UI
if mode == 'auto':
r_serv_onion.sadd('auto_crawler_url:{}'.format(service_type), url)
def delete_auto_crawler(url):
domain = get_domain_from_url(url)
type = get_type_domain(domain)
# remove from set
r_serv_onion.srem('auto_crawler_url:{}'.format(type), url)
# remove config
r_serv_onion.delete('crawler_config:auto:{}:{}:{}'.format(type, domain, url))
# remove from queue
r_serv_onion.srem('{}_crawler_priority_queue'.format(type), '{};auto'.format(url))
# remove from crawler_auto_queue
r_serv_onion.zrem('crawler_auto_queue'.format(type), '{};auto;{}'.format(url, type))
# ============= ROUTES ==============
@hiddenServices.route("/crawlers/", methods=['GET'])
def dashboard():
crawler_metadata_onion = get_crawler_splash_status('onion')
crawler_metadata_regular = get_crawler_splash_status('regular')
now = datetime.datetime.now()
date = now.strftime("%Y%m%d")
statDomains_onion = get_stats_last_crawled_domains('onion', date)
statDomains_regular = get_stats_last_crawled_domains('regular', date)
return render_template("Crawler_dashboard.html", crawler_metadata_onion = crawler_metadata_onion,
crawler_metadata_regular=crawler_metadata_regular,
statDomains_onion=statDomains_onion, statDomains_regular=statDomains_regular)
@hiddenServices.route("/hiddenServices/2", methods=['GET'])
def hiddenServices_page_test():
return render_template("Crawler_index.html")
@hiddenServices.route("/crawlers/manual", methods=['GET'])
def manual():
return render_template("Crawler_Splash_manual.html")
@hiddenServices.route("/crawlers/crawler_splash_onion", methods=['GET'])
def crawler_splash_onion():
type = 'onion'
last_onions = get_last_domains_crawled(type)
list_onion = []
now = datetime.datetime.now()
date = now.strftime("%Y%m%d")
statDomains = get_stats_last_crawled_domains(type, date)
list_onion = get_last_crawled_domains_metadata(last_onions, date, type=type)
crawler_metadata = get_crawler_splash_status(type)
date_string = '{}-{}-{}'.format(date[0:4], date[4:6], date[6:8])
return render_template("hiddenServices.html", last_onions=list_onion, statDomains=statDomains,
return render_template("Crawler_Splash_onion.html", last_onions=list_onion, statDomains=statDomains,
crawler_metadata=crawler_metadata, date_from=date_string, date_to=date_string)
@hiddenServices.route("/crawlers/Crawler_Splash_last_by_type", methods=['GET'])
def Crawler_Splash_last_by_type():
type = request.args.get('type')
# verify user input
if type not in list_types:
type = 'onion'
type_name = dic_type_name[type]
list_domains = []
now = datetime.datetime.now()
date = now.strftime("%Y%m%d")
date_string = '{}-{}-{}'.format(date[0:4], date[4:6], date[6:8])
statDomains = get_stats_last_crawled_domains(type, date)
list_domains = get_last_crawled_domains_metadata(get_last_domains_crawled(type), date, type=type)
crawler_metadata = get_crawler_splash_status(type)
return render_template("Crawler_Splash_last_by_type.html", type=type, type_name=type_name,
last_domains=list_domains, statDomains=statDomains,
crawler_metadata=crawler_metadata, date_from=date_string, date_to=date_string)
@hiddenServices.route("/crawlers/blacklisted_domains", methods=['GET'])
def blacklisted_domains():
blacklist_domain = request.args.get('blacklist_domain')
unblacklist_domain = request.args.get('unblacklist_domain')
type = request.args.get('type')
if type in list_types:
type_name = dic_type_name[type]
if blacklist_domain is not None:
blacklist_domain = int(blacklist_domain)
if unblacklist_domain is not None:
unblacklist_domain = int(unblacklist_domain)
try:
page = int(request.args.get('page'))
except:
page = 1
if page <= 0:
page = 1
nb_page_max = r_serv_onion.scard('blacklist_{}'.format(type))/(1000)
if isinstance(nb_page_max, float):
nb_page_max = int(nb_page_max)+1
if page > nb_page_max:
page = nb_page_max
start = 1000*(page -1)
stop = 1000*page
list_blacklisted = list(r_serv_onion.smembers('blacklist_{}'.format(type)))
list_blacklisted_1 = list_blacklisted[start:stop]
list_blacklisted_2 = list_blacklisted[stop:stop+1000]
return render_template("blacklisted_domains.html", list_blacklisted_1=list_blacklisted_1, list_blacklisted_2=list_blacklisted_2,
type=type, type_name=type_name, page=page, nb_page_max=nb_page_max,
blacklist_domain=blacklist_domain, unblacklist_domain=unblacklist_domain)
else:
return 'Incorrect Type'
@hiddenServices.route("/crawler/blacklist_domain", methods=['GET'])
def blacklist_domain():
domain = request.args.get('domain')
type = request.args.get('type')
try:
page = int(request.args.get('page'))
except:
page = 1
if type in list_types:
if is_valid_domain(domain):
res = r_serv_onion.sadd('blacklist_{}'.format(type), domain)
if page:
if res == 0:
return redirect(url_for('hiddenServices.blacklisted_domains', page=page, type=type, blacklist_domain=2))
else:
return redirect(url_for('hiddenServices.blacklisted_domains', page=page, type=type, blacklist_domain=1))
else:
return redirect(url_for('hiddenServices.blacklisted_domains', page=page, type=type, blacklist_domain=0))
else:
return 'Incorrect type'
@hiddenServices.route("/crawler/unblacklist_domain", methods=['GET'])
def unblacklist_domain():
domain = request.args.get('domain')
type = request.args.get('type')
try:
page = int(request.args.get('page'))
except:
page = 1
if type in list_types:
if is_valid_domain(domain):
res = r_serv_onion.srem('blacklist_{}'.format(type), domain)
if page:
if res == 0:
return redirect(url_for('hiddenServices.blacklisted_domains', page=page, type=type, unblacklist_domain=2))
else:
return redirect(url_for('hiddenServices.blacklisted_domains', page=page, type=type, unblacklist_domain=1))
else:
return redirect(url_for('hiddenServices.blacklisted_domains', page=page, type=type, unblacklist_domain=0))
else:
return 'Incorrect type'
@hiddenServices.route("/crawlers/create_spider_splash", methods=['POST'])
def create_spider_splash():
url = request.form.get('url_to_crawl')
automatic = request.form.get('crawler_type')
crawler_time = request.form.get('crawler_epoch')
#html = request.form.get('html_content_id')
screenshot = request.form.get('screenshot')
har = request.form.get('har')
depth_limit = request.form.get('depth_limit')
max_pages = request.form.get('max_pages')
# validate url
if url is None or url=='' or url=='\n':
return 'incorrect url'
crawler_config = {}
# verify user input
if automatic:
automatic = True
else:
automatic = False
if not screenshot:
crawler_config['png'] = 0
if not har:
crawler_config['har'] = 0
# verify user input
if depth_limit:
try:
depth_limit = int(depth_limit)
if depth_limit < 0:
return 'incorrect depth_limit'
else:
crawler_config['depth_limit'] = depth_limit
except:
return 'incorrect depth_limit'
if max_pages:
try:
max_pages = int(max_pages)
if max_pages < 1:
return 'incorrect max_pages'
else:
crawler_config['closespider_pagecount'] = max_pages
except:
return 'incorrect max_pages'
# get service_type
faup.decode(url)
unpack_url = faup.get()
domain = unpack_url['domain'].decode()
if unpack_url['tld'] == b'onion':
service_type = 'onion'
else:
service_type = 'regular'
if automatic:
mode = 'auto'
try:
crawler_time = int(crawler_time)
if crawler_time < 0:
return 'incorrect epoch'
else:
crawler_config['time'] = crawler_time
except:
return 'incorrect epoch'
else:
mode = 'manual'
epoch = None
create_crawler_config(mode, service_type, crawler_config, domain, url=url)
send_url_to_crawl_in_queue(mode, service_type, url)
return redirect(url_for('hiddenServices.manual'))
@hiddenServices.route("/crawlers/auto_crawler", methods=['GET'])
def auto_crawler():
nb_element_to_display = 100
try:
page = int(request.args.get('page'))
except:
page = 1
if page <= 0:
page = 1
nb_auto_onion = r_serv_onion.scard('auto_crawler_url:onion')
nb_auto_regular = r_serv_onion.scard('auto_crawler_url:regular')
if nb_auto_onion > nb_auto_regular:
nb_max = nb_auto_onion
else:
nb_max = nb_auto_regular
nb_page_max = nb_max/(nb_element_to_display)
if isinstance(nb_page_max, float):
nb_page_max = int(nb_page_max)+1
if page > nb_page_max:
page = nb_page_max
start = nb_element_to_display*(page -1)
stop = nb_element_to_display*page
last_auto_crawled = get_last_domains_crawled('auto_crawled')
last_domains = get_last_crawled_domains_metadata(last_auto_crawled, '')
if start > nb_auto_onion:
auto_crawler_domain_onions = []
elif stop > nb_auto_onion:
auto_crawler_domain_onions = list(r_serv_onion.smembers('auto_crawler_url:onion'))[start:nb_auto_onion]
else:
auto_crawler_domain_onions = list(r_serv_onion.smembers('auto_crawler_url:onion'))[start:stop]
if start > nb_auto_regular:
auto_crawler_domain_regular = []
elif stop > nb_auto_regular:
auto_crawler_domain_regular = list(r_serv_onion.smembers('auto_crawler_url:regular'))[start:nb_auto_regular]
else:
auto_crawler_domain_regular = list(r_serv_onion.smembers('auto_crawler_url:regular'))[start:stop]
auto_crawler_domain_onions_metadata = get_last_crawled_domains_metadata(auto_crawler_domain_onions, '', type='onion', auto_mode=True)
auto_crawler_domain_regular_metadata = get_last_crawled_domains_metadata(auto_crawler_domain_regular, '', type='regular', auto_mode=True)
return render_template("Crawler_auto.html", page=page, nb_page_max=nb_page_max,
last_domains=last_domains,
auto_crawler_domain_onions_metadata=auto_crawler_domain_onions_metadata,
auto_crawler_domain_regular_metadata=auto_crawler_domain_regular_metadata)
@hiddenServices.route("/crawlers/remove_auto_crawler", methods=['GET'])
def remove_auto_crawler():
url = request.args.get('url')
page = request.args.get('page')
if url:
delete_auto_crawler(url)
return redirect(url_for('hiddenServices.auto_crawler', page=page))
@hiddenServices.route("/crawlers/crawler_dashboard_json", methods=['GET'])
def crawler_dashboard_json():
crawler_metadata_onion = get_crawler_splash_status('onion')
crawler_metadata_regular = get_crawler_splash_status('regular')
now = datetime.datetime.now()
date = now.strftime("%Y%m%d")
statDomains_onion = get_stats_last_crawled_domains('onion', date)
statDomains_regular = get_stats_last_crawled_domains('regular', date)
return jsonify({'statDomains_onion': statDomains_onion, 'statDomains_regular': statDomains_regular,
'crawler_metadata_onion':crawler_metadata_onion, 'crawler_metadata_regular':crawler_metadata_regular})
# # TODO: refractor
@hiddenServices.route("/hiddenServices/last_crawled_domains_with_stats_json", methods=['GET'])
def last_crawled_domains_with_stats_json():
last_onions = r_serv_onion.lrange('last_onion', 0 ,-1)
@ -261,29 +645,54 @@ def show_domains_by_daterange():
date_from=date_from, date_to=date_to, domains_up=domains_up, domains_down=domains_down,
domains_tags=domains_tags, bootstrap_label=bootstrap_label)
@hiddenServices.route("/hiddenServices/onion_domain", methods=['GET'])
def onion_domain():
onion_domain = request.args.get('onion_domain')
if onion_domain is None or not r_serv_onion.exists('onion_metadata:{}'.format(onion_domain)):
@hiddenServices.route("/crawlers/show_domain", methods=['GET'])
def show_domain():
domain = request.args.get('domain')
epoch = request.args.get('epoch')
try:
epoch = int(epoch)
except:
epoch = None
port = request.args.get('port')
faup.decode(domain)
unpack_url = faup.get()
domain = unpack_url['domain'].decode()
if not port:
if unpack_url['port']:
port = unpack_url['port'].decode()
else:
port = 80
try:
port = int(port)
except:
port = 80
type = get_type_domain(domain)
if domain is None or not r_serv_onion.exists('{}_metadata:{}'.format(type, domain)):
return '404'
# # TODO: FIXME return 404
last_check = r_serv_onion.hget('onion_metadata:{}'.format(onion_domain), 'last_check')
last_check = r_serv_onion.hget('{}_metadata:{}'.format(type, domain), 'last_check')
if last_check is None:
last_check = '********'
last_check = '{}/{}/{}'.format(last_check[0:4], last_check[4:6], last_check[6:8])
first_seen = r_serv_onion.hget('onion_metadata:{}'.format(onion_domain), 'first_seen')
first_seen = r_serv_onion.hget('{}_metadata:{}'.format(type, domain), 'first_seen')
if first_seen is None:
first_seen = '********'
first_seen = '{}/{}/{}'.format(first_seen[0:4], first_seen[4:6], first_seen[6:8])
origin_paste = r_serv_onion.hget('onion_metadata:{}'.format(onion_domain), 'paste_parent')
origin_paste = r_serv_onion.hget('{}_metadata:{}'.format(type, domain), 'paste_parent')
h = HiddenServices(onion_domain, 'onion')
l_pastes = h.get_last_crawled_pastes()
h = HiddenServices(domain, type, port=port)
item_core = h.get_domain_crawled_core_item(epoch=epoch)
if item_core:
l_pastes = h.get_last_crawled_pastes(item_root=item_core['root_item'])
else:
l_pastes = []
dict_links = h.get_all_links(l_pastes)
if l_pastes:
status = True
else:
status = False
last_check = '{} - {}'.format(last_check, time.strftime('%H:%M.%S', time.gmtime(epoch)))
screenshot = h.get_domain_random_screenshot(l_pastes)
if screenshot:
screenshot = screenshot[0]
@ -295,15 +704,14 @@ def onion_domain():
origin_paste_name = h.get_origin_paste_name()
origin_paste_tags = unpack_paste_tags(r_serv_metadata.smembers('tag:{}'.format(origin_paste)))
paste_tags = []
path_name = []
for path in l_pastes:
path_name.append(path.replace(PASTES_FOLDER+'/', ''))
p_tags = r_serv_metadata.smembers('tag:'+path)
paste_tags.append(unpack_paste_tags(p_tags))
return render_template("showDomain.html", domain=onion_domain, last_check=last_check, first_seen=first_seen,
return render_template("showDomain.html", domain=domain, last_check=last_check, first_seen=first_seen,
l_pastes=l_pastes, paste_tags=paste_tags, bootstrap_label=bootstrap_label,
path_name=path_name, origin_paste_tags=origin_paste_tags, status=status,
dict_links=dict_links,
origin_paste_tags=origin_paste_tags, status=status,
origin_paste=origin_paste, origin_paste_name=origin_paste_name,
domain_tags=domain_tags, screenshot=screenshot)
@ -314,7 +722,6 @@ def onion_son():
h = HiddenServices(onion_domain, 'onion')
l_pastes = h.get_last_crawled_pastes()
l_son = h.get_domain_son(l_pastes)
print(l_son)
return 'l_son'
# ============= JSON ==============
@ -336,5 +743,26 @@ def domain_crawled_7days_json():
return jsonify(json_domain_stats)
@hiddenServices.route('/hiddenServices/domain_crawled_by_type_json')
def domain_crawled_by_type_json():
current_date = request.args.get('date')
type = request.args.get('type')
if type in list_types:
num_day_type = 7
date_range = get_date_range(num_day_type)
range_decoder = []
for date in date_range:
day_crawled = {}
day_crawled['date']= date[0:4] + '-' + date[4:6] + '-' + date[6:8]
day_crawled['UP']= nb_domain_up = r_serv_onion.scard('{}_up:{}'.format(type, date))
day_crawled['DOWN']= nb_domain_up = r_serv_onion.scard('{}_up:{}'.format(type, date))
range_decoder.append(day_crawled)
return jsonify(range_decoder)
else:
return jsonify('Incorrect Type')
# ========= REGISTRATION =========
app.register_blueprint(hiddenServices, url_prefix=baseUrl)

View file

@ -0,0 +1,477 @@
<!DOCTYPE html>
<html>
<head>
<title>AIL-Framework</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png')}}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/daterangepicker.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/popper.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/moment.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/jquery.daterangepicker.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/d3.min.js') }}"></script>
<style>
.bar {
fill: steelblue;
}
.bar:hover{
fill: brown;
cursor: pointer;
}
.bar_stack:hover{
cursor: pointer;
}
.popover{
max-width: 100%;
}
</style>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'crawler/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div class="row">
<div class="col-12 col-xl-6">
<div class="table-responsive mt-1 table-hover table-borderless table-striped">
<table class="table">
<thead class="thead-dark">
<tr>
<th>Domain</th>
<th>First Seen</th>
<th>Last Check</th>
<th>Status</th>
</tr>
</thead>
<tbody id="tbody_last_crawled">
{% for metadata_domain in last_domains %}
<tr data-toggle="popover" data-trigger="hover"
title="<span class='badge badge-dark'>{{metadata_domain['domain']}}</span>"
data-content="port: <span class='badge badge-secondary'>{{metadata_domain['port']}}</span><br>
epoch: {{metadata_domain['epoch']}}">
<td><a target="_blank" href="{{ url_for('hiddenServices.show_domain') }}?domain={{ metadata_domain['domain'] }}&port={{metadata_domain['port']}}&epoch={{metadata_domain['epoch']}}">{{ metadata_domain['domain_name'] }}</a></td>
<td>{{'{}/{}/{}'.format(metadata_domain['first_seen'][0:4], metadata_domain['first_seen'][4:6], metadata_domain['first_seen'][6:8])}}</td>
<td>{{'{}/{}/{}'.format(metadata_domain['last_check'][0:4], metadata_domain['last_check'][4:6], metadata_domain['last_check'][6:8])}}</td>
<td><div style="color:{{metadata_domain['status_color']}}; display:inline-block">
<i class="fas {{metadata_domain['status_icon']}} "></i>
{{metadata_domain['status_text']}}
</div>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
<a href="{{ url_for('hiddenServices.blacklisted_domains') }}?type={{type}}">
<button type="button" class="btn btn-outline-danger">Show Blacklisted {{type_name}}s</button>
</a>
</div>
<div class="col-12 col-xl-6">
<div class="card text-white bg-dark mb-3 mt-1">
<div class="card-header">
<div class="row">
<div class="col-6">
<span class="badge badge-success">{{ statDomains['domains_up'] }}</span> UP
<span class="badge badge-danger ml-md-3">{{ statDomains['domains_down'] }}</span> DOWN
</div>
<div class="col-6">
<span class="badge badge-success">{{ statDomains['total'] }}</span> Crawled
<span class="badge badge-warning ml-md-3">{{ statDomains['domains_queue'] }}</span> Queue
</div>
</div>
</div>
<div class="card-body">
<h5 class="card-title">Select domains by date range :</h5>
<p class="card-text">Some quick example text to build on the card title and make up the bulk of the card's content.</p>
<form action="{{ url_for('hiddenServices.get_onions_by_daterange') }}" id="hash_selector_form" method='post'>
<div class="row">
<div class="col-6">
<div class="input-group" id="date-range-from">
<div class="input-group-prepend"><span class="input-group-text"><i class="far fa-calendar-alt" aria-hidden="true"></i></span></div>
<input class="form-control" id="date-range-from-input" placeholder="yyyy-mm-dd" value="{{ date_from }}" name="date_from" autocomplete="off">
</div>
<div class="input-group" id="date-range-to">
<div class="input-group-prepend"><span class="input-group-text"><i class="far fa-calendar-alt" aria-hidden="true"></i></span></div>
<input class="form-control" id="date-range-to-input" placeholder="yyyy-mm-dd" value="{{ date_to }}" name="date_to" autocomplete="off">
</div>
</div>
<div class="col-6">
<div class="custom-control custom-switch">
<input class="custom-control-input" type="checkbox" name="domains_up" value="True" id="domains_up_id" checked>
<label class="custom-control-label" for="domains_up_id">
<span class="badge badge-success"><i class="fas fa-check-circle"></i> Domains UP </span>
</label>
</div>
<div class="custom-control custom-switch">
<input class="custom-control-input" type="checkbox" name="domains_down" value="True" id="domains_down_id">
<label class="custom-control-label" for="domains_down_id">
<span class="badge badge-danger"><i class="fas fa-times-circle"></i> Domains DOWN</span>
</label>
</div>
<div class="custom-control custom-switch mt-2">
<input class="custom-control-input" type="checkbox" name="domains_tags" value="True" id="domains_tags_id">
<label class="custom-control-label" for="domains_tags_id">
<span class="badge badge-dark"><i class="fas fa-tags"></i> Domains Tags</span>
</label>
</div>
</div>
</div>
<button class="btn btn-primary">
<i class="fas fa-eye"></i> Show {{type_name}}s
</button>
<form>
</div>
</div>
<div id="barchart_type">
</div>
<div class="card mt-1 mb-1">
<div class="card-header text-white bg-dark">
Crawlers Status
</div>
<div class="card-body px-0 py-0 ">
<table class="table">
<tbody id="tbody_crawler_info">
{% for crawler in crawler_metadata %}
<tr>
<td>
<i class="fas fa-{%if crawler['status']%}check{%else%}times{%endif%}-circle" style="color:{%if crawler['status']%}Green{%else%}Red{%endif%};"></i> {{crawler['crawler_info']}}
</td>
<td>
{{crawler['crawling_domain']}}
</td>
<td style="color:{%if crawler['status']%}Green{%else%}Red{%endif%};">
{{crawler['status_info']}}
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
<script>
var chart = {};
$(document).ready(function(){
$("#page-Crawler").addClass("active");
$("#nav_{{type}}_crawler").addClass("active");
$('#date-range-from').dateRangePicker({
separator : ' to ',
getValue: function(){
if ($('#date-range-from-input').val() && $('#date-range-to-input').val() )
return $('#date-range-from-input').val() + ' to ' + $('#date-range-to-input').val();
else
return '';
},
setValue: function(s,s1,s2){
$('#date-range-from-input').val(s1);
$('#date-range-to-input').val(s2);
}
});
$('#date-range-to').dateRangePicker({
separator : ' to ',
getValue: function(){
if ($('#date-range-from-input').val() && $('#date-range-to-input').val() )
return $('#date-range-from-input').val() + ' to ' + $('#date-range-to-input').val();
else
return '';
},
setValue: function(s,s1,s2){
$('#date-range-from-input').val(s1);
$('#date-range-to-input').val(s2);
}
});
chart.stackBarChart =barchart_type_stack("{{ url_for('hiddenServices.domain_crawled_by_type_json') }}?type={{type}}", 'id');
chart.onResize();
$(window).on("resize", function() {
chart.onResize();
});
$('[data-toggle="popover"]').popover({
placement: 'top',
container: 'body',
html : true,
});
});
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
</script>
<script>/*
function refresh_list_crawled(){
$.getJSON("{{ url_for('hiddenServices.last_crawled_domains_with_stats_json') }}",
function(data) {
var tableRef = document.getElementById('tbody_last_crawled');
$("#tbody_last_crawled").empty()
for (var i = 0; i < data.last_domains.length; i++) {
var data_domain = data.last_domains[i]
var newRow = tableRef.insertRow(tableRef.rows.length);
var newCell = newRow.insertCell(0);
newCell.innerHTML = "<td><a target=\"_blank\" href=\"{{ url_for('hiddenServices.show_domain') }}?onion_domain="+data_domain['domain']+"\">"+data_domain['domain']+"</a></td>";
newCell = newRow.insertCell(1);
newCell.innerHTML = "<td>"+data_domain['first_seen'].substr(0, 4)+"/"+data_domain['first_seen'].substr(4, 2)+"/"+data_domain['first_seen'].substr(6, 2)+"</td>"
newCell = newRow.insertCell(2);
newCell.innerHTML = "<td>"+data_domain['last_check'].substr(0, 4)+"/"+data_domain['last_check'].substr(4, 2)+"/"+data_domain['last_check'].substr(6, 2)+"</td>"
newCell = newRow.insertCell(3);
newCell.innerHTML = "<td><div style=\"color:"+data_domain['status_color']+"; display:inline-block\"><i class=\"fa "+data_domain['status_icon']+" fa-2x\"></i>"+data_domain['status_text']+"</div></td>"
}
var statDomains = data.statDomains
document.getElementById('text_domain_up').innerHTML = statDomains['domains_up']
document.getElementById('text_domain_down').innerHTML = statDomains['domains_down']
document.getElementById('text_domain_queue').innerHTML = statDomains['domains_queue']
document.getElementById('text_total_domains').innerHTML = statDomains['total']
if(data.crawler_metadata.length!=0){
$("#tbody_crawler_info").empty();
var tableRef = document.getElementById('tbody_crawler_info');
for (var i = 0; i < data.crawler_metadata.length; i++) {
var crawler = data.crawler_metadata[i];
var newRow = tableRef.insertRow(tableRef.rows.length);
var text_color;
var icon;
if(crawler['status']){
text_color = 'Green';
icon = 'check';
} else {
text_color = 'Red';
icon = 'times';
}
var newCell = newRow.insertCell(0);
newCell.innerHTML = "<td><i class=\"fa fa-"+icon+"-circle\" style=\"color:"+text_color+";\"></i>"+crawler['crawler_info']+"</td>";
newCell = newRow.insertCell(1);
newCell.innerHTML = "<td><a target=\"_blank\" href=\"{{ url_for('hiddenServices.show_domain') }}?onion_domain="+crawler['crawling_domain']+"\">"+crawler['crawling_domain']+"</a></td>";
newCell = newRow.insertCell(2);
newCell.innerHTML = "<td><div style=\"color:"+text_color+";\">"+crawler['status_info']+"</div></td>";
$("#panel_crawler").show();
}
} else {
$("#panel_crawler").hide();
}
}
);
if (to_refresh) {
setTimeout("refresh_list_crawled()", 10000);
}
}*/
</script>
<script>
var margin = {top: 20, right: 90, bottom: 55, left: 0},
width = parseInt(d3.select('#barchart_type').style('width'), 10);
width = 1000 - margin.left - margin.right,
height = 500 - margin.top - margin.bottom;
var x = d3.scaleBand().rangeRound([0, width]).padding(0.1);
var y = d3.scaleLinear().rangeRound([height, 0]);
var xAxis = d3.axisBottom(x);
var yAxis = d3.axisLeft(y);
var color = d3.scaleOrdinal(d3.schemeSet3);
var svg = d3.select("#barchart_type").append("svg")
.attr("id", "thesvg")
.attr("viewBox", "0 0 "+width+" 500")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
function barchart_type_stack(url, id) {
d3.json(url)
.then(function(data){
var labelVar = 'date'; //A
var varNames = d3.keys(data[0])
.filter(function (key) { return key !== labelVar;}); //B
data.forEach(function (d) { //D
var y0 = 0;
d.mapping = varNames.map(function (name) {
return {
name: name,
label: d[labelVar],
y0: y0,
y1: y0 += +d[name]
};
});
d.total = d.mapping[d.mapping.length - 1].y1;
});
x.domain(data.map(function (d) { return (d.date); })); //E
y.domain([0, d3.max(data, function (d) { return d.total; })]);
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.attr("class", "bar")
.on("click", function (d) { window.location.href = "#" })
.attr("transform", "rotate(-18)" )
//.attr("transform", "rotate(-40)" )
.style("text-anchor", "end");
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style("text-anchor", "end");
var selection = svg.selectAll(".series")
.data(data)
.enter().append("g")
.attr("class", "series")
.attr("transform", function (d) { return "translate(" + x((d.date)) + ",0)"; });
selection.selectAll("rect")
.data(function (d) { return d.mapping; })
.enter().append("rect")
.attr("class", "bar_stack")
.attr("width", x.bandwidth())
.attr("y", function (d) { return y(d.y1); })
.attr("height", function (d) { return y(d.y0) - y(d.y1); })
.style("fill", function (d) { return color(d.name); })
.style("stroke", "grey")
.on("mouseover", function (d) { showPopover.call(this, d); })
.on("mouseout", function (d) { removePopovers(); })
.on("click", function(d){ window.location.href = "#" });
data.forEach(function(d) {
if(d.total != 0){
svg.append("text")
.attr("class", "bar")
.attr("dy", "-.35em")
.attr('x', x(d.date) + x.bandwidth()/2)
.attr('y', y(d.total))
.on("click", function () {window.location.href = "#" })
.style("text-anchor", "middle")
.text(d.total);
}
});
drawLegend(varNames);
});
}
function drawLegend (varNames) {
var legend = svg.selectAll(".legend")
.data(varNames.slice().reverse())
.enter().append("g")
.attr("class", "legend")
.attr("transform", function (d, i) { return "translate(0," + i * 20 + ")"; });
legend.append("rect")
.attr("x", 943)
.attr("width", 10)
.attr("height", 10)
.style("fill", color)
.style("stroke", "grey");
legend.append("text")
.attr("class", "svgText")
.attr("x", 941)
.attr("y", 6)
.attr("dy", ".35em")
.style("text-anchor", "end")
.text(function (d) { return d; });
}
function removePopovers () {
$('.popover').each(function() {
$(this).remove();
});
}
function showPopover (d) {
$(this).popover({
title: d.name,
placement: 'top',
container: 'body',
trigger: 'manual',
html : true,
content: function() {
return d.label +
"<br/>num: " + d3.format(",")(d.value ? d.value: d.y1 - d.y0); }
});
$(this).popover('show')
}
chart.onResize = function () {
var aspect = width / height, chart = $("#thesvg");
var targetWidth = chart.parent().width();
chart.attr("width", targetWidth);
chart.attr("height", targetWidth / 2);
}
window.chart = chart;
</script>

View file

@ -0,0 +1,164 @@
<!DOCTYPE html>
<html>
<head>
<title>AIL-Framework</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png')}}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/daterangepicker.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/popper.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/moment.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/jquery.daterangepicker.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/d3.min.js') }}"></script>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'crawler/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div class="card text-white bg-dark mb-3 mt-1">
<div class="card-header">
<h5 class="card-title">Crawl a Domain</h5>
</div>
<div class="card-body">
<p class="card-text">Enter a domain and choose what kind of data you want.</p>
<form action="{{ url_for('hiddenServices.create_spider_splash') }}" method='post'>
<div class="row">
<div class="col-12 col-lg-6">
<div class="input-group" id="date-range-from">
<input type="text" class="form-control" id="url_to_crawl" name="url_to_crawl" placeholder="Address or Domain">
</div>
<div class="d-flex mt-1">
<i class="fas fa-user-ninja mt-1"></i> &nbsp;Manual&nbsp;&nbsp;
<div class="custom-control custom-switch">
<input class="custom-control-input" type="checkbox" name="crawler_type" value="True" id="crawler_type">
<label class="custom-control-label" for="crawler_type">
<i class="fas fa-clock"></i> &nbsp;Automatic
</label>
</div>
</div>
<div class="input-group mt-2 mb-2" id="crawler_epoch_input">
<div class="input-group-prepend">
<span class="input-group-text bg-light"><i class="fas fa-clock"></i>&nbsp;</span>
</div>
<input class="form-control" type="number" id="crawler_epoch" value="3600" min="1" name="crawler_epoch" required>
<div class="input-group-append">
<span class="input-group-text">Time (seconds) between each crawling</span>
</div>
</div>
</div>
<div class="col-12 col-lg-6 mt-2 mt-lg-0">
<div class="row">
<div class="col-12 col-xl-6">
<div class="custom-control custom-switch">
<input class="custom-control-input" type="checkbox" name="html_content" value="True" id="html_content_id" checked disabled>
<label class="custom-control-label" for="html_content_id">
<i class="fab fa-html5"></i> &nbsp;HTML
</label>
</div>
<div class="custom-control custom-switch mt-1">
<input class="custom-control-input" type="checkbox" name="screenshot" value="True" id="screenshot_id">
<label class="custom-control-label" for="screenshot_id">
<i class="fas fa-image"></i> Screenshot
</label>
</div>
<div class="custom-control custom-switch mt-1">
<input class="custom-control-input" type="checkbox" name="har" value="True" id="har_id">
<label class="custom-control-label" for="har_id">
<i class="fas fa-file"></i> &nbsp;HAR
</label>
</div>
</div>
<div class="col-12 col-xl-6">
<div class="input-group form-group mb-0">
<div class="input-group-prepend">
<span class="input-group-text bg-light"><i class="fas fa-water"></i></span>
</div>
<input class="form-control" type="number" id="depth_limit" name="depth_limit" min="0" value="0" required>
<div class="input-group-append">
<span class="input-group-text">Depth Limit</span>
</div>
</div>
<div class="input-group mt-2">
<div class="input-group-prepend">
<span class="input-group-text bg-light"><i class="fas fa-copy"></i>&nbsp;</span>
</div>
<input class="form-control" type="number" id="max_pages" name="max_pages" min="1" value="1" required>
<div class="input-group-append">
<span class="input-group-text">Max Pages</span>
</div>
</div>
</div>
</div>
</div>
</div>
<button class="btn btn-primary mt-2">
<i class="fas fa-spider"></i> Send to Spider
</button>
<form>
</div>
</div>
</div>
</div>
</div>
</body>
<script>
var chart = {};
$(document).ready(function(){
$("#page-Crawler").addClass("active");
$("#nav_manual_crawler").addClass("active");
manual_crawler_input_controler();
$('#crawler_type').change(function () {
manual_crawler_input_controler();
});
});
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
function manual_crawler_input_controler() {
if($('#crawler_type').is(':checked')){
$("#crawler_epoch_input").show();
}else{
$("#crawler_epoch_input").hide();
}
}
</script>

View file

@ -0,0 +1,476 @@
<!DOCTYPE html>
<html>
<head>
<title>AIL-Framework</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png')}}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/daterangepicker.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/popper.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/moment.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/jquery.daterangepicker.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/d3.min.js') }}"></script>
<style>
.bar {
fill: steelblue;
}
.bar:hover{
fill: brown;
cursor: pointer;
}
.bar_stack:hover{
cursor: pointer;
}
div.tooltip {
position: absolute;
text-align: center;
padding: 2px;
font: 12px sans-serif;
background: #ebf4fb;
border: 2px solid #b7ddf2;
border-radius: 8px;
pointer-events: none;
color: #000000;
}
</style>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'crawler/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div class="row">
<div class="col-12 col-xl-6">
<div class="table-responsive mt-1 table-hover table-borderless table-striped">
<table class="table">
<thead class="thead-dark">
<tr>
<th>Domain</th>
<th>First Seen</th>
<th>Last Check</th>
<th>Status</th>
</tr>
</thead>
<tbody id="tbody_last_crawled">
{% for metadata_onion in last_onions %}
<tr>
<td><a target="_blank" href="{{ url_for('hiddenServices.onion_domain') }}?onion_domain={{ metadata_onion['domain'] }}">{{ metadata_onion['domain'] }}</a></td>
<td>{{'{}/{}/{}'.format(metadata_onion['first_seen'][0:4], metadata_onion['first_seen'][4:6], metadata_onion['first_seen'][6:8])}}</td>
<td>{{'{}/{}/{}'.format(metadata_onion['last_check'][0:4], metadata_onion['last_check'][4:6], metadata_onion['last_check'][6:8])}}</td>
<td><div style="color:{{metadata_onion['status_color']}}; display:inline-block">
<i class="fas {{metadata_onion['status_icon']}} "></i>
{{metadata_onion['status_text']}}
</div>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
<a href="{{ url_for('hiddenServices.blacklisted_onion') }}">
<button type="button" class="btn btn-outline-danger">Show Blacklisted Onion</button>
</a>
</div>
<div class="col-12 col-xl-6">
<div class="card text-white bg-dark mb-3 mt-1">
<div class="card-header">
<div class="row">
<div class="col-6">
<span class="badge badge-success">{{ statDomains['domains_up'] }}</span> UP
<span class="badge badge-danger ml-md-3">{{ statDomains['domains_down'] }}</span> DOWN
</div>
<div class="col-6">
<span class="badge badge-success">{{ statDomains['total'] }}</span> Crawled
<span class="badge badge-warning ml-md-3">{{ statDomains['domains_queue'] }}</span> Queue
</div>
</div>
</div>
<div class="card-body">
<h5 class="card-title">Select domains by date range :</h5>
<p class="card-text">Some quick example text to build on the card title and make up the bulk of the card's content.</p>
<form action="{{ url_for('hiddenServices.get_onions_by_daterange') }}" id="hash_selector_form" method='post'>
<div class="row">
<div class="col-6">
<div class="input-group" id="date-range-from">
<div class="input-group-prepend"><span class="input-group-text"><i class="far fa-calendar-alt" aria-hidden="true"></i></span></div>
<input class="form-control" id="date-range-from-input" placeholder="yyyy-mm-dd" value="{{ date_from }}" name="date_from">
</div>
<div class="input-group" id="date-range-to">
<div class="input-group-prepend"><span class="input-group-text"><i class="far fa-calendar-alt" aria-hidden="true"></i></span></div>
<input class="form-control" id="date-range-to-input" placeholder="yyyy-mm-dd" value="{{ date_to }}" name="date_to">
</div>
</div>
<div class="col-6">
<div class="custom-control custom-switch">
<input class="custom-control-input" type="checkbox" name="domains_up" value="True" id="domains_up_id" checked>
<label class="custom-control-label" for="domains_up_id">
<span class="badge badge-success"><i class="fas fa-check-circle"></i> Domains UP </span>
</label>
</div>
<div class="custom-control custom-switch">
<input class="custom-control-input" type="checkbox" name="domains_down" value="True" id="domains_down_id">
<label class="custom-control-label" for="domains_down_id">
<span class="badge badge-danger"><i class="fas fa-times-circle"></i> Domains DOWN</span>
</label>
</div>
<div class="custom-control custom-switch mt-2">
<input class="custom-control-input" type="checkbox" name="domains_tags" value="True" id="domains_tags_id">
<label class="custom-control-label" for="domains_tags_id">
<span class="badge badge-dark"><i class="fas fa-tags"></i> Domains Tags</span>
</label>
</div>
</div>
</div>
<button class="btn btn-primary">
<i class="fas fa-eye"></i> Show Onions
</button>
<form>
</div>
</div>
<div id="barchart_type">
</div>
<div class="card mt-1 mb-1">
<div class="card-header text-white bg-dark">
Crawlers Status
</div>
<div class="card-body px-0 py-0 ">
<table class="table">
<tbody id="tbody_crawler_info">
{% for crawler in crawler_metadata %}
<tr>
<td>
<i class="fas fa-{%if crawler['status']%}check{%else%}times{%endif%}-circle" style="color:{%if crawler['status']%}Green{%else%}Red{%endif%};"></i> {{crawler['crawler_info']}}
</td>
<td>
{{crawler['crawling_domain']}}
</td>
<td style="color:{%if crawler['status']%}Green{%else%}Red{%endif%};">
{{crawler['status_info']}}
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
<script>
var chart = {};
$(document).ready(function(){
$("#page-Crawler").addClass("active");
$("#nav_onion_crawler").addClass("active");
$('#date-range-from').dateRangePicker({
separator : ' to ',
getValue: function(){
if ($('#date-range-from-input').val() && $('#date-range-to-input').val() )
return $('#date-range-from-input').val() + ' to ' + $('#date-range-to-input').val();
else
return '';
},
setValue: function(s,s1,s2){
$('#date-range-from-input').val(s1);
$('#date-range-to-input').val(s2);
}
});
$('#date-range-to').dateRangePicker({
separator : ' to ',
getValue: function(){
if ($('#date-range-from-input').val() && $('#date-range-to-input').val() )
return $('#date-range-from-input').val() + ' to ' + $('#date-range-to-input').val();
else
return '';
},
setValue: function(s,s1,s2){
$('#date-range-from-input').val(s1);
$('#date-range-to-input').val(s2);
}
});
chart.stackBarChart =barchart_type_stack("{{ url_for('hiddenServices.automatic_onion_crawler_json') }}", 'id');
chart.onResize();
$(window).on("resize", function() {
chart.onResize();
});
});
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
</script>
<script>/*
function refresh_list_crawled(){
$.getJSON("{{ url_for('hiddenServices.last_crawled_domains_with_stats_json') }}",
function(data) {
var tableRef = document.getElementById('tbody_last_crawled');
$("#tbody_last_crawled").empty()
for (var i = 0; i < data.last_onions.length; i++) {
var data_domain = data.last_onions[i]
var newRow = tableRef.insertRow(tableRef.rows.length);
var newCell = newRow.insertCell(0);
newCell.innerHTML = "<td><a target=\"_blank\" href=\"{{ url_for('hiddenServices.onion_domain') }}?onion_domain="+data_domain['domain']+"\">"+data_domain['domain']+"</a></td>";
newCell = newRow.insertCell(1);
newCell.innerHTML = "<td>"+data_domain['first_seen'].substr(0, 4)+"/"+data_domain['first_seen'].substr(4, 2)+"/"+data_domain['first_seen'].substr(6, 2)+"</td>"
newCell = newRow.insertCell(2);
newCell.innerHTML = "<td>"+data_domain['last_check'].substr(0, 4)+"/"+data_domain['last_check'].substr(4, 2)+"/"+data_domain['last_check'].substr(6, 2)+"</td>"
newCell = newRow.insertCell(3);
newCell.innerHTML = "<td><div style=\"color:"+data_domain['status_color']+"; display:inline-block\"><i class=\"fa "+data_domain['status_icon']+" fa-2x\"></i>"+data_domain['status_text']+"</div></td>"
}
var statDomains = data.statDomains
document.getElementById('text_domain_up').innerHTML = statDomains['domains_up']
document.getElementById('text_domain_down').innerHTML = statDomains['domains_down']
document.getElementById('text_domain_queue').innerHTML = statDomains['domains_queue']
document.getElementById('text_total_domains').innerHTML = statDomains['total']
if(data.crawler_metadata.length!=0){
$("#tbody_crawler_info").empty();
var tableRef = document.getElementById('tbody_crawler_info');
for (var i = 0; i < data.crawler_metadata.length; i++) {
var crawler = data.crawler_metadata[i];
var newRow = tableRef.insertRow(tableRef.rows.length);
var text_color;
var icon;
if(crawler['status']){
text_color = 'Green';
icon = 'check';
} else {
text_color = 'Red';
icon = 'times';
}
var newCell = newRow.insertCell(0);
newCell.innerHTML = "<td><i class=\"fa fa-"+icon+"-circle\" style=\"color:"+text_color+";\"></i>"+crawler['crawler_info']+"</td>";
newCell = newRow.insertCell(1);
newCell.innerHTML = "<td><a target=\"_blank\" href=\"{{ url_for('hiddenServices.onion_domain') }}?onion_domain="+crawler['crawling_domain']+"\">"+crawler['crawling_domain']+"</a></td>";
newCell = newRow.insertCell(2);
newCell.innerHTML = "<td><div style=\"color:"+text_color+";\">"+crawler['status_info']+"</div></td>";
$("#panel_crawler").show();
}
} else {
$("#panel_crawler").hide();
}
}
);
if (to_refresh) {
setTimeout("refresh_list_crawled()", 10000);
}
}*/
</script>
<script>
var margin = {top: 20, right: 90, bottom: 55, left: 0},
width = parseInt(d3.select('#barchart_type').style('width'), 10);
width = 1000 - margin.left - margin.right,
height = 500 - margin.top - margin.bottom;
var x = d3.scaleBand().rangeRound([0, width]).padding(0.1);
var y = d3.scaleLinear().rangeRound([height, 0]);
var xAxis = d3.axisBottom(x);
var yAxis = d3.axisLeft(y);
var color = d3.scaleOrdinal(d3.schemeSet3);
var svg = d3.select("#barchart_type").append("svg")
.attr("id", "thesvg")
.attr("viewBox", "0 0 "+width+" 500")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
function barchart_type_stack(url, id) {
d3.json(url)
.then(function(data){
var labelVar = 'date'; //A
var varNames = d3.keys(data[0])
.filter(function (key) { return key !== labelVar;}); //B
data.forEach(function (d) { //D
var y0 = 0;
d.mapping = varNames.map(function (name) {
return {
name: name,
label: d[labelVar],
y0: y0,
y1: y0 += +d[name]
};
});
d.total = d.mapping[d.mapping.length - 1].y1;
});
x.domain(data.map(function (d) { return (d.date); })); //E
y.domain([0, d3.max(data, function (d) { return d.total; })]);
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.attr("class", "bar")
.on("click", function (d) { window.location.href = "#" })
.attr("transform", "rotate(-18)" )
//.attr("transform", "rotate(-40)" )
.style("text-anchor", "end");
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style("text-anchor", "end");
var selection = svg.selectAll(".series")
.data(data)
.enter().append("g")
.attr("class", "series")
.attr("transform", function (d) { return "translate(" + x((d.date)) + ",0)"; });
selection.selectAll("rect")
.data(function (d) { return d.mapping; })
.enter().append("rect")
.attr("class", "bar_stack")
.attr("width", x.bandwidth())
.attr("y", function (d) { return y(d.y1); })
.attr("height", function (d) { return y(d.y0) - y(d.y1); })
.style("fill", function (d) { return color(d.name); })
.style("stroke", "grey")
.on("mouseover", function (d) { showPopover.call(this, d); })
.on("mouseout", function (d) { removePopovers(); })
.on("click", function(d){ window.location.href = "#" });
data.forEach(function(d) {
if(d.total != 0){
svg.append("text")
.attr("class", "bar")
.attr("dy", "-.35em")
.attr('x', x(d.date) + x.bandwidth()/2)
.attr('y', y(d.total))
.on("click", function () {window.location.href = "#" })
.style("text-anchor", "middle")
.text(d.total);
}
});
drawLegend(varNames);
});
}
function drawLegend (varNames) {
var legend = svg.selectAll(".legend")
.data(varNames.slice().reverse())
.enter().append("g")
.attr("class", "legend")
.attr("transform", function (d, i) { return "translate(0," + i * 20 + ")"; });
legend.append("rect")
.attr("x", 943)
.attr("width", 10)
.attr("height", 10)
.style("fill", color)
.style("stroke", "grey");
legend.append("text")
.attr("class", "svgText")
.attr("x", 941)
.attr("y", 6)
.attr("dy", ".35em")
.style("text-anchor", "end")
.text(function (d) { return d; });
}
function removePopovers () {
$('.popover').each(function() {
$(this).remove();
});
}
function showPopover (d) {
$(this).popover({
title: d.name,
placement: 'top',
container: 'body',
trigger: 'manual',
html : true,
content: function() {
return d.label +
"<br/>num: " + d3.format(",")(d.value ? d.value: d.y1 - d.y0); }
});
$(this).popover('show')
}
chart.onResize = function () {
var aspect = width / height, chart = $("#thesvg");
var targetWidth = chart.parent().width();
chart.attr("width", targetWidth);
chart.attr("height", targetWidth / 2);
}
window.chart = chart;
</script>

View file

@ -0,0 +1,217 @@
<!DOCTYPE html>
<html>
<head>
<title>AIL-Framework</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png')}}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dataTables.bootstrap4.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/jquery.dataTables.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/dataTables.bootstrap.min.js')}}"></script>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'crawler/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
{%if last_domains%}
<div class="table-responsive mt-1 mb-3 table-hover table-borderless table-striped">
<table class="table">
<thead class="thead-dark">
<tr>
<th>Domain</th>
<th>First Seen</th>
<th>Last Check</th>
<th>Status</th>
</tr>
</thead>
<tbody id="tbody_last_crawled">
{% for metadata_domain in last_domains %}
<tr>
<td><a target="_blank" href="{{ url_for('hiddenServices.show_domain') }}?domain={{ metadata_domain['domain'] }}&port={{metadata_domain['port']}}&epoch={{metadata_domain['epoch']}}">{{ metadata_domain['domain_name'] }}</a></td>
<td>{{'{}/{}/{}'.format(metadata_domain['first_seen'][0:4], metadata_domain['first_seen'][4:6], metadata_domain['first_seen'][6:8])}}</td>
<td>{{'{}/{}/{}'.format(metadata_domain['last_check'][0:4], metadata_domain['last_check'][4:6], metadata_domain['last_check'][6:8])}}</td>
<td><div style="color:{{metadata_domain['status_color']}}; display:inline-block">
<i class="fas {{metadata_domain['status_icon']}} "></i>
{{metadata_domain['status_text']}}
</div>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
{%endif%}
<div class="row">
<div class="col-lg-6">
<div class="table-responsive mt-1 table-hover table-borderless table-striped">
<table class="table" id="myTable_1">
<thead class="thead-dark">
<tr>
<th>Onion Url</th>
<th></th>
<th>Next Check</th>
<th></th>
<th></th>
</tr>
</thead>
<tbody id="tbody_last_crawled">
{% for metadata_domain in auto_crawler_domain_onions_metadata %}
<tr>
<td><a target="_blank" href="{{ url_for('hiddenServices.show_domain') }}?domain={{ metadata_domain['domain'] }}&port={{metadata_domain['port']}}&epoch={{metadata_domain['epoch']}}">{{ metadata_domain['url'] }}</a></td>
<td><a class="btn btn-outline-danger px-1 py-0" href="{{ url_for('hiddenServices.remove_auto_crawler') }}?url={{ metadata_domain['url'] }}&page={{page}}">
<i class="fas fa-trash-alt"></i></a>
</td>
<td>{{metadata_domain['epoch']}}</td>
<td><div style="color:{{metadata_domain['status_color']}}; display:inline-block">
<i class="fas {{metadata_domain['status_icon']}} "></i>
{{metadata_domain['status_text']}}
</div>
</td>
<td>
<button class="btn btn-outline-secondary px-1 py-0 disabled"><i class="fas fa-pencil-alt"></i></button>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
<div class="col-lg-6">
<div class="table-responsive mt-1 table-hover table-borderless table-striped">
<table class="table" id="myTable_2">
<thead class="thead-dark">
<tr>
<th>Regular Url</th>
<th></th>
<th>Next Check</th>
<th></th>
<th></th>
</tr>
</thead>
<tbody id="tbody_last_crawled">
{% for metadata_domain in auto_crawler_domain_regular_metadata %}
<tr>
<td><a target="_blank" href="{{ url_for('hiddenServices.show_domain') }}?domain={{ metadata_domain['domain'] }}&port={{metadata_domain['port']}}&epoch={{metadata_domain['epoch']}}">{{ metadata_domain['url'] }}</a></td>
<td><a class="btn btn-outline-danger px-1 py-0" href="{{ url_for('hiddenServices.remove_auto_crawler') }}?url={{ metadata_domain['url'] }}&page={{page}}">
<i class="fas fa-trash-alt"></i></a>
</td>
<td>{{metadata_domain['epoch']}}</td>
<td><div style="color:{{metadata_domain['status_color']}}; display:inline-block">
<i class="fas {{metadata_domain['status_icon']}} "></i>
{{metadata_domain['status_text']}}
</div>
</td>
<td>
<button class="btn btn-outline-secondary px-1 py-0 disabled"><i class="fas fa-pencil-alt"></i></button>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
</div>
<hr class="mt-4">
<div class="d-flex justify-content-center">
<nav aria-label="...">
<ul class="pagination">
<li class="page-item {%if page==1%}disabled{%endif%}">
<a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{page-1}}">Previous</a>
</li>
{%if page>3%}
<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page=1">1</a></li>
<li class="page-item disabled"><a class="page-link" aria-disabled="true" href="#">...</a></li>
<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{page-1}}">{{page-1}}</a></li>
<li class="page-item active"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{page}}">{{page}}</a></li>
{%else%}
{%if page>2%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{page-2}}">{{page-2}}</a></li>{%endif%}
{%if page>1%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{page-1}}">{{page-1}}</a></li>{%endif%}
<li class="page-item active"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{page}}">{{page}}</a></li>
{%endif%}
{%if nb_page_max-page>3%}
<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{page+1}}">{{page+1}}</a></li>
<li class="page-item disabled"><a class="page-link" aria-disabled="true" href="#">...</a></li>
<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{nb_page_max}}">{{nb_page_max}}</a></li>
{%else%}
{%if nb_page_max-page>2%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{nb_page_max-2}}">{{nb_page_max-2}}</a></li>{%endif%}
{%if nb_page_max-page>1%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{nb_page_max-1}}">{{nb_page_max-1}}</a></li>{%endif%}
{%if nb_page_max-page>0%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{nb_page_max}}">{{nb_page_max}}</a></li>{%endif%}
{%endif%}
<li class="page-item {%if page==nb_page_max%}disabled{%endif%}">
<a class="page-link" href="{{ url_for('hiddenServices.auto_crawler') }}?page={{page+1}}" aria-disabled="true">Next</a>
</li>
</ul>
</nav>
</div>
</div>
</div>
</div>
</body>
<script>
$(document).ready(function(){
$("#page-Crawler").addClass("active");
$("#nav_auto_crawler").addClass("active");
table1 = $('#myTable_1').DataTable(
{
//"aLengthMenu": [[5, 10, 15, 20, -1], [5, 10, 15, 20, "All"]],
//"iDisplayLength": 5,
//"order": [[ 0, "desc" ]]
columnDefs: [
{ orderable: false, targets: [-1, -4] }
]
});
table2 = $('#myTable_2').DataTable(
{
//"aLengthMenu": [[5, 10, 15, 20, -1], [5, 10, 15, 20, "All"]],
//"iDisplayLength": 5,
//"order": [[ 0, "desc" ]]
columnDefs: [
{ orderable: false, targets: [-1, -4] }
]
});
});
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
</script>

View file

@ -0,0 +1,222 @@
<!DOCTYPE html>
<html>
<head>
<title>AIL-Framework</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png')}}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'crawler/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div class="row">
<div class="col-xl-6">
<div class="card mt-1 mb-1">
<div class="card-header text-white bg-dark">
<h5><a class="text-info" href="{{ url_for('hiddenServices.Crawler_Splash_last_by_type')}}?type=onion"><i class="fas fa-user-secret"></i> Onions Crawlers</a></h5>
<div class="row">
<div class="col-6">
<span class="badge badge-success" id="stat_onion_domain_up">{{ statDomains_onion['domains_up'] }}</span> UP
<span class="badge badge-danger ml-md-3" id="stat_onion_domain_down">{{ statDomains_onion['domains_down'] }}</span> DOWN
</div>
<div class="col-6">
<span class="badge badge-success" id="stat_onion_total">{{ statDomains_onion['total'] }}</span> Crawled
<span class="badge badge-warning ml-md-3" id="stat_onion_queue">{{ statDomains_onion['domains_queue'] }}</span> Queue
</div>
</div>
</div>
<div class="card-body px-0 py-0 ">
<table class="table">
<tbody id="tbody_crawler_onion_info">
{% for crawler in crawler_metadata_onion %}
<tr>
<td>
<i class="fas fa-{%if crawler['status']%}check{%else%}times{%endif%}-circle" style="color:{%if crawler['status']%}Green{%else%}Red{%endif%};"></i> {{crawler['crawler_info']}}
</td>
<td>
{{crawler['crawling_domain']}}
</td>
<td style="color:{%if crawler['status']%}Green{%else%}Red{%endif%};">
{{crawler['status_info']}}
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
</div>
<div class="col-xl-6">
<div class="card mt-1 mb-1">
<div class="card-header text-white bg-dark">
<h5><a class="text-info" href="{{ url_for('hiddenServices.Crawler_Splash_last_by_type')}}?type=regular"><i class="fab fa-html5"></i> Regular Crawlers</a></h5>
<div class="row">
<div class="col-6">
<span class="badge badge-success" id="stat_regular_domain_up">{{ statDomains_regular['domains_up'] }}</span> UP
<span class="badge badge-danger ml-md-3" id="stat_regular_domain_down">{{ statDomains_regular['domains_down'] }}</span> DOWN
</div>
<div class="col-6">
<span class="badge badge-success" id="stat_regular_total">{{ statDomains_regular['total'] }}</span> Crawled
<span class="badge badge-warning ml-md-3" id="stat_regular_queue">{{ statDomains_regular['domains_queue'] }}</span> Queue
</div>
</div>
</div>
<div class="card-body px-0 py-0 ">
<table class="table">
<tbody id="tbody_crawler_regular_info">
{% for crawler in crawler_metadata_regular %}
<tr>
<td>
<i class="fas fa-{%if crawler['status']%}check{%else%}times{%endif%}-circle" style="color:{%if crawler['status']%}Green{%else%}Red{%endif%};"></i> {{crawler['crawler_info']}}
</td>
<td>
{{crawler['crawling_domain']}}
</td>
<td style="color:{%if crawler['status']%}Green{%else%}Red{%endif%};">
{{crawler['status_info']}}
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
<script>
var to_refresh = false
$(document).ready(function(){
$("#page-Crawler").addClass("active");
$("#nav_dashboard").addClass("active");
$( window ).focus(function() {
to_refresh = true
refresh_crawler_status();
});
$( window ).blur(function() {
to_refresh = false
});
to_refresh = true
refresh_crawler_status();
});
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
function refresh_crawler_status(){
$.getJSON("{{ url_for('hiddenServices.crawler_dashboard_json') }}",
function(data) {
$('#stat_onion_domain_up').text(data.statDomains_onion['domains_up']);
$('#stat_onion_domain_down').text(data.statDomains_onion['domains_down']);
$('#stat_onion_total').text(data.statDomains_onion['total']);
$('#stat_onion_queue').text(data.statDomains_onion['domains_queue']);
$('#stat_regular_domain_up').text(data.statDomains_regular['domains_up']);
$('#stat_regular_domain_down').text(data.statDomains_regular['domains_down']);
$('#stat_regular_total').text(data.statDomains_regular['total']);
$('#stat_regular_queue').text(data.statDomains_regular['domains_queue']);
if(data.crawler_metadata_onion.length!=0){
$("#tbody_crawler_onion_info").empty();
var tableRef = document.getElementById('tbody_crawler_onion_info');
for (var i = 0; i < data.crawler_metadata_onion.length; i++) {
var crawler = data.crawler_metadata_onion[i];
var newRow = tableRef.insertRow(tableRef.rows.length);
var text_color;
var icon;
if(crawler['status']){
text_color = 'Green';
icon = 'check';
} else {
text_color = 'Red';
icon = 'times';
}
var newCell = newRow.insertCell(0);
newCell.innerHTML = "<td><i class=\"fas fa-"+icon+"-circle\" style=\"color:"+text_color+";\"></i> "+crawler['crawler_info']+"</td>";
newCell = newRow.insertCell(1);
newCell.innerHTML = "<td>"+crawler['crawling_domain']+"</td>";
newCell = newRow.insertCell(2);
newCell.innerHTML = "<td><div style=\"color:"+text_color+";\">"+crawler['status_info']+"</div></td>";
//$("#panel_crawler").show();
}
}
if(data.crawler_metadata_regular.length!=0){
$("#tbody_crawler_regular_info").empty();
var tableRef = document.getElementById('tbody_crawler_regular_info');
for (var i = 0; i < data.crawler_metadata_regular.length; i++) {
var crawler = data.crawler_metadata_regular[i];
var newRow = tableRef.insertRow(tableRef.rows.length);
var text_color;
var icon;
if(crawler['status']){
text_color = 'Green';
icon = 'check';
} else {
text_color = 'Red';
icon = 'times';
}
var newCell = newRow.insertCell(0);
newCell.innerHTML = "<td><i class=\"fas fa-"+icon+"-circle\" style=\"color:"+text_color+";\"></i> "+crawler['crawler_info']+"</td>";
newCell = newRow.insertCell(1);
newCell.innerHTML = "<td>"+crawler['crawling_domain']+"</td>";
newCell = newRow.insertCell(2);
newCell.innerHTML = "<td><div style=\"color:"+text_color+";\">"+crawler['status_info']+"</div></td>";
//$("#panel_crawler").show();
}
}
}
);
if (to_refresh) {
setTimeout("refresh_crawler_status()", 10000);
}
}
</script>

View file

@ -0,0 +1,66 @@
<!DOCTYPE html>
<html>
<head>
<title>AIL-Framework</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png')}}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'crawler/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div >
<pre>
--------------
--------------
</pre>
</div>
</div>
</div>
</div>
</body>
<script>
$(document).ready(function(){
$("#page-Crawler").addClass("active");
});
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
</script>

View file

@ -0,0 +1,208 @@
<!DOCTYPE html>
<html>
<head>
<title>AIL-Framework</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png')}}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dataTables.bootstrap.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/jquery.dataTables.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/dataTables.bootstrap.min.js')}}"></script>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'crawler/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div class="card-deck justify-content-center mx-0">
<div class="card border-dark mt-2">
<div class="card-header bg-dark text-white">
Blacklisted {{type_name}}s
</div>
<div class="card-body text-dark">
<div class="row">
<div class="col-12 col-md-6">
<div class="card text-center border-danger">
<div class="card-body text-danger">
<h5 class="card-title">Blacklist {{type_name}}</h5>
<input type="text" class="form-control {%if blacklist_domain is not none %}{%if blacklist_domain==1 %}is-valid{% else %}is-invalid{%endif%}{%endif%}" id="blacklist_domain_input" placeholder="{{type_name}} Address">
<div class="invalid-feedback">
{%if blacklist_domain==2 %}
This {{type_name}} is already blacklisted
{% else %}
Incorrect {{type_name}} address
{% endif %}
</div>
<div class="valid-feedback">
{{type_name}} Blacklisted
</div>
<button type="button" class="btn btn-danger mt-2" onclick="window.location.href ='{{ url_for('hiddenServices.blacklist_domain') }}?redirect=0&type={{type}}&domain='+$('#blacklist_domain_input').val();">Blacklist {{type_name}}</button>
</div>
</div>
</div>
<div class="col-12 col-md-6 mt-4 mt-md-0">
<div class="card text-center border-success">
<div class="card-body">
<h5 class="card-title">Unblacklist {{type_name}}</h5>
<input type="text" class="form-control {%if unblacklist_domain is not none %}{%if unblacklist_domain==1 %}is-valid{% else %}is-invalid{%endif%}{%endif%}" id="unblacklist_domain_input" placeholder="{{type_name}} Address">
<div class="invalid-feedback">
{%if unblacklist_domain==2 %}
This {{type_name}} is not blacklisted
{% else %}
Incorrect {{type_name}} address
{% endif %}
</div>
<div class="valid-feedback">
{{type_name}} Unblacklisted
</div>
<button type="button" class="btn btn-outline-secondary mt-2" onclick="window.location.href ='{{ url_for('hiddenServices.unblacklist_domain') }}?redirect=0&type={{type}}&domain='+$('#unblacklist_domain_input').val();">Unblacklist {{type_name}}</button>
</div>
</div>
</div>
</div>
<div class="row mt-4">
<div class="col-12 col-xl-6">
<table class="table table-striped table-bordered table-hover" id="myTable_1">
<thead class="thead-dark">
<tr>
<th style="max-width: 800px;">{{type_name}}</th>
<th style="max-width: 800px;">Unblacklist {{type_name}}</th>
</tr>
</thead>
<tbody>
{% for domain in list_blacklisted_1 %}
<tr>
<td>{{domain}}</td>
<td>
<a href="{{ url_for('hiddenServices.unblacklist_domain') }}?page={{page}}&domain={{domain}}&type={{type}}">
<button type="button" class="btn btn-outline-danger">UnBlacklist {{type_name}}</button>
</a>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
<div class="col-12 col-xl-6">
<table class="table table-striped table-bordered table-hover" id="myTable_2">
<thead class="thead-dark">
<tr>
<th style="max-width: 800px;">{{type_name}}</th>
<th style="max-width: 800px;">Unblacklist {{type_name}}</th>
</tr>
</thead>
<tbody>
{% for domain in list_blacklisted_2 %}
<tr>
<td>{{domain}}</td>
<td>
<a href="{{ url_for('hiddenServices.unblacklist_domain') }}?page={{page}}&domain={{domain}}&type={{type}}">
<button type="button" class="btn btn-outline-danger">UnBlacklist {{type_name}}</button>
</a>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
</div>
</div>
</div>
<div class="d-flex justify-content-center">
<nav class="mt-4" aria-label="...">
<ul class="pagination">
<li class="page-item {%if page==1%}disabled{%endif%}">
<a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{page-1}}">Previous</a>
</li>
{%if page>3%}
<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page=1">1</a></li>
<li class="page-item disabled"><a class="page-link" aria-disabled="true" href="#">...</a></li>
<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{page-1}}">{{page-1}}</a></li>
<li class="page-item active"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{page}}">{{page}}</a></li>
{%else%}
{%if page>2%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{page-2}}">{{page-2}}</a></li>{%endif%}
{%if page>1%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{page-1}}">{{page-1}}</a></li>{%endif%}
<li class="page-item active"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{page}}">{{page}}</a></li>
{%endif%}
{%if nb_page_max-page>3%}
<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{page+1}}">{{page+1}}</a></li>
<li class="page-item disabled"><a class="page-link" aria-disabled="true" href="#">...</a></li>
<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{nb_page_max}}">{{nb_page_max}}</a></li>
{%else%}
{%if nb_page_max-page>2%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{nb_page_max-2}}">{{nb_page_max-2}}</a></li>{%endif%}
{%if nb_page_max-page>1%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{nb_page_max-1}}">{{nb_page_max-1}}</a></li>{%endif%}
{%if nb_page_max-page>0%}<li class="page-item"><a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{nb_page_max}}">{{nb_page_max}}</a></li>{%endif%}
{%endif%}
<li class="page-item {%if page==nb_page_max%}disabled{%endif%}">
<a class="page-link" href="{{ url_for('hiddenServices.blacklisted_domains') }}?page={{page+1}}" aria-disabled="true">Next</a>
</li>
</ul>
</nav>
</div>
</div>
</div>
</div>
</body>
<script>
var table
$(document).ready(function(){
table = $('#myTable_1').DataTable(
{
/*"aLengthMenu": [[5, 10, 15, 20, -1], [5, 10, 15, 20, "All"]],
"iDisplayLength": 10,*/
"order": [[ 0, "asc" ]]
}
);
table = $('#myTable_2').DataTable(
{
/*"aLengthMenu": [[5, 10, 15, 20, -1], [5, 10, 15, 20, "All"]],
"iDisplayLength": 10,*/
"order": [[ 0, "asc" ]]
}
);
$("#page-Crawler").addClass("active");
});
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
</script>

View file

@ -63,7 +63,7 @@
{% for domain in domains_by_day[date] %}
<tr>
<td>
<a target="_blank" href="{{ url_for('hiddenServices.onion_domain') }}?onion_domain={{ domain }}">{{ domain }}</a>
<a target="_blank" href="{{ url_for('hiddenServices.show_domain') }}?domain={{ domain }}">{{ domain }}</a>
<div>
{% for tag in domain_metadata[domain]['tags'] %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag }}">

View file

@ -1 +1 @@
<li id='page-hiddenServices'><a href="{{ url_for('hiddenServices.hiddenServices_page') }}"><i class="fa fa-user-secret"></i> hidden Services </a></li>
<li id='page-hiddenServices'><a href="{{ url_for('hiddenServices.dashboard') }}"><i class="fa fa-user-secret"></i> hidden Services </a></li>

View file

@ -1,60 +1,54 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Show Domain - AIL</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png') }}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='font-awesome/css/font-awesome.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/sb-admin-2.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dygraph_gallery.css') }}" rel="stylesheet" type="text/css" />
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/daterangepicker.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dataTables.bootstrap4.min.css') }}" rel="stylesheet">
<!-- JS -->
<script type="text/javascript" src="{{ url_for('static', filename='js/dygraph-combined.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/jquery.dataTables.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/dataTables.bootstrap.js')}}"></script>
<style>
.test thead{
background: #d91f2d;
color: #fff;
}
</style>
<script src="{{ url_for('static', filename='js/dataTables.bootstrap.min.js')}}"></script>
</head>
<body>
{% include 'navbar.html' %}
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'crawler/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div id="page-wrapper">
<div class="row">
<div class="col-md-6">
<div class="row">
<div class="panel panel-info">
<div class="panel-heading">
<div class="col-12 col-xl-6">
<div class="card mt-2">
<div class="card-header bg-dark">
<span class="badge badge-pill badge-light flex-row-reverse float-right">
{% if status %}
<div class="pull-right" style="color:Green;">
<i class="fa fa-check-circle fa-2x"></i>
<div style="color:Green;">
<i class="fas fa-check-circle fa-2x"></i>
UP
</div>
{% else %}
<div class="pull-right" style="color:Red;">
<i class="fa fa-times-circle fa-2x"></i>
<div style="color:Red;">
<i class="fas fa-times-circle fa-2x"></i>
DOWN
</div>
{% endif %}
<h3>{{ domain }} :</h3>
<ul class="list-group">
<li class="list-group-item">
<table class="table table-condensed">
</span>
<h3 class="card-title text-white">{{ domain }} :</h3>
</div>
<div class="card-body">
<table class="table table-responsive table-condensed">
<thead>
<tr>
<th>First Seen</th>
@ -63,76 +57,80 @@
</thead>
<tbody>
<tr>
<td class="panelText"><a href="#">{{ first_seen }}</a></td>
<td class="panelText"><a href="#">{{ last_check }}</a></td>
<td class="panelText">{{ first_seen }}</td>
<td class="panelText">{{ last_check }}</td>
</tr>
</tbody>
</table>
</li>
<li class="list-group-item">
Origin Paste: <a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste', paste=origin_paste) }}" />{{ origin_paste_name }}</a>
Origin Paste:
{% if origin_paste_name=='manual' or origin_paste_name=='auto' %}
<span class="badge badge-dark">{{ origin_paste_name }}</span>
{%else%}
<a class="badge badge-dark" target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste', paste=origin_paste) }}" />{{ origin_paste_name }}</a>
{%endif%}
<div>
{% for tag in origin_paste_tags %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag[1] }}">
<span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag[0] }}</span>
<a href="{{ url_for('Tags.Tags_page') }}?ltags={{ tag[1] }}">
<span class="badge badge-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag[0] }}</span>
</a>
{% endfor %}
<br>
</div>
</li>
</ul>
</div>
</div>
<div>
{% for tag in domain_tags %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag }}">
<span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag }} <i>{{ domain_tags[tag] }}</i></span>
<a href="{{ url_for('Tags.Tags_page') }}?ltags={{ tag }}">
<span class="badge badge-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag }} <i>{{ domain_tags[tag] }}</i></span>
</a>
{% endfor %}
<br>
<br>
</div>
<table class="test table table-striped table-bordered table-hover table-responsive " id="myTable_">
<thead>
{% if l_pastes %}
<table class="table table-striped table-bordered table-hover" id="myTable_1">
<thead class="thead-dark">
<tr>
<th style="max-width: 800px;">Crawled Pastes</th>
<th>Crawled Pastes</th>
</tr>
</thead>
<tbody>
{% for path in l_pastes %}
<tr>
<td><a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{path}}">{{ path_name[loop.index0] }}</a>
<td>
<a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{path}}" class="text-secondary">
<div style="line-height:0.9;">{{ dict_links[path] }}</div>
</a>
<div>
{% for tag in paste_tags[loop.index0] %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag[1] }}">
<span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag[0] }}</span>
<a href="{{ url_for('Tags.Tags_page') }}?ltags={{ tag[1] }}">
<span class="badge badge-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag[0] }}</span>
</a>
{% endfor %}
</div>
</td>
</tr>
{% endfor %}
</tbody>
</table>
{%endif%}
</div>
</div>
<div class="col-md-6">
<div class="panel panel-info" style="text-align:center;">
<div class="panel-heading">
<div class="col-12 col-xl-6">
<div class="card my-2" style="background-color:#ecf0f1;">
<div class="card-body py-2">
<div class="row">
<div class="col-md-8">
<input class="center" id="blocks" type="range" min="1" max="50" value="13">
<input class="custom-range mt-2" id="blocks" type="range" min="1" max="50" value="13">
</div>
<div class="col-md-4">
<button class="btn btn-primary btn-tags" onclick="blocks.value=50;pixelate();">
<span class="glyphicon glyphicon-zoom-in"></span>
<button class="btn btn-primary" onclick="blocks.value=50;pixelate();">
<i class="fas fa-search-plu"></i>
<span class="label-icon">Full resolution</span>
</button>
</div>
@ -140,25 +138,47 @@
</div>
</div>
<canvas id="canvas" style="width:100%;"></canvas>
<div class="text-center">
<small>
<a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{screenshot['item']}}" class="text-info">
<div style="line-height:0.9;">{{dict_links[screenshot['item']]}}</div>
</a>
<small>
</div>
</div>
</div>
</div>
<!-- /#page-wrapper -->
</div>
</div>
</body>
<script>
var table;
$(document).ready(function(){
activePage = "page-hiddenServices"
$("#"+activePage).addClass("active");
table = $('#myTable_').DataTable(
table = $('#myTable_1').DataTable(
{
"aLengthMenu": [[5, 10, 15, 20, -1], [5, 10, 15, 20, "All"]],
"iDisplayLength": 5,
"order": [[ 0, "desc" ]]
}
);
//"aLengthMenu": [[5, 10, 15, 20, -1], [5, 10, 15, 20, "All"]],
//"iDisplayLength": 5,
//"order": [[ 0, "desc" ]]
});
});
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
</script>
<script>
@ -172,10 +192,9 @@
img.addEventListener("error", img_error);
var draw_img = false;
img.src = "{{ url_for('showsavedpastes.screenshot', filename=screenshot) }}";
img.src = "{{ url_for('showsavedpastes.screenshot', filename=screenshot['screenshot']) }}";
function pixelate() {
/// use slider value
if( blocks.value == 50 ){
size = 1;
@ -208,6 +227,4 @@
}
</script>
</body>
</html>

View file

@ -29,7 +29,7 @@ r_serv_metadata = Flask_config.r_serv_metadata
max_preview_char = Flask_config.max_preview_char
max_preview_modal = Flask_config.max_preview_modal
bootstrap_label = Flask_config.bootstrap_label
PASTES_FOLDER = Flask_config.PASTES_FOLDER
baseindexpath = os.path.join(os.environ['AIL_HOME'], cfg.get("Indexer", "path"))
indexRegister_path = os.path.join(os.environ['AIL_HOME'],
@ -133,8 +133,8 @@ def search():
query = QueryParser("content", ix.schema).parse("".join(q))
results = searcher.search_page(query, 1, pagelen=num_elem_to_get)
for x in results:
r.append(x.items()[0][1])
path = x.items()[0][1]
r.append(x.items()[0][1].replace(PASTES_FOLDER, '', 1))
path = x.items()[0][1].replace(PASTES_FOLDER, '', 1)
paste = Paste.Paste(path)
content = paste.get_p_content()
content_range = max_preview_char if len(content)>max_preview_char else len(content)-1
@ -208,6 +208,7 @@ def get_more_search_result():
results = searcher.search_page(query, page_offset, num_elem_to_get)
for x in results:
path = x.items()[0][1]
path = path.replace(PASTES_FOLDER, '', 1)
path_array.append(path)
paste = Paste.Paste(path)
content = paste.get_p_content()

View file

@ -101,7 +101,7 @@
<td><a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{path}}">{{ path }}</a>
<div>
{% for tag in paste_tags[loop.index0] %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag[1] }}">
<a href="{{ url_for('Tags.Tags_page') }}?ltags={{ tag[1] }}">
<span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag[0] }}</span>
</a>
{% endfor %}
@ -201,7 +201,7 @@
var curr_preview = data.preview_array[i].replace(/\"/g, "\'");
var tag = ""
for(j=0; j<data.list_tags[i].length; j++) {
tag = tag + "<a href=\"{{ url_for('Tags.get_tagged_paste') }}?ltags=" + data.list_tags[i][j][1] + ">"
tag = tag + "<a href=\"{{ url_for('Tags.Tags_page') }}?ltags=" + data.list_tags[i][j][1] + ">"
+ "<span class=\"label label-" + data.bootstrap_label[j % 5] + " pull-left\">" + data.list_tags[i][j][0]
+ "</span>" + "</a>"
}

View file

@ -0,0 +1,129 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
'''
Flask functions and routes for the settings modules page
'''
from flask import Flask, render_template, jsonify, request, Blueprint, redirect, url_for
import json
import datetime
import git_status
# ============ VARIABLES ============
import Flask_config
app = Flask_config.app
cfg = Flask_config.cfg
baseUrl = Flask_config.baseUrl
r_serv_db = Flask_config.r_serv_db
max_preview_char = Flask_config.max_preview_char
max_preview_modal = Flask_config.max_preview_modal
REPO_ORIGIN = Flask_config.REPO_ORIGIN
dict_update_description = Flask_config.dict_update_description
settings = Blueprint('settings', __name__, template_folder='templates')
# ============ FUNCTIONS ============
def one():
return 1
#def get_v1.5_update_tags_backgroud_status():
# return '38%'
def get_git_metadata():
dict_git = {}
dict_git['current_branch'] = git_status.get_current_branch()
dict_git['is_clone'] = git_status.is_not_fork(REPO_ORIGIN)
dict_git['is_working_directory_clean'] = git_status.is_working_directory_clean()
dict_git['current_commit'] = git_status.get_last_commit_id_from_local()
dict_git['last_remote_commit'] = git_status.get_last_commit_id_from_remote()
dict_git['last_local_tag'] = git_status.get_last_tag_from_local()
dict_git['last_remote_tag'] = git_status.get_last_tag_from_remote()
if dict_git['current_commit'] != dict_git['last_remote_commit']:
dict_git['new_git_update_available'] = True
else:
dict_git['new_git_update_available'] = False
if dict_git['last_local_tag'] != dict_git['last_remote_tag']:
dict_git['new_git_version_available'] = True
else:
dict_git['new_git_version_available'] = False
return dict_git
def get_update_metadata():
dict_update = {}
dict_update['current_version'] = r_serv_db.get('ail:version')
dict_update['current_background_update'] = r_serv_db.get('ail:current_background_update')
dict_update['update_in_progress'] = r_serv_db.get('ail:update_in_progress')
dict_update['update_error'] = r_serv_db.get('ail:update_error')
if dict_update['update_in_progress']:
dict_update['update_progression'] = r_serv_db.scard('ail:update_{}'.format(dict_update['update_in_progress']))
dict_update['update_nb'] = dict_update_description[dict_update['update_in_progress']]['nb_background_update']
dict_update['update_stat'] = int(dict_update['update_progression']*100/dict_update['update_nb'])
dict_update['current_background_script'] = r_serv_db.get('ail:current_background_script')
dict_update['current_background_script_stat'] = r_serv_db.get('ail:current_background_script_stat')
return dict_update
# ============= ROUTES ==============
@settings.route("/settings/", methods=['GET'])
def settings_page():
git_metadata = get_git_metadata()
current_version = r_serv_db.get('ail:version')
update_metadata = get_update_metadata()
return render_template("settings_index.html", git_metadata=git_metadata,
current_version=current_version)
@settings.route("/settings/get_background_update_stats_json", methods=['GET'])
def get_background_update_stats_json():
# handle :end, error
update_stats = {}
current_update = r_serv_db.get('ail:current_background_update')
update_in_progress = r_serv_db.get('ail:update_in_progress')
if current_update:
update_stats['update_version']= current_update
update_stats['background_name']= r_serv_db.get('ail:current_background_script')
update_stats['background_stats']= r_serv_db.get('ail:current_background_script_stat')
if update_stats['background_stats'] is None:
update_stats['background_stats'] = 0
else:
update_stats['background_stats'] = int(update_stats['background_stats'])
update_progression = r_serv_db.scard('ail:update_{}'.format(current_update))
update_nb_scripts = dict_update_description[current_update]['nb_background_update']
update_stats['update_stat'] = int(update_progression*100/update_nb_scripts)
update_stats['update_stat_label'] = '{}/{}'.format(update_progression, update_nb_scripts)
if not update_in_progress:
update_stats['error'] = True
error_message = r_serv_db.get('ail:update_error')
if error_message:
update_stats['error_message'] = error_message
else:
update_stats['error_message'] = 'Please relaunch the bin/update-background.py script'
else:
if update_stats['background_name'] is None:
update_stats['error'] = True
update_stats['error_message'] = 'Please launch the bin/update-background.py script'
else:
update_stats['error'] = False
return jsonify(update_stats)
else:
return jsonify({})
# ========= REGISTRATION =========
app.register_blueprint(settings, url_prefix=baseUrl)

View file

@ -0,0 +1 @@
<li id='page-hiddenServices'><a href="{{ url_for('settings.settings_page') }}"><i class="fa fa-cog"></i> Server Management </a></li>

View file

@ -0,0 +1,196 @@
<!DOCTYPE html>
<html>
<head>
<title>Server Management - AIL</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png') }}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dataTables.bootstrap4.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/popper.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/jquery.dataTables.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/dataTables.bootstrap.min.js')}}"></script>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'settings/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div class="card mb-3 mt-1">
<div class="card-header text-white bg-dark pb-1">
<h5 class="card-title">AIL-framework Status :</h5>
</div>
<div class="card-body">
<div class="row">
<div class="col-xl-6">
<div class="card text-center border-secondary">
<div class="card-body px-1 py-0">
<table class="table table-sm">
<tbody>
<tr>
<td>AIL Version</td>
<td>{{current_version}}<a target="_blank" href="https://github.com/CIRCL/AIL-framework/releases/tag/{{current_version}}" class="text-info"><small> (release note)</small></a></td>
</tr>
<tr
{%if git_metadata['current_branch'] != 'master'%}
class="table-danger"
{%endif%}
>
<td>Current Branch</td>
<td>
{%if git_metadata['current_branch'] != 'master'%}
<i class="fas fa-times-circle text-danger" data-toggle="tooltip" data-placement="top" title="Please checkout the master branch"></i>&nbsp;
{%endif%}
{{git_metadata['current_branch']}}
</td>
</tr>
<tr
{%if git_metadata['new_git_update_available']%}
class="table-warning"
{%endif%}
>
<td>Current Commit ID</td>
<td>
{%if git_metadata['new_git_update_available']%}
<i class="fas fa-exclamation-triangle text-secondary" data-toggle="tooltip" data-placement="top" title="A New Update Is Available"></i>&nbsp;
{%endif%}
{{git_metadata['current_commit']}}
</td>
</tr>
<tr
{%if git_metadata['new_git_version_available']%}
class="table-danger"
{%endif%}
>
<td>Current Tag</td>
<td>
{%if git_metadata['new_git_version_available']%}
<i class="fas fa-exclamation-circle text-danger" data-toggle="tooltip" data-placement="top" title="A New Version Is Available"></i>&nbsp;&nbsp;
{%endif%}
{{git_metadata['last_local_tag']}}
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="col-xl-6">
<div class="card text-center border-success" id="card_progress">
<div class="card-body" id="card_progress_body">
<h5 class="card-title">Backgroud Update: <span id="backgroud_update_version"></span></h5>
<div class="progress">
<div class="progress-bar bg-danger" role="progressbar" id="update_global_progress" aria-valuenow="0" aria-valuemin="0" aria-valuemax="100"></div>
</div>
<hr class="my-1">
Updating: <strong id="backgroud_update_name"></strong> ...
<div class="progress">
<div class="progress-bar progress-bar-striped bg-warning progress-bar-animated" role="progressbar" id="update_background_progress" aria-valuenow="0" aria-valuemin="0" aria-valuemax="100"></div>
</div>
<div class="text-danger" id="update_error_div">
<hr>
<h5 class="card-title"><i class="fas fa-times-circle text-danger"></i> Update Error:</h5>
<p id="update_error_mess"></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
{%if git_metadata['new_git_version_available']%}
<div class="alert alert-danger" role="alert">
<h4 class="alert-heading">New Version Available!</h4>
<hr class="my-0">
<p>A new version is available, new version: <strong>{{git_metadata['last_remote_tag']}}</strong></p>
<a target="_blank" href="https://github.com/CIRCL/AIL-framework/releases/tag/{{git_metadata['last_remote_tag']}}"> Check last release note.</a>
</div>
{%endif%}
{%if git_metadata['new_git_update_available']%}
<div class="alert alert-warning" role="alert">
<h4 class="alert-heading">New Update Available!</h4>
<hr class="my-0">
<p>A new update is available, new commit ID: <strong>{{git_metadata['last_remote_commit']}}</strong></p>
<a target="_blank" href="https://github.com/CIRCL/AIL-framework/commit/{{git_metadata['last_remote_commit']}}"> Check last commit content.</a>
</div>
{%endif%}
</div>
</div>
</div>
</body>
<script>
$(document).ready(function(){
$("#page-options").addClass("active");
} );
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
function update_progress(){
$.getJSON("{{ url_for('settings.get_background_update_stats_json') }}", function(data){
if(! jQuery.isEmptyObject(data)){
$('#card_progress').show();
$('#backgroud_update_version').text(data['update_version']);
$('#backgroud_update_name').text(data['background_name']);
$('#update_global_progress').attr('aria-valuenow', data['update_stat']).width(data['update_stat']+'%').text(data['update_stat_label']);
$('#update_background_progress').attr('aria-valuenow', data['background_stats']).width(data['background_stats']+'%').text(data['background_stats']+'%');
if(data['error']){
$('#update_error_div').show();
$('#update_error_mess').text(data['error_message']);
$('#card_progress').removeClass("border-success");
$('#card_progress').addClass("border-danger");
} else {
$('#update_error_div').hide();
$('#card_progress').removeClass("border-danger");
$('#card_progress').add("border-success");
}
} else {
$('#card_progress').hide();
clearInterval(progress_interval);
}
});
}
update_progress();
//Interval
var progress_interval = setInterval(function(){
update_progress()
}, 4000);
</script>
</html>

View file

@ -40,15 +40,22 @@ showsavedpastes = Blueprint('showsavedpastes', __name__, template_folder='templa
# ============ FUNCTIONS ============
def get_item_screenshot_path(item):
screenshot = r_serv_metadata.hget('paste_metadata:{}'.format(item), 'screenshot')
if screenshot:
screenshot = os.path.join(screenshot[0:2], screenshot[2:4], screenshot[4:6], screenshot[6:8], screenshot[8:10], screenshot[10:12], screenshot[12:])
return screenshot
def showpaste(content_range, requested_path):
relative_path = None
if PASTES_FOLDER not in requested_path:
relative_path = requested_path
requested_path = os.path.join(PASTES_FOLDER, requested_path)
# remove old full path
#requested_path = requested_path.replace(PASTES_FOLDER, '')
# remove full path
requested_path_full = os.path.join(requested_path, PASTES_FOLDER)
else:
requested_path_full = requested_path
requested_path = requested_path.replace(PASTES_FOLDER, '', 1)
# escape directory transversal
if os.path.commonprefix((os.path.realpath(requested_path),PASTES_FOLDER)) != PASTES_FOLDER:
if os.path.commonprefix((requested_path_full,PASTES_FOLDER)) != PASTES_FOLDER:
return 'path transversal detected'
vt_enabled = Flask_config.vt_enabled
@ -58,7 +65,7 @@ def showpaste(content_range, requested_path):
p_date = p_date[6:]+'/'+p_date[4:6]+'/'+p_date[0:4]
p_source = paste.p_source
p_encoding = paste._get_p_encoding()
p_language = paste._get_p_language()
p_language = 'None'
p_size = paste.p_size
p_mime = paste.p_mime
p_lineinfo = paste.get_lines_info()
@ -124,8 +131,6 @@ def showpaste(content_range, requested_path):
active_taxonomies = r_serv_tags.smembers('active_taxonomies')
l_tags = r_serv_metadata.smembers('tag:'+requested_path)
if relative_path is not None:
l_tags.union( r_serv_metadata.smembers('tag:'+relative_path) )
#active galaxies
active_galaxies = r_serv_tags.smembers('active_galaxies')
@ -154,7 +159,19 @@ def showpaste(content_range, requested_path):
if r_serv_metadata.scard('hash_paste:'+requested_path) > 0:
set_b64 = r_serv_metadata.smembers('hash_paste:'+requested_path)
for hash in set_b64:
nb_in_file = int(r_serv_metadata.zscore('nb_seen_hash:'+hash, requested_path))
nb_in_file = r_serv_metadata.zscore('nb_seen_hash:'+hash, requested_path)
# item list not updated
if nb_in_file is None:
l_pastes = r_serv_metadata.zrange('nb_seen_hash:'+hash, 0, -1)
for paste_name in l_pastes:
# dynamic update
if PASTES_FOLDER in paste_name:
score = r_serv_metadata.zscore('nb_seen_hash:{}'.format(hash), paste_name)
r_serv_metadata.zrem('nb_seen_hash:{}'.format(hash), paste_name)
paste_name = paste_name.replace(PASTES_FOLDER, '', 1)
r_serv_metadata.zadd('nb_seen_hash:{}'.format(hash), score, paste_name)
nb_in_file = r_serv_metadata.zscore('nb_seen_hash:'+hash, requested_path)
nb_in_file = int(nb_in_file)
estimated_type = r_serv_metadata.hget('metadata_hash:'+hash, 'estimated_type')
file_type = estimated_type.split('/')[0]
# set file icon
@ -189,7 +206,7 @@ def showpaste(content_range, requested_path):
crawler_metadata['domain'] = r_serv_metadata.hget('paste_metadata:'+requested_path, 'domain')
crawler_metadata['paste_father'] = r_serv_metadata.hget('paste_metadata:'+requested_path, 'father')
crawler_metadata['real_link'] = r_serv_metadata.hget('paste_metadata:'+requested_path,'real_link')
crawler_metadata['screenshot'] = paste.get_p_rel_path()
crawler_metadata['screenshot'] = get_item_screenshot_path(requested_path)
else:
crawler_metadata['get_metadata'] = False
@ -223,6 +240,141 @@ def showpaste(content_range, requested_path):
crawler_metadata=crawler_metadata,
l_64=l_64, vt_enabled=vt_enabled, misp=misp, hive=hive, misp_eventid=misp_eventid, misp_url=misp_url, hive_caseid=hive_caseid, hive_url=hive_url)
def get_item_basic_info(item):
item_basic_info = {}
item_basic_info['date'] = str(item.get_p_date())
item_basic_info['date'] = '{}/{}/{}'.format(item_basic_info['date'][0:4], item_basic_info['date'][4:6], item_basic_info['date'][6:8])
item_basic_info['source'] = item.get_item_source()
item_basic_info['size'] = item.get_item_size()
## TODO: FIXME ##performance
item_basic_info['encoding'] = item._get_p_encoding()
## TODO: FIXME ##performance
#item_basic_info['language'] = item._get_p_language()
## TODO: FIXME ##performance
info_line = item.get_lines_info()
item_basic_info['nb_lines'] = info_line[0]
item_basic_info['max_length_line'] = info_line[1]
return item_basic_info
def show_item_min(requested_path , content_range=0):
relative_path = None
if PASTES_FOLDER not in requested_path:
relative_path = requested_path
requested_path = os.path.join(PASTES_FOLDER, requested_path)
else:
relative_path = requested_path.replace(PASTES_FOLDER, '', 1)
# remove old full path
#requested_path = requested_path.replace(PASTES_FOLDER, '')
# escape directory transversal
if os.path.commonprefix((os.path.realpath(requested_path),PASTES_FOLDER)) != PASTES_FOLDER:
return 'path transversal detected'
item_info ={}
paste = Paste.Paste(requested_path)
item_basic_info = get_item_basic_info(paste)
item_info['nb_duplictates'] = paste.get_nb_duplicate()
## TODO: use this for fix ?
item_content = paste.get_p_content()
char_to_display = len(item_content)
if content_range != 0:
item_content = item_content[0:content_range]
vt_enabled = Flask_config.vt_enabled
p_hashtype_list = []
print(requested_path)
l_tags = r_serv_metadata.smembers('tag:'+relative_path)
if relative_path is not None:
l_tags.union( r_serv_metadata.smembers('tag:'+relative_path) )
item_info['tags'] = l_tags
item_info['name'] = relative_path.replace('/', ' / ')
l_64 = []
# load hash files
if r_serv_metadata.scard('hash_paste:'+relative_path) > 0:
set_b64 = r_serv_metadata.smembers('hash_paste:'+relative_path)
for hash in set_b64:
nb_in_file = r_serv_metadata.zscore('nb_seen_hash:'+hash, relative_path)
# item list not updated
if nb_in_file is None:
l_pastes = r_serv_metadata.zrange('nb_seen_hash:'+hash, 0, -1)
for paste_name in l_pastes:
# dynamic update
if PASTES_FOLDER in paste_name:
score = r_serv_metadata.zscore('nb_seen_hash:{}'.format(hash), paste_name)
r_serv_metadata.zrem('nb_seen_hash:{}'.format(hash), paste_name)
paste_name = paste_name.replace(PASTES_FOLDER, '', 1)
r_serv_metadata.zadd('nb_seen_hash:{}'.format(hash), score, paste_name)
nb_in_file = r_serv_metadata.zscore('nb_seen_hash:{}'.format(hash), relative_path)
nb_in_file = int(nb_in_file)
estimated_type = r_serv_metadata.hget('metadata_hash:'+hash, 'estimated_type')
file_type = estimated_type.split('/')[0]
# set file icon
if file_type == 'application':
file_icon = 'fa-file '
elif file_type == 'audio':
file_icon = 'fa-file-video '
elif file_type == 'image':
file_icon = 'fa-file-image'
elif file_type == 'text':
file_icon = 'fa-file-alt'
else:
file_icon = 'fa-file'
saved_path = r_serv_metadata.hget('metadata_hash:'+hash, 'saved_path')
if r_serv_metadata.hexists('metadata_hash:'+hash, 'vt_link'):
b64_vt = True
b64_vt_link = r_serv_metadata.hget('metadata_hash:'+hash, 'vt_link')
b64_vt_report = r_serv_metadata.hget('metadata_hash:'+hash, 'vt_report')
else:
b64_vt = False
b64_vt_link = ''
b64_vt_report = r_serv_metadata.hget('metadata_hash:'+hash, 'vt_report')
# hash never refreshed
if b64_vt_report is None:
b64_vt_report = ''
l_64.append( (file_icon, estimated_type, hash, saved_path, nb_in_file, b64_vt, b64_vt_link, b64_vt_report) )
crawler_metadata = {}
if 'infoleak:submission="crawler"' in l_tags:
crawler_metadata['get_metadata'] = True
crawler_metadata['domain'] = r_serv_metadata.hget('paste_metadata:'+relative_path, 'domain')
crawler_metadata['paste_father'] = r_serv_metadata.hget('paste_metadata:'+relative_path, 'father')
crawler_metadata['real_link'] = r_serv_metadata.hget('paste_metadata:'+relative_path,'real_link')
crawler_metadata['screenshot'] = get_item_screenshot_path(relative_path)
else:
crawler_metadata['get_metadata'] = False
misp_event = r_serv_metadata.get('misp_events:' + requested_path)
if misp_event is None:
misp_eventid = False
misp_url = ''
else:
misp_eventid = True
misp_url = misp_event_url + misp_event
hive_case = r_serv_metadata.get('hive_cases:' + requested_path)
if hive_case is None:
hive_caseid = False
hive_url = ''
else:
hive_caseid = True
hive_url = hive_case_url.replace('id_here', hive_case)
return render_template("show_saved_item_min.html", bootstrap_label=bootstrap_label, content=item_content,
item_basic_info=item_basic_info, item_info=item_info,
initsize=len(item_content),
hashtype_list = p_hashtype_list,
crawler_metadata=crawler_metadata,
l_64=l_64, vt_enabled=vt_enabled, misp_eventid=misp_eventid, misp_url=misp_url, hive_caseid=hive_caseid, hive_url=hive_url)
# ============ ROUTES ============
@showsavedpastes.route("/showsavedpaste/") #completely shows the paste in a new tab
@ -230,6 +382,11 @@ def showsavedpaste():
requested_path = request.args.get('paste', '')
return showpaste(0, requested_path)
@showsavedpastes.route("/showsaveditem_min/") #completely shows the paste in a new tab
def showsaveditem_min():
requested_path = request.args.get('paste', '')
return show_item_min(requested_path)
@showsavedpastes.route("/showsavedrawpaste/") #shows raw
def showsavedrawpaste():
requested_path = request.args.get('paste', '')
@ -241,7 +398,7 @@ def showsavedrawpaste():
def showpreviewpaste():
num = request.args.get('num', '')
requested_path = request.args.get('paste', '')
return "|num|"+num+"|num|"+showpaste(max_preview_modal, requested_path)
return "|num|"+num+"|num|"+show_item_min(requested_path, content_range=max_preview_modal)
@showsavedpastes.route("/getmoredata/")
@ -278,6 +435,7 @@ def send_file_to_vt():
paste = request.form['paste']
hash = request.form['hash']
## TODO: # FIXME: path transversal
b64_full_path = os.path.join(os.environ['AIL_HOME'], b64_path)
b64_content = ''
with open(b64_full_path, 'rb') as f:

View file

@ -0,0 +1,302 @@
<!DOCTYPE html>
<html lang="en">
<head>
<title>Paste information - AIL</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png') }}">
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/dataTables.bootstrap4.min.css') }}" rel="stylesheet">
<script language="javascript" src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap.min.js') }}"></script>
<script src="{{ url_for('static', filename='js/jquery.dataTables.min.js') }}"></script>
<script src="{{ url_for('static', filename='js/dataTables.bootstrap.min.js') }}"></script>
<style>
.scrollable-menu {
height: auto;
max-height: 200px;
overflow-x: hidden;
width:100%;
}
.red_table thead{
background: #d91f2d;
color: #fff;
}
</style>
</head>
<body>
<div class="card mb-2">
<div class="card-header bg-dark">
<h3 class="text-white text-center" >{{ item_info['name'] }}</h3>
</div>
<div class="card-body pb-1">
<table class="table table-condensed table-responsive">
<thead class="">
<tr>
<th>Date</th>
<th>Source</th>
<th>Encoding</th>
<th>Size (Kb)</th>
<th>Number of lines</th>
<th>Max line length</th>
</tr>
</thead>
<tbody>
<tr>
<td>{{ item_basic_info['date'] }}</td>
<td>{{ item_basic_info['source'] }}</td>
<td>{{ item_basic_info['encoding'] }}</td>
<td>{{ item_basic_info['size'] }}</td>
<td>{{ item_basic_info['nb_lines'] }}</td>
<td>{{ item_basic_info['max_length_line'] }}</td>
</tr>
</tbody>
</table>
<div>
<h5>
{% for tag in item_info['tags'] %}
<span class="badge badge-{{ bootstrap_label[loop.index0 % 5] }}">{{ tag }}</span>
{% endfor %}
</h5>
</div>
</div>
</div>
{% if misp_eventid %}
<div class="list-group" id="misp_event">
<li class="list-group-item active">MISP Events already Created</li>
<a target="_blank" href="{{ misp_url }}" class="list-group-item">{{ misp_url }}</a>
</div>
{% endif %}
{% if hive_caseid %}
<div class="list-group" id="misp_event">
<li class="list-group-item active">The Hive Case already Created</li>
<a target="_blank" href="{{ hive_url }}" class="list-group-item">{{ hive_url }}</a>
</div>
{% endif %}
{% if item_info['nb_duplictates'] != 0 %}
<div id="accordionDuplicate" class="mb-2">
<div class="card">
<div class="card-header py-1" id="headingDuplicate">
<div class="my-1">
<i class="far fa-clone"></i> duplicates&nbsp;&nbsp;
<div class="badge badge-warning">{{item_info['nb_duplictates']}}</div>
</div>
</div>
</div>
</div>
{% endif %}
{% if l_64|length != 0 %}
<div id="accordionDecoded" class="mb-3">
<div class="card">
<div class="card-header py-1" id="headingDecoded">
<div class="row">
<div class="col-11">
<div class="mt-2">
<i class="fas fa-lock-open"></i> Decoded Files&nbsp;&nbsp;
<div class="badge badge-warning">{{l_64|length}}</div>
</div>
</div>
<div class="col-1">
<button class="btn btn-link py-2 rotate" data-toggle="collapse" data-target="#collapseDecoded" aria-expanded="true" aria-controls="collapseDecoded">
<i class="fas fa-chevron-circle-down"></i>
</button>
</div>
</div>
</div>
<div id="collapseDecoded" class="collapse show" aria-labelledby="headingDecoded" data-parent="#accordionDecoded">
<div class="card-body">
<table id="tableb64" class="red_table table table-striped">
<thead>
<tr>
<th>estimated type</th>
<th>hash</th>
</tr>
</thead>
<tbody>
{% for b64 in l_64 %}
<tr>
<td><i class="fas {{ b64[0] }}"></i>&nbsp;&nbsp;{{ b64[1] }}</td>
<td><a target="_blank" href="{{ url_for('hashDecoded.showHash') }}?hash={{ b64[2] }}">{{ b64[2] }}</a> ({{ b64[4] }})</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
</div>
</div>
{% endif %}
{% if crawler_metadata['get_metadata'] %}
<div id="accordionCrawler" class="mb-3">
<div class="card">
<div class="card-header py-1" id="headingCrawled">
<div class="row">
<div class="col-11">
<div class="mt-2">
<i class="fas fa-spider"></i> Crawled Item
</div>
</div>
<div class="col-1">
<button class="btn btn-link py-2 rotate" data-toggle="collapse" data-target="#collapseCrawled" aria-expanded="true" aria-controls="collapseCrawled">
<i class="fas fa-chevron-circle-up"></i>
</button>
</div>
</div>
</div>
<div id="collapseCrawled" class="collapse show" aria-labelledby="headingCrawled" data-parent="#accordionCrawler">
<div class="card-body">
<table class="table table-hover table-striped">
<tbody>
<tr>
<td>Domain</td>
<td><a target="_blank" href="{{ url_for('hiddenServices.show_domain') }}?domain={{ crawler_metadata['domain'] }}" id='domain'>{{ crawler_metadata['domain'] }}</a></td>
</tr>
<tr>
<td>Father</td>
<td><a target="_blank" href="{{ url_for('showsavedpastes.showsavedpaste') }}?paste={{ crawler_metadata['paste_father'] }}" id='paste_father'>{{ crawler_metadata['paste_father'] }}</a></td>
</tr>
<tr>
<td>Url</td>
<td>{{ crawler_metadata['real_link'] }}</td>
</tr>
</tbody>
</table>
<div class="card my-2" style="background-color:#ecf0f1;">
<div class="card-body py-2">
<div class="row">
<div class="col-md-8">
<input class="custom-range mt-2" id="blocks" type="range" min="1" max="50" value="13">
</div>
<div class="col-md-4">
<button class="btn btn-primary" onclick="blocks.value=50;pixelate();">
<i class="fas fa-search-plu"></i>
<span class="label-icon">Full resolution</span>
</button>
</div>
</div>
</div>
</div>
<canvas id="canvas" style="width:100%;"></canvas>
</div>
</div>
</div>
</div>
{% endif %}
<div class="card bg-dark text-white">
<div class="card-header">
<div class="row">
<div class="col-10">
<h3> Content: </h3>
</div>
<div class="col-2">
<div class="mt-2">
<small><a class="text-info" href="{{ url_for('showsavedpastes.showsavedrawpaste') }}?paste={{ request.args.get('paste') }}" id='raw_paste' > [Raw content] </a></small>
</div>
</div>
</div>
</div>
</div>
<p class="my-0" data-initsize="{{ initsize }}"> <pre id="paste-holder" class="border">{{ content }}</pre></p>
<script>
var ltags
var ltagsgalaxies
$(document).ready(function(){
$('#tableDup').DataTable();
$('#tableb64').DataTable({
"aLengthMenu": [[5, 10, 15, -1], [5, 10, 15, "All"]],
"iDisplayLength": 5,
"order": [[ 1, "asc" ]]
});
$(".rotate").click(function(){
$(this).toggleClass("down") ;
})
});
</script>
{% if crawler_metadata['get_metadata'] %}
<script>
var ctx = canvas.getContext('2d'), img = new Image();
/// turn off image smoothing
ctx.webkitImageSmoothingEnabled = false;
ctx.imageSmoothingEnabled = false;
img.onload = pixelate;
img.addEventListener("error", img_error);
var draw_img = false;
img.src = "{{ url_for('showsavedpastes.screenshot', filename=crawler_metadata['screenshot']) }}";
function pixelate() {
/// use slider value
if( blocks.value == 50 ){
size = 1;
} else {
var size = (blocks.value) * 0.01;
}
canvas.width = img.width;
canvas.height = img.height;
/// cache scaled width and height
w = canvas.width * size;
h = canvas.height * size;
/// draw original image to the scaled size
ctx.drawImage(img, 0, 0, w, h);
/// pixelated
ctx.drawImage(canvas, 0, 0, w, h, 0, 0, canvas.width, canvas.height);
}
function img_error() {
img.onerror=null;
img.src="{{ url_for('static', filename='image/AIL.png') }}";
blocks.value = 50;
pixelate;
}
blocks.addEventListener('change', pixelate, false);
</script>
{% endif %}
<div id="container-show-more" class="text-center">
</div>
</html>
</body>
</html>

View file

@ -433,7 +433,7 @@
<tbody>
<tr>
<td>Domain</td>
<td><a target="_blank" href="{{ url_for('hiddenServices.onion_domain') }}?onion_domain={{ crawler_metadata['domain'] }}" id='onion_domain'>{{ crawler_metadata['domain'] }}</a></td>
<td><a target="_blank" href="{{ url_for('hiddenServices.show_domain') }}?domain={{ crawler_metadata['domain'] }}" id='domain'>{{ crawler_metadata['domain'] }}</a></td>
</tr>
<tr>
<td>Father</td>

View file

@ -281,14 +281,13 @@ def terms_management_query_paste():
p_date = str(paste._get_p_date())
p_date = p_date[0:4]+'/'+p_date[4:6]+'/'+p_date[6:8]
p_source = paste.p_source
p_encoding = paste._get_p_encoding()
p_size = paste.p_size
p_mime = paste.p_mime
p_lineinfo = paste.get_lines_info()
p_content = paste.get_p_content()
if p_content != 0:
p_content = p_content[0:400]
paste_info.append({"path": path, "date": p_date, "source": p_source, "encoding": p_encoding, "size": p_size, "mime": p_mime, "lineinfo": p_lineinfo, "content": p_content})
paste_info.append({"path": path, "date": p_date, "source": p_source, "size": p_size, "mime": p_mime, "lineinfo": p_lineinfo, "content": p_content})
return jsonify(paste_info)

View file

@ -131,7 +131,7 @@
<span class="term_name">{{ set }}</span>
<div>
{% for tag in notificationTagsTermMapping[set] %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag }}">
<a href="{{ url_for('Tags.Tags_page') }}?ltags={{ tag }}">
<span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag }}</span>
</a>
{% endfor %}
@ -208,7 +208,7 @@
<span class="term_name">{{ regex }}</span>
<div>
{% for tag in notificationTagsTermMapping[regex] %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag }}">
<a href="{{ url_for('Tags.Tags_page') }}?ltags={{ tag }}">
<span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag }}</span>
</a>
{% endfor %}
@ -285,7 +285,7 @@
<span class="term_name">{{ term }}</span>
<div>
{% for tag in notificationTagsTermMapping[term] %}
<a href="{{ url_for('Tags.get_tagged_paste') }}?ltags={{ tag }}">
<a href="{{ url_for('Tags.Tags_page') }}?ltags={{ tag }}">
<span class="label label-{{ bootstrap_label[loop.index0 % 5] }} pull-left">{{ tag }}</span>
</a>
{% endfor %}
@ -443,7 +443,7 @@ function bindEventsForCurrentPage() {
html_to_add += "<tr>";
html_to_add += "<th>Source</th>";
html_to_add += "<th>Date</th>";
html_to_add += "<th>Encoding</th>";
html_to_add += "<th>Mime</th>";
html_to_add += "<th>Size (Kb)</th>";
html_to_add += "<th># lines</th>";
html_to_add += "<th>Max length</th>";
@ -456,7 +456,7 @@ function bindEventsForCurrentPage() {
html_to_add += "<tr>";
html_to_add += "<td>"+curr_data.source+"</td>";
html_to_add += "<td>"+curr_data.date+"</td>";
html_to_add += "<td>"+curr_data.encoding+"</td>";
html_to_add += "<td>"+curr_data.mime+"</td>";
html_to_add += "<td>"+curr_data.size+"</td>";
html_to_add += "<td>"+curr_data.lineinfo[0]+"</td>";
html_to_add += "<td>"+curr_data.lineinfo[1]+"</td>";

View file

@ -1,6 +1,7 @@
.tag-ctn{
position: relative;
height: 30px;
height: 38px;
min-height: 38px;
padding: 0;
margin-bottom: 0px;
font-size: 14px;
@ -63,6 +64,7 @@
}
.tag-ctn .tag-empty-text{
color: #DDD;
width: 0;
}
.tag-ctn input:focus{
border: 0;

View file

@ -0,0 +1,48 @@
<div class="col-12 col-lg-2 p-0 bg-light border-right" id="side_menu">
<button type="button" class="btn btn-outline-secondary mt-1 ml-3" onclick="toggle_sidebar()">
<i class="fas fa-align-left"></i>
<span>Toggle Sidebar</span>
</button>
<nav class="navbar navbar-expand navbar-light bg-light flex-md-column flex-row align-items-start py-2" id="nav_menu">
<h5 class="d-flex text-muted w-100">
<span>Splash Crawlers </span>
<a class="ml-auto" href="{{url_for('hiddenServices.manual')}}">
<i class="fas fa-plus-circle ml-auto"></i>
</a>
</h5>
<ul class="nav flex-md-column flex-row navbar-nav justify-content-between w-100"> <!--nav-pills-->
<li class="nav-item">
<a class="nav-link" href="{{url_for('hiddenServices.dashboard')}}" id="nav_dashboard">
<i class="fas fa-search"></i>
<span>Dashboard</span>
</a>
</li>
<li class="nav-item">
<a class="nav-link" href="{{url_for('hiddenServices.Crawler_Splash_last_by_type')}}?type=onion" id="nav_onion_crawler">
<i class="fas fa-user-secret"></i>
Onion Crawler
</a>
</li>
<li class="nav-item">
<a class="nav-link" href="{{url_for('hiddenServices.Crawler_Splash_last_by_type')}}?type=regular" id="nav_regular_crawler">
<i class="fab fa-html5"></i>
Regular Crawler
</a>
</li>
<li class="nav-item">
<a class="nav-link" href="{{url_for('hiddenServices.manual')}}" id="nav_manual_crawler">
<i class="fas fa-spider"></i>
Manual Crawler
</a>
</li>
<li class="nav-item">
<a class="nav-link" href="{{url_for('hiddenServices.auto_crawler')}}" id="nav_auto_crawler">
<i class="fas fa-sync"></i>
Automatic Crawler
</a>
</li>
</ul>
</nav>
</div>

View file

@ -0,0 +1,48 @@
<nav class="navbar navbar-expand-xl navbar-dark bg-dark">
<a class="navbar-brand" href="{{ url_for('dashboard.index') }}">
<img src="{{ url_for('static', filename='image/ail-icon.png')}}" alt="AIL" style="width:80px;">
</a>
<button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarSupportedContent" aria-controls="navbarSupportedContent" aria-expanded="false" aria-label="Toggle navigation">
<span class="navbar-toggler-icon"></span>
</button>
<div class="collapse navbar-collapse" id="navbarSupportedContent">
<ul class="navbar-nav">
<li class="nav-item">
<a class="nav-link mr-3" href="{{ url_for('dashboard.index') }}">Home</a>
</li>
<li class="nav-item mr-3">
<a class="nav-link" href="{{ url_for('PasteSubmit.PasteSubmit_page') }}" aria-disabled="true"><i class="fas fa-external-link-alt"></i> Submit</a>
</li>
<li class="nav-item mr-3">
<a class="nav-link" id="page-Browse-Items" href="{{ url_for('Tags.Tags_page') }}" aria-disabled="true"><i class="fas fa-tag"></i> Browse Items</a>
</li>
<li class="nav-item mr-3">
<a class="nav-link" href="{{ url_for('terms.terms_management') }}" aria-disabled="true"><i class="fas fa-crosshairs"></i> Leaks Hunter</a>
</li>
<li class="nav-item mr-3">
<a class="nav-link" id="page-Crawler" href="{{ url_for('hiddenServices.dashboard') }}" tabindex="-1" aria-disabled="true"><i class="fas fa-spider"></i> Crawlers</a>
</li>
<li class="nav-item mr-3">
<a class="nav-link" href="{{ url_for('hashDecoded.hashDecoded_page') }}" aria-disabled="true"><i class="fas fa-lock-open"></i> Decoded</a>
</li>
<li class="nav-item mr-3">
<a class="nav-link" href="{{ url_for('trendingmodules.moduletrending') }}" aria-disabled="true"><i class="fas fa-chart-bar"></i> Statistics</a>
</li>
<li class="nav-item mr-3">
<a class="nav-link" id="page-options" href="{{ url_for('settings.settings_page') }}" aria-disabled="true"><i class="fas fa-cog"></i> Server Management</a>
</li>
</ul>
<form class="form-inline my-2 my-lg-0 ml-auto justify-content-center">
<div class="d-flex flex-column">
<div>
<input class="form-control mr-sm-2" type="search" id="global_search" placeholder="Search" aria-label="Search" aria-describedby="advanced_search">
<button class="btn btn-outline-info my-2 my-sm-0" type="submit"><i class="fas fa-search"></i></button>
</div>
<small id="advanced_search" class="form-text"><a class="nav text-muted" href="#" aria-disabled="true">Advanced Search</a></small>
</div>
</form>
</div>
</nav>

View file

@ -0,0 +1,13 @@
<div class="col-12 col-lg-2 p-0 bg-light border-right" id="side_menu">
<button type="button" class="btn btn-outline-secondary mt-1 ml-3" onclick="toggle_sidebar()">
<i class="fas fa-align-left"></i>
<span>Toggle Sidebar</span>
</button>
<nav class="navbar navbar-expand navbar-light bg-light flex-md-column flex-row align-items-start py-2" id="nav_menu">
<h5 class="d-flex text-muted w-100">
<span>Diagnostic</span>
</h5>
</nav>
</div>

View file

@ -0,0 +1,91 @@
<div class="col-12 col-lg-2 p-0 bg-light border-right" id="side_menu">
<button type="button" class="btn btn-outline-secondary mt-1 ml-3" onclick="toggle_sidebar()">
<i class="fas fa-align-left"></i>
<span>Toggle Sidebar</span>
</button>
<nav class="navbar navbar-expand navbar-light bg-light flex-md-column flex-row align-items-start py-2" id="nav_menu">
<h5 class="d-flex text-muted w-100">
<span>Tags Management </span>
</h5>
<ul class="nav flex-md-column flex-row navbar-nav justify-content-between w-100">
<li class="nav-item">
<a class="nav-link" href="{{ url_for('Tags.taxonomies') }}" id="nav_taxonomies">
<i class="fas fa-wrench"></i>
Edit Taxonomies List
</a>
</li>
<li class="nav-item">
<a class="nav-link" href="{{ url_for('Tags.galaxies') }}" id="nav_onion_galaxies">
<i class="fas fa-rocket"></i>
Edit Galaxies List
</a>
</li>
</ul>
<h5 class="d-flex text-muted w-100">
<span>Tags Export </span>
</h5>
<ul class="nav flex-md-column flex-row navbar-nav justify-content-between w-100">
<li class="nav-item">
<a class="nav-link" href="{{url_for('PasteSubmit.edit_tag_export')}}" id="nav_regular_edit_tag_export">
<i class="fas fa-cogs"></i>
MISP and Hive, auto push
</a>
</li>
</ul>
<h5 class="d-flex text-muted w-100" id="nav_quick_search">
<span>Quick Search </span>
</h5>
<ul class="nav flex-md-column flex-row navbar-nav justify-content-between w-100">
<li class="nav-item">
<a class="nav-link" href='{{url_for('Tags.Tags_page')}}?ltags=infoleak:automatic-detection="credential"' id='nav_tag_infoleakautomatic-detectioncredential'>
<i class="fas fa-unlock-alt"></i>
Credentials
</a>
</li>
<li class="nav-item">
<a class="nav-link" href='{{url_for('Tags.Tags_page')}}?ltags=infoleak:automatic-detection="credit-card"' id='nav_tag_infoleakautomatic-detectioncredit-card'>
<i class="far fa-credit-card"></i>
Credit cards
</a>
</li>
<li class="nav-item">
<a class="nav-link" href='{{url_for('Tags.Tags_page')}}?ltags=infoleak:automatic-detection="mail"' id='nav_tag_infoleakautomatic-detectionmail'>
<i class="fas fa-envelope"></i>
Mails
</a>
</li>
<li class="nav-item">
<a class="nav-link" href='{{url_for('Tags.Tags_page')}}?ltags=infoleak:automatic-detection="cve"' id='nav_tag_infoleakautomatic-detectioncve'>
<i class="fas fa-bug"></i>
CVEs
</a>
</li>
<li class="nav-item">
<a class="nav-link" href='{{url_for('Tags.Tags_page')}}?ltags=infoleak:automatic-detection="onion"' id='nav_tag_infoleakautomatic-detectiononion'>
<i class="fas fa-user-secret"></i>
Onions
</a>
</li>
<li class="nav-item">
<a class="nav-link" href='{{url_for('Tags.Tags_page')}}?ltags=infoleak:automatic-detection="bitcoin-address"' id='nav_tag_infoleakautomatic-detectionbitcoin-address'>
<i class="fab fa-bitcoin"></i>
Bitcoin
</a>
</li>
<li class="nav-item">
<a class="nav-link" href='{{url_for('Tags.Tags_page')}}?ltags=infoleak:automatic-detection="base64"' id='nav_tag_infoleakautomatic-detectionbase64'>
<i class="fas fa-lock-open"></i>
Base64
</a>
</li>
<li class="nav-item">
<a class="nav-link" href='{{url_for('Tags.Tags_page')}}?ltags=infoleak:automatic-detection="phone-number"' id='nav_tag_infoleakautomatic-detectionphone-number'>
<i class="fas fa-phone"></i>
Phones
</a>
</li>
</ul>
</nav>
</div>

View file

@ -5,33 +5,51 @@ set -e
wget http://dygraphs.com/dygraph-combined.js -O ./static/js/dygraph-combined.js
SBADMIN_VERSION='3.3.7'
FONT_AWESOME_VERSION='4.7.0'
BOOTSTRAP_VERSION='4.2.1'
FONT_AWESOME_VERSION='5.7.1'
D3_JS_VERSION='5.5.0'
rm -rf temp
mkdir temp
wget https://github.com/twbs/bootstrap/releases/download/v${BOOTSTRAP_VERSION}/bootstrap-${BOOTSTRAP_VERSION}-dist.zip -O temp/bootstrap${BOOTSTRAP_VERSION}.zip
wget https://github.com/FezVrasta/popper.js/archive/v1.14.3.zip -O temp/popper.zip
wget https://github.com/BlackrockDigital/startbootstrap-sb-admin/archive/v${SBADMIN_VERSION}.zip -O temp/${SBADMIN_VERSION}.zip
wget https://github.com/BlackrockDigital/startbootstrap-sb-admin-2/archive/v${SBADMIN_VERSION}.zip -O temp/${SBADMIN_VERSION}-2.zip
wget https://github.com/FortAwesome/Font-Awesome/archive/v${FONT_AWESOME_VERSION}.zip -O temp/FONT_AWESOME_${FONT_AWESOME_VERSION}.zip
wget https://github.com/FortAwesome/Font-Awesome/archive/v4.7.0.zip -O temp/FONT_AWESOME_4.7.0.zip
wget https://github.com/FortAwesome/Font-Awesome/archive/5.7.1.zip -O temp/FONT_AWESOME_${FONT_AWESOME_VERSION}.zip
wget https://github.com/d3/d3/releases/download/v${D3_JS_VERSION}/d3.zip -O temp/d3_${D3_JS_VERSION}.zip
# dateRangePicker
wget https://github.com/moment/moment/archive/2.22.2.zip -O temp/moment_2.22.2.zip
wget https://github.com/longbill/jquery-date-range-picker/archive/v0.18.0.zip -O temp/daterangepicker_v0.18.0.zip
unzip temp/bootstrap${BOOTSTRAP_VERSION}.zip -d temp/
unzip temp/popper.zip -d temp/
unzip temp/${SBADMIN_VERSION}.zip -d temp/
unzip temp/${SBADMIN_VERSION}-2.zip -d temp/
unzip temp/FONT_AWESOME_4.7.0.zip -d temp/
unzip temp/FONT_AWESOME_${FONT_AWESOME_VERSION}.zip -d temp/
unzip temp/d3_${D3_JS_VERSION}.zip -d temp/
unzip temp/moment_2.22.2.zip -d temp/
unzip temp/daterangepicker_v0.18.0.zip -d temp/
mv temp/bootstrap-${BOOTSTRAP_VERSION}-dist/js/bootstrap.min.js ./static/js/bootstrap4.min.js
mv temp/bootstrap-${BOOTSTRAP_VERSION}-dist/js/bootstrap.min.js.map ./static/js/bootstrap.min.js.map
mv temp/bootstrap-${BOOTSTRAP_VERSION}-dist/css/bootstrap.min.css ./static/css/bootstrap4.min.css
mv temp/bootstrap-${BOOTSTRAP_VERSION}-dist/css/bootstrap.min.css.map ./static/css/bootstrap4.min.css.map
mv temp/popper.js-1.14.3/dist/umd/popper.min.js ./static/js/
mv temp/popper.js-1.14.3/dist/umd/popper.min.js.map ./static/js/
mv temp/startbootstrap-sb-admin-${SBADMIN_VERSION} temp/sb-admin
mv temp/startbootstrap-sb-admin-2-${SBADMIN_VERSION} temp/sb-admin-2
mv temp/Font-Awesome-${FONT_AWESOME_VERSION} temp/font-awesome
mv temp/Font-Awesome-4.7.0 temp/font-awesome
rm -rf ./static/webfonts/
mv temp/Font-Awesome-${FONT_AWESOME_VERSION}/css/all.min.css ./static/css/font-awesome.min.css
mv temp/Font-Awesome-${FONT_AWESOME_VERSION}/webfonts ./static/webfonts
rm -rf ./static/js/plugins
mv temp/sb-admin/js/* ./static/js/
@ -59,6 +77,9 @@ wget https://cdn.datatables.net/1.10.12/js/jquery.dataTables.min.js -O ./static/
wget https://cdn.datatables.net/plug-ins/1.10.7/integration/bootstrap/3/dataTables.bootstrap.css -O ./static/css/dataTables.bootstrap.css
wget https://cdn.datatables.net/plug-ins/1.10.7/integration/bootstrap/3/dataTables.bootstrap.js -O ./static/js/dataTables.bootstrap.js
wget https://cdn.datatables.net/1.10.18/css/dataTables.bootstrap4.min.css -O ./static/css/dataTables.bootstrap.min.css
wget https://cdn.datatables.net/1.10.18/js/dataTables.bootstrap4.min.js -O ./static/js/dataTables.bootstrap.min.js
#Ressource for graph
wget https://raw.githubusercontent.com/flot/flot/958e5fd43c6dff4bab3e1fd5cb6109df5c1e8003/jquery.flot.js -O ./static/js/jquery.flot.js
wget https://raw.githubusercontent.com/flot/flot/958e5fd43c6dff4bab3e1fd5cb6109df5c1e8003/jquery.flot.pie.js -O ./static/js/jquery.flot.pie.js