chg: [merge] merge master in misp_modules

This commit is contained in:
Terrtia 2019-11-17 15:22:30 +01:00
commit cd4f81ec64
No known key found for this signature in database
GPG key ID: 1E1B1F50D84613D0
135 changed files with 10034 additions and 4836 deletions

13
.gitignore vendored
View file

@ -31,15 +31,24 @@ var/www/static/
!var/www/static/js/trendingchart.js
var/www/templates/header.html
var/www/submitted
var/www/server.crt
var/www/server.key
# Local config
bin/packages/config.cfg
bin/packages/config.cfg.backup
configs/keys
bin/packages/core.cfg
bin/packages/config.cfg.backup
configs/core.cfg
configs/core.cfg.backup
configs/update.cfg
update/current_version
files
# Helper
bin/helper/gen_cert/rootCA.*
bin/helper/gen_cert/server.*
# Pystemon archives
pystemon/archives

View file

@ -25,7 +25,7 @@ Feed data to AIL:
3. Launch pystemon ``` ./pystemon ```
4. Edit your configuration file ```bin/packages/config.cfg``` and modify the pystemonpath path accordingly
4. Edit your configuration file ```configs/core.cfg``` and modify the pystemonpath path accordingly
5. Launch pystemon-feeder ``` ./bin/feeder/pystemon-feeder.py ```
@ -123,7 +123,7 @@ There are two types of installation. You can install a *local* or a *remote* Spl
(for a linux docker, the localhost IP is *172.17.0.1*; Should be adapted for other platform)
- Restart the tor proxy: ``sudo service tor restart``
3. *(AIL host)* Edit the ``/bin/packages/config.cfg`` file:
3. *(AIL host)* Edit the ``/configs/core.cfg`` file:
- In the crawler section, set ``activate_crawler`` to ``True``
- Change the IP address of Splash servers if needed (remote only)
- Set ``splash_onion_port`` according to your Splash servers port numbers that will be used.
@ -134,7 +134,7 @@ There are two types of installation. You can install a *local* or a *remote* Spl
- *(Splash host)* Launch all Splash servers with:
```sudo ./bin/torcrawler/launch_splash_crawler.sh -f <config absolute_path> -p <port_start> -n <number_of_splash>```
With ``<port_start>`` and ``<number_of_splash>`` matching those specified at ``splash_onion_port`` in the configuration file of point 3 (``/bin/packages/config.cfg``)
With ``<port_start>`` and ``<number_of_splash>`` matching those specified at ``splash_onion_port`` in the configuration file of point 3 (``/configs/core.cfg``)
All Splash dockers are launched inside the ``Docker_Splash`` screen. You can use ``sudo screen -r Docker_Splash`` to connect to the screen session and check all Splash servers status.
@ -148,7 +148,7 @@ All Splash dockers are launched inside the ``Docker_Splash`` screen. You can use
- ```crawler_hidden_services_install.sh -y```
- Add the following line in ``SOCKSPolicy accept 172.17.0.0/16`` in ``/etc/tor/torrc``
- ```sudo service tor restart```
- set activate_crawler to True in ``/bin/packages/config.cfg``
- set activate_crawler to True in ``/configs/core.cfg``
#### Start
- ```sudo ./bin/torcrawler/launch_splash_crawler.sh -f $AIL_HOME/configs/docker/splash_onion/etc/splash/proxy-profiles/ -p 8050 -n 1```
@ -166,4 +166,3 @@ Then starting the crawler service (if you follow the procedure above)
##### Python 3 Upgrade
To upgrade from an existing AIL installation, you have to launch [python3_upgrade.sh](./python3_upgrade.sh), this script will delete and create a new virtual environment. The script **will upgrade the packages but won't keep your previous data** (neverthless the data is copied into a directory called `old`). If you install from scratch, you don't require to launch the [python3_upgrade.sh](./python3_upgrade.sh).

View file

@ -38,6 +38,21 @@ Redis and ARDB overview
| failed_login_ip:**ip** | **nb login failed** | TTL
| failed_login_user_id:**user_id** | **nb login failed** | TTL
##### Item Import:
| Key | Value |
| ------ | ------ |
| **uuid**:nb_total | **nb total** | TTL *(if imported)*
| **uuid**:nb_end | **nb** | TTL *(if imported)*
| **uuid**:nb_sucess | **nb success** | TTL *(if imported)*
| **uuid**:end | **0 (in progress) or (item imported)** | TTL *(if imported)*
| **uuid**:processing | **process status: 0 or 1** | TTL *(if imported)*
| **uuid**:error | **error message** | TTL *(if imported)*
| Set Key | Value |
| ------ | ------ |
| **uuid**:paste_submit_link | **item_path** | TTL *(if imported)*
## DB0 - Core:
##### Update keys:
@ -92,47 +107,91 @@ Redis and ARDB overview
| ------ | ------ |
| misp_module:**module name** | **module dict** |
## DB2 - TermFreq:
##### Set:
##### Item Import:
| Key | Value |
| ------ | ------ |
| TrackedSetTermSet | **tracked_term** |
| TrackedSetSet | **tracked_set** |
| TrackedRegexSet | **tracked_regex** |
| | |
| tracked_**tracked_term** | **item_path** |
| set_**tracked_set** | **item_path** |
| regex_**tracked_regex** | **item_path** |
| | |
| TrackedNotifications | **tracked_trem / set / regex** |
| | |
| TrackedNotificationTags_**tracked_trem / set / regex** | **tag** |
| | |
| TrackedNotificationEmails_**tracked_trem / set / regex** | **email** |
| **uuid**:isfile | **boolean** |
| **uuid**:paste_content | **item_content** |
##### Zset:
## DB2 - TermFreq:
| Set Key | Value |
| ------ | ------ |
| submitted:uuid | **uuid** |
| **uuid**:ltags | **tag** |
| **uuid**:ltagsgalaxies | **tag** |
## DB3 - Leak Hunter:
##### Tracker metadata:
| Hset - Key | Field | Value |
| ------ | ------ | ------ |
| tracker:**uuid** | tracker | **tacked word/set/regex** |
| | type | **word/set/regex** |
| | date | **date added** |
| | user_id | **created by user_id** |
| | dashboard | **0/1 Display alert on dashboard** |
| | description | **Tracker description** |
| | level | **0/1 Tracker visibility** |
##### Tracker by user_id (visibility level: user only):
| Set - Key | Value |
| ------ | ------ |
| user:tracker:**user_id** | **uuid - tracker uuid** |
| user:tracker:**user_id**:**word/set/regex - tracker type** | **uuid - tracker uuid** |
##### Global Tracker (visibility level: all users):
| Set - Key | Value |
| ------ | ------ |
| gobal:tracker | **uuid - tracker uuid** |
| gobal:tracker:**word/set/regex - tracker type** | **uuid - tracker uuid** |
##### All Tracker by type:
| Set - Key | Value |
| ------ | ------ |
| all:tracker:**word/set/regex - tracker type** | **tracked item** |
| Set - Key | Value |
| ------ | ------ |
| all:tracker_uuid:**tracker type**:**tracked item** | **uuid - tracker uuid** |
##### All Tracked items:
| Set - Key | Value |
| ------ | ------ |
| tracker:item:**uuid**:**date** | **item_id** |
##### All Tracked tags:
| Set - Key | Value |
| ------ | ------ |
| tracker:tags:**uuid** | **tag** |
##### All Tracked mail:
| Set - Key | Value |
| ------ | ------ |
| tracker:mail:**uuid** | **mail** |
##### Refresh Tracker:
| Key | Value |
| ------ | ------ |
| tracker:refresh:word | **last refreshed epoch** |
| tracker:refresh:set | - |
| tracker:refresh:regex | - |
##### Zset Stat Tracker:
| Key | Field | Value |
| ------ | ------ | ------ |
| per_paste_TopTermFreq_set_month | **term** | **nb_seen** |
| per_paste_TopTermFreq_set_week | **term** | **nb_seen** |
| per_paste_TopTermFreq_set_day_**epoch** | **term** | **nb_seen** |
| | | |
| TopTermFreq_set_month | **term** | **nb_seen** |
| TopTermFreq_set_week | **term** | **nb_seen** |
| TopTermFreq_set_day_**epoch** | **term** | **nb_seen** |
| tracker:stat:**uuid** | **date** | **nb_seen** |
##### Hset:
##### Stat token:
| Key | Field | Value |
| ------ | ------ | ------ |
| TrackedTermDate | **tracked_term** | **epoch** |
| TrackedSetDate | **tracked_set** | **epoch** |
| TrackedRegexDate | **tracked_regex** | **epoch** |
| stat_token_total_by_day:**date** | **word** | **nb_seen** |
| | | |
| BlackListTermDate | **blacklisted_term** | **epoch** |
| | | |
| **epoch** | **term** | **nb_seen** |
| stat_token_per_item_by_day:**date** | **word** | **nb_seen** |
| Set - Key | Value |
| ------ | ------ |
| stat_token_history | **date** |
## DB6 - Tags:
@ -214,6 +273,9 @@ Redis and ARDB overview
| set_pgpdump_name:*name* | *item_path* |
| | |
| set_pgpdump_mail:*mail* | *item_path* |
| | |
| | |
| set_domain_pgpdump_**pgp_type**:**key** | **domain** |
##### Hset date:
| Key | Field | Value |
@ -241,11 +303,20 @@ Redis and ARDB overview
| item_pgpdump_name:*item_path* | *name* |
| | |
| item_pgpdump_mail:*item_path* | *mail* |
| | |
| | |
| domain_pgpdump_**pgp_type**:**domain** | **key** |
#### Cryptocurrency
Supported cryptocurrency:
- bitcoin
- bitcoin-cash
- dash
- etherum
- litecoin
- monero
- zcash
##### Hset:
| Key | Field | Value |
@ -256,7 +327,8 @@ Supported cryptocurrency:
##### set:
| Key | Value |
| ------ | ------ |
| set_cryptocurrency_**cryptocurrency name**:**cryptocurrency address** | **item_path** |
| set_cryptocurrency_**cryptocurrency name**:**cryptocurrency address** | **item_path** | PASTE
| domain_cryptocurrency_**cryptocurrency name**:**cryptocurrency address** | **domain** | DOMAIN
##### Hset date:
| Key | Field | Value |
@ -271,8 +343,14 @@ Supported cryptocurrency:
##### set:
| Key | Value |
| ------ | ------ |
| item_cryptocurrency_**cryptocurrency name**:**item_path** | **cryptocurrency address** |
| item_cryptocurrency_**cryptocurrency name**:**item_path** | **cryptocurrency address** | PASTE
| domain_cryptocurrency_**cryptocurrency name**:**item_path** | **cryptocurrency address** | DOMAIN
#### HASH
| Key | Value |
| ------ | ------ |
| hash_domain:**domain** | **hash** |
| domain_hash:**hash** | **domain** |
## DB9 - Crawler:
@ -315,6 +393,20 @@ Supported cryptocurrency:
}
```
##### CRAWLER QUEUES:
| SET - Key | Value |
| ------ | ------ |
| onion_crawler_queue | **url**;**item_id** | RE-CRAWL
| regular_crawler_queue | - |
| | |
| onion_crawler_priority_queue | **url**;**item_id** | USER
| regular_crawler_priority_queue | - |
| | |
| onion_crawler_discovery_queue | **url**;**item_id** | DISCOVER
| regular_crawler_discovery_queue | - |
##### TO CHANGE:
ARDB overview
----------------------------------------- SENTIMENT ------------------------------------

View file

@ -42,8 +42,8 @@ Features
* Multiple feed support
* Each module can process and reprocess the information already processed by AIL
* Detecting and extracting URLs including their geographical location (e.g. IP address location)
* Extracting and validating potential leak of credit cards numbers, credentials, ...
* Extracting and validating email addresses leaked including DNS MX validation
* Extracting and validating potential leaks of credit card numbers, credentials, ...
* Extracting and validating leaked email addresses, including DNS MX validation
* Module for extracting Tor .onion addresses (to be further processed for analysis)
* Keep tracks of duplicates (and diffing between each duplicate found)
* Extracting and validating potential hostnames (e.g. to feed Passive DNS systems)
@ -103,7 +103,7 @@ Starting AIL
```bash
cd bin/
./LAUNCH -l
./LAUNCH.sh -l
```
Eventually you can browse the status of the AIL framework website at the following URL:
@ -119,6 +119,12 @@ Training
CIRCL organises training on how to use or extend the AIL framework. AIL training materials are available at [https://www.circl.lu/services/ail-training-materials/](https://www.circl.lu/services/ail-training-materials/).
API
-----
The API documentation is available in [doc/README.md](doc/README.md)
HOWTO
-----

View file

@ -92,8 +92,6 @@ if __name__ == "__main__":
publisher.info("BankAccount started")
message = p.get_from_set()
#iban_regex = re.compile(r'\b[A-Za-z]{2}[0-9]{2}(?:[ ]?[0-9]{4}){4}(?:[ ]?[0-9]{1,2})?\b')
iban_regex = re.compile(r'\b([A-Za-z]{2}[ \-]?[0-9]{2})(?=(?:[ \-]?[A-Za-z0-9]){9,30})((?:[ \-]?[A-Za-z0-9]{3,5}){2,6})([ \-]?[A-Za-z0-9]{1,3})\b')
iban_regex_verify = re.compile(r'^([A-Z]{2})([0-9]{2})([A-Z0-9]{9,30})$')

View file

@ -1,142 +0,0 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
The Bitcoin Module
============================
It trying to extract Bitcoin address and secret key from paste
..seealso:: Paste method (get_regex)
Requirements
------------
*Need running Redis instances. (Redis).
"""
from packages import Paste
from Helper import Process
from pubsublogger import publisher
import re
import time
import redis
from hashlib import sha256
#### thank http://rosettacode.org/wiki/Bitcoin/address_validation#Python for this 2 functions
def decode_base58(bc, length):
n = 0
for char in bc:
n = n * 58 + digits58.index(char)
return n.to_bytes(length, 'big')
def check_bc(bc):
try:
bcbytes = decode_base58(bc, 25)
return bcbytes[-4:] == sha256(sha256(bcbytes[:-4]).digest()).digest()[:4]
except Exception:
return False
########################################################
def search_key(content, message, paste):
bitcoin_address = re.findall(regex_bitcoin_public_address, content)
bitcoin_private_key = re.findall(regex_bitcoin_private_key, content)
date = str(paste._get_p_date())
validate_address = False
key = False
if(len(bitcoin_address) >0):
#print(message)
for address in bitcoin_address:
if(check_bc(address)):
validate_address = True
print('Bitcoin address found : {}'.format(address))
if(len(bitcoin_private_key) > 0):
for private_key in bitcoin_private_key:
print('Bitcoin private key found : {}'.format(private_key))
key = True
# build bitcoin correlation
save_cryptocurrency_data('bitcoin', date, message, address)
if(validate_address):
p.populate_set_out(message, 'Duplicate')
to_print = 'Bitcoin found: {} address and {} private Keys'.format(len(bitcoin_address), len(bitcoin_private_key))
print(to_print)
publisher.warning(to_print)
msg = 'infoleak:automatic-detection="bitcoin-address";{}'.format(message)
p.populate_set_out(msg, 'Tags')
if(key):
msg = 'infoleak:automatic-detection="bitcoin-private-key";{}'.format(message)
p.populate_set_out(msg, 'Tags')
to_print = 'Bitcoin;{};{};{};'.format(paste.p_source, paste.p_date,
paste.p_name)
publisher.warning('{}Detected {} Bitcoin private key;{}'.format(
to_print, len(bitcoin_private_key),paste.p_rel_path))
def save_cryptocurrency_data(cryptocurrency_name, date, item_path, cryptocurrency_address):
# create basic medata
if not serv_metadata.exists('cryptocurrency_metadata_{}:{}'.format(cryptocurrency_name, cryptocurrency_address)):
serv_metadata.hset('cryptocurrency_metadata_{}:{}'.format(cryptocurrency_name, cryptocurrency_address), 'first_seen', date)
serv_metadata.hset('cryptocurrency_metadata_{}:{}'.format(cryptocurrency_name, cryptocurrency_address), 'last_seen', date)
else:
last_seen = serv_metadata.hget('cryptocurrency_metadata_{}:{}'.format(cryptocurrency_name, cryptocurrency_address), 'last_seen')
if not last_seen:
serv_metadata.hset('cryptocurrency_metadata_{}:{}'.format(cryptocurrency_name, cryptocurrency_address), 'last_seen', date)
else:
if int(last_seen) < int(date):
serv_metadata.hset('cryptocurrency_metadata_{}:{}'.format(cryptocurrency_name, cryptocurrency_address), 'last_seen', date)
# global set
serv_metadata.sadd('set_cryptocurrency_{}:{}'.format(cryptocurrency_name, cryptocurrency_address), item_path)
# daily
serv_metadata.hincrby('cryptocurrency:{}:{}'.format(cryptocurrency_name, date), cryptocurrency_address, 1)
# all type
serv_metadata.zincrby('cryptocurrency_all:{}'.format(cryptocurrency_name), cryptocurrency_address, 1)
# item_metadata
serv_metadata.sadd('item_cryptocurrency_{}:{}'.format(cryptocurrency_name, item_path), cryptocurrency_address)
if __name__ == "__main__":
publisher.port = 6380
publisher.channel = "Script"
config_section = 'Bitcoin'
# Setup the I/O queues
p = Process(config_section)
serv_metadata = redis.StrictRedis(
host=p.config.get("ARDB_Metadata", "host"),
port=p.config.getint("ARDB_Metadata", "port"),
db=p.config.getint("ARDB_Metadata", "db"),
decode_responses=True)
# Sent to the logging a description of the module
publisher.info("Run Keys module ")
digits58 = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'
regex_bitcoin_public_address = re.compile(r'(?<![a-km-zA-HJ-NP-Z0-9])[13][a-km-zA-HJ-NP-Z0-9]{26,33}(?![a-km-zA-HJ-NP-Z0-9])')
regex_bitcoin_private_key = re.compile(r'[5KL][1-9A-HJ-NP-Za-km-z]{50,51}')
# Endless loop getting messages from the input queue
while True:
# Get one message from the input queue
message = p.get_from_set()
if message is None:
publisher.debug("{} queue is empty, waiting".format(config_section))
time.sleep(1)
continue
# Do something with the message from the queue
paste = Paste.Paste(message)
content = paste.get_p_content()
search_key(content, message, paste)

View file

@ -128,6 +128,13 @@ def get_elem_to_crawl(rotation_mode):
if message is not None:
domain_service_type = service_type
break
#load_discovery_queue
if message is None:
for service_type in rotation_mode:
message = redis_crawler.spop('{}_crawler_discovery_queue'.format(service_type))
if message is not None:
domain_service_type = service_type
break
#load_normal_queue
if message is None:
for service_type in rotation_mode:
@ -341,13 +348,27 @@ if __name__ == '__main__':
faup = Faup()
# get HAR files
default_crawler_har = p.config.getboolean("Crawler", "default_crawler_har")
if default_crawler_har:
default_crawler_har = 1
else:
default_crawler_har = 0
# get PNG files
default_crawler_png = p.config.getboolean("Crawler", "default_crawler_png")
if default_crawler_png:
default_crawler_png = 1
else:
default_crawler_png = 0
# Default crawler options
default_crawler_config = {'html': 1,
'har': 1,
'png': 1,
'har': default_crawler_har,
'png': default_crawler_png,
'depth_limit': p.config.getint("Crawler", "crawler_depth_limit"),
'closespider_pagecount': 50,
'user_agent': 'Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Firefox/24.0'}
'closespider_pagecount': p.config.getint("Crawler", "default_crawler_closespider_pagecount"),
'user_agent': p.config.get("Crawler", "default_crawler_user_agent")}
# Track launched crawler
r_cache.sadd('all_crawler', splash_port)

192
bin/Cryptocurrencies.py Executable file
View file

@ -0,0 +1,192 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
The Cryptocurrency Module
============================
It trying to extract Bitcoin address and secret key from paste
..seealso:: Paste method (get_regex)
Requirements
------------
*Need running Redis instances. (Redis).
"""
from Helper import Process
from pubsublogger import publisher
import os
import re
import sys
import time
import redis
import signal
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages'))
import Item
import Cryptocurrency
class TimeoutException(Exception):
pass
def timeout_handler(signum, frame):
raise TimeoutException
signal.signal(signal.SIGALRM, timeout_handler)
def search_crytocurrency(item_id, item_content):
is_cryptocurrency_found = False
for crypto_name in cryptocurrency_dict:
crypto_dict = cryptocurrency_dict[crypto_name]
signal.alarm(crypto_dict['max_execution_time'])
try:
crypto_addr = re.findall(crypto_dict['regex'], item_content)
except TimeoutException:
crypto_addr = []
p.incr_module_timeout_statistic() # add encoder type
print ("{0} processing timeout".format(item_id))
continue
else:
signal.alarm(0)
if crypto_addr:
is_valid_crypto_addr = False
# validate cryptocurrency address
for address in crypto_addr:
if(Cryptocurrency.verify_cryptocurrency_address(crypto_name, address)):
is_valid_crypto_addr = True
print('{} address found : {}'.format(crypto_name, address))
# build bitcoin correlation
Cryptocurrency.save_cryptocurrency_data(crypto_name, Item.get_item_date(item_id), item_id, address)
# At least one valid cryptocurrency address was found
if(is_valid_crypto_addr):
# valid cryptocurrency found in this item
is_cryptocurrency_found = True
# Tag Item
msg = '{};{}'.format(crypto_dict['tag'], item_id)
p.populate_set_out(msg, 'Tags')
# search cryptocurrency private key
if crypto_dict.get('private_key'):
signal.alarm(crypto_dict['private_key']['max_execution_time'])
try:
addr_private_key = re.findall(crypto_dict['private_key']['regex'], item_content)
except TimeoutException:
addr_private_key = []
p.incr_module_timeout_statistic() # add encoder type
print ("{0} processing timeout".format(item_id))
continue
else:
signal.alarm(0)
if addr_private_key:
# Tag Item
msg = '{};{}'.format(crypto_dict['private_key']['tag'], item_id)
p.populate_set_out(msg, 'Tags')
# debug
print(addr_private_key)
to_print = '{} found: {} address and {} private Keys'.format(crypto_name, len(crypto_addr), len(addr_private_key))
print(to_print)
publisher.warning(to_print)
to_print = 'Cryptocurrency;{};{};{};'.format(Item.get_source(item_id), Item.get_item_date(item_id), Item.get_item_basename(item_id))
publisher.warning('{}Detected {} {} private key;{}'.format(
to_print, len(addr_private_key), crypto_name, item_id))
if is_cryptocurrency_found:
# send to duplicate module
p.populate_set_out(item_id, 'Duplicate')
default_max_execution_time = 30
cryptocurrency_dict = {
'bitcoin': {
'name': 'bitcoin', # e.g. 1NbEPRwbBZrFDsx1QW19iDs8jQLevzzcms
'regex': r'\b(?<![+/=])[13][A-Za-z0-9]{26,33}(?![+/=])\b',
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="bitcoin-address"',
'private_key': {
'regex': r'\b(?<![+/=])[5KL][1-9A-HJ-NP-Za-km-z]{50,51}(?![+/=])\b',
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="bitcoin-private-key"',
},
},
'ethereum': {
'name': 'ethereum', # e.g. 0x8466b50B53c521d0B4B163d186596F94fB8466f1
'regex': r'\b(?<![+/=])0x[A-Za-z0-9]{40}(?![+/=])\b',
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="ethereum-address"',
},
'bitcoin-cash': {
'name': 'bitcoin-cash', # e.g. bitcoincash:pp8skudq3x5hzw8ew7vzsw8tn4k8wxsqsv0lt0mf3g
'regex': r'bitcoincash:[a-za0-9]{42}(?![+/=])\b',
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="bitcoin-cash-address"',
},
'litecoin': {
'name': 'litecoin', # e.g. MV5rN5EcX1imDS2gEh5jPJXeiW5QN8YrK3
'regex': r'\b(?<![+/=])[ML][A-Za-z0-9]{33}(?![+/=])\b',
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="litecoin-address"',
},
'monero': {
'name': 'monero', # e.g. 47JLdZWteNPFQPaGGNsqLBAU3qmTcWbRda4yJvaPTCB8JbY18MNrcmfCcxrfDF61Dm7pJc4bHbBW57URjwTWzTRH2RfsUB4
'regex': r'\b(?<![+/=()])4[A-Za-z0-9]{94}(?![+/=()])\b',
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="monero-address"',
},
'zcash': {
'name': 'zcash', # e.g. t1WvvNmFuKkUipcoEADNFvqamRrBec8rpUn
'regex': r'\b(?<![+/=()])t[12][A-Za-z0-9]{33}(?![+/=()])\b',
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="zcash-address"',
},
'dash': {
'name': 'dash', # e.g. XmNfXq2kDmrNBTiDTofohRemwGur1WmgTT
'regex': r'\b(?<![+/=])X[A-Za-z0-9]{33}(?![+/=])\b',
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="dash-address"',
}
}
if __name__ == "__main__":
publisher.port = 6380
publisher.channel = "Script"
config_section = 'Bitcoin'
# Setup the I/O queues
p = Process(config_section)
# Sent to the logging a description of the module
publisher.info("Run Cryptocurrency module ")
# Endless loop getting messages from the input queue
while True:
# Get one message from the input queue
item_id = p.get_from_set()
if item_id is None:
publisher.debug("{} queue is empty, waiting".format(config_section))
time.sleep(1)
continue
# Do something with the message from the queue
item_content = Item.get_item_content(item_id)
search_crytocurrency(item_id, item_content)

View file

@ -1,184 +0,0 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
This module is consuming the Redis-list created by the ZMQ_Sub_Curve_Q Module.
This modules update a .csv file used to draw curves representing selected
words and their occurency per day.
..note:: The channel will have the name of the file created.
..note:: Module ZMQ_Something_Q and ZMQ_Something are closely bound, always put
the same Subscriber name in both of them.
This Module is also used for term frequency.
/!\ Top set management is done in the module Curve_manage_top_set
Requirements
------------
*Need running Redis instances. (Redis)
*Categories files of words in /files/ need to be created
*Need the ZMQ_PubSub_Tokenize_Q Module running to be able to work properly.
"""
import redis
import time
from pubsublogger import publisher
from packages import lib_words
import os
import datetime
import calendar
from Helper import Process
# Email notifications
from NotificationHelper import *
# Config Variables
BlackListTermsSet_Name = "BlackListSetTermSet"
TrackedTermsSet_Name = "TrackedSetTermSet"
top_term_freq_max_set_cardinality = 20 # Max cardinality of the terms frequences set
oneDay = 60*60*24
top_termFreq_setName_day = ["TopTermFreq_set_day_", 1]
top_termFreq_setName_week = ["TopTermFreq_set_week", 7]
top_termFreq_setName_month = ["TopTermFreq_set_month", 31]
top_termFreq_set_array = [top_termFreq_setName_day,top_termFreq_setName_week, top_termFreq_setName_month]
TrackedTermsNotificationTagsPrefix_Name = "TrackedNotificationTags_"
# create direct link in mail
full_paste_url = "/showsavedpaste/?paste="
def check_if_tracked_term(term, path):
if term in server_term.smembers(TrackedTermsSet_Name):
#add_paste to tracked_word_set
set_name = "tracked_" + term
server_term.sadd(set_name, path)
print(term, 'addded', set_name, '->', path)
p.populate_set_out("New Term added", 'CurveManageTopSets')
# Send a notification only when the member is in the set
if term in server_term.smembers(TrackedTermsNotificationEnabled_Name):
# create mail body
mail_body = ("AIL Framework,\n"
"New occurrence for term: " + term + "\n"
''+full_paste_url + path)
# Send to every associated email adress
for email in server_term.smembers(TrackedTermsNotificationEmailsPrefix_Name + term):
sendEmailNotification(email, 'Term', mail_body)
# tag paste
for tag in server_term.smembers(TrackedTermsNotificationTagsPrefix_Name + term):
msg = '{};{}'.format(tag, path)
p.populate_set_out(msg, 'Tags')
def getValueOverRange(word, startDate, num_day):
to_return = 0
for timestamp in range(startDate, startDate - num_day*oneDay, -oneDay):
value = server_term.hget(timestamp, word)
to_return += int(value) if value is not None else 0
return to_return
if __name__ == "__main__":
publisher.port = 6380
publisher.channel = "Script"
config_section = 'Curve'
p = Process(config_section)
# REDIS #
r_serv1 = redis.StrictRedis(
host=p.config.get("ARDB_Curve", "host"),
port=p.config.get("ARDB_Curve", "port"),
db=p.config.get("ARDB_Curve", "db"),
decode_responses=True)
server_term = redis.StrictRedis(
host=p.config.get("ARDB_TermFreq", "host"),
port=p.config.get("ARDB_TermFreq", "port"),
db=p.config.get("ARDB_TermFreq", "db"),
decode_responses=True)
# FUNCTIONS #
publisher.info("Script Curve started")
# create direct link in mail
full_paste_url = p.config.get("Notifications", "ail_domain") + full_paste_url
# FILE CURVE SECTION #
csv_path = os.path.join(os.environ['AIL_HOME'],
p.config.get("Directories", "wordtrending_csv"))
wordfile_path = os.path.join(os.environ['AIL_HOME'],
p.config.get("Directories", "wordsfile"))
message = p.get_from_set()
prec_filename = None
generate_new_graph = False
# Term Frequency
top_termFreq_setName_day = ["TopTermFreq_set_day_", 1]
top_termFreq_setName_week = ["TopTermFreq_set_week", 7]
top_termFreq_setName_month = ["TopTermFreq_set_month", 31]
while True:
if message is not None:
generate_new_graph = True
filename, word, score = message.split()
temp = filename.split('/')
date = temp[-4] + temp[-3] + temp[-2]
timestamp = calendar.timegm((int(temp[-4]), int(temp[-3]), int(temp[-2]), 0, 0, 0))
curr_set = top_termFreq_setName_day[0] + str(timestamp)
low_word = word.lower()
#Old curve with words in file
r_serv1.hincrby(low_word, date, int(score))
# Update redis
#consider the num of occurence of this term
curr_word_value = int(server_term.hincrby(timestamp, low_word, int(score)))
#1 term per paste
curr_word_value_perPaste = int(server_term.hincrby("per_paste_" + str(timestamp), low_word, int(1)))
# Add in set only if term is not in the blacklist
if low_word not in server_term.smembers(BlackListTermsSet_Name):
#consider the num of occurence of this term
server_term.zincrby(curr_set, low_word, float(score))
#1 term per paste
server_term.zincrby("per_paste_" + curr_set, low_word, float(1))
#Add more info for tracked terms
check_if_tracked_term(low_word, filename)
#send to RegexForTermsFrequency
to_send = "{} {} {}".format(filename, timestamp, word)
p.populate_set_out(to_send, 'RegexForTermsFrequency')
else:
if generate_new_graph:
generate_new_graph = False
print('Building graph')
today = datetime.date.today()
year = today.year
month = today.month
lib_words.create_curve_with_word_file(r_serv1, csv_path,
wordfile_path, year,
month)
publisher.debug("Script Curve is Idling")
print("sleeping")
time.sleep(10)
message = p.get_from_set()

View file

@ -1,166 +0,0 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
This module manage top sets for terms frequency.
Every 'refresh_rate' update the weekly and monthly set
"""
import redis
import time
import datetime
import copy
from pubsublogger import publisher
from packages import lib_words
import datetime
import calendar
import os
import configparser
# Config Variables
Refresh_rate = 60*5 #sec
BlackListTermsSet_Name = "BlackListSetTermSet"
TrackedTermsSet_Name = "TrackedSetTermSet"
top_term_freq_max_set_cardinality = 20 # Max cardinality of the terms frequences set
oneDay = 60*60*24
num_day_month = 31
num_day_week = 7
top_termFreq_setName_day = ["TopTermFreq_set_day_", 1]
top_termFreq_setName_week = ["TopTermFreq_set_week", 7]
top_termFreq_setName_month = ["TopTermFreq_set_month", 31]
top_termFreq_set_array = [top_termFreq_setName_day,top_termFreq_setName_week, top_termFreq_setName_month]
def manage_top_set():
startDate = datetime.datetime.now()
startDate = startDate.replace(hour=0, minute=0, second=0, microsecond=0)
startDate = calendar.timegm(startDate.timetuple())
blacklist_size = int(server_term.scard(BlackListTermsSet_Name))
dico = {}
dico_per_paste = {}
# Retreive top data (max_card + blacklist_size) from days sets
for timestamp in range(startDate, startDate - top_termFreq_setName_month[1]*oneDay, -oneDay):
curr_set = top_termFreq_setName_day[0] + str(timestamp)
array_top_day = server_term.zrevrangebyscore(curr_set, '+inf', '-inf', withscores=True, start=0, num=top_term_freq_max_set_cardinality+blacklist_size)
array_top_day_per_paste = server_term.zrevrangebyscore("per_paste_" + curr_set, '+inf', '-inf', withscores=True, start=0, num=top_term_freq_max_set_cardinality+blacklist_size)
for word, value in array_top_day:
if word not in server_term.smembers(BlackListTermsSet_Name):
if word in dico.keys():
dico[word] += value
else:
dico[word] = value
for word, value in array_top_day_per_paste:
if word not in server_term.smembers(BlackListTermsSet_Name):
if word in dico_per_paste.keys():
dico_per_paste[word] += value
else:
dico_per_paste[word] = value
if timestamp == startDate - num_day_week*oneDay:
dico_week = copy.deepcopy(dico)
dico_week_per_paste = copy.deepcopy(dico_per_paste)
# convert dico into sorted array
array_month = []
for w, v in dico.items():
array_month.append((w, v))
array_month.sort(key=lambda tup: -tup[1])
array_month = array_month[0:20]
array_week = []
for w, v in dico_week.items():
array_week.append((w, v))
array_week.sort(key=lambda tup: -tup[1])
array_week = array_week[0:20]
# convert dico_per_paste into sorted array
array_month_per_paste = []
for w, v in dico_per_paste.items():
array_month_per_paste.append((w, v))
array_month_per_paste.sort(key=lambda tup: -tup[1])
array_month_per_paste = array_month_per_paste[0:20]
array_week_per_paste = []
for w, v in dico_week_per_paste.items():
array_week_per_paste.append((w, v))
array_week_per_paste.sort(key=lambda tup: -tup[1])
array_week_per_paste = array_week_per_paste[0:20]
# suppress every terms in top sets
for curr_set, curr_num_day in top_termFreq_set_array[1:3]:
for w in server_term.zrange(curr_set, 0, -1):
server_term.zrem(curr_set, w)
for w in server_term.zrange("per_paste_" + curr_set, 0, -1):
server_term.zrem("per_paste_" + curr_set, w)
# Add top term from sorted array in their respective sorted sets
for elem in array_week:
server_term.zadd(top_termFreq_setName_week[0], float(elem[1]), elem[0])
for elem in array_week_per_paste:
server_term.zadd("per_paste_" + top_termFreq_setName_week[0], float(elem[1]), elem[0])
for elem in array_month:
server_term.zadd(top_termFreq_setName_month[0], float(elem[1]), elem[0])
for elem in array_month_per_paste:
server_term.zadd("per_paste_" + top_termFreq_setName_month[0], float(elem[1]), elem[0])
timestamp = int(time.mktime(datetime.datetime.now().timetuple()))
value = str(timestamp) + ", " + "-"
r_temp.set("MODULE_"+ "CurveManageTopSets" + "_" + str(os.getpid()), value)
print("refreshed module")
if __name__ == '__main__':
# If you wish to use an other port of channel, do not forget to run a subscriber accordingly (see launch_logs.sh)
# Port of the redis instance used by pubsublogger
publisher.port = 6380
# Script is the default channel used for the modules.
publisher.channel = 'Script'
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
# For Module Manager
r_temp = redis.StrictRedis(
host=cfg.get('RedisPubSub', 'host'),
port=cfg.getint('RedisPubSub', 'port'),
db=cfg.getint('RedisPubSub', 'db'),
decode_responses=True)
timestamp = int(time.mktime(datetime.datetime.now().timetuple()))
value = str(timestamp) + ", " + "-"
r_temp.set("MODULE_"+ "CurveManageTopSets" + "_" + str(os.getpid()), value)
r_temp.sadd("MODULE_TYPE_"+ "CurveManageTopSets" , str(os.getpid()))
server_term = redis.StrictRedis(
host=cfg.get("ARDB_TermFreq", "host"),
port=cfg.getint("ARDB_TermFreq", "port"),
db=cfg.getint("ARDB_TermFreq", "db"),
decode_responses=True)
publisher.info("Script Curve_manage_top_set started")
# Sent to the logging a description of the module
publisher.info("Manage the top sets with the data created by the module curve.")
manage_top_set()
while True:
# Get one message from the input queue (module only work if linked with a queue)
time.sleep(Refresh_rate) # sleep a long time then manage the set
manage_top_set()

57
bin/DbCleaner.py Executable file
View file

@ -0,0 +1,57 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
The DbCleaner Module
===================
"""
import os
import sys
import time
import datetime
from pubsublogger import publisher
import NotificationHelper
from packages import Date
from packages import Item
from packages import Term
def clean_term_db_stat_token():
all_stat_date = Term.get_all_token_stat_history()
list_date_to_keep = Date.get_date_range(31)
for date in all_stat_date:
if date not in list_date_to_keep:
# remove history
Term.delete_token_statistics_by_date(date)
print('Term Stats Cleaned')
if __name__ == "__main__":
publisher.port = 6380
publisher.channel = "Script"
publisher.info("DbCleaner started")
# low priority
time.sleep(180)
daily_cleaner = True
current_date = datetime.datetime.now().strftime("%Y%m%d")
while True:
if daily_cleaner:
clean_term_db_stat_token()
daily_cleaner = False
else:
sys.exit(0)
time.sleep(600)
new_date = datetime.datetime.now().strftime("%Y%m%d")
if new_date != current_date:
current_date = new_date
daily_cleaner = True

View file

@ -18,6 +18,7 @@ from pubsublogger import publisher
from Helper import Process
from packages import Paste
from packages import Item
import re
import signal
@ -120,6 +121,12 @@ def save_hash(decoder_name, message, date, decoded):
serv_metadata.zincrby('nb_seen_hash:'+hash, message, 1)# hash - paste map
serv_metadata.zincrby(decoder_name+'_hash:'+hash, message, 1) # number of b64 on this paste
# Domain Object
if Item.is_crawled(message):
domain = Item.get_item_domain(message)
serv_metadata.sadd('hash_domain:{}'.format(domain), hash) # domain - hash map
serv_metadata.sadd('domain_hash:{}'.format(hash), domain) # hash - domain map
def save_hash_on_disk(decode, type, hash, json_data):

View file

@ -1,48 +0,0 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import argparse
import redis
from pubsublogger import publisher
from packages.lib_words import create_dirfile
import configparser
def main():
"""Main Function"""
# CONFIG #
cfg = configparser.ConfigParser()
cfg.read('./packages/config.cfg')
parser = argparse.ArgumentParser(
description='''This script is a part of the Analysis Information Leak
framework. It create a redis list called "listfile" which contain
the absolute filename of all the files from the directory given in
the argument "directory".''',
epilog='Example: ./Dir.py /home/2013/03/')
parser.add_argument('directory', type=str,
help='The directory to run inside', action='store')
parser.add_argument('-db', type=int, default=0,
help='The name of the Redis DB (default 0)',
choices=[0, 1, 2, 3, 4], action='store')
parser.add_argument('-ow', help='trigger the overwritting mode',
action='store_true')
args = parser.parse_args()
r_serv = redis.StrictRedis(host=cfg.get("Redis_Queues", "host"),
port=cfg.getint("Redis_Queues", "port"),
db=cfg.getint("Redis_Queues", "db"),
decode_responses=True)
publisher.port = 6380
publisher.channel = "Script"
create_dirfile(r_serv, args.directory, args.ow)
if __name__ == "__main__":
main()

View file

@ -20,10 +20,10 @@ import datetime
import json
class PubSub(object):
class PubSub(object): ## TODO: remove config, use ConfigLoader by default
def __init__(self):
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
configfile = os.path.join(os.environ['AIL_HOME'], 'configs/core.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
@ -58,7 +58,6 @@ class PubSub(object):
for address in addresses.split(','):
new_sub = context.socket(zmq.SUB)
new_sub.connect(address)
# bytes64 encode bytes to ascii only bytes
new_sub.setsockopt_string(zmq.SUBSCRIBE, channel)
self.subscribers.append(new_sub)
@ -112,7 +111,7 @@ class PubSub(object):
class Process(object):
def __init__(self, conf_section, module=True):
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
configfile = os.path.join(os.environ['AIL_HOME'], 'configs/core.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \

81
bin/IPAddress.py Executable file
View file

@ -0,0 +1,81 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
The IP Module
======================
This module is consuming the global channel.
It first performs a regex to find IP addresses and then matches those IPs to
some configured ip ranges.
The list of IP ranges are expected to be in CIDR format (e.g. 192.168.0.0/16)
and should be defined in the config.cfg file, under the [IP] section
"""
import time
import re
from pubsublogger import publisher
from packages import Paste
from Helper import Process
from ipaddress import IPv4Network, IPv4Address
def search_ip(message):
paste = Paste.Paste(message)
content = paste.get_p_content()
# regex to find IPs
reg_ip = re.compile(r'^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)', flags=re.MULTILINE)
# list of the regex results in the Paste, may be null
results = reg_ip.findall(content)
matching_ips = []
for res in results:
address = IPv4Address(res)
for network in ip_networks:
if address in network:
matching_ips.append(address)
if len(matching_ips) > 0:
print('{} contains {} IPs'.format(paste.p_name, len(matching_ips)))
publisher.warning('{} contains {} IPs'.format(paste.p_name, len(matching_ips)))
#Tag message with IP
msg = 'infoleak:automatic-detection="ip";{}'.format(message)
p.populate_set_out(msg, 'Tags')
#Send to duplicate
p.populate_set_out(message, 'Duplicate')
if __name__ == '__main__':
# If you wish to use an other port of channel, do not forget to run a subscriber accordingly (see launch_logs.sh)
# Port of the redis instance used by pubsublogger
publisher.port = 6380
# Script is the default channel used for the modules.
publisher.channel = 'Script'
# Section name in bin/packages/modules.cfg
config_section = 'IP'
# Setup the I/O queues
p = Process(config_section)
ip_networks = []
for network in p.config.get("IP", "networks").split(","):
ip_networks.append(IPv4Network(network))
# Sent to the logging a description of the module
publisher.info("Run IP module")
# Endless loop getting messages from the input queue
while True:
# Get one message from the input queue
message = p.get_from_set()
if message is None:
publisher.debug("{} queue is empty, waiting".format(config_section))
time.sleep(1)
continue
# Do something with the message from the queue
search_ip(message)

View file

@ -121,6 +121,13 @@ def search_key(paste):
p.populate_set_out(msg, 'Tags')
find = True
if '-----BEGIN PUBLIC KEY-----' in content:
publisher.warning('{} has a public key message'.format(paste.p_name))
msg = 'infoleak:automatic-detection="public-key";{}'.format(message)
p.populate_set_out(msg, 'Tags')
find = True
# pgp content
if get_pgp_content:
p.populate_set_out(message, 'PgpDump')

View file

@ -66,8 +66,8 @@ function helptext {
"$DEFAULT"
This script launch:
"$CYAN"
- All the ZMQ queuing modules.
- All the ZMQ processing modules.
- All the queuing modules.
- All the processing modules.
- All Redis in memory servers.
- All ARDB on disk servers.
"$DEFAULT"
@ -76,12 +76,15 @@ function helptext {
Usage:
-----
LAUNCH.sh
[-l | --launchAuto]
[-k | --killAll]
[-u | --update]
[-c | --configUpdate]
[-t | --thirdpartyUpdate]
[-h | --help]
[-l | --launchAuto] LAUNCH DB + Scripts
[-k | --killAll] Kill DB + Scripts
[-ks | --killscript] Scripts
[-u | --update] Update AIL
[-c | --crawler] LAUNCH Crawlers
[-f | --launchFeeder] LAUNCH Pystemon feeder
[-t | --thirdpartyUpdate] Update Web
[-m | --menu] Display Advanced Menu
[-h | --help] Help
"
}
@ -143,7 +146,7 @@ function launching_scripts {
screen -dmS "Script_AIL"
sleep 0.1
echo -e $GREEN"\t* Launching ZMQ scripts"$DEFAULT
echo -e $GREEN"\t* Launching scripts"$DEFAULT
screen -S "Script_AIL" -X screen -t "ModuleInformation" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./ModulesInformationV2.py -k 0 -c 1; read x"
sleep 0.1
@ -153,14 +156,10 @@ function launching_scripts {
sleep 0.1
screen -S "Script_AIL" -X screen -t "Duplicates" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Duplicates.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "Lines" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Lines.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "DomClassifier" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./DomClassifier.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "Categ" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Categ.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "Tokenize" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Tokenize.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "CreditCards" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./CreditCards.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "BankAccount" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./BankAccount.py; read x"
@ -175,13 +174,9 @@ function launching_scripts {
sleep 0.1
screen -S "Script_AIL" -X screen -t "Credential" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Credential.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "Curve" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Curve.py; read x"
screen -S "Script_AIL" -X screen -t "TermTrackerMod" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./TermTrackerMod.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "CurveManageTopSets" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./CurveManageTopSets.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "RegexForTermsFrequency" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./RegexForTermsFrequency.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "SetForTermsFrequency" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./SetForTermsFrequency.py; read x"
screen -S "Script_AIL" -X screen -t "RegexTracker" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./RegexTracker.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "Indexer" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Indexer.py; read x"
sleep 0.1
@ -191,7 +186,9 @@ function launching_scripts {
sleep 0.1
screen -S "Script_AIL" -X screen -t "Decoder" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Decoder.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "Bitcoin" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Bitcoin.py; read x"
screen -S "Script_AIL" -X screen -t "Cryptocurrency" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Cryptocurrencies.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "Tools" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Tools.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "Phone" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./Phone.py; read x"
sleep 0.1
@ -213,15 +210,19 @@ function launching_scripts {
sleep 0.1
screen -S "Script_AIL" -X screen -t "SentimentAnalysis" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./SentimentAnalysis.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "DbCleaner" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./DbCleaner.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "UpdateBackground" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./update-background.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "SubmitPaste" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./submit_paste.py; read x"
sleep 0.1
screen -S "Script_AIL" -X screen -t "IPAddress" bash -c "cd ${AIL_BIN}; ${ENV_PY} ./IPAddress.py; read x"
}
function launching_crawler {
if [[ ! $iscrawler ]]; then
CONFIG=$AIL_BIN/packages/config.cfg
CONFIG=$AIL_HOME/configs/core.cfg
lport=$(awk '/^\[Crawler\]/{f=1} f==1&&/^splash_port/{print $3;exit}' "${CONFIG}")
IFS='-' read -ra PORTS <<< "$lport"
@ -404,6 +405,18 @@ function launch_feeder {
fi
}
function killscript {
if [[ $islogged || $isqueued || $isscripted || $isflasked || $isfeeded || $iscrawler ]]; then
echo -e $GREEN"Killing Script"$DEFAULT
kill $islogged $isqueued $isscripted $isflasked $isfeeded $iscrawler
sleep 0.2
echo -e $ROSE`screen -ls`$DEFAULT
echo -e $GREEN"\t* $islogged $isqueued $isscripted $isflasked $isfeeded $iscrawler killed."$DEFAULT
else
echo -e $RED"\t* No script to kill"$DEFAULT
fi
}
function killall {
if [[ $isredis || $isardb || $islogged || $isqueued || $isscripted || $isflasked || $isfeeded || $iscrawler ]]; then
if [[ $isredis ]]; then
@ -463,76 +476,82 @@ function launch_all {
launch_flask;
}
#If no params, display the menu
function menu_display {
options=("Redis" "Ardb" "Logs" "Queues" "Scripts" "Flask" "Killall" "Shutdown" "Update" "Update-config" "Update-thirdparty")
menu() {
echo "What do you want to Launch?:"
for i in ${!options[@]}; do
printf "%3d%s) %s\n" $((i+1)) "${choices[i]:- }" "${options[i]}"
done
[[ "$msg" ]] && echo "$msg"; :
}
prompt="Check an option (again to uncheck, ENTER when done): "
while menu && read -rp "$prompt" numinput && [[ "$numinput" ]]; do
for num in $numinput; do
[[ "$num" != *[![:digit:]]* ]] && (( num > 0 && num <= ${#options[@]} )) || {
msg="Invalid option: $num"; break
}
((num--)); msg="${options[num]} was ${choices[num]:+un}checked"
[[ "${choices[num]}" ]] && choices[num]="" || choices[num]="+"
done
done
for i in ${!options[@]}; do
if [[ "${choices[i]}" ]]; then
case ${options[i]} in
Redis)
launch_redis;
;;
Ardb)
launch_ardb;
;;
Logs)
launch_logs;
;;
Queues)
launch_queues;
;;
Scripts)
launch_scripts;
;;
Flask)
launch_flask;
;;
Crawler)
launching_crawler;
;;
Killall)
killall;
;;
Shutdown)
shutdown;
;;
Update)
update;
;;
Update-config)
checking_configuration;
;;
Update-thirdparty)
update_thirdparty;
;;
esac
fi
done
exit
}
#If no params, display the help
[[ $@ ]] || {
helptext;
options=("Redis" "Ardb" "Logs" "Queues" "Scripts" "Flask" "Killall" "Shutdown" "Update" "Update-config" "Update-thirdparty")
menu() {
echo "What do you want to Launch?:"
for i in ${!options[@]}; do
printf "%3d%s) %s\n" $((i+1)) "${choices[i]:- }" "${options[i]}"
done
[[ "$msg" ]] && echo "$msg"; :
}
prompt="Check an option (again to uncheck, ENTER when done): "
while menu && read -rp "$prompt" numinput && [[ "$numinput" ]]; do
for num in $numinput; do
[[ "$num" != *[![:digit:]]* ]] && (( num > 0 && num <= ${#options[@]} )) || {
msg="Invalid option: $num"; break
}
((num--)); msg="${options[num]} was ${choices[num]:+un}checked"
[[ "${choices[num]}" ]] && choices[num]="" || choices[num]="+"
done
done
for i in ${!options[@]}; do
if [[ "${choices[i]}" ]]; then
case ${options[i]} in
Redis)
launch_redis;
;;
Ardb)
launch_ardb;
;;
Logs)
launch_logs;
;;
Queues)
launch_queues;
;;
Scripts)
launch_scripts;
;;
Flask)
launch_flask;
;;
Crawler)
launching_crawler;
;;
Killall)
killall;
;;
Shutdown)
shutdown;
;;
Update)
update;
;;
Update-config)
checking_configuration;
;;
Update-thirdparty)
update_thirdparty;
;;
esac
fi
done
exit
}
#echo "$@"
@ -553,6 +572,10 @@ while [ "$1" != "" ]; do
;;
-k | --killAll ) killall;
;;
-ks | --killscript ) killscript;
;;
-m | --menu ) menu_display;
;;
-u | --update ) update;
;;
-t | --thirdpartyUpdate ) update_thirdparty;
@ -565,7 +588,6 @@ while [ "$1" != "" ]; do
exit
;;
-kh | --khelp ) helptext;
;;
* ) helptext
exit 1

View file

@ -1,85 +0,0 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
The ZMQ_PubSub_Lines Module
============================
This module is consuming the Redis-list created by the ZMQ_PubSub_Line_Q
Module.
It perform a sorting on the line's length and publish/forward them to
differents channels:
*Channel 1 if max length(line) < max
*Channel 2 if max length(line) > max
The collected informations about the processed pastes
(number of lines and maximum length line) are stored in Redis.
..note:: Module ZMQ_Something_Q and ZMQ_Something are closely bound, always put
the same Subscriber name in both of them.
Requirements
------------
*Need running Redis instances. (LevelDB & Redis)
*Need the ZMQ_PubSub_Line_Q Module running to be able to work properly.
"""
import argparse
import time
from packages import Paste
from pubsublogger import publisher
from Helper import Process
if __name__ == '__main__':
publisher.port = 6380
publisher.channel = 'Script'
config_section = 'Lines'
p = Process(config_section)
# SCRIPT PARSER #
parser = argparse.ArgumentParser(
description='This script is a part of the Analysis Information \
Leak framework.')
parser.add_argument(
'-max', type=int, default=500,
help='The limit between "short lines" and "long lines"',
action='store')
args = parser.parse_args()
# FUNCTIONS #
tmp_string = "Lines script Subscribed to channel {} and Start to publish \
on channel Longlines, Shortlines"
publisher.info(tmp_string)
while True:
try:
message = p.get_from_set()
print(message)
if message is not None:
PST = Paste.Paste(message)
else:
publisher.debug("Tokeniser is idling 10s")
time.sleep(10)
continue
# FIXME do it in the paste class
lines_infos = PST.get_lines_info()
PST.save_attribute_redis("p_nb_lines", lines_infos[0])
PST.save_attribute_redis("p_max_length_line", lines_infos[1])
# FIXME Not used.
PST.store.sadd("Pastes_Objects", PST.p_rel_path)
print(PST.p_rel_path)
if lines_infos[1] < args.max:
p.populate_set_out( PST.p_rel_path , 'LinesShort')
else:
p.populate_set_out( PST.p_rel_path , 'LinesLong')
except IOError:
print("CRC Checksum Error on : ", PST.p_rel_path)

View file

@ -8,20 +8,20 @@ module
This module send tagged pastes to MISP or THE HIVE Project
"""
import redis
import sys
import os
import sys
import uuid
import redis
import time
import json
import configparser
from pubsublogger import publisher
from Helper import Process
from packages import Paste
import ailleakObject
import uuid
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
from pymisp import PyMISP
@ -133,26 +133,10 @@ if __name__ == "__main__":
config_section = 'MISP_The_hive_feeder'
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
config_loader = ConfigLoader.ConfigLoader()
cfg = configparser.ConfigParser()
cfg.read(configfile)
r_serv_db = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_db = config_loader.get_redis_conn("ARDB_DB")
r_serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
# set sensor uuid
uuid_ail = r_serv_db.get('ail:uuid')
@ -212,7 +196,9 @@ if __name__ == "__main__":
refresh_time = 3
## FIXME: remove it
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes"))
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "pastes"))
config_loader = None
time_1 = time.time()
while True:

View file

@ -29,16 +29,20 @@ Every data coming from a named feed can be sent to a pre-processing module befor
The mapping can be done via the variable FEED_QUEUE_MAPPING
"""
import os
import sys
import base64
import hashlib
import os
import time
from pubsublogger import publisher
import redis
import configparser
from Helper import Process
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
# CONFIG #
refresh_time = 30
@ -52,37 +56,22 @@ if __name__ == '__main__':
p = Process(config_section)
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
config_loader = ConfigLoader.ConfigLoader()
# REDIS #
server = redis.StrictRedis(
host=cfg.get("Redis_Mixer_Cache", "host"),
port=cfg.getint("Redis_Mixer_Cache", "port"),
db=cfg.getint("Redis_Mixer_Cache", "db"),
decode_responses=True)
server_cache = redis.StrictRedis(
host=cfg.get("Redis_Log_submit", "host"),
port=cfg.getint("Redis_Log_submit", "port"),
db=cfg.getint("Redis_Log_submit", "db"),
decode_responses=True)
server = config_loader.get_redis_conn("Redis_Mixer_Cache")
server_cache = config_loader.get_redis_conn("Redis_Log_submit")
# LOGGING #
publisher.info("Feed Script started to receive & publish.")
# OTHER CONFIG #
operation_mode = cfg.getint("Module_Mixer", "operation_mode")
ttl_key = cfg.getint("Module_Mixer", "ttl_duplicate")
default_unnamed_feed_name = cfg.get("Module_Mixer", "default_unnamed_feed_name")
operation_mode = config_loader.get_config_int("Module_Mixer", "operation_mode")
ttl_key = config_loader.get_config_int("Module_Mixer", "ttl_duplicate")
default_unnamed_feed_name = config_loader.get_config_str("Module_Mixer", "default_unnamed_feed_name")
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], p.config.get("Directories", "pastes")) + '/'
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "pastes")) + '/'
config_loader = None
# STATS #
processed_paste = 0

View file

@ -1,354 +0,0 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
'''
This module can be use to see information of running modules.
These information are logged in "logs/moduleInfo.log"
It can also try to manage them by killing inactive one.
However, it does not support mutliple occurence of the same module
(It will kill the first one obtained by get)
'''
import time
import datetime
import redis
import os
import signal
import argparse
from subprocess import PIPE, Popen
import configparser
import json
from terminaltables import AsciiTable
import textwrap
from colorama import Fore, Back, Style, init
import curses
# CONFIG VARIABLES
kill_retry_threshold = 60 #1m
log_filename = "../logs/moduleInfo.log"
command_search_pid = "ps a -o pid,cmd | grep {}"
command_search_name = "ps a -o pid,cmd | grep {}"
command_restart_module = "screen -S \"Script\" -X screen -t \"{}\" bash -c \"./{}.py; read x\""
init() #Necesary for colorama
printarrayGlob = [None]*14
printarrayGlob.insert(0, ["Time", "Module", "PID", "Action"])
lastTimeKillCommand = {}
#Curses init
#stdscr = curses.initscr()
#curses.cbreak()
#stdscr.keypad(1)
# GLOBAL
last_refresh = 0
def getPid(module):
p = Popen([command_search_pid.format(module+".py")], stdin=PIPE, stdout=PIPE, bufsize=1, shell=True)
for line in p.stdout:
print(line)
splittedLine = line.split()
if 'python2' in splittedLine:
return int(splittedLine[0])
return None
def clearRedisModuleInfo():
for k in server.keys("MODULE_*"):
server.delete(k)
inst_time = datetime.datetime.fromtimestamp(int(time.time()))
printarrayGlob.insert(1, [inst_time, "*", "-", "Cleared redis module info"])
printarrayGlob.pop()
def cleanRedis():
for k in server.keys("MODULE_TYPE_*"):
moduleName = k[12:].split('_')[0]
for pid in server.smembers(k):
flag_pid_valid = False
proc = Popen([command_search_name.format(pid)], stdin=PIPE, stdout=PIPE, bufsize=1, shell=True)
for line in proc.stdout:
splittedLine = line.split()
if ('python2' in splittedLine or 'python' in splittedLine) and "./"+moduleName+".py" in splittedLine:
flag_pid_valid = True
if not flag_pid_valid:
print(flag_pid_valid, 'cleaning', pid, 'in', k)
server.srem(k, pid)
inst_time = datetime.datetime.fromtimestamp(int(time.time()))
printarrayGlob.insert(1, [inst_time, moduleName, pid, "Cleared invalid pid in " + k])
printarrayGlob.pop()
#time.sleep(5)
def kill_module(module, pid):
print('')
print('-> trying to kill module:', module)
if pid is None:
print('pid was None')
printarrayGlob.insert(1, [0, module, pid, "PID was None"])
printarrayGlob.pop()
pid = getPid(module)
else: #Verify that the pid is at least in redis
if server.exists("MODULE_"+module+"_"+str(pid)) == 0:
return
lastTimeKillCommand[pid] = int(time.time())
if pid is not None:
try:
os.kill(pid, signal.SIGUSR1)
except OSError:
print(pid, 'already killed')
inst_time = datetime.datetime.fromtimestamp(int(time.time()))
printarrayGlob.insert(1, [inst_time, module, pid, "Already killed"])
printarrayGlob.pop()
return
time.sleep(1)
if getPid(module) is None:
print(module, 'has been killed')
print('restarting', module, '...')
p2 = Popen([command_restart_module.format(module, module)], stdin=PIPE, stdout=PIPE, bufsize=1, shell=True)
inst_time = datetime.datetime.fromtimestamp(int(time.time()))
printarrayGlob.insert(1, [inst_time, module, pid, "Killed"])
printarrayGlob.insert(1, [inst_time, module, "?", "Restarted"])
printarrayGlob.pop()
printarrayGlob.pop()
else:
print('killing failed, retrying...')
inst_time = datetime.datetime.fromtimestamp(int(time.time()))
printarrayGlob.insert(1, [inst_time, module, pid, "Killing #1 failed."])
printarrayGlob.pop()
time.sleep(1)
os.kill(pid, signal.SIGUSR1)
time.sleep(1)
if getPid(module) is None:
print(module, 'has been killed')
print('restarting', module, '...')
p2 = Popen([command_restart_module.format(module, module)], stdin=PIPE, stdout=PIPE, bufsize=1, shell=True)
inst_time = datetime.datetime.fromtimestamp(int(time.time()))
printarrayGlob.insert(1, [inst_time, module, pid, "Killed"])
printarrayGlob.insert(1, [inst_time, module, "?", "Restarted"])
printarrayGlob.pop()
printarrayGlob.pop()
else:
print('killing failed!')
inst_time = datetime.datetime.fromtimestamp(int(time.time()))
printarrayGlob.insert(1, [inst_time, module, pid, "Killing failed!"])
printarrayGlob.pop()
else:
print('Module does not exist')
inst_time = datetime.datetime.fromtimestamp(int(time.time()))
printarrayGlob.insert(1, [inst_time, module, pid, "Killing failed, module not found"])
printarrayGlob.pop()
#time.sleep(5)
cleanRedis()
def get_color(time, idle):
if time is not None:
temp = time.split(':')
time = int(temp[0])*3600 + int(temp[1])*60 + int(temp[2])
if time >= args.treshold:
if not idle:
return Back.RED + Style.BRIGHT
else:
return Back.MAGENTA + Style.BRIGHT
elif time > args.treshold/2:
return Back.YELLOW + Style.BRIGHT
else:
return Back.GREEN + Style.BRIGHT
else:
return Style.RESET_ALL
def waiting_refresh():
global last_refresh
if time.time() - last_refresh < args.refresh:
return False
else:
last_refresh = time.time()
return True
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Show info concerning running modules and log suspected stucked modules. May be use to automatically kill and restart stucked one.')
parser.add_argument('-r', '--refresh', type=int, required=False, default=1, help='Refresh rate')
parser.add_argument('-t', '--treshold', type=int, required=False, default=60*10*1, help='Refresh rate')
parser.add_argument('-k', '--autokill', type=int, required=False, default=0, help='Enable auto kill option (1 for TRUE, anything else for FALSE)')
parser.add_argument('-c', '--clear', type=int, required=False, default=0, help='Clear the current module information (Used to clear data from old launched modules)')
args = parser.parse_args()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
# REDIS #
server = redis.StrictRedis(
host=cfg.get("Redis_Queues", "host"),
port=cfg.getint("Redis_Queues", "port"),
db=cfg.getint("Redis_Queues", "db"),
decode_responses=True)
if args.clear == 1:
clearRedisModuleInfo()
lastTime = datetime.datetime.now()
module_file_array = set()
no_info_modules = {}
path_allmod = os.path.join(os.environ['AIL_HOME'], 'doc/all_modules.txt')
with open(path_allmod, 'r') as module_file:
for line in module_file:
module_file_array.add(line[:-1])
cleanRedis()
while True:
if waiting_refresh():
#key = ''
#while key != 'q':
# key = stdsrc.getch()
# stdscr.refresh()
all_queue = set()
printarray1 = []
printarray2 = []
printarray3 = []
for queue, card in server.hgetall("queues").items():
all_queue.add(queue)
key = "MODULE_" + queue + "_"
keySet = "MODULE_TYPE_" + queue
array_module_type = []
for moduleNum in server.smembers(keySet):
value = server.get(key + str(moduleNum))
if value is not None:
timestamp, path = value.split(", ")
if timestamp is not None and path is not None:
startTime_readable = datetime.datetime.fromtimestamp(int(timestamp))
processed_time_readable = str((datetime.datetime.now() - startTime_readable)).split('.')[0]
if int(card) > 0:
if int((datetime.datetime.now() - startTime_readable).total_seconds()) > args.treshold:
log = open(log_filename, 'a')
log.write(json.dumps([queue, card, str(startTime_readable), str(processed_time_readable), path]) + "\n")
try:
last_kill_try = time.time() - lastTimeKillCommand[moduleNum]
except KeyError:
last_kill_try = kill_retry_threshold+1
if args.autokill == 1 and last_kill_try > kill_retry_threshold :
kill_module(queue, int(moduleNum))
array_module_type.append([get_color(processed_time_readable, False) + str(queue), str(moduleNum), str(card), str(startTime_readable), str(processed_time_readable), str(path) + get_color(None, False)])
else:
printarray2.append([get_color(processed_time_readable, True) + str(queue), str(moduleNum), str(card), str(startTime_readable), str(processed_time_readable), str(path) + get_color(None, True)])
array_module_type.sort(lambda x,y: cmp(x[4], y[4]), reverse=True)
for e in array_module_type:
printarray1.append(e)
for curr_queue in module_file_array:
if curr_queue not in all_queue:
printarray3.append([curr_queue, "Not running"])
else:
if len(list(server.smembers('MODULE_TYPE_'+curr_queue))) == 0:
if curr_queue not in no_info_modules:
no_info_modules[curr_queue] = int(time.time())
printarray3.append([curr_queue, "No data"])
else:
#If no info since long time, try to kill
if args.autokill == 1:
if int(time.time()) - no_info_modules[curr_queue] > args.treshold:
kill_module(curr_queue, None)
no_info_modules[curr_queue] = int(time.time())
printarray3.append([curr_queue, "Stuck or idle, restarting in " + str(abs(args.treshold - (int(time.time()) - no_info_modules[curr_queue]))) + "s"])
else:
printarray3.append([curr_queue, "Stuck or idle, restarting disabled"])
## FIXME To add:
## Button KILL Process using Curses
printarray1.sort(key=lambda x: x[0][9:], reverse=False)
printarray2.sort(key=lambda x: x[0][9:], reverse=False)
printarray1.insert(0,["Queue", "PID", "Amount", "Paste start time", "Processing time for current paste (H:M:S)", "Paste hash"])
printarray2.insert(0,["Queue", "PID","Amount", "Paste start time", "Time since idle (H:M:S)", "Last paste hash"])
printarray3.insert(0,["Queue", "State"])
os.system('clear')
t1 = AsciiTable(printarray1, title="Working queues")
t1.column_max_width(1)
if not t1.ok:
longest_col = t1.column_widths.index(max(t1.column_widths))
max_length_col = t1.column_max_width(longest_col)
if max_length_col > 0:
for i, content in enumerate(t1.table_data):
if len(content[longest_col]) > max_length_col:
temp = ''
for l in content[longest_col].splitlines():
if len(l) > max_length_col:
temp += '\n'.join(textwrap.wrap(l, max_length_col)) + '\n'
else:
temp += l + '\n'
content[longest_col] = temp.strip()
t1.table_data[i] = content
t2 = AsciiTable(printarray2, title="Idling queues")
t2.column_max_width(1)
if not t2.ok:
longest_col = t2.column_widths.index(max(t2.column_widths))
max_length_col = t2.column_max_width(longest_col)
if max_length_col > 0:
for i, content in enumerate(t2.table_data):
if len(content[longest_col]) > max_length_col:
temp = ''
for l in content[longest_col].splitlines():
if len(l) > max_length_col:
temp += '\n'.join(textwrap.wrap(l, max_length_col)) + '\n'
else:
temp += l + '\n'
content[longest_col] = temp.strip()
t2.table_data[i] = content
t3 = AsciiTable(printarray3, title="Not running queues")
t3.column_max_width(1)
printarray4 = []
for elem in printarrayGlob:
if elem is not None:
printarray4.append(elem)
t4 = AsciiTable(printarray4, title="Last actions")
t4.column_max_width(1)
legend_array = [["Color", "Meaning"], [Back.RED+Style.BRIGHT+" "*10+Style.RESET_ALL, "Time >=" +str(args.treshold)+Style.RESET_ALL], [Back.MAGENTA+Style.BRIGHT+" "*10+Style.RESET_ALL, "Time >=" +str(args.treshold)+" while idle"+Style.RESET_ALL], [Back.YELLOW+Style.BRIGHT+" "*10+Style.RESET_ALL, "Time >=" +str(args.treshold/2)+Style.RESET_ALL], [Back.GREEN+Style.BRIGHT+" "*10+Style.RESET_ALL, "Time <" +str(args.treshold)]]
legend = AsciiTable(legend_array, title="Legend")
legend.column_max_width(1)
print(legend.table)
print('\n')
print(t1.table)
print('\n')
print(t2.table)
print('\n')
print(t3.table)
print('\n')
print(t4.table9)
if (datetime.datetime.now() - lastTime).total_seconds() > args.refresh*5:
lastTime = datetime.datetime.now()
cleanRedis()
#time.sleep(args.refresh)

View file

@ -9,7 +9,6 @@ import time
import datetime
import redis
import os
from packages import lib_words
from packages.Date import Date
from pubsublogger import publisher
from Helper import Process

View file

@ -10,13 +10,16 @@ from asciimatics.event import Event
from asciimatics.event import KeyboardEvent, MouseEvent
import sys, os
import time, datetime
import argparse, configparser
import argparse
import json
import redis
import psutil
from subprocess import PIPE, Popen
from packages import Paste
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
# CONFIG VARIABLES
kill_retry_threshold = 60 #1m
log_filename = "../logs/moduleInfo.log"
@ -798,21 +801,11 @@ if __name__ == "__main__":
args = parser.parse_args()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
config_loader = ConfigLoader.ConfigLoader()
# REDIS #
server = redis.StrictRedis(
host=cfg.get("Redis_Queues", "host"),
port=cfg.getint("Redis_Queues", "port"),
db=cfg.getint("Redis_Queues", "db"),
decode_responses=True)
server = config_loader.get_redis_conn("Redis_Queues")
config_loader = None
if args.clear == 1:
clearRedisModuleInfo()

View file

@ -1,46 +1,34 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import argparse
import configparser
import traceback
import os
import sys
import argparse
import traceback
import smtplib
from pubsublogger import publisher
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
"""
This module allows the global configuration and management of notification settings and methods.
"""
# CONFIG #
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
config_loader = ConfigLoader.ConfigLoader()
publisher.port = 6380
publisher.channel = "Script"
# notifications enabled/disabled
TrackedTermsNotificationEnabled_Name = "TrackedNotifications"
# associated notification email addresses for a specific term`
# Keys will be e.g. TrackedNotificationEmails<TERMNAME>
TrackedTermsNotificationEmailsPrefix_Name = "TrackedNotificationEmails_"
def sendEmailNotification(recipient, alert_name, content):
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv?')
cfg = configparser.ConfigParser()
cfg.read(configfile)
sender = cfg.get("Notifications", "sender")
sender_host = cfg.get("Notifications", "sender_host")
sender_port = cfg.getint("Notifications", "sender_port")
sender_pw = cfg.get("Notifications", "sender_pw")
sender = config_loader.get_config_str("Notifications", "sender")
sender_host = config_loader.get_config_str("Notifications", "sender_host")
sender_port = config_loader.get_config_int("Notifications", "sender_port")
sender_pw = config_loader.get_config_str("Notifications", "sender_pw")
if sender_pw == 'None':
sender_pw = None

View file

@ -21,6 +21,8 @@ from bs4 import BeautifulSoup
from Helper import Process
from packages import Paste
from packages import Pgp
class TimeoutException(Exception):
pass
@ -117,31 +119,6 @@ def extract_id_from_output(pgp_dump_outpout):
key_id = key_id.replace(key_id_str, '', 1)
set_key.add(key_id)
def save_pgp_data(type_pgp, date, item_path, data):
# create basic medata
if not serv_metadata.exists('pgpdump_metadata_{}:{}'.format(type_pgp, data)):
serv_metadata.hset('pgpdump_metadata_{}:{}'.format(type_pgp, data), 'first_seen', date)
serv_metadata.hset('pgpdump_metadata_{}:{}'.format(type_pgp, data), 'last_seen', date)
else:
last_seen = serv_metadata.hget('pgpdump_metadata_{}:{}'.format(type_pgp, data), 'last_seen')
if not last_seen:
serv_metadata.hset('pgpdump_metadata_{}:{}'.format(type_pgp, data), 'last_seen', date)
else:
if int(last_seen) < int(date):
serv_metadata.hset('pgpdump_metadata_{}:{}'.format(type_pgp, data), 'last_seen', date)
# global set
serv_metadata.sadd('set_pgpdump_{}:{}'.format(type_pgp, data), item_path)
# daily
serv_metadata.hincrby('pgpdump:{}:{}'.format(type_pgp, date), data, 1)
# all type
serv_metadata.zincrby('pgpdump_all:{}'.format(type_pgp), data, 1)
# item_metadata
serv_metadata.sadd('item_pgpdump_{}:{}'.format(type_pgp, item_path), data)
if __name__ == '__main__':
# If you wish to use an other port of channel, do not forget to run a subscriber accordingly (see launch_logs.sh)
@ -236,12 +213,12 @@ if __name__ == '__main__':
for key_id in set_key:
print(key_id)
save_pgp_data('key', date, message, key_id)
Pgp.save_pgp_data('key', date, message, key_id)
for name_id in set_name:
print(name_id)
save_pgp_data('name', date, message, name_id)
Pgp.save_pgp_data('name', date, message, name_id)
for mail_id in set_mail:
print(mail_id)
save_pgp_data('mail', date, message, mail_id)
Pgp.save_pgp_data('mail', date, message, mail_id)

View file

@ -1,57 +0,0 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import redis
import argparse
import configparser
import time
import os
from pubsublogger import publisher
import texttable
def main():
"""Main Function"""
# CONFIG #
cfg = configparser.ConfigParser()
cfg.read('./packages/config.cfg')
# SCRIPT PARSER #
parser = argparse.ArgumentParser(
description='''This script is a part of the Assisted Information Leak framework.''',
epilog='''''')
parser.add_argument('-db', type=int, default=0,
help='The name of the Redis DB (default 0)',
choices=[0, 1, 2, 3, 4], action='store')
# REDIS #
r_serv = redis.StrictRedis(
host=cfg.get("Redis_Queues", "host"),
port=cfg.getint("Redis_Queues", "port"),
db=cfg.getint("Redis_Queues", "db"),
decode_responses=True)
# LOGGING #
publisher.port = 6380
publisher.channel = "Queuing"
while True:
table = texttable.Texttable()
table.header(["Queue name", "#Items"])
row = []
for queue in r_serv.smembers("queues"):
current = r_serv.llen(queue)
current = current - r_serv.llen(queue)
row.append((queue, r_serv.llen(queue)))
time.sleep(0.5)
row.sort()
table.add_rows(row, header=False)
os.system('clear')
print(table.draw())
if __name__ == "__main__":
main()

View file

@ -1,157 +0,0 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
This Module is used for term frequency.
It processes every paste coming from the global module and test the regexs
supplied in the term webpage.
"""
import redis
import time
from pubsublogger import publisher
from packages import Paste
import calendar
import re
import signal
import time
from Helper import Process
# Email notifications
from NotificationHelper import *
class TimeoutException(Exception):
pass
def timeout_handler(signum, frame):
raise TimeoutException
signal.signal(signal.SIGALRM, timeout_handler)
# Config Variables
DICO_REFRESH_TIME = 60 # s
BlackListTermsSet_Name = "BlackListSetTermSet"
TrackedTermsSet_Name = "TrackedSetTermSet"
TrackedRegexSet_Name = "TrackedRegexSet"
top_term_freq_max_set_cardinality = 20 # Max cardinality of the terms frequences set
oneDay = 60*60*24
top_termFreq_setName_day = ["TopTermFreq_set_day_", 1]
top_termFreq_setName_week = ["TopTermFreq_set_week", 7]
top_termFreq_setName_month = ["TopTermFreq_set_month", 31]
top_termFreq_set_array = [top_termFreq_setName_day, top_termFreq_setName_week, top_termFreq_setName_month]
TrackedTermsNotificationTagsPrefix_Name = "TrackedNotificationTags_"
# create direct link in mail
full_paste_url = "/showsavedpaste/?paste="
def refresh_dicos():
dico_regex = {}
dico_regexname_to_redis = {}
for regex_str in server_term.smembers(TrackedRegexSet_Name):
dico_regex[regex_str[1:-1]] = re.compile(regex_str[1:-1])
dico_regexname_to_redis[regex_str[1:-1]] = regex_str
return dico_regex, dico_regexname_to_redis
if __name__ == "__main__":
publisher.port = 6380
publisher.channel = "Script"
config_section = 'RegexForTermsFrequency'
p = Process(config_section)
max_execution_time = p.config.getint(config_section, "max_execution_time")
# REDIS #
server_term = redis.StrictRedis(
host=p.config.get("ARDB_TermFreq", "host"),
port=p.config.get("ARDB_TermFreq", "port"),
db=p.config.get("ARDB_TermFreq", "db"),
decode_responses=True)
# FUNCTIONS #
publisher.info("RegexForTermsFrequency script started")
# create direct link in mail
full_paste_url = p.config.get("Notifications", "ail_domain") + full_paste_url
# compile the regex
dico_refresh_cooldown = time.time()
dico_regex, dico_regexname_to_redis = refresh_dicos()
message = p.get_from_set()
# Regex Frequency
while True:
if message is not None:
if time.time() - dico_refresh_cooldown > DICO_REFRESH_TIME:
dico_refresh_cooldown = time.time()
dico_regex, dico_regexname_to_redis = refresh_dicos()
print('dico got refreshed')
filename = message
temp = filename.split('/')
timestamp = calendar.timegm((int(temp[-4]), int(temp[-3]), int(temp[-2]), 0, 0, 0))
curr_set = top_termFreq_setName_day[0] + str(timestamp)
paste = Paste.Paste(filename)
content = paste.get_p_content()
# iterate the word with the regex
for regex_str, compiled_regex in dico_regex.items():
signal.alarm(max_execution_time)
try:
matched = compiled_regex.search(content)
except TimeoutException:
print ("{0} processing timeout".format(paste.p_rel_path))
continue
else:
signal.alarm(0)
if matched is not None: # there is a match
print('regex matched {}'.format(regex_str))
matched = matched.group(0)
regex_str_complete = "/" + regex_str + "/"
# Add in Regex track set only if term is not in the blacklist
if regex_str_complete not in server_term.smembers(BlackListTermsSet_Name):
# Send a notification only when the member is in the set
if regex_str_complete in server_term.smembers(TrackedTermsNotificationEnabled_Name):
# create mail body
mail_body = ("AIL Framework,\n"
"New occurrence for regex: " + regex_str + "\n"
''+full_paste_url + filename)
# Send to every associated email adress
for email in server_term.smembers(TrackedTermsNotificationEmailsPrefix_Name + regex_str_complete):
sendEmailNotification(email, 'Term', mail_body)
# tag paste
for tag in server_term.smembers(TrackedTermsNotificationTagsPrefix_Name + regex_str_complete):
msg = '{};{}'.format(tag, filename)
p.populate_set_out(msg, 'Tags')
set_name = 'regex_' + dico_regexname_to_redis[regex_str]
new_to_the_set = server_term.sadd(set_name, filename)
new_to_the_set = True if new_to_the_set == 1 else False
# consider the num of occurence of this term
regex_value = int(server_term.hincrby(timestamp, dico_regexname_to_redis[regex_str], int(1)))
# 1 term per paste
if new_to_the_set:
regex_value_perPaste = int(server_term.hincrby("per_paste_" + str(timestamp), dico_regexname_to_redis[regex_str], int(1)))
server_term.zincrby("per_paste_" + curr_set, dico_regexname_to_redis[regex_str], float(1))
server_term.zincrby(curr_set, dico_regexname_to_redis[regex_str], float(1))
else:
pass
else:
publisher.debug("Script RegexForTermsFrequency is Idling")
print("sleeping")
time.sleep(5)
message = p.get_from_set()

96
bin/RegexTracker.py Executable file
View file

@ -0,0 +1,96 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
This Module is used for regex tracking.
It processes every paste coming from the global module and test the regexs
supplied in the term webpage.
"""
import os
import re
import sys
import time
import signal
from Helper import Process
from pubsublogger import publisher
import NotificationHelper
from packages import Item
from packages import Term
full_item_url = "/showsavedpaste/?paste="
mail_body_template = "AIL Framework,\nNew occurrence for term tracked regex: {}\nitem id: {}\nurl: {}{}"
dict_regex_tracked = Term.get_regex_tracked_words_dict()
last_refresh = time.time()
class TimeoutException(Exception):
pass
def timeout_handler(signum, frame):
raise TimeoutException
signal.signal(signal.SIGALRM, timeout_handler)
def new_term_found(term, term_type, item_id, item_date):
uuid_list = Term.get_term_uuid_list(term, 'regex')
print('new tracked term found: {} in {}'.format(term, item_id))
for term_uuid in uuid_list:
Term.add_tracked_item(term_uuid, item_id, item_date)
tags_to_add = Term.get_term_tags(term_uuid)
for tag in tags_to_add:
msg = '{};{}'.format(tag, item_id)
p.populate_set_out(msg, 'Tags')
mail_to_notify = Term.get_term_mails(term_uuid)
if mail_to_notify:
mail_body = mail_body_template.format(term, item_id, full_item_url, item_id)
for mail in mail_to_notify:
NotificationHelper.sendEmailNotification(mail, 'Term Tracker', mail_body)
if __name__ == "__main__":
publisher.port = 6380
publisher.channel = "Script"
publisher.info("Script RegexTracker started")
config_section = 'RegexTracker'
p = Process(config_section)
max_execution_time = p.config.getint(config_section, "max_execution_time")
ull_item_url = p.config.get("Notifications", "ail_domain") + full_item_url
# Regex Frequency
while True:
item_id = p.get_from_set()
if item_id is not None:
item_date = Item.get_item_date(item_id)
item_content = Item.get_item_content(item_id)
for regex in dict_regex_tracked:
signal.alarm(max_execution_time)
try:
matched = dict_regex_tracked[regex].search(item_content)
except TimeoutException:
print ("{0} processing timeout".format(item_id))
continue
else:
signal.alarm(0)
if matched:
new_term_found(regex, 'regex', item_id, item_date)
else:
time.sleep(5)
# refresh Tracked term
if last_refresh < Term.get_tracked_term_last_updated_by_type('regex'):
dict_regex_tracked = Term.get_regex_tracked_words_dict()
last_refresh = time.time()
print('Tracked set refreshed')

View file

@ -1,97 +0,0 @@
#!/usr/bin/python3
# -*-coding:UTF-8 -*
import redis
import argparse
import configparser
from datetime import datetime
from pubsublogger import publisher
import matplotlib.pyplot as plt
def main():
"""Main Function"""
# CONFIG #
cfg = configparser.ConfigParser()
cfg.read('./packages/config.cfg')
# SCRIPT PARSER #
parser = argparse.ArgumentParser(
description='''This script is a part of the Analysis Information Leak framework.''',
epilog='''''')
parser.add_argument('-f', type=str, metavar="filename", default="figure",
help='The absolute path name of the "figure.png"',
action='store')
parser.add_argument('-y', '--year', type=int, required=False, default=None, help='The date related to the DB')
args = parser.parse_args()
# REDIS #
# port generated automatically depending on the date
curYear = datetime.now().year if args.year is None else args.year
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_Hashs", "host"),
port=cfg.getint("ARDB_Hashs", "port"),
db=curYear,
decode_responses=True)
# LOGGING #
publisher.port = 6380
publisher.channel = "Graph"
# FUNCTIONS #
publisher.info("""Creating the Repartition Graph""")
total_list = []
codepad_list = []
pastie_list = []
pastebin_list = []
for hash in r_serv.keys():
total_list.append(r_serv.scard(hash))
code = 0
pastie = 0
pastebin = 0
for paste in r_serv.smembers(hash):
source = paste.split("/")[5]
if source == "codepad.org":
code = code + 1
elif source == "pastie.org":
pastie = pastie + 1
elif source == "pastebin.com":
pastebin = pastebin + 1
codepad_list.append(code)
pastie_list.append(pastie)
pastebin_list.append(pastebin)
codepad_list.sort(reverse=True)
pastie_list.sort(reverse=True)
pastebin_list.sort(reverse=True)
total_list.sort(reverse=True)
plt.plot(codepad_list, 'b', label='Codepad.org')
plt.plot(pastebin_list, 'g', label='Pastebin.org')
plt.plot(pastie_list, 'y', label='Pastie.org')
plt.plot(total_list, 'r', label='Total')
plt.xscale('log')
plt.xlabel('Hashs')
plt.ylabel('Occur[Hash]')
plt.title('Repartition')
plt.legend()
plt.grid()
plt.tight_layout()
plt.savefig(args.f+".png", dpi=None, facecolor='w', edgecolor='b',
orientation='portrait', papertype=None, format="png",
transparent=False, bbox_inches=None, pad_inches=0.1,
frameon=True)
if __name__ == "__main__":
main()

View file

@ -14,7 +14,6 @@ It test different possibility to makes some sqlInjection.
import time
import datetime
import redis
import string
import urllib.request
import re
from pubsublogger import publisher
@ -22,131 +21,41 @@ from Helper import Process
from packages import Paste
from pyfaup.faup import Faup
# Config Var
regex_injection = []
word_injection = []
word_injection_suspect = []
# Classic atome injection
regex_injection1 = "([[AND |OR ]+[\'|\"]?[0-9a-zA-Z]+[\'|\"]?=[\'|\"]?[0-9a-zA-Z]+[\'|\"]?])"
regex_injection.append(regex_injection1)
# Time-based attack
regex_injection2 = ["SLEEP\([0-9]+", "BENCHMARK\([0-9]+", "WAIT FOR DELAY ", "WAITFOR DELAY"]
regex_injection2 = re.compile('|'.join(regex_injection2))
regex_injection.append(regex_injection2)
# Interesting keyword
word_injection1 = [" IF ", " ELSE ", " CASE ", " WHEN ", " END ", " UNION ", "SELECT ", " FROM ", " ORDER BY ", " WHERE ", " DELETE ", " DROP ", " UPDATE ", " EXEC "]
word_injection.append(word_injection1)
# Database special keywords
word_injection2 = ["@@version", "POW(", "BITAND(", "SQUARE("]
word_injection.append(word_injection2)
# Html keywords
word_injection3 = ["<script>"]
word_injection.append(word_injection3)
# Suspect char
word_injection_suspect1 = ["\'", "\"", ";", "<", ">"]
word_injection_suspect += word_injection_suspect1
# Comment
word_injection_suspect2 = ["--", "#", "/*"]
word_injection_suspect += word_injection_suspect2
# Reference: https://github.com/stamparm/maltrail/blob/master/core/settings.py
SQLI_REGEX = r"information_schema|sysdatabases|sysusers|floor\(rand\(|ORDER BY \d+|\bUNION\s+(ALL\s+)?SELECT\b|\b(UPDATEXML|EXTRACTVALUE)\(|\bCASE[^\w]+WHEN.*THEN\b|\bWAITFOR[^\w]+DELAY\b|\bCONVERT\(|VARCHAR\(|\bCOUNT\(\*\)|\b(pg_)?sleep\(|\bSELECT\b.*\bFROM\b.*\b(WHERE|GROUP|ORDER)\b|\bSELECT \w+ FROM \w+|\b(AND|OR|SELECT)\b.*/\*.*\*/|/\*.*\*/.*\b(AND|OR|SELECT)\b|\b(AND|OR)[^\w]+\d+['\") ]?[=><]['\"( ]?\d+|ODBC;DRIVER|\bINTO\s+(OUT|DUMP)FILE"
def analyse(url, path):
faup.decode(url)
url_parsed = faup.get()
resource_path = url_parsed['resource_path']
query_string = url_parsed['query_string']
result_path = 0
result_query = 0
if resource_path is not None:
## TODO: # FIXME: remove me
try:
resource_path = resource_path.decode()
except:
pass
result_path = is_sql_injection(resource_path)
if query_string is not None:
## TODO: # FIXME: remove me
try:
query_string = query_string.decode()
except:
pass
result_query = is_sql_injection(query_string)
if (result_path > 0) or (result_query > 0):
if is_sql_injection(url):
faup.decode(url)
url_parsed = faup.get()
paste = Paste.Paste(path)
if (result_path > 1) or (result_query > 1):
print("Detected SQL in URL: ")
print(urllib.request.unquote(url))
to_print = 'SQLInjection;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, "Detected SQL in URL", paste.p_rel_path)
publisher.warning(to_print)
#Send to duplicate
p.populate_set_out(path, 'Duplicate')
print("Detected SQL in URL: ")
print(urllib.request.unquote(url))
to_print = 'SQLInjection;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, "Detected SQL in URL", paste.p_rel_path)
publisher.warning(to_print)
#Send to duplicate
p.populate_set_out(path, 'Duplicate')
msg = 'infoleak:automatic-detection="sql-injection";{}'.format(path)
p.populate_set_out(msg, 'Tags')
#statistics
tld = url_parsed['tld']
if tld is not None:
## TODO: # FIXME: remove me
try:
tld = tld.decode()
except:
pass
date = datetime.datetime.now().strftime("%Y%m")
server_statistics.hincrby('SQLInjection_by_tld:'+date, tld, 1)
else:
print("Potential SQL injection:")
print(urllib.request.unquote(url))
to_print = 'SQLInjection;{};{};{};{};{}'.format(paste.p_source, paste.p_date, paste.p_name, "Potential SQL injection", paste.p_rel_path)
publisher.info(to_print)
msg = 'infoleak:automatic-detection="sql-injection";{}'.format(path)
p.populate_set_out(msg, 'Tags')
#statistics
tld = url_parsed['tld']
if tld is not None:
## TODO: # FIXME: remove me
try:
tld = tld.decode()
except:
pass
date = datetime.datetime.now().strftime("%Y%m")
server_statistics.hincrby('SQLInjection_by_tld:'+date, tld, 1)
# Try to detect if the url passed might be an sql injection by appliying the regex
# defined above on it.
def is_sql_injection(url_parsed):
line = urllib.request.unquote(url_parsed)
line = str.upper(line)
result = []
result_suspect = []
for regex in regex_injection:
temp_res = re.findall(regex, line)
if len(temp_res)>0:
result.append(temp_res)
for word_list in word_injection:
for word in word_list:
temp_res = str.find(line, str.upper(word))
if temp_res!=-1:
result.append(line[temp_res:temp_res+len(word)])
for word in word_injection_suspect:
temp_res = str.find(line, str.upper(word))
if temp_res!=-1:
result_suspect.append(line[temp_res:temp_res+len(word)])
if len(result)>0:
print(result)
return 2
elif len(result_suspect)>0:
print(result_suspect)
return 1
else:
return 0
return re.search(SQLI_REGEX, line, re.I) is not None
if __name__ == '__main__':

View file

@ -14,6 +14,8 @@
Hutto, C.J. & Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014.
"""
import os
import sys
import time
import datetime
@ -24,6 +26,9 @@ from pubsublogger import publisher
from Helper import Process
from packages import Paste
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk import tokenize
@ -32,19 +37,6 @@ accepted_Mime_type = ['text/plain']
size_threshold = 250
line_max_length_threshold = 1000
import os
import configparser
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
sentiment_lexicon_file = cfg.get("Directories", "sentiment_lexicon_file")
#time_clean_sentiment_db = 60*60
def Analyse(message, server):
@ -151,12 +143,12 @@ if __name__ == '__main__':
# Sent to the logging a description of the module
publisher.info("<description of the module>")
config_loader = ConfigLoader.ConfigLoader()
sentiment_lexicon_file = config_loader.get_config_str("Directories", "sentiment_lexicon_file")
# REDIS_LEVEL_DB #
server = redis.StrictRedis(
host=p.config.get("ARDB_Sentiment", "host"),
port=p.config.get("ARDB_Sentiment", "port"),
db=p.config.get("ARDB_Sentiment", "db"),
decode_responses=True)
server = config_loader.get_redis_conn("ARDB_Sentiment")
config_loader = None
time1 = time.time()

View file

@ -1,151 +0,0 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
This Module is used for term frequency.
It processes every paste coming from the global module and test the sets
supplied in the term webpage.
"""
import redis
import time
from pubsublogger import publisher
from packages import lib_words
from packages import Paste
import os
import datetime
import calendar
import re
import ast
from Helper import Process
# Email notifications
from NotificationHelper import *
# Config Variables
BlackListTermsSet_Name = "BlackListSetTermSet"
TrackedTermsSet_Name = "TrackedSetTermSet"
TrackedRegexSet_Name = "TrackedRegexSet"
TrackedSetSet_Name = "TrackedSetSet"
top_term_freq_max_set_cardinality = 20 # Max cardinality of the terms frequences set
oneDay = 60*60*24
top_termFreq_setName_day = ["TopTermFreq_set_day_", 1]
top_termFreq_setName_week = ["TopTermFreq_set_week", 7]
top_termFreq_setName_month = ["TopTermFreq_set_month", 31]
top_termFreq_set_array = [top_termFreq_setName_day,top_termFreq_setName_week, top_termFreq_setName_month]
TrackedTermsNotificationTagsPrefix_Name = "TrackedNotificationTags_"
# create direct link in mail
full_paste_url = "/showsavedpaste/?paste="
def add_quote_inside_tab(tab):
quoted_tab = "["
for elem in tab[1:-1].split(','):
elem = elem.lstrip().strip()
quoted_tab += "\'{}\', ".format(elem)
quoted_tab = quoted_tab[:-2] #remove trailing ,
quoted_tab += "]"
return str(quoted_tab)
if __name__ == "__main__":
publisher.port = 6380
publisher.channel = "Script"
config_section = 'SetForTermsFrequency'
p = Process(config_section)
# REDIS #
server_term = redis.StrictRedis(
host=p.config.get("ARDB_TermFreq", "host"),
port=p.config.get("ARDB_TermFreq", "port"),
db=p.config.get("ARDB_TermFreq", "db"),
decode_responses=True)
# FUNCTIONS #
publisher.info("RegexForTermsFrequency script started")
# create direct link in mail
full_paste_url = p.config.get("Notifications", "ail_domain") + full_paste_url
#get the dico and matching percent
dico_percent = {}
dico_set_tab = {}
dico_setname_to_redis = {}
for set_str in server_term.smembers(TrackedSetSet_Name):
tab_set = set_str[1:-1]
tab_set = add_quote_inside_tab(tab_set)
perc_finder = re.compile("\[[0-9]{1,3}\]").search(tab_set)
if perc_finder is not None:
match_percent = perc_finder.group(0)[1:-1]
dico_percent[tab_set] = float(match_percent)
dico_set_tab[tab_set] = ast.literal_eval(tab_set)
dico_setname_to_redis[tab_set] = set_str
else:
continue
message = p.get_from_set()
while True:
if message is not None:
filename = message
temp = filename.split('/')
timestamp = calendar.timegm((int(temp[-4]), int(temp[-3]), int(temp[-2]), 0, 0, 0))
content = Paste.Paste(filename).get_p_content()
curr_set = top_termFreq_setName_day[0] + str(timestamp)
#iterate over the words of the file
match_dico = {}
for word in content.split():
for cur_set, array_set in dico_set_tab.items():
for w_set in array_set[:-1]: #avoid the percent matching
if word == w_set:
try:
match_dico[str(array_set)] += 1
except KeyError:
match_dico[str(array_set)] = 1
#compute matching %
for the_set, matchingNum in match_dico.items():
eff_percent = float(matchingNum) / float((len(ast.literal_eval(the_set))-1)) * 100 #-1 bc if the percent matching
if eff_percent >= dico_percent[the_set]:
# Send a notification only when the member is in the set
if dico_setname_to_redis[str(the_set)] in server_term.smembers(TrackedTermsNotificationEnabled_Name):
# create mail body
mail_body = ("AIL Framework,\n"
"New occurrence for term: " + dico_setname_to_redis[str(the_set)] + "\n"
''+full_paste_url + filename)
# Send to every associated email adress
for email in server_term.smembers(TrackedTermsNotificationEmailsPrefix_Name + dico_setname_to_redis[str(the_set)]):
sendEmailNotification(email, 'Term', mail_body)
# tag paste
for tag in server_term.smembers(TrackedTermsNotificationTagsPrefix_Name + dico_setname_to_redis[str(the_set)]):
msg = '{};{}'.format(tag, filename)
p.populate_set_out(msg, 'Tags')
print(the_set, "matched in", filename)
set_name = 'set_' + dico_setname_to_redis[the_set]
new_to_the_set = server_term.sadd(set_name, filename)
new_to_the_set = True if new_to_the_set == 1 else False
#consider the num of occurence of this set
set_value = int(server_term.hincrby(timestamp, dico_setname_to_redis[the_set], int(1)))
# FIXME - avoid using per paste as a set is checked over the entire paste
#1 term per paste
if new_to_the_set:
set_value_perPaste = int(server_term.hincrby("per_paste_" + str(timestamp), dico_setname_to_redis[the_set], int(1)))
server_term.zincrby("per_paste_" + curr_set, dico_setname_to_redis[the_set], float(1))
server_term.zincrby(curr_set, dico_setname_to_redis[the_set], float(1))
else:
publisher.debug("Script RegexForTermsFrequency is Idling")
print("sleeping")
time.sleep(5)
message = p.get_from_set()

View file

@ -1,68 +0,0 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
The ZMQ_Feed_Q Module
=====================
This module is consuming the Redis-list created by the ZMQ_Feed_Q Module,
And save the paste on disk to allow others modules to work on them.
..todo:: Be able to choose to delete or not the saved paste after processing.
..todo:: Store the empty paste (unprocessed) somewhere in Redis.
..note:: Module ZMQ_Something_Q and ZMQ_Something are closely bound, always put
the same Subscriber name in both of them.
Requirements
------------
*Need running Redis instances.
*Need the ZMQ_Feed_Q Module running to be able to work properly.
"""
import redis
import configparser
import os
configfile = os.path.join(os.environ['AIL_BIN'], './packages/config.cfg')
def main():
"""Main Function"""
# CONFIG #
cfg = configparser.ConfigParser()
cfg.read(configfile)
# REDIS
r_serv = redis.StrictRedis(host=cfg.get("Redis_Queues", "host"),
port=cfg.getint("Redis_Queues", "port"),
db=cfg.getint("Redis_Queues", "db"),
decode_responses=True)
# FIXME: automatic based on the queue name.
# ### SCRIPTS ####
r_serv.sadd("SHUTDOWN_FLAGS", "Feed")
r_serv.sadd("SHUTDOWN_FLAGS", "Categ")
r_serv.sadd("SHUTDOWN_FLAGS", "Lines")
r_serv.sadd("SHUTDOWN_FLAGS", "Tokenize")
r_serv.sadd("SHUTDOWN_FLAGS", "Attributes")
r_serv.sadd("SHUTDOWN_FLAGS", "Creditcards")
r_serv.sadd("SHUTDOWN_FLAGS", "Duplicate")
r_serv.sadd("SHUTDOWN_FLAGS", "Mails")
r_serv.sadd("SHUTDOWN_FLAGS", "Onion")
r_serv.sadd("SHUTDOWN_FLAGS", "Urls")
r_serv.sadd("SHUTDOWN_FLAGS", "Feed_Q")
r_serv.sadd("SHUTDOWN_FLAGS", "Categ_Q")
r_serv.sadd("SHUTDOWN_FLAGS", "Lines_Q")
r_serv.sadd("SHUTDOWN_FLAGS", "Tokenize_Q")
r_serv.sadd("SHUTDOWN_FLAGS", "Attributes_Q")
r_serv.sadd("SHUTDOWN_FLAGS", "Creditcards_Q")
r_serv.sadd("SHUTDOWN_FLAGS", "Duplicate_Q")
r_serv.sadd("SHUTDOWN_FLAGS", "Mails_Q")
r_serv.sadd("SHUTDOWN_FLAGS", "Onion_Q")
r_serv.sadd("SHUTDOWN_FLAGS", "Urls_Q")
if __name__ == "__main__":
main()

View file

@ -16,6 +16,8 @@ import datetime
from pubsublogger import publisher
from Helper import Process
from packages import Paste
from packages import Item
def get_item_date(item_filename):
l_directory = item_filename.split('/')
@ -84,6 +86,12 @@ if __name__ == '__main__':
set_tag_metadata(tag, item_date)
server_metadata.sadd('tag:{}'.format(path), tag)
# Domain Object
if Item.is_crawled(path) and tag!='infoleak:submission="crawler"':
domain = Item.get_item_domain(path)
server_metadata.sadd('tag:{}'.format(domain), tag)
server.sadd('domain:{}:{}'.format(tag, item_date), domain)
curr_date = datetime.date.today().strftime("%Y%m%d")
server.hincrby('daily_tags:{}'.format(item_date), tag, 1)
p.populate_set_out(message, 'MISP_The_Hive_feeder')

121
bin/TermTrackerMod.py Executable file
View file

@ -0,0 +1,121 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
The TermTracker Module
===================
"""
import os
import sys
import time
import signal
from Helper import Process
from pubsublogger import publisher
import NotificationHelper
from packages import Item
from packages import Term
full_item_url = "/showsavedpaste/?paste="
mail_body_template = "AIL Framework,\nNew occurrence for term tracked term: {}\nitem id: {}\nurl: {}{}"
# loads tracked words
list_tracked_words = Term.get_tracked_words_list()
last_refresh_word = time.time()
set_tracked_words_list = Term.get_set_tracked_words_list()
last_refresh_set = time.time()
class TimeoutException(Exception):
pass
def timeout_handler(signum, frame):
raise TimeoutException
signal.signal(signal.SIGALRM, timeout_handler)
def new_term_found(term, term_type, item_id, item_date):
uuid_list = Term.get_term_uuid_list(term, term_type)
print('new tracked term found: {} in {}'.format(term, item_id))
for term_uuid in uuid_list:
Term.add_tracked_item(term_uuid, item_id, item_date)
tags_to_add = Term.get_term_tags(term_uuid)
for tag in tags_to_add:
msg = '{};{}'.format(tag, item_id)
p.populate_set_out(msg, 'Tags')
mail_to_notify = Term.get_term_mails(term_uuid)
if mail_to_notify:
mail_body = mail_body_template.format(term, item_id, full_item_url, item_id)
for mail in mail_to_notify:
NotificationHelper.sendEmailNotification(mail, 'Term Tracker', mail_body)
if __name__ == "__main__":
publisher.port = 6380
publisher.channel = "Script"
publisher.info("Script TermTrackerMod started")
config_section = 'TermTrackerMod'
p = Process(config_section)
max_execution_time = p.config.getint(config_section, "max_execution_time")
full_item_url = p.config.get("Notifications", "ail_domain") + full_item_url
while True:
item_id = p.get_from_set()
if item_id is not None:
item_date = Item.get_item_date(item_id)
item_content = Item.get_item_content(item_id)
signal.alarm(max_execution_time)
try:
dict_words_freq = Term.get_text_word_frequency(item_content)
except TimeoutException:
print ("{0} processing timeout".format(paste.p_rel_path))
continue
else:
signal.alarm(0)
# create token statistics
#for word in dict_words_freq:
# Term.create_token_statistics(item_date, word, dict_words_freq[word])
# check solo words
for word in list_tracked_words:
if word in dict_words_freq:
new_term_found(word, 'word', item_id, item_date)
# check words set
for elem in set_tracked_words_list:
list_words = elem[0]
nb_words_threshold = elem[1]
word_set = elem[2]
nb_uniq_word = 0
for word in list_words:
if word in dict_words_freq:
nb_uniq_word += 1
if nb_uniq_word >= nb_words_threshold:
new_term_found(word_set, 'set', item_id, item_date)
else:
time.sleep(5)
# refresh Tracked term
if last_refresh_word < Term.get_tracked_term_last_updated_by_type('word'):
list_tracked_words = Term.get_tracked_words_list()
last_refresh_word = time.time()
print('Tracked word refreshed')
if last_refresh_set < Term.get_tracked_term_last_updated_by_type('set'):
set_tracked_words_list = Term.get_set_tracked_words_list()
last_refresh_set = time.time()
print('Tracked set refreshed')

View file

@ -1,71 +0,0 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
The Tokenize Module
===================
This module is consuming the Redis-list created by the ZMQ_PubSub_Tokenize_Q
Module.
It tokenize the content of the paste and publish the result in the following
format:
channel_name+' '+/path/of/the/paste.gz+' '+tokenized_word+' '+scoring
..seealso:: Paste method (_get_top_words)
..note:: Module ZMQ_Something_Q and ZMQ_Something are closely bound, always put
the same Subscriber name in both of them.
Requirements
------------
*Need running Redis instances. (Redis)
*Need the ZMQ_PubSub_Tokenize_Q Module running to be able to work properly.
"""
import time
from packages import Paste
from pubsublogger import publisher
from Helper import Process
import signal
class TimeoutException(Exception):
pass
def timeout_handler(signum, frame):
raise TimeoutException
signal.signal(signal.SIGALRM, timeout_handler)
if __name__ == "__main__":
publisher.port = 6380
publisher.channel = "Script"
config_section = 'Tokenize'
p = Process(config_section)
# LOGGING #
publisher.info("Tokeniser started")
while True:
message = p.get_from_set()
print(message)
if message is not None:
paste = Paste.Paste(message)
signal.alarm(5)
try:
for word, score in paste._get_top_words().items():
if len(word) >= 4:
msg = '{} {} {}'.format(paste.p_rel_path, word, score)
p.populate_set_out(msg)
except TimeoutException:
p.incr_module_timeout_statistic()
print ("{0} processing timeout".format(paste.p_rel_path))
continue
else:
signal.alarm(0)
else:
publisher.debug("Tokeniser is idling 10s")
time.sleep(10)
print("Sleeping")

753
bin/Tools.py Executable file
View file

@ -0,0 +1,753 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
"""
Tools Module
============================
Search tools outpout
"""
from Helper import Process
from pubsublogger import publisher
import os
import re
import sys
import time
import redis
import signal
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages'))
import Item
class TimeoutException(Exception):
pass
def timeout_handler(signum, frame):
raise TimeoutException
signal.signal(signal.SIGALRM, timeout_handler)
def search_tools(item_id, item_content):
tools_in_item = False
for tools_name in tools_dict:
tool_dict = tools_dict[tools_name]
regex_match = False
for regex_nb in list(range(tool_dict['nb_regex'])):
regex_index = regex_nb + 1
regex = tool_dict['regex{}'.format(regex_index)]
signal.alarm(tool_dict['max_execution_time'])
try:
tools_found = re.findall(regex, item_content)
except TimeoutException:
tools_found = []
p.incr_module_timeout_statistic() # add encoder type
print ("{0} processing timeout".format(item_id))
continue
else:
signal.alarm(0)
if not tools_found:
regex_match = False
break
else:
regex_match = True
if 'tag{}'.format(regex_index) in tool_dict:
print('{} found: {}'.format(item_id, tool_dict['tag{}'.format(regex_index)]))
msg = '{};{}'.format(tool_dict['tag{}'.format(regex_index)], item_id)
p.populate_set_out(msg, 'Tags')
if regex_match:
print('{} found: {}'.format(item_id, tool_dict['name']))
# Tag Item
msg = '{};{}'.format(tool_dict['tag'], item_id)
p.populate_set_out(msg, 'Tags')
if tools_in_item:
# send to duplicate module
p.populate_set_out(item_id, 'Duplicate')
default_max_execution_time = 30
tools_dict = {
'sqlmap': {
'name': 'sqlmap',
'regex1': r'Usage of sqlmap for attacking targets without|all tested parameters do not appear to be injectable|sqlmap identified the following injection point|Title:[^\n]*((error|time|boolean)-based|stacked queries|UNION query)',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="sqlmap-tool"', # tag if all regex match
},
'wig': {
'name': 'wig',
'regex1': r'(?s)wig - WebApp Information Gatherer.+?_{10,}',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="wig-tool"', # tag if all regex match
},
'dmytry': {
'name': 'dmitry',
'regex1': r'(?s)Gathered (TCP Port|Inet-whois|Netcraft|Subdomain|E-Mail) information for.+?-{10,}',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="dmitry-tool"', # tag if all regex match
},
'inurlbr': {
'name': 'inurlbr',
'regex1': r'Usage of INURLBR for attacking targets without prior mutual consent is illegal',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="inurlbr-tool"', # tag if all regex match
},
'wafw00f': {
'name': 'wafw00f',
'regex1': r'(?s)WAFW00F - Web Application Firewall Detection Tool.+?Checking',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="wafw00f-tool"', # tag if all regex match
},
'sslyze': {
'name': 'sslyze',
'regex1': r'(?s)PluginSessionRenegotiation.+?SCAN RESULTS FOR',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="sslyze-tool"', # tag if all regex match
},
'nmap': {
'name': 'nmap',
'regex1': r'(?s)Nmap scan report for.+?Host is',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="nmap-tool"', # tag if all regex match
},
'dnsenum': {
'name': 'dnsenum',
'regex1': r'(?s)dnsenum(\.pl)? VERSION:.+?Trying Zone Transfer',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="dnsenum-tool"', # tag if all regex match
},
'knock': {
'name': 'knock',
'regex1': r'I scannig with my internal wordlist',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="knock-tool"', # tag if all regex match
},
'nikto': {
'name': 'nikto',
'regex1': r'(?s)\+ Target IP:.+?\+ Start Time:',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="nikto-tool"', # tag if all regex match
},
'dnscan': {
'name': 'dnscan',
'regex1': r'(?s)\[\*\] Processing domain.+?\[\+\] Getting nameservers.+?records found',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="dnscan-tool"', # tag if all regex match
},
'dnsrecon': {
'name': 'dnsrecon',
'regex1': r'Performing General Enumeration of Domain:|Performing TLD Brute force Enumeration against',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="dnsrecon-tool"', # tag if all regex match
},
'striker': {
'name': 'striker',
'regex1': r'Crawling the target for fuzzable URLs|Honeypot Probabilty:',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="striker-tool"', # tag if all regex match
},
'rhawk': {
'name': 'rhawk',
'regex1': r'S U B - D O M A I N F I N D E R',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="rhawk-tool"', # tag if all regex match
},
'uniscan': {
'name': 'uniscan',
'regex1': r'\| \[\+\] E-mail Found:',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="uniscan-tool"', # tag if all regex match
},
'masscan': {
'name': 'masscan',
'regex1': r'(?s)Starting masscan [\d.]+.+?Scanning|bit.ly/14GZzcT',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="masscan-tool"', # tag if all regex match
},
'msfconsole': {
'name': 'msfconsole',
'regex1': r'=\[ metasploit v[\d.]+.+?msf >',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="msfconsole-tool"', # tag if all regex match
},
'amap': {
'name': 'amap',
'regex1': r'\bamap v[\d.]+ \(www.thc.org/thc-amap\)',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="amap-tool"', # tag if all regex match
},
'automater': {
'name': 'automater',
'regex1': r'(?s)\[\*\] Checking.+?_+ Results found for:',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="automater-tool"', # tag if all regex match
},
'braa': {
'name': 'braa',
'regex1': r'\bbraa public@[\d.]+',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="braa-tool"', # tag if all regex match
},
'ciscotorch': {
'name': 'ciscotorch',
'regex1': r'Becase we need it',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="ciscotorch-tool"', # tag if all regex match
},
'theharvester': {
'name': 'theharvester',
'regex1': r'Starting harvesting process for domain:',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="theharvester-tool"', # tag if all regex match
},
'sslstrip': {
'name': 'sslstrip',
'regex1': r'sslstrip [\d.]+ by Moxie Marlinspike running',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="sslstrip-tool"', # tag if all regex match
},
'sslcaudit': {
'name': 'sslcaudit',
'regex1': r'# filebag location:',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="sslcaudit-tool"', # tag if all regex match
},
'smbmap': {
'name': 'smbmap',
'regex1': r'\[\+\] Finding open SMB ports\.\.\.',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="smbmap-tool"', # tag if all regex match
},
'reconng': {
'name': 'reconng',
'regex1': r'\[\*\] Status: unfixed|\[recon-ng\]\[default\]',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="reconng-tool"', # tag if all regex match
},
'p0f': {
'name': 'p0f',
'regex1': r'\bp0f [^ ]+ by Michal Zalewski',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="p0f-tool"', # tag if all regex match
},
'hping3': {
'name': 'hping3',
'regex1': r'\bHPING [^ ]+ \([^)]+\): [^ ]+ mode set',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="hping3-tool"', # tag if all regex match
},
'enum4linux': {
'name': 'enum4linux',
'regex1': r'Starting enum4linux v[\d.]+|\| Target Information \|',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="enum4linux-tool"', # tag if all regex match
},
'dnstracer': {
'name': 'dnstracer',
'regex1': r'(?s)Tracing to.+?DNS HEADER \(send\)',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="dnstracer-tool"', # tag if all regex match
},
'dnmap': {
'name': 'dnmap',
'regex1': r'dnmap_(client|server)|Nmap output files stored in \'nmap_output\' directory',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="dnmap-tool"', # tag if all regex match
},
'arpscan': {
'name': 'arpscan',
'regex1': r'Starting arp-scan [^ ]+ with \d+ hosts',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="arpscan-tool"', # tag if all regex match
},
'cdpsnarf': {
'name': 'cdpsnarf',
'regex1': r'(?s)CDPSnarf v[^ ]+.+?Waiting for a CDP packet\.\.\.',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="cdpsnarf-tool"', # tag if all regex match
},
'dnsmap': {
'name': 'dnsmap',
'regex1': r'DNS Network Mapper by pagvac',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="dnsmap-tool"', # tag if all regex match
},
'dotdotpwn': {
'name': 'dotdotpwn',
'regex1': r'DotDotPwn v[^ ]+|dotdotpwn@sectester.net|\[\+\] Creating Traversal patterns',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="dotdotpwn-tool"', # tag if all regex match
},
'searchsploit': {
'name': 'searchsploit',
'regex1': r'(exploits|shellcodes)/|searchsploit_rc|Exploit Title',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="searchsploit-tool"', # tag if all regex match
},
'fierce': {
'name': 'fierce',
'regex1': r'(?s)Trying zone transfer first.+Checking for wildcard DNS',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="fierce-tool"', # tag if all regex match
},
'firewalk': {
'name': 'firewalk',
'regex1': r'Firewalk state initialization completed successfully|Ramping phase source port',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="firewalk-tool"', # tag if all regex match
},
'fragroute': {
'name': 'fragroute',
'regex1': r'\bfragroute: tcp_seg -> ip_frag',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="fragroute-tool"', # tag if all regex match
},
'fragrouter': {
'name': 'fragrouter',
'regex1': r'fragrouter: frag-\d+:',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="fragrouter-tool"', # tag if all regex match
},
'goofile': {
'name': 'goofile',
'regex1': r'code.google.com/p/goofile\b',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="goofile-tool"', # tag if all regex match
},
'intrace': {
'name': 'intrace',
'regex1': r'\bInTrace [\d.]+ \-\-',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="intrace-tool"', # tag if all regex match
},
'ismtp': {
'name': 'ismtp',
'regex1': r'Testing SMTP server \[user enumeration\]',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="ismtp-tool"', # tag if all regex match
},
'lbd': {
'name': 'lbd',
'regex1': r'Checking for (DNS|HTTP)-Loadbalancing',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="lbd-tool"', # tag if all regex match
},
'miranda': {
'name': 'miranda',
'regex1': r'Entering discovery mode for \'upnp:',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="miranda-tool"', # tag if all regex match
},
'ncat': {
'name': 'ncat',
'regex1': r'nmap.org/ncat',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="ncat-tool"', # tag if all regex match
},
'ohrwurm': {
'name': 'ohrwurm',
'regex1': r'\bohrwurm-[\d.]+',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="ohrwurm-tool"', # tag if all regex match
},
'oscanner': {
'name': 'oscanner',
'regex1': r'Loading services/sids from service file',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="oscanner-tool"', # tag if all regex match
},
'sfuzz': {
'name': 'sfuzz',
'regex1': r'AREALLYBADSTRING|sfuzz/sfuzz',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="sfuzz-tool"', # tag if all regex match
},
'sidguess': {
'name': 'sidguess',
'regex1': r'SIDGuesser v[\d.]+',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="sidguess-tool"', # tag if all regex match
},
'sqlninja': {
'name': 'sqlninja',
'regex1': r'Sqlninja rel\. [\d.]+',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="sqlninja-tool"', # tag if all regex match
},
'sqlsus': {
'name': 'sqlsus',
'regex1': r'sqlsus version [\d.]+',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="sqlsus-tool"', # tag if all regex match
},
'dnsdict6': {
'name': 'dnsdict6',
'regex1': r'Starting DNS enumeration work on',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="dnsdict6-tool"', # tag if all regex match
},
'unixprivesccheck': {
'name': 'unixprivesccheck',
'regex1': r'Recording Interface IP addresses',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="unixprivesccheck-tool"', # tag if all regex match
},
'yersinia': {
'name': 'yersinia',
'regex1': r'yersinia@yersinia.net',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="yersinia-tool"', # tag if all regex match
},
'armitage': {
'name': 'armitage',
'regex1': r'\[\*\] Starting msfrpcd for you',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="armitage-tool"', # tag if all regex match
},
'backdoorfactory': {
'name': 'backdoorfactory',
'regex1': r'\[\*\] In the backdoor module',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="backdoorfactory-tool"', # tag if all regex match
},
'beef': {
'name': 'beef',
'regex1': r'Please wait as BeEF services are started',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="beef-tool"', # tag if all regex match
},
'cat': {
'name': 'cat',
'regex1': r'Cisco Auditing Tool.+?g0ne',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="cat-tool"', # tag if all regex match
},
'cge': {
'name': 'cge',
'regex1': r'Vulnerability successful exploited with \[',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="cge-tool"', # tag if all regex match
},
'john': {
'name': 'john',
'regex1': r'John the Ripper password cracker, ver:|Loaded \d+ password hash \(',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="john-tool"', # tag if all regex match
},
'keimpx': {
'name': 'keimpx',
'regex1': r'\bkeimpx [\d.]+',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="keimpx-tool"', # tag if all regex match
},
'maskprocessor': {
'name': 'maskprocessor',
'regex1': r'mp by atom, High-Performance word generator',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="maskprocessor-tool"', # tag if all regex match
},
'ncrack': {
'name': 'ncrack',
'regex1': r'Starting Ncrack[^\n]+http://ncrack.org',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="ncrack-tool"', # tag if all regex match
},
'patator': {
'name': 'patator',
'regex1': r'http://code.google.com/p/patator/|Starting Patator v',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="patator-tool"', # tag if all regex match
},
'phrasendrescher': {
'name': 'phrasendrescher',
'regex1': r'phrasen\|drescher [\d.]+',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="phrasendrescher-tool"', # tag if all regex match
},
'polenum': {
'name': 'polenum',
'regex1': r'\[\+\] Password Complexity Flags:',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="polenum-tool"', # tag if all regex match
},
'rainbowcrack': {
'name': 'rainbowcrack',
'regex1': r'Official Website: http://project-rainbowcrack.com/',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="rainbowcrack-tool"', # tag if all regex match
},
'rcracki_mt': {
'name': 'rcracki_mt',
'regex1': r'Found \d+ rainbowtable files\.\.\.',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="rcracki_mt-tool"', # tag if all regex match
},
'tcpdump': {
'name': 'tcpdump',
'regex1': r'tcpdump: listening on.+capture size \d+|\d+ packets received by filter',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="tcpdump-tool"', # tag if all regex match
},
'hydra': {
'name': 'hydra',
'regex1': r'Hydra \(http://www.thc.org/thc-hydra\)',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="hydra-tool"', # tag if all regex match
},
'netcat': {
'name': 'netcat',
'regex1': r'Listening on \[[\d.]+\] \(family',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="netcat-tool"', # tag if all regex match
},
'nslookup': {
'name': 'nslookup',
'regex1': r'Non-authoritative answer:',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="nslookup-tool"', # tag if all regex match
},
'dig': {
'name': 'dig',
'regex1': r'; <<>> DiG [\d.]+',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="dig-tool"', # tag if all regex match
},
'whois': {
'name': 'whois',
'regex1': r'(?i)Registrar WHOIS Server:|Registrar URL: http://|DNSSEC: unsigned|information on Whois status codes|REGISTERED, DELEGATED|[Rr]egistrar:|%[^\n]+(WHOIS|2016/679)',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="whois-tool"', # tag if all regex match
},
'nessus': {
'name': 'nessus',
'regex1': r'nessus_(report_(get|list|exploits)|scan_(new|status))|nessuscli|nessusd|nessus-service',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="nessus-tool"', # tag if all regex match
},
'openvas': {
'name': 'openvas',
'regex1': r'/openvas/',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="openvas-tool"', # tag if all regex match
},
'golismero': {
'name': 'golismero',
'regex1': r'GoLismero[\n]+The Web Knife',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="golismero-tool"', # tag if all regex match
},
'wpscan': {
'name': 'wpscan',
'regex1': r'WordPress Security Scanner by the WPScan Team|\[\+\] Interesting header:',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="wpscan-tool"', # tag if all regex match
},
'skipfish': {
'name': 'skipfish',
'regex1': r'\[\+\] Sorting and annotating crawl nodes:|skipfish version [\d.]+',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="skipfish-tool"', # tag if all regex match
},
'arachni': {
'name': 'arachni',
'regex1': r'With the support of the community and the Arachni Team|\[\*\] Waiting for plugins to settle\.\.\.',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="arachni-tool"', # tag if all regex match
},
'dirb': {
'name': 'dirb',
'regex1': r'==> DIRECTORY:|\bDIRB v[\d.]+',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="dirb-tool"', # tag if all regex match
},
'joomscan': {
'name': 'joomscan',
'regex1': r'OWASP Joomla! Vulnerability Scanner v[\d.]+',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="joomscan-tool"', # tag if all regex match
},
'jbossautopwn': {
'name': 'jbossautopwn',
'regex1': r'\[x\] Now creating BSH script\.\.\.|\[x\] Now deploying \.war file:',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="jbossautopwn-tool"', # tag if all regex match
},
'grabber': {
'name': 'grabber',
'regex1': r'runSpiderScan @',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="grabber-tool"', # tag if all regex match
},
'fimap': {
'name': 'fimap',
'regex1': r'Automatic LFI/RFI scanner and exploiter',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="fimap-tool"', # tag if all regex match
},
'dsxs': {
'name': 'dsxs',
'regex1': r'Damn Small XSS Scanner \(DSXS\)',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="dsxs-tool"', # tag if all regex match
},
'dsss': {
'name': 'dsss',
'regex1': r'Damn Small SQLi Scanner \(DSSS\)',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="dsss-tool"', # tag if all regex match
},
'dsjs': {
'name': 'dsjs',
'regex1': r'Damn Small JS Scanner \(DSJS\)',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="dsjs-tool"', # tag if all regex match
},
'dsfs': {
'name': 'dsfs',
'regex1': r'Damn Small FI Scanner \(DSFS\)',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="dsfs-tool"', # tag if all regex match
},
'identywaf': {
'name': 'identywaf',
'regex1': r'\[o\] initializing handlers\.\.\.',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="identywaf-tool"', # tag if all regex match
},
'whatwaf': {
'name': 'whatwaf',
'regex1': r'<sCRIPT>ALeRt.+?WhatWaf\?',
'nb_regex': 1,
'max_execution_time': default_max_execution_time,
'tag': 'infoleak:automatic-detection="whatwaf-tool"', # tag if all regex match
}
}
if __name__ == "__main__":
publisher.port = 6380
publisher.channel = "Script"
config_section = 'Tools'
# # TODO: add duplicate
# Setup the I/O queues
p = Process(config_section)
# Sent to the logging a description of the module
publisher.info("Run Tools module ")
# Endless loop getting messages from the input queue
while True:
# Get one message from the input queue
item_id = p.get_from_set()
if item_id is None:
publisher.debug("{} queue is empty, waiting".format(config_section))
time.sleep(1)
continue
# Do something with the message from the queue
item_content = Item.get_item_content(item_id)
search_tools(item_id, item_content)

View file

@ -68,9 +68,9 @@ def main():
#------------------------------------------------------------------------------------#
config_file_default = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
config_file_default_sample = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg.sample')
config_file_default_backup = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg.backup')
config_file_default = os.path.join(os.environ['AIL_HOME'], 'configs/core.cfg')
config_file_default_sample = os.path.join(os.environ['AIL_HOME'], 'configs/core.cfg.sample')
config_file_default_backup = os.path.join(os.environ['AIL_HOME'], 'configs/core.cfg.backup')
config_file_update = os.path.join(os.environ['AIL_HOME'], 'configs/update.cfg')
config_file_update_sample = os.path.join(os.environ['AIL_HOME'], 'configs/update.cfg.sample')

View file

@ -1,13 +1,18 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
from pymisp.tools.abstractgenerator import AbstractMISPObjectGenerator
import configparser
from packages import Paste
import datetime
import json
from io import BytesIO
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
class AilLeakObject(AbstractMISPObjectGenerator):
def __init__(self, uuid_ail, p_source, p_date, p_content, p_duplicate, p_duplicate_number):
super(AbstractMISPObjectGenerator, self).__init__('ail-leak')
@ -35,9 +40,9 @@ class ObjectWrapper:
self.pymisp = pymisp
self.currentID_date = None
self.eventID_to_push = self.get_daily_event_id()
cfg = configparser.ConfigParser()
cfg.read('./packages/config.cfg')
self.maxDuplicateToPushToMISP = cfg.getint("ailleakObject", "maxDuplicateToPushToMISP")
config_loader = ConfigLoader.ConfigLoader()
self.maxDuplicateToPushToMISP = config_loader.get_config_int("ailleakObject", "maxDuplicateToPushToMISP")
config_loader = None
self.attribute_to_tag = None
def add_new_object(self, uuid_ail, path, p_source, tag):

View file

@ -17,36 +17,33 @@
#
# Copyright (c) 2014 Alexandre Dulaunoy - a@foo.be
import os
import sys
import zmq
import random
import sys
import time
import redis
import base64
import os
import configparser
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
cfg = configparser.ConfigParser()
cfg.read(configfile)
config_loader = ConfigLoader.ConfigLoader()
if cfg.has_option("ZMQ_Global", "bind"):
zmq_url = cfg.get("ZMQ_Global", "bind")
if config_loader.has_option("ZMQ_Global", "bind"):
zmq_url = config_loader.get_config_str("ZMQ_Global", "bind")
else:
zmq_url = "tcp://127.0.0.1:5556"
pystemonpath = cfg.get("Directories", "pystemonpath")
pastes_directory = cfg.get("Directories", "pastes")
pystemonpath = config_loader.get_config_str("Directories", "pystemonpath")
pastes_directory = config_loader.get_config_str("Directories", "pastes")
pastes_directory = os.path.join(os.environ['AIL_HOME'], pastes_directory)
base_sleeptime = 0.01
sleep_inc = 0
config_loader = None
context = zmq.Context()
socket = context.socket(zmq.PUB)
socket.bind(zmq_url)

View file

@ -10,11 +10,13 @@
#
# Copyright (c) 2014 Alexandre Dulaunoy - a@foo.be
import configparser
import argparse
import gzip
import os
import sys
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
def readdoc(path=None):
if path is None:
@ -22,13 +24,11 @@ def readdoc(path=None):
f = gzip.open(path, 'r')
return f.read()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
cfg = configparser.ConfigParser()
cfg.read(configfile)
config_loader = ConfigLoader.ConfigLoader()
# Indexer configuration - index dir and schema setup
indexpath = os.path.join(os.environ['AIL_HOME'], cfg.get("Indexer", "path"))
indexertype = cfg.get("Indexer", "type")
indexpath = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Indexer", "path"))
indexertype = config_loader.get_config_str("Indexer", "type")
argParser = argparse.ArgumentParser(description='Fulltext search for AIL')
argParser.add_argument('-q', action='append', help='query to lookup (one or more)')

54
bin/lib/ConfigLoader.py Executable file
View file

@ -0,0 +1,54 @@
#!/usr/bin/python3
"""
The ``Domain``
===================
"""
import os
import sys
import time
import redis
import configparser
# Get Config file
config_dir = os.path.join(os.environ['AIL_HOME'], 'configs')
config_file = os.path.join(config_dir, 'core.cfg')
if not os.path.exists(config_file):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
# # TODO: create sphinx doc
# # TODO: add config_field to reload
class ConfigLoader(object):
"""docstring for Config_Loader."""
def __init__(self):
self.cfg = configparser.ConfigParser()
self.cfg.read(config_file)
def get_redis_conn(self, redis_name, decode_responses=True): ## TODO: verify redis name
return redis.StrictRedis( host=self.cfg.get(redis_name, "host"),
port=self.cfg.getint(redis_name, "port"),
db=self.cfg.getint(redis_name, "db"),
decode_responses=decode_responses )
def get_config_str(self, section, key_name):
return self.cfg.get(section, key_name)
def get_config_int(self, section, key_name):
return self.cfg.getint(section, key_name)
def get_config_boolean(self, section, key_name):
return self.cfg.getboolean(section, key_name)
def has_option(self, section, key_name):
return self.cfg.has_option(section, key_name)
def has_section(self, section):
return self.cfg.has_section(section)

288
bin/lib/Correlate_object.py Executable file
View file

@ -0,0 +1,288 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import uuid
import redis
from flask import url_for
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
import Decoded
import Domain
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages/'))
import Pgp
import Cryptocurrency
import Item
config_loader = ConfigLoader.ConfigLoader()
r_serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
config_loader = None
def get_all_correlation_names():
'''
Return a list of all available correlations
'''
return ['pgp', 'cryptocurrency', 'decoded']
def get_all_correlation_objects():
'''
Return a list of all correllated objects
'''
return ['domain', 'paste']
def get_object_metadata(object_type, correlation_id, type_id=None):
if object_type == 'domain':
return Domain.Domain(correlation_id).get_domain_metadata()
elif object_type == 'paste':
return {}
elif object_type == 'decoded':
return Decoded.get_decoded_metadata(correlation_id, nb_seen=True, size=True)
elif object_type == 'pgp':
return Pgp.pgp.get_metadata(type_id, correlation_id)
elif object_type == 'cryptocurrency':
return Cryptocurrency.cryptocurrency.get_metadata(type_id, correlation_id)
def get_object_correlation(object_type, value, correlation_names, correlation_objects, requested_correl_type=None):
if object_type == 'domain':
return Domain.get_domain_all_correlation(value, correlation_names=correlation_names)
elif object_type == 'paste':
return Item.get_item_all_correlation(value, correlation_names=correlation_names)
elif object_type == 'decoded':
return Decoded.get_decoded_correlated_object(value, correlation_objects)
elif object_type == 'pgp':
return Pgp.pgp.get_correlation_all_object(requested_correl_type, value, correlation_objects=correlation_objects)
elif object_type == 'cryptocurrency':
return Cryptocurrency.cryptocurrency.get_correlation_all_object(requested_correl_type, value, correlation_objects=correlation_objects)
return {}
def get_correlation_node_icon(correlation_name, correlation_type=None, value=None):
'''
Used in UI Graph.
Return a font awesome icon for a given correlation_name.
:param correlation_name: correlation name
:param correlation_name: str
:param correlation_type: correlation type
:type correlation_type: str, optional
:return: a dictionnary {font awesome class, icon_code}
:rtype: dict
'''
icon_class = 'fas'
icon_text = ''
node_color = "#332288"
node_radius = 6
if correlation_name == "pgp":
node_color = '#44AA99'
if correlation_type == 'key':
icon_text = '\uf084'
elif correlation_type == 'name':
icon_text = '\uf507'
elif correlation_type == 'mail':
icon_text = '\uf1fa'
else:
icon_text = 'times'
elif correlation_name == 'cryptocurrency':
node_color = '#DDCC77'
if correlation_type == 'bitcoin':
icon_class = 'fab'
icon_text = '\uf15a'
elif correlation_type == 'monero':
icon_class = 'fab'
icon_text = '\uf3d0'
elif correlation_type == 'ethereum':
icon_class = 'fab'
icon_text = '\uf42e'
else:
icon_text = '\uf51e'
elif correlation_name == 'decoded':
node_color = '#88CCEE'
correlation_type = Decoded.get_decoded_item_type(value).split('/')[0]
if correlation_type == 'application':
icon_text = '\uf15b'
elif correlation_type == 'audio':
icon_text = '\uf1c7'
elif correlation_type == 'image':
icon_text = '\uf1c5'
elif correlation_type == 'text':
icon_text = '\uf15c'
else:
icon_text = '\uf249'
elif correlation_name == 'domain':
node_radius = 5
node_color = '#3DA760'
if Domain.get_domain_type(value) == 'onion':
icon_text = '\uf06e'
else:
icon_class = 'fab'
icon_text = '\uf13b'
elif correlation_name == 'paste':
node_radius = 5
if Item.is_crawled(value):
node_color = 'red'
else:
node_color = '#332288'
return {"icon_class": icon_class, "icon_text": icon_text, "node_color": node_color, "node_radius": node_radius}
def get_item_url(correlation_name, value, correlation_type=None):
'''
Warning: use only in flask
'''
url = '#'
if correlation_name == "pgp":
endpoint = 'correlation.show_correlation'
url = url_for(endpoint, object_type="pgp", type_id=correlation_type, correlation_id=value)
elif correlation_name == 'cryptocurrency':
endpoint = 'correlation.show_correlation'
url = url_for(endpoint, object_type="cryptocurrency", type_id=correlation_type, correlation_id=value)
elif correlation_name == 'decoded':
endpoint = 'correlation.show_correlation'
url = url_for(endpoint, object_type="decoded", correlation_id=value)
elif correlation_name == 'domain':
endpoint = 'crawler_splash.showDomain'
url = url_for(endpoint, domain=value)
elif correlation_name == 'paste':
endpoint = 'showsavedpastes.showsavedpaste'
url = url_for(endpoint, paste=value)
return url
def create_graph_links(links_set):
graph_links_list = []
for link in links_set:
graph_links_list.append({"source": link[0], "target": link[1]})
return graph_links_list
def create_graph_nodes(nodes_set, root_node_id):
graph_nodes_list = []
for node_id in nodes_set:
correlation_name, correlation_type, value = node_id.split(';', 3)
dict_node = {"id": node_id}
dict_node['style'] = get_correlation_node_icon(correlation_name, correlation_type, value)
dict_node['text'] = value
if node_id == root_node_id:
dict_node["style"]["node_color"] = 'orange'
dict_node["style"]["node_radius"] = 7
dict_node['url'] = get_item_url(correlation_name, value, correlation_type)
graph_nodes_list.append(dict_node)
return graph_nodes_list
def create_node_id(correlation_name, value, correlation_type=''):
if correlation_type is None:
correlation_type = ''
return '{};{};{}'.format(correlation_name, correlation_type, value)
# # TODO: filter by correlation type => bitcoin, mail, ...
def get_graph_node_object_correlation(object_type, root_value, mode, correlation_names, correlation_objects, max_nodes=300, requested_correl_type=None):
links = set()
nodes = set()
root_node_id = create_node_id(object_type, root_value, requested_correl_type)
nodes.add(root_node_id)
root_correlation = get_object_correlation(object_type, root_value, correlation_names, correlation_objects, requested_correl_type=requested_correl_type)
for correl in root_correlation:
if correl in ('pgp', 'cryptocurrency'):
for correl_type in root_correlation[correl]:
for correl_val in root_correlation[correl][correl_type]:
# add correlation
correl_node_id = create_node_id(correl, correl_val, correl_type)
if mode=="union":
if len(nodes) > max_nodes:
break
nodes.add(correl_node_id)
links.add((root_node_id, correl_node_id))
# get second correlation
res = get_object_correlation(correl, correl_val, correlation_names, correlation_objects, requested_correl_type=correl_type)
if res:
for corr_obj in res:
for correl_key_val in res[corr_obj]:
#filter root value
if correl_key_val == root_value:
continue
if len(nodes) > max_nodes:
break
new_corel_1 = create_node_id(corr_obj, correl_key_val)
new_corel_2 = create_node_id(correl, correl_val, correl_type)
nodes.add(new_corel_1)
nodes.add(new_corel_2)
links.add((new_corel_1, new_corel_2))
if mode=="inter":
nodes.add(correl_node_id)
links.add((root_node_id, correl_node_id))
if correl in ('decoded', 'domain', 'paste'):
for correl_val in root_correlation[correl]:
correl_node_id = create_node_id(correl, correl_val)
if mode=="union":
if len(nodes) > max_nodes:
break
nodes.add(correl_node_id)
links.add((root_node_id, correl_node_id))
res = get_object_correlation(correl, correl_val, correlation_names, correlation_objects)
if res:
for corr_obj in res:
if corr_obj in ('decoded', 'domain', 'paste'):
for correl_key_val in res[corr_obj]:
#filter root value
if correl_key_val == root_value:
continue
if len(nodes) > max_nodes:
break
new_corel_1 = create_node_id(corr_obj, correl_key_val)
new_corel_2 = create_node_id(correl, correl_val)
nodes.add(new_corel_1)
nodes.add(new_corel_2)
links.add((new_corel_1, new_corel_2))
if mode=="inter":
nodes.add(correl_node_id)
links.add((root_node_id, correl_node_id))
if corr_obj in ('pgp', 'cryptocurrency'):
for correl_key_type in res[corr_obj]:
for correl_key_val in res[corr_obj][correl_key_type]:
#filter root value
if correl_key_val == root_value:
continue
if len(nodes) > max_nodes:
break
new_corel_1 = create_node_id(corr_obj, correl_key_val, correl_key_type)
new_corel_2 = create_node_id(correl, correl_val)
nodes.add(new_corel_1)
nodes.add(new_corel_2)
links.add((new_corel_1, new_corel_2))
if mode=="inter":
nodes.add(correl_node_id)
links.add((root_node_id, correl_node_id))
return {"nodes": create_graph_nodes(nodes, root_node_id), "links": create_graph_links(links)}
######## API EXPOSED ########
######## ########

141
bin/lib/Decoded.py Executable file
View file

@ -0,0 +1,141 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import redis
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages'))
import Item
import Date
import ConfigLoader
config_loader = ConfigLoader.ConfigLoader()
r_serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
config_loader = None
def get_decoded_item_type(sha1_string):
'''
Retun the estimed type of a given decoded item.
:param sha1_string: sha1_string
'''
return r_serv_metadata.hget('metadata_hash:{}'.format(sha1_string), 'estimated_type')
def nb_decoded_seen_in_item(sha1_string):
nb = r_serv_metadata.hget('metadata_hash:{}'.format(sha1_string), 'nb_seen_in_all_pastes')
if nb is None:
return 0
else:
return int(nb)
def nb_decoded_item_size(sha1_string):
nb = r_serv_metadata.hget('metadata_hash:{}'.format(sha1_string), 'size')
if nb is None:
return 0
else:
return int(nb)
def get_decoded_metadata(sha1_string, nb_seen=False, size=False):
metadata_dict = {}
metadata_dict['first_seen'] = r_serv_metadata.hget('metadata_hash:{}'.format(sha1_string), 'first_seen')
metadata_dict['last_seen'] = r_serv_metadata.hget('metadata_hash:{}'.format(sha1_string), 'last_seen')
if nb_seen:
metadata_dict['nb_seen'] = nb_decoded_seen_in_item(sha1_string)
if size:
metadata_dict['size'] = nb_decoded_item_size(sha1_string)
return metadata_dict
def get_list_nb_previous_hash(sha1_string, num_day):
nb_previous_hash = []
for date_day in Date.get_previous_date_list(num_day):
nb_previous_hash.append(get_nb_hash_seen_by_date(sha1_string, date_day))
return nb_previous_hash
def get_nb_hash_seen_by_date(sha1_string, date_day):
nb = r_serv_metadata.zscore('hash_date:{}'.format(date_day), sha1_string)
if nb is None:
return 0
else:
return int(nb)
def get_decoded_vt_report(sha1_string):
vt_dict = {}
res = r_serv_metadata.hget('metadata_hash:{}'.format(sha1_string), 'vt_link')
if res:
vt_dict["link"] = res
res = r_serv_metadata.hget('metadata_hash:{}'.format(sha1_string), 'vt_report')
if res:
vt_dict["report"] = res
return vt_dict
def get_decoded_items_list(sha1_string):
return r_serv_metadata.zrange('nb_seen_hash:{}'.format(sha1_string), 0, -1)
def get_item_decoded(item_id):
'''
Retun all decoded item of a given item id.
:param item_id: item id
'''
res = r_serv_metadata.smembers('hash_paste:{}'.format(item_id))
if res:
return list(res)
else:
return []
def get_domain_decoded_item(domain):
'''
Retun all decoded item of a given domain.
:param domain: crawled domain
'''
res = r_serv_metadata.smembers('hash_domain:{}'.format(domain))
if res:
return list(res)
else:
return []
def get_decoded_domain_item(sha1_string):
'''
Retun all domain of a given decoded item.
:param sha1_string: sha1_string
'''
res = r_serv_metadata.smembers('domain_hash:{}'.format(sha1_string))
if res:
return list(res)
else:
return []
def get_decoded_correlated_object(sha1_string, correlation_objects=[]):
'''
Retun all correlation of a given sha1.
:param sha1_string: sha1
:type sha1_string: str
:return: a dict of all correlation for a given sha1
:rtype: dict
'''
if correlation_objects is None:
correlation_objects = Correlation.get_all_correlation_objects()
decoded_correlation = {}
for correlation_object in correlation_objects:
if correlation_object == 'paste':
res = get_decoded_items_list(sha1_string)
elif correlation_object == 'domain':
res = get_decoded_domain_item(sha1_string)
else:
res = None
if res:
decoded_correlation[correlation_object] = res
return decoded_correlation
def save_domain_decoded(domain, sha1_string):
r_serv_metadata.sadd('hash_domain:{}'.format(domain), sha1_string) # domain - hash map
r_serv_metadata.sadd('domain_hash:{}'.format(sha1_string), domain) # hash - domain ma

409
bin/lib/Domain.py Executable file
View file

@ -0,0 +1,409 @@
#!/usr/bin/python3
"""
The ``Domain``
===================
"""
import os
import sys
import time
import redis
import random
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages/'))
import Cryptocurrency
from Pgp import pgp
import Decoded
import Item
import Tag
cryptocurrency = Cryptocurrency.cryptocurrency
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
import Correlate_object
config_loader = ConfigLoader.ConfigLoader()
r_serv_onion = config_loader.get_redis_conn("ARDB_Onion")
config_loader = None
def get_domain_type(domain):
if str(domain).endswith('.onion'):
return 'onion'
else:
return 'regular'
def sanathyse_port(port, domain, domain_type, strict=False, current_port=None):
'''
Retun a port number, If the port number is invalid, a port of the provided domain is randomly selected
'''
try:
port = int(port)
except (TypeError, ValueError):
if strict:
port = current_port
else:
port = get_random_domain_port(domain, domain_type)
return port
def is_domain_up(domain, domain_type):
return r_serv_onion.hexists('{}_metadata:{}'.format(domain_type, domain), 'ports')
def get_domain_all_ports(domain, domain_type):
'''
Return a list of all crawled ports
'''
l_ports = r_serv_onion.hget('{}_metadata:{}'.format(domain_type, domain), 'ports')
if l_ports:
return l_ports.split(";")
return []
def get_random_domain_port(domain, domain_type):
return random.choice(get_domain_all_ports(domain, domain_type))
def get_all_domain_up_by_type(domain_type):
if domain_type in domains:
list_domain = list(r_serv_onion.smembers('full_{}_up'.format(domain_type)))
return ({'type': domain_type, 'domains': list_domain}, 200)
else:
return ({"status": "error", "reason": "Invalid domain type"}, 400)
def get_domain_items(domain, root_item_id):
dom_item = get_domain_item_children(domain, root_item_id)
dom_item.append(root_item_id)
return dom_item
def get_domain_item_children(domain, root_item_id):
all_items = []
for item_id in Item.get_item_children(root_item_id):
if Item.is_item_in_domain(domain, item_id):
all_items.append(item_id)
all_items.extend(get_domain_item_children(domain, item_id))
return all_items
def get_domain_last_crawled_item_root(domain, domain_type, port):
'''
Retun last_crawled_item_core dict
'''
res = r_serv_onion.zrevrange('crawler_history_{}:{}:{}'.format(domain_type, domain, port), 0, 0, withscores=True)
if res:
return {"root_item": res[0][0], "epoch": int(res[0][1])}
else:
return {}
def get_domain_crawled_item_root(domain, domain_type, port, epoch=None):
'''
Retun the first item crawled for a given domain:port (and epoch)
'''
if epoch:
res = r_serv_onion.zrevrangebyscore('crawler_history_{}:{}:{}'.format(domain_type, domain, port), int(epoch), int(epoch))
if res:
return {"root_item": res[0], "epoch": int(epoch)}
# invalid epoch
epoch = None
if not epoch:
return get_domain_last_crawled_item_root(domain, domain_type, port)
def get_domain_items_crawled(domain, domain_type, port, epoch=None, items_link=False, item_screenshot=False, item_tag=False):
'''
'''
item_crawled = {}
item_root = get_domain_crawled_item_root(domain, domain_type, port, epoch=epoch)
if item_root:
item_crawled['port'] = port
item_crawled['epoch'] = item_root['epoch']
item_crawled['date'] = time.strftime('%Y/%m/%d - %H:%M.%S', time.gmtime(item_root['epoch']))
item_crawled['items'] = []
for item in get_domain_items(domain, item_root['root_item']):
dict_item = {"id": item}
if items_link:
dict_item['link'] = Item.get_item_link(item)
if item_screenshot:
dict_item['screenshot'] = Item.get_item_screenshot(item)
if item_tag:
dict_item['tags'] = Tag.get_item_tags_minimal(item)
item_crawled['items'].append(dict_item)
return item_crawled
def get_link_tree():
pass
def get_domain_last_check(domain, domain_type=None, r_format="str"):
'''
Get domain last check date
:param domain: crawled domain
:type domain: str
:param domain_type: domain type
:type domain_type: str
:return: domain last check date
:rtype: str
'''
if not domain_type:
domain_type = get_domain_type(domain)
last_check = r_serv_onion.hget('{}_metadata:{}'.format(domain_type, domain), 'last_check')
if last_check is not None:
if r_format=="int":
last_check = int(last_check)
# str
else:
last_check = '{}/{}/{}'.format(last_check[0:4], last_check[4:6], last_check[6:8])
return last_check
def get_domain_last_origin(domain, domain_type):
'''
Get domain last origin
:param domain: crawled domain
:type domain: str
:param domain_type: domain type
:type domain_type: str
:return: last orgin item_id
:rtype: str
'''
origin_item = r_serv_onion.hget('{}_metadata:{}'.format(domain_type, domain), 'paste_parent')
return origin_item
def get_domain_tags(domain):
'''
Retun all tags of a given domain.
:param domain: crawled domain
'''
return Tag.get_item_tags(domain)
def get_domain_cryptocurrency(domain, currencies_type=None, get_nb=False):
'''
Retun all cryptocurrencies of a given domain.
:param domain: crawled domain
:param currencies_type: list of cryptocurrencies type
:type currencies_type: list, optional
'''
return cryptocurrency.get_domain_correlation_dict(domain, correlation_type=currencies_type, get_nb=get_nb)
def get_domain_pgp(domain, currencies_type=None, get_nb=False):
'''
Retun all pgp of a given domain.
:param domain: crawled domain
:param currencies_type: list of pgp type
:type currencies_type: list, optional
'''
return pgp.get_domain_correlation_dict(domain, correlation_type=currencies_type, get_nb=get_nb)
def get_domain_decoded(domain):
'''
Retun all decoded item of a given domain.
:param domain: crawled domain
'''
return Decoded.get_domain_decoded_item(domain)
def get_domain_all_correlation(domain, correlation_names=[], get_nb=False):
'''
Retun all correlation of a given domain.
:param domain: crawled domain
:type domain: str
:return: a dict of all correlation for a given domain
:rtype: dict
'''
if not correlation_names:
correlation_names = Correlate_object.get_all_correlation_names()
domain_correl = {}
for correlation_name in correlation_names:
if correlation_name=='cryptocurrency':
res = get_domain_cryptocurrency(domain, get_nb=get_nb)
elif correlation_name=='pgp':
res = get_domain_pgp(domain, get_nb=get_nb)
elif correlation_name=='decoded':
res = get_domain_decoded(domain)
else:
res = None
# add correllation to dict
if res:
domain_correl[correlation_name] = res
return domain_correl
# TODO: handle port
def get_domain_history(domain, domain_type, port): # TODO: add date_range: from to + nb_elem
'''
Retun .
:param domain: crawled domain
:type domain: str
:return:
:rtype: list of tuple (item_core, epoch)
'''
return r_serv_onion.zrange('crawler_history_{}:{}:{}'.format(domain_type, domain, port), 0, -1, withscores=True)
def get_domain_history_with_status(domain, domain_type, port): # TODO: add date_range: from to + nb_elem
'''
Retun .
:param domain: crawled domain
:type domain: str
:return:
:rtype: list of dict (epoch, date: %Y/%m/%d - %H:%M.%S, boolean status)
'''
l_history = []
history = get_domain_history(domain, domain_type, port)
for root_item, epoch_val in history:
epoch_val = int(epoch_val) # force int
# domain down, root_item==epoch_val
try:
int(root_item)
status = False
# domain up, root_item=str
except ValueError:
status = True
l_history.append({"epoch": epoch_val, "date": time.strftime('%Y/%m/%d - %H:%M.%S', time.gmtime(epoch_val)), "status": status})
return l_history
def verify_if_domain_exist(domain):
return r_serv_onion.exists('{}_metadata:{}'.format(get_domain_type(domain), domain))
def api_verify_if_domain_exist(domain):
if not verify_if_domain_exist(domain):
return ({'status': 'error', 'reason': 'Unknow Domain'}, 404)
else:
return None
class Domain(object):
"""docstring for Domain."""
def __init__(self, domain, port=None):
self.domain = str(domain)
self.type = get_domain_type(domain)
if self.is_domain_up():
self.current_port = sanathyse_port(port, self.domain, self.type)
def get_domain_name(self):
return self.domain
def get_domain_type(self):
return self.type
def get_current_port(self):
return self.current_port
def get_domain_first_seen(self):
'''
Get domain first seen date
:return: domain first seen date
:rtype: str
'''
first_seen = r_serv_onion.hget('{}_metadata:{}'.format(self.type, self.domain), 'first_seen')
if first_seen is not None:
first_seen = '{}/{}/{}'.format(first_seen[0:4], first_seen[4:6], first_seen[6:8])
return first_seen
def get_domain_last_check(self):
'''
Get domain last check date
:return: domain last check date
:rtype: str
'''
return get_domain_last_check(self.domain, domain_type=self.type)
def get_domain_last_origin(self):
'''
Get domain last origin
:param domain: crawled domain
:type domain: str
:param domain_type: domain type
:type domain_type: str
:return: last orgin item_id
:rtype: str
'''
return get_domain_last_origin(self.domain, self.type)
def is_domain_up(self): # # TODO: handle multiple ports
'''
Return True if this domain is UP
'''
return is_domain_up(self.domain, self.type)
def get_domain_all_ports(self):
return get_domain_all_ports(self.domain, self.type)
def get_domain_metadata(self, first_seen=True, last_ckeck=True, status=True, ports=True):
'''
Get Domain basic metadata
:param first_seen: get domain first_seen
:type first_seen: boolean
:param last_ckeck: get domain last_check
:type last_ckeck: boolean
:param ports: get all domain ports
:type ports: boolean
:return: a dict of all metadata for a given domain
:rtype: dict
'''
dict_metadata = {}
if first_seen:
res = self.get_domain_first_seen()
if res is not None:
dict_metadata['first_seen'] = res
if last_ckeck:
res = self.get_domain_last_check()
if res is not None:
dict_metadata['last_check'] = res
if status:
dict_metadata['status'] = self.is_domain_up()
if ports:
dict_metadata['ports'] = self.get_domain_all_ports()
return dict_metadata
def get_domain_tags(self):
'''
Retun all tags of a given domain.
:param domain: crawled domain
'''
return get_domain_tags(self.domain)
def get_domain_correlation(self):
'''
Retun all correlation of a given domain.
'''
return get_domain_all_correlation(self.domain, get_nb=True)
def get_domain_history(self):
'''
Retun the full history of a given domain and port.
'''
return get_domain_history(self.domain, self.type, 80)
def get_domain_history_with_status(self):
'''
Retun the full history (with status) of a given domain and port.
'''
return get_domain_history_with_status(self.domain, self.type, 80)
def get_domain_items_crawled(self, port=None, epoch=None, items_link=False, item_screenshot=False, item_tag=False):
'''
Return ........................
'''
port = sanathyse_port(port, self.domain, self.type, strict=True, current_port=self.current_port)
return get_domain_items_crawled(self.domain, self.type, port, epoch=epoch, items_link=items_link, item_screenshot=item_screenshot, item_tag=item_tag)

260
bin/packages/Correlation.py Executable file
View file

@ -0,0 +1,260 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import redis
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages/'))
import Date
config_loader = ConfigLoader.ConfigLoader()
r_serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
config_loader = None
class Correlation(object):
def __init__(self, correlation_name, all_correlation_types):
self.correlation_name = correlation_name
self.all_correlation_types = all_correlation_types
def _exist_corelation_field(self, correlation_type, field_name, item_type='paste'):
if type=='paste':
return r_serv_metadata.exists('set_{}_{}:{}'.format(self.correlation_name, correlation_type, field_name))
else:
return r_serv_metadata.exists('set_domain_{}_{}:{}'.format(self.correlation_name, correlation_type, field_name))
def _get_items(self, correlation_type, field_name):
res = r_serv_metadata.smembers('set_{}_{}:{}'.format(self.correlation_name, correlation_type, field_name))
if res:
return list(res)
else:
return []
def _get_metadata(self, correlation_type, field_name):
meta_dict = {}
meta_dict['first_seen'] = r_serv_metadata.hget('{}_metadata_{}:{}'.format(self.correlation_name, correlation_type, field_name), 'first_seen')
meta_dict['last_seen'] = r_serv_metadata.hget('{}_metadata_{}:{}'.format(self.correlation_name, correlation_type, field_name), 'last_seen')
meta_dict['nb_seen'] = r_serv_metadata.scard('set_{}_{}:{}'.format(self.correlation_name, correlation_type, field_name))
return meta_dict
def get_metadata(self, correlation_type, field_name, date_format='str_date'):
meta_dict = self._get_metadata(correlation_type, field_name)
if date_format == "str_date":
if meta_dict['first_seen']:
meta_dict['first_seen'] = '{}/{}/{}'.format(meta_dict['first_seen'][0:4], meta_dict['first_seen'][4:6], meta_dict['first_seen'][6:8])
if meta_dict['last_seen']:
meta_dict['last_seen'] = '{}/{}/{}'.format(meta_dict['last_seen'][0:4], meta_dict['last_seen'][4:6], meta_dict['last_seen'][6:8])
return meta_dict
def get_nb_object_seen_by_date(self, correlation_type, field_name, date_day):
nb = r_serv_metadata.hget('{}:{}:{}'.format(self.correlation_name, correlation_type, date_day), field_name)
if nb is None:
return 0
else:
return int(nb)
def get_list_nb_previous_correlation_object(self, correlation_type, field_name, numDay):
nb_previous_correlation = []
for date_day in Date.get_previous_date_list(numDay):
nb_previous_correlation.append(self.get_nb_object_seen_by_date(correlation_type, field_name, date_day))
return nb_previous_correlation
def _get_correlation_by_date(self, correlation_type, date):
return r_serv_metadata.hkeys('{}:{}:{}'.format(self.correlation_name, correlation_type, date))
def verify_correlation_field_request(self, request_dict, correlation_type, item_type='paste'):
if not request_dict:
return ({'status': 'error', 'reason': 'Malformed JSON'}, 400)
field_name = request_dict.get(correlation_type, None)
if not field_name:
return ( {'status': 'error', 'reason': 'Mandatory parameter(s) not provided'}, 400 )
if not self._exist_corelation_field(correlation_type, field_name, item_type=item_type):
return ( {'status': 'error', 'reason': 'Item not found'}, 404 )
def get_correlation(self, request_dict, correlation_type, field_name):
dict_resp = {}
if request_dict.get('items'):
dict_resp['items'] = self._get_items(correlation_type, field_name)
if request_dict.get('metadata'):
dict_resp['metadata'] = self._get_metadata(correlation_type, field_name)
dict_resp[correlation_type] = field_name
return (dict_resp, 200)
def get_all_correlation_types(self):
'''
Gel all correlation types
:return: A list of all the correlation types
:rtype: list
'''
return self.all_correlation_types
def sanythise_correlation_types(self, correlation_types):
'''
Check if all correlation types in the list are valid.
:param correlation_types: list of correlation type
:type currency_type: list
:return: If a type is invalid, return the full list of correlation types else return the provided list
:rtype: list
'''
if correlation_types is None:
return self.get_all_correlation_types()
for correl in correlation_types: # # TODO: # OPTIMIZE:
if correl not in self.get_all_correlation_types():
return self.get_all_correlation_types()
return correlation_types
def _get_domain_correlation_obj(self, domain, correlation_type):
'''
Return correlation of a given domain.
:param domain: crawled domain
:type domain: str
:param correlation_type: correlation type
:type correlation_type: str
:return: a list of correlation
:rtype: list
'''
res = r_serv_metadata.smembers('domain_{}_{}:{}'.format(self.correlation_name, correlation_type, domain))
if res:
return list(res)
else:
return []
def get_domain_correlation_dict(self, domain, correlation_type=None, get_nb=False):
'''
Return all correlation of a given domain.
:param domain: crawled domain
:param correlation_type: list of correlation types
:type correlation_type: list, optional
:return: a dictionnary of all the requested correlations
:rtype: dict
'''
correlation_type = self.sanythise_correlation_types(correlation_type)
dict_correlation = {}
for correl in correlation_type:
res = self._get_domain_correlation_obj(domain, correl)
if res:
dict_correlation[correl] = res
if get_nb:
dict_correlation['nb'] = dict_correlation.get('nb', 0) + len(dict_correlation[correl])
return dict_correlation
def _get_correlation_obj_domain(self, field_name, correlation_type):
'''
Return all domains that contain this correlation.
:param domain: field name
:type domain: str
:param correlation_type: correlation type
:type correlation_type: str
:return: a list of correlation
:rtype: list
'''
res = r_serv_metadata.smembers('set_domain_{}_{}:{}'.format(self.correlation_name, correlation_type, field_name))
if res:
return list(res)
else:
return []
def get_correlation_obj_domain(self, field_name, correlation_type=None):
'''
Return all domain correlation of a given correlation_value.
:param field_name: field_name
:param correlation_type: list of correlation types
:type correlation_type: list, optional
:return: a dictionnary of all the requested correlations
:rtype: list
'''
correlation_type = self.sanythise_correlation_types(correlation_type)
for correl in correlation_type:
res = self._get_correlation_obj_domain(field_name, correl)
if res:
return res
return []
def _get_item_correlation_obj(self, item_id, correlation_type):
'''
Return correlation of a given item id.
:param item_id: item id
:type item_id: str
:param correlation_type: correlation type
:type correlation_type: str
:return: a list of correlation
:rtype: list
'''
res = r_serv_metadata.smembers('item_{}_{}:{}'.format(self.correlation_name, correlation_type, item_id))
if res:
return list(res)
else:
return []
def get_item_correlation_dict(self, item_id, correlation_type=None, get_nb=False):
'''
Return all correlation of a given item id.
:param item_id: item id
:param correlation_type: list of correlation types
:type correlation_type: list, optional
:return: a dictionnary of all the requested correlations
:rtype: dict
'''
correlation_type = self.sanythise_correlation_types(correlation_type)
dict_correlation = {}
for correl in correlation_type:
res = self._get_item_correlation_obj(item_id, correl)
if res:
dict_correlation[correl] = res
if get_nb:
dict_correlation['nb'] = dict_correlation.get('nb', 0) + len(dict_correlation[correl])
return dict_correlation
def get_correlation_all_object(self, correlation_type, correlation_value, correlation_objects=[]):
if correlation_objects is None:
correlation_objects = get_all_correlation_objects()
correlation_obj = {}
for correlation_object in correlation_objects:
if correlation_object == 'paste':
res = self._get_items(correlation_type, correlation_value)
elif correlation_object == 'domain':
res = self.get_correlation_obj_domain(correlation_value, correlation_type=correlation_type)
else:
res = None
if res:
correlation_obj[correlation_object] = res
return correlation_obj
def save_domain_correlation(self, domain, correlation_type, correlation_value):
r_serv_metadata.sadd('domain_{}_{}:{}'.format(self.correlation_name, correlation_type, domain), correlation_value)
r_serv_metadata.sadd('set_domain_{}_{}:{}'.format(self.correlation_name, correlation_type, correlation_value), domain)
######## API EXPOSED ########
######## ########

96
bin/packages/Cryptocurrency.py Executable file
View file

@ -0,0 +1,96 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import redis
from hashlib import sha256
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages'))
from Correlation import Correlation
import Item
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
config_loader = ConfigLoader.ConfigLoader()
r_serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
config_loader = None
digits58 = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'
class Cryptocurrency(Correlation):
def __init__(self):
super().__init__('cryptocurrency', ['bitcoin', 'ethereum', 'bitcoin-cash', 'litecoin', 'monero', 'zcash', 'dash'])
cryptocurrency = Cryptocurrency()
# http://rosettacode.org/wiki/Bitcoin/address_validation#Python
def decode_base58(bc, length):
n = 0
for char in bc:
n = n * 58 + digits58.index(char)
return n.to_bytes(length, 'big')
# http://rosettacode.org/wiki/Bitcoin/address_validation#Python
def check_base58_address(bc):
try:
bcbytes = decode_base58(bc, 25)
return bcbytes[-4:] == sha256(sha256(bcbytes[:-4]).digest()).digest()[:4]
except Exception:
return False
def verify_cryptocurrency_address(cryptocurrency_type, cryptocurrency_address):
if cryptocurrency_type in ('bitcoin', 'litecoin', 'dash'):
return check_base58_address(cryptocurrency_address)
else:
return True
def get_cryptocurrency(request_dict, cryptocurrency_type):
# basic verification
res = cryptocurrency.verify_correlation_field_request(request_dict, cryptocurrency_type)
if res:
return res
# cerify address
field_name = request_dict.get(cryptocurrency_type)
if not verify_cryptocurrency_address(cryptocurrency_type, field_name):
return ( {'status': 'error', 'reason': 'Invalid Cryptocurrency address'}, 400 )
return cryptocurrency.get_correlation(request_dict, cryptocurrency_type, field_name)
# # TODO: refractor/move me in Correlation
def save_cryptocurrency_data(cryptocurrency_name, date, item_path, cryptocurrency_address):
# create basic medata
if not r_serv_metadata.exists('cryptocurrency_metadata_{}:{}'.format(cryptocurrency_name, cryptocurrency_address)):
r_serv_metadata.hset('cryptocurrency_metadata_{}:{}'.format(cryptocurrency_name, cryptocurrency_address), 'first_seen', date)
r_serv_metadata.hset('cryptocurrency_metadata_{}:{}'.format(cryptocurrency_name, cryptocurrency_address), 'last_seen', date)
else:
last_seen = r_serv_metadata.hget('cryptocurrency_metadata_{}:{}'.format(cryptocurrency_name, cryptocurrency_address), 'last_seen')
if not last_seen:
r_serv_metadata.hset('cryptocurrency_metadata_{}:{}'.format(cryptocurrency_name, cryptocurrency_address), 'last_seen', date)
else:
if int(last_seen) < int(date):
r_serv_metadata.hset('cryptocurrency_metadata_{}:{}'.format(cryptocurrency_name, cryptocurrency_address), 'last_seen', date)
## global set
# item
r_serv_metadata.sadd('set_cryptocurrency_{}:{}'.format(cryptocurrency_name, cryptocurrency_address), item_path)
# daily
r_serv_metadata.hincrby('cryptocurrency:{}:{}'.format(cryptocurrency_name, date), cryptocurrency_address, 1)
# all type
r_serv_metadata.zincrby('cryptocurrency_all:{}'.format(cryptocurrency_name), cryptocurrency_address, 1)
## object_metadata
# item
r_serv_metadata.sadd('item_cryptocurrency_{}:{}'.format(cryptocurrency_name, item_path), cryptocurrency_address)
# domain
if Item.is_crawled(item_path): # # TODO: use save_domain_correlation
domain = Item.get_item_domain(item_path)
r_serv_metadata.sadd('domain_cryptocurrency_{}:{}'.format(cryptocurrency_name, domain), cryptocurrency_address)
r_serv_metadata.sadd('set_domain_cryptocurrency_{}:{}'.format(cryptocurrency_name, cryptocurrency_address), domain)

View file

@ -1,5 +1,9 @@
#!/usr/bin/python3
import datetime
# # TODO: refractor me
class Date(object):
"""docstring for Date"""
def __init__(self, *args):
@ -34,9 +38,46 @@ class Date(object):
self.day = day
def substract_day(self, numDay):
import datetime
computed_date = datetime.date(int(self.year), int(self.month), int(self.day)) - datetime.timedelta(numDay)
comp_year = str(computed_date.year)
comp_month = str(computed_date.month).zfill(2)
comp_day = str(computed_date.day).zfill(2)
return comp_year + comp_month + comp_day
def date_add_day(date, num_day=1):
new_date = datetime.date(int(date[0:4]), int(date[4:6]), int(date[6:8])) + datetime.timedelta(num_day)
new_date = str(new_date).replace('-', '')
return new_date
def date_substract_day(date, num_day=1):
new_date = datetime.date(int(date[0:4]), int(date[4:6]), int(date[6:8])) - datetime.timedelta(num_day)
new_date = str(new_date).replace('-', '')
return new_date
# # TODO: remove me ## FIXME:
def get_date_range(num_day):
curr_date = datetime.date.today()
date = Date(str(curr_date.year)+str(curr_date.month).zfill(2)+str(curr_date.day).zfill(2))
date_list = []
for i in range(0, num_day+1):
date_list.append(date.substract_day(i))
return list(reversed(date_list))
def get_previous_date_list(num_day):
curr_date = datetime.date.today()
date = Date(str(curr_date.year)+str(curr_date.month).zfill(2)+str(curr_date.day).zfill(2))
date_list = []
for i in range(0, num_day+1):
date_list.append(date.substract_day(i))
return list(reversed(date_list))
def substract_date(date_from, date_to):
date_from = datetime.date(int(date_from[0:4]), int(date_from[4:6]), int(date_from[6:8]))
date_to = datetime.date(int(date_to[0:4]), int(date_to[4:6]), int(date_to[6:8]))
delta = date_to - date_from # timedelta
l_date = []
for i in range(delta.days + 1):
date = date_from + datetime.timedelta(i)
l_date.append( date.strftime('%Y%m%d') )
return l_date

View file

@ -17,6 +17,7 @@ Conditions to fulfill to be able to use this class correctly:
"""
import os
import sys
import time
import gzip
import redis
@ -25,11 +26,12 @@ import random
from io import BytesIO
import zipfile
import configparser
import sys
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages/'))
from Date import Date
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
class HiddenServices(object):
"""
This class representing a hiddenServices as an object.
@ -43,27 +45,11 @@ class HiddenServices(object):
def __init__(self, domain, type, port=80):
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
config_loader = ConfigLoader.ConfigLoader()
self.r_serv_onion = config_loader.get_redis_conn("ARDB_Onion")
self.r_serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
cfg = configparser.ConfigParser()
cfg.read(configfile)
self.r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
self.r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
self.PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
self.PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "pastes")) + '/'
self.domain = domain
self.type = type
@ -71,17 +57,19 @@ class HiddenServices(object):
self.tags = {}
if type == 'onion' or type == 'regular':
self.paste_directory = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes"))
self.paste_crawled_directory = os.path.join(self.paste_directory, cfg.get("Directories", "crawled"))
self.paste_crawled_directory_name = cfg.get("Directories", "crawled")
self.screenshot_directory = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "crawled_screenshot"))
self.paste_directory = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "pastes"))
self.paste_crawled_directory = os.path.join(self.paste_directory, config_loader.get_config_str("Directories", "crawled"))
self.paste_crawled_directory_name = config_loader.get_config_str("Directories", "crawled")
self.screenshot_directory = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "crawled_screenshot"))
self.screenshot_directory_screenshot = os.path.join(self.screenshot_directory, 'screenshot')
elif type == 'i2p':
self.paste_directory = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "crawled_screenshot"))
self.screenshot_directory = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "crawled_screenshot"))
self.paste_directory = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "crawled_screenshot"))
self.screenshot_directory = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "crawled_screenshot"))
else:
## TODO: # FIXME: add error
pass
config_loader = None
#def remove_absolute_path_link(self, key, value):
# print(key)

80
bin/packages/Import_helper.py Executable file
View file

@ -0,0 +1,80 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import uuid
import redis
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
config_loader = ConfigLoader.ConfigLoader()
r_serv_db = config_loader.get_redis_conn("ARDB_DB")
r_serv_log_submit = config_loader.get_redis_conn("Redis_Log_submit")
config_loader = None
def is_valid_uuid_v4(UUID):
UUID = UUID.replace('-', '')
try:
uuid_test = uuid.UUID(hex=UUID, version=4)
return uuid_test.hex == UUID
except:
return False
def create_import_queue(tags, galaxy, paste_content, UUID, password=None, isfile = False):
# save temp value on disk
for tag in tags:
r_serv_db.sadd(UUID + ':ltags', tag)
for tag in galaxy:
r_serv_db.sadd(UUID + ':ltagsgalaxies', tag)
r_serv_db.set(UUID + ':paste_content', paste_content)
if password:
r_serv_db.set(UUID + ':password', password)
r_serv_db.set(UUID + ':isfile', isfile)
r_serv_log_submit.set(UUID + ':end', 0)
r_serv_log_submit.set(UUID + ':processing', 0)
r_serv_log_submit.set(UUID + ':nb_total', -1)
r_serv_log_submit.set(UUID + ':nb_end', 0)
r_serv_log_submit.set(UUID + ':nb_sucess', 0)
# save UUID on disk
r_serv_db.sadd('submitted:uuid', UUID)
return UUID
def check_import_status(UUID):
if not is_valid_uuid_v4(UUID):
return ({'status': 'error', 'reason': 'Invalid uuid'}, 400)
processing = r_serv_log_submit.get(UUID + ':processing')
if not processing:
return ({'status': 'error', 'reason': 'Unknown uuid'}, 404)
# nb_total = r_serv_log_submit.get(UUID + ':nb_total')
# nb_sucess = r_serv_log_submit.get(UUID + ':nb_sucess')
# nb_end = r_serv_log_submit.get(UUID + ':nb_end')
items_id = list(r_serv_log_submit.smembers(UUID + ':paste_submit_link'))
error = r_serv_log_submit.get(UUID + ':error')
end = r_serv_log_submit.get(UUID + ':end')
dict_import_status = {}
if items_id:
dict_import_status['items'] = items_id
if error:
dict_import_status['error'] = error
if processing == '0':
status = 'in queue'
else:
if end == '0':
status = 'in progress'
else:
status = 'imported'
dict_import_status['status'] = status
return (dict_import_status, 200)

261
bin/packages/Item.py Executable file
View file

@ -0,0 +1,261 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import gzip
import redis
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages/'))
import Date
import Tag
import Correlation
import Cryptocurrency
from Pgp import pgp
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
import Decoded
config_loader = ConfigLoader.ConfigLoader()
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "pastes")) + '/'
r_cache = config_loader.get_redis_conn("Redis_Cache")
r_serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
config_loader = None
def exist_item(item_id):
if os.path.isfile(os.path.join(PASTES_FOLDER, item_id)):
return True
else:
return False
def get_item_id(full_path):
return full_path.replace(PASTES_FOLDER, '', 1)
def get_item_date(item_id):
l_directory = item_id.split('/')
return '{}{}{}'.format(l_directory[-4], l_directory[-3], l_directory[-2])
def get_source(item_id):
return item_id.split('/')[-5]
def get_item_basename(item_id):
return os.path.basename(item_id)
def get_item_size(item_id):
return round(os.path.getsize(os.path.join(PASTES_FOLDER, item_id))/1024.0, 2)
def get_lines_info(item_id, item_content=None):
if not item_content:
item_content = get_item_content(item_id)
max_length = 0
line_id = 0
nb_line = 0
for line in item_content.splitlines():
length = len(line)
if length > max_length:
max_length = length
nb_line += 1
return {'nb': nb_line, 'max_length': max_length}
def get_item_content(item_id):
item_full_path = os.path.join(PASTES_FOLDER, item_id)
try:
item_content = r_cache.get(item_full_path)
except UnicodeDecodeError:
item_content = None
except Exception as e:
item_content = None
if item_content is None:
try:
with gzip.open(item_full_path, 'r') as f:
item_content = f.read().decode()
r_cache.set(item_full_path, item_content)
r_cache.expire(item_full_path, 300)
except:
item_content = ''
return str(item_content)
# API
def get_item(request_dict):
if not request_dict:
return Response({'status': 'error', 'reason': 'Malformed JSON'}, 400)
item_id = request_dict.get('id', None)
if not item_id:
return ( {'status': 'error', 'reason': 'Mandatory parameter(s) not provided'}, 400 )
if not exist_item(item_id):
return ( {'status': 'error', 'reason': 'Item not found'}, 404 )
dict_item = {}
dict_item['id'] = item_id
date = request_dict.get('date', True)
if date:
dict_item['date'] = get_item_date(item_id)
tags = request_dict.get('tags', True)
if tags:
dict_item['tags'] = Tag.get_item_tags(item_id)
size = request_dict.get('size', False)
if size:
dict_item['size'] = get_item_size(item_id)
content = request_dict.get('content', False)
if content:
# UTF-8 outpout, # TODO: use base64
dict_item['content'] = get_item_content(item_id)
lines_info = request_dict.get('lines', False)
if lines_info:
dict_item['lines'] = get_lines_info(item_id, dict_item.get('content', 'None'))
if request_dict.get('pgp'):
dict_item['pgp'] = {}
if request_dict['pgp'].get('key'):
dict_item['pgp']['key'] = get_item_pgp_key(item_id)
if request_dict['pgp'].get('mail'):
dict_item['pgp']['mail'] = get_item_pgp_mail(item_id)
if request_dict['pgp'].get('name'):
dict_item['pgp']['name'] = get_item_pgp_name(item_id)
if request_dict.get('cryptocurrency'):
dict_item['cryptocurrency'] = {}
if request_dict['cryptocurrency'].get('bitcoin'):
dict_item['cryptocurrency']['bitcoin'] = get_item_bitcoin(item_id)
return (dict_item, 200)
###
### correlation
###
def get_item_cryptocurrency(item_id, currencies_type=None, get_nb=False):
'''
Return all cryptocurrencies of a given item.
:param item_id: item id
:param currencies_type: list of cryptocurrencies type
:type currencies_type: list, optional
'''
return Cryptocurrency.cryptocurrency.get_item_correlation_dict(item_id, correlation_type=currencies_type, get_nb=get_nb)
def get_item_pgp(item_id, currencies_type=None, get_nb=False):
'''
Return all pgp of a given item.
:param item_id: item id
:param currencies_type: list of cryptocurrencies type
:type currencies_type: list, optional
'''
return pgp.get_item_correlation_dict(item_id, correlation_type=currencies_type, get_nb=get_nb)
def get_item_decoded(item_id):
'''
Return all pgp of a given item.
:param item_id: item id
:param currencies_type: list of cryptocurrencies type
:type currencies_type: list, optional
'''
return Decoded.get_item_decoded(item_id)
def get_item_all_correlation(item_id, correlation_names=[], get_nb=False):
'''
Retun all correlation of a given item id.
:param item_id: item id
:type domain: str
:return: a dict of all correlation for a item id
:rtype: dict
'''
if not correlation_names:
correlation_names = Correlation.get_all_correlation_names()
item_correl = {}
for correlation_name in correlation_names:
if correlation_name=='cryptocurrency':
res = get_item_cryptocurrency(item_id, get_nb=get_nb)
elif correlation_name=='pgp':
res = get_item_pgp(item_id, get_nb=get_nb)
elif correlation_name=='decoded':
res = get_item_decoded(item_id)
else:
res = None
# add correllation to dict
if res:
item_correl[correlation_name] = res
return item_correl
## TODO: REFRACTOR
def _get_item_correlation(correlation_name, correlation_type, item_id):
res = r_serv_metadata.smembers('item_{}_{}:{}'.format(correlation_name, correlation_type, item_id))
if res:
return list(res)
else:
return []
## TODO: REFRACTOR
def get_item_bitcoin(item_id):
return _get_item_correlation('cryptocurrency', 'bitcoin', item_id)
## TODO: REFRACTOR
def get_item_pgp_key(item_id):
return _get_item_correlation('pgpdump', 'key', item_id)
## TODO: REFRACTOR
def get_item_pgp_name(item_id):
return _get_item_correlation('pgpdump', 'name', item_id)
## TODO: REFRACTOR
def get_item_pgp_mail(item_id):
return _get_item_correlation('pgpdump', 'mail', item_id)
## TODO: REFRACTOR
def get_item_pgp_correlation(item_id):
pass
###
### GET Internal Module DESC
###
def get_item_list_desc(list_item_id):
desc_list = []
for item_id in list_item_id:
desc_list.append( {'id': item_id, 'date': get_item_date(item_id), 'tags': Tag.get_item_tags(item_id)} )
return desc_list
# # TODO: add an option to check the tag
def is_crawled(item_id):
return item_id.startswith('crawled')
def is_onion(item_id):
is_onion = False
if len(is_onion) > 62:
if is_crawled(item_id) and item_id[-42:-36] == '.onion':
is_onion = True
return is_onion
def is_item_in_domain(domain, item_id):
is_in_domain = False
domain_lenght = len(domain)
if len(item_id) > (domain_lenght+48):
if item_id[-36-domain_lenght:-36] == domain:
is_in_domain = True
return is_in_domain
def get_item_domain(item_id):
return item_id[19:-36]
def get_item_children(item_id):
return list(r_serv_metadata.smembers('paste_children:{}'.format(item_id)))
def get_item_link(item_id):
return r_serv_metadata.hget('paste_metadata:{}'.format(item_id), 'real_link')
def get_item_screenshot(item_id):
screenshot = r_serv_metadata.hget('paste_metadata:{}'.format(item_id), 'screenshot')
if screenshot:
return os.path.join(screenshot[0:2], screenshot[2:4], screenshot[4:6], screenshot[6:8], screenshot[8:10], screenshot[10:12], screenshot[12:])
return ''

View file

@ -17,20 +17,22 @@ Conditions to fulfill to be able to use this class correctly:
"""
import os
import re
import sys
import magic
import gzip
import redis
import operator
import string
import re
import json
import configparser
from io import StringIO
import sys
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages/'))
from Date import Date
from Hash import Hash
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
from langid.langid import LanguageIdentifier, model
from nltk.tokenize import RegexpTokenizer
@ -58,31 +60,12 @@ class Paste(object):
def __init__(self, p_path):
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
config_loader = ConfigLoader.ConfigLoader()
self.cache = config_loader.get_redis_conn("Redis_Queues")
self.store = config_loader.get_redis_conn("Redis_Data_Merging")
self.store_metadata = config_loader.get_redis_conn("ARDB_Metadata")
cfg = configparser.ConfigParser()
cfg.read(configfile)
self.cache = redis.StrictRedis(
host=cfg.get("Redis_Queues", "host"),
port=cfg.getint("Redis_Queues", "port"),
db=cfg.getint("Redis_Queues", "db"),
decode_responses=True)
self.store = redis.StrictRedis(
host=cfg.get("Redis_Data_Merging", "host"),
port=cfg.getint("Redis_Data_Merging", "port"),
db=cfg.getint("Redis_Data_Merging", "db"),
decode_responses=True)
self.store_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
self.PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes"))
self.PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "pastes"))
if self.PASTES_FOLDER not in p_path:
self.p_rel_path = p_path
self.p_path = os.path.join(self.PASTES_FOLDER, p_path)
@ -115,6 +98,17 @@ class Paste(object):
self.p_duplicate = None
self.p_tags = None
def get_item_dict(self):
dict_item = {}
dict_item['id'] = self.p_rel_path
dict_item['date'] = str(self.p_date)
dict_item['content'] = self.get_p_content()
tags = self._get_p_tags()
if tags:
dict_item['tags'] = tags
return dict_item
def get_p_content(self):
"""
Returning the content of the Paste
@ -321,8 +315,8 @@ class Paste(object):
return self.store_metadata.scard('dup:'+self.p_path) + self.store_metadata.scard('dup:'+self.p_rel_path)
def _get_p_tags(self):
self.p_tags = self.store_metadata.smembers('tag:'+path, tag)
if self.self.p_tags is not None:
self.p_tags = self.store_metadata.smembers('tag:'+self.p_rel_path)
if self.p_tags is not None:
return list(self.p_tags)
else:
return '[]'

67
bin/packages/Pgp.py Executable file
View file

@ -0,0 +1,67 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import redis
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages'))
from Correlation import Correlation
import Item
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
config_loader = ConfigLoader.ConfigLoader()
serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
config_loader = None
class Pgp(Correlation):
def __init__(self):
super().__init__('pgpdump', ['key', 'mail', 'name'])
pgp = Pgp()
def get_pgp(request_dict, pgp_type):
# basic verification
res = pgpdump.verify_correlation_field_request(request_dict, pgp_type)
if res:
return res
# cerify address
field_name = request_dict.get(pgp_type)
return pgpdump.get_correlation(request_dict, pgp_type, field_name)
def save_pgp_data(type_pgp, date, item_path, data):
# create basic medata
if not serv_metadata.exists('pgpdump_metadata_{}:{}'.format(type_pgp, data)):
serv_metadata.hset('pgpdump_metadata_{}:{}'.format(type_pgp, data), 'first_seen', date)
serv_metadata.hset('pgpdump_metadata_{}:{}'.format(type_pgp, data), 'last_seen', date)
else:
last_seen = serv_metadata.hget('pgpdump_metadata_{}:{}'.format(type_pgp, data), 'last_seen')
if not last_seen:
serv_metadata.hset('pgpdump_metadata_{}:{}'.format(type_pgp, data), 'last_seen', date)
else:
if int(last_seen) < int(date):
serv_metadata.hset('pgpdump_metadata_{}:{}'.format(type_pgp, data), 'last_seen', date)
# global set
serv_metadata.sadd('set_pgpdump_{}:{}'.format(type_pgp, data), item_path)
# daily
serv_metadata.hincrby('pgpdump:{}:{}'.format(type_pgp, date), data, 1)
# all type
serv_metadata.zincrby('pgpdump_all:{}'.format(type_pgp), data, 1)
## object_metadata
# paste
serv_metadata.sadd('item_pgpdump_{}:{}'.format(type_pgp, item_path), data)
# domain object
if Item.is_crawled(item_path):
domain = Item.get_item_domain(item_path)
serv_metadata.sadd('domain_pgpdump_{}:{}'.format(type_pgp, domain), data)
serv_metadata.sadd('set_domain_pgpdump_{}:{}'.format(type_pgp, data), domain)

304
bin/packages/Tag.py Executable file
View file

@ -0,0 +1,304 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import sys
import redis
import Date
import Item
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
import Domain
from pytaxonomies import Taxonomies
from pymispgalaxies import Galaxies, Clusters
config_loader = ConfigLoader.ConfigLoader()
r_serv_tags = config_loader.get_redis_conn("ARDB_Tags")
r_serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
config_loader = None
def get_taxonomie_from_tag(tag):
return tag.split(':')[0]
def get_galaxy_from_tag(tag):
galaxy = tag.split(':')[1]
galaxy = galaxy.split('=')[0]
return galaxy
def get_active_taxonomies():
return r_serv_tags.smembers('active_taxonomies')
def get_active_galaxies():
return r_serv_tags.smembers('active_galaxies')
def is_taxonomie_tag_enabled(taxonomie, tag):
if tag in r_serv_tags.smembers('active_tag_' + taxonomie):
return True
else:
return False
def is_galaxy_tag_enabled(galaxy, tag):
if tag in r_serv_tags.smembers('active_tag_galaxies_' + galaxy):
return True
else:
return False
# Check if tags are enabled in AIL
def is_valid_tags_taxonomies_galaxy(list_tags, list_tags_galaxy):
if list_tags:
active_taxonomies = get_active_taxonomies()
for tag in list_tags:
taxonomie = get_taxonomie_from_tag(tag)
if taxonomie not in active_taxonomies:
return False
if not is_taxonomie_tag_enabled(taxonomie, tag):
return False
if list_tags_galaxy:
active_galaxies = get_active_galaxies()
for tag in list_tags_galaxy:
galaxy = get_galaxy_from_tag(tag)
if galaxy not in active_galaxies:
return False
if not is_galaxy_tag_enabled(galaxy, tag):
return False
return True
def get_tag_metadata(tag):
first_seen = r_serv_tags.hget('tag_metadata:{}'.format(tag), 'first_seen')
last_seen = r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen')
return {'tag': tag, 'first_seen': first_seen, 'last_seen': last_seen}
def is_tag_in_all_tag(tag):
if r_serv_tags.sismember('list_tags', tag):
return True
else:
return False
def get_all_tags():
return list(r_serv_tags.smembers('list_tags'))
'''
Retun all the tags of a given item.
:param item_id: (Paste or domain)
'''
def get_item_tags(item_id):
tags = r_serv_metadata.smembers('tag:{}'.format(item_id))
if tags:
return list(tags)
else:
return []
def get_min_tag(tag):
tag = tag.split('=')
if len(tag) > 1:
if tag[1] != '':
tag = tag[1][1:-1]
# no value
else:
tag = tag[0][1:-1]
# custom tags
else:
tag = tag[0]
return tag
def get_item_tags_minimal(item_id):
return [ {"tag": tag, "min_tag": get_min_tag(tag)} for tag in get_item_tags(item_id) ]
# TEMPLATE + API QUERY
def add_items_tag(tags=[], galaxy_tags=[], item_id=None): ## TODO: remove me
res_dict = {}
if item_id == None:
return ({'status': 'error', 'reason': 'Item id not found'}, 404)
if not tags and not galaxy_tags:
return ({'status': 'error', 'reason': 'Tags or Galaxy not specified'}, 400)
res_dict['tags'] = []
for tag in tags:
taxonomie = get_taxonomie_from_tag(tag)
if is_taxonomie_tag_enabled(taxonomie, tag):
add_item_tag(tag, item_id)
res_dict['tags'].append(tag)
else:
return ({'status': 'error', 'reason': 'Tags or Galaxy not enabled'}, 400)
for tag in galaxy_tags:
galaxy = get_galaxy_from_tag(tag)
if is_galaxy_tag_enabled(galaxy, tag):
add_item_tag(tag, item_id)
res_dict['tags'].append(tag)
else:
return ({'status': 'error', 'reason': 'Tags or Galaxy not enabled'}, 400)
res_dict['id'] = item_id
return (res_dict, 200)
# TEMPLATE + API QUERY
def add_items_tags(tags=[], galaxy_tags=[], item_id=None, item_type="paste"):
res_dict = {}
if item_id == None:
return ({'status': 'error', 'reason': 'Item id not found'}, 404)
if not tags and not galaxy_tags:
return ({'status': 'error', 'reason': 'Tags or Galaxy not specified'}, 400)
if item_type not in ('paste', 'domain'):
return ({'status': 'error', 'reason': 'Incorrect item_type'}, 400)
res_dict['tags'] = []
for tag in tags:
if tag:
taxonomie = get_taxonomie_from_tag(tag)
if is_taxonomie_tag_enabled(taxonomie, tag):
add_item_tag(tag, item_id, item_type=item_type)
res_dict['tags'].append(tag)
else:
return ({'status': 'error', 'reason': 'Tags or Galaxy not enabled'}, 400)
for tag in galaxy_tags:
if tag:
galaxy = get_galaxy_from_tag(tag)
if is_galaxy_tag_enabled(galaxy, tag):
add_item_tag(tag, item_id, item_type=item_type)
res_dict['tags'].append(tag)
else:
return ({'status': 'error', 'reason': 'Tags or Galaxy not enabled'}, 400)
res_dict['id'] = item_id
res_dict['type'] = item_type
return (res_dict, 200)
def add_domain_tag(tag, domain, item_date):
r_serv_metadata.sadd('tag:{}'.format(domain), tag)
r_serv_tags.sadd('domain:{}:{}'.format(tag, item_date), domain)
def add_item_tag(tag, item_path, item_type="paste", tag_date=None):
if item_type=="paste":
item_date = int(Item.get_item_date(item_path))
#add tag
r_serv_metadata.sadd('tag:{}'.format(item_path), tag)
r_serv_tags.sadd('{}:{}'.format(tag, item_date), item_path)
if Item.is_crawled(item_path):
domain = Item.get_item_domain(item_path)
r_serv_metadata.sadd('tag:{}'.format(domain), tag)
r_serv_tags.sadd('domain:{}:{}'.format(tag, item_date), domain)
# domain item
else:
item_date = int(Domain.get_domain_last_check(item_path, r_format="int"))
add_domain_tag(tag, item_path, item_date)
r_serv_tags.hincrby('daily_tags:{}'.format(item_date), tag, 1)
tag_first_seen = r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen')
if tag_first_seen is None:
tag_first_seen = 99999999
else:
tag_first_seen = int(tag_first_seen)
tag_last_seen = r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen')
if tag_last_seen is None:
tag_last_seen = 0
else:
tag_last_seen = int(tag_last_seen)
#add new tag in list of all used tags
r_serv_tags.sadd('list_tags', tag)
# update fisrt_seen/last_seen
if item_date < tag_first_seen:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'first_seen', item_date)
# update metadata last_seen
if item_date > tag_last_seen:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'last_seen', item_date)
# API QUERY
def remove_item_tags(tags=[], item_id=None):
if item_id == None:
return ({'status': 'error', 'reason': 'Item id not found'}, 404)
if not tags:
return ({'status': 'error', 'reason': 'No Tag(s) specified'}, 400)
dict_res = {}
dict_res['tags'] = []
for tag in tags:
res = remove_item_tag(tag, item_id)
if res[1] != 200:
return res
else:
dict_res['tags'].append(tag)
dict_res['id'] = item_id
return (dict_res, 200)
# TEMPLATE + API QUERY
def remove_item_tag(tag, item_id):
item_date = int(Item.get_item_date(item_id))
#remove tag
r_serv_metadata.srem('tag:{}'.format(item_id), tag)
res = r_serv_tags.srem('{}:{}'.format(tag, item_date), item_id)
if res ==1:
# no tag for this day
if int(r_serv_tags.hget('daily_tags:{}'.format(item_date), tag)) == 1:
r_serv_tags.hdel('daily_tags:{}'.format(item_date), tag)
else:
r_serv_tags.hincrby('daily_tags:{}'.format(item_date), tag, -1)
tag_first_seen = int(r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen'))
tag_last_seen = int(r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen'))
# update fisrt_seen/last_seen
if item_date == tag_first_seen:
update_tag_first_seen(tag, tag_first_seen, tag_last_seen)
if item_date == tag_last_seen:
update_tag_last_seen(tag, tag_first_seen, tag_last_seen)
return ({'status': 'success'}, 200)
else:
return ({'status': 'error', 'reason': 'Item id or tag not found'}, 400)
def update_tag_first_seen(tag, tag_first_seen, tag_last_seen):
if tag_first_seen == tag_last_seen:
if r_serv_tags.scard('{}:{}'.format(tag, tag_first_seen)) > 0:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'first_seen', tag_first_seen)
# no tag in db
else:
r_serv_tags.srem('list_tags', tag)
r_serv_tags.hdel('tag_metadata:{}'.format(tag), 'first_seen')
r_serv_tags.hdel('tag_metadata:{}'.format(tag), 'last_seen')
else:
if r_serv_tags.scard('{}:{}'.format(tag, tag_first_seen)) > 0:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'first_seen', tag_first_seen)
else:
tag_first_seen = Date.date_add_day(tag_first_seen)
update_tag_first_seen(tag, tag_first_seen, tag_last_seen)
def update_tag_last_seen(tag, tag_first_seen, tag_last_seen):
if tag_first_seen == tag_last_seen:
if r_serv_tags.scard('{}:{}'.format(tag, tag_last_seen)) > 0:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'last_seen', tag_last_seen)
# no tag in db
else:
r_serv_tags.srem('list_tags', tag)
r_serv_tags.hdel('tag_metadata:{}'.format(tag), 'first_seen')
r_serv_tags.hdel('tag_metadata:{}'.format(tag), 'last_seen')
else:
if r_serv_tags.scard('{}:{}'.format(tag, tag_last_seen)) > 0:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'last_seen', tag_last_seen)
else:
tag_last_seen = Date.date_substract_day(tag_last_seen)
update_tag_last_seen(tag, tag_first_seen, tag_last_seen)
# used by modal
def get_modal_add_tags(item_id, tag_type='paste'):
'''
Modal: add tags to domain or Paste
'''
return {"active_taxonomies": get_active_taxonomies(), "active_galaxies": get_active_galaxies(),
"item_id": item_id, "type": tag_type}

501
bin/packages/Term.py Executable file
View file

@ -0,0 +1,501 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import re
import sys
import time
import uuid
import redis
import datetime
from collections import defaultdict
from nltk.tokenize import RegexpTokenizer
from textblob import TextBlob
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
from flask import escape
import Date
import Item
config_loader = ConfigLoader.ConfigLoader()
r_serv_term = config_loader.get_redis_conn("ARDB_Tracker")
config_loader = None
email_regex = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}'
email_regex = re.compile(email_regex)
special_characters = set('[<>~!?@#$%^&*|()_-+={}":;,.\'\n\r\t]/\\')
special_characters.add('\\s')
# NLTK tokenizer
tokenizer = RegexpTokenizer('[\&\~\:\;\,\.\(\)\{\}\|\[\]\\\\/\-/\=\'\"\%\$\?\@\+\#\_\^\<\>\!\*\n\r\t\s]+',
gaps=True, discard_empty=True)
def is_valid_uuid_v4(UUID):
UUID = UUID.replace('-', '')
try:
uuid_test = uuid.UUID(hex=UUID, version=4)
return uuid_test.hex == UUID
except:
return False
# # TODO: use new package => duplicate fct
def is_in_role(user_id, role):
if r_serv_db.sismember('user_role:{}'.format(role), user_id):
return True
else:
return False
def check_term_uuid_valid_access(term_uuid, user_id):
if not is_valid_uuid_v4(term_uuid):
return ({"status": "error", "reason": "Invalid uuid"}, 400)
level = r_serv_term.hget('tracker:{}'.format(term_uuid), 'level')
if not level:
return ({"status": "error", "reason": "Unknown uuid"}, 404)
if level == 0:
if r_serv_term.hget('tracker:{}'.format(term_uuid), 'user_id') != user_id:
if not is_in_role(user_id, 'admin'):
return ({"status": "error", "reason": "Unknown uuid"}, 404)
return None
def is_valid_mail(email):
result = email_regex.match(email)
if result:
return True
else:
return False
def verify_mail_list(mail_list):
for mail in mail_list:
if not is_valid_mail(mail):
return ({'status': 'error', 'reason': 'Invalid email', 'value': mail}, 400)
return None
def is_valid_regex(term_regex):
try:
re.compile(term_regex)
return True
except:
return False
def get_text_word_frequency(item_content, filtering=True):
item_content = item_content.lower()
words_dict = defaultdict(int)
if filtering:
blob = TextBlob(item_content , tokenizer=tokenizer)
else:
blob = TextBlob(item_content)
for word in blob.tokens:
words_dict[word] += 1
return words_dict
# # TODO: create all tracked words
def get_tracked_words_list():
return list(r_serv_term.smembers('all:tracker:word'))
def get_set_tracked_words_list():
set_list = r_serv_term.smembers('all:tracker:set')
all_set_list = []
for elem in set_list:
res = elem.split(';')
num_words = int(res[1])
ter_set = res[0].split(',')
all_set_list.append((ter_set, num_words, elem))
return all_set_list
def get_regex_tracked_words_dict():
regex_list = r_serv_term.smembers('all:tracker:regex')
dict_tracked_regex = {}
for regex in regex_list:
dict_tracked_regex[regex] = re.compile(regex)
return dict_tracked_regex
def get_tracked_term_list_item(term_uuid, date_from, date_to):
all_item_id = []
if date_from and date_to:
for date in r_serv_term.zrangebyscore('tracker:stat:{}'.format(term_uuid), int(date_from), int(date_to)):
all_item_id = all_item_id + list(r_serv_term.smembers('tracker:item:{}:{}'.format(term_uuid, date)))
return all_item_id
def is_term_tracked_in_global_level(term, term_type):
res = r_serv_term.smembers('all:tracker_uuid:{}:{}'.format(term_type, term))
if res:
for elem_uuid in res:
if r_serv_term.hget('tracker:{}'.format(elem_uuid), 'level')=='1':
return True
return False
def is_term_tracked_in_user_level(term, term_type, user_id):
res = r_serv_term.smembers('user:tracker:{}'.format(user_id))
if res:
for elem_uuid in res:
if r_serv_term.hget('tracker:{}'.format(elem_uuid), 'tracked')== term:
if r_serv_term.hget('tracker:{}'.format(elem_uuid), 'type')== term_type:
return True
return False
def parse_json_term_to_add(dict_input, user_id):
term = dict_input.get('term', None)
if not term:
return ({"status": "error", "reason": "Term not provided"}, 400)
term_type = dict_input.get('type', None)
if not term_type:
return ({"status": "error", "reason": "Term type not provided"}, 400)
nb_words = dict_input.get('nb_words', 1)
description = dict_input.get('description', '')
description = escape(description)
res = parse_tracked_term_to_add(term , term_type, nb_words=nb_words)
if res[1]!=200:
return res
term = res[0]['term']
term_type = res[0]['type']
tags = dict_input.get('tags', [])
mails = dict_input.get('mails', [])
res = verify_mail_list(mails)
if res:
return res
## TODO: add dashboard key
level = dict_input.get('level', 1)
try:
level = int(level)
if level not in range(0, 1):
level = 1
except:
level = 1
# check if term already tracked in global
if level==1:
if is_term_tracked_in_global_level(term, term_type):
return ({"status": "error", "reason": "Term already tracked"}, 409)
else:
if is_term_tracked_in_user_level(term, term_type, user_id):
return ({"status": "error", "reason": "Term already tracked"}, 409)
term_uuid = add_tracked_term(term , term_type, user_id, level, tags, mails, description)
return ({'term': term, 'type': term_type, 'uuid': term_uuid}, 200)
def parse_tracked_term_to_add(term , term_type, nb_words=1):
if term_type=='regex':
if not is_valid_regex(term):
return ({"status": "error", "reason": "Invalid regex"}, 400)
elif term_type=='word' or term_type=='set':
# force lowercase
term = term.lower()
word_set = set(term)
set_inter = word_set.intersection(special_characters)
if set_inter:
return ({"status": "error", "reason": "special character not allowed", "message": "Please use a regex or remove all special characters"}, 400)
words = term.split()
# not a word
if term_type=='word' and len(words)>1:
term_type = 'set'
# ouput format: term1,term2,term3;2
if term_type=='set':
try:
nb_words = int(nb_words)
except:
nb_words = 1
if nb_words==0:
nb_words = 1
words_set = set(words)
words_set = sorted(words_set)
term = ",".join(words_set)
term = "{};{}".format(term, nb_words)
if nb_words > len(words_set):
nb_words = len(words_set)
else:
return ({"status": "error", "reason": "Incorrect type"}, 400)
return ({"status": "success", "term": term, "type": term_type}, 200)
def add_tracked_term(term , term_type, user_id, level, tags, mails, description, dashboard=0):
term_uuid = str(uuid.uuid4())
# create metadata
r_serv_term.hset('tracker:{}'.format(term_uuid), 'tracked',term)
r_serv_term.hset('tracker:{}'.format(term_uuid), 'type', term_type)
r_serv_term.hset('tracker:{}'.format(term_uuid), 'date', datetime.date.today().strftime("%Y%m%d"))
r_serv_term.hset('tracker:{}'.format(term_uuid), 'user_id', user_id)
r_serv_term.hset('tracker:{}'.format(term_uuid), 'level', level)
r_serv_term.hset('tracker:{}'.format(term_uuid), 'dashboard', dashboard)
if description:
r_serv_term.hset('tracker:{}'.format(term_uuid), 'description', description)
# create all term set
r_serv_term.sadd('all:tracker:{}'.format(term_type), term)
# create term - uuid map
r_serv_term.sadd('all:tracker_uuid:{}:{}'.format(term_type, term), term_uuid)
# add display level set
if level == 0: # user only
r_serv_term.sadd('user:tracker:{}'.format(user_id), term_uuid)
r_serv_term.sadd('user:tracker:{}:{}'.format(user_id, term_type), term_uuid)
elif level == 1: # global
r_serv_term.sadd('global:tracker', term_uuid)
r_serv_term.sadd('global:tracker:{}'.format(term_type), term_uuid)
# create term tags list
for tag in tags:
r_serv_term.sadd('tracker:tags:{}'.format(term_uuid), escape(tag) )
# create term tags mail notification list
for mail in mails:
r_serv_term.sadd('tracker:mail:{}'.format(term_uuid), escape(mail) )
# toggle refresh module tracker list/set
r_serv_term.set('tracker:refresh:{}'.format(term_type), time.time())
return term_uuid
def parse_tracked_term_to_delete(dict_input, user_id):
term_uuid = dict_input.get("uuid", None)
res = check_term_uuid_valid_access(term_uuid, user_id)
if res:
return res
delete_term(term_uuid)
return ({"uuid": term_uuid}, 200)
def delete_term(term_uuid):
term = r_serv_term.hget('tracker:{}'.format(term_uuid), 'tracked')
term_type = r_serv_term.hget('tracker:{}'.format(term_uuid), 'type')
level = r_serv_term.hget('tracker:{}'.format(term_uuid), 'level')
r_serv_term.srem('all:tracker_uuid:{}:{}'.format(term_type, term), term_uuid)
# Term not tracked by other users
if not r_serv_term.exists('all:tracker_uuid:{}:{}'.format(term_type, term)):
r_serv_term.srem('all:tracker:{}'.format(term_type), term)
# toggle refresh module tracker list/set
r_serv_term.set('tracker:refresh:{}'.format(term_type), time.time())
if level == '0': # user only
user_id = term_type = r_serv_term.hget('tracker:{}'.format(term_uuid), 'user_id')
r_serv_term.srem('user:tracker:{}'.format(user_id), term_uuid)
r_serv_term.srem('user:tracker:{}:{}'.format(user_id, term_type), term_uuid)
elif level == '1': # global
r_serv_term.srem('global:tracker', term_uuid)
r_serv_term.srem('global:tracker:{}'.format(term_type), term_uuid)
# delete metatadata
r_serv_term.delete('tracker:{}'.format(term_uuid))
# remove tags
r_serv_term.delete('tracker:tags:{}'.format(term_uuid))
# remove mails
r_serv_term.delete('tracker:mail:{}'.format(term_uuid))
# remove item set
all_item_date = r_serv_term.zrange('tracker:stat:{}'.format(term_uuid), 0, -1)
for date in all_item_date:
r_serv_term.delete('tracker:item:{}:{}'.format(term_uuid, date))
r_serv_term.delete('tracker:stat:{}'.format(term_uuid))
def replace_tracker_description(term_uuid, description):
description = escape(description)
r_serv_term.hset('tracker:{}'.format(term_uuid), 'description', description)
def replace_tracked_term_tags(term_uuid, tags):
r_serv_term.delete('tracker:tags:{}'.format(term_uuid))
for tag in tags:
tag = escape(tag)
r_serv_term.sadd('tracker:tags:{}'.format(term_uuid), tag)
def replace_tracked_term_mails(term_uuid, mails):
res = verify_mail_list(mails)
if res:
return res
else:
r_serv_term.delete('tracker:mail:{}'.format(term_uuid))
for mail in mails:
mail = escape(mail)
r_serv_term.sadd('tracker:mail:{}'.format(term_uuid), mail)
def get_term_uuid_list(term, term_type):
return list(r_serv_term.smembers('all:tracker_uuid:{}:{}'.format(term_type, term)))
def get_term_tags(term_uuid):
return list(r_serv_term.smembers('tracker:tags:{}'.format(term_uuid)))
def get_term_mails(term_uuid):
return list(r_serv_term.smembers('tracker:mail:{}'.format(term_uuid)))
def add_tracked_item(term_uuid, item_id, item_date):
# track item
r_serv_term.sadd('tracker:item:{}:{}'.format(term_uuid, item_date), item_id)
# track nb item by date
r_serv_term.zadd('tracker:stat:{}'.format(term_uuid), item_date, int(item_date))
def create_token_statistics(item_date, word, nb):
r_serv_term.zincrby('stat_token_per_item_by_day:{}'.format(item_date), word, 1)
r_serv_term.zincrby('stat_token_total_by_day:{}'.format(item_date), word, nb)
r_serv_term.sadd('stat_token_history', item_date)
def delete_token_statistics_by_date(item_date):
r_serv_term.delete('stat_token_per_item_by_day:{}'.format(item_date))
r_serv_term.delete('stat_token_total_by_day:{}'.format(item_date))
r_serv_term.srem('stat_token_history', item_date)
def get_all_token_stat_history():
return r_serv_term.smembers('stat_token_history')
def get_tracked_term_last_updated_by_type(term_type):
epoch_update = r_serv_term.get('tracker:refresh:{}'.format(term_type))
if not epoch_update:
epoch_update = 0
return float(epoch_update)
def parse_get_tracker_term_item(dict_input, user_id):
term_uuid = dict_input.get('uuid', None)
res = check_term_uuid_valid_access(term_uuid, user_id)
if res:
return res
date_from = dict_input.get('date_from', None)
date_to = dict_input.get('date_to', None)
if date_from is None:
date_from = get_tracked_term_first_seen(term_uuid)
if date_from:
date_from = date_from[0]
if date_to is None:
date_to = date_from
if date_from > date_to:
date_from = date_to
all_item_id = get_tracked_term_list_item(term_uuid, date_from, date_to)
all_item_id = Item.get_item_list_desc(all_item_id)
res_dict = {}
res_dict['uuid'] = term_uuid
res_dict['date_from'] = date_from
res_dict['date_to'] = date_to
res_dict['items'] = all_item_id
return (res_dict, 200)
def get_tracked_term_first_seen(term_uuid):
res = r_serv_term.zrange('tracker:stat:{}'.format(term_uuid), 0, 0)
if res:
return res[0]
else:
return None
def get_tracked_term_last_seen(term_uuid):
res = r_serv_term.zrevrange('tracker:stat:{}'.format(term_uuid), 0, 0)
if res:
return res[0]
else:
return None
def get_term_metedata(term_uuid, user_id=False, description=False, level=False, tags=False, mails=False, sparkline=False):
dict_uuid = {}
dict_uuid['term'] = r_serv_term.hget('tracker:{}'.format(term_uuid), 'tracked')
dict_uuid['type'] = r_serv_term.hget('tracker:{}'.format(term_uuid), 'type')
dict_uuid['date'] = r_serv_term.hget('tracker:{}'.format(term_uuid), 'date')
dict_uuid['description'] = r_serv_term.hget('tracker:{}'.format(term_uuid), 'description')
dict_uuid['first_seen'] = get_tracked_term_first_seen(term_uuid)
dict_uuid['last_seen'] = get_tracked_term_last_seen(term_uuid)
if user_id:
dict_uuid['user_id'] = r_serv_term.hget('tracker:{}'.format(term_uuid), 'user_id')
if level:
dict_uuid['level'] = r_serv_term.hget('tracker:{}'.format(term_uuid), 'level')
if mails:
dict_uuid['mails'] = get_list_trackeed_term_mails(term_uuid)
if tags:
dict_uuid['tags'] = get_list_trackeed_term_tags(term_uuid)
if sparkline:
dict_uuid['sparkline'] = get_tracked_term_sparkline(term_uuid)
dict_uuid['uuid'] = term_uuid
return dict_uuid
def get_tracked_term_sparkline(tracker_uuid, num_day=6):
date_range_sparkline = Date.get_date_range(num_day)
sparklines_value = []
for date_day in date_range_sparkline:
nb_seen_this_day = r_serv_term.scard('tracker:item:{}:{}'.format(tracker_uuid, date_day))
if nb_seen_this_day is None:
nb_seen_this_day = 0
sparklines_value.append(int(nb_seen_this_day))
return sparklines_value
def get_list_tracked_term_stats_by_day(list_tracker_uuid, num_day=31, date_from=None, date_to=None):
if date_from and date_to:
date_range = Date.substract_date(date_from, date_to)
else:
date_range = Date.get_date_range(num_day)
list_tracker_stats = []
for tracker_uuid in list_tracker_uuid:
dict_tracker_data = []
tracker = r_serv_term.hget('tracker:{}'.format(tracker_uuid), 'tracked')
for date_day in date_range:
nb_seen_this_day = r_serv_term.scard('tracker:item:{}:{}'.format(tracker_uuid, date_day))
if nb_seen_this_day is None:
nb_seen_this_day = 0
dict_tracker_data.append({"date": date_day,"value": int(nb_seen_this_day)})
list_tracker_stats.append({"name": tracker,"Data": dict_tracker_data})
return list_tracker_stats
def get_list_trackeed_term_tags(term_uuid):
res = r_serv_term.smembers('tracker:tags:{}'.format(term_uuid))
if res:
return list(res)
else:
return []
def get_list_trackeed_term_mails(term_uuid):
res = r_serv_term.smembers('tracker:mail:{}'.format(term_uuid))
if res:
return list(res)
else:
return []
def get_user_tracked_term_uuid(user_id, filter_type=None):
if filter_type:
return list(r_serv_term.smembers('user:tracker:{}:{}'.format(user_id,filter_type)))
else:
return list(r_serv_term.smembers('user:tracker:{}'.format(user_id)))
def get_global_tracked_term_uuid(filter_type=None):
if filter_type:
return list(r_serv_term.smembers('global:tracker:{}'.format(filter_type)))
else:
return list(r_serv_term.smembers('global:tracker'))
def get_all_user_tracked_terms(user_id, filter_type=None):
all_user_term = []
all_user_term_uuid = get_user_tracked_term_uuid(user_id, filter_type=filter_type)
for term_uuid in all_user_term_uuid:
all_user_term.append(get_term_metedata(term_uuid, tags=True, mails=True, sparkline=True))
return all_user_term
def get_all_global_tracked_terms(filter_type=None):
all_user_term = []
all_user_term_uuid = get_global_tracked_term_uuid(filter_type=filter_type)
for term_uuid in all_user_term_uuid:
all_user_term.append(get_term_metedata(term_uuid, user_id=True, tags=True, mails=True, sparkline=True))
return all_user_term

View file

@ -2,9 +2,12 @@
# -*-coding:UTF-8 -*
import os
import sys
import redis
import bcrypt
import configparser
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
from flask_login import UserMixin
@ -12,20 +15,10 @@ class User(UserMixin):
def __init__(self, id):
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
config_loader = ConfigLoader.ConfigLoader()
cfg = configparser.ConfigParser()
cfg.read(configfile)
self.r_serv_db = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
self.r_serv_db = config_loader.get_redis_conn("ARDB_DB")
config_loader = None
if self.r_serv_db.hexists('user:all', id):
self.id = id

View file

@ -1,14 +1,20 @@
#!/usr/bin/python3
import re
import os
import configparser
import re
import sys
import dns.resolver
from pubsublogger import publisher
from datetime import timedelta
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
config_loader = ConfigLoader.ConfigLoader()
dns_server = config_loader.get_config_str("Web", "dns")
config_loader = None
def is_luhn_valid(card_number):
"""Apply the Luhn algorithm to validate credit card.
@ -103,14 +109,6 @@ def checking_MX_record(r_serv, adress_set, addr_dns):
def checking_A_record(r_serv, domains_set):
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
dns_server = cfg.get("Web", "dns")
score = 0
num = len(domains_set)

View file

@ -11,62 +11,10 @@ from dateutil.rrule import rrule, DAILY
import csv
def listdirectory(path):
"""Path Traversing Function.
:param path: -- The absolute pathname to a directory.
This function is returning all the absolute path of the files contained in
the argument directory.
"""
fichier = []
for root, dirs, files in os.walk(path):
for i in files:
fichier.append(os.path.join(root, i))
return fichier
clean = lambda dirty: ''.join(filter(string.printable.__contains__, dirty))
"""It filters out non-printable characters from the string it receives."""
def create_dirfile(r_serv, directory, overwrite):
"""Create a file of path.
:param r_serv: -- connexion to redis database
:param directory: -- The folder where to launch the listing of the .gz files
This function create a list in redis with inside the absolute path
of all the pastes needed to be proceeded by function using parallel
(like redis_words_ranking)
"""
if overwrite:
r_serv.delete("filelist")
for x in listdirectory(directory):
r_serv.lpush("filelist", x)
publisher.info("The list was overwritten")
else:
if r_serv.llen("filelist") == 0:
for x in listdirectory(directory):
r_serv.lpush("filelist", x)
publisher.info("New list created")
else:
for x in listdirectory(directory):
r_serv.lpush("filelist", x)
publisher.info("The list was updated with new elements")
def create_curve_with_word_file(r_serv, csvfilename, feederfilename, year, month):
"""Create a csv file used with dygraph.

View file

@ -19,31 +19,20 @@ subscribe = Redis_Global
[Attributes]
subscribe = Redis_Global
[Lines]
subscribe = Redis_Global
publish = Redis_LinesShort,Redis_LinesLong
[DomClassifier]
subscribe = Redis_Global
[Tokenize]
subscribe = Redis_LinesShort
publish = Redis_Words
[Curve]
subscribe = Redis_Words
publish = Redis_CurveManageTopSets,Redis_Tags
[RegexForTermsFrequency]
[TermTrackerMod]
subscribe = Redis_Global
publish = Redis_Tags
[SetForTermsFrequency]
[RegexTracker]
subscribe = Redis_Global
publish = Redis_Tags
[CurveManageTopSets]
subscribe = Redis_CurveManageTopSets
[Tools]
subscribe = Redis_Global
publish = Redis_Tags
[Categ]
subscribe = Redis_Global
@ -143,3 +132,7 @@ publish = Redis_Mixer
[Crawler]
subscribe = Redis_Crawler
publish = Redis_Mixer,Redis_Tags
[IP]
subscribe = Redis_Global
publish = Redis_Duplicate,Redis_Tags

View file

@ -1,7 +1,6 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import configparser
import os
import sys
import gzip
@ -17,6 +16,9 @@ import sflock
from Helper import Process
from pubsublogger import publisher
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
def create_paste(uuid, paste_content, ltags, ltagsgalaxies, name):
now = datetime.datetime.now()
@ -47,7 +49,11 @@ def create_paste(uuid, paste_content, ltags, ltagsgalaxies, name):
r_serv_log_submit.hincrby("mixer_cache:list_feeder", "submitted", 1)
# add tags
add_tags(ltags, ltagsgalaxies, rel_item_path)
for tag in ltags:
add_item_tag(tag, rel_item_path)
for tag in ltagsgalaxies:
add_item_tag(tag, rel_item_path)
r_serv_log_submit.incr(uuid + ':nb_end')
r_serv_log_submit.incr(uuid + ':nb_sucess')
@ -92,7 +98,6 @@ def remove_submit_uuid(uuid):
r_serv_log_submit.expire(uuid + ':nb_sucess', expire_time)
r_serv_log_submit.expire(uuid + ':nb_end', expire_time)
r_serv_log_submit.expire(uuid + ':error', expire_time)
r_serv_log_submit.srem(uuid + ':paste_submit_link', '')
r_serv_log_submit.expire(uuid + ':paste_submit_link', expire_time)
# delete uuid
@ -134,18 +139,6 @@ def add_item_tag(tag, item_path):
if item_date > tag_last_seen:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'last_seen', item_date)
def add_tags(tags, tagsgalaxies, path):
list_tag = tags.split(',')
list_tag_galaxies = tagsgalaxies.split(',')
if list_tag != ['']:
for tag in list_tag:
add_item_tag(tag, path)
if list_tag_galaxies != ['']:
for tag in list_tag_galaxies:
add_item_tag(tag, path)
def verify_extention_filename(filename):
if not '.' in filename:
return True
@ -163,44 +156,13 @@ if __name__ == "__main__":
publisher.port = 6380
publisher.channel = "Script"
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
config_loader = ConfigLoader.ConfigLoader()
cfg = configparser.ConfigParser()
cfg.read(configfile)
r_serv_db = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_log_submit = redis.StrictRedis(
host=cfg.get("Redis_Log_submit", "host"),
port=cfg.getint("Redis_Log_submit", "port"),
db=cfg.getint("Redis_Log_submit", "db"),
decode_responses=True)
r_serv_tags = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
serv_statistics = redis.StrictRedis(
host=cfg.get('ARDB_Statistics', 'host'),
port=cfg.getint('ARDB_Statistics', 'port'),
db=cfg.getint('ARDB_Statistics', 'db'),
decode_responses=True)
r_serv_db = config_loader.get_redis_conn("ARDB_DB")
r_serv_log_submit = config_loader.get_redis_conn("Redis_Log_submit")
r_serv_tags = config_loader.get_redis_conn("ARDB_Tags")
r_serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
serv_statistics = config_loader.get_redis_conn("ARDB_Statistics")
expire_time = 120
MAX_FILE_SIZE = 1000000000
@ -209,7 +171,9 @@ if __name__ == "__main__":
config_section = 'submit_paste'
p = Process(config_section)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "pastes")) + '/'
config_loader = None
while True:
@ -218,8 +182,8 @@ if __name__ == "__main__":
uuid = r_serv_db.srandmember('submitted:uuid')
# get temp value save on disk
ltags = r_serv_db.get(uuid + ':ltags')
ltagsgalaxies = r_serv_db.get(uuid + ':ltagsgalaxies')
ltags = r_serv_db.smembers(uuid + ':ltags')
ltagsgalaxies = r_serv_db.smembers(uuid + ':ltagsgalaxies')
paste_content = r_serv_db.get(uuid + ':paste_content')
isfile = r_serv_db.get(uuid + ':isfile')
password = r_serv_db.get(uuid + ':password')
@ -230,8 +194,6 @@ if __name__ == "__main__":
r_serv_log_submit.set(uuid + ':nb_total', -1)
r_serv_log_submit.set(uuid + ':nb_end', 0)
r_serv_log_submit.set(uuid + ':nb_sucess', 0)
r_serv_log_submit.set(uuid + ':error', 'error:')
r_serv_log_submit.sadd(uuid + ':paste_submit_link', '')
r_serv_log_submit.set(uuid + ':processing', 1)
@ -275,7 +237,7 @@ if __name__ == "__main__":
else:
#decompress file
try:
if password == '':
if password == None:
files = unpack(file_full_path.encode())
#print(files.children)
else:

View file

@ -68,9 +68,11 @@ class TorSplashCrawler():
self.date_month = date['date_month']
self.date_epoch = int(date['epoch'])
# # TODO: timeout in config
self.arg_crawler = { 'html': crawler_options['html'],
'wait': 10,
'render_all': 1,
'timeout': 30,
'har': crawler_options['har'],
'png': crawler_options['png']}

View file

@ -5,29 +5,21 @@ import os
import sys
import json
import redis
import configparser
from TorSplashCrawler import TorSplashCrawler
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
if __name__ == '__main__':
if len(sys.argv) != 2:
print('usage:', 'tor_crawler.py', 'uuid')
exit(1)
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
redis_cache = redis.StrictRedis(
host=cfg.get("Redis_Cache", "host"),
port=cfg.getint("Redis_Cache", "port"),
db=cfg.getint("Redis_Cache", "db"),
decode_responses=True)
config_loader = ConfigLoader.ConfigLoader()
redis_cache = config_loader.get_redis_conn("Redis_Cache")
config_loader = None
# get crawler config key
uuid = sys.argv[1]

View file

@ -13,23 +13,17 @@ import os
import sys
import redis
import subprocess
import configparser
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
if __name__ == "__main__":
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
config_loader = ConfigLoader.ConfigLoader()
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv = config_loader.get_redis_conn("ARDB_DB")
r_serv_onion = config_loader.get_redis_conn("ARDB_Onion")
config_loader = None
if r_serv.scard('ail:update_v1.5') != 5:
r_serv.delete('ail:update_error')
@ -60,3 +54,23 @@ if __name__ == "__main__":
r_serv.delete('ail:current_background_script')
r_serv.delete('ail:current_background_script_stat')
r_serv.delete('ail:current_background_update')
if r_serv.get('ail:current_background_update') == 'v2.4':
r_serv.delete('ail:update_error')
r_serv.set('ail:update_in_progress', 'v2.4')
r_serv.set('ail:current_background_update', 'v2.4')
r_serv.set('ail:current_background_script', 'domain update')
update_file = os.path.join(os.environ['AIL_HOME'], 'update', 'v2.4', 'Update_domain.py')
process = subprocess.run(['python' ,update_file])
if int(r_serv_onion.scard('domain_update_v2.4')) != 0:
r_serv.set('ail:update_error', 'Update v2.4 Failed, please relaunch the bin/update-background.py script')
else:
r_serv.delete('ail:update_in_progress')
r_serv.delete('ail:current_background_script')
r_serv.delete('ail:current_background_script_stat')
r_serv.delete('ail:current_background_update')
r_serv.delete('update:nb_elem_to_convert')
r_serv.delete('update:nb_elem_converted')

View file

@ -23,7 +23,7 @@ sentiment_lexicon_file = sentiment/vader_lexicon.zip/vader_lexicon/vader_lexicon
##### Notifications ######
[Notifications]
ail_domain = http://localhost:7000
ail_domain = https://localhost:7000
sender = sender@example.com
sender_host = smtp.example.com
sender_port = 1337
@ -107,7 +107,10 @@ operation_mode = 3
ttl_duplicate = 86400
default_unnamed_feed_name = unnamed_feeder
[RegexForTermsFrequency]
[TermTrackerMod]
max_execution_time = 120
[RegexTracker]
max_execution_time = 60
##### Redis #####
@ -177,6 +180,11 @@ host = localhost
port = 6382
db = 3
[ARDB_Tracker]
host = localhost
port = 6382
db = 3
[ARDB_Hashs]
host = localhost
db = 1
@ -252,5 +260,14 @@ db = 0
[Crawler]
activate_crawler = False
crawler_depth_limit = 1
default_crawler_har = True
default_crawler_png = True
default_crawler_closespider_pagecount = 50
default_crawler_user_agent = Mozilla/5.0 (Windows NT 6.1; rv:60.0) Gecko/20100101 Firefox/60.0
splash_url = http://127.0.0.1
splash_port = 8050-8052
[IP]
# list of comma-separated CIDR that you wish to be alerted for. e.g:
#networks = 192.168.34.0/24,10.0.0.0/8,192.168.33.0/24
networks =

1199
doc/README.md Normal file

File diff suppressed because it is too large Load diff

View file

@ -1,53 +0,0 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
'''
submit your own pastes in AIL
empty values must be initialized
'''
import requests
if __name__ == '__main__':
#AIL url
url = 'http://localhost:7000'
ail_url = url + '/PasteSubmit/submit'
# MIPS TAXONOMIE, need to be initialized (tags_taxonomies = '')
tags_taxonomies = 'CERT-XLM:malicious-code=\"ransomware\",CERT-XLM:conformity=\"standard\"'
# MISP GALAXY, need to be initialized (tags_galaxies = '')
tags_galaxies = 'misp-galaxy:cert-seu-gocsector=\"Constituency\",misp-galaxy:cert-seu-gocsector=\"EU-Centric\"'
# user paste input, need to be initialized (paste_content = '')
paste_content = 'paste content test'
#file full or relative path
file_to_submit = 'test_file.zip'
#compress file password, need to be initialized (password = '')
password = ''
'''
submit user text
'''
r = requests.post(ail_url, data={ 'password': password,
'paste_content': paste_content,
'tags_taxonomies': tags_taxonomies,
'tags_galaxies': tags_galaxies})
print(r.status_code, r.reason)
'''
submit a file
'''
with open(file_submit,'rb') as f:
r = requests.post(ail_url, data={ 'password': password,
'paste_content': paste_content,
'tags_taxonomies': tags_taxonomies,
'tags_galaxies': tags_galaxies}, files={'file': (file_to_submit, f.read() )})
print(r.status_code, r.reason)

Binary file not shown.

Before

Width:  |  Height:  |  Size: 73 KiB

View file

@ -81,8 +81,8 @@ pushd ardb/
make
popd
if [ ! -f bin/packages/config.cfg ]; then
cp bin/packages/config.cfg.sample bin/packages/config.cfg
if [ ! -f configs/core.cfg ]; then
cp configs/core.cfg.sample configs/core.cfg
fi
if [ -z "$VIRTUAL_ENV" ]; then

View file

@ -50,6 +50,7 @@ flask-login
bcrypt
#DomainClassifier
git+https://github.com/D4-project/BGP-Ranking.git/#egg=pybgpranking&subdirectory=client
DomainClassifier
#Indexer requirements
whoosh

View file

@ -6,52 +6,144 @@ GREEN="\\033[1;32m"
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
# Make sure the reseting is intentional
num=$(( ( RANDOM % 100 ) + 1 ))
function reset_dir {
# Access dirs and delete
cd $AIL_HOME
echo -e $RED"To reset the platform, enter the following number: "$DEFAULT $num
read userInput
# Kill all screens
screen -ls | grep Detached | cut -d. -f1 | awk '{print $1}' | xargs kill
if [ $userInput -eq $num ]
then
echo "Reseting AIL..."
else
echo "Wrong number"
exit 1;
fi
# Access dirs and delete
cd $AIL_HOME
if [ -d indexdir/ ]; then
pushd indexdir/
rm -r *
echo 'cleaned indexdir'
popd
fi
# Kill all screens
screen -ls | grep Detached | cut -d. -f1 | awk '{print $1}' | xargs kill
if [ $userInput -eq $num ]
then
if [ -d DATA_ARDB/ ]; then
pushd DATA_ARDB/
rm -r *
echo 'cleaned DATA_ARDB'
popd
fi
fi
set -e
if [ -d logs/ ]; then
pushd logs/
rm *
echo 'cleaned logs'
popd
fi
# Access dirs and delete
cd $AIL_HOME
if [ -d PASTES/ ]; then
pushd PASTES/
rm -r *
echo 'cleaned PASTES'
popd
fi
pushd dumps/
rm *
echo 'cleaned dumps'
popd
if [ -d HASHS/ ]; then
pushd HASHS/
rm -r *
echo 'cleaned HASHS'
popd
fi
pushd indexdir/
rm -r *
echo 'cleaned indexdir'
popd
if [ -d CRAWLED_SCREESHOT/ ]; then
pushd CRAWLED_SCREESHOT/
rm -r *
echo 'cleaned CRAWLED_SCREESHOT'
popd
fi
pushd LEVEL_DB_DATA/
rm -r *
echo 'cleaned LEVEL_DB_DATA'
popd
if [ -d temp/ ]; then
pushd temp/
rm -r *
echo 'cleaned temp'
popd
fi
pushd logs/
rm *
echo 'cleaned logs'
popd
if [ -d var/www/submitted/ ]; then
pushd var/www/submitted
rm -r *
echo 'cleaned submitted'
popd
fi
pushd PASTES/
rm -r *
echo 'cleaned PASTES'
popd
echo -e $GREEN"* AIL has been reset *"$DEFAULT
}
echo -e $GREEN"* AIL has been reset *"$DEFAULT
function flush_DB_keep_user {
bash ${AIL_BIN}LAUNCH.sh -lav &
wait
echo ""
pushd redis/src
./redis-cli -p 6382 -n 1 FLUSHDB;
./redis-cli -p 6382 -n 2 FLUSHDB;
./redis-cli -p 6382 -n 3 FLUSHDB;
./redis-cli -p 6382 -n 4 FLUSHDB;
./redis-cli -p 6382 -n 5 FLUSHDB;
./redis-cli -p 6382 -n 6 FLUSHDB;
./redis-cli -p 6382 -n 7 FLUSHDB;
./redis-cli -p 6382 -n 8 FLUSHDB;
./redis-cli -p 6382 -n 9 FLUSHDB;
echo "ARDB FLUSHED"
popd
bash ${AIL_BIN}LAUNCH.sh -k
}
function soft_reset {
reset_dir;
flush_DB_keep_user;
}
#If no params,
[[ $@ ]] || {
# Make sure the reseting is intentional
num=$(( ( RANDOM % 100 ) + 1 ))
echo -e $RED"To reset the platform, enter the following number: "$DEFAULT $num
read userInput
if [ $userInput -eq $num ]
then
echo "Reseting AIL..."
else
echo "Wrong number"
exit 1;
fi
num=$(( ( RANDOM % 100 ) + 1 ))
echo -e $RED"If yes you want to delete the DB , enter the following number: "$DEFAULT $num
read userInput
reset_dir;
if [ $userInput -eq $num ]
then
if [ -d DATA_ARDB/ ]; then
pushd DATA_ARDB/
rm -r *
echo 'cleaned DATA_ARDB'
popd
fi
fi
echo -e $GREEN"* AIL has been reset *"$DEFAULT
exit
}
while [ "$1" != "" ]; do
case $1 in
--softReset ) soft_reset;
;;
* ) exit 1
esac
shift
done

171
tests/testApi.py Normal file
View file

@ -0,0 +1,171 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import os
import sys
import time
import unittest
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages'))
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'bin'))
sys.path.append(os.environ['AIL_FLASK'])
sys.path.append(os.path.join(os.environ['AIL_FLASK'], 'modules'))
import Import_helper
import Tag
from Flask_server import app
def parse_response(obj, ail_response):
res_json = ail_response.get_json()
if 'status' in res_json:
if res_json['status'] == 'error':
return obj.fail('{}: {}: {}'.format(ail_response.status_code, res_json['status'], res_json['reason']))
return res_json
def get_api_key():
api_file = os.path.join(os.environ['AIL_HOME'], 'DEFAULT_PASSWORD')
if os.path.isfile(api_file):
with open(os.path.join(os.environ['AIL_HOME'], 'DEFAULT_PASSWORD'), 'r') as f:
content = f.read()
content = content.splitlines()
apikey = content[-1]
apikey = apikey.replace('API_Key=', '', 1)
# manual tests
else:
apikey = sys.argv[1]
return apikey
APIKEY = get_api_key()
class TestApiV1(unittest.TestCase):
import_uuid = None
item_id = None
def setUp(self):
self.app = app
self.app.config['TESTING'] = True
self.client = self.app.test_client()
self.apikey = APIKEY
self.item_content = "text to import"
self.item_tags = ["infoleak:analyst-detection=\"private-key\""]
self.expected_tags = ["infoleak:analyst-detection=\"private-key\"", 'infoleak:submission="manual"']
# POST /api/v1/import/item
def test_0001_api_import_item(self):
input_json = {"type": "text","tags": self.item_tags,"text": self.item_content}
req = self.client.post('/api/v1/import/item', json=input_json ,headers={ 'Authorization': self.apikey })
req_json = parse_response(self, req)
import_uuid = req_json['uuid']
self.__class__.import_uuid = import_uuid
self.assertTrue(Import_helper.is_valid_uuid_v4(import_uuid))
# POST /api/v1/get/import/item
def test_0002_api_get_import_item(self):
input_json = {"uuid": self.__class__.import_uuid}
item_not_imported = True
import_timout = 30
start = time.time()
while item_not_imported:
req = self.client.post('/api/v1/get/import/item', json=input_json ,headers={ 'Authorization': self.apikey })
req_json = parse_response(self, req)
if req_json['status'] == 'imported':
try:
item_id = req_json['items'][0]
item_not_imported = False
except Exception as e:
if time.time() - start > import_timout:
item_not_imported = False
self.fail("Import error: {}".format(req_json))
else:
if time.time() - start > import_timout:
item_not_imported = False
self.fail("Import Timeout, import status: {}".format(req_json['status']))
self.__class__.item_id = item_id
# Process item
time.sleep(5)
# POST /api/v1/get/item/content
def test_0003_api_get_item_content(self):
input_json = {"id": self.__class__.item_id}
req = self.client.post('/api/v1/get/item/content', json=input_json ,headers={ 'Authorization': self.apikey })
req_json = parse_response(self, req)
item_content = req_json['content']
self.assertEqual(item_content, self.item_content)
# POST /api/v1/get/item/tag
def test_0004_api_get_item_tag(self):
input_json = {"id": self.__class__.item_id}
req = self.client.post('/api/v1/get/item/tag', json=input_json ,headers={ 'Authorization': self.apikey })
req_json = parse_response(self, req)
item_tags = req_json['tags']
self.assertCountEqual(item_tags, self.expected_tags)
# POST /api/v1/get/item/tag
def test_0005_api_get_item_default(self):
input_json = {"id": self.__class__.item_id}
req = self.client.post('/api/v1/get/item/default', json=input_json ,headers={ 'Authorization': self.apikey })
req_json = parse_response(self, req)
item_tags = req_json['tags']
self.assertCountEqual(item_tags, self.expected_tags)
item_content = req_json['content']
self.assertEqual(item_content, self.item_content)
# POST /api/v1/get/item/tag
# # TODO: add more test
def test_0006_api_get_item(self):
input_json = {"id": self.__class__.item_id, "content": True}
req = self.client.post('/api/v1/get/item', json=input_json ,headers={ 'Authorization': self.apikey })
req_json = parse_response(self, req)
item_tags = req_json['tags']
self.assertCountEqual(item_tags, self.expected_tags)
item_content = req_json['content']
self.assertEqual(item_content, self.item_content)
# POST api/v1/add/item/tag
def test_0007_api_add_item_tag(self):
tags_to_add = ["infoleak:analyst-detection=\"api-key\""]
current_item_tag = Tag.get_item_tags(self.__class__.item_id)
current_item_tag.append(tags_to_add[0])
#galaxy_to_add = ["misp-galaxy:stealer=\"Vidar\""]
input_json = {"id": self.__class__.item_id, "tags": tags_to_add}
req = self.client.post('/api/v1/add/item/tag', json=input_json ,headers={ 'Authorization': self.apikey })
req_json = parse_response(self, req)
item_tags = req_json['tags']
self.assertEqual(item_tags, tags_to_add)
new_item_tag = Tag.get_item_tags(self.__class__.item_id)
self.assertCountEqual(new_item_tag, current_item_tag)
# DELETE api/v1/delete/item/tag
def test_0008_api_add_item_tag(self):
tags_to_delete = ["infoleak:analyst-detection=\"api-key\""]
input_json = {"id": self.__class__.item_id, "tags": tags_to_delete}
req = self.client.delete('/api/v1/delete/item/tag', json=input_json ,headers={ 'Authorization': self.apikey })
req_json = parse_response(self, req)
item_tags = req_json['tags']
self.assertCountEqual(item_tags, tags_to_delete)
current_item_tag = Tag.get_item_tags(self.__class__.item_id)
if tags_to_delete[0] in current_item_tag:
self.fail('Tag no deleted')
# POST api/v1/get/tag/metadata
def test_0009_api_add_item_tag(self):
input_json = {"tag": self.item_tags[0]}
req = self.client.post('/api/v1/get/tag/metadata', json=input_json ,headers={ 'Authorization': self.apikey })
req_json = parse_response(self, req)
self.assertEqual(req_json['tag'], self.item_tags[0])
# GET api/v1/get/tag/all
def test_0010_api_add_item_tag(self):
input_json = {"tag": self.item_tags[0]}
req = self.client.get('/api/v1/get/tag/all', json=input_json ,headers={ 'Authorization': self.apikey })
req_json = parse_response(self, req)
self.assertTrue(req_json['tags'])
if __name__ == "__main__":
unittest.main(argv=['first-arg-is-ignored'], exit=False)

View file

@ -9,6 +9,9 @@ import argparse
import datetime
import configparser
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='AIL default update')
parser.add_argument('-t', help='version tag' , type=str, dest='tag', required=True)
@ -23,19 +26,9 @@ if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
config_loader = ConfigLoader.ConfigLoader()
r_serv = config_loader.get_redis_conn("ARDB_DB")
config_loader = None
#Set current ail version
r_serv.set('ail:version', update_tag)

View file

@ -20,8 +20,8 @@ export PATH=$AIL_FLASK:$PATH
GREEN="\\033[1;32m"
DEFAULT="\\033[0;39m"
echo -e $GREEN"Shutting down AIL ..."$DEFAULT
bash ${AIL_BIN}/LAUNCH.sh -k
echo -e $GREEN"Shutting down AIL Script ..."$DEFAULT
bash ${AIL_BIN}/LAUNCH.sh -ks
wait
echo ""
@ -37,8 +37,8 @@ echo ""
echo ""
echo ""
echo -e $GREEN"Shutting down ARDB ..."$DEFAULT
bash ${AIL_BIN}/LAUNCH.sh -k
echo -e $GREEN"Killing Script ..."$DEFAULT
bash ${AIL_BIN}/LAUNCH.sh -ks
wait
echo ""

View file

@ -5,7 +5,9 @@ import os
import sys
import time
import redis
import configparser
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
def update_tracked_terms(main_key, tracked_container_key):
for tracked_item in r_serv_term.smembers(main_key):
@ -50,45 +52,16 @@ if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
config_loader = ConfigLoader.ConfigLoader()
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_tag = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
r_serv_term = redis.StrictRedis(
host=cfg.get("ARDB_TermFreq", "host"),
port=cfg.getint("ARDB_TermFreq", "port"),
db=cfg.getint("ARDB_TermFreq", "db"),
decode_responses=True)
r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
r_serv = config_loader.get_redis_conn("ARDB_DB")
r_serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
r_serv_tag = config_loader.get_redis_conn("ARDB_Tags")
r_serv_term = config_loader.get_redis_conn("ARDB_TermFreq")
r_serv_onion = config_loader.get_redis_conn("ARDB_Onion")
config_loader = None
r_serv.set('ail:current_background_script', 'metadata')

View file

@ -6,7 +6,9 @@ import sys
import time
import redis
import datetime
import configparser
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
def substract_date(date_from, date_to):
date_from = datetime.date(int(date_from[0:4]), int(date_from[4:6]), int(date_from[6:8]))
@ -39,39 +41,15 @@ if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
config_loader = ConfigLoader.ConfigLoader()
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_tag = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
r_serv = config_loader.get_redis_conn("ARDB_DB")
r_serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
r_serv_tag = config_loader.get_redis_conn("ARDB_Tags")
r_serv_onion = config_loader.get_redis_conn("ARDB_Onion")
config_loader = None
r_serv.set('ail:current_background_script', 'onions')
r_serv.set('ail:current_background_script_stat', 0)

View file

@ -6,10 +6,12 @@ import sys
import time
import redis
import datetime
import configparser
from hashlib import sha256
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
def rreplace(s, old, new, occurrence):
li = s.rsplit(old, occurrence)
return new.join(li)
@ -28,41 +30,18 @@ if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
SCREENSHOT_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "crawled_screenshot"))
NEW_SCREENSHOT_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "crawled_screenshot"), 'screenshot')
config_loader = ConfigLoader.ConfigLoader()
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
SCREENSHOT_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "crawled_screenshot"))
NEW_SCREENSHOT_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "crawled_screenshot"), 'screenshot')
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "pastes")) + '/'
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_tag = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
r_serv = config_loader.get_redis_conn("ARDB_DB")
r_serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
r_serv_tag = config_loader.get_redis_conn("ARDB_Tags")
r_serv_onion = config_loader.get_redis_conn("ARDB_Onion")
config_loader = None
r_serv.set('ail:current_background_script', 'crawled_screenshot')
r_serv.set('ail:current_background_script_stat', 0)

View file

@ -5,58 +5,36 @@ import os
import sys
import time
import redis
import configparser
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
config_loader = ConfigLoader.ConfigLoader()
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_tag = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "pastes")) + '/'
r_serv = config_loader.get_redis_conn("ARDB_DB")
r_serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
r_serv_tag = config_loader.get_redis_conn("ARDB_Tags")
r_serv_onion = config_loader.get_redis_conn("ARDB_Onion")
r_important_paste_2018 = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
host=config_loader.get_config_str("ARDB_Metadata", "host"),
port=config_loader.get_config_int("ARDB_Metadata", "port"),
db=2018,
decode_responses=True)
r_important_paste_2019 = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=2018,
host=config_loader.get_config_str("ARDB_Metadata", "host"),
port=config_loader.get_config_int("ARDB_Metadata", "port"),
db=2019,
decode_responses=True)
config_loader = None
r_serv.set('ail:current_background_script', 'tags')
r_serv.set('ail:current_background_script_stat', 0)

View file

@ -5,7 +5,9 @@ import os
import sys
import time
import redis
import configparser
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
def tags_key_fusion(old_item_path_key, new_item_path_key):
print('fusion:')
@ -19,33 +21,14 @@ if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
config_loader = ConfigLoader.ConfigLoader()
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_tag = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
r_serv = config_loader.get_redis_conn("ARDB_DB")
r_serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
r_serv_tag = config_loader.get_redis_conn("ARDB_Tags")
config_loader = None
if r_serv.sismember('ail:update_v1.5', 'tags'):

View file

@ -6,33 +6,21 @@ import sys
import time
import redis
import datetime
import configparser
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
config_loader = ConfigLoader.ConfigLoader()
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "pastes")) + '/'
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
r_serv = config_loader.get_redis_conn("ARDB_DB")
r_serv_onion = config_loader.get_redis_conn("ARDB_Onion")
config_loader = None
print()
print('Updating ARDB_Onion ...')

View file

@ -6,25 +6,18 @@ import sys
import time
import redis
import datetime
import configparser
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
config_loader = ConfigLoader.ConfigLoader()
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv = config_loader.get_redis_conn("ARDB_DB")
config_loader = None
#Set current ail version
r_serv.set('ail:version', 'v1.7')

View file

@ -6,25 +6,18 @@ import sys
import time
import redis
import datetime
import configparser
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
if __name__ == '__main__':
start_deb = time.time()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
config_loader = ConfigLoader.ConfigLoader()
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv = config_loader.get_redis_conn("ARDB_DB")
config_loader = None
#Set current ail version
r_serv.set('ail:version', 'v2.0')

110
update/v2.2/Update.py Executable file
View file

@ -0,0 +1,110 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import re
import sys
import time
import redis
import datetime
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages'))
import Item
import Term
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
def rreplace(s, old, new, occurrence):
li = s.rsplit(old, occurrence)
return new.join(li)
if __name__ == '__main__':
start_deb = time.time()
config_loader = ConfigLoader.ConfigLoader()
r_serv_term_stats = config_loader.get_redis_conn("ARDB_Trending")
r_serv_termfreq = config_loader.get_redis_conn("ARDB_TermFreq")
config_loader = None
r_serv_term_stats.flushdb()
#convert all regex:
all_regex = r_serv_termfreq.smembers('TrackedRegexSet')
for regex in all_regex:
tags = list( r_serv_termfreq.smembers('TrackedNotificationTags_{}'.format(regex)) )
mails = list( r_serv_termfreq.smembers('TrackedNotificationEmails_{}'.format(regex)) )
new_term = regex[1:-1]
res = Term.parse_json_term_to_add({"term": new_term, "type": 'regex', "tags": tags, "mails": mails, "level": 1}, 'admin@admin.test')
if res[1] == 200:
term_uuid = res[0]['uuid']
list_items = r_serv_termfreq.smembers('regex_{}'.format(regex))
for paste_item in list_items:
item_id = Item.get_item_id(paste_item)
item_date = Item.get_item_date(item_id)
Term.add_tracked_item(term_uuid, item_id, item_date)
# Invalid Tracker => remove it
else:
print('Invalid Regex Removed: {}'.format(regex))
print(res[0])
# allow reprocess
r_serv_termfreq.srem('TrackedRegexSet', regex)
all_tokens = r_serv_termfreq.smembers('TrackedSetTermSet')
for token in all_tokens:
tags = list( r_serv_termfreq.smembers('TrackedNotificationTags_{}'.format(token)) )
mails = list( r_serv_termfreq.smembers('TrackedNotificationEmails_{}'.format(token)) )
res = Term.parse_json_term_to_add({"term": token, "type": 'word', "tags": tags, "mails": mails, "level": 1}, 'admin@admin.test')
if res[1] == 200:
term_uuid = res[0]['uuid']
list_items = r_serv_termfreq.smembers('tracked_{}'.format(token))
for paste_item in list_items:
item_id = Item.get_item_id(paste_item)
item_date = Item.get_item_date(item_id)
Term.add_tracked_item(term_uuid, item_id, item_date)
# Invalid Tracker => remove it
else:
print('Invalid Token Removed: {}'.format(token))
print(res[0])
# allow reprocess
r_serv_termfreq.srem('TrackedSetTermSet', token)
all_set = r_serv_termfreq.smembers('TrackedSetSet')
for curr_set in all_set:
tags = list( r_serv_termfreq.smembers('TrackedNotificationTags_{}'.format(curr_set)) )
mails = list( r_serv_termfreq.smembers('TrackedNotificationEmails_{}'.format(curr_set)) )
to_remove = ',{}'.format(curr_set.split(',')[-1])
new_set = rreplace(curr_set, to_remove, '', 1)
new_set = new_set[2:]
new_set = new_set.replace(',', '')
res = Term.parse_json_term_to_add({"term": new_set, "type": 'set', "nb_words": 1, "tags": tags, "mails": mails, "level": 1}, 'admin@admin.test')
if res[1] == 200:
term_uuid = res[0]['uuid']
list_items = r_serv_termfreq.smembers('tracked_{}'.format(curr_set))
for paste_item in list_items:
item_id = Item.get_item_id(paste_item)
item_date = Item.get_item_date(item_id)
Term.add_tracked_item(term_uuid, item_id, item_date)
# Invalid Tracker => remove it
else:
print('Invalid Set Removed: {}'.format(curr_set))
print(res[0])
# allow reprocess
r_serv_termfreq.srem('TrackedSetSet', curr_set)
r_serv_termfreq.flushdb()
#Set current ail version
r_serv.set('ail:version', 'v2.2')
#Set current ail version
r_serv.hset('ail:update_date', 'v2.2', datetime.datetime.now().strftime("%Y%m%d"))

39
update/v2.2/Update.sh Executable file
View file

@ -0,0 +1,39 @@
#!/bin/bash
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_REDIS" ] && echo "Needs the env var AIL_REDIS. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_ARDB" ] && echo "Needs the env var AIL_ARDB. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_BIN" ] && echo "Needs the env var AIL_ARDB. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_FLASK" ] && echo "Needs the env var AIL_FLASK. Run the script from the virtual environment." && exit 1;
export PATH=$AIL_HOME:$PATH
export PATH=$AIL_REDIS:$PATH
export PATH=$AIL_ARDB:$PATH
export PATH=$AIL_BIN:$PATH
export PATH=$AIL_FLASK:$PATH
GREEN="\\033[1;32m"
DEFAULT="\\033[0;39m"
echo -e $GREEN"Shutting down AIL ..."$DEFAULT
bash ${AIL_BIN}/LAUNCH.sh -ks
wait
bash ${AIL_BIN}/LAUNCH.sh -lav &
wait
echo ""
echo ""
echo -e $GREEN"Updating AIL VERSION ..."$DEFAULT
echo ""
python ${AIL_HOME}/update/v2.2/Update.py
wait
echo ""
echo ""
echo ""
echo -e $GREEN"Shutting down ARDB ..."$DEFAULT
bash ${AIL_BIN}/LAUNCH.sh -ks
wait
exit 0

37
update/v2.4/Update.py Executable file
View file

@ -0,0 +1,37 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import re
import sys
import time
import redis
import datetime
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
new_version = 'v2.4'
if __name__ == '__main__':
start_deb = time.time()
config_loader = ConfigLoader.ConfigLoader()
r_serv = config_loader.get_redis_conn("ARDB_DB")
r_serv_onion = config_loader.get_redis_conn("ARDB_Onion")
config_loader = None
#Set current update_in_progress
r_serv.set('ail:update_in_progress', new_version)
r_serv.set('ail:current_background_update', new_version)
r_serv_onion.sunionstore('domain_update_v2.4', 'full_onion_up', 'full_regular_up')
r_serv.set('update:nb_elem_to_convert', r_serv_onion.scard('domain_update_v2.4'))
r_serv.set('update:nb_elem_converted',0)
#Set current ail version
r_serv.set('ail:version', new_version)
#Set current ail version
r_serv.hset('ail:update_date', new_version, datetime.datetime.now().strftime("%Y%m%d"))

42
update/v2.4/Update.sh Executable file
View file

@ -0,0 +1,42 @@
#!/bin/bash
[ -z "$AIL_HOME" ] && echo "Needs the env var AIL_HOME. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_REDIS" ] && echo "Needs the env var AIL_REDIS. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_ARDB" ] && echo "Needs the env var AIL_ARDB. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_BIN" ] && echo "Needs the env var AIL_ARDB. Run the script from the virtual environment." && exit 1;
[ -z "$AIL_FLASK" ] && echo "Needs the env var AIL_FLASK. Run the script from the virtual environment." && exit 1;
export PATH=$AIL_HOME:$PATH
export PATH=$AIL_REDIS:$PATH
export PATH=$AIL_ARDB:$PATH
export PATH=$AIL_BIN:$PATH
export PATH=$AIL_FLASK:$PATH
GREEN="\\033[1;32m"
DEFAULT="\\033[0;39m"
echo -e $GREEN"Shutting down AIL ..."$DEFAULT
bash ${AIL_BIN}/LAUNCH.sh -ks
wait
bash ${AIL_BIN}/LAUNCH.sh -lav &
wait
echo ""
cp ${AIL_BIN}/packages/config.cfg ${AIL_HOME}/configs/core.cfg
rm ${AIL_BIN}/packages/config.cfg
echo ""
echo -e $GREEN"Updating AIL VERSION ..."$DEFAULT
echo ""
python ${AIL_HOME}/update/v2.4/Update.py
wait
echo ""
echo ""
echo ""
echo -e $GREEN"Shutting down ARDB ..."$DEFAULT
bash ${AIL_BIN}/LAUNCH.sh -ks
wait
exit 0

77
update/v2.4/Update_domain.py Executable file
View file

@ -0,0 +1,77 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
import os
import re
import sys
import time
import redis
import datetime
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages/'))
import Item
import Tag
from Cryptocurrency import cryptocurrency
from Pgp import pgp
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
import Decoded
import Domain
def update_update_stats():
nb_updated = int(r_serv_db.get('update:nb_elem_converted'))
progress = int((nb_updated * 100) / nb_elem_to_update)
print('{}/{} updated {}%'.format(nb_updated, nb_elem_to_update, progress))
r_serv_db.set('ail:current_background_script_stat', progress)
def update_domain_by_item(domain_obj, item_id):
domain_name = domain_obj.get_domain_name()
# update domain tags
for tag in Tag.get_item_tags(item_id):
if tag != 'infoleak:submission="crawler"' and tag != 'infoleak:submission="manual"':
Tag.add_domain_tag(tag, domain_name, Item.get_item_date(item_id))
# update domain correlation
item_correlation = Item.get_item_all_correlation(item_id)
for correlation_name in item_correlation:
for correlation_type in item_correlation[correlation_name]:
if correlation_name in ('pgp', 'cryptocurrency'):
for correl_value in item_correlation[correlation_name][correlation_type]:
if correlation_name=='pgp':
pgp.save_domain_correlation(domain_name, correlation_type, correl_value)
if correlation_name=='cryptocurrency':
cryptocurrency.save_domain_correlation(domain_name, correlation_type, correl_value)
if correlation_name=='decoded':
for decoded_item in item_correlation['decoded']:
Decoded.save_domain_decoded(domain_name, decoded_item)
if __name__ == '__main__':
start_deb = time.time()
config_loader = ConfigLoader.ConfigLoader()
r_serv_db = config_loader.get_redis_conn("ARDB_DB")
r_serv_onion = config_loader.get_redis_conn("ARDB_Onion")
config_loader = None
nb_elem_to_update = int( r_serv_db.get('update:nb_elem_to_convert') )
while True:
domain = r_serv_onion.spop('domain_update_v2.4')
if domain is not None:
print(domain)
domain = Domain.Domain(domain)
for domain_history in domain.get_domain_history():
domain_item = domain.get_domain_items_crawled(epoch=domain_history[1]) # item_tag
if "items" in domain_item:
for item_dict in domain_item['items']:
update_domain_by_item(domain, item_dict['id'])
r_serv_db.incr('update:nb_elem_converted')
update_update_stats()
else:
sys.exit(0)

View file

@ -2,86 +2,75 @@
# -*-coding:UTF-8 -*
import os
import re
import sys
import ssl
import json
import time
import redis
import random
import logging
import logging.handlers
import configparser
from flask import Flask, render_template, jsonify, request, Request, session, redirect, url_for
from flask import Flask, render_template, jsonify, request, Request, Response, session, redirect, url_for
from flask_login import LoginManager, current_user, login_user, logout_user, login_required
import bcrypt
import flask
import importlib
from os.path import join
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages/'))
sys.path.append('./modules/')
import Paste
from Date import Date
from User import User
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
from pytaxonomies import Taxonomies
# Import config
import Flask_config
# Import Role_Manager
from Role_Manager import create_user_db, check_password_strength, check_user_role_integrity
from Role_Manager import login_admin, login_analyst
# Import Blueprint
from blueprints.root import root
from blueprints.crawler_splash import crawler_splash
from blueprints.correlation import correlation
Flask_dir = os.environ['AIL_FLASK']
# CONFIG #
cfg = Flask_config.cfg
baseUrl = cfg.get("Flask", "baseurl")
config_loader = ConfigLoader.ConfigLoader()
baseUrl = config_loader.get_config_str("Flask", "baseurl")
baseUrl = baseUrl.replace('/', '')
if baseUrl != '':
baseUrl = '/'+baseUrl
# ========= REDIS =========#
r_serv_db = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_tags = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
r_cache = redis.StrictRedis(
host=cfg.get("Redis_Cache", "host"),
port=cfg.getint("Redis_Cache", "port"),
db=cfg.getint("Redis_Cache", "db"),
decode_responses=True)
r_serv_db = config_loader.get_redis_conn("ARDB_DB")
r_serv_tags = config_loader.get_redis_conn("ARDB_Tags")
r_cache = config_loader.get_redis_conn("Redis_Cache")
# logs
log_dir = os.path.join(os.environ['AIL_HOME'], 'logs')
if not os.path.isdir(log_dir):
os.makedirs(logs_dir)
log_filename = os.path.join(log_dir, 'flask_server.logs')
logger = logging.getLogger()
formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
handler_log = logging.handlers.TimedRotatingFileHandler(log_filename, when="midnight", interval=1)
handler_log.suffix = '%Y-%m-%d.log'
handler_log.setFormatter(formatter)
handler_log.setLevel(30)
logger.addHandler(handler_log)
logger.setLevel(30)
# log_filename = os.path.join(log_dir, 'flask_server.logs')
# logger = logging.getLogger()
# formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
# handler_log = logging.handlers.TimedRotatingFileHandler(log_filename, when="midnight", interval=1)
# handler_log.suffix = '%Y-%m-%d.log'
# handler_log.setFormatter(formatter)
# handler_log.setLevel(30)
# logger.addHandler(handler_log)
# logger.setLevel(30)
# ========= =========#
# ========= TLS =========#
ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLSv1_2)
ssl_context.load_cert_chain(certfile='server.crt', keyfile='server.key')
ssl_context.load_cert_chain(certfile=os.path.join(Flask_dir, 'server.crt'), keyfile=os.path.join(Flask_dir, 'server.key'))
#print(ssl_context.get_ciphers())
# ========= =========#
@ -89,19 +78,27 @@ Flask_config.app = Flask(__name__, static_url_path=baseUrl+'/static/')
app = Flask_config.app
app.config['MAX_CONTENT_LENGTH'] = 900 * 1024 * 1024
# ========= BLUEPRINT =========#
app.register_blueprint(root, url_prefix=baseUrl)
app.register_blueprint(crawler_splash, url_prefix=baseUrl)
app.register_blueprint(correlation, url_prefix=baseUrl)
# ========= =========#
# ========= session ========
app.secret_key = str(random.getrandbits(256))
login_manager = LoginManager()
login_manager.login_view = 'login'
login_manager.login_view = 'root.login'
login_manager.init_app(app)
print()
# ========= LOGIN MANAGER ========
@login_manager.user_loader
def load_user(user_id):
return User.get(user_id)
# ========= HEADER GENERATION ========
# ========= HEADER GENERATION ======== DEPRECATED
# Get headers items that should be ignored (not displayed)
toIgnoreModule = set()
@ -112,13 +109,12 @@ try:
toIgnoreModule.add(line)
except IOError:
f = open('templates/ignored_modules.txt', 'w')
f.close()
pass
# Dynamically import routes and functions from modules
# Also, prepare header.html
to_add_to_header_dico = {}
for root, dirs, files in os.walk('modules/'):
for root, dirs, files in os.walk(os.path.join(Flask_dir, 'modules')):
sys.path.append(join(root))
# Ignore the module
@ -140,7 +136,7 @@ for root, dirs, files in os.walk('modules/'):
#create header.html
complete_header = ""
with open('templates/header_base.html', 'r') as f:
with open(os.path.join(Flask_dir, 'templates', 'header_base.html'), 'r') as f:
complete_header = f.read()
modified_header = complete_header
@ -159,7 +155,7 @@ for module_name, txt in to_add_to_header_dico.items():
modified_header = modified_header.replace('<!--insert here-->', '\n'.join(to_add_to_header))
#Write the header.html file
with open('templates/header.html', 'w') as f:
with open(os.path.join(Flask_dir, 'templates', 'header.html'), 'w') as f:
f.write(modified_header)
# ========= JINJA2 FUNCTIONS ========
@ -176,120 +172,45 @@ def add_header(response):
and also to cache the rendered page for 10 minutes.
"""
response.headers['X-UA-Compatible'] = 'IE=Edge,chrome=1'
response.headers['Cache-Control'] = 'public, max-age=0'
if 'Cache-Control' not in response.headers:
response.headers['Cache-Control'] = 'private, max-age=0'
return response
# @app.route('/test', methods=['GET'])
# def test():
# for rule in app.url_map.iter_rules():
# print(rule)
# return 'o'
# ========== ROUTES ============
@app.route('/login', methods=['POST', 'GET'])
def login():
current_ip = request.remote_addr
login_failed_ip = r_cache.get('failed_login_ip:{}'.format(current_ip))
# brute force by ip
if login_failed_ip:
login_failed_ip = int(login_failed_ip)
if login_failed_ip >= 5:
error = 'Max Connection Attempts reached, Please wait {}s'.format(r_cache.ttl('failed_login_ip:{}'.format(current_ip)))
return render_template("login.html", error=error)
if request.method == 'POST':
username = request.form.get('username')
password = request.form.get('password')
#next_page = request.form.get('next_page')
if username is not None:
user = User.get(username)
login_failed_user_id = r_cache.get('failed_login_user_id:{}'.format(username))
# brute force by user_id
if login_failed_user_id:
login_failed_user_id = int(login_failed_user_id)
if login_failed_user_id >= 5:
error = 'Max Connection Attempts reached, Please wait {}s'.format(r_cache.ttl('failed_login_user_id:{}'.format(username)))
return render_template("login.html", error=error)
if user and user.check_password(password):
if not check_user_role_integrity(user.get_id()):
error = 'Incorrect User ACL, Please contact your administrator'
return render_template("login.html", error=error)
login_user(user) ## TODO: use remember me ?
if user.request_password_change():
return redirect(url_for('change_password'))
else:
return redirect(url_for('dashboard.index'))
# login failed
else:
# set brute force protection
logger.warning("Login failed, ip={}, username={}".format(current_ip, username))
r_cache.incr('failed_login_ip:{}'.format(current_ip))
r_cache.expire('failed_login_ip:{}'.format(current_ip), 300)
r_cache.incr('failed_login_user_id:{}'.format(username))
r_cache.expire('failed_login_user_id:{}'.format(username), 300)
#
error = 'Password Incorrect'
return render_template("login.html", error=error)
return 'please provide a valid username'
else:
#next_page = request.args.get('next')
error = request.args.get('error')
return render_template("login.html" , error=error)
@app.route('/change_password', methods=['POST', 'GET'])
@login_required
def change_password():
password1 = request.form.get('password1')
password2 = request.form.get('password2')
error = request.args.get('error')
if error:
return render_template("change_password.html", error=error)
if current_user.is_authenticated and password1!=None:
if password1==password2:
if check_password_strength(password1):
user_id = current_user.get_id()
create_user_db(user_id , password1, update=True)
return redirect(url_for('dashboard.index'))
else:
error = 'Incorrect password'
return render_template("change_password.html", error=error)
else:
error = "Passwords don't match"
return render_template("change_password.html", error=error)
else:
error = 'Please choose a new password'
return render_template("change_password.html", error=error)
@app.route('/logout')
@login_required
def logout():
logout_user()
return redirect(url_for('login'))
# role error template
@app.route('/role', methods=['POST', 'GET'])
@login_required
def role():
return render_template("error/403.html"), 403
@app.route('/searchbox/')
@login_required
@login_analyst
def searchbox():
return render_template("searchbox.html")
#@app.route('/endpoints')
#def endpoints():
# for rule in app.url_map.iter_rules():
# str_endpoint = str(rule)
# if len(str_endpoint)>5:
# if str_endpoint[0:5]=='/api/': ## add baseUrl ???
# print(str_endpoint)
# #print(rule.endpoint) #internal endpoint name
# #print(rule.methods)
# return 'ok'
# ========== ERROR HANDLER ============
@app.errorhandler(405)
def _handle_client_error(e):
if request.path.startswith('/api/'): ## # TODO: add baseUrl
res_dict = {"status": "error", "reason": "Method Not Allowed: The method is not allowed for the requested URL"}
anchor_id = request.path[8:]
anchor_id = anchor_id.replace('/', '_')
api_doc_url = 'https://github.com/CIRCL/AIL-framework/tree/master/doc#{}'.format(anchor_id)
res_dict['documentation'] = api_doc_url
return Response(json.dumps(res_dict, indent=2, sort_keys=True), mimetype='application/json'), 405
else:
return e
@app.errorhandler(404)
def error_page_not_found(e):
if request.path.startswith('/api/'): ## # TODO: add baseUrl
return Response(json.dumps({"status": "error", "reason": "404 Not Found"}, indent=2, sort_keys=True), mimetype='application/json'), 404
else:
# avoid endpoint enumeration
return page_not_found(e)
@login_required
def page_not_found(e):
# avoid endpoint enumeration

View file

@ -0,0 +1,200 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
'''
Blueprint Flask: crawler splash endpoints: dashboard, onion crawler ...
'''
import os
import sys
import json
import random
from flask import Flask, render_template, jsonify, request, Blueprint, redirect, url_for, Response
from flask_login import login_required, current_user, login_user, logout_user
sys.path.append('modules')
import Flask_config
# Import Role_Manager
from Role_Manager import create_user_db, check_password_strength, check_user_role_integrity
from Role_Manager import login_admin, login_analyst
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib'))
import Correlate_object
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages'))
import Cryptocurrency
import Pgp
import Decoded
bootstrap_label = Flask_config.bootstrap_label
vt_enabled = Flask_config.vt_enabled
# ============ BLUEPRINT ============
correlation = Blueprint('correlation', __name__, template_folder=os.path.join(os.environ['AIL_FLASK'], 'templates/correlation'))
# ============ VARIABLES ============
######
### graph_line_json
### 'hashDecoded.pgpdump_graph_line_json'
### 'hashDecoded.cryptocurrency_graph_line_json'
###
######
# ============ FUNCTIONS ============
def sanitise_graph_mode(graph_mode):
if graph_mode not in ('inter', 'union'):
return 'union'
else:
return graph_mode
def sanitise_nb_max_nodes(nb_max_nodes):
try:
nb_max_nodes = int(nb_max_nodes)
if nb_max_nodes < 2:
nb_max_nodes = 300
except:
nb_max_nodes = 300
return nb_max_nodes
def sanitise_correlation_names(correlation_names):
'''
correlation_names ex = 'pgp,crypto'
'''
all_correlation_names = Correlate_object.get_all_correlation_names()
if correlation_names is None:
return all_correlation_names
else:
l_correlation_names = []
for correl in correlation_names.split(','):
if correl in all_correlation_names:
l_correlation_names.append(correl)
if l_correlation_names:
return l_correlation_names
else:
return all_correlation_names
def sanitise_correlation_objects(correlation_objects):
'''
correlation_objects ex = 'domain,decoded'
'''
all_correlation_objects = Correlate_object.get_all_correlation_objects()
if correlation_objects is None:
return all_correlation_objects
else:
l_correlation_objects = []
for correl in correlation_objects.split(','):
if correl in all_correlation_objects:
l_correlation_objects.append(correl)
if l_correlation_objects:
return l_correlation_objects
else:
return all_correlation_objects
def get_card_metadata(object_type, correlation_id, type_id=None):
card_dict = {}
if object_type == 'cryptocurrency':
card_dict["sparkline"] = Cryptocurrency.cryptocurrency.get_list_nb_previous_correlation_object(type_id, correlation_id, 6)
card_dict["icon"] = Correlate_object.get_correlation_node_icon(object_type, type_id)
elif object_type == 'pgp':
card_dict["sparkline"] = Pgp.pgp.get_list_nb_previous_correlation_object(type_id, correlation_id, 6)
card_dict["icon"] = Correlate_object.get_correlation_node_icon(object_type, type_id)
elif object_type == 'decoded':
card_dict["sparkline"] = Decoded.get_list_nb_previous_hash(correlation_id, 6)
card_dict["icon"] = Correlate_object.get_correlation_node_icon(object_type, value=correlation_id)
card_dict["vt"] = Decoded.get_decoded_vt_report(correlation_id)
card_dict["vt"]["status"] = vt_enabled
elif object_type == 'domain':
pass
elif object_type == 'paste':
pass
return card_dict
# ============= ROUTES ==============
@correlation.route('/correlation/show_correlation', methods=['GET', 'POST']) # GET + POST
@login_required
@login_analyst
def show_correlation():
if request.method == 'POST':
object_type = request.form.get('object_type')
type_id = request.form.get('type_id')
correlation_id = request.form.get('correlation_id')
max_nodes = request.form.get('max_nb_nodes_in')
mode = request.form.get('mode')
if mode:
mode = 'inter'
else:
mode = 'union'
## get all selected correlations
correlation_names = []
correlation_objects = []
#correlation_names
correl_option = request.form.get('CryptocurrencyCheck')
if correl_option:
correlation_names.append('cryptocurrency')
correl_option = request.form.get('PgpCheck')
if correl_option:
correlation_names.append('pgp')
correl_option = request.form.get('DecodedCheck')
if correl_option:
correlation_names.append('decoded')
# correlation_objects
correl_option = request.form.get('DomainCheck')
if correl_option:
correlation_objects.append('domain')
correl_option = request.form.get('PasteCheck')
if correl_option:
correlation_objects.append('paste')
# list as params
correlation_names = ",".join(correlation_names)
correlation_objects = ",".join(correlation_objects)
# redirect to keep history and bookmark
return redirect(url_for('correlation.show_correlation', object_type=object_type, type_id=type_id, correlation_id=correlation_id, mode=mode,
max_nodes=max_nodes, correlation_names=correlation_names, correlation_objects=correlation_objects))
# request.method == 'GET'
else:
object_type = request.args.get('object_type')
type_id = request.args.get('type_id')
correlation_id = request.args.get('correlation_id')
max_nodes = sanitise_nb_max_nodes(request.args.get('max_nodes'))
mode = sanitise_graph_mode(request.args.get('mode'))
correlation_names = sanitise_correlation_names(request.args.get('correlation_names'))
correlation_objects = sanitise_correlation_objects(request.args.get('correlation_objects'))
dict_object = {"object_type": object_type, "correlation_id": correlation_id}
dict_object["max_nodes"] = max_nodes
dict_object["mode"] = mode
dict_object["correlation_names"] = correlation_names
dict_object["correlation_names_str"] = ",".join(correlation_names)
dict_object["correlation_objects"] = correlation_objects
dict_object["correlation_objects_str"] = ",".join(correlation_objects)
dict_object["metadata"] = Correlate_object.get_object_metadata(object_type, correlation_id, type_id=type_id)
if type_id:
dict_object["metadata"]['type_id'] = type_id
dict_object["metadata_card"] = get_card_metadata(object_type, correlation_id, type_id=type_id)
return render_template("show_correlation.html", dict_object=dict_object)
@correlation.route('/correlation/graph_node_json')
@login_required
@login_analyst
def graph_node_json(): # # TODO: use post
correlation_id = request.args.get('correlation_id')
type_id = request.args.get('type_id')
object_type = request.args.get('object_type')
max_nodes = sanitise_nb_max_nodes(request.args.get('max_nodes'))
correlation_names = sanitise_correlation_names(request.args.get('correlation_names'))
correlation_objects = sanitise_correlation_objects(request.args.get('correlation_objects'))
mode = sanitise_graph_mode(request.args.get('mode'))
res = Correlate_object.get_graph_node_object_correlation(object_type, correlation_id, mode, correlation_names, correlation_objects, requested_correl_type=type_id, max_nodes=max_nodes)
return jsonify(res)

View file

@ -0,0 +1,73 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
'''
Blueprint Flask: crawler splash endpoints: dashboard, onion crawler ...
'''
import os
import sys
import json
import random
from flask import Flask, render_template, jsonify, request, Blueprint, redirect, url_for, Response
from flask_login import login_required, current_user, login_user, logout_user
sys.path.append('modules')
import Flask_config
# Import Role_Manager
from Role_Manager import create_user_db, check_password_strength, check_user_role_integrity
from Role_Manager import login_admin, login_analyst
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages'))
from Tag import get_modal_add_tags
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib'))
import Domain
r_cache = Flask_config.r_cache
r_serv_db = Flask_config.r_serv_db
r_serv_tags = Flask_config.r_serv_tags
bootstrap_label = Flask_config.bootstrap_label
# ============ BLUEPRINT ============
crawler_splash = Blueprint('crawler_splash', __name__, template_folder=os.path.join(os.environ['AIL_FLASK'], 'templates/crawler/crawler_splash'))
# ============ VARIABLES ============
# ============ FUNCTIONS ============
def api_validator(api_response):
if api_response:
return Response(json.dumps(api_response[0], indent=2, sort_keys=True), mimetype='application/json'), api_response[1]
# ============= ROUTES ==============
# add route : /crawlers/show_domain
@crawler_splash.route('/crawlers/showDomain')
@login_required
@login_analyst
def showDomain():
domain_name = request.args.get('domain')
epoch = request.args.get('epoch')
port = request.args.get('port')
res = api_validator(Domain.api_verify_if_domain_exist(domain_name))
if res:
return res
domain = Domain.Domain(domain_name, port=port)
dict_domain = domain.get_domain_metadata()
dict_domain['domain'] = domain_name
if domain.is_domain_up():
dict_domain = {**dict_domain, **domain.get_domain_correlation()}
dict_domain['origin_item'] = domain.get_domain_last_origin()
dict_domain['tags'] = domain.get_domain_tags()
dict_domain['history'] = domain.get_domain_history_with_status()
dict_domain['crawler_history'] = domain.get_domain_items_crawled(items_link=True, epoch=epoch, item_screenshot=True, item_tag=True) # # TODO: handle multiple port
dict_domain['crawler_history']['random_item'] = random.choice(dict_domain['crawler_history']['items'])
return render_template("showDomain.html", dict_domain=dict_domain, bootstrap_label=bootstrap_label,
modal_add_tags=get_modal_add_tags(dict_domain['domain'], tag_type="domain"))

140
var/www/blueprints/root.py Normal file
View file

@ -0,0 +1,140 @@
#!/usr/bin/env python3
# -*-coding:UTF-8 -*
'''
Blueprint Flask: root endpoints: login, ...
'''
import os
import sys
from flask import Flask, render_template, jsonify, request, Blueprint, redirect, url_for, Response
from flask_login import login_required, current_user, login_user, logout_user
sys.path.append('modules')
import Flask_config
# Import Role_Manager
from Role_Manager import create_user_db, check_password_strength, check_user_role_integrity
from Role_Manager import login_admin, login_analyst
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'packages'))
from User import User
r_cache = Flask_config.r_cache
r_serv_db = Flask_config.r_serv_db
r_serv_tags = Flask_config.r_serv_tags
# ============ BLUEPRINT ============
root = Blueprint('root', __name__, template_folder='templates')
# ============ VARIABLES ============
# ============ FUNCTIONS ============
# ============= ROUTES ==============
@root.route('/login', methods=['POST', 'GET'])
def login():
current_ip = request.remote_addr
login_failed_ip = r_cache.get('failed_login_ip:{}'.format(current_ip))
# brute force by ip
if login_failed_ip:
login_failed_ip = int(login_failed_ip)
if login_failed_ip >= 5:
error = 'Max Connection Attempts reached, Please wait {}s'.format(r_cache.ttl('failed_login_ip:{}'.format(current_ip)))
return render_template("login.html", error=error)
if request.method == 'POST':
username = request.form.get('username')
password = request.form.get('password')
#next_page = request.form.get('next_page')
if username is not None:
user = User.get(username)
login_failed_user_id = r_cache.get('failed_login_user_id:{}'.format(username))
# brute force by user_id
if login_failed_user_id:
login_failed_user_id = int(login_failed_user_id)
if login_failed_user_id >= 5:
error = 'Max Connection Attempts reached, Please wait {}s'.format(r_cache.ttl('failed_login_user_id:{}'.format(username)))
return render_template("login.html", error=error)
if user and user.check_password(password):
if not check_user_role_integrity(user.get_id()):
error = 'Incorrect User ACL, Please contact your administrator'
return render_template("login.html", error=error)
login_user(user) ## TODO: use remember me ?
if user.request_password_change():
return redirect(url_for('root.change_password'))
else:
return redirect(url_for('dashboard.index'))
# login failed
else:
# set brute force protection
#logger.warning("Login failed, ip={}, username={}".format(current_ip, username))
r_cache.incr('failed_login_ip:{}'.format(current_ip))
r_cache.expire('failed_login_ip:{}'.format(current_ip), 300)
r_cache.incr('failed_login_user_id:{}'.format(username))
r_cache.expire('failed_login_user_id:{}'.format(username), 300)
#
error = 'Password Incorrect'
return render_template("login.html", error=error)
return 'please provide a valid username'
else:
#next_page = request.args.get('next')
error = request.args.get('error')
return render_template("login.html" , error=error)
@root.route('/change_password', methods=['POST', 'GET'])
@login_required
def change_password():
password1 = request.form.get('password1')
password2 = request.form.get('password2')
error = request.args.get('error')
if error:
return render_template("change_password.html", error=error)
if current_user.is_authenticated and password1!=None:
if password1==password2:
if check_password_strength(password1):
user_id = current_user.get_id()
create_user_db(user_id , password1, update=True)
return redirect(url_for('dashboard.index'))
else:
error = 'Incorrect password'
return render_template("change_password.html", error=error)
else:
error = "Passwords don't match"
return render_template("change_password.html", error=error)
else:
error = 'Please choose a new password'
return render_template("change_password.html", error=error)
@root.route('/logout')
@login_required
def logout():
logout_user()
return redirect(url_for('root.login'))
# role error template
@root.route('/role', methods=['POST', 'GET'])
@login_required
def role():
return render_template("error/403.html"), 403
@root.route('/searchbox/')
@login_required
@login_analyst
def searchbox():
return render_template("searchbox.html")

View file

@ -4,28 +4,18 @@
import os
import sys
import redis
import configparser
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
sys.path.append(os.path.join(os.environ['AIL_FLASK'], 'modules'))
from Role_Manager import create_user_db, edit_user_db, get_default_admin_token, gen_password
config_loader = ConfigLoader.ConfigLoader()
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
r_serv = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv = config_loader.get_redis_conn("ARDB_DB")
config_loader = None
if __name__ == "__main__":

View file

@ -4,110 +4,34 @@
'''
Flask global variables shared accross modules
'''
import configparser
import redis
import os
import re
import sys
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
# FLASK #
app = None
#secret_key = 'ail-super-secret_key01C'
# CONFIG #
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
cfg = configparser.ConfigParser()
cfg.read(configfile)
config_loader = ConfigLoader.ConfigLoader()
# REDIS #
r_serv = redis.StrictRedis(
host=cfg.get("Redis_Queues", "host"),
port=cfg.getint("Redis_Queues", "port"),
db=cfg.getint("Redis_Queues", "db"),
decode_responses=True)
r_cache = redis.StrictRedis(
host=cfg.get("Redis_Cache", "host"),
port=cfg.getint("Redis_Cache", "port"),
db=cfg.getint("Redis_Cache", "db"),
decode_responses=True)
r_serv_log = redis.StrictRedis(
host=cfg.get("Redis_Log", "host"),
port=cfg.getint("Redis_Log", "port"),
db=cfg.getint("Redis_Log", "db"),
decode_responses=True)
r_serv_log_submit = redis.StrictRedis(
host=cfg.get("Redis_Log_submit", "host"),
port=cfg.getint("Redis_Log_submit", "port"),
db=cfg.getint("Redis_Log_submit", "db"),
decode_responses=True)
r_serv_charts = redis.StrictRedis(
host=cfg.get("ARDB_Trending", "host"),
port=cfg.getint("ARDB_Trending", "port"),
db=cfg.getint("ARDB_Trending", "db"),
decode_responses=True)
r_serv_sentiment = redis.StrictRedis(
host=cfg.get("ARDB_Sentiment", "host"),
port=cfg.getint("ARDB_Sentiment", "port"),
db=cfg.getint("ARDB_Sentiment", "db"),
decode_responses=True)
r_serv_term = redis.StrictRedis(
host=cfg.get("ARDB_TermFreq", "host"),
port=cfg.getint("ARDB_TermFreq", "port"),
db=cfg.getint("ARDB_TermFreq", "db"),
decode_responses=True)
r_serv_cred = redis.StrictRedis(
host=cfg.get("ARDB_TermCred", "host"),
port=cfg.getint("ARDB_TermCred", "port"),
db=cfg.getint("ARDB_TermCred", "db"),
decode_responses=True)
r_serv_pasteName = redis.StrictRedis(
host=cfg.get("Redis_Paste_Name", "host"),
port=cfg.getint("Redis_Paste_Name", "port"),
db=cfg.getint("Redis_Paste_Name", "db"),
decode_responses=True)
r_serv_tags = redis.StrictRedis(
host=cfg.get("ARDB_Tags", "host"),
port=cfg.getint("ARDB_Tags", "port"),
db=cfg.getint("ARDB_Tags", "db"),
decode_responses=True)
r_serv_metadata = redis.StrictRedis(
host=cfg.get("ARDB_Metadata", "host"),
port=cfg.getint("ARDB_Metadata", "port"),
db=cfg.getint("ARDB_Metadata", "db"),
decode_responses=True)
r_serv_db = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_statistics = redis.StrictRedis(
host=cfg.get("ARDB_Statistics", "host"),
port=cfg.getint("ARDB_Statistics", "port"),
db=cfg.getint("ARDB_Statistics", "db"),
decode_responses=True)
r_serv_onion = redis.StrictRedis(
host=cfg.get("ARDB_Onion", "host"),
port=cfg.getint("ARDB_Onion", "port"),
db=cfg.getint("ARDB_Onion", "db"),
decode_responses=True)
r_serv = config_loader.get_redis_conn("Redis_Queues")
r_cache = config_loader.get_redis_conn("Redis_Cache")
r_serv_log = config_loader.get_redis_conn("Redis_Log")
r_serv_log_submit = config_loader.get_redis_conn("Redis_Log_submit")
r_serv_charts = config_loader.get_redis_conn("ARDB_Trending")
r_serv_sentiment = config_loader.get_redis_conn("ARDB_Sentiment")
r_serv_term = config_loader.get_redis_conn("ARDB_Tracker")
r_serv_cred = config_loader.get_redis_conn("ARDB_TermCred")
r_serv_pasteName = config_loader.get_redis_conn("Redis_Paste_Name")
r_serv_tags = config_loader.get_redis_conn("ARDB_Tags")
r_serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
r_serv_db = config_loader.get_redis_conn("ARDB_DB")
r_serv_statistics = config_loader.get_redis_conn("ARDB_Statistics")
r_serv_onion = config_loader.get_redis_conn("ARDB_Onion")
sys.path.append('../../configs/keys')
# MISP #
@ -146,39 +70,43 @@ if HiveApi != False:
HiveApi = False
print('The Hive not connected')
# VARIABLES #
baseUrl = cfg.get("Flask", "baseurl")
#### VARIABLES ####
baseUrl = config_loader.get_config_str("Flask", "baseurl")
baseUrl = baseUrl.replace('/', '')
if baseUrl != '':
baseUrl = '/'+baseUrl
max_preview_char = int(cfg.get("Flask", "max_preview_char")) # Maximum number of character to display in the tooltip
max_preview_modal = int(cfg.get("Flask", "max_preview_modal")) # Maximum number of character to display in the modal
max_preview_char = int(config_loader.get_config_str("Flask", "max_preview_char")) # Maximum number of character to display in the tooltip
max_preview_modal = int(config_loader.get_config_str("Flask", "max_preview_modal")) # Maximum number of character to display in the modal
max_tags_result = 50
DiffMaxLineLength = int(cfg.get("Flask", "DiffMaxLineLength"))#Use to display the estimated percentage instead of a raw value
DiffMaxLineLength = int(config_loader.get_config_str("Flask", "DiffMaxLineLength"))#Use to display the estimated percentage instead of a raw value
bootstrap_label = ['primary', 'success', 'danger', 'warning', 'info']
dict_update_description = {'v1.5':{'nb_background_update': 5, 'update_warning_message': 'An Update is running on the background. Some informations like Tags, screenshot can be',
'update_warning_message_notice_me': 'missing from the UI.'}
'update_warning_message_notice_me': 'missing from the UI.'},
'v2.4':{'nb_background_update': 1, 'update_warning_message': 'An Update is running on the background. Some informations like Domain Tags/Correlation can be',
'update_warning_message_notice_me': 'missing from the UI.'}
}
UPLOAD_FOLDER = os.path.join(os.environ['AIL_FLASK'], 'submitted')
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes")) + '/'
SCREENSHOT_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "crawled_screenshot"), 'screenshot')
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "pastes")) + '/'
SCREENSHOT_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "crawled_screenshot"), 'screenshot')
REPO_ORIGIN = 'https://github.com/CIRCL/AIL-framework.git'
max_dashboard_logs = int(cfg.get("Flask", "max_dashboard_logs"))
max_dashboard_logs = int(config_loader.get_config_str("Flask", "max_dashboard_logs"))
crawler_enabled = cfg.getboolean("Crawler", "activate_crawler")
crawler_enabled = config_loader.get_config_boolean("Crawler", "activate_crawler")
email_regex = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}'
email_regex = re.compile(email_regex)
IMPORT_MAX_TEXT_SIZE = 900000 # size in bytes
# VT
try:
from virusTotalKEYS import vt_key
@ -190,6 +118,6 @@ try:
vt_enabled = False
print('VT submission is disabled')
except:
vt_auth = {'apikey': cfg.get("Flask", "max_preview_char")}
vt_auth = {'apikey': config_loader.get_config_str("Flask", "max_preview_char")}
vt_enabled = False
print('VT submission is disabled')

View file

@ -23,6 +23,9 @@ import json
import Paste
import Import_helper
import Tag
from pytaxonomies import Taxonomies
from pymispgalaxies import Galaxies, Clusters
@ -41,7 +44,6 @@ except:
import Flask_config
app = Flask_config.app
cfg = Flask_config.cfg
baseUrl = Flask_config.baseUrl
r_serv_tags = Flask_config.r_serv_tags
r_serv_metadata = Flask_config.r_serv_metadata
@ -87,65 +89,6 @@ def clean_filename(filename, whitelist=valid_filename_chars, replace=' '):
# keep only whitelisted chars
return ''.join(c for c in cleaned_filename if c in whitelist)
def launch_submit(ltags, ltagsgalaxies, paste_content, UUID, password, isfile = False):
# save temp value on disk
r_serv_db.set(UUID + ':ltags', ltags)
r_serv_db.set(UUID + ':ltagsgalaxies', ltagsgalaxies)
r_serv_db.set(UUID + ':paste_content', paste_content)
r_serv_db.set(UUID + ':password', password)
r_serv_db.set(UUID + ':isfile', isfile)
r_serv_log_submit.set(UUID + ':end', 0)
r_serv_log_submit.set(UUID + ':processing', 0)
r_serv_log_submit.set(UUID + ':nb_total', -1)
r_serv_log_submit.set(UUID + ':nb_end', 0)
r_serv_log_submit.set(UUID + ':nb_sucess', 0)
r_serv_log_submit.set(UUID + ':error', 'error:')
r_serv_log_submit.sadd(UUID + ':paste_submit_link', '')
# save UUID on disk
r_serv_db.sadd('submitted:uuid', UUID)
def addTagsVerification(tags, tagsgalaxies):
list_tag = tags.split(',')
list_tag_galaxies = tagsgalaxies.split(',')
taxonomies = Taxonomies()
active_taxonomies = r_serv_tags.smembers('active_taxonomies')
active_galaxies = r_serv_tags.smembers('active_galaxies')
if list_tag != ['']:
for tag in list_tag:
# verify input
tax = tag.split(':')[0]
if tax in active_taxonomies:
if tag in r_serv_tags.smembers('active_tag_' + tax):
pass
else:
return False
else:
return False
if list_tag_galaxies != ['']:
for tag in list_tag_galaxies:
# verify input
gal = tag.split(':')[1]
gal = gal.split('=')[0]
if gal in active_galaxies:
if tag in r_serv_tags.smembers('active_tag_galaxies_' + gal):
pass
else:
return False
else:
return False
return True
def date_to_str(date):
return "{0}-{1}-{2}".format(date.year, date.month, date.day)
@ -279,11 +222,9 @@ def hive_create_case(hive_tlp, threat_level, hive_description, hive_case_title,
@login_required
@login_analyst
def PasteSubmit_page():
#active taxonomies
active_taxonomies = r_serv_tags.smembers('active_taxonomies')
#active galaxies
active_galaxies = r_serv_tags.smembers('active_galaxies')
# Get all active tags/galaxy
active_taxonomies = Tag.get_active_taxonomies()
active_galaxies = Tag.get_active_galaxies()
return render_template("submit_items.html",
active_taxonomies = active_taxonomies,
@ -311,21 +252,27 @@ def submit():
submitted_tag = 'infoleak:submission="manual"'
#active taxonomies
active_taxonomies = r_serv_tags.smembers('active_taxonomies')
active_taxonomies = Tag.get_active_taxonomies()
#active galaxies
active_galaxies = r_serv_tags.smembers('active_galaxies')
active_galaxies = Tag.get_active_galaxies()
if ltags or ltagsgalaxies:
if not addTagsVerification(ltags, ltagsgalaxies):
ltags = ltags.split(',')
ltagsgalaxies = ltagsgalaxies.split(',')
print(ltags)
print(ltagsgalaxies)
if not Tags.is_valid_tags_taxonomies_galaxy(ltags, ltagsgalaxies):
content = 'INVALID TAGS'
print(content)
return content, 400
# add submitted tags
if(ltags != ''):
ltags = ltags + ',' + submitted_tag
else:
ltags = submitted_tag
if not ltags:
ltags = []
ltags.append(submitted_tag)
if is_file:
if file:
@ -358,7 +305,7 @@ def submit():
paste_content = full_path
launch_submit(ltags, ltagsgalaxies, paste_content, UUID, password ,True)
Import_helper.create_import_queue(ltags, ltagsgalaxies, paste_content, UUID, password ,True)
return render_template("submit_items.html",
active_taxonomies = active_taxonomies,
@ -376,12 +323,7 @@ def submit():
# get id
UUID = str(uuid.uuid4())
#if paste_name:
# clean file name
#id = clean_filename(paste_name)
launch_submit(ltags, ltagsgalaxies, paste_content, UUID, password)
Import_helper.create_import_queue(ltags, ltagsgalaxies, paste_content, UUID, password)
return render_template("submit_items.html",
active_taxonomies = active_taxonomies,
@ -415,7 +357,7 @@ def submit_status():
nb_sucess = r_serv_log_submit.get(UUID + ':nb_sucess')
paste_submit_link = list(r_serv_log_submit.smembers(UUID + ':paste_submit_link'))
if (end != None) and (nb_total != None) and (nb_end != None) and (error != None) and (processing != None) and (paste_submit_link != None):
if (end != None) and (nb_total != None) and (nb_end != None) and (processing != None):
link = ''
if paste_submit_link:
@ -433,10 +375,10 @@ def submit_status():
else:
prog = 0
if error == 'error:':
isError = False
else:
if error:
isError = True
else:
isError = False
if end == '0':
end = False

View file

@ -3,39 +3,47 @@
import os
import re
import sys
import redis
import bcrypt
import configparser
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
import ConfigLoader
from functools import wraps
from flask_login import LoginManager, current_user, login_user, logout_user, login_required
from flask import request, current_app
from flask import request, make_response, current_app
login_manager = LoginManager()
login_manager.login_view = 'role'
# CONFIG #
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
if not os.path.exists(configfile):
raise Exception('Unable to find the configuration file. \
Did you set environment variables? \
Or activate the virtualenv.')
config_loader = ConfigLoader.ConfigLoader()
cfg = configparser.ConfigParser()
cfg.read(configfile)
r_serv_db = redis.StrictRedis(
host=cfg.get("ARDB_DB", "host"),
port=cfg.getint("ARDB_DB", "port"),
db=cfg.getint("ARDB_DB", "db"),
decode_responses=True)
r_serv_db = config_loader.get_redis_conn("ARDB_DB")
config_loader = None
default_passwd_file = os.path.join(os.environ['AIL_HOME'], 'DEFAULT_PASSWORD')
regex_password = r'^(?=(.*\d){2})(?=.*[a-z])(?=.*[A-Z]).{10,100}$'
regex_password = re.compile(regex_password)
###############################################################
############### FLASK CACHE ##################
###############################################################
def no_cache(func):
@wraps(func)
def decorated_view(*args, **kwargs):
resp = make_response(func(*args, **kwargs))
resp.headers['Cache-Control'] = 'no-cache, no-store, must-revalidate'
resp.headers['Pragma'] = 'no-cache'
return resp
return decorated_view
###############################################################
###############################################################
###############################################################
###############################################################
############### CHECK ROLE ACCESS ##################
###############################################################

View file

@ -20,9 +20,9 @@ from pymispgalaxies import Galaxies, Clusters
# ============ VARIABLES ============
import Flask_config
import Tag
app = Flask_config.app
cfg = Flask_config.cfg
baseUrl = Flask_config.baseUrl
r_serv_tags = Flask_config.r_serv_tags
r_serv_metadata = Flask_config.r_serv_metadata
@ -59,16 +59,6 @@ for name, tags in clusters.items(): #galaxie name + tags
def one():
return 1
def date_substract_day(date, num_day=1):
new_date = datetime.date(int(date[0:4]), int(date[4:6]), int(date[6:8])) - datetime.timedelta(num_day)
new_date = str(new_date).replace('-', '')
return new_date
def date_add_day(date, num_day=1):
new_date = datetime.date(int(date[0:4]), int(date[4:6]), int(date[6:8])) + datetime.timedelta(num_day)
new_date = str(new_date).replace('-', '')
return new_date
def get_tags_with_synonyms(tag):
str_synonyms = ' - synonyms: '
synonyms = r_serv_tags.smembers('synonym_tag_' + tag)
@ -131,93 +121,6 @@ def get_last_seen_from_tags_list(list_tags):
min_last_seen = tag_last_seen
return str(min_last_seen)
def add_item_tag(tag, item_path):
item_date = int(get_item_date(item_path))
#add tag
r_serv_metadata.sadd('tag:{}'.format(item_path), tag)
r_serv_tags.sadd('{}:{}'.format(tag, item_date), item_path)
r_serv_tags.hincrby('daily_tags:{}'.format(item_date), tag, 1)
tag_first_seen = r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen')
if tag_first_seen is None:
tag_first_seen = 99999999
else:
tag_first_seen = int(tag_first_seen)
tag_last_seen = r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen')
if tag_last_seen is None:
tag_last_seen = 0
else:
tag_last_seen = int(tag_last_seen)
#add new tag in list of all used tags
r_serv_tags.sadd('list_tags', tag)
# update fisrt_seen/last_seen
if item_date < tag_first_seen:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'first_seen', item_date)
# update metadata last_seen
if item_date > tag_last_seen:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'last_seen', item_date)
def remove_item_tag(tag, item_path):
item_date = int(get_item_date(item_path))
#remove tag
r_serv_metadata.srem('tag:{}'.format(item_path), tag)
res = r_serv_tags.srem('{}:{}'.format(tag, item_date), item_path)
if res ==1:
# no tag for this day
if int(r_serv_tags.hget('daily_tags:{}'.format(item_date), tag)) == 1:
r_serv_tags.hdel('daily_tags:{}'.format(item_date), tag)
else:
r_serv_tags.hincrby('daily_tags:{}'.format(item_date), tag, -1)
tag_first_seen = int(r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen'))
tag_last_seen = int(r_serv_tags.hget('tag_metadata:{}'.format(tag), 'last_seen'))
# update fisrt_seen/last_seen
if item_date == tag_first_seen:
update_tag_first_seen(tag, tag_first_seen, tag_last_seen)
if item_date == tag_last_seen:
update_tag_last_seen(tag, tag_first_seen, tag_last_seen)
else:
return 'Error incorrect tag'
def update_tag_first_seen(tag, tag_first_seen, tag_last_seen):
if tag_first_seen == tag_last_seen:
if r_serv_tags.scard('{}:{}'.format(tag, tag_first_seen)) > 0:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'first_seen', tag_first_seen)
# no tag in db
else:
r_serv_tags.srem('list_tags', tag)
r_serv_tags.hdel('tag_metadata:{}'.format(tag), 'first_seen')
r_serv_tags.hdel('tag_metadata:{}'.format(tag), 'last_seen')
else:
if r_serv_tags.scard('{}:{}'.format(tag, tag_first_seen)) > 0:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'first_seen', tag_first_seen)
else:
tag_first_seen = date_add_day(tag_first_seen)
update_tag_first_seen(tag, tag_first_seen, tag_last_seen)
def update_tag_last_seen(tag, tag_first_seen, tag_last_seen):
if tag_first_seen == tag_last_seen:
if r_serv_tags.scard('{}:{}'.format(tag, tag_last_seen)) > 0:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'last_seen', tag_last_seen)
# no tag in db
else:
r_serv_tags.srem('list_tags', tag)
r_serv_tags.hdel('tag_metadata:{}'.format(tag), 'first_seen')
r_serv_tags.hdel('tag_metadata:{}'.format(tag), 'last_seen')
else:
if r_serv_tags.scard('{}:{}'.format(tag, tag_last_seen)) > 0:
r_serv_tags.hset('tag_metadata:{}'.format(tag), 'last_seen', tag_last_seen)
else:
tag_last_seen = date_substract_day(tag_last_seen)
update_tag_last_seen(tag, tag_first_seen, tag_last_seen)
# ============= ROUTES ==============
@Tags.route("/tags/", methods=['GET'])
@ -472,8 +375,9 @@ def remove_tag():
path = request.args.get('paste')
tag = request.args.get('tag')
remove_item_tag(tag, path)
res = Tag.remove_item_tag(tag, path)
if res[1] != 200:
str(res[0])
return redirect(url_for('showsavedpastes.showsavedpaste', paste=path))
@Tags.route("/Tags/confirm_tag")
@ -486,11 +390,11 @@ def confirm_tag():
tag = request.args.get('tag')
if(tag[9:28] == 'automatic-detection'):
remove_item_tag(tag, path)
Tag.remove_item_tag(tag, path)
tag = tag.replace('automatic-detection','analyst-detection', 1)
#add analyst tag
add_item_tag(tag, path)
Tag.add_item_tag(tag, path)
return redirect(url_for('showsavedpastes.showsavedpaste', paste=path))
@ -530,44 +434,36 @@ def addTags():
list_tag = tags.split(',')
list_tag_galaxies = tagsgalaxies.split(',')
taxonomies = Taxonomies()
active_taxonomies = r_serv_tags.smembers('active_taxonomies')
active_galaxies = r_serv_tags.smembers('active_galaxies')
if not path:
return 'INCORRECT INPUT0'
if list_tag != ['']:
for tag in list_tag:
# verify input
tax = tag.split(':')[0]
if tax in active_taxonomies:
if tag in r_serv_tags.smembers('active_tag_' + tax):
add_item_tag(tag, path)
else:
return 'INCORRECT INPUT1'
else:
return 'INCORRECT INPUT2'
if list_tag_galaxies != ['']:
for tag in list_tag_galaxies:
# verify input
gal = tag.split(':')[1]
gal = gal.split('=')[0]
if gal in active_galaxies:
if tag in r_serv_tags.smembers('active_tag_galaxies_' + gal):
add_item_tag(tag, path)
else:
return 'INCORRECT INPUT3'
else:
return 'INCORRECT INPUT4'
res = Tag.add_items_tag(list_tag, list_tag_galaxies, path)
print(res)
# error
if res[1] != 200:
return str(res[0])
# success
return redirect(url_for('showsavedpastes.showsavedpaste', paste=path))
@Tags.route("/Tags/add_item_tags")
@login_required
@login_analyst
def add_item_tags():
tags = request.args.get('tags')
tagsgalaxies = request.args.get('tagsgalaxies')
item_id = request.args.get('item_id')
item_type = request.args.get('type')
list_tag = tags.split(',')
list_tag_galaxies = tagsgalaxies.split(',')
res = Tag.add_items_tags(tags=list_tag, galaxy_tags=list_tag_galaxies, item_id=item_id, item_type=item_type)
# error
if res[1] != 200:
return str(res[0])
# success
if item_type=='domain':
return redirect(url_for('crawler_splash.showDomain', domain=item_id))
else:
return redirect(url_for('showsavedpastes.showsavedpaste', paste=item_id))
@Tags.route("/Tags/taxonomies")
@login_required

View file

@ -21,7 +21,7 @@ from flask_login import login_required
import Flask_config
app = Flask_config.app
cfg = Flask_config.cfg
config_loader = Flask_config.config_loader
baseUrl = Flask_config.baseUrl
r_serv = Flask_config.r_serv
r_serv_log = Flask_config.r_serv_log
@ -171,8 +171,8 @@ def stuff():
@login_required
@login_analyst
def index():
default_minute = cfg.get("Flask", "minute_processed_paste")
threshold_stucked_module = cfg.getint("Module_ModuleInformation", "threshold_stucked_module")
default_minute = config_loader.get_config_str("Flask", "minute_processed_paste")
threshold_stucked_module = config_loader.get_config_int("Module_ModuleInformation", "threshold_stucked_module")
log_select = {10, 25, 50, 100}
log_select.add(max_dashboard_logs)
log_select = list(log_select)

View file

@ -24,7 +24,6 @@ from flask_login import login_required
import Flask_config
app = Flask_config.app
cfg = Flask_config.cfg
baseUrl = Flask_config.baseUrl
r_serv_metadata = Flask_config.r_serv_metadata
vt_enabled = Flask_config.vt_enabled
@ -34,7 +33,7 @@ PASTES_FOLDER = Flask_config.PASTES_FOLDER
hashDecoded = Blueprint('hashDecoded', __name__, template_folder='templates')
## TODO: put me in option
all_cryptocurrency = ['bitcoin', 'monero']
all_cryptocurrency = ['bitcoin', 'ethereum', 'bitcoin-cash', 'litecoin', 'monero', 'zcash', 'dash']
all_pgpdump = ['key', 'name', 'mail']
# ============ FUNCTIONS ============
@ -225,13 +224,7 @@ def get_correlation_type_page_endpoint(correlation_type):
return endpoint
def get_show_key_id_endpoint(correlation_type):
if correlation_type == 'pgpdump':
endpoint = 'hashDecoded.show_pgpdump'
elif correlation_type == 'cryptocurrency':
endpoint = 'hashDecoded.show_cryptocurrency'
else:
endpoint = 'hashDecoded.hashDecoded_page'
return endpoint
return 'correlation.show_correlation'
def get_range_type_json_endpoint(correlation_type):
if correlation_type == 'pgpdump':
@ -352,8 +345,13 @@ def main_correlation_page(correlation_type, type_id, date_from, date_to, show_de
l_type = get_all_types_id(correlation_type)
correlation_type_n = correlation_type
if correlation_type_n=='pgpdump':
correlation_type_n = 'pgp'
return render_template("DaysCorrelation.html", all_metadata=keys_id_metadata,
correlation_type=correlation_type,
correlation_type_n=correlation_type_n,
correlation_type_endpoint=get_correlation_type_page_endpoint(correlation_type),
correlation_type_search_endpoint=get_correlation_type_search_endpoint(correlation_type),
show_key_id_endpoint=get_show_key_id_endpoint(correlation_type),
@ -363,27 +361,27 @@ def main_correlation_page(correlation_type, type_id, date_from, date_to, show_de
date_from=date_from, date_to=date_to,
show_decoded_files=show_decoded_files)
def show_correlation(correlation_type, type_id, key_id):
if is_valid_type_id(correlation_type, type_id):
key_id_metadata = get_key_id_metadata(correlation_type, type_id, key_id)
if key_id_metadata:
num_day_sparkline = 6
date_range_sparkline = get_date_range(num_day_sparkline)
sparkline_values = list_sparkline_type_id_values(date_range_sparkline, correlation_type, type_id, key_id)
return render_template('showCorrelation.html', key_id=key_id, type_id=type_id,
correlation_type=correlation_type,
graph_node_endpoint=get_graph_node_json_endpoint(correlation_type),
graph_line_endpoint=get_graph_line_json_endpoint(correlation_type),
font_family=get_font_family(correlation_type),
key_id_metadata=key_id_metadata,
type_icon=get_icon(correlation_type, type_id),
sparkline_values=sparkline_values)
else:
return '404'
else:
return 'error'
# def show_correlation(correlation_type, type_id, key_id):
# if is_valid_type_id(correlation_type, type_id):
# key_id_metadata = get_key_id_metadata(correlation_type, type_id, key_id)
# if key_id_metadata:
#
# num_day_sparkline = 6
# date_range_sparkline = get_date_range(num_day_sparkline)
#
# sparkline_values = list_sparkline_type_id_values(date_range_sparkline, correlation_type, type_id, key_id)
# return render_template('showCorrelation.html', key_id=key_id, type_id=type_id,
# correlation_type=correlation_type,
# graph_node_endpoint=get_graph_node_json_endpoint(correlation_type),
# graph_line_endpoint=get_graph_line_json_endpoint(correlation_type),
# font_family=get_font_family(correlation_type),
# key_id_metadata=key_id_metadata,
# type_icon=get_icon(correlation_type, type_id),
# sparkline_values=sparkline_values)
# else:
# return '404'
# else:
# return 'error'
def correlation_type_range_type_json(correlation_type, date_from, date_to):
date_range = []
@ -621,60 +619,60 @@ def hash_hash():
hash = request.args.get('hash')
return render_template('hash_hash.html')
@hashDecoded.route('/hashDecoded/showHash')
@login_required
@login_analyst
def showHash():
hash = request.args.get('hash')
#hash = 'e02055d3efaad5d656345f6a8b1b6be4fe8cb5ea'
# TODO FIXME show error
if hash is None:
return hashDecoded_page()
estimated_type = r_serv_metadata.hget('metadata_hash:'+hash, 'estimated_type')
# hash not found
# TODO FIXME show error
if estimated_type is None:
return hashDecoded_page()
else:
file_icon = get_file_icon(estimated_type)
size = r_serv_metadata.hget('metadata_hash:'+hash, 'size')
first_seen = r_serv_metadata.hget('metadata_hash:'+hash, 'first_seen')
last_seen = r_serv_metadata.hget('metadata_hash:'+hash, 'last_seen')
nb_seen_in_all_pastes = r_serv_metadata.hget('metadata_hash:'+hash, 'nb_seen_in_all_pastes')
# get all encoding for this hash
list_hash_decoder = []
list_decoder = r_serv_metadata.smembers('all_decoder')
for decoder in list_decoder:
encoding = r_serv_metadata.hget('metadata_hash:'+hash, decoder+'_decoder')
if encoding is not None:
list_hash_decoder.append({'encoding': decoder, 'nb_seen': encoding})
num_day_type = 6
date_range_sparkline = get_date_range(num_day_type)
sparkline_values = list_sparkline_values(date_range_sparkline, hash)
if r_serv_metadata.hexists('metadata_hash:'+hash, 'vt_link'):
b64_vt = True
b64_vt_link = r_serv_metadata.hget('metadata_hash:'+hash, 'vt_link')
b64_vt_report = r_serv_metadata.hget('metadata_hash:'+hash, 'vt_report')
else:
b64_vt = False
b64_vt_link = ''
b64_vt_report = r_serv_metadata.hget('metadata_hash:'+hash, 'vt_report')
# hash never refreshed
if b64_vt_report is None:
b64_vt_report = ''
return render_template('showHash.html', hash=hash, vt_enabled=vt_enabled, b64_vt=b64_vt, b64_vt_link=b64_vt_link,
b64_vt_report=b64_vt_report,
size=size, estimated_type=estimated_type, file_icon=file_icon,
first_seen=first_seen, list_hash_decoder=list_hash_decoder,
last_seen=last_seen, nb_seen_in_all_pastes=nb_seen_in_all_pastes, sparkline_values=sparkline_values)
#
# @hashDecoded.route('/hashDecoded/showHash')
# @login_required
# @login_analyst
# def showHash():
# hash = request.args.get('hash')
# #hash = 'e02055d3efaad5d656345f6a8b1b6be4fe8cb5ea'
#
# # TODO FIXME show error
# if hash is None:
# return hashDecoded_page()
#
# estimated_type = r_serv_metadata.hget('metadata_hash:'+hash, 'estimated_type')
# # hash not found
# # TODO FIXME show error
# if estimated_type is None:
# return hashDecoded_page()
#
# else:
# file_icon = get_file_icon(estimated_type)
# size = r_serv_metadata.hget('metadata_hash:'+hash, 'size')
# first_seen = r_serv_metadata.hget('metadata_hash:'+hash, 'first_seen')
# last_seen = r_serv_metadata.hget('metadata_hash:'+hash, 'last_seen')
# nb_seen_in_all_pastes = r_serv_metadata.hget('metadata_hash:'+hash, 'nb_seen_in_all_pastes')
#
# # get all encoding for this hash
# list_hash_decoder = []
# list_decoder = r_serv_metadata.smembers('all_decoder')
# for decoder in list_decoder:
# encoding = r_serv_metadata.hget('metadata_hash:'+hash, decoder+'_decoder')
# if encoding is not None:
# list_hash_decoder.append({'encoding': decoder, 'nb_seen': encoding})
#
# num_day_type = 6
# date_range_sparkline = get_date_range(num_day_type)
# sparkline_values = list_sparkline_values(date_range_sparkline, hash)
#
# if r_serv_metadata.hexists('metadata_hash:'+hash, 'vt_link'):
# b64_vt = True
# b64_vt_link = r_serv_metadata.hget('metadata_hash:'+hash, 'vt_link')
# b64_vt_report = r_serv_metadata.hget('metadata_hash:'+hash, 'vt_report')
# else:
# b64_vt = False
# b64_vt_link = ''
# b64_vt_report = r_serv_metadata.hget('metadata_hash:'+hash, 'vt_report')
# # hash never refreshed
# if b64_vt_report is None:
# b64_vt_report = ''
#
# return render_template('showHash.html', hash=hash, vt_enabled=vt_enabled, b64_vt=b64_vt, b64_vt_link=b64_vt_link,
# b64_vt_report=b64_vt_report,
# size=size, estimated_type=estimated_type, file_icon=file_icon,
# first_seen=first_seen, list_hash_decoder=list_hash_decoder,
# last_seen=last_seen, nb_seen_in_all_pastes=nb_seen_in_all_pastes, sparkline_values=sparkline_values)
@hashDecoded.route('/hashDecoded/downloadHash')
@ -1208,22 +1206,22 @@ def all_cryptocurrency_search():
show_decoded_files = request.form.get('show_decoded_files')
return redirect(url_for('hashDecoded.cryptocurrency_page', date_from=date_from, date_to=date_to, type_id=type_id, show_decoded_files=show_decoded_files))
@hashDecoded.route('/correlation/show_pgpdump')
@login_required
@login_analyst
def show_pgpdump():
type_id = request.args.get('type_id')
key_id = request.args.get('key_id')
return show_correlation('pgpdump', type_id, key_id)
@hashDecoded.route('/correlation/show_cryptocurrency')
@login_required
@login_analyst
def show_cryptocurrency():
type_id = request.args.get('type_id')
key_id = request.args.get('key_id')
return show_correlation('cryptocurrency', type_id, key_id)
# @hashDecoded.route('/correlation/show_pgpdump')
# @login_required
# @login_analyst
# def show_pgpdump():
# type_id = request.args.get('type_id')
# key_id = request.args.get('key_id')
# return show_correlation('pgpdump', type_id, key_id)
#
#
# @hashDecoded.route('/correlation/show_cryptocurrency')
# @login_required
# @login_analyst
# def show_cryptocurrency():
# type_id = request.args.get('type_id')
# key_id = request.args.get('key_id')
# return show_correlation('cryptocurrency', type_id, key_id)
@hashDecoded.route('/correlation/cryptocurrency_range_type_json')
@login_required
@ -1249,6 +1247,7 @@ def pgpdump_graph_node_json():
key_id = request.args.get('key_id')
return correlation_graph_node_json('pgpdump', type_id, key_id)
# # TODO: REFRACTOR
@hashDecoded.route('/correlation/cryptocurrency_graph_node_json')
@login_required
@login_analyst
@ -1257,6 +1256,7 @@ def cryptocurrency_graph_node_json():
key_id = request.args.get('key_id')
return correlation_graph_node_json('cryptocurrency', type_id, key_id)
# # TODO: REFRACTOR
@hashDecoded.route('/correlation/pgpdump_graph_line_json')
@login_required
@login_analyst

View file

@ -20,6 +20,7 @@
<script language="javascript" src="{{ url_for('static', filename='js/moment.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/jquery.daterangepicker.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/d3.min.js') }}"></script>
<script src="{{ url_for('static', filename='js/d3/sparklines.js')}}"></script>
<style>
.input-group .form-control {
@ -144,7 +145,7 @@
{% for key_id in all_metadata %}
<tr>
<td><i class="{{ all_metadata[key_id]['type_icon'] }}"></i>&nbsp;&nbsp;{{ all_metadata[key_id]['type_id'] }}</td>
<td><a target="_blank" href="{{ url_for(show_key_id_endpoint) }}?type_id={{ all_metadata[key_id]['type_id'] }}&key_id={{ key_id }}">{{ key_id }}</a></td>
<td><a target="_blank" href="{{ url_for(show_key_id_endpoint) }}?object_type={{correlation_type_n}}&type_id={{ all_metadata[key_id]['type_id'] }}&correlation_id={{ key_id }}&correlation_objects=paste">{{ key_id }}</a></td>
<td>{{ all_metadata[key_id]['first_seen'] }}</td>
<td>{{ all_metadata[key_id]['last_seen'] }}</td>
<td>{{ all_metadata[key_id]['nb_seen'] }}</td>
@ -223,9 +224,6 @@
chart.stackBarChart = barchart_type_stack("{{ url_for(range_type_json_endpoint) }}?date_from={{date_from}}&date_to={{date_to}}", 'id');
{% endif %}
//draw_pie_chart("pie_chart_encoded" ,"{{ url_for('hashDecoded.decoder_type_json') }}?date_from={{date_from}}&date_to={{date_to}}&type={{type}}", "{{ url_for('hashDecoded.hashDecoded_page') }}?date_from={{date_from}}&date_to={{date_to}}&type={{type}}&encoding=");
//draw_pie_chart("pie_chart_top5_types" ,"{{ url_for('hashDecoded.top5_type_json') }}?date_from={{date_from}}&date_to={{date_to}}&type={{type}}", "{{ url_for('hashDecoded.hashDecoded_page') }}?date_from={{date_from}}&date_to={{date_to}}&type=");
chart.onResize();
$(window).on("resize", function() {
chart.onResize();
@ -246,45 +244,10 @@ function toggle_sidebar(){
}
}
</script>
<script>
// a sparklines plot
function sparklines(id, points) {
var width = 100, height = 60;
var data = []
for (i = 0; i < points.length; i++) {
data[i] = {
'x': i,
'y': +points[i]
}
}
var x = d3.scaleLinear()
.range([0, width - 10])
.domain([0,5]);
var y = d3.scaleLinear()
.range([height, 0])
.domain([0,10]);
var line = d3.line()
.x(function(d) {return x(d.x)})
.y(function(d) {return y(d.y)});
d3.select("#"+id).append('svg')
.attr('width', width)
.attr('height', height)
.append('path')
.attr('class','line')
.datum(data)
.attr('d', line);
}
</script>
<script>
{% for key_id in all_metadata %}
sparklines("sparklines_{{ all_metadata[key_id]['sparklines_id'] }}", {{ all_metadata[key_id]['sparklines_data'] }})
sparkline("sparklines_{{ all_metadata[key_id]['sparklines_id'] }}", {{ all_metadata[key_id]['sparklines_data'] }}, {});
{% endfor %}
</script>

View file

@ -20,6 +20,7 @@
<script language="javascript" src="{{ url_for('static', filename='js/moment.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/jquery.daterangepicker.min.js') }}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/d3.min.js') }}"></script>
<script src="{{ url_for('static', filename='js/d3/sparklines.js')}}"></script>
<style>
.input-group .form-control {
@ -157,7 +158,7 @@
{% for b64 in l_64 %}
<tr>
<td><i class="fas {{ b64[0] }}"></i>&nbsp;&nbsp;{{ b64[1] }}</td>
<td><a target="_blank" href="{{ url_for('hashDecoded.showHash') }}?hash={{ b64[2] }}">{{ b64[2] }}</a></td>
<td><a target="_blank" href="{{ url_for('correlation.show_correlation') }}?object_type=decoded&correlation_id={{ b64[2] }}&correlation_objects=paste">{{ b64[2] }}</a></td>
<td>{{ b64[5] }}</td>
<td>{{ b64[6] }}</td>
<td>{{ b64[3] }}</td>
@ -296,46 +297,10 @@ function toggle_sidebar(){
});
}
</script>
<script>
//var data = [6,3,3,2,5,3,9];
// a sparklines plot
function sparklines(id, points) {
var width = 100, height = 60;
var data = []
for (i = 0; i < points.length; i++) {
data[i] = {
'x': i,
'y': +points[i]
}
}
var x = d3.scaleLinear()
.range([0, width - 10])
.domain([0,5]);
var y = d3.scaleLinear()
.range([height, 0])
.domain([0,10]);
var line = d3.line()
.x(function(d) {return x(d.x)})
.y(function(d) {return y(d.y)});
d3.select("#"+id).append('svg')
.attr('width', width)
.attr('height', height)
.append('path')
.attr('class','line')
.datum(data)
.attr('d', line);
}
</script>
<script>
{% for b64 in l_64 %}
sparklines("sparklines_{{ b64[2] }}", {{ b64[10] }})
sparkline("sparklines_{{ b64[2] }}", {{ b64[10] }}, {});
{% endfor %}
</script>

View file

@ -1,628 +0,0 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Hash Information - AIL</title>
<link rel="icon" href="{{ url_for('static', filename='image/ail-icon.png') }}">
<!-- Core CSS -->
<link href="{{ url_for('static', filename='css/bootstrap4.min.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/font-awesome.min.css') }}" rel="stylesheet">
<!-- JS -->
<script src="{{ url_for('static', filename='js/jquery.js')}}"></script>
<script src="{{ url_for('static', filename='js/popper.min.js')}}"></script>
<script src="{{ url_for('static', filename='js/bootstrap4.min.js')}}"></script>
<script language="javascript" src="{{ url_for('static', filename='js/d3.min.js') }}"></script>
<style>
line.link {
stroke: #666;
}
line.link:hover{
stroke: red;
stroke-width: 2px
}
.line_sparkline {
fill: none;
stroke: #000;
stroke-width: 2.0px;
}
.node {
pointer-events: all;
}
circle {
stroke: none;
}
.graph_text_node {
font: 8px sans-serif;
pointer-events: none;
}
.graph_node_icon {
pointer-events: none;
}
.node text {
font: 8px sans-serif;
pointer-events: auto;
}
div.tooltip {
position: absolute;
text-align: center;
padding: 2px;
font: 12px sans-serif;
background: #ebf4fb;
border: 2px solid #b7ddf2;
border-radius: 8px;
pointer-events: none;
color: #000000;
}
.graph_panel {
padding: unset;
}
.line_graph {
fill: none;
stroke: steelblue;
stroke-width: 2px;
stroke-linejoin: round;
stroke-linecap: round;
stroke-width: 1.5;
/*attr('stroke', '#bcbd22').*/
}
</style>
</head>
<body>
{% include 'nav_bar.html' %}
<div class="container-fluid">
<div class="row">
{% include 'decoded/menu_sidebar.html' %}
<div class="col-12 col-lg-10" id="core_content">
<div class="card my-3">
<div class="card-header" style="background-color:#d9edf7;font-size: 15px">
<h4 class="text-secondary">{{ hash }} :</h4>
<ul class="list-group mb-2">
<li class="list-group-item py-0">
<div class="row">
<div class="col-md-10">
<table class="table">
<thead>
<tr>
<th>Estimated type</th>
<th>First_seen</th>
<th>Last_seen</th>
<th>Size (Kb)</th>
<th>Nb seen</th>
</tr>
</thead>
<tbody>
<tr>
<td><i class="fas {{ file_icon }}"></i>&nbsp;&nbsp;{{ estimated_type }}</td>
<td>{{ first_seen }}</td>
<td>{{ last_seen }}</td>
<td>{{ size }}</td>
<td>{{ nb_seen_in_all_pastes }}</td>
</tr>
</tbody>
</table>
</div>
<div class="col-md-1">
<div id="sparkline"></div>
</div>
</div>
</li>
</ul>
{% if vt_enabled %}
{% if not b64_vt %}
<darkbutton>
<button id="submit_vt_b" class="btn btn-primary" onclick="sendFileToVT('{{ hash }}')" style="font-size: 15px">
<i class="fas fa-paper-plane"></i>&nbsp;Send this file to VT
</button>
</darkbutton>
{% else %}
<a class="btn btn-primary" target="_blank" href="{{ b64_vt_link }}" style="font-size: 15px"><i class="fas fa-link"></i> VT Report</a>
{% endif %}
<button class="btn btn-outline-secondary" onclick="updateVTReport('{{ hash }}')" style="font-size: 15px">
<div id="report_vt_b"><i class="fas fa-sync-alt"></i>&nbsp;{{ b64_vt_report }}</div>
</button>
{% else %}
Virus Total submission is disabled
{% endif %}
<a href="{{ url_for('hashDecoded.downloadHash') }}?hash={{hash}}" target="blank" class="float-right" style="font-size: 15px">
<button class='btn btn-info'><i class="fas fa-download"></i> Download Decoded file
</button>
</a>
</div>
</div>
<div class="row">
<div class="col-xl-10">
<div class="card mb-3">
<div class="card-header">
<i class="fas fa-project-diagram"></i> Graph
</div>
<div class="card-body graph_panel">
<div id="graph">
</div>
</div>
</div>
</div>
<div class="col-xl-2">
<div class="card">
<div class="card-header">
<i class="fas fa-unlock-alt" aria-hidden="true"></i> Encoding
</div>
<div class="card-body text-center">
{% for encoding in list_hash_decoder %}
<button class="btn btn-outline-dark" disabled>
{{encoding['encoding']}} <span class="badge badge-dark">{{encoding['nb_seen']}}</span>
</button>
{% endfor %}
</div>
</div>
<div class="card my-3">
<div class="card-header">
<i class="fas fa-project-diagram"></i> Graph
</div>
<div class="card-body text-center px-0 py-0">
<button class="btn btn-primary my-4" onclick="resize_graph();">
<i class="fas fa-sync"></i>&nbsp;Resize Graph
</button>
<ul class="list-group">
<li class="list-group-item list-group-item-info"><i class="fas fa-info-circle fa-2x"></i></li>
<li class="list-group-item text-left">
<p>Double click on a node to open Hash/Paste<br><br>
<svg height="12" width="12"><g class="nodes"><circle cx="6" cy="6" r="6" fill="orange"></circle></g></svg>
Current Hash<br>
<svg height="12" width="12"><g class="nodes"><circle cx="6" cy="6" r="6" fill="rgb(141, 211, 199)"></circle></g></svg>
Hashes<br>
<svg height="12" width="12"><g class="nodes"><circle cx="6" cy="6" r="6" fill="#1f77b4"></circle></g></svg>
Pastes
</p>
</li>
<li class="list-group-item list-group-item-info">
Hash Types:
</li>
<li class="list-group-item text-left">
<i class="fas fa-file"></i> Application<br>
<i class="fas fa-file-video"></i> Audio<br>
<i class="fas fa-file-image"></i> Image<br>
<i class="fas fa-file-alt"></i> Text<br>
<i class="fas fa-sticky-note"></i> Other
</li>
</ul>
</div>
</div>
</div>
</div>
<div class="card">
<div class="card-header">
<i class="fas fa-chart-bar"></i> Graph
</div>
<div class="panel-body ">
<div id="graph_line">
</div>
</div>
</div>
</div>
</div>
</div>
<script>
var all_graph = {};
$(document).ready(function(){
$("#page-Decoded").addClass("active");
sparklines("sparkline", {{ sparkline_values }})
all_graph.node_graph = create_graph("{{ url_for('hashDecoded.hash_graph_node_json') }}?hash={{hash}}");
all_graph.line_chart = create_line_chart('graph_line', "{{ url_for('hashDecoded.hash_graph_line_json') }}?hash={{hash}}");
all_graph.onResize();
});
$(window).on("resize", function() {
all_graph.onResize();
});
function toggle_sidebar(){
if($('#nav_menu').is(':visible')){
$('#nav_menu').hide();
$('#side_menu').removeClass('border-right')
$('#side_menu').removeClass('col-lg-2')
$('#core_content').removeClass('col-lg-10')
}else{
$('#nav_menu').show();
$('#side_menu').addClass('border-right')
$('#side_menu').addClass('col-lg-2')
$('#core_content').addClass('col-lg-10')
}
}
</script>
<script>
function sendFileToVT(hash) {
//send file to vt
$.getJSON("{{ url_for('hashDecoded.send_file_to_vt_js') }}?hash="+hash,
function(data) {
var content = '<a id="submit_vt_b" class="btn btn-primary" target="_blank" href="'+ data['vt_link'] +'"><i class="fas fa-link"> '+ ' VT Report' +'</i></a>';
$('#submit_vt_b').remove();
$('darkbutton').append(content);
});
}
function updateVTReport(hash) {
//updateReport
$.getJSON("{{ url_for('hashDecoded.update_vt_result') }}?hash="+hash,
function(data) {
var content = '<i class="fas fa-sync-alt"></i> ' +data['report_vt'];
$( "#report_vt_b" ).html(content);
});
}
</script>
<script>
function resize_graph() {
zoom.translateTo(svg_node, 200, 200);
zoom.scaleTo(svg_node, 2);
}
</script>
<script>
//var data = [6,3,3,2,5,3,9];
// a sparklines plot
function sparklines(id, points) {
var width_spark = 100, height_spark = 60;
var data = []
for (i = 0; i < points.length; i++) {
data[i] = {
'x': i,
'y': +points[i]
}
}
var x = d3.scaleLinear()
.range([0, width_spark - 10])
.domain([0,5]);
var y = d3.scaleLinear()
.range([height_spark, 0])
.domain([0,10]);
var line = d3.line()
.x(function(d) {return x(d.x)})
.y(function(d) {return y(d.y)});
d3.select("#"+id).append('svg')
.attr('width', width_spark)
.attr('height', height_spark)
.append('path')
.attr('class','line_sparkline')
.datum(data)
.attr('d', line);
}
</script>
<script>
var width = 400,
height = 400;
var link;
var zoom = d3.zoom()
.scaleExtent([.2, 10])
.on("zoom", zoomed);
//var transform = d3.zoomIdentity;
var color = d3.scaleOrdinal(d3.schemeCategory10);
var div = d3.select("body").append("div")
.attr("class", "tooltip")
.style("opacity", 0);
var simulation = d3.forceSimulation()
.force("link", d3.forceLink().id(function(d) { return d.id; }))
.force("charge", d3.forceManyBody())
.force("center", d3.forceCenter(width / 2, height / 2));
//.on("tick", ticked);
var svg_node = d3.select("#graph").append("svg")
.attr("id", "graph_div")
.attr("width", width)
.attr("height", height)
.call(d3.zoom().scaleExtent([1, 8]).on("zoom", zoomed))
.on("dblclick.zoom", null)
var container_graph = svg_node.append("g");
//.attr("transform", "translate(40,0)")
//.attr("transform", "scale(2)");
function create_graph(url){
d3.json(url)
.then(function(data){
link = container_graph.append("g")
.selectAll("line")
.data(data.links)
.enter().append("line")
.attr("class", "link");
//.attr("stroke-width", function(d) { return Math.sqrt(d.value); })
var node = container_graph.selectAll(".node")
.data(data.nodes)
.enter().append("g")
.attr("class", "nodes")
.on("dblclick", doubleclick)
.on("click", click)
.on("mouseover", mouseovered)
.on("mouseout", mouseouted)
.call(d3.drag()
.on("start", drag_start)
.on("drag", dragged)
.on("end", drag_end));
node.append("circle")
.attr("r", function(d) {
return (d.hash) ? 6 : 5; })
.attr("fill", function(d) {
if(!d.hash){ return color(d.group);}
if(d.group == 1){ return "orange";}
return "rgb(141, 211, 199)"; });
node.append('text')
.attr('text-anchor', 'middle')
.attr('dominant-baseline', 'central')
.attr("class", "graph_node_icon fa")
.attr('font-size', '8px' )
.attr('pointer-events', 'none')
.text(function(d) {
if(d.hash){
return d.icon
} });
zoom.translateTo(svg_node, 200, 200);
zoom.scaleTo(svg_node, 2);
/* node.append("text")
.attr("dy", 3)
.attr("dx", 7)
.attr("class", "graph_text_node")
//.style("text-anchor", function(d) { return d.children ? "end" : "start"; })
.text(function(d) { return d.id; });*/
simulation
.nodes(data.nodes)
.on("tick", ticked);
simulation.force("link")
.links(data.links);
function ticked() {
link
.attr("x1", function(d) { return d.source.x; })
.attr("y1", function(d) { return d.source.y; })
.attr("x2", function(d) { return d.target.x; })
.attr("y2", function(d) { return d.target.y; });
/*node
.attr("cx", function(d) { return d.x; })
.attr("cy", function(d) { return d.y; });*/
node.attr("transform", function(d) { return "translate(" + d.x + "," + d.y + ")"; });
}
});
}
function zoomed() {
container_graph.attr("transform", d3.event.transform);
}
function doubleclick (d) {
window.open(d.url, '_blank');
}
function click (d) {
console.log('clicked')
}
function drag_start(d) {
if (!d3.event.active) simulation.alphaTarget(0.3).restart();
d.fx = d.x;
d.fy = d.y;
}
function dragged(d) {
d.fx = d3.event.x;
d.fy = d3.event.y;
}
function drag_end(d) {
if (!d3.event.active) simulation.alphaTarget(0);
d.fx = d.x;
d.fy = d.y;
}
function mouseovered(d) {
// tooltip
var content;
if(d.hash == true){
content = "<b><span id='tooltip-id-name'></span></b><br/>"+
"<br/>"+
"<i>First seen</i>: <span id='tooltip-id-first_seen'></span><br/>"+
"<i>Last seen</i>: <span id='tooltip-id-last_seen'></span><br/>"+
"<i>nb_seen</i>: <span id='tooltip-id-nb_seen'></span>";
div.transition()
.duration(200)
.style("opacity", .9);
div.html(content)
.style("left", (d3.event.pageX) + "px")
.style("top", (d3.event.pageY - 28) + "px");
$("#tooltip-id-name").text(d.id);
$("#tooltip-id-first_seen").text(d.first_seen);
$("#tooltip-id-last_seen").text(d.last_seen);
$("#tooltip-id-nb_seen").text(d.nb_seen_in_paste);
} else {
content = "<b><span id='tooltip-id-name'></span></b><br/>";
div.transition()
.duration(200)
.style("opacity", .9);
div.html(content)
.style("left", (d3.event.pageX) + "px")
.style("top", (d3.event.pageY - 28) + "px");
$("#tooltip-id-name").text(d.id);
}
//links
/*link.style("stroke-opacity", function(o) {
return o.source === d || o.target === d ? 1 : opacity;
});*/
link.style("stroke", function(o){
return o.source === d || o.target === d ? "#666" : "#ddd";
});
}
function mouseouted() {
div.transition()
.duration(500)
.style("opacity", 0);
link.style("stroke", "#666");
}
all_graph.onResize = function () {
var aspect = 1000 / 500, all_graph = $("#graph_div");
var targetWidth = all_graph.parent().width();
all_graph.attr("width", targetWidth);
all_graph.attr("height", targetWidth / aspect);
}
window.all_graph = all_graph;
</script>
<script>
function create_line_chart(id, url){
var width = 900;
var height = Math.round(width / 4);
var margin = {top: 20, right: 55, bottom: 50, left: 40};
var x = d3.scaleTime().range([0, width]);
var y = d3.scaleLinear().rangeRound([height, 0]);
var xAxis = d3.axisBottom(x);
var yAxis = d3.axisLeft(y);
var parseTime = d3.timeParse("%Y-%m-%d");
var line = d3.line()
.x(function(d) {
return x(d.date);
}).y(function(d) {
return y(d.value);
});
var svg_line = d3.select('#'+id).append('svg')
.attr("id", "graph_div")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append('g')
.attr('transform', "translate("+ margin.left +","+ margin.top +")");
var div = d3.select('body').append('div')
.attr('class', 'tooltip')
.style('opacity', 0);
//add div tooltip
d3.json(url)
.then(function(data){
data.forEach(function(d) {
d.date_label = d.date;
d.date = parseTime(d.date);
d.value = +d.value;
});
// fit the data
x.domain(d3.extent(data, function(d) { return d.date; }));
//x.domain(data.map(function (d) { return d.date; })); //E
y.domain([0, d3.max(data, function(d){ return d.value ; })]);
//line
svg_line.append("path")
.data([data])
.attr("class", "line_graph")
.attr("d", line);
// add X axis
svg_line.append("g")
.attr("transform", "translate(0," + height + ")")
.call(d3.axisBottom(x))
.selectAll("text")
.style("text-anchor", "end")
.attr("transform", "rotate(-45)" );
// Add the Y Axis
svg_line.append("g")
.call(d3.axisLeft(y));
//add a dot circle
svg_line.selectAll('dot')
.data(data).enter()
.append('circle')
.attr('r', 2)
.attr('cx', function(d) { return x(d.date); })
.attr('cy', function(d) { return y(d.value); })
.on('mouseover', function(d) {
div.transition().style('opacity', .9);
div.html('' + d.date_label+ '<br/>' + d.value).style('left', (d3.event.pageX) + 'px')
.style("left", (d3.event.pageX) + "px")
.style("top", (d3.event.pageY - 28) + "px");
})
.on('mouseout', function(d)
{
div.transition().style('opacity', 0);
});
});
}
</script>
</body>
</html>

Some files were not shown because too many files have changed in this diff Show more