ail-framework/README.md

281 lines
11 KiB
Markdown
Raw Permalink Normal View History

AIL
===
2018-08-24 12:39:55 +00:00
<table>
<tr>
<td>Latest Release</td>
2019-02-17 08:31:59 +00:00
<td><a href="https://github.com/CIRCL/AIL-framework/releases/latest"><img src="https://img.shields.io/github/release/CIRCL/AIL-framework/all.svg"></a></td>
</tr>
<td>Travis</td>
<td><a href="https://travis-ci.org/CIRCL/AIL-framework"><img src="https://img.shields.io/travis/CIRCL/AIL-framework.svg" /></a></td>
</tr>
<tr>
<td>Gitter</td>
<td><a href="https://gitter.im/SteveClement/AIL-framework?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge"><img src="https://badges.gitter.im/SteveClement/AIL-framework.svg" /></a></td>
2018-08-24 12:39:55 +00:00
</tr>
<tr>
<td>Contributors</td>
<td><img src="https://img.shields.io/github/contributors/CIRCL/AIL-Framework.svg" /></td>
</tr>
<tr>
<td>License</td>
<td><img src="https://img.shields.io/github/license/CIRCL/AIL-Framework.svg" /></td>
</tr>
</table>
![Logo](./doc/logo/logo-small.png?raw=true "AIL logo")
AIL framework - Framework for Analysis of Information Leaks
AIL is a modular framework to analyse potential information leaks from unstructured data sources like pastes from Pastebin or similar services or unstructured data streams. AIL framework is flexible and can be extended to support other functionalities to mine or process sensitive information (e.g. data leak prevention).
![Dashboard](./doc/screenshots/dashboard.png?raw=true "AIL framework dashboard")
2018-08-24 12:39:55 +00:00
2016-02-08 10:49:33 +00:00
Features
--------
2016-02-08 13:13:24 +00:00
* Modular architecture to handle streams of unstructured or structured information
* Default support for external ZMQ feeds, such as provided by CIRCL or other providers
2017-05-03 12:25:58 +00:00
* Multiple feed support
2016-02-08 13:13:24 +00:00
* Each module can process and reprocess the information already processed by AIL
* Detecting and extracting URLs including their geographical location (e.g. IP address location)
2017-05-03 12:25:58 +00:00
* Extracting and validating potential leak of credit cards numbers, credentials, ...
2016-02-08 10:49:33 +00:00
* Extracting and validating email addresses leaked including DNS MX validation
* Module for extracting Tor .onion addresses (to be further processed for analysis)
2017-12-27 15:09:32 +00:00
* Keep tracks of duplicates (and diffing between each duplicate found)
2016-02-08 10:49:33 +00:00
* Extracting and validating potential hostnames (e.g. to feed Passive DNS systems)
* A full-text indexer module to index unstructured information
2017-05-03 12:25:58 +00:00
* Statistics on modules and web
* Real-time modules manager in terminal
2016-08-23 15:20:22 +00:00
* Global sentiment analysis for each providers based on nltk vader module
2017-05-03 12:25:58 +00:00
* Terms, Set of terms and Regex tracking and occurrence
2016-02-08 13:13:24 +00:00
* Many more modules for extracting phone numbers, credentials and others
* Alerting to [MISP](https://github.com/MISP/MISP) to share found leaks within a threat intelligence platform using [MISP standard](https://www.misp-project.org/objects.html#_ail_leak)
2018-08-24 12:29:25 +00:00
* Detect and decode encoded file (Base64, hex encoded or your own decoding scheme) and store files
2018-05-11 14:25:50 +00:00
* Detect Amazon AWS and Google API keys
2018-05-11 09:31:47 +00:00
* Detect Bitcoin address and Bitcoin private keys
2018-08-24 12:45:00 +00:00
* Detect private keys, certificate, keys (including SSH, OpenVPN)
* Detect IBAN bank accounts
2018-05-31 14:09:46 +00:00
* Tagging system with [MISP Galaxy](https://github.com/MISP/misp-galaxy) and [MISP Taxonomies](https://github.com/MISP/misp-taxonomies) tags
2018-06-19 15:10:47 +00:00
* UI paste submission
* Create events on [MISP](https://github.com/MISP/MISP) and cases on [The Hive](https://github.com/TheHive-Project/TheHive)
2018-06-20 08:19:31 +00:00
* Automatic paste export at detection on [MISP](https://github.com/MISP/MISP) (events) and [The Hive](https://github.com/TheHive-Project/TheHive) (alerts) on selected tags
2018-08-24 12:29:25 +00:00
* Extracted and decoded files can be searched by date range, type of file (mime-type) and encoding discovered
2019-05-24 11:32:50 +00:00
* Graph relationships between decoded file (hashes), similar PGP UIDs and addresses of cryptocurrencies
2018-10-03 05:46:26 +00:00
* Tor hidden services crawler to crawl and parse output
* Tor onion availability is monitored to detect up and down of hidden services
* Browser hidden services are screenshot and integrated in the analysed output including a blurring screenshot interface (to avoid "burning the eyes" of the security analysis with specific content)
* Tor hidden services is part of the standard framework, all the AIL modules are available to the crawled hidden services
2019-05-24 11:32:50 +00:00
* Generic web crawler to trigger crawling on demand or at regular interval URL or Tor hidden services
2018-10-03 05:46:26 +00:00
2018-05-11 09:31:47 +00:00
Installation
------------
2017-10-22 16:49:11 +00:00
Type these command lines for a fully automated installation and start AIL framework:
2017-10-20 13:56:14 +00:00
```bash
2014-08-25 13:02:53 +00:00
git clone https://github.com/CIRCL/AIL-framework.git
cd AIL-framework
./installing_deps.sh
2014-08-25 13:02:53 +00:00
cd ~/AIL-framework/
cd bin/
./LAUNCH.sh -l
2014-08-25 13:02:53 +00:00
```
2017-10-22 16:49:11 +00:00
The default [installing_deps.sh](./installing_deps.sh) is for Debian and Ubuntu based distributions.
2014-08-25 13:02:53 +00:00
2016-02-08 13:13:24 +00:00
There is also a [Travis file](.travis.yml) used for automating the installation that can be used to build and install AIL on other systems.
Installation Notes
------------
In order to use AIL combined with **ZFS** or **unprivileged LXC** it's necessary to disable Direct I/O in `$AIL_HOME/configs/6382.conf` by changing the value of the directive `use_direct_io_for_flush_and_compaction` to `false`.
2018-05-09 11:32:25 +00:00
Python 3 Upgrade
------------
To upgrade from an existing AIL installation, you have to launch [python3_upgrade.sh](./python3_upgrade.sh), this script will delete and create a new virtual environment. The script **will upgrade the packages but won't keep your previous data** (neverthless the data is copied into a directory called `old`). If you install from scratch, you don't require to launch the [python3_upgrade.sh](./python3_upgrade.sh).
Docker Quick Start (Ubuntu 16.04 LTS)
------------
2017-10-22 16:49:11 +00:00
2019-06-25 07:38:40 +00:00
:warning:
Not maintained at the moment.
If you are interested to get this running, please:
Fork -> Branch -> PR
1. Install Docker
2017-10-20 13:56:14 +00:00
```bash
sudo su
apt-get install -y curl
curl https://get.docker.com | /bin/bash
```
2. Type these commands to build the Docker image:
2017-10-20 13:56:14 +00:00
```bash
2018-10-20 14:04:54 +00:00
git clone https://github.com/CIRCL/AIL-framework.git
cd AIL-framework
2017-10-01 00:34:53 +00:00
docker build -t ail-framework .
```
3. To start AIL on port 7000, type the following command below:
```
docker run -p 7000:7000 ail-framework
```
4. To debug the running container, type the following command and note the container name or identifier:
2017-10-20 13:56:14 +00:00
```bash
docker ps
```
After getting the name or identifier type the following commands:
2017-10-20 13:56:14 +00:00
```bash
docker exec -it CONTAINER_NAME_OR_IDENTIFIER bash
cd /opt/ail
```
2016-08-23 15:20:22 +00:00
Install using Ansible
---------------------
Please check the [Ansible readme](ansible/README.md).
Starting AIL
--------------------------
2017-10-20 13:56:14 +00:00
```bash
cd bin/
./LAUNCH -l
2014-08-08 09:42:51 +00:00
```
2016-02-08 13:13:24 +00:00
Eventually you can browse the status of the AIL framework website at the following URL:
2017-10-20 13:56:14 +00:00
```
2019-07-05 13:24:38 +00:00
https://localhost:7000/
2017-10-20 13:56:14 +00:00
```
The default credentials for the web interface are located in ``DEFAULT_PASSWORD``. This file is removed when you change your password.
2018-12-03 16:34:11 +00:00
Training
--------
CIRCL organises training on how to use or extend the AIL framework. The next training will be [Thursday, 20 Dec](https://en.xing-events.com/ZEQWMLJ.html) in Luxembourg.
2017-07-17 11:36:41 +00:00
HOWTO
-----
HOWTO are available in [HOWTO.md](HOWTO.md)
2018-06-20 06:58:02 +00:00
Privacy and GDPR
----------------
[AIL information leaks analysis and the GDPR in the context of collection, analysis and sharing information leaks](https://www.circl.lu/assets/files/information-leaks-analysis-and-gdpr.pdf) document provides an overview how to use AIL in a lawfulness context especially in the scope of General Data Protection Regulation.
2016-10-13 15:13:08 +00:00
Research using AIL
------------------
If you write academic paper, relying or using AIL, it can be cited with the following BibTeX:
~~~~
@inproceedings{mokaddem2018ail,
title={AIL-The design and implementation of an Analysis Information Leak framework},
author={Mokaddem, Sami and Wagener, G{\'e}rard and Dulaunoy, Alexandre},
booktitle={2018 IEEE International Conference on Big Data (Big Data)},
pages={5049--5057},
year={2018},
organization={IEEE}
}
~~~~
2017-05-03 12:30:58 +00:00
Screenshots
===========
2018-10-03 05:46:26 +00:00
Tor hidden service crawler
--------------------------
2018-10-03 05:48:14 +00:00
![Tor hidden service](./doc/screenshots/ail-bitcoinmixer.png?raw=true "Tor hidden service crawler")
2018-10-03 05:46:26 +00:00
2017-05-03 12:30:58 +00:00
Trending charts
---------------
![Trending-Web](./doc/screenshots/trending-web.png?raw=true "AIL framework webtrending")
![Trending-Modules](./doc/screenshots/trending-module.png?raw=true "AIL framework modulestrending")
2018-08-24 12:16:29 +00:00
Extracted encoded files from pastes
-----------------------------------
![Extracted files from pastes](./doc/screenshots/ail-hashedfiles.png?raw=true "AIL extracted decoded files statistics")
![Relationships between extracted files from encoded file in unstructured data](./doc/screenshots/hashedfile-graph.png?raw=true "Relationships between extracted files from encoded file in unstructured data")
2017-05-03 12:30:58 +00:00
Browsing
--------
![Browse-Pastes](./doc/screenshots/browse-important.png?raw=true "AIL framework browseImportantPastes")
2018-05-31 14:09:46 +00:00
Tagging system
--------
![Tags](./doc/screenshots/tags.png?raw=true "AIL framework tags")
2018-06-19 15:10:47 +00:00
MISP and The Hive, automatic events and alerts creation
--------
![paste_submit](./doc/screenshots/tag_auto_export.png?raw=true "AIL framework MISP and Hive auto export")
Paste submission
--------
![paste_submit](./doc/screenshots/paste_submit.png?raw=true "AIL framework paste submission")
2017-05-03 12:30:58 +00:00
Sentiment analysis
------------------
![Sentiment](./doc/screenshots/sentiment.png?raw=true "AIL framework sentimentanalysis")
Terms manager and occurence
---------------------------
![Term-Manager](./doc/screenshots/terms-manager.png?raw=true "AIL framework termManager")
### Top terms
2017-05-03 12:30:58 +00:00
![Term-Top](./doc/screenshots/terms-top.png?raw=true "AIL framework termTop")
![Term-Plot](./doc/screenshots/terms-plot.png?raw=true "AIL framework termPlot")
[AIL framework screencast](https://www.youtube.com/watch?v=1_ZrZkRKmNo)
Command line module manager
---------------------------
2018-06-20 08:38:52 +00:00
![Module-Manager](./doc/screenshots/module_information.png?raw=true "AIL framework ModuleInformationV2.py")
2017-05-03 12:30:58 +00:00
2017-05-03 12:25:58 +00:00
License
=======
```
Copyright (C) 2014 Jules Debra
Copyright (C) 2014-2019 CIRCL - Computer Incident Response Center Luxembourg (c/o smile, security made in Lëtzebuerg, Groupement d'Intérêt Economique)
Copyright (c) 2014-2019 Raphaël Vinot
Copyright (c) 2014-2019 Alexandre Dulaunoy
Copyright (c) 2016-2019 Sami Mokaddem
Copyright (c) 2018-2019 Thirion Aurélien
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
```