mirror of https://github.com/ail-project/ail-framework.git synced 2025-01-30 22:16:16 +00:00

AIL framework - Analysis Information Leak framework

ail-framework data-mining information-extraction information-security leak

Find a file

terrtia 44d1860b87 fix: [module] typo		2024-10-11 14:39:19 +02:00
.github/workflows	fix: [tests] github workflow	2023-05-22 15:36:51 +02:00
bin	fix: [module] typo	2024-10-11 14:39:19 +02:00
configs	chg: [module] add CEDetector	2024-10-11 14:36:02 +02:00
doc	chg: [doc] add overview	2024-04-25 14:43:26 +02:00
files	chg: [submodule] update	2024-09-17 17:17:36 +02:00
logs	Travis, print logs	2016-01-19 12:01:45 +01:00
other_installers	Add [ailbuilder] cleanup	2024-03-07 15:06:15 +01:00
samples/2021/01/01	chg: [modules + tests] fix modules + test modules on samples	2021-06-08 16:46:36 +02:00
tests	chg: [ail users] remove old User lib + improve API test	2024-08-13 14:10:17 +02:00
tools	chg: [qrcode] extract qrcode content from images and screenshots + qrcode object + correlation	2024-10-01 15:12:15 +02:00
update	fix: [qrcode install] add missing lib	2024-10-02 15:24:58 +02:00
var/www	chg: [crawler] submit free text of urls to crawl	2024-10-09 15:05:27 +02:00
.gitchangelog.rc	chg: [gitchangelog.rc] updated to output Markdown	2021-07-14 16:20:29 +02:00
.gitignore	chg: [git] update gitignore	2024-02-26 15:41:46 +01:00
.gitmodules	chg: [tags] refactor tags + cleanup	2022-11-22 10:47:15 +01:00
HOWTO.md	chg: [doc] add overview	2024-04-25 14:43:26 +02:00
install_virtualenv.sh	chg: [thirdparties] remove sb-admin + debug #194	2024-01-02 18:14:27 +01:00
installing_deps.sh	fix: [qrcode install] add missing lib	2024-10-02 15:24:58 +02:00
LICENSE	Initial import of AIL framework - Analysis Information Leak framework	2014-08-06 11:43:40 +02:00
README.md	Update README.md	2024-06-26 15:06:33 +02:00
requirements.txt	chg: [qrcode] improve qrcode extractor + add v5.8 update	2024-10-02 15:07:21 +02:00
reset_AIL.sh	Typo in CRAWLED_SCREENSHOT	2024-01-01 14:10:42 +01:00
SECURITY.md	Create SECURITY.md	2022-02-03 10:15:12 +01:00

README.md

AIL framework

Latest Release
CI
Gitter
Contributors
License

AIL framework - Framework for Analysis of Information Leaks

AIL is a modular framework to analyse potential information leaks from unstructured data sources like pastes from Pastebin or similar services or unstructured data streams. AIL framework is flexible and can be extended to support other functionalities to mine or process sensitive information (e.g. data leak prevention).

AIL V5.0 Version:

AIL v5.0 introduces significant improvements and new features:

Codebase Rewrite: The codebase has undergone a substantial rewrite, resulting in enhanced performance and speed improvements.
Database Upgrade: The database has been migrated from ARDB to Kvrocks.
New Correlation Engine: AIL v5.0 introduces a new powerful correlation engine with two new correlation types: CVE and Title.
Enhanced Logging: The logging system has been improved to provide better troubleshooting capabilities.
Tagging Support: AIL objects now support tagging, allowing users to categorize and label extracted information for easier analysis and organization.
Trackers: Improved objects filtering, PGP and decoded tracking added.
UI Content Visualization: The user interface has been upgraded to visualize extracted and tracked information.
New Crawler Lacus: improve crawling capabilities.
Modular Importers and Exporters: New importers (ZMQ, AIL Feeders) and exporters (MISP, Mail, TheHive) modular design. Allow easy creation and customization by extending an abstract class.
Module Queues: improved the queuing mechanism between detection modules.
New Object CVE and Title: Extract an correlate CVE IDs and web page titles.

Features

Modular architecture to handle streams of unstructured or structured information
Default support for external ZMQ feeds, such as provided by CIRCL or other providers
Multiple Importers and feeds support
Each module can process and reprocess the information already analyzed by AIL
Detecting and extracting URLs including their geographical location (e.g. IP address location)
Extracting and validating potential leaks of credit card numbers, credentials, ...
Extracting and validating leaked email addresses, including DNS MX validation
Module for extracting Tor .onion addresses for further analysis
Keep tracks of credentials duplicates (and diffing between each duplicate found)
Extracting and validating potential hostnames (e.g. to feed Passive DNS systems)
A full-text indexer module to index unstructured information
Terms, Set of terms, Regex, typo squatting and YARA tracking and occurrence
YARA Retro Hunt
Many more modules for extracting phone numbers, credentials, and more
Alerting to MISP to share found leaks within a threat intelligence platform using MISP standard
Detecting and decoding encoded file (Base64, hex encoded or your own decoding scheme) and storing files
Detecting Amazon AWS and Google API keys
Detecting Bitcoin address and Bitcoin private keys
Detecting private keys, certificate, keys (including SSH, OpenVPN)
Detecting IBAN bank accounts
Tagging system with MISP Galaxy and MISP Taxonomies tags
UI submission
Create events on MISP and cases on The Hive
Automatic export on detection with MISP (events) and The Hive (alerts) on selected tags
Extracted and decoded files can be searched by date range, type of file (mime-type) and encoding discovered
Correlations engine and Graph to visualize relationships between decoded files (hashes), PGP UIDs, domains, username, and cryptocurrencies addresses
Websites, Forums and Tor Hidden-Services hidden services crawler to crawl and parse output
Domain availability monitoring to detect up and down of websites and hidden services
Browsed hidden services are automatically captured and integrated into the analyzed output, including a blurring screenshot interface (to avoid "burning the eyes" of security analysts with sensitive content)
Tor hidden services is part of the standard framework, all the AIL modules are available to the crawled hidden services
Crawler scheduler to trigger crawling on demand or at regular intervals for URLs or Tor hidden services

Installation

To install the AIL framework, run the following commands:

# Clone the repo first
git clone https://github.com/ail-project/ail-framework.git
cd ail-framework

# For Debian and Ubuntu based distributions
./installing_deps.sh

# Launch ail
cd ~/ail-framework/
cd bin/
./LAUNCH.sh -l

The default installing_deps.sh is for Debian and Ubuntu based distributions.

Requirement:

Python 3.7+

Installation Notes

For Lacus Crawler and LibreTranslate installation instructions (if you want to use those features), refer to the HOWTO

Starting AIL

To start AIL, use the following commands:

cd bin/
./LAUNCH.sh -l

You can access the AIL framework web interface at the following URL:

https://localhost:7000/

The default credentials for the web interface are located in the DEFAULT_PASSWORDfile, which is deleted when you change your password.

Training

CIRCL organises training on how to use or extend the AIL framework. AIL training materials are available at https://github.com/ail-project/ail-training.

API

The API documentation is available in doc/api.md

HOWTO

HOWTO are available in HOWTO.md

For information on AIL's compliance with GDPR and privacy considerations, refer to the AIL information leaks analysis and the GDPR in the context of collection, analysis and sharing information leaks document.

this document provides an overview how to use AIL in a lawfulness context especially in the scope of General Data Protection Regulation.

Research using AIL

If you use or reference AIL in an academic paper, you can cite it using the following BibTeX:

@inproceedings{mokaddem2018ail,
  title={AIL-The design and implementation of an Analysis Information Leak framework},
  author={Mokaddem, Sami and Wagener, G{\'e}rard and Dulaunoy, Alexandre},
  booktitle={2018 IEEE International Conference on Big Data (Big Data)},
  pages={5049--5057},
  year={2018},
  organization={IEEE}
}

Screenshots

Websites, Forums and Tor Hidden-Services

Extracted encoded files from items

Correlation Engine

Investigation

Tagging system

MISP Export

MISP and The Hive, automatic events and alerts creation

UI submission

Trackers

License

    Copyright (C) 2014 Jules Debra
    Copyright (c) 2021 Olivier Sagit
    Copyright (C) 2014-2024 CIRCL - Computer Incident Response Center Luxembourg (c/o smile, security made in Lëtzebuerg, Groupement d'Intérêt Economique)
    Copyright (c) 2014-2024 Raphaël Vinot
    Copyright (c) 2014-2024 Alexandre Dulaunoy
    Copyright (c) 2016-2024 Sami Mokaddem
    Copyright (c) 2018-2024 Thirion Aurélien

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU Affero General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU Affero General Public License for more details.

    You should have received a copy of the GNU Affero General Public License
    along with this program.  If not, see <http://www.gnu.org/licenses/>.